CIBRRIG
:book: ReadTheDocs
Support for extraction, preprocessing, sorting, analysis and plotting of physiology and Neuropixel recordings from rig to fig
Description
Code to integrate hardware and software on the Neuropixel rig in JMB 971 at Seattle Childrens Research Institute, Center for Integrative Brain Research (SCRI-CIBR) This code is maintained by Nick Bush in the Ramirez Lab and is subject to change.
The rig is designed to monitor breathing and behavior in a head-fixed mouse while recording from neuropixels throughout the brain. Rig is capable of hot-swap between awake and anesthetized preps.
Incorporates both custom code that is specific for the 971 Rig, and more general analyses that are applicable to Neuropixel recordings of respiratory/physiological systems.
IMPORTANT
This code is designed to work in conjunction with hardware in the the pyExpControl repository. Most functionality can be used independantly of this hardware, but the most critical piece is the automatically generated log file that is created during recording with this hardware.
The log file is a .tsv file with the name _cibrrig_<run_name>.g<x>.t<x>.tsv. It has required columns:
[label, category, start_time, end_time], and optional columns that describe parameters of the events (e.g., frequency, duration…). One could create these logfiles manually if desired, or ignore them entirely, but some functionality will fail.
Installation
Create a virtual environment using mamba/conda.
[!WARNING] If on SCRI networks it is critically important to specify the python version here. This circumvents the SSL issue we have been running into. BE SURE YOU HAVE MODIFIED YOUR .condarc file (in
C:/Users/<user>) appropriately
mamba create -n cibrrig python=3.12
mamba activate cibrrig
Then change directory to a place to install cibrrig locally.
[!IMPORTANT] If you are on NPX 971 room computer, this has already been cloned and you should just install into your new venv.
cd C:/helpers/cibrrig git pull pip install -e .(note the period) OTHERWISE, clone the repo:
cd </path/to/somewhere/reasonable/> git clone https://github.com/nbush257/cibrrig cd cibrrig pip install -e .(note the period) Once your virtual (mamba/conda) environment has been set up,
git pullin the cibrrig directory will updatecibrrigso you do not have to redo the pip install
[!WARNING] To do manual spike curation, you will need to install
phyinto a seperate conda/mamba environment due to some dependency issues at the moment See: https://github.com/cortex-lab/phy
Then, make sure the GPU is working for Kilosort (See kilosort install instructions steps 7 and 8):
Next, if the CPU version of pytorch was installed (will happen on Windows), remove it with
pip uninstall torchThen install the GPU version of pytorchconda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia
Make sure you are using the GPU by running the kilosort gui:python -m kilosort and confirming the PyTorch device is the GPU and not the CPU: 
Helper packages (Primarily matlab packages) should live in C:/helpers on the NPX computer so they are available to all users. Some functionality relies on these packages, but much is being phased out
These include:
Kilosort (versions 2,3)
Chronux http://chronux.org/
Breathmetrics https://github.com/zelanolab/breathmetrics
SALT (Kvitsiani et al. 2013)
:exclamation: Quick start and Data structure
From local computer :computer:
:warning: This performs all processing on the local computer and ties up the resources. This workflow can get backed up if things go sideways.
If you have recorded a dataset on the NPX computer you can simply open a command prompt and run:
mamba activate cibrrig npx_run_allThis will open a GUI that prompts you to choose some options and point to where you want thie files saved.
From sasquatch (HPC) :monkey:
:warning: Performing the computation on sasquatch keeps the acquisition rig cleaner First, compress and backup the dataset with:
mamba activate cibrrig backup </local/run/path> <baker/path>Example:
backup D:/Subjects/mickey_mouse \\baker.childrens.sea.kids/archive/ramirez_j/ramirezlab/alf_data_repo/ramirez/SubjectsSecond, sign on to a sasquatch login node and run:
mamba activate iblenv pipeline_hpc </baker/path> --no-qcN.B. This rsyncs the data to the sasquatch drive, submits SLURM jobs on sasquatch nodes, then moves the data to the ramirezlab alf repository.
[!NOTE] There is incomplete code to run the pipeline via a series of SSH commands (
run_sasquatch.from_NPX), but is not finished.
Details:
Main entry points can be run from anywhere as long as the package has been pip installed
:arrow_right:Pipelines (Commands involved in end to end processing)
npx_run_all - Opens a GUI to performs backup, preprocess, and spikesorting
backup <local_run_path> <remote_subjects_path> - Just performs backup
pipeline_hpc <run_path> - Copy from run path to sasquatch tempdir, run pipeline, move to ramirezlab alf repo
Modules (Parts of the pipeline that can be run separately if needed)
npx_preproc <session_path> - Just performs preprocessing and extraction.
ephys_to_alf <run_path> - Rename the recorded data to alf format
spikesort <session_path> - run spikesorting
convert_ks_to_alf <session_path> <sorter> - convert sorted neural data from kilosort (i.e., phy) to ALF format. kilosort4
ephys_qc <session_path> - Run IBL ephys qc and plots
In practice, it is easiest to simply run npx_run_all after recording. Previously run steps will be skipped or appropriately overwritten. Some users have shortcuts to batch scripts that activate the virtual environment and run this.
Data structure
We save data in a way consistent with the Open Neurophysiology Environment (ONE) For a detailed description of filenames and structure see:ONE Naming
Data should be organized with the following structure:
./<lab>/Subjects/<subject-id>/<yyyy-mm-dd>/<session_number>
e.g.:
alf_data_repo/
├─ ramirez/
│ ├─ Subjects/
│ │ ├─ leonardo/
│ │ │ ├─ 2024-08-01/
│ │ │ │ ├─ 000/**<- SESSION_PATH**
│ │ │ │ ├─ 001/
│ │ │ ├─ 2024-08-02/
│ │ │ │ ├─ 000/
│ │ ├─ donatello/
│ │ │ ├─ 2024-03-05/
│ │ │ │ ├─ 000/
├─ sessions.pqt
├─ datasets.pqt
Data should have filenames like: spikes.times.npy of the form <object>.<attribute>.<ext>
To work with data, you should set up a one instance:
from one.api import One
one = One.setup(cache_dir=/path/to/alf_data_repo>)
[!IMPORTANT] Most commands either take a
runor asessionas input. There is an important distinction between arunand asession.
A
runis in “SpikeGLX refers to any number of “gates” as recorded by spikeGLX. This folder structure is:<subject>/<subject>_g0...A
sessionis in ALF/ONE format and refers to a single gate recorded by SpikeGLX, but processed into the format above. \Rule of thumb is, if you are working before spikesorting, you are working with
runformat. If you are after spikesorting, it issession
For SCRI/Ramirelab users:
The cache_dir lives on the RSS in:
/helens.childrens.sea.kids/active/ramirez_j/ramirezlab/alf_data_repo
which is mounted on sasquatch as:
/data/rss/helens/ramirez_j/ramirezlab
We mirror all but the raw ephys data to sasquatch work nodes at:
/data/hps/assoc/private/medullary/data/alf_data_repo
Now you can structure analysis scripts around the ONE structure. Scripts for analysis of data specific to projects should be maintained seperately from this repo. The user is encouraged to use brainbox to manipulate data.
Hardware
IMEC Neuropixels
Sensapex MPM
NI based auxiliary recording
AM systems 1700 Amplifier
Buxco pressure sensor
Legacy Ramirez homebrew hardware integrator (for integrating EMGs)
Valve manifold for gas presentation
100% O2
Room air
10% O2 hypoxia
5%CO2 hypercapnia
100%N2 anoxia
Hering breuer closure valve
Optogenetics - 2 x Cobalt 473nm, 1x Cobalt 635nm lasers.
Arduino based experiment control (inspired by Bpod)
Chameleon Camera(s) - controlled by a teensy camera pulser
USV mic
Olfactometer
Software
hardware: Control, CAD, and diagrams of the rig hardware Currently hosted in its own repository. See https://github.com/nbush257/pyExpControl
pyExperimentControl: Firmware, gui and scripting of arduino control
archiving: Routines for backing up raw data on the SCRI RSS
preprocess: Extract physiological data, experimental events
sorting: Spikesorting functions and pipelines
postprocess: Compute secondary analyses that rely on spikesorted data
e.g. optotagging, coherence and respiratory modulation calculations,axon/soma categorization
utils: General utility functions
analysis: Singlecell and population analyses.
plot: Frequently reused plotting functions, including latent space plotting
videos: Code to make frequently created videos, including evolution of latent, rasters, and auxiliary data over time.
Primary preprocessing and sorting pipeline
This code provides a simple way to process most of the preprocessing steps necesarry to perform after a neuropixel expriment.
Run the pipeline from the cibrrig root with: python main_pipeline.py. This will take several hours.
This pipeline runs:
Backup and compression of raw data
Conversion of raw data structure to ONE
Extraction of auxiliary data
Synch data
Physiology (e.g. breathing)
Camera frames times
Laser data
Spike sorting with Kilosort 4 via spikeinterface
IBL destriping
Motion correction (DREDGE, in Spikeinterface)
(Optional) Optogenetic artifact removal
Spikesorting
QC metrics of the spikesorted data
UnitRefine assignment of Noise, MUA, SUA
Conversion of spikesorted data to ALF format
Concatenation of multiple triggers of auxiliary data and adjusting of time events across streams