CellBin is an image processing pipeline designed to delineate cell boundaries for spatial analysis. It consists of several image analysis steps. Given the image and gene expression data as input, CellBin performs image registration, tissue segmentation, nuclei segmentation, and molecular labeling (i.e., cell border expanding), ultimately defining the molecular boundaries of individual cells. It incorporates a suite of self-developed algorithms, including deep-learning models, for each of the analysis task. The processed data is then mapped onto the chip to extract molecular information, resulting in an accurate single-cell expression matrix. (Cover image) For more information on CellBin, please refer to the following link.
Cellbin2 is an upgraded version of the original CellBin platform with two key enhancements:
- Expanded Algorithm Library: Incorporates additional image processing algorithms to serve broader application scenarios like single-cell RNA-seq, Plant cellbin.
- Configurable Architecture: Refactored codebase allows users to customize analysis pipelines through JSON and YAML configuration files.
Linux
# Clone the main repository
git clone https://github.com/STOmics/cellbin2
# git clone -b <branch> https://github.com/STOmics/cellbin2
# Create and activate a Conda environment
conda create --name cellbin2 python=3.8
conda activate cellbin2
# Install package dependencies
cd cellbin2
pip install .[cp,rs]
# For development mode (optional):
# pip install -e .[cp,rs] # Editable install with basic extras
# pip install -e .[cp,rs,rp] # Editable install including report module
# if you pip install packages error, please refer to the pyproject.toml file for more details.
# Execute the demo (takes ~30-40 minutes on GPU hardware)
python demo.py
Performance Note:
We strongly recommend using GPU acceleration for optimal performance. Below is the runtime comparison of two processing modes for an S1 chip (1cmΒ² chip area):
Processing Mode | Runtime |
---|---|
GPU | 30-40 mins |
CPU | 6-7 hours |
Benchmark hardware:
GPU: NVIDIA GeForce RTX 3060
CPU: AMD Ryzen 7 5800H
Memory: 16GB
If the pipeline defaults to CPU mode unexpectedly, follow our GPU troubleshooting guide to verify your hardware setup.
Output Verification:
After completion, validate the output integrity by comparing your results with the Outputs.
The cellbin_pipeline.py
script serves as the main entry point for CellBin2 analysis. It supports two configuration approaches:
- Configuration files : Use JSON files for full customization
- Command-line arguments: Quick setup using key parameters with kit-based defaults
π Configuration Guide:
See JSON Configuration Documentation for full parameter specifications.
# Minimal configuration (requires complete parameters in JSON)
CUDA_VISIBLE_DEVICES=0 python cellbin2/cellbin_pipeline.py -c <SN> -p <config.json> -o <output_dir>
# Kit-based configuration (auto-loads predefined settings)
CUDA_VISIBLE_DEVICES=0 python cellbin2/cellbin_pipeline.py -c <SN> -i <image.tif> -s <stain_type> -m <expression.gef> -o <output_dir> -k "Kit Name"
# View all availa
8000
ble parameters
python cellbin2/cellbin_pipeline.py -h
Parameter | Required* | Description | Examples |
---|---|---|---|
-c |
β | Serial number of chip | SN |
-o |
β | Output directory | results/SAMPLE123 |
-i |
ββ³ | Primary image path (required for kit-based mode) | SN.tif |
-s |
ββ³ | Stain type (required for kit-based mode) | DAPI , ssDNA , HE |
-p |
β³ | Path to custom configuration file JSON Configuration Documentation |
config/custom.json |
-m |
β³ | Gene expression matrix | SN.raw.gef |
-mi |
β³ | Multi-channel images | IF=SN_IF.tif |
-pr |
β³ | Protein expression matrix | SN_IF.protein.gef |
-k |
ββ³ | Kit type (required for kit-based mode,See kit list below) | "Stereo-CITE T FF V1.1 R" |
*β = Always required, ββ³ = Required for kit-based mode, β³ = Optional
KIT_VERSIONS = (
# Standard product versions
'Stereo-seq T FF V1.2',
'Stereo-seq T FF V1.3',
'Stereo-CITE T FF V1.0',
'Stereo-CITE T FF V1.1',
'Stereo-seq N FFPE V1.0',
# Research versions
'Stereo-seq T FF V1.2 R',
'Stereo-seq T FF V1.3 R',
'Stereo-CITE T FF V1.0 R',
'Stereo-CITE T FF V1.1 R',
'Stereo-seq N FFPE V1.0 R',
)
The kit controls the module switches and parameters in the JSON configuration to customize the analysis workflow.
Detailed configurations per kit: config.md.
More introduction about kits type, you can view STOmics official website.
ssDNA
CUDA_VISIBLE_DEVICES=0 python cellbin2/cellbin_pipeline.py \
-c SN \
-i SN.tif \
-s ssDNA \
-m SN.raw.gef \
-o test/SN \
-k "Stereo-seq T FF V1.2"
DAPI + IF + trans gef
CUDA_VISIBLE_DEVICES=0 python cellbin2/cellbin_pipeline.py \
-c SN \
-i SN.tif \
-s DAPI \
-mi IF=SN_IF.tif \
-m SN.raw.gef \
-o test/SN \
-k "Stereo-CITE T FF V1.1 R"
DAPI + protein gef
CUDA_VISIBLE_DEVICES=0 python cellbin2/cellbin_pipeline.py \
-c SN \
-i SN_fov_stitched.tif \
-s DAPI \
-pr IF=SN.protein.tissue.gef \
-o /test/SN \
-k "Stereo-CITE T FF V1.1 R"
DAPI + IF + trans gef + protein gef
CUDA_VISIBLE_DEVICES=0 python cellbin2/cellbin_pipeline.py \
-c SN \ # chip number
-i SN_DAPI_fov_stitched.tif \ # ssDNA, DAPI, HE data path
-mi IF=SN_IF.tif \
-s DAPI \ # stain type (ssDNA, DAPI, HE)
-m SN.raw.gef \ # Transcriptomics gef path
-pr SN.protein.raw.gef \ # protein gef path
-o test/SN \ # output dir
-k "Stereo-CITE T FF V1.1 R"
trans gef
CUDA_VISIBLE_DEVICES=0 python cellbin2/cellbin_pipeline.py \
-c SN \ # chip number
-p only_matrix.json \ # Personalized Json File
-o test/SN \ # output dir
please modify only_matrix.json
ssDNA + FB + trans gef
CUDA_VISIBLE_DEVICES=0 python cellbin2/cellbin_pipeline.py \
-c SN \ # chip number
-p Plant.json \ # Personalized Json File
-o test/SN \ # output dir
please modify Plant.json
ssDNA + HE + trans gef
CUDA_VISIBLE_DEVICES=0 python cellbin2/cellbin_pipeline.py \
-c SN \ # chip number
-i SN_ssDNA_fov_stitched.tif \ # ssDNA,DAPI data path
-mi HE=SN_HE_fov_stitched.tif \ # HE data path. This image has been registered with ssDNA(DAPI) image
-s ssDNA \ # stain type (ssDNA, DAPI)
-m SN.raw.gef \ # Transcriptomics gef path
-o test/SN \ # output dir
-k "Stereo-CITE T FF V1.1 R"
more examples, please visit example.md
refer to error.md
File Name | Description |
---|---|
SN_cell_mask.tif | Final cell mask |
SN_mask.tif | Final nuclear mask |
SN_tissue_mask.tif | Final tissue mask |
SN_params.json | CellBin 2.0 input params |
SN.ipr | Image processing record |
metrics.json | CellBin 2.0 Metrics |
CellBin_0.0.1_report.html | CellBin 2.0 report |
SN.rpi | Recorded image processing (for visualization) |
SN.stereo | A JSON-formatted manifest file that records the visualization files in the result |
SN.tar.gz | tar.gz file |
SN_DAPI_mask.tif | Cell mask on registered image |
SN_DAPI_regist.tif | Registered image |
SN_DAPI_tissue_cut.tif | Tissue mask on registered image |
SN_IF_mask.tif | Cell mask on registered image |
SN_IF_regist.tif | Registered image |
SN_IF_tissue_cut.tif | Tissue mask on registered image |
SN_Transcriptomics_matrix_template.txt | Track template on gene matrix |
- Image files (
*.tif
): Inspect using ImageJ - Gene expression file (generated only when matrix_extract module is enabled): Visualize with StereoMap v4.
CellBin introduction (Chinese)
https://github.com/STOmics/CellBin
https://github.com/MouseLand/cellpose
https://github.com/matejak/imreg_dft
https://github.com/rezazad68/BCDU-Net
https://github.com/libvips/pyvips
https://github.com/vanvalenlab/deepcell-tf
https://github.com/ultralytics/ultralytics
Tweets
Stereo-seq CellBin introduction (Chinese)
Stereo-seq CellBin application intro (Chinese)
Stereo-seq CellBin cell segmentation database introduction (Chinese)
CellBin: The Core Image Processing Pipeline in SAW for Generating Single-cell Gene Expression Data for Stereo-seq (English)
A Practical Guide to SAW Output Files for Stereo-seq (English)
Paper related
CellBin: a highly accurate single-cell gene expression processing pipeline for high-resolution spatial transcriptomics (GitHub Link)
Generating single-cell gene expression profiles for high-resolution spatial transcriptomics based on cell boundary images (GitHub Link)
CellBinDB: A Large-Scale Multimodal Annotated Dataset for Cell Segmentation with Benchmarking of Universal Models (GitHub Link)
Video tutorial
Cell segmentation tool selection and application (Chinese)
One-stop solution for spatial single-cell data acquisition (Chinese)
Single-cell processing framework for high resolution spatial omics (Chinese)