Visium Segment Input Tutorial
Required files
Filename / pattern |
Required |
Format |
Description |
|---|---|---|---|
|
Yes |
PNG |
High-resolution image corresponding to the segmentation coordinates |
|
Yes |
JSON |
Image scale factors |
|
Yes |
GeoJSON |
Cell segmentation polygons |
|
Yes |
H5 |
Main expression matrix |
|
No |
H5 |
Alternative compatible matrix filenames |
Where these files come from
Official download:
segmented_outputsgenerated by the 10x Visium segmentation workflowExperimental output: files produced by the image segmentation pipeline
Placeholder usage: you can first write
data/S1and replace it later with the actual sample directory
Leveraging the cell segmentation algorithm included in Space Ranger v4 by 10x Genomics, Spatialsnake provides a dedicated ingestion channel based on the segmentation output structure to facilitate downstream analysis. The data layout must conform to the structure shown below.
project_root/
├── data/ (stores your raw data)
│ └── {sample_id}/
├── sample.txt (key sample description file)
├── results/ (stores analysis outputs)
└── <analysis_option>.yaml (optional configuration file)
data/
└── {sample_id}/
└── segmented_outputs/
├── filtered_feature_bc_matrix.h5
├── cell_segmentations.geojson
└── spatial/
├── tissue_hires_image.png
└── scalefactors_json.json
Demo Dataset Walkthrough
run_type: visium_segment. In this tutorial, we use the cell segmentation output from the public CRC P2 dataset provided by 10x Genomics. These files are generated automatically by Space Ranger v4.
Dataset link: Visium Segmentation Demo (multisample_raw_data.tar.gz)
Before running Spatialsnake, create the project directory, download the archive, and extract the segmentation output into a folder named Colon_Cancer_P2 so that the final directory contains segmented_outputs directly under the sample folder.
Example setup:
mkdir -p project_root/data/Colon_Cancer_P2/segmented_outputs
cd project_root/data/Colon_Cancer_P2/segmented_outputs
curl -L -o multisample_raw_data.tar.gz https://cf.10xgenomics.com/supp/spatial-exp/analysis-workshop/multisample_raw_data.tar.gz
tar -xf multisample_raw_data.tar.gz
mkdir -p spatial
mv tissue_hires_image.png spatial/tissue_hires_image.png
mv scalefactors_json.json spatial/scalefactors_json.json
After extraction, the sample directory should match the layout shown below.
move the spatial files into the data/Colon_Cancer_P2/segmented_outputs/spatial folder.
Example directory layout
project_root/
├── data/ (stores your raw data)
│ └── Colon_Cancer_P2/
├── sample.txt (key sample description file)
├── results/ (stores analysis outputs)
└── <analysis_option>.yaml (optional configuration file)
data/
└── Colon_Cancer_P2/
└── segmented_outputs/
├── filtered_feature_bc_matrix.h5
├── cell_segmentations.geojson
└── spatial/
├── tissue_hires_image.png
└── scalefactors_json.json
single_analysis:
sample_id input_path
Colon_Cancer_P2 data/Colon_Cancer_P2
Make sure sample.txt is located in your current working directory.
spatialsnake single_analysis sample.txt visium_segment --option=integrate
Result file structure
results/
├── Colon_Cancer_P2/
└── integrate/
├── Colon_Cancer_P2.zarr # zarr-formatted data
├── total.png # histogram of total expression
├── total_umi_by_sample.png # histogram of total UMI counts by sample
├── total_genes_by_sample.png # histogram of detected genes by sample
├── genes_by_sample.png # histogram of mitochondrial signal by sample
└── scatter.png # scatter plot of total expression versus gene counts
Main output:
results/<sample>/integrate/<sample>.zarrAdditional output for comparison analysis:
results/merge_data/integrate/concatenated_sdataAdditional QC plots: the ingestion script writes five QC figures into the
integratedirectory. These files are generated during execution even though they are not explicitly listed one by one in the Snakemakeoutputdeclaration.
You have now ingested your data into a zarr object. For the subsequent core analysis, please refer to Core Analysis Workflow. We recommend starting with the example dataset to gain hands-on experience with the basic core-analysis workflow. If you prefer to proceed directly with your own data, each step page begins with a concise summary of the essential parameters.
Simply follow the tutorial to update the sample name and platform-specific parameters, then continue with the next step: Preprocessing.
If you want to run multi-sample integration analysis, continue to Spatialsnake for multi-sample integration.