Visium Segment Input Tutorial

Required files

Filename / pattern

Required

Format

Description

segmented_outputs/spatial/tissue_hires_image.png

Yes

PNG

High-resolution image corresponding to the segmentation coordinates

segmented_outputs/spatial/scalefactors_json.json

Yes

JSON

Image scale factors

segmented_outputs/cell_segmentations.geojson

Yes

GeoJSON

Cell segmentation polygons

segmented_outputs/filtered_feature_bc_matrix.h5 or segmented_outputs/raw_feature_bc_matrix.h5

Yes

H5

Main expression matrix

segmented_outputs/cell_feature_matrix.h5 / segmented_outputs/filtered_feature_cell_matrix.h5 / segmented_outputs/raw_feature_cell_matrix.h5

No

H5

Alternative compatible matrix filenames

Where these files come from

  • Official download: segmented_outputs generated by the 10x Visium segmentation workflow

  • Experimental output: files produced by the image segmentation pipeline

  • Placeholder usage: you can first write data/S1 and replace it later with the actual sample directory

Leveraging the cell segmentation algorithm included in Space Ranger v4 by 10x Genomics, Spatialsnake provides a dedicated ingestion channel based on the segmentation output structure to facilitate downstream analysis. The data layout must conform to the structure shown below.

project_root/
├── data/ (stores your raw data)
│   └── {sample_id}/
├── sample.txt (key sample description file)
├── results/ (stores analysis outputs)
└── <analysis_option>.yaml (optional configuration file)

data/
└── {sample_id}/
    └── segmented_outputs/
        ├── filtered_feature_bc_matrix.h5
        ├── cell_segmentations.geojson
        └── spatial/
            ├── tissue_hires_image.png
            └── scalefactors_json.json

Demo Dataset Walkthrough

run_type: visium_segment. In this tutorial, we use the cell segmentation output from the public CRC P2 dataset provided by 10x Genomics. These files are generated automatically by Space Ranger v4.

Dataset link: Visium Segmentation Demo (multisample_raw_data.tar.gz)

Before running Spatialsnake, create the project directory, download the archive, and extract the segmentation output into a folder named Colon_Cancer_P2 so that the final directory contains segmented_outputs directly under the sample folder.

Example setup:

mkdir -p project_root/data/Colon_Cancer_P2/segmented_outputs
cd project_root/data/Colon_Cancer_P2/segmented_outputs

curl -L -o multisample_raw_data.tar.gz https://cf.10xgenomics.com/supp/spatial-exp/analysis-workshop/multisample_raw_data.tar.gz
tar -xf multisample_raw_data.tar.gz
mkdir -p spatial
mv tissue_hires_image.png spatial/tissue_hires_image.png
mv scalefactors_json.json spatial/scalefactors_json.json

After extraction, the sample directory should match the layout shown below. move the spatial files into the data/Colon_Cancer_P2/segmented_outputs/spatial folder.

Example directory layout

project_root/
├── data/ (stores your raw data)
│   └── Colon_Cancer_P2/
├── sample.txt (key sample description file)
├── results/ (stores analysis outputs)
└── <analysis_option>.yaml (optional configuration file)

data/
└── Colon_Cancer_P2/
    └── segmented_outputs/
        ├── filtered_feature_bc_matrix.h5
        ├── cell_segmentations.geojson
        └── spatial/
            ├── tissue_hires_image.png
            └── scalefactors_json.json

single_analysis:

sample_id input_path
Colon_Cancer_P2 data/Colon_Cancer_P2

Make sure sample.txt is located in your current working directory.

spatialsnake single_analysis sample.txt visium_segment --option=integrate

Result file structure

results/
├── Colon_Cancer_P2/
    └── integrate/
        ├── Colon_Cancer_P2.zarr # zarr-formatted data
        ├── total.png # histogram of total expression
        ├── total_umi_by_sample.png # histogram of total UMI counts by sample
        ├── total_genes_by_sample.png # histogram of detected genes by sample
        ├── genes_by_sample.png # histogram of mitochondrial signal by sample
        └── scatter.png # scatter plot of total expression versus gene counts
  • Main output: results/<sample>/integrate/<sample>.zarr

  • Additional output for comparison analysis: results/merge_data/integrate/concatenated_sdata

  • Additional QC plots: the ingestion script writes five QC figures into the integrate directory. These files are generated during execution even though they are not explicitly listed one by one in the Snakemake output declaration.

You have now ingested your data into a zarr object. For the subsequent core analysis, please refer to Core Analysis Workflow. We recommend starting with the example dataset to gain hands-on experience with the basic core-analysis workflow. If you prefer to proceed directly with your own data, each step page begins with a concise summary of the essential parameters. Simply follow the tutorial to update the sample name and platform-specific parameters, then continue with the next step: Preprocessing. If you want to run multi-sample integration analysis, continue to Spatialsnake for multi-sample integration.