Input Arguments

This page outlines the required inputs and optional parameters for running Statescope.

Required Datasets

The signature matrix defines the gene expression profiles of different cell types.

There are two ways to specify the signature matrix:

Statescope provides pre-processed signatures for various tumor types.
To use these signatures, specify the TumorType and the number of cell types (Ncelltypes).
Available options for TumorType and Ncelltypes can be found in the Processed Signatures page.

Example using pre-processed signatures:

Statescope_model = Initialize_Statescope(Bulk, TumorType='PBMC', Ncelltypes=7, Ncores=40)

Users can also provide their own custom single-cell data in .h5ad format.
The cell type annotations should be present in the key specified in celltype_key.

Example using a custom signature matrix:

Statescope_model = Initialize_Statescope(
    Bulk, 
    Signature=Signature, 
    celltype_key='leiden', 
    Ncores=40
)

⚠️ Note:

Single-cell data should be preprocessed (filtering, QC, normalization).
Statescope handles internal normalization and preprocessing automatically.
Ensure the cell type annotations exist under the key celltype_key in .obs.

Example format in Python:

import pandas as pd 
Bulk = pd.read_csv("bulk_expression.csv", index_col=0)

Example format in Python:

expected_fractions = pd.read_csv("expected_cell_fractions.csv", index_col=0)

Bulk RNA-seq data should be in linear scale (not log-transformed).
Signature matrices should be in log-scale.
Single-cell .h5ad files should contain filtered, QC’d, and annotated cell types.
pandas DataFrames** are recommended for structured inputs.