Skip to contents

Reads a sparse gene-by-cell expression matrix from Cell Ranger output and performs quality control checks. This function is used internally by reference_data_preprocessing_10x.

Usage

obtain_qc_response_data(path_to_cellranger_output)

Arguments

path_to_cellranger_output

Character. Path to a Cell Ranger run folder (e.g., "SRR12345678"). This folder must contain:

  • outs/filtered_feature_bc_matrix/matrix.mtx.gz

  • outs/filtered_feature_bc_matrix/features.tsv.gz

  • outs/filtered_feature_bc_matrix/barcodes.tsv.gz

Value

A sparse CsparseMatrix (genes as rows, cells as columns) with:

  • Unique, non-empty gene IDs as row names

  • Unique, non-empty cell barcodes as column names

  • Duplicate genes and barcodes removed (keeping first occurrence)

Details

In some cases, the subfolder filtered_feature_bc_matrix/ may need to be produced by unzipping the filtered_feature_bc_matrix.tar.gz file from Cell Ranger output.

The function:

  1. Reads the sparse matrix in Matrix Market format

  2. Converts to column-compressed sparse format (CsparseMatrix)

  3. Reads gene annotations from features.tsv.gz

  4. Removes duplicate or empty gene IDs

  5. Reads cell barcodes from barcodes.tsv.gz

  6. Removes duplicate or empty barcodes

See also

reference_data_preprocessing_10x for aggregating data from multiple Cell Ranger runs

Examples

# Load example Cell Ranger output
cellranger_path <- system.file("extdata/cellranger_tiny", package = "perturbplan")
response_matrix <- obtain_qc_response_data(cellranger_path)

# Inspect the matrix
dim(response_matrix)
#> [1] 5 8
class(response_matrix)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"