
Load and QC Gene Expression Matrix from Cell Ranger Output
obtain_qc_response_data.Rd
Reads a sparse gene-by-cell expression matrix from Cell Ranger output and performs
quality control checks. This function is used internally by
reference_data_preprocessing_10x
.
Arguments
- path_to_cellranger_output
Character. Path to a Cell Ranger run folder (e.g.,
"SRR12345678"
). This folder must contain:outs/filtered_feature_bc_matrix/matrix.mtx.gz
outs/filtered_feature_bc_matrix/features.tsv.gz
outs/filtered_feature_bc_matrix/barcodes.tsv.gz
Value
A sparse CsparseMatrix
(genes as rows, cells as columns) with:
Unique, non-empty gene IDs as row names
Unique, non-empty cell barcodes as column names
Duplicate genes and barcodes removed (keeping first occurrence)
Details
In some cases, the subfolder filtered_feature_bc_matrix/
may need to be
produced by unzipping the filtered_feature_bc_matrix.tar.gz
file from
Cell Ranger output.
The function:
Reads the sparse matrix in Matrix Market format
Converts to column-compressed sparse format (CsparseMatrix)
Reads gene annotations from features.tsv.gz
Removes duplicate or empty gene IDs
Reads cell barcodes from barcodes.tsv.gz
Removes duplicate or empty barcodes
See also
reference_data_preprocessing_10x
for aggregating data from
multiple Cell Ranger runs
Examples
# Load example Cell Ranger output
cellranger_path <- system.file("extdata/cellranger_tiny", package = "perturbplan")
response_matrix <- obtain_qc_response_data(cellranger_path)
# Inspect the matrix
dim(response_matrix)
#> [1] 5 8
class(response_matrix)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"