Skip to contents

Computes the naive mapping efficiency as the proportion of total reads that map to the transcriptome. This function is used internally by reference_data_preprocessing_10x.

Usage

obtain_mapping_efficiency(QC_data, path_to_cellranger_output)

Arguments

QC_data

Data frame. Output of obtain_qc_read_umi_table containing a num_reads column with read counts per UMI.

path_to_cellranger_output

Character. Path to Cell Ranger run folder containing outs/metrics_summary.csv with a "Number of Reads" column.

Value

Numeric value between 0 and 1 representing the proportion of total reads that successfully mapped to the transcriptome.

Details

The function calculates:

$$\text{mapping_efficiency} = \frac{\text{mapped_reads}}{\text{total_reads}}$$

where:

  • mapped_reads = sum of num_reads from QC_data

  • total_reads = "Number of Reads" from metrics_summary.csv

Important Notes

  • The metrics_summary.csv file must contain a column named "Number of Reads"

  • This column may need to be added or edited manually when Cell Ranger is run with multiple libraries or samples

  • The function removes commas from the "Number of Reads" field before conversion

  • This gives a "naive" estimate that will be adjusted in reference_data_processing when a gene list is specified

See also

reference_data_preprocessing_10x for the complete aggregation workflow

Examples

# Get mapping efficiency from Cell Ranger output
cellranger_path <- system.file("extdata/cellranger_tiny", package = "perturbplan")
qc_data <- obtain_qc_read_umi_table(cellranger_path)
mapping_eff <- obtain_mapping_efficiency(qc_data, cellranger_path)

# View result
print(mapping_eff)
#> [1] 1