
Fit Saturation-Magnitude (S-M) Curve Between Reads and UMIs Using PreseqR
library_estimation.RdFits a saturation curve using preseqR to estimate the relationship
between mapped reads per cell and observed UMIs per cell. Automatically selects
between RFA (Rational Function Approximation) and ZTNB (Zero-Truncated Negative Binomial)
methods based on the estimated shape parameter. This function is used internally by
reference_data_processing.
Arguments
- QC_data
Data frame. UMI-level molecule information from
obtain_qc_read_umi_tablecontaining columnsnum_reads,UMI_id,cell_id, andresponse_id.- mt
Integer. Number of terms to use in RFA method (default: 20).
Value
A list with method-specific parameters:
- method_used
Character. Either "RFA" or "ZTNB" indicating which method was used
- reads_norm
Numeric. Normalization constant (reads per cell at pilot data)
- n_cells
Numeric. Number of cells in pilot data
- UMI_per_cell_at_saturation
Numeric. Maximum UMI per cell at infinite sequencing depth
For RFA method (when shape <= 1):
- valid_estimator
Logical. Whether RFA estimator is valid
- coefs_real
Numeric vector. Real parts of RFA coefficients (if valid)
- coefs_imag
Numeric vector. Imaginary parts of RFA coefficients (if valid)
- poles_real
Numeric vector. Real parts of RFA poles (if valid)
- poles_imag
Numeric vector. Imaginary parts of RFA poles (if valid)
- constant_value
Numeric. Constant prediction value (if invalid)
For ZTNB method (when shape > 1):
- L
Numeric. Total expected distinct UMIs at saturation
- size
Numeric. ZTNB shape parameter
- mu
Numeric. ZTNB mean parameter
Details
Method Selection
The function first fits a ZTNB model using preseqR.ztnb.em() and examines the
shape parameter:
If shape <= 1: Uses RFA method (ds.rSAC) for better extrapolation
If shape > 1: Uses ZTNB closed-form formula
See also
obtain_qc_read_umi_table for input data preparation.
reference_data_processing for the complete preprocessing workflow.
Examples
# Get QC data and compute library parameters
cellranger_path <- system.file("extdata/cellranger_tiny", package = "perturbplan")
qc_data <- obtain_qc_read_umi_table(cellranger_path)
# Fit saturation curve using preseqR
lib_params <- library_estimation(QC_data = qc_data)
# Check which method was used
lib_params$method_used
#> [1] "ZTNB"
lib_params$UMI_per_cell_at_saturation
#> [1] 4.724635