Skip to contents

Fits a saturation curve using preseqR to estimate the relationship between mapped reads per cell and observed UMIs per cell. Automatically selects between RFA (Rational Function Approximation) and ZTNB (Zero-Truncated Negative Binomial) methods based on the estimated shape parameter. This function is used internally by reference_data_processing.

Usage

library_estimation(QC_data, mt = 20)

Arguments

QC_data

Data frame. UMI-level molecule information from obtain_qc_read_umi_table containing columns num_reads, UMI_id, cell_id, and response_id.

mt

Integer. Number of terms to use in RFA method (default: 20).

Value

A list with method-specific parameters:

method_used

Character. Either "RFA" or "ZTNB" indicating which method was used

reads_norm

Numeric. Normalization constant (reads per cell at pilot data)

n_cells

Numeric. Number of cells in pilot data

UMI_per_cell_at_saturation

Numeric. Maximum UMI per cell at infinite sequencing depth

For RFA method (when shape <= 1):

valid_estimator

Logical. Whether RFA estimator is valid

coefs_real

Numeric vector. Real parts of RFA coefficients (if valid)

coefs_imag

Numeric vector. Imaginary parts of RFA coefficients (if valid)

poles_real

Numeric vector. Real parts of RFA poles (if valid)

poles_imag

Numeric vector. Imaginary parts of RFA poles (if valid)

constant_value

Numeric. Constant prediction value (if invalid)

For ZTNB method (when shape > 1):

L

Numeric. Total expected distinct UMIs at saturation

size

Numeric. ZTNB shape parameter

mu

Numeric. ZTNB mean parameter

Details

Method Selection

The function first fits a ZTNB model using preseqR.ztnb.em() and examines the shape parameter:

  • If shape <= 1: Uses RFA method (ds.rSAC) for better extrapolation

  • If shape > 1: Uses ZTNB closed-form formula

RFA Method (shape <= 1)

Uses rational function approximation: \(f(t) = Re(coefs \%*\% (t/(t - poles))^r)\)

ZTNB Method (shape > 1)

Uses closed-form: \(f(t) = L \times P(X > 0 | \text{size}, \mu \times t)\)

See also

obtain_qc_read_umi_table for input data preparation.

reference_data_processing for the complete preprocessing workflow.

Examples

# Get QC data and compute library parameters
cellranger_path <- system.file("extdata/cellranger_tiny", package = "perturbplan")
qc_data <- obtain_qc_read_umi_table(cellranger_path)

# Fit saturation curve using preseqR
lib_params <- library_estimation(QC_data = qc_data)

# Check which method was used
lib_params$method_used
#> [1] "ZTNB"
lib_params$UMI_per_cell_at_saturation
#> [1] 4.724635