Skip to contents

C++ implementation of the preseqR-based saturation curve that relates sequencing reads to unique UMI counts. Supports both ZTNB and RFA methods.

This function computes the effective library size (in UMIs) from sequencing read depth using fitted saturation curves from preseqR that account for PCR amplification bias and UMI saturation. This is an R wrapper for the C++ implementation.

Usage

fit_read_UMI_curve_cpp(reads_per_cell, rSAC_fn_wrapper)

fit_read_UMI_curve_cpp(reads_per_cell, rSAC_fn_wrapper)

Arguments

reads_per_cell

Numeric vector. Total reads per cell.

rSAC_fn_wrapper

List. Parameter structure from library_estimation containing method_used, reads_norm, n_cells, and method-specific parameters.

Value

Numeric vector. Effective library size in UMIs for each read depth.

Numeric vector. Effective library size in UMIs.

Details

This C++ implementation provides significant performance improvements over the R version for large-scale power analysis computations. Supports two methods:

  • ZTNB: Uses L * P(X > 0 | size, mu * t)

  • RFA: Uses rational function approximation with complex arithmetic

The saturation-magnitude (S-M) curve model relates sequencing reads to unique UMI counts. This function supports two methods from preseqR:

  • ZTNB: Zero-truncated negative binomial closed-form formula

  • RFA: Rational function approximation for better extrapolation

The rSAC_fn_wrapper parameter must be obtained from library_estimation.

See also

identify_library_size_range for R wrapper that uses this function

library_estimation for fitting S-M curve parameters

Examples

# Get QC data and estimate library parameters
cellranger_path <- system.file("extdata/cellranger_tiny", package = "perturbplan")
qc_data <- obtain_qc_read_umi_table(cellranger_path)
rSAC_params <- library_estimation(QC_data = qc_data)

# Define read depths to test
read_depths <- c(10000, 25000, 50000, 100000)

# Calculate effective library sizes
effective_umis <- fit_read_UMI_curve_cpp(read_depths, rSAC_params)

# View the results
data.frame(
  reads_per_cell = read_depths,
  effective_UMI = effective_umis,
  saturation_pct = round(100 * effective_umis / rSAC_params$UMI_per_cell_at_saturation, 1)
)
#>   reads_per_cell effective_UMI saturation_pct
#> 1          10000      4.724635            100
#> 2          25000      4.724635            100
#> 3          50000      4.724635            100
#> 4         100000      4.724635            100