
Compute effective library size from read depth using preseqR saturation curve (C++)
fit_read_UMI_curve_cpp.RdC++ implementation of the preseqR-based saturation curve that relates sequencing reads to unique UMI counts. Supports both ZTNB and RFA methods.
This function computes the effective library size (in UMIs) from sequencing read depth using fitted saturation curves from preseqR that account for PCR amplification bias and UMI saturation. This is an R wrapper for the C++ implementation.
Usage
fit_read_UMI_curve_cpp(reads_per_cell, rSAC_fn_wrapper)
fit_read_UMI_curve_cpp(reads_per_cell, rSAC_fn_wrapper)Arguments
- reads_per_cell
Numeric vector. Total reads per cell.
- rSAC_fn_wrapper
List. Parameter structure from
library_estimationcontaining method_used, reads_norm, n_cells, and method-specific parameters.
Value
Numeric vector. Effective library size in UMIs for each read depth.
Numeric vector. Effective library size in UMIs.
Details
This C++ implementation provides significant performance improvements over the R version for large-scale power analysis computations. Supports two methods:
ZTNB: Uses L * P(X > 0 | size, mu * t)
RFA: Uses rational function approximation with complex arithmetic
The saturation-magnitude (S-M) curve model relates sequencing reads to unique UMI counts. This function supports two methods from preseqR:
ZTNB: Zero-truncated negative binomial closed-form formula
RFA: Rational function approximation for better extrapolation
The rSAC_fn_wrapper parameter must be obtained from library_estimation.
See also
identify_library_size_range for R wrapper that uses this function
library_estimation for fitting S-M curve parameters
Examples
# Get QC data and estimate library parameters
cellranger_path <- system.file("extdata/cellranger_tiny", package = "perturbplan")
qc_data <- obtain_qc_read_umi_table(cellranger_path)
rSAC_params <- library_estimation(QC_data = qc_data)
# Define read depths to test
read_depths <- c(10000, 25000, 50000, 100000)
# Calculate effective library sizes
effective_umis <- fit_read_UMI_curve_cpp(read_depths, rSAC_params)
# View the results
data.frame(
reads_per_cell = read_depths,
effective_UMI = effective_umis,
saturation_pct = round(100 * effective_umis / rSAC_params$UMI_per_cell_at_saturation, 1)
)
#> reads_per_cell effective_UMI saturation_pct
#> 1 10000 4.724635 100
#> 2 25000 4.724635 100
#> 3 50000 4.724635 100
#> 4 100000 4.724635 100