
Compute power analysis for experimental planning (underspecified design)
compute_power_plan.Rd
This function performs power analysis during the experimental planning phase using underspecified information. It accepts aggregate experimental parameters (cells per target, reads per cell) without requiring specific cell count assignments to individual gRNAs or perturbation-gene pairs. It also allows specifying a list of experimental parameters to compare across. This is useful for designing experiments before data collection, allowing you to explore how different experimental designs (cell numbers, sequencing depth) affect overall statistical power.
Usage
compute_power_plan(
TPM_threshold,
minimum_fold_change,
cells_per_target,
sequenced_reads_per_cell,
MOI = 10,
num_targets = 100,
non_targeting_gRNAs = 10,
gRNAs_per_target = 4,
gRNA_variability = 0.13,
control_group = "complement",
side = "left",
multiple_testing_alpha = 0.05,
prop_non_null = 0.1,
baseline_expression_stats,
library_parameters,
grid_size = 10,
min_power_threshold = 0.01,
max_power_threshold = 0.8,
mapping_efficiency = 0.72
)
Arguments
- TPM_threshold
Numeric, numeric vector, or character. TPM threshold value, custom sequence, or "varying" for auto-selection.
- minimum_fold_change
Numeric, numeric vector, or character. Minimum fold change value, custom sequence, or "varying" for auto-selection. Pairs with effects at least this large are considered non-null.
- cells_per_target
Numeric, numeric vector, or character. Number of cells per target, custom sequence, or "varying" for auto-generated grid.
- sequenced_reads_per_cell
Numeric, numeric vector, or character. Sequenced reads per cell (raw sequencer output), custom sequence, or "varying" for auto-generated grid.
- MOI
Numeric. Multiplicity of infection (default: 10).
- num_targets
Integer. Number of targets (default: 100).
- non_targeting_gRNAs
Integer. Number of non-targeting gRNAs (default: 10).
- gRNAs_per_target
Integer. Number of gRNAs per target (default: 4).
- gRNA_variability
Numeric. Standard deviation for gRNA effect variation (default: 0.13).
- control_group
String. Control group type (default: "complement").
- side
String. Test sidedness (default: "left").
- multiple_testing_alpha
Numeric. FDR level (default: 0.05).
- prop_non_null
Numeric. Proportion of non-null hypotheses, i.e., the fraction of tested pairs expected to exhibit an effect at least as large as the specified minimum_fold_change (default: 0.1).
- baseline_expression_stats
Data frame. Baseline expression statistics. See
reference_data_processing
for data format requirements.- library_parameters
List. Library parameters with UMI_per_cell and variation. See
reference_data_processing
for parameter specifications.- grid_size
Integer. Grid size for each dimension (default: 10).
- min_power_threshold
Numeric. Minimum power threshold (default: 0.01).
- max_power_threshold
Numeric. Maximum power threshold to achieve (default: 0.8).
- mapping_efficiency
Numeric. Mapping efficiency for raw reads to usable reads (default: 0.72). See
reference_data_processing
for typical values.
Details
This function provides comprehensive power analysis by:
Expanding parameter combinations (TPM thresholds, fold changes)
Creating fold change expression data for each combination
Running compute_power_plan_per_grid() for each parameter set
Combining results into a flat dataframe for analysis
Examples
# Define parameter ranges for comprehensive analysis
TPM_threshold <- c(5, 10, 15)
minimum_fold_change <- c(0.7, 0.8, 0.9)
cells_per_target <- c(50, 100, 200)
sequenced_reads_per_cell <- c(10000, 25000, 50000)
# Get pilot data
pilot_data <- get_pilot_data_from_package("K562")
# Run comprehensive power analysis
full_results <- compute_power_plan(
TPM_threshold = TPM_threshold,
minimum_fold_change = minimum_fold_change,
cells_per_target = cells_per_target,
sequenced_reads_per_cell = sequenced_reads_per_cell,
baseline_expression_stats = pilot_data$baseline_expression_stats,
library_parameters = pilot_data$library_parameters,
MOI = 10,
num_targets = 100,
side = "left"
)
# Examine results
dim(full_results)
#> [1] 81 7
head(full_results)
#> # A tibble: 6 × 7
#> minimum_fold_change TPM_threshold cells_per_target num_captured_cells
#> <dbl> <dbl> <dbl> <dbl>
#> 1 0.7 5 50 512.
#> 2 0.7 5 100 1025
#> 3 0.7 5 200 2050
#> 4 0.7 5 50 512.
#> 5 0.7 5 100 1025
#> 6 0.7 5 200 2050
#> # ℹ 3 more variables: sequenced_reads_per_cell <dbl>, library_size <dbl>,
#> # overall_power <dbl>
summary(full_results$overall_power)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.0000087 0.0035579 0.0244641 0.0968134 0.1208887 0.7185824