Find Optimal Cost-Efficient Experimental Designs

Identifies cost-optimal experimental designs that achieve target statistical power within specified precision bounds. This function processes cost-power analysis results to find minimal-cost designs for each parameter level and generates detailed cost grids for design optimization.

Usage

find_optimal_cost_design(
  cost_power_df,
  minimizing_variable,
  power_target,
  power_precision,
  MOI = 10,
  num_targets = 100,
  non_targeting_gRNAs = 10,
  gRNAs_per_target = 4,
  cost_per_captured_cell = 0.086,
  cost_per_million_reads = 0.374,
  cost_grid_size = 50
)

Arguments

cost_power_df

Data frame. Output from cost_power_computation containing power analysis results with cost calculations. Must include columns: overall_power, total_cost, cells_per_target, sequenced_reads_per_cell, plus the specified minimizing variable (except for cost optimization).

minimizing_variable

Character. The parameter being optimized. Must be one of:

"TPM_threshold": TPM expression threshold optimization
"minimum_fold_change": Minimum fold change threshold optimization
"cost": Total cost optimization across all experimental designs

power_target

Numeric. Target statistical power level (typically 0.8 for 80% power). Must be between 0 and 1.

power_precision

Numeric. Acceptable precision around power target. Designs with power within power_target ± power_precision are considered acceptable. Must be between 0 and 1.

MOI

Numeric. Multiplicity of infection parameter for experimental design calculations (default: 10). Used to compute number of captured cells.

num_targets

Integer. Number of target genes in the experiment (default: 100). Used for cost calculations.

non_targeting_gRNAs

Integer. Number of non-targeting gRNAs in the experiment (default: 10). Used to calculate total library size and captured cell requirements.

gRNAs_per_target

Integer. Number of gRNAs per target gene (default: 4). Used to calculate total gRNAs and experimental design parameters.

cost_per_captured_cell

Numeric. Cost per captured cell in dollars (default: 0.086). Used for library preparation cost calculations.

cost_per_million_reads

Numeric. Cost per million sequencing reads in dollars (default: 0.374). Used for sequencing cost calculations.

cost_grid_size

Integer. Number of grid points for cost optimization grid (default: 200). Higher values provide finer resolution but longer computation time.

Value

A list containing two elements:

optimal_cost_power_df: Data frame with optimal power-cost combinations, including columns from input plus minimum cost information and cost precision.
optimal_cost_grid: Data frame with nested cost grids for each parameter level, containing detailed design alternatives within cost precision bounds.

Details

This function implements a three-stage cost optimization process:

Stage 1: Power Filtering

Filters input data to designs achieving power within target ± precision
Ensures only viable designs (meeting power requirements) are considered

Stage 2: Cost Optimization

Groups designs by minimizing variable (e.g., TPM_threshold levels)
Identifies minimum cost for each parameter level
Computes cost precision (1% of minimum cost) for grid generation
Records parameter ranges (min/max cells and reads per cell) for each level

Stage 3: Design Grid Generation

Creates log-spaced grids within parameter ranges for each level
Computes detailed cost components (library + sequencing costs)
Filters to designs within cost precision bounds (±1% of minimum cost)
Applies sampling to reduce redundant designs while preserving diversity

Cost Model:

Total cost calculation:

Total Cost = Library Cost + Sequencing Cost

Where:

Library Cost = cost_per_captured_cell * num_captured_cells
Sequencing Cost = cost_per_million_reads * (reads_per_cell * num_captured_cells) / 1,000,000
num_captured_cells = ((gRNAs_per_target * num_targets + non_targeting_gRNAs) * cells_per_target) / (gRNAs_per_target * MOI)

The function is designed to work with output from cost_power_computation() and provides fine-grained cost optimization for experimental design selection.

Usage

Arguments

Value

Details

See also