
K562 Ray TAP-seq reference data for CRISPR power analysis
K562_Ray.Rd
Pre-computed pilot data from K562 chronic myelogenous leukemia cells generated using TAP-seq (targeted perturb-seq) with 10x Chromium technology. Contains baseline gene expression parameters and library size information for power analysis of CRISPR-based perturbation experiments. TAP-seq uses targeted sequencing to profile a focused gene panel, providing cost-effective power analysis for experiments targeting specific pathways or gene sets.
Format
A list with 3 elements:
- baseline_expression_stats
Data frame with gene expression data (303 genes × 3 columns):
response_id
: Character vector of Ensembl gene IDsrelative_expression
: Numeric vector of relative expression levels (TPM/1e6 scale)expression_size
: Numeric vector of dispersion parameters (theta)
- library_parameters
List containing:
UMI_per_cell
: Maximum UMI per cell parameter (2,377)variation
: Variation parameter for PCR bias (0.809)
- mapping_efficiency
Numeric. Mapping efficiency value (0.349)
Source
Paper: An unbiased survey of distal element-gene regulatory interactions with direct-capture targeted Perturb-seq
Author and Year: Ray et al., 2025
Journal: (Publication details pending)
Accession: GSE303901
PMID: 41000760
Details
This dataset was generated using DC TAP-seq (Direct-Capture Targeted Perturb-seq), an enhanced version of targeted perturb-seq that integrates CRISPR-based perturbations with direct-capture single-cell RNA sequencing. By capturing guide RNAs alongside targeted gene transcripts within the same sequencing reaction, DC TAP-seq enables high-throughput, unbiased mapping of distal regulatory element–gene interactions with improved sensitivity and reduced technical noise. This approach allows simultaneous measurement of perturbation identity and gene expression in thousands of single cells, facilitating large-scale functional dissection of noncoding regions at single-cell resolution.
Cells Used in Relative Expression Estimate: All cells in high-moi condition
See also
get_pilot_data_from_package
for accessing this data programmatically
Examples
data(K562_Ray)
str(K562_Ray)
#> List of 3
#> $ baseline_expression_stats:'data.frame': 303 obs. of 3 variables:
#> ..$ response_id : chr [1:303] "ENSG00000235169" "ENSG00000130764" "ENSG00000116198" "ENSG00000159023" ...
#> ..$ relative_expression: num [1:303] 0.003732 0.000337 0.000121 0.001622 0.00034 ...
#> ..$ expression_size : num [1:303] 2.55 3.92 18.58 5.21 1.73 ...
#> $ library_parameters :List of 2
#> ..$ UMI_per_cell: num 2377
#> ..$ variation : num 0.809
#> $ mapping_efficiency : num 0.349