Title: | Adenine base editor analysis |
---|---|
Description: | Base editors are emerging molecular sensors for protein-RNA interaction. This package implements a workflow for analysis of adenine base editor datasets. With minimal adjustment it can be used for systematic inquiry of any known single base-pair mutagenesis patterns. Part of the y3628 analysis suite. |
Authors: | Ye Yuan [aut, cre] |
Maintainer: | Ye Yuan <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.0.1 |
Built: | 2025-02-13 04:08:17 UTC |
Source: | https://github.com/yeyuan98/Rabe |
Assign variant IDs to a GRanges object based on overlap
assignToGRanges(variant_ids, targetGRanges, prune = TRUE)
assignToGRanges(variant_ids, targetGRanges, prune = TRUE)
variant_ids |
Character vector of variant IDs |
targetGRanges |
GenomicRanges::GRanges object |
prune |
Boolean, if TRUE will only rturn rows that overlap to >=1 variants |
This function considers strandedness when finding overlap. "A/G" variants will only be assigned to "+" stranded ranges, and vice versa. However, you must make sure that only "A/G" and "T/C" variants are included in the input.
GenomicRanges::GRanges with an additional mcol $variant_ids. $variant_ids is a IRanges::CharacterList
vignette("rABE-analysis")
vignette("rABE-analysis")
Assign overlapping VCF variant IDs to a GRanges
assignVariantIds(range, vcf, sep = ";")
assignVariantIds(range, vcf, sep = ";")
range |
GenomicRanges::GRanges object |
vcf |
VariantAnnotation::VCF object |
sep |
Separator to use if >1 variants overlap to a range |
Input range
except adding a $variant_ids column
vignette("rABE-analysis")
vignette("rABE-analysis")
Set difference of two variant tables
diffVariants(varTableX, varTableY)
diffVariants(varTableX, varTableY)
varTableX |
variant table X |
varTableY |
variant table Y |
variant table of X \ Y (based on $variant_id)
vignette("rABE-analysis")
vignette("rABE-analysis")
Filter variants to include only Adenine base editing
filterAdenineEditVariants(assignedRange, stranded = TRUE, pruning = TRUE)
filterAdenineEditVariants(assignedRange, stranded = TRUE, pruning = TRUE)
assignedRange |
GenomicRanges::GRanges with $variant_ids column.
You can assign variants with the function |
stranded |
Boolean whether to consider strandedness. If TRUE, A/G editing for '+' strand, T/C for '-' strand, raising error if any strand takes '*' value. |
pruning |
Whether to prune values in $variant_ids column. If TRUE, $variant_ids will only retain adenine base editing events. If FALSE, $variant_ids is not changed. |
GenomicRanges::GRanges containing ranges with at least one adenine base editing event.
vignette("rABE-analysis")
vignette("rABE-analysis")
Filter a single merged variant table
filterVariants(mergedVarTable, perc.limits, plot = TRUE)
filterVariants(mergedVarTable, perc.limits, plot = TRUE)
mergedVarTable |
Single tibble::tibble of merged variants |
perc.limits |
Lower and upper limits of median mutation percentage. |
plot |
Boolean of whether to plot a simple histogram of mutation percentages |
Filtered merged variant table
vignette("rABE-analysis")
vignette("rABE-analysis")
Get 'confident' variants for downstream analysis
getConfidentVariantIds(variantTableList, print.stats = TRUE, min.occurence = 2)
getConfidentVariantIds(variantTableList, print.stats = TRUE, min.occurence = 2)
variantTableList |
A list of variant tables. Each variant table must have $variant_id column. |
print.stats |
Boolean, whether to print statistics. |
min.occurence |
Integer, minimum number of occurences for a variant to be considered as 'confident'. |
A character vector of 'confident' variant IDs.
vignette("rABE-analysis")
vignette("rABE-analysis")
Convert GRanges with assigned variants into a tibble
getNameVarTable(assignedGRanges)
getNameVarTable(assignedGRanges)
assignedGRanges |
an "assigned GRanges"
(i.e., GenomicRanges::GRanges with names and |
A tibble::tibble with columns:
name, character of name(assignedGRanges)
variant_ids, list(character) of assignedGRanges$variant_ids
vignette("rABE-analysis")
vignette("rABE-analysis")
Get data frame of primary variants from a VCF file
getPrimaryVariantTable(vcf, variant_ids)
getPrimaryVariantTable(vcf, variant_ids)
vcf |
VariantAnnotation::VCF object |
variant_ids |
Identifiers to subset the VCF object |
A data.frame of counts for primary variants of each ID queried.
Return detailed variant information (mutation rate, counts for REF/ALT).
getVariantTable(vcf, assignedRange)
getVariantTable(vcf, assignedRange)
vcf |
VariantAnnotation::VCF object |
assignedRange |
GenomicRanges::GRanges with $variant_ids column.
You can assign variants with the function |
A tibble::tibble
vignette("rABE-analysis")
vignette("rABE-analysis")
Merge a list of variant tables into one table.
mergeVariants(variantTableList, variantIDsToKeep)
mergeVariants(variantTableList, variantIDsToKeep)
variantTableList |
list of variant tables. |
variantIDsToKeep |
which variant IDs to keep in the merged frame |
A single tibble::tibble of merged variant tables with following columns:
variant_id, character vector
variant_data list of tibble::tibble, each with columns $countREF, $countALT, $mutatin.perc.
vignette("rABE-analysis")
vignette("rABE-analysis")
Format summary of a numeric vector
prettyFormatSummary(nums)
prettyFormatSummary(nums)
nums |
A numeric vector |
A character(1) showing standard summary statistics (i.e., 0/25/50/mean/75/100 quantiles).
Prune a GenomicRanges containing variants to retain only A->G edits
pruneAdenineEditVariants(rangeWithVariants, stranded = TRUE)
pruneAdenineEditVariants(rangeWithVariants, stranded = TRUE)
rangeWithVariants |
GenomicRanges::GRanges with $variant_ids.
Each range can contain multiple variant_ids separated by |
stranded |
Boolean, whether to consider strand of the ranges |
GenomicRanges::GRanges with only A->G edits
Convert A/G variant ID vector into GenomicRanges::GRanges
variantID2GRanges(variant_ids)
variantID2GRanges(variant_ids)
variant_ids |
Character vector of VCF variant identifier. Example: "chr2L:1_A/G" |
Variant type is extracted but NOT checked at all. You must make sure that only "A/G" and "T/C" are present as input.
GenomicRanges::GRanges object of the variants. All widths are one as this is record of single base mutations. The following mcols are defined:
type
, "A/G" (strand = "+") or "T/C" (strand = "-")
to reflect variant type.
variant_id
, the original input as record