alakazam - The alakazam package
alakazam in a member of the Change-O suite of tools and serves five main
- Providing core functionality for other R packages in the Change-O suite. This includes common tasks such as file I/O, basic DNA sequence manipulation, and interacting with V(D)J segment and gene annotations.
- Providing an R interface for interacting with the output of the pRESTO tool suite.
- Performing lineage reconstruction on clonal populations of immunoglobulin (Ig) sequences.
- Performing clonal abundance and diversity analysis on lymphocyte repertoires.
- Performing physicochemical property analyses of lymphocyte receptor sequences.
For additional details regarding the use of the
alakazam package see the
- maskSeqEnds: Mask ragged ends.
- maskSeqGaps: Mask gap characters.
- collapseDuplicates: Remove duplicate sequences.
- makeChangeoClone: Clean sequences for lineage reconstruction.
- buildPhylipLineage: Perform lineage reconstruction of Ig sequences.
Lineage topology analysis¶
- tableEdges: Tabulate annotation relationships over edges.
- testEdges: Significance testing of annotation edges.
- testMRCA: Significance testing of MRCA annotations.
- summarizeSubtrees: Various summary statistics for subtrees.
- plotSubtrees: Plot distributions of summary statistics for a population of trees.
- countClones: Calculate clonal abundance.
- estimateAbundance: Infer complete clonal abundance distribution with confidence intervals.
- rarefyDiversity: Generate clonal diversity curves.
- testDiversity: Test significance of clonal diversity scores.
- plotAbundanceCurve: Plot clone size distribution as a rank-abundance curve.
- plotDiversityCurve: Plot clonal diversity curves.
Ig and TCR sequence annotation¶
- countGenes: Calculate Ig and TCR allele, gene and family usage.
- extractVRegion: Extract CDRs and FWRs sub-sequences.
- getAllele: Get V(D)J allele names.
- getGene: Get V(D)J gene names.
- getFamily: Get V(D)J family names.
Sequence distance calculation¶
- seqDist: Calculate Hamming distance between two sequences.
- seqEqual: Test two sequences for equivalence.
- pairwiseDist: Calculate a matrix of pairwise Hamming distances for a set of sequences.
- pairwiseEqual: Calculate a logical matrix of pairwise equivalence for a set of sequences.
Amino acid propertes¶
- translateDNA: Translate DNA sequences to amino acid sequences.
- aminoAcidProperties: Calculate various physicochemical properties of amino acid sequences.
- countPatterns: Count patterns in sequences.
General data manipulation¶
- translateStrings: Perform multiple string replacement operations.
- Vander Heiden JA, Yaari G, et al. pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics. 2014 30(13):1930-2.
- Stern JNH, Yaari G, Vander Heiden JA, et al. B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes. Sci Transl Med. 2014 6(248):248ra107.
- Wu Y-CB, et al. Influence of seasonal exposure to grass pollen on local and peripheral blood IgE repertoires in patients with allergic rhinitis. J Allergy Clin Immunol. 2014 134(3):604-12.
- Gupta NT, Vander Heiden JA, et al. Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. Bioinformatics. 2015 Oct 15;31(20):3356-8.