The alakazam package¶
Description¶
alakazam
in a member of the Immcantation framework of tools and serves five main
purposes:
- Providing core functionality for other R packages in Immcantation. This includes common tasks such as file I/O, basic DNA sequence manipulation, and interacting with V(D)J segment and gene annotations.
- Providing an R interface for interacting with the output of the pRESTO and Change-O tool suites.
- Performing clonal abundance and diversity analysis on lymphocyte repertoires.
- Performing lineage reconstruction on clonal populations of immunoglobulin (Ig) sequences.
- Performing physicochemical property analyses of lymphocyte receptor sequences.
For additional details regarding the use of the alakazam
package see the
vignettes:
browseVignettes("alakazam")
File I/O¶
- readChangeoDb: Input Change-O style files.
- writeChangeoDb: Output Change-O style files.
Sequence cleaning¶
- maskSeqEnds: Mask ragged ends.
- maskSeqGaps: Mask gap characters.
- collapseDuplicates: Remove duplicate sequences.
Lineage reconstruction¶
- makeChangeoClone: Clean sequences for lineage reconstruction.
- buildPhylipLineage: Perform lineage reconstruction of Ig sequences.
Lineage topology analysis¶
- tableEdges: Tabulate annotation relationships over edges.
- testEdges: Significance testing of annotation edges.
- testMRCA: Significance testing of MRCA annotations.
- summarizeSubtrees: Various summary statistics for subtrees.
- plotSubtrees: Plot distributions of summary statistics for a population of trees.
Diversity analysis¶
- countClones: Calculate clonal abundance.
- estimateAbundance: Bootstrap clonal abundance curves.
- alphaDiversity: Generate clonal alpha diversity curves.
- plotAbundanceCurve: Plot clone size distribution as a rank-abundance
- plotDiversityCurve: Plot clonal diversity curves.
- plotDiversityTest: Plot testing at given diversity hill indicex.
Ig and TCR sequence annotation¶
- countGenes: Calculate Ig and TCR allele, gene and family usage.
- extractVRegion: Extract CDRs and FWRs sub-sequences.
- getAllele: Get V(D)J allele names.
- getGene: Get V(D)J gene names.
- getFamily: Get V(D)J family names.
Sequence distance calculation¶
- seqDist: Calculate Hamming distance between two sequences.
- seqEqual: Test two sequences for equivalence.
- pairwiseDist: Calculate a matrix of pairwise Hamming distances for a set of sequences.
- pairwiseEqual: Calculate a logical matrix of pairwise equivalence for a set of sequences.
Amino acid propertes¶
- translateDNA: Translate DNA sequences to amino acid sequences.
- aminoAcidProperties: Calculate various physicochemical properties of amino acid sequences.
- countPatterns: Count patterns in sequences.
References¶
- Vander Heiden JA, Yaari G, et al. pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics. 2014 30(13):1930-2.
- Stern JNH, Yaari G, Vander Heiden JA, et al. B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes. Sci Transl Med. 2014 6(248):248ra107.
- Wu Y-CB, et al. Influence of seasonal exposure to grass pollen on local and peripheral blood IgE repertoires in patients with allergic rhinitis. J Allergy Clin Immunol. 2014 134(3):604-12.
- Gupta NT, Vander Heiden JA, et al. Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. Bioinformatics. 2015 Oct 15;31(20):3356-8.