scholarly journals immuneSIM: tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking

2020 ◽  
Vol 36 (11) ◽  
pp. 3594-3596 ◽  
Author(s):  
Cédric R Weber ◽  
Rahmad Akbar ◽  
Alexander Yermanos ◽  
Milena Pavlović ◽  
Igor Snapkov ◽  
...  

Abstract Summary B- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we developed immuneSIM, an R package that allows the simulation of native-like and aberrant synthetic full-length variable region immune receptor sequences by tuning the following immune receptor features: (i) species and chain type (BCR, TCR, single and paired), (ii) germline gene usage, (iii) occurrence of insertions and deletions, (iv) clonal abundance, (v) somatic hypermutation and (vi) sequence motifs. Each simulated sequence is annotated by the complete set of simulation events that contributed to its in silico generation. immuneSIM permits the benchmarking of key computational tools for immune receptor analysis, such as germline gene annotation, diversity and overlap estimation, sequence similarity, network architecture, clustering analysis and machine learning methods for motif detection. Availability and implementation The package is available via https://github.com/GreiffLab/immuneSIM and on CRAN at https://cran.r-project.org/web/packages/immuneSIM. The documentation is hosted at https://immuneSIM.readthedocs.io. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Cédric R. Weber ◽  
Rahmad Akbar ◽  
Alexander Yermanos ◽  
Milena Pavlović ◽  
Igor Snapkov ◽  
...  

AbstractSummaryB- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we developed immuneSIM, an R package that allows the simulation of native-like and aberrant synthetic full length variable region immune receptor sequences. ImmuneSIM enables the tuning of the immune receptor features: (i) species and chain type (BCR, TCR, single, paired), (ii) germline gene usage, (iii) occurrence of insertions and deletions, (iv) clonal abundance, (v) somatic hypermutation, and (vi) sequence motifs. Each simulated sequence is annotated by the complete set of simulation events that contributed to its in silico generation. immuneSIM permits the benchmarking of key computational tools for immune receptor analysis such as germline gene annotation, diversity and overlap estimation, sequence similarity, network architecture, clustering analysis, and machine learning methods for motif detection.AvailabilityThe package is available via https://github.com/GreiffLab/immuneSIM and will also be available at CRAN (submitted). The documentation is hosted at https://[email protected], [email protected]


2021 ◽  
Vol 12 ◽  
Author(s):  
Jonathan Crider ◽  
Sylvie M. A. Quiniou ◽  
Kristianna L. Felch ◽  
Kurt Showmaker ◽  
Eva Bengtén ◽  
...  

The complete germline repertoires of the channel catfish, Ictalurus punctatus, T cell receptor (TR) loci, TRAD, TRB, and TRG were obtained by analyzing genomic data from PacBio sequencing. The catfish TRB locus spans 214 kb, and contains 112 TRBV genes, a single TRBD gene, 31 TRBJ genes and two TRBC genes. In contrast, the TRAD locus is very large, at 1,285 kb. It consists of four TRDD genes, one TRDJ gene followed by the exons for TRDC, 125 TRAJ genes and the exons encoding the TRAC. Downstream of the TRAC, are 140 TRADV genes, and all of them are in the opposite transcriptional orientation. The catfish TRGC locus spans 151 kb and consists of four diverse V-J-C cassettes. Altogether, this locus contains 15 TRGV genes and 10 TRGJ genes. To place our data into context, we also analyzed the zebrafish TR germline gene repertoires. Overall, our findings demonstrated that catfish possesses a more restricted repertoire compared to the zebrafish. For example, the 140 TRADV genes in catfish form eight subgroups based on members sharing 75% nucleotide identity. However, the 149 TRAD genes in zebrafish form 53 subgroups. This difference in subgroup numbers between catfish and zebrafish is best explained by expansions of catfish TRADV subgroups, which likely occurred through multiple, relatively recent gene duplications. Similarly, 112 catfish TRBV genes form 30 subgroups, while the 51 zebrafish TRBV genes are placed into 36 subgroups. Notably, several catfish and zebrafish TRB subgroups share ancestor nodes. In addition, the complete catfish TR gene annotation was used to compile a TR gene segment database, which was applied in clonotype analysis of an available gynogenetic channel catfish transcriptome. Combined, the TR annotation and clonotype analysis suggested that the expressed TRA, TRB, and TRD repertoires were generated by different mechanisms. The diversity of the TRB repertoire depends on the number of TRBV subgroups and TRBJ genes, while TRA diversity relies on the many different TRAJ genes, which appear to be only minimally trimmed. In contrast, TRD diversity relies on nucleotide additions and the utilization of up to four TRDD segments.


Author(s):  
Taylor Jones ◽  
Samuel B Day ◽  
Luke Myers ◽  
James E Crowe ◽  
Cinque Soto

Abstract Summary B cell receptor (BCR) and T cell receptor (TCR) repertoires are generated through somatic DNA rearrangements and are responsible for the molecular basis of antigen recognition in the immune system. Next-generation sequencing (NGS) of DNA and the falling cost of sequencing due to continued development of these technologies have made sequencing assays an affordable way to characterize the repertoire of adaptive immune receptors (sometimes termed the ‘immunome’). Many new workflows have been developed to take advantage of NGS and have placed the resulting immunome datasets in the public domain. The scale of these NGS datasets has made it challenging to search through the Complementarity-determining region 3 (CDR3), which is responsible for imparting specific antibody-antigen interactions. Thus, there is an increasing demand for sequence analysis tools capable of searching through CDR3s from immunome data collections containing millions of sequences. To address this need, we created a software package called ClonoMatch that facilitates rapid searches in bulk immunome data for BCR or TCR sequences based on their CDR3 sequence or V3J clonotype. Availability and implementation Documentation, software support and the codebase are all available at https://github.com/crowelab/clonomatch. This software is distributed under the GPL v3 license. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Milena Vujović ◽  
Paolo Marcatili ◽  
Benny Chain ◽  
Joseph Kaplinsky ◽  
Thomas Lars Andresen

AbstractWe propose TCRDivER, a global approach to T-cell repertoire comparison using diversity profiles sensitive to both clone size and sequence similarity. As immunotherapies improve, the long standing biological interest in connecting outcome with T cell receptor (TCR) repertoire status has become more urgent. Here we show that new insights can be extracted from high throughput repertoire sequencing data. Most current efforts focus on identification of immunisation-specific sequence motifs or on monitoring changes in frequency of individual clones. Applying TCRDivER to murine spleen samples shows it characterises an additional dimension of repertoire variation, beyond conventional diversity estimates, allowing distinction between immunised and non-immunised samples. We further apply TCRDivER to repertoires from human blood. In both cases we show characteristic relationships between repertoire features. These reveal biologically interpretable relationships between sequence similarity and clonal expansions. We thereby demonstrate a new tool for investigation in clinical and research applications.


Author(s):  
Anja Mösch ◽  
Dmitrij Frishman

Abstract Summary The ability of a T cell to recognize foreign peptides is defined by a single α and a single β hypervariable complementarity determining region (CDR3), which together form the T cell receptor (TCR) heterodimer. In ∼30%-35% of T cells, two α chains are expressed at the mRNA level but only one α chain is part of the functional TCR. This effect can also be observed for β chains, although it is less common. The identification of functional α/β chain pairs is instrumental in high-throughput characterization of therapeutic TCRs. TCRpair is the first method that predicts whether an α and β chain pair forms a functional, HLA-A*02:01 specific TCR without requiring the sequence of a recognized peptide. By taking additional amino acids flanking the CDR3 regions into account, TCRpair achieves an AUC of 0.71. Availability TCRpair is implemented in Python using TensorFlow 2.0 and is freely available at https://www.github.com/amoesch/TCRpair Supplementary information Supplementary data are available at Bioinformatics online.


1994 ◽  
Vol 106 (5) ◽  
pp. 1321-1325 ◽  
Author(s):  
Koji Manabe ◽  
Martin L. Hibberd ◽  
Peter T. Donaldson ◽  
James A. Underhill ◽  
Derek G. Doherty ◽  
...  

1996 ◽  
Vol 168 (2) ◽  
pp. 235-242 ◽  
Author(s):  
Corinna Kayser ◽  
Inge Waase ◽  
Cornelia M. Weyand ◽  
Jörg J. Goronzy

Sign in / Sign up

Export Citation Format

Share Document