scholarly journals A full-length transcriptome dataset of normal and Nosema ceranae-challenged midgut tissues of eastern honeybee workers

2020 ◽  
Author(s):  
Yu Du ◽  
Yuanchan Fan ◽  
Huazhi Chen ◽  
Jie Wang ◽  
Cuiling Xiong ◽  
...  

ABSTRACTApis cerana cerana is a subspecies of eastern honeybee, Apis cerana, and it plays a vital role in ecological maintenance in China. However, A. c. cerana is threatened by many pathogenic microorganisms including Nosema ceranae, a widespread fungal parasite that infected worldwide colonies. In this article, un-challenged (AcCK1, AcCK2) and N. ceranae-challenged midguts of A. c. cerana workers (AcT1, AcT2) were sequenced utilizing Nanopore long-read sequencing technology. Totally, 11,727,628, 6,996,395, 14,383,735 and 11,580,154 raw reads were yielded from AcCK1, AcCK2, AcT1 and AcT2; the average lengths were 1147 bp, 908 bp, 992 bp and 1077 bp, and the average N50 were 1308 bp, 911 bp, 1079 bp and 1192 bp. The length distribution of was ranged 1 kb to more than 10 kb. Additionally, the quality (Q) score distribution of raw reads was among Q7~Q17. Further, 11,617,144, 6,940,895, 14,277,240 and 11,501,562 clean reads were respectively obtained from AcCK1, AcCK2, AcT1 and AcT2, and among them 78.40%, 82.50%, 79.05% and 80.20% were identified as full-length clean reads. In addition, full-length clean reads from AcCK1, AcT1, AcT2 and AcCK2 were ranged from 1 kb to more than 10 kb in length. Finally, the length distribution of redundant reads-removed full-length transcripts was among 1 kb~5 kb.Value of the data♦This dataset enables better understanding the complexity of A. c. cerana transcriptome.♦Current dataset contributes to identification of genes and transcripts engaged in response of eastern honeybee to N. ceranae stress.♦The data provides a valuable genetic resource for deciphering alternative splicing and polyadenylation of A. c. cerana mRNAs involved in host response to N. ceranae challenge.♦The reported data is beneficial for uncovering the molecular mechanism regulating interaction between eastern honeybee and microsporidian.

2020 ◽  
Author(s):  
Huazhi Chen ◽  
Yu Du ◽  
Yuanchan Fan ◽  
Haibin Jiang ◽  
Cuiling Xiong ◽  
...  

ABSTRACTNosema ceranae, a widespread fungal parasite that infects honeybee and many other bee species, can seriously affect bee health and colony productivity. In this article, N. ceranae spores were purified followed by third-generation sequencing using Nanopore PromethION platform. Totally, 6988795 raw reads were yielded from purified spores, with a length distribution among 1 kb~10 kb and a quality (Q) score distribution among Q6~Q12. A total of 6953469 clean reads were obtained, and among them 73.98% were identified as being full-length. The length of redundant reads-removed full-length transcripts was ranged from 1 kb to 5 kb, with the most abundant length of 1 kb. These data will improve transcriptome quality of N. ceranae significantly.


Author(s):  
Yu Du ◽  
Huazhi Chen ◽  
Jie Wang ◽  
Zhiwei Zhu ◽  
Cuiling Xiong ◽  
...  

ABSTRACTAscosphaera apis is a fungal pathogen that exclusively infects honeybee larvae, leading to chalkbrood disease, which damages the number of adult honeybees and colony productivity. In this article, A. apis mecylia and spores were respectively purified followed by Oxford Nanopore sequencing via PromethION platform. In total, 6,321,704 and 6,259,727 raw reads were generated from Aam and Aas, with a length distribution among 1 kb~10 kb. The quality (Q) scores of majority of raw reads were Q9 (Aam) and Q11 (Aas). Additionally, 5,669,436 and 6,233,159 clean reads were gained, among them 79.32% and 79.62% were identified as being full-length. The lengths of redundant reads-removed full-length transcripts were among 1 kb~8 kb and 1 kb~9 kb, and most abundant length for both was 1 kb. Furthermore, the length of redundant transcripts-removed clean reads was ranged from 1 kb~7 kb, with the largest group of 1 kb. The data reported here provides a beneficial genetic resource for improving genome and transcriptome annotations of A. apis and for exploring alternative splicing and polyadenylation of A. apis mRNAs.Value of the resultCurrent dataset enables better understanding of the complexity of A. apis transcriptome.The long-read transcriptome data can be used to identify of genes and transcripts associated with A. apis infection mechanism.The accessible data provides full-length transcripts for improving gene structure and functional annotation of A. apis transcriptome.This dataset could be utilized for investigation of alternative splicing and polyadenylation of A. apis mRNAs.


2020 ◽  
Author(s):  
Huazhi Chen ◽  
Xiaoxue Fan ◽  
Yu Du ◽  
Yuanchan Fan ◽  
Jie Wang ◽  
...  

ABSTRACTApis mellifera ligustica is a subspecies of western honeybee, Apis mellifera. Nosema ceranae is known to cause bee microspodiosis, which seriously affects bee survival and colony productivity. In this article, Nanopore long-read sequencing was used to sequence N. ceranae-infected and un-infected midguts of A. m. ligustica workers at 7 d and 10 d post inoculation (dpi). In total, 5942745, 6664923, 7100161 and 6506665 raw reads were respectively yielded from AmT1, AmT2, AmCK1 and AmCK2, with average lengths of 1148, 1196, 1178 and 1201 bp, and N50 of 1328, 1394, 1347 and 1388 bp. The length distribution of raw reads from AmT1, AmT2, AmCK1 and AmCK2 was ranged from 1 kb to more than 10 kb. Additionally, the distribution of quality score of raw reads from AmT1 and AmT2 was among Q6∼Q12, while that from AmCK1 and AmCK2 was among Q6∼Q16. Further, 5745048, 6416987, 6928170, 6353066 clean reads were respectively gained from AmT1, AmT2, AmCK1 and AmCK2, and among them 4172542, 4638289, 5068270 and 4857960 were identified as being full-length. After removing redundant reads, the length distribution of remaining full-length transcripts was among 1 kb∼8 kb, with the most abundant length of 2 kb. The long-read transcriptome data reported here contributes to a deeper understanding of the molecular regulating N. ceranae-response of A. m. ligustica and host-fungal parasite interaction during microsporidiosis.


2020 ◽  
Author(s):  
Huazhi Chen ◽  
Dingding Zhou ◽  
Yu Du ◽  
Cuiling Xiong ◽  
Yanzhen Zheng ◽  
...  

ABSTRACTApis cerana cerana is a subspecies of eastern honeybee, Apis cerana. Nosema ceranae is a widespread fungal parasite of honeybee, causing heavy losses for beekeeping industry all over the world. In this article, total RNA of normal midguts (AcCK1, AcCK2) and N. ceranae-infected midguts of A. c. cerana workers at 7 d and 10 d post inoculation (AcT1, AcT2) were respectively isolated followed by strand-specific cDNA library construction and next-generation RNA sequencing. In tolal, 56270223688, 44860946964, 78991623806, and 92712308296 raw reads were derived from AcCK1, AcCK2, AcT1 and AcT2, respectively. Following strict quality control, 54495191388, 43570608753, 76708161525, and 89467858351 clean reads were obtained, with Q30 value of 95.80%, 95.99%, 96.07% and 96.04%, and GC content of 44.20%, 43.44%, 44.83% and 43.63%, respectively. The raw data were submitted to the NCBI Sequence Read Archive database and connected to BioProject PRJNA562784. These data offers a valuable resource for deep investigation of mechanisms underlying eastern honeybee responding to N. ceranae infection and host-fungal parasite interaction during microsporidiosis.Value of the DataCurrent dataset offers a valuable resource for exploring mRNAs, lncRNAs and circRNAs involved in response of A. c. cerana worker to N. ceranae infection.The accessible data can be used to investigate differential expression pattern and regulatory network of non-coding RNAs in A. c. cerana workers’ midguts responding to N. ceranae challenge.This data will enable a better understanding of the molecular mechanism regulating eastern honeybee-N. ceranae interaction.


2019 ◽  
Author(s):  
Dhaivat Joshi ◽  
Shunfu Mao ◽  
Sreeram Kannan ◽  
Suhas Diggavi

AbstractMotivationEfficient and accurate alignment of DNA / RNA sequence reads to each other or to a reference genome / transcriptome is an important problem in genomic analysis. Nanopore sequencing has emerged as a major sequencing technology and many long-read aligners have been designed for aligning nanopore reads. However, the high error rate makes accurate and efficient alignment difficult. Utilizing the noise and error characteristics inherent in the sequencing process properly can play a vital role in constructing a robust aligner. In this paper, we design QAlign, a pre-processor that can be used with any long-read aligner for aligning long reads to a genome / transcriptome or to other long reads. The key idea in QAlign is to convert the nucleotide reads into discretized current levels that capture the error modes of the nanopore sequencer before running it through a sequence aligner.ResultsWe show that QAlign is able to improve alignment rates from around 80% up to 90% with nanopore reads when aligning to the genome. We also show that QAlign improves the average overlap quality by 9.2%, 2.5% and 10.8% in three real datasets for read-to-read alignment. Read-to-transcriptome alignment rates are improved from 51.6% to 75.4% and 82.6% to 90% in two real datasets.Availabilityhttps://github.com/joshidhaivat/QAlign.git


2021 ◽  
Author(s):  
Jun Ke Yu ◽  
Da Fu Chen ◽  
Rui Guo

Apis cerana cerana is an excellent subspecies of Apis cerana, playing a vital role in pollination for wild flowers and crops as well as ecological balance. Nosema ceranae, an emergent fungal parasite infecting various bee species, originates from eastern honeybee. In this article, midguts of N. ceranae-inoculated A. c. cerana workers at 7 days post inoculation (dpi) and 10 dpi (AcT1 and AcT2) and un-inoculated workers' midguts (AcCK1, AcCK2) were subjected to Nanopore-based genome-wide DNA methylation sequencing. Totally, 1773258, 2151476, 1927874 and 2109961 clean reads were generated from AcCK1, AcCK2, AcT1, and AcT2 groups, with the N50 lengths of 7548, 7936, 7678, and 7291 and the average quality value of 8.97, 8.95, 9.24, and 8.98, respectively. Among these, 93.85%, 94.49%, 88.69%, and 81.27% clean reads could be mapped to the reference genome of A. c. cerana. In the aforementioned four groups, 2149685, 2614513, 1637018 and 2726985 CHG sites were identified; the numbers of CHH sites were 9581990, 11801082, 7178559, and 12342423, whereas those of CpG sites were 14325356, 15703508, 14856284 and 13956849, respectively. Additionally, there were 36114, 118867, 30249, and 82984 6mA methylation sites respectively discovered. These data can be used for identifying differential 5mC methylation and 6mA methylation engaged in response of eastern honeybee workers to N. ceranae infestation, and for investigating the 5mC or 6mA methylation-mediated mechanism underlying host response.


Author(s):  
Dhaivat Joshi ◽  
Shunfu Mao ◽  
Sreeram Kannan ◽  
Suhas Diggavi

Abstract Motivation Efficient and accurate alignment of DNA/RNA sequence reads to each other or to a reference genome/transcriptome is an important problem in genomic analysis. Nanopore sequencing has emerged as a major sequencing technology and many long-read aligners have been designed for aligning nanopore reads. However, the high error rate makes accurate and efficient alignment difficult. Utilizing the noise and error characteristics inherent in the sequencing process properly can play a vital role in constructing a robust aligner. In this article, we design QAlign, a pre-processor that can be used with any long-read aligner for aligning long reads to a genome/transcriptome or to other long reads. The key idea in QAlign is to convert the nucleotide reads into discretized current levels that capture the error modes of the nanopore sequencer before running it through a sequence aligner. Results We show that QAlign is able to improve alignment rates from around 80% up to 90% with nanopore reads when aligning to the genome. We also show that QAlign improves the average overlap quality by 9.2, 2.5 and 10.8% in three real datasets for read-to-read alignment. Read-to-transcriptome alignment rates are improved from 51.6% to 75.4% and 82.6% to 90% in two real datasets. Availability and implementation https://github.com/joshidhaivat/QAlign.git. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Lingyun Liu ◽  
Ke Teng ◽  
Xifeng Fan ◽  
Hui Zhang ◽  
Chao Han ◽  
...  

Abstract Pennisetum setaceum ‘Rubrum’ is an ornamental herb with purple leaves, and it is widely used in the construction of landscaping. However, the current next generation sequencing (NGS) transcriptome information is not satisfactory mainly because of the enormous difficulty in obtaining full-length transcripts. What’s more, the molecular mechanisms of anthocyanin accumulation have not been thoroughly studied. In this study, we used PacBio full-length transcriptome sequencing combined with NGS sequencing technology to conduct transcriptome analysis on leaves showing different colors at different stages to clarify the molecular mechanism involved in the color change of P. setaceum ‘Rubrum’. A total of 280,413 full-length non-chimeric reads (FLNC) sequences were obtained based on single-molecule long-read sequencing technology. We obtained 140,633 high quality (HQ) transcripts and 2,683 low quality (LQ) transcripts and identified 5,352 alternative splicing (AS). In addition, a total of 93,066 ORFs, including 57,457 full open links and 2,910 lncRNA sequences were screened out. Furthermore, a total of 10,795 differentially expressed genes were identified. Gene ontology (GO) cluster and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis revealed the underlying mechanism of anthocyanin accumulation. In this study, to our best knowledge, we provided the full-length transcriptome information of P. setaceum ‘Rubrum’ for the first time. The underlying mechanism of anthocyanin accumulation in P. setaceum ‘Rubrum’ was further discussed based on the newly generated transcriptome data. The information will not only facilitate the gene function studies but also pave the way for future breeding projects of Pennisetum setaceum .


Author(s):  
Benjamin J Callahan ◽  
Dmitry Grinevich ◽  
Siddhartha Thakur ◽  
Michael A Balamotis ◽  
Tuval Ben Yehezkel

AbstractOut of the many pathogenic bacterial species that are known, only a fraction are readily identifiable directly from a complex microbial community using standard next generation DNA sequencing technology. Long-read sequencing offers the potential to identify a wider range of species and to differentiate between strains within a species, but attaining sufficient accuracy in complex metagenomes remains a challenge. Here, we describe and analytically validate LoopSeq, a commercially-available synthetic long-read (SLR) sequencing technology that generates highly-accurate long reads from standard short reads. LoopSeq reads are sufficiently long and accurate to identify microbial genes and species directly from complex samples. LoopSeq applied to full-length 16S rRNA genes from known strains in a microbial community perfectly recovered the full diversity of full-length exact sequence variants in a known microbial community. Full-length LoopSeq reads had a per-base error rate of 0.005%, which exceeds the accuracy reported for other long-read sequencing technologies. 18S-ITS and genomic sequencing of fungal and bacterial isolates confirmed that LoopSeq sequencing maintains that accuracy for reads up to 6 kilobases in length. Analysis of rinsate from retail meat samples demonstrated that LoopSeq full-length 16S rRNA synthetic long-reads could accurately classify organisms down to the species level, and could differentiate between different strains within species identified by the CDC as potential foodborne pathogens. The order-of-magnitude improvement in both length and accuracy over standard Illumina amplicon sequencing achieved with LoopSeq enables accurate species-level and strain identification from complex and low-biomass microbiome samples. The ability to generate accurate and long microbiome sequencing reads using standard short read sequencers will accelerate the building of quality microbial sequence databases and removes a significant hurdle on the path to precision microbial genomics.


2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 162.2-162
Author(s):  
M. Bakker ◽  
P. Putrik ◽  
J. Rademakers ◽  
M. Van de Laar ◽  
H. Vonkeman ◽  
...  

Background:The prevalence of limited health literacy (i.e. cognitive and social resources of individuals to access, understand and apply health information to promote and maintain good health) in the Netherlands is estimated to be over 36% [1]. Access to and outcomes of rheumatological care may be compromised by limited patient health literacy, yet little is known about how to address this, thus action is required. As influencing individual patients’ health literacy in the rheumatology context is often unrealistic, it is paramount for the health system to be tailored to the health literacy needs of its patients. The OPtimising HEalth LIteracy and Access (Ophelia) process offers a method to inform system change [2].Objectives:Following the Ophelia approach:a. Identify health literacy profiles reflecting strengths and weaknesses of outpatients with RA, SpA and gout.b. Use the health literacy profiles to facilitate discussions on challenges for patients and professionals in rheumatological care and identify possible solutions the health system could offer to address these challenges.Methods:Patients with RA, SpA and gout attending outpatient clinics in three centres in the Netherlands completed the Health Literacy Questionnaire (HLQ) and questions on socio-demographic and health-related characteristics. Hierarchical cluster analysis using Ward’s method identified clusters based on the nine HLQ domains. Three researchers jointly examined 24 cluster solutions for meaningfulness by interpreting HLQ domain scores and patient characteristics. Meaningful clusters were translated into health literacy profiles using HLQ patterns and demographic data. A patient research partner confirmed the identified profiles. Patient vignettes were designed by combining cluster analyses results with qualitative patient interviews. The vignettes were used in two two-hour co-design workshops with rheumatologists and nurses to discuss their perspective on health literacy-related challenges for patients and professionals, and generate ideas on how to address these challenges.Results:In total, 895 patients participated: 49% female, mean age 61 years (±13.0), 25% lived alone, 18% had a migrant background, 6.6% did not speak Dutch at home and 51% had low levels of education. Figure 1 shows a heat map of identified health literacy profiles, displaying the score distribution per profile across nine health literacy domains. Figure 2 shows an excerpt of a patient vignette, describing challenges for a patient with profile number 9. The workshops were attended by 7 and 14 nurses and rheumatologists. Proposed solutions included health literacy communication training for professionals, developing and improving (visual) patient information materials, peer support for patients through patient associations or group consultations, a clear referral system for patients who need additional guidance by a nurse, social worker, lifestyle coach, pharmacist or family doctor, and more time with rheumatology nurses for target populations. Moreover, several system adaptations to the clinic, such as a central desk for all patient appointments, were proposed.Conclusion:This study identified several distinct health literacy profiles of patients with rheumatic conditions. Engaging with health professionals in co-design workshops led to numerous bottom-up ideas to improve care. Next steps include co-design workshops with patients, followed by prioritising and testing proposed interventions.References:[1]Heijmans M. et al. Health Literacy in the Netherlands. Utrecht: Nivel 2018[2]Batterham R. et al. BMC Public Health 2014, 14:694Disclosure of Interests:Mark Bakker: None declared, Polina Putrik: None declared, Jany Rademakers Speakers bureau: In March 2017, Prof. Dr. Rademakers was invited to speak about health literacy at the “Heuvellanddagen” Conference, hosted by Janssen-Cilag., Mart van de Laar Consultant of: Sanofi Genzyme, Speakers bureau: Sanofi Genzyme, Harald Vonkeman: None declared, Marc R Kok Grant/research support from: BMS and Novartis, Consultant of: Novartis and Galapagos, Hanneke Voorneveld: None declared, Sofia Ramiro Grant/research support from: MSD, Consultant of: Abbvie, Lilly, Novartis, Sanofi Genzyme, Speakers bureau: Lilly, MSD, Novartis, Maarten de Wit Grant/research support from: Dr. de Wit reports personal fees from Ely Lilly, 2019, personal fees from Celgene, 2019, personal fees from Pfizer, 2019, personal fees from Janssen-Cilag, 2017, outside the submitted work., Consultant of: Dr. de Wit reports personal fees from Ely Lilly, 2019, personal fees from Celgene, 2019, personal fees from Pfizer, 2019, personal fees from Janssen-Cilag, 2017, outside the submitted work., Speakers bureau: Dr. de Wit reports personal fees from Ely Lilly, 2019, personal fees from Celgene, 2019, personal fees from Pfizer, 2019, personal fees from Janssen-Cilag, 2017, outside the submitted work., Richard Osborne Consultant of: Prof. Osborne is a paid consultant for pharma in the field of influenza and related infectious diseases., Roy Batterham: None declared, Rachelle Buchbinder: None declared, Annelies Boonen Grant/research support from: AbbVie, Consultant of: Galapagos, Lilly (all paid to the department)


Sign in / Sign up

Export Citation Format

Share Document