A Beginner’s Guide on Integrating *Omics Approaches to Study Marine Microbial Communities: Details and Discussions From Sample Collection to Bioinformatics Analysis

Frontiers in Marine Science ◽

10.3389/fmars.2021.692538 ◽

2021 ◽

Vol 8 ◽

Author(s):

Sabrina Natalie Wilms

Keyword(s):

Single Cell ◽

Data Storage ◽

High Throughput Sequencing ◽

Marine Ecosystem ◽

Cloud Services ◽

Small Scale ◽

Species Differentiation ◽

Sequencing Data ◽

The World ◽

Microbial Organisms

The variety of Earth’s organisms is manifold. However, it is the small-scale marine community that makes the world goes round. Microbial organisms of pro- and eukaryotic origin drive the carbon supply and nutrient cycling, thus are mediating the primary productivity within the world largest ecosystem called ocean. But due to the ocean’s great size and large number of biogeographically habitats, the total of microbial species can hardly be grabbed and therefore their functional roles not fully described. However, recent advances in high-throughput sequencing technologies are revolutionizing our understanding of the marine microbial diversity, ecology and evolution. Nowadays, research questions on species differentiation can be solved with genomic approaches such as metabarcoding, while transcriptomics offers the possibility to assign gene functions even to a single cell, e.g., single-cell transcriptomics. On the other hand, due to the diversified amount of sequencing data, the certainty of a data crisis is currently evolving. Scientists are forced to broaden their view on bioinformatics resources for analysis and data storage in from of, e.g., cloud services, to ensure the data’s exchangeability. Which is why time resources are now shifting toward solving data problems rather than answering the eco-evolutionary questions stated in the first place. This review is intended to provide exchange on *omics approaches and key points for discussions on data handling used to decipher the relevant diversity and functions of microbial organisms in the marine ecosystem.

Download Full-text

Fishing for a Career: Alternative Livelihoods and the Hardheaded Art of Academic Failure

Journal of Working-Class Studies ◽

10.13001/jwcs.v2i2.6101 ◽

2017 ◽

Vol 2 (2) ◽

pp. 155-167

Author(s):

Deb Cleland

Keyword(s):

East Asia ◽

Marine Ecosystem ◽

Academic Failure ◽

Work Life ◽

Small Scale ◽

South East Asia ◽

Alternative Livelihoods ◽

Alternative Livelihood ◽

The World ◽

History Of

Charting the course: The world of alternative livelihood research brings a heavy history of paternalistic colonial intervention and moralising. In particular, subsistence fishers in South East Asia are cyclical attractors of project funding to help them exit poverty and not ‘further degrade the marine ecosystem’ (Cinner et al. 2011), through leaving their boats behind and embarking on non-oceanic careers. What happens, then, when we turn an autoethnographic eye on the livelihood of the alternative livelihood researcher? What lexicons of lack and luck may we borrow from the fishers in order to ‘render articulate and more systematic those feelings of dissatisfaction’ (Young 2002) of an academic’s life’s work and our work-life? What might we learn from comparing small-scale fishers to small-scale scholars about how to successfully ‘navigate’ the casualised waters of the modern university? Does this unlikely course bring any ideas of ‘possibilities glimmering’ (Young 2002) for ‘exiting’ poverty in Academia?

Download Full-text

PIRD: Pan Immune Repertoire Database

Bioinformatics ◽

10.1093/bioinformatics/btz614 ◽

2019 ◽

Cited By ~ 1

Author(s):

Wei Zhang ◽

Longlong Wang ◽

Ke Liu ◽

Xiaofeng Wei ◽

Kai Yang ◽

...

Keyword(s):

Data Storage ◽

High Throughput Sequencing ◽

Homo Sapiens ◽

Adaptive Immune System ◽

B Cell Receptors ◽

Sequencing Data ◽

Immune Repertoire ◽

Adaptive Immune ◽

Potential Applications ◽

Antibody Drug

Abstract Motivation T and B cell receptors (TCRs and BCRs) play a pivotal role in the adaptive immune system by recognizing an enormous variety of external and internal antigens. Understanding these receptors is critical for exploring the process of immunoreaction and exploiting potential applications in immunotherapy and antibody drug design. Although a large number of samples have had their TCR and BCR repertoires sequenced using high-throughput sequencing in recent years, very few databases have been constructed to store these kinds of data. To resolve this issue, we developed a database. Results We developed a database, the Pan Immune Repertoire Database (PIRD), located in China National GeneBank (CNGBdb), to collect and store annotated TCR and BCR sequencing data, including from Homo sapiens and other species. In addition to data storage, PIRD also provides functions of data visualization and interactive online analysis. Additionally, a manually curated database of TCRs and BCRs targeting known antigens (TBAdb) was also deposited in PIRD. Availability and implementation PIRD can be freely accessed at https://db.cngb.org/pird.

Download Full-text

Exploring cell-specific miRNA regulation with single-cell miRNA-mRNA co-sequencing data

10.1101/2020.10.14.340299 ◽

2020 ◽

Author(s):

Junpeng Zhang ◽

Lin Liu ◽

Taosheng Xu ◽

Wu Zhang ◽

Chunwen Zhao ◽

...

Keyword(s):

Single Cell ◽

Regulatory Networks ◽

Single Cells ◽

Small Scale ◽

Mirna Regulation ◽

Sequencing Data ◽

Resolution Level ◽

Novel Strategy ◽

Cell Cell

AbstractBackgroundExisting computational methods for studying miRNA regulation are mostly based on bulk miRNA and mRNA expression data. However, bulk data only allows the analysis of miRNA regulation regarding a group of cells, rather than the miRNA regulation unique to individual cells. Recent advance in single-cell miRNA-mRNA co-sequencing technology has opened a way for investigating miRNA regulation at single-cell level. However, as currently single-cell miRNA-mRNA co-sequencing data is just emerging and only available at small-scale, there is a strong need of novel methods to exploit existing single-cell data for the study of cell-specific miRNA regulation.ResultsIn this work, we propose a new method, CSmiR (Cell-Specific miRNA regulation) to use single-cell miRNA-mRNA co-sequencing data to identify miRNA regulatory networks at the resolution of individual cells. We apply CSmiR to the miRNA-mRNA co-sequencing data in 19 K562 single-cells to identify cell-specific miRNA-mRNA regulatory networks to understand miRNA regulation in each K562 single-cell. By analyzing the obtained cell-specific miRNA-mRNA regulatory networks, we observe that the miRNA regulation in each K562 single-cell is unique. Moreover, we conduct detailed analysis on the cell-specific miRNA regulation associated with the miR-17/92 family as a case study. Finally, through exploring cell-cell similarity matrix characterized by cell-specific miRNA regulation, CSmiR provides a novel strategy for clustering single-cells to help understand cell-cell crosstalk.ConclusionsTo the best of our knowledge, CSmiR is the first method to explore miRNA regulation at a single-cell resolution level, and we believe that it can be a useful method to enhance the understanding of cell-specific miRNA regulation.

Download Full-text

Exploring cell-specific miRNA regulation with single-cell miRNA-mRNA co-sequencing data

BMC Bioinformatics ◽

10.1186/s12859-021-04498-6 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Junpeng Zhang ◽

Lin Liu ◽

Taosheng Xu ◽

Wu Zhang ◽

Chunwen Zhao ◽

...

Keyword(s):

Single Cell ◽

Regulatory Networks ◽

Single Cells ◽

Small Scale ◽

Mirna Regulation ◽

Sequencing Data ◽

Comparison Results ◽

Resolution Level ◽

Novel Strategy ◽

Cell Cell

Abstract Background Existing computational methods for studying miRNA regulation are mostly based on bulk miRNA and mRNA expression data. However, bulk data only allows the analysis of miRNA regulation regarding a group of cells, rather than the miRNA regulation unique to individual cells. Recent advance in single-cell miRNA-mRNA co-sequencing technology has opened a way for investigating miRNA regulation at single-cell level. However, as currently single-cell miRNA-mRNA co-sequencing data is just emerging and only available at small-scale, there is a strong need of novel methods to exploit existing single-cell data for the study of cell-specific miRNA regulation. Results In this work, we propose a new method, CSmiR (Cell-Specific miRNA regulation) to combine single-cell miRNA-mRNA co-sequencing data and putative miRNA-mRNA binding information to identify miRNA regulatory networks at the resolution of individual cells. We apply CSmiR to the miRNA-mRNA co-sequencing data in 19 K562 single-cells to identify cell-specific miRNA-mRNA regulatory networks for understanding miRNA regulation in each K562 single-cell. By analyzing the obtained cell-specific miRNA-mRNA regulatory networks, we observe that the miRNA regulation in each K562 single-cell is unique. Moreover, we conduct detailed analysis on the cell-specific miRNA regulation associated with the miR-17/92 family as a case study. The comparison results indicate that CSmiR is effective in predicting cell-specific miRNA targets. Finally, through exploring cell–cell similarity matrix characterized by cell-specific miRNA regulation, CSmiR provides a novel strategy for clustering single-cells and helps to understand cell–cell crosstalk. Conclusions To the best of our knowledge, CSmiR is the first method to explore miRNA regulation at a single-cell resolution level, and we believe that it can be a useful method to enhance the understanding of cell-specific miRNA regulation.

Download Full-text

A Scalable Strand-Specific Protocol Enabling Full-Length Total RNA Sequencing From Single Cells

Frontiers in Genetics ◽

10.3389/fgene.2021.665888 ◽

2021 ◽

Vol 12 ◽

Author(s):

Simon Haile ◽

Richard D. Corbett ◽

Veronique G. LeBlanc ◽

Lisa Wei ◽

Stephen Pleasance ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Rna Sequencing ◽

High Throughput Sequencing ◽

Single Cells ◽

Cell Types ◽

Full Length ◽

Sequencing Data ◽

Total Rna ◽

Specific Protocol

RNA sequencing (RNAseq) has been widely used to generate bulk gene expression measurements collected from pools of cells. Only relatively recently have single-cell RNAseq (scRNAseq) methods provided opportunities for gene expression analyses at the single-cell level, allowing researchers to study heterogeneous mixtures of cells at unprecedented resolution. Tumors tend to be composed of heterogeneous cellular mixtures and are frequently the subjects of such analyses. Extensive method developments have led to several protocols for scRNAseq but, owing to the small amounts of RNA in single cells, technical constraints have required compromises. For example, the majority of scRNAseq methods are limited to sequencing only the 3′ or 5′ termini of transcripts. Other protocols that facilitate full-length transcript profiling tend to capture only polyadenylated mRNAs and are generally limited to processing only 96 cells at a time. Here, we address these limitations and present a novel protocol that allows for the high-throughput sequencing of full-length, total RNA at single-cell resolution. We demonstrate that our method produced strand-specific sequencing data for both polyadenylated and non-polyadenylated transcripts, enabled the profiling of transcript regions beyond only transcript termini, and yielded data rich enough to allow identification of cell types from heterogeneous biological samples.

Download Full-text

Broom: Application for non-redundant storage of High Throughput Sequencing data

10.1101/312306 ◽

2018 ◽

Author(s):

Levent Albayrak ◽

Kamil Khanipov ◽

George Golovko ◽

Yuriy Fofanov

Keyword(s):

Data Storage ◽

High Throughput ◽

High Throughput Sequencing ◽

Data Generation ◽

Sequencing Data ◽

High Throughput Sequencing Data ◽

Sequencing Quality ◽

Redundant Storage ◽

Recent Trends ◽

The Cost

AbstractMotivationThe data generation capabilities of High Throughput Sequencing (HTS) instruments have exponentially increased over the last few years, while the cost of sequencing has dramatically decreased allowing this technology to become widely used in biomedical studies. For small labs and individual researchers, however, storage and transfer of large amounts of HTS data present a significant challenge. The recent trends in increased sequencing quality and genome coverage can be used to reconsider HTS data storage strategies.ResultsWe present Broom, a stand-alone application designed to select and store only high-quality sequencing reads at extremely high compression rates. Written in C++, the application accepts single and paired-end reads in FASTQ and FASTA formats and decompresses data in FASTA format.AvailabilityC++ code available at https://scsb.utmb.edu/labgroups/fofanov/[email protected]

Download Full-text

Norwegian e-Infrastructure for Life Sciences (NeLS)

F1000Research ◽

10.12688/f1000research.15119.1 ◽

2018 ◽

Vol 7 ◽

pp. 968 ◽

Cited By ~ 2

Author(s):

Kidane M. Tekle ◽

Sveinung Gundersen ◽

Kjetil Klepper ◽

Lars Ailo Bongo ◽

Inge Alexander Raknes ◽

...

Keyword(s):

Data Storage ◽

High Throughput Sequencing ◽

Life Sciences ◽

Large Data ◽

Sequencing Data ◽

Sensitive Data ◽

Help Desk ◽

Web Interfaces ◽

The Galaxy ◽

University Of Oslo

The Norwegian e-Infrastructure for Life Sciences (NeLS) has been developed by ELIXIR Norway to provide its users with a system enabling data storage, sharing, and analysis in a project-oriented fashion. The system is available through easy-to-use web interfaces, including the Galaxy workbench for data analysis and workflow execution. Users confident with a command-line interface and programming may also access it through Secure Shell (SSH) and application programming interfaces (APIs). NeLS has been in production since 2015, with training and support provided by the help desk of ELIXIR Norway. Through collaboration with NorSeq, the national consortium for high-throughput sequencing, an integrated service is offered so that sequencing data generated in a research project is provided to the involved researchers through NeLS. Sensitive data, such as individual genomic sequencing data, are handled using the TSD (Services for Sensitive Data) platform provided by Sigma2 and the University of Oslo. NeLS integrates national e-infrastructure storage and computing resources, and is also integrated with the SEEK platform in order to store large data files produced by experiments described in SEEK. In this article, we outline the architecture of NeLS and discuss possible directions for further development.

Download Full-text

PanClassif: Improving pan cancer classification of single cell RNA-seq using machine learning

10.1101/2021.04.10.439266 ◽

2021 ◽

Author(s):

Kazi Ferdous Mahin ◽

Md. Robiuddin ◽

Mujahidul Islam ◽

Shayed Ashraf ◽

Farjana Yeasmin ◽

...

Keyword(s):

Machine Learning ◽

Single Cell ◽

High Throughput Sequencing ◽

Nearest Neighbor ◽

Cancer Classification ◽

The Cancer Genome Atlas ◽

Supplementary Information ◽

Rna Seq ◽

Sequencing Data ◽

Wide Range

AbstractMotivationCancer is one of the major causes of human death per year. In recent years, cancer identification and classification using machine learning have gained momentum due to the availability of high throughput sequencing data. Using RNA-seq, cancer research is blooming day by day and new insights of cancer and related treatments are coming into light.ResultsIn this paper, we propose PanClassif, a method that requires a very few and effective genes to detect cancer from RNA-seq data and is able to provide performance gain in several wide range machine learning classifiers. We have taken 22 types of cancer samples from The Cancer Genome Atlas (TCGA) having 8287 cancer samples and 680 normal samples. Firstly, PanClassif uses k-Nearest Neighbor (k-NN) smoothing to smooth the samples to handle noise in the data. Then effective genes are selected by Anova based test. For balancing the train data, PanClassif applies an oversampling method, SMOTE. We have performed comprehensive experiments on the datasets using several classification algorithms. Experimental results shows that PanClassif outperform existing state-of-the-art methods available and shows consistent performance for two single cell RNA-seq datasets taken from Gene Expression Omnibus (GEO). PanClassif improves performances of a wide variety of classifiers for both binary cancer prediction and multi-class cancer classification.AvailabilityPanClassif is available as a python package (https://pypi.org/project/panclassif/)[email protected], [email protected] informationSupplementary data are available at online.

Download Full-text

PIRD: Pan immune repertoire database

10.1101/399493 ◽

2018 ◽

Author(s):

Wei Zhang ◽

Longlong Wang ◽

Ke Liu ◽

Xiaofeng Wei ◽

Kai Yang ◽

...

Keyword(s):

Data Storage ◽

High Throughput Sequencing ◽

Homo Sapiens ◽

Adaptive Immune System ◽

B Cell Receptors ◽

Data Visualisation ◽

Sequencing Data ◽

Immune Repertoire ◽

Adaptive Immune ◽

Potential Applications

ABSTRACTMotivationT and B cell receptors (TCRs and BCRs) play a pivotal role in the adaptive immune system by recognizing an enormous variety of external and internal antigens. Understanding these receptors is critical for exploring the process of immunoreaction and exploiting potential applications in immunotherapy and antibody drug design. Although a large number of samples have had their TCR and BCR repertoires sequenced using high-throughput sequencing in recent years, very few databases have been constructed to store these kinds of data. To resolve this issue, we developed a database.ResultsWe developed a database, the Pan Immune Repertoire Database (PIRD), located in China National GeneBank (CNGBdb), to collect and store annotated TCR and BCR sequencing data, including from Homo sapiens and other species. In addition to data storage, PIRD also provides functions of data visualisation and interactive online analysis. Additionally, a manually curated database of TCRs and BCRs targeting known antigens (TBAdb) was also deposited in PIRD.Availability and ImplementationPIRD can be freely accessed at https://db.cngb.org/pird.

Download Full-text

Implementaon of digital economy elements in electric power industry

Safety and Reliability of Power Industry ◽

10.24223/1999-5555-2018-11-2-94-102 ◽

2018 ◽

Vol 11 (2) ◽

pp. 94-102 ◽

Cited By ~ 1

Author(s):

A. G. Filimonov ◽

N. D. Chichirova ◽

A. A. Chichirov ◽

A. A. Filimonovа

Keyword(s):

Electric Power ◽

Energy Production ◽

Predictive Analytics ◽

Digital Economy ◽

Digital Transformation ◽

Cloud Services ◽

Small Scale ◽

Economic Activities ◽

Industrial Enterprises ◽

Automation Systems

Energy generation, along with other sectors of Russia’s economy, is on the cusp of the era of digital transformation. Modern IT solutions ensure the transition of industrial enterprises from automation and computerization, which used to be the targets of the second half of the last century, to digital enterprise concept 4.0. The international record of technological and structural solutions in digitization may be used in Russia’s energy sector to the full extent. Specifics of implementation of such systems in different countries are only determined by the level of economic development of each particular state and the attitude of public authorities as related to the necessity of creating conditions for implementation of the same. It is shown that a strong legislative framework is created in Russia for transition to the digital economy, with research and applied developments available that are up to the international level. The following digital economy elements may be used today at enterprises for production of electrical and thermal energy: — dealing with large amounts of data (including operations exercised via cloud services and distributed data bases); — development of small scale distributed generation and its dispatching; — implementation of smart elements in both electric power and heat supply networks; — development of production process automation systems, remote monitoring and predictive analytics; 3D-modeling of parts and elements; real time mathematic simulation with feedback in the form of control actions; — creating centres for analytical processing of statistic data and accounting in financial and economic activities with business analytics functions, with expansion of communication networks and computing capacities. Examples are presented for implementation of smart systems in energy production and distribution. It is stated in the paper that state-of art information technologies are currently being implemented in Russia, new unique digital transformation projects are being launched in major energy companies. Yet, what is required is large-scale and thorough digitization and controllable energy production system as a multi-factor business process will provide the optimum combination of efficient economic activities, reliability and safety of power supply.

Download Full-text