miND pipeline AWS EC2 installation and setup v2

Author(s):  
Andreas B Diendorfer ◽  
Kseniya.Khamina not provided ◽  
marianne.pultar not provided

miND is a NGS data analysis pipeline for smallRNA sequencing data. In this protocol, the pipeline is setup and run on an AWS EC2 instance with example data from a public repository. Please see the publication paper on F1000 for more details on the pipeline and how to use it.

2021 ◽  
Author(s):  
Andreas B B Diendorfer ◽  
Kseniya.Khamina not provided ◽  
marianne.pultar not provided

miND is a NGS data analysis pipeline for smallRNA sequencing data. In this protocol, the pipeline is setup and run on an AWS EC2 instance with example data from a public repository. Please see the publication paper on F1000 for more details on the pipeline and how to use it.


2019 ◽  
Author(s):  
Anna James-Bott ◽  
Adam P. Cribbs

AbstractMany tools have been developed to analyse small RNA sequencing data, however it remains a challenging task to accurately process reads aligning to small RNA due to their short-read length. Most pipelines have been developed with miRNA analysis in mind and there are currently very few workflows focused on the analysis of transfer RNAs. Moreover, these workflows suffer from being low throughput, difficult to install and lack sufficient visualisation to make the output interpretable. To address these issues, we have built a comprehensive and customisable small RNA-seq data analysis pipeline, with emphasis on the analysis of tRNAs. The pipeline takes as an input a fastq file of small RNA sequencing reads and performs successive steps of mapping and alignment to transposable elements, gene transcripts, miRNAs, snRNAs, rRNA and tRNAs. Subsequent steps are then performed to generate summary statistics on reads of tRNA origin, which are then visualised in a html report. Unlike other low-throughput analysis tools currently available, our high-throughput method allows for the simultaneous analysis of multiple samples and scales with the number of input files. tRNAnalysis is command line runnable and is implemented predominantly using Python and R. The source code is available at https://github.com/Acribbs/tRNAnalysis.


PLoS ONE ◽  
2017 ◽  
Vol 12 (2) ◽  
pp. e0171983 ◽  
Author(s):  
Sarah Sandmann ◽  
Aniek O. de Graaf ◽  
Bert A. van der Reijden ◽  
Joop H. Jansen ◽  
Martin Dugas

2017 ◽  
Vol 45 (21) ◽  
pp. 12140-12151 ◽  
Author(s):  
Xiaogang Wu ◽  
Taek-Kyun Kim ◽  
David Baxter ◽  
Kelsey Scherler ◽  
Aaron Gordon ◽  
...  

2019 ◽  
Vol 24 (3) ◽  
pp. 213-223 ◽  
Author(s):  
Raimo Franke ◽  
Bettina Hinkelmann ◽  
Verena Fetz ◽  
Theresia Stradal ◽  
Florenz Sasse ◽  
...  

Mode of action (MoA) identification of bioactive compounds is very often a challenging and time-consuming task. We used a label-free kinetic profiling method based on an impedance readout to monitor the time-dependent cellular response profiles for the interaction of bioactive natural products and other small molecules with mammalian cells. Such approaches have been rarely used so far due to the lack of data mining tools to properly capture the characteristics of the impedance curves. We developed a data analysis pipeline for the xCELLigence Real-Time Cell Analysis detection platform to process the data, assess and score their reproducibility, and provide rank-based MoA predictions for a reference set of 60 bioactive compounds. The method can reveal additional, previously unknown targets, as exemplified by the identification of tubulin-destabilizing activities of the RNA synthesis inhibitor actinomycin D and the effects on DNA replication of vioprolide A. The data analysis pipeline is based on the statistical programming language R and is available to the scientific community through a GitHub repository.


2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
A Biricik ◽  
V Bianchi ◽  
F Lecciso ◽  
M Surdo ◽  
M Manno ◽  
...  

Abstract Study question To explore ploidy concordance between invasive and non-invasive PGTA (niPGT-A) at different embryo culture time. Summary answer High level (>84%) of concordance rate for ploidy and sex, sensitivity (>88%), and specificity (76%) were obtained for both day6/7 samples and day5 samples. What is known already The analysis of embryo cell free DNA (cfDNA) that are released into culture media during in vitro embryo development has the potential to evaluate embryo ploidy status. However, obtaining sufficient quality and quantity of cfDNA is essential to achieve interpretable results for niPGT-A. More culture time is expected to be directly proportional to the release of more cfDNA. But embryo culture time is limited due to in-vitro embryo survival potential. Therefore, it is important to estimate the duration of the culture that will provide the maximum cfDNA that can be obtained without adversely affecting the development of the embryo. Study design, size, duration A total of 105 spent culture media (SCM) from day5-day7 blastocyst stage embryos have been included in this cohort study. The cfDNA of SCM samples were amplified and analyzed for niPGT-A by NGS analysis. The SCM samples were divided into 2 subgroups according the embryo culture hours (Day5 and Day6/7 group). The DNA concentration, informativity and euploidy results have then been compared with their corresponding embryos after trophectoderm biopsy (TE) and PGT-A analysis by NGS Participants/materials, setting, methods Embryos cultured until Day3 washed and cultured again in 20µl fresh culture media until embryo biopsy on Day5, 6, or 7. After biopsy SCM samples were immediately collected in PCR tubes and conserved at –20 °C until whole genome amplification by MALBAC® (Yicon Genomics). The TE and SCM samples were analyzed by next-generation sequencing (NGS) using Illumina MiSeq® System. NGS data analysis has been done by Bluefuse Multi Software 4.5 (Illumina) for SCM and TE samples Main results and the role of chance Only the SCM samples which have an embryo with a conclusive result were included in this cohort (n = 105). Overall 97.1% (102/105) of SCM samples gave a successful DNA amplification with a concentration ranging 32.4–128.5ng/µl. Non-informative (NI) results including a chaotic profile (>5 chromosome aneuploidies) were observed in 17 samples, so 83.3%(85/102) of SCM samples were informative for NGS data analysis. Ploidy concordance rate with the corresponding TE biopsies (euploid vs euploid, aneuploid vs aneuploid) was 84.7% (72/85). Sensitivity and specificity were 92,8% and 76,7%, respectively with no significant difference for all parameters for day 6/7 samples compared with day 5 samples. The false-negative rate was 3.5% (3/85), and false-positive rate was 11.7% (10/85). Limitations, reasons for caution The sample size is relatively small. Larger prospective studies are needed. As this is a single-center study, the impact of the variations in embryo culture conditions can be underestimated. Maternal DNA contamination risk cannot be revealed in SCM, therefore the use of molecular markers would increase the reliability. Wider implications of the findings: Non-invasive analysis of embryo cfDNA analyzed in spent culture media demonstrates high concordance with TE biopsy results in both early and late culture time. A non-invasive approach for aneuploidy screening offers important advantages such as avoiding invasive embryo biopsy and decreased cost, potentially increasing accessibility for a wider patient population. Trial registration number Not applicable


2017 ◽  
Author(s):  
Baekdoo Kim ◽  
Thahmina Ali ◽  
Carlos Lijeron ◽  
Enis Afgan ◽  
Konstantinos Krampis

ABSTRACTBackgroundProcessing of Next-Generation Sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized post-analysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers, towards seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform.FindingsWe present an approach for abstracting the complex data operations of multi-step, bioinformatics pipelines for NGS data analysis. As examples, we have deployed two pipelines for RNAseq and CHIPseq, pre-configured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines is as simple as running a single bioinformatics tool. This is achieved through a “meta-script” that automatically starts the Bio-Docklets, and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface (API). The pipelne output is post-processed using the Visual Omics Explorer (VOE) framework, providing interactive data visualizations that users can access through a web browser.ConclusionsThe goal of our approach is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts, on any computing environment whether a laboratory workstation, university computer cluster, or a cloud service provider,. Besides end-users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets.


Sign in / Sign up

Export Citation Format

Share Document