Analysis of Big Data in Healthcare and Life Sciences Using Hive and Spark

Author(s):  
A. Sai Hanuman ◽  
R. Soujanya ◽  
P. M. Madhuri
Keyword(s):  
Big Data ◽  
2014 ◽  
Vol 8 (4) ◽  
pp. 192-201 ◽  
Author(s):  
Hongyan Wu ◽  
Atsuko Yamaguchi

2018 ◽  
Vol 24 ◽  
pp. e912
Author(s):  
Sabrina K. Schulze ◽  
Živa Ramšak ◽  
Yen Hoang ◽  
Eftim Zdravevski ◽  
Juliane Pfeil ◽  
...  

On 6th and 7th February 2018, a Think Tank took place in Ljubljana, Slovenia. It was a follow-up of the “Big Data Training School for Life Sciences” held in Uppsala, Sweden, in September 2017. The focus was on identifying topics of interest and optimising the programme for a forthcoming “Advanced” Big Data Training School for Life Science, that we hope is again supported by the COST Action CHARME (Harmonising standardisation strategies to increase efficiency and competitiveness of European life-science research - CA15110). The Think Tank aimed to go into details of several topics that were - to a degree - covered by the former training school. Likewise, discussions embraced the recent experience of the attendees in light of the new knowledge obtained by the first edition of the training school and how it comes from the perspective of their current and upcoming work. The 2018 training school should strive for and further facilitate optimised applications of Big Data technologies in life sciences. The attendees of this hackathon entirely organised this workshop.


2019 ◽  
Author(s):  
Serghei Mangul

Recent advances in omics technologies have led to the broad applicability of computational techniques across various domains of life science and medical research. These technologies provide an unprecedented opportunity to collect omics data from hundreds of thousands of individuals and to study gene-disease association without the aid of prior assumptions about the trait biology. Despite the many advantages of modern omics technologies, interpretations of big data produced by such technologies require advanced computational algorithms. Below I outline key challenges that biomedical researches are facing when interpreting and integrating big omics data. I discuss the reproducibility aspect of big data analysis in the life sciences and review current practices in reproducible research. Finally, I explain the skills which biomedical researchers need to acquire in order to independently analyze big omics data.


2015 ◽  
Vol 9 ◽  
pp. BBI.S12467 ◽  
Author(s):  
Xiaoxi Dong ◽  
Anatoly Yambartsev ◽  
Stephen A. Ramsey ◽  
Lina D Thomas ◽  
Natalia Shulzhenko ◽  
...  

Omics technologies enable unbiased investigation of biological systems through massively parallel sequence acquisition or molecular measurements, bringing the life sciences into the era of Big Data. A central challenge posed by such omics datasets is how to transform these data into biological knowledge, for example, how to use these data to answer questions such as: Which functional pathways are involved in cell differentiation? Which genes should we target to stop cancer? Network analysis is a powerful and general approach to solve this problem consisting of two fundamental stages, network reconstruction, and network interrogation. Here we provide an overview of network analysis including a step-by-step guide on how to perform and use this approach to investigate a biological question. In this guide, we also include the software packages that we and others employ for each of the steps of a network analysis workflow.


2015 ◽  
Vol 26 (22) ◽  
pp. 3894-3897 ◽  
Author(s):  
Keith G. Kozminski

New scientific frontiers and emerging technologies within the life sciences pose many global challenges to society. Big Data is a premier example, especially with respect to individual, national, and international security. Here a Special Agent of the Federal Bureau of Investigation discusses the security implications of Big Data and the need for security in the life sciences.


Author(s):  
Katharina Frey ◽  
Alenka Hafner ◽  
Boas Pucker

The 'big data revolution' has enabled novel types of analyses in the life sciences, facilitated by public sharing and reuse of datasets. Here, we review the prodigious potential of reusing publicly available datasets and the challenges, limitations and risks associated with it. Due to the prominence, abundance and wide distribution of sequencing results, we focus on the reuse of publicly available sequence datasets. Through selected examples of successful reuse of different data (genome, transcriptome, proteome, metabolome, phenotype and ecosystem), with their respective limitations and risks, we illustrate the enormous potential of the practice. A checklist to determine the reuse value and potential of particular dataset is also provided.


2017 ◽  
Author(s):  
Jonas Almeida, Ph.D. ◽  
Sean Clouston, Ph.D. ◽  
Gary LaFever ◽  
Ted Myerson ◽  
Sandeep Pulim, MD
Keyword(s):  
Big Data ◽  

2019 ◽  
Vol 3 (4) ◽  
pp. 335-341 ◽  
Author(s):  
Serghei Mangul

Abstract Recent advances in omics technologies have led to the broad applicability of computational techniques across various domains of life science and medical research. These technologies provide an unprecedented opportunity to collect the omics data from hundreds of thousands of individuals and to study the gene–disease association without the aid of prior assumptions about the trait biology. Despite the many advantages of modern omics technologies, interpretations of big data produced by such technologies require advanced computational algorithms. I outline key challenges that biomedical researches are facing when interpreting and integrating big omics data. I discuss the reproducibility aspect of big data analysis in the life sciences and review current practices in reproducible research. Finally, I explain the skills that biomedical researchers need to acquire to independently analyze big omics data.


Sign in / Sign up

Export Citation Format

Share Document