Large Scale Advanced Data Analytics on Skin Conditions from Genotype to Phenotype

Maryam Panahiazar; Darya Fadavi; Jihad Aljabban; Laraib Safeer; Imad Aljabban; Dexter Hadley

doi:10.3390/informatics5040039

Large Scale Advanced Data Analytics on Skin Conditions from Genotype to Phenotype

Informatics ◽

10.3390/informatics5040039 ◽

2018 ◽

Vol 5 (4) ◽

pp. 39

Author(s):

Maryam Panahiazar ◽

Darya Fadavi ◽

Jihad Aljabban ◽

Laraib Safeer ◽

Imad Aljabban ◽

...

Keyword(s):

Big Data ◽

Data Analytics ◽

Large Scale ◽

Meta Analysis ◽

Large Scale Data ◽

Public Data ◽

Postinflammatory Hyperpigmentation ◽

Health Quality ◽

The Common ◽

Skin Conditions

A crucial factor in Big Data is to take advantage of available data and use that for new discovery or hypothesis generation. In this study, we analyzed Large-scale data from the literature to OMICS, such as the genome, proteome or metabolome, respectively, for skin conditions. Skin acts as a natural barrier to the world around us and protects our body from different conditions, viruses, and bacteria, and plays a big part in appearance. We have included Hyperpigmentation, Postinflammatory Hyperpigmentation, Melasma, Rosacea, Actinic keratosis, and Pigmentation in this study. These conditions have been selected based on reasoning of big scale UCSF patient data of 527,273 females from 2011 to 2017, and related publications from 2000 to 2017 regarding skin conditions. The selected conditions have been confirmed with experts in the field from different research centers and hospitals. We proposed a novel framework for large-scale available public data to find the common genotypes and phenotypes of different skin conditions. The outcome of this study based on Advance Data Analytics provides information on skin conditions and their treatments to the research community and introduces new hypotheses for possible genotype and phenotype targets. The novelty of this work is a meta-analysis of different features on different skin conditions. Instead of looking at individual conditions with one or two features, which is how most of the previous works are conducted, we looked at several conditions with different features to find the common factors between them. Our hypothesis is that by finding the overlap in genotype and phenotype between different skin conditions, we can suggest using a drug that is recommended in one condition, for treatment in the other condition which has similar genes or other common phenotypes. We identified common genes between these skin conditions and were able to find common areas for targeting between conditions, such as common drugs. Our work has implications for discovery and new hypotheses to improve health quality, and is geared towards making Big Data useful.

Download Full-text

Data Lake Ecosystem Workflow

10.21079/11681/40203 ◽

2021 ◽

Author(s):

R. Salter ◽

Quyen Dong ◽

Cody Coleman ◽

Maria Seale ◽

Alicia Ruvinsky ◽

...

Keyword(s):

Big Data ◽

Language Processing ◽

Data Analytics ◽

Large Scale ◽

Big Data Analytics ◽

Lake Ecosystem ◽

Data Governance ◽

Government Organizations ◽

Large Scale Data ◽

Scale Data

The Engineer Research and Development Center, Information Technology Laboratory’s (ERDC-ITL’s) Big Data Analytics team specializes in the analysis of large-scale datasets with capabilities across four research areas that require vast amounts of data to inform and drive analysis: large-scale data governance, deep learning and machine learning, natural language processing, and automated data labeling. Unfortunately, data transfer between government organizations is a complex and time-consuming process requiring coordination of multiple parties across multiple offices and organizations. Past successes in large-scale data analytics have placed a significant demand on ERDC-ITL researchers, highlighting that few individuals fully understand how to successfully transfer data between government organizations; future project success therefore depends on a small group of individuals to efficiently execute a complicated process. The Big Data Analytics team set out to develop a standardized workflow for the transfer of large-scale datasets to ERDC-ITL, in part to educate peers and future collaborators on the process required to transfer datasets between government organizations. Researchers also aim to increase workflow efficiency while protecting data integrity. This report provides an overview of the created Data Lake Ecosystem Workflow by focusing on the six phases required to efficiently transfer large datasets to supercomputing resources located at ERDC-ITL.

Download Full-text

Performance evaluation of big data frameworks for large-scale data analytics

2016 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata.2016.7840633 ◽

2016 ◽

Cited By ~ 14

Author(s):

Jorge Veiga ◽

Roberto R. Exposito ◽

Xoan C. Pardo ◽

Guillermo L. Taboada ◽

Juan Tourifio

Keyword(s):

Big Data ◽

Performance Evaluation ◽

Data Analytics ◽

Large Scale ◽

Large Scale Data ◽

Scale Data

Download Full-text

A Survey of Cloud-Based Services Leveraged by Big Data Applications

Web Services ◽

10.4018/978-1-5225-7501-6.ch088 ◽

2019 ◽

pp. 1706-1716

Author(s):

S. ZerAfshan Goher ◽

Barkha Javed ◽

Peter Bloodsworth

Keyword(s):

Big Data ◽

Data Storage ◽

Data Analytics ◽

Large Scale ◽

Future Trends ◽

Advantages And Disadvantages ◽

Large Scale Data ◽

Big Data Applications ◽

Big Data Storage ◽

Scale Data

Due to the growing interest in harnessing the hidden significance of data, more and more enterprises are moving to data analytics. Data analytics require the analysis and management of large-scale data to find the hidden patterns among various data components to gain useful insight. The derived information is then used to predict the future trends that can be advantageous for a business to flourish such as customers' likes/dislikes, reasons behind customers' churn and more. In this paper, several techniques for the big data analysis have been investigated along with their advantages and disadvantages. The significance of cloud computing for big data storage has also been discussed. Finally, the techniques to make the robust and efficient usage of big data have also been discussed.

Download Full-text

Big data breaches and customer compensation strategies

International Journal of Operations & Production Management ◽

10.1108/ijopm-03-2015-0156 ◽

2017 ◽

Vol 37 (1) ◽

pp. 56-74 ◽

Cited By ~ 7

Author(s):

Thomas Kude ◽

Hartmut Hoehle ◽

Tracy Ann Sykes

Keyword(s):

Big Data ◽

Personality Traits ◽

Data Analytics ◽

Large Scale ◽

Big Data Analytics ◽

Content Type ◽

Customer Data ◽

Data Breaches ◽

Large Scale Data ◽

Scale Data

Purpose Big Data Analytics provides a multitude of opportunities for organizations to improve service operations, but it also increases the threat of external parties gaining unauthorized access to sensitive customer data. With data breaches now a common occurrence, it is becoming increasingly plain that while modern organizations need to put into place measures to try to prevent breaches, they must also put into place processes to deal with a breach once it occurs. Prior research on information technology security and services failures suggests that customer compensation can potentially restore customer sentiment after such data breaches. The paper aims to discuss these issues. Design/methodology/approach In this study, the authors draw on the literature on personality traits and social influence to better understand the antecedents of perceived compensation and the effectiveness of compensation strategies. The authors studied the propositions using data collected in the context of Target’s large-scale data breach that occurred in December 2013 and affected the personal data of more than 70 million customers. In total, the authors collected data from 212 breached customers. Findings The results show that customers’ personality traits and their social environment significantly influences their perceptions of compensation. The authors also found that perceived compensation positively influences service recovery and customer experience. Originality/value The results add to the emerging literature on Big Data Analytics and will help organizations to more effectively manage compensation strategies in large-scale data breaches.

Download Full-text

Explanations as Discourse

Australasian Journal of Information Systems ◽

10.3127/ajis.v24i0.2519 ◽

2020 ◽

Vol 24 ◽

Author(s):

Sadaf Afrashteh ◽

Ida Someh ◽

Michael Davern

Keyword(s):

Decision Making ◽

Big Data ◽

Data Analytics ◽

Large Scale ◽

Big Data Analytics ◽

Customer Engagement ◽

Data Sets ◽

Research Directions ◽

Large Scale Data ◽

Scale Data

Big data analytics uses algorithms for decision-making and targeting of customers. These algorithms process large-scale data sets and create efficiencies in the decision-making process for organizations but are often incomprehensible to customers and inherently opaque in nature. Recent European Union regulations require that organizations communicate meaningful information to customers on the use of algorithms and the reasons behind decisions made about them. In this paper, we explore the use of explanations in big data analytics services. We rely on discourse ethics to argue that explanations can facilitate a balanced communication between organizations and customers, leading to transparency and trust for customers as well as customer engagement and reduced reputation risks for organizations. We conclude the paper by proposing future empirical research directions.

Download Full-text

Assessment of the Awareness of Nigerian Professionals in the Built Environment on the Big Data analytics (BDA) Applications in the Construction Industry.

10.36265/arejoen.2021.010101 ◽

2021 ◽

pp. 1-7

Author(s):

Emmanuel Jesse Amadosi

Keyword(s):

Big Data ◽

Built Environment ◽

Data Analytics ◽

Large Scale ◽

Rapid Development ◽

Big Data Analytics ◽

Strong Relationship ◽

Large Scale Data ◽

Scale Data ◽

Structured Questionnaire

With rapid development in technology, the built industry’s capacity to generate large-scale data is not in doubt. This trend of data upsurge labelled “Big Data” is currently being used to seek intelligent solutions in many industries including construction. As a result of this, the appeal to embrace Big Data Analytics has also gained wide advocacy globally. However, the general knowledge of Nigeria’s built environment professionals on Big Data Analytics is still limited and this gap continues to account for the slow pace of adoption of digital technologies like Big Data Analytics and the value it projects. This study set out to assess the level of awareness and knowledge of professionals within the Nigerian built environment with a view to promoting the adoption of Big Data Analytics for improved productivity. To achieve this aim, a structured questionnaire survey was carried out among a total of 283 professionals drawn from 9 disciplines within the built environment in the Federal Capital Territory, Abuja. The findings revealed that: a) a low knowledge level of Big Data exists among professionals, b) knowledge among professional and the level of Big Data Analytics application have strong relationship c) professional are interested in knowing more about the Big Data concept and how Big Data Analytics can be leveraged upon. The study, therefore recommends an urgent paradigm shift towards digitisation to fully embrace and adopt Big Data Analytics and enjoin stakeholders to promote collaborative schemes among practice-based professionals and the academia in seeking intelligent and smart solutions to construction-related problems.

Download Full-text

A Survey of Cloud-Based Services Leveraged by Big Data Applications

Advances in Data Mining and Database Management - Managing and Processing Big Data in Cloud Computing ◽

10.4018/978-1-4666-9767-6.ch008 ◽

2016 ◽

pp. 121-131

Author(s):

S. ZerAfshan Goher ◽

Barkha Javed ◽

Peter Bloodsworth

Keyword(s):

Big Data ◽

Data Storage ◽

Data Analytics ◽

Large Scale ◽

Future Trends ◽

Advantages And Disadvantages ◽

Large Scale Data ◽

Big Data Applications ◽

Big Data Storage ◽

Scale Data

Download Full-text

Healthcare Analytics: A Comprehensive Review

Engineering, Technology & Applied Science Research ◽

10.48084/etasr.3965 ◽

2021 ◽

Vol 11 (1) ◽

pp. 6650-6655

Author(s):

A. Alghamdi ◽

T. Alsubait ◽

A. Baz ◽

H. Alhakami

Keyword(s):

Big Data ◽

Data Analytics ◽

Large Scale ◽

Predictive Analytics ◽

Human Life ◽

Descriptive Analysis ◽

Secondary Data ◽

Healthcare Outcomes ◽

Large Scale Data ◽

Scale Data

Big data have attracted significant attention in recent years, as their hidden potentials that can improve human life, especially when applied in healthcare. Big data is a reasonable collection of useful information allowing new breakthroughs or understandings. This paper reviews the use and effectiveness of data analytics in healthcare, examining secondary data sources such as books, journals, and other reputable publications between 2000 and 2020, utilizing a very strict strategy in keywords. Large scale data have been proven of great importance in healthcare, and therefore there is a need for advanced forms of data analytics, such as diagnostic data and descriptive analysis, for improving healthcare outcomes. The utilization of large-scale data can form the backbone of predictive analytics which is the baseline for future individual outcome prediction.

Download Full-text

On using MapReduce to scale algorithms for Big Data analytics: a case study

Journal Of Big Data ◽

10.1186/s40537-019-0269-1 ◽

2019 ◽

Vol 6 (1) ◽

Cited By ~ 1

Author(s):

Phongphun Kijsanayothin ◽

Gantaphon Chalumporn ◽

Rattikorn Hewett

Keyword(s):

Big Data ◽

Data Analytics ◽

Large Scale ◽

Big Data Analytics ◽

Apriori Algorithm ◽

Large Scale Data ◽

Distributed Execution ◽

Mapreduce Model ◽

Large Clusters

Abstract Introduction Many data analytics algorithms are originally designed for in-memory data. Parallel and distributed computing is a natural first remedy to scale these algorithms to “Big algorithms” for large-scale data. Advances in many Big Data analytics algorithms are contributed by MapReduce, a programming paradigm that enables parallel and distributed execution of massive data processing on large clusters of machines. Much research has focused on building efficient naive MapReduce-based algorithms or extending MapReduce mechanisms to enhance performance. However, we argue that these should not be the only research directions to pursue. We conjecture that when naive MapReduce-based solutions do not perform well, it could be because certain classes of algorithms are not amendable to MapReduce model and one should find a fundamentally different approach to a new MapReduce-based solution. Case description This paper investigates a case study of a scaling problem of “Big algorithms” for a popular association rule-mining algorithm, particularly the development of Apriori algorithm in MapReduce model. Discussion and evaluation Formal and empirical illustrations are explored to compare our proposed MapReduce-based Apriori algorithm with previous solutions. The findings support our conjecture and our study shows promising results compared to the state-of-the-art performer with 7% increase in performance on the average of transactions ranging from 10,000 to 120,000. Conclusions The results confirm that effective MapReduce implementation should avoid dependent iterations, such as that of the original sequential Apriori algorithm. These findings could lead to many more alternative non-naive MapReduce-based “Big algorithms”.

Download Full-text

Introduction to Large-Scale Data Analytics

Beginning Apache Spark Using Azure Databricks ◽

10.1007/978-1-4842-5781-4_1 ◽

2020 ◽

pp. 1-14

Author(s):

Robert Ilijason

Keyword(s):

Data Analytics ◽

Large Scale ◽

Large Scale Data ◽

Scale Data

Download Full-text