scholarly journals Handling Data Skew in MapReduce Cluster by Using Partition Tuning

2017 ◽  
Vol 2017 ◽  
pp. 1-12 ◽  
Author(s):  
Yufei Gao ◽  
Yanjie Zhou ◽  
Bing Zhou ◽  
Lei Shi ◽  
Jiacai Zhang

The healthcare industry has generated large amounts of data, and analyzing these has emerged as an important problem in recent years. The MapReduce programming model has been successfully used for big data analytics. However, data skew invariably occurs in big data analytics and seriously affects efficiency. To overcome the data skew problem in MapReduce, we have in the past proposed a data processing algorithm called Partition Tuning-based Skew Handling (PTSH). In comparison with the one-stage partitioning strategy used in the traditional MapReduce model, PTSH uses a two-stage strategy and the partition tuning method to disperse key-value pairs in virtual partitions and recombines each partition in case of data skew. The robustness and efficiency of the proposed algorithm were tested on a wide variety of simulated datasets and real healthcare datasets. The results showed that PTSH algorithm can handle data skew in MapReduce efficiently and improve the performance of MapReduce jobs in comparison with the native Hadoop, Closer, and locality-aware and fairness-aware key partitioning (LEEN). We also found that the time needed for rule extraction can be reduced significantly by adopting the PTSH algorithm, since it is more suitable for association rule mining (ARM) on healthcare data.

2021 ◽  
Vol 13 ◽  
pp. 175628722199813
Author(s):  
B. M. Zeeshan Hameed ◽  
Aiswarya V. L. S. Dhavileswarapu ◽  
Nithesh Naik ◽  
Hadis Karimi ◽  
Padmaraj Hegde ◽  
...  

Artificial intelligence (AI) has a proven record of application in the field of medicine and is used in various urological conditions such as oncology, urolithiasis, paediatric urology, urogynaecology, infertility and reconstruction. Data is the driving force of AI and the past decades have undoubtedly witnessed an upsurge in healthcare data. Urology is a specialty that has always been at the forefront of innovation and research and has rapidly embraced technologies to improve patient outcomes and experience. Advancements made in Big Data Analytics raised the expectations about the future of urology. This review aims to investigate the role of big data and its blend with AI for trends and use in urology. We explore the different sources of big data in urology and explicate their current and future applications. A positive trend has been exhibited by the advent and implementation of AI in urology with data available from several databases. The extensive use of big data for the diagnosis and treatment of urological disorders is still in its early stage and under validation. In future however, big data will no doubt play a major role in the management of urological conditions.


Author(s):  
Pijush Kanti Dutta Pramanik ◽  
Saurabh Pal ◽  
Moutan Mukhopadhyay

Like other fields, the healthcare sector has also been greatly impacted by big data. A huge volume of healthcare data and other related data are being continually generated from diverse sources. Tapping and analysing these data, suitably, would open up new avenues and opportunities for healthcare services. In view of that, this paper aims to present a systematic overview of big data and big data analytics, applicable to modern-day healthcare. Acknowledging the massive upsurge in healthcare data generation, various ‘V's, specific to healthcare big data, are identified. Different types of data analytics, applicable to healthcare, are discussed. Along with presenting the technological backbone of healthcare big data and analytics, the advantages and challenges of healthcare big data are meticulously explained. A brief report on the present and future market of healthcare big data and analytics is also presented. Besides, several applications and use cases are discussed with sufficient details.


Author(s):  
Sheik Abdullah A. ◽  
Selvakumar S. ◽  
Parkavi R. ◽  
Suganya R. ◽  
Abirami A. M.

The importance of big data over analytics made the process of solving various real-world problems simpler. The big data and data science tool box provided a realm of data preparation, data analysis, implementation process, and solutions. Data connections over any data source, data preparation for analysis has been made simple with the availability of tremendous tools in data analytics package. Some of the analytical tools include R programming, python programming, rapid analytics, and weka. The patterns and the granularity over the observed data can be fetched with the visualizations and data observations. This chapter provides an insight regarding the types of analytics in a big data perspective with the realm in applicability towards healthcare data. Also, the processing paradigms and techniques can be clearly observed through the chapter contents.


2020 ◽  
pp. 100-117
Author(s):  
Sarah Brayne

This chapter looks at the promise and peril of police use of big data analytics for inequality. On the one hand, big data analytics may be a means by which to ameliorate persistent inequalities in policing. Data can be used to “police the police” and replace unparticularized suspicion of racial minorities and human exaggeration of patterns with less biased predictions of risk. On the other hand, data-intensive police surveillance practices are implicated in the reproduction of inequality in at least four ways: by deepening the surveillance of individuals already under suspicion, codifying a secondary surveillance network of individuals with no direct police contact, widening the criminal justice dragnet unequally, and leading people to avoid institutions that collect data and are fundamental to social integration. Crucially, as currently implemented, “data-driven” decision-making techwashes, both obscuring and amplifying social inequalities under a patina of objectivity.


Author(s):  
Mimoh Ojha

Abstract: This paper gives an insight of how information and communications technology (ICT) in combination with big data analytics can help to improve healthcare services in Madhya Pradesh, which is a state in India having approximately 75 million populations. With ongoing projects like ‘Digital India’ which will allow computerization of hospitals and digitization of healthcare data. Digital India coupled with ICT, can play an indispensable role in providing effective healthcare services through e-health application like electronic health record, e-prescription, computerized physician order entry, telemedicine, mhealth along with the network like State wide area network (SWAN) and National health information network which will allow sharing of healthcare records across the network. Data stored through e-health application is of huge size having different formats which makes it difficult to perform analytics on it. But with big data analytics we can perform analytics on large voluminous healthcare data and useful result obtained from data analytics, patients can be given better and specific treatments. It will also help doctors to exchange their knowledge and treatment practices. This paper also illustrates a case study on M.Y. hospital located in Indore, Madhya Pradesh. Keywords: ICT, E-health, Digital India, SWAN, CUG, Big Data Analytics.


Author(s):  
Navin Kumar

The amount of healthcare data continues to exponentially grow everyday. The complexity of this data further limits the analytical capabilities of traditional healthcare systems. With value-based care, it is far more imminent for healthcare organizations to control the costs and to improve the quality of care in order to sustain their business. The purpose of the chapter is to gain insights into complexities and challenges that exist in current healthcare systems and how big data analytics and IoT can play a pivotal role to positively influence the quality of care and patient outcomes. The chapter also provides solutions and strategies for building cloud-based data asset that can deliver rich data analytics to both the healthcare systems and the patients.


Author(s):  
HarshmitKaur Saluja ◽  
Vinod Kumar Yadav ◽  
K.M. Mohapatra

On the one hand, big-data analytics has brought revolution in the predictive modeler by enabling the complex data sets getting structured. On the other hand, the interactive advertisement has changed the complete scenario of the advertising sector by making advertisements content structured in such a way that it is customer-centric. The paper helps to widen the view to explore the growing urge of customization technique in advertising sector with interactive enablers. The paper further examines that how interactive advertisement and big-data has helped to represent product/service from the view of a customer and also improved the product/service performance. In order of study, exhaustive literature reviews resulting in three hypothesis are developed to take on the above-mentioned concerns.


Author(s):  
Abhishek Bajpai ◽  
Dr. Sanjiv Sharma

As the Volume of the data produced is increasing day by day in our society, the exploration of big data in healthcare is increasing at an unprecedented rate. Now days, Big data is very popular buzzword concept in the various areas. This paper provide an effort is made to established that even the healthcare industries are stepping into big data pool to take all advantages from its various advanced tools and technologies. This paper provides the review of various research disciplines made in health care realm using big data approaches and methodologies. Big data methodologies can be used for the healthcare data analytics (which consist 4 V’s) which provide the better decision to accelerate the business profit and customer affection, acquire a better understanding of market behaviours and trends and to provide E-Health services using Digital imaging and communication in Medicine (DICOM).Big data Techniques like Map Reduce, Machine learning can be applied to develop system for early diagnosis of disease, i.e. analysis of the chronic disease like- heart disease, diabetes and stroke. The analysis on the data is performed using big data analytics framework Hadoop. Hadoop framework is used to process large data sets Further the paper present the various Big data tools , challenges and opportunities and various hurdles followed by the conclusion.                                      


2021 ◽  
Author(s):  
Mahtab Shahin ◽  
Sijo Arakkal Peious ◽  
Rahul Sharma ◽  
Minakshi Kaushik ◽  
Sadok Ben Yahia ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document