huge data
Recently Published Documents


TOTAL DOCUMENTS

496
(FIVE YEARS 266)

H-INDEX

12
(FIVE YEARS 4)

2022 ◽  
Vol 11 (3) ◽  
pp. 0-0

Emergence of big data in today’s world leads to new challenges for sorting strategies to analyze the data in a better way. For most of the analyzing technique, sorting is considered as an implicit attribute of the technique used. The availability of huge data has changed the way data is analyzed across industries. Healthcare is one of the notable areas where data analytics is making big changes. An efficient analysis has the potential to reduce costs of treatment and improve the quality of life in general. Healthcare industries are collecting massive amounts of data and look for the best strategies to use these numbers. This research proposes a novel non-comparison based approach to sort a large data that can further be utilized by any big data analytical technique for various analyses.


2022 ◽  
Vol 12 (2) ◽  
pp. 651
Author(s):  
Gökhan Demirdöğen ◽  
Zeynep Işık ◽  
Yusuf Arayici

The use of digital technologies such as Internet of Things (IoT) and smart meters induces a huge data stack in facility management (FM). However, the use of data analysis techniques has remained limited to converting available data into information within activities performed in FM. In this context, business intelligence and analytics (BI&A) techniques can provide a promising opportunity to elaborate facility performance and discover measurable new FM key performance indicators (KPIs) since existing KPIs are too crude to discover actual performance of facilities. Beside this, there is no comprehensive study that covers BI&A activities and their importance level for healthcare FM. Therefore, this study aims to identify healthcare FM KPIs and their importance levels for the Turkish healthcare FM industry with the use of the AHP integrated PROMETHEE method. As a result of the study, ninety-eight healthcare FM KPIs, which are categorized under six categories, were found. The comparison of the findings with the literature review showed that there are some similarities and differences between countries’ FM healthcare ranks. Within this context, differences between countries can be related to the consideration of limited FM KPIs in the existing studies. Therefore, the proposed FM KPIs under this study are very comprehensive and detailed to measure and discover healthcare FM performance. This study can help professionals perform more detailed building performance analyses in FM. Additionally, findings from this study will pave the way for new developments in FM software and effective use of available data to enable lean FM processes in healthcare facilities.


2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

Data Mining is an essential task because the digital world creates huge data daily. Associative classification is one of the data mining task which is used to carry out classification of data, based on the demand of knowledge users. Most of the associative classification algorithms are not able to analyze the big data which are mostly continuous in nature. This leads to the interest of analyzing the existing discretization algorithms which converts continuous data into discrete values and the development of novel discretizer Reliable Distributed Fuzzy Discretizer for big data set. Many discretizers suffer the problem of over splitting the partitions. Our proposed method is implemented in distributed fuzzy environment and aims to avoid over splitting of partitions by introducing a novel stopping criteria. Proposed discretization method is compared with existing distributed fuzzy partitioning method and achieved good accuracy in the performance of associative classifiers.


2022 ◽  
pp. 115-127
Author(s):  
Sagar Sudhir Dhobale ◽  
Sharda Bapat

ICD (international classification of diseases) is a system developed by the WHO in which every unique diagnosis and procedure has a unique code. It provides a standardized way to represent medical information and makes it sharable and comparable across different hospitals and countries. Currently, the task of assigning ICD codes to patient discharge summaries is performed manually by medical coders. Manual coding is costly, time consuming, and inefficient for huge data. So, the healthcare industry requires automated solutions to make the medical coding more efficient, accurate, and consistent. In this study, the automated ICD-9 coding is approached as a multi-label text classification problem. A deep learning system is presented to assign ICD-9 codes automatically to the patient discharge summaries. Convolutional neural networks and word2vec model are combined to automatically extract features from the input text. The best model has achieved 83.28% accuracy. The results of this research prove the usability of deep learning for multi-label text classification and medical coding.


2022 ◽  
pp. 1054-1070
Author(s):  
Andrew Stranieri ◽  
Venki Balasubramanian

Remote patient monitoring involves the collection of data from wearable sensors that typically requires analysis in real time. The real-time analysis of data streaming continuously to a server challenges data mining algorithms that have mostly been developed for static data residing in central repositories. Remote patient monitoring also generates huge data sets that present storage and management problems. Although virtual records of every health event throughout an individual's lifespan known as the electronic health record are rapidly emerging, few electronic records accommodate data from continuous remote patient monitoring. These factors combine to make data analytics with continuous patient data very challenging. In this chapter, benefits for data analytics inherent in the use of standards for clinical concepts for remote patient monitoring is presented. The openEHR standard that describes the way in which concepts are used in clinical practice is well suited to be adopted as the standard required to record meta-data about remote monitoring. The claim is advanced that this is likely to facilitate meaningful real time analyses with big remote patient monitoring data. The point is made by drawing on a case study involving the transmission of patient vital sign data collected from wearable sensors in an Indian hospital.


2022 ◽  
Vol 12 (1) ◽  
pp. 1-14
Author(s):  
Parmeet Kaur ◽  
Sanya Deshmukh ◽  
Pranjal Apoorva ◽  
Simar Batra

Humongous volumes of data are being generated every minute by individual users as well as organizations. This data can be turned into a valuable asset only if it is analyzed, interpreted and used for improving processes or for benefiting users. One such source that is contributing huge data every year is a large number of web-based crowd-funding projects. These projects and related campaigns help ventures to raise money by acquiring small amounts of funding from different small organizations and people. The funds raised for crowdfunded projects and hence, their success depends on multiple elements of the project. The current work predicts the success of a new venture by analysis and visualization of the existing data and determining the parameters on which success of a project depends. The prediction of a project’s outcome is performed by application of machine learning algorithms on crowd-funding data stored in the NoSQL database, MongoDB. The results of this work can prove beneficial for the investors to have an estimate about the success of a project before investing in it.


Author(s):  
Osval Antonio Montesinos López ◽  
Abelardo Montesinos López ◽  
Jose Crossa

AbstractNowadays, huge data quantities are collected and analyzed for delivering deep insights into biological processes and human behavior. This chapter assesses the use of big data for prediction and estimation through statistical machine learning and its applications in agriculture and genetics in general, and specifically, for genome-based prediction and selection. First, we point out the importance of data and how the use of data is reshaping our way of living. We also provide the key elements of genomic selection and its potential for plant improvement. In addition, we analyze elements of modeling with machine learning methods applied to genomic selection and stress their importance as a predictive methodology. Two cultures of model building are analyzed and discussed: prediction and inference; by understanding modeling building, researchers will be able to select the best model/method for each circumstance. Within this context, we explain the differences between nonparametric models (predictors are constructed according to information derived from data) and parametric models (all the predictors take predetermined forms with the response) as well their type of effects: fixed, random, and mixed. Basic elements of linear algebra are provided to facilitate understanding the contents of the book. This chapter also contains examples of the different types of data using supervised, unsupervised, and semi-supervised learning methods.


2022 ◽  
pp. 1458-1483
Author(s):  
Kamalendu Pal

Heterogeneous data types, widely distributed data sources, huge data volumes, and large-scale business-alliance partners describe typical global supply chain operational environments. Mobile and wireless technologies are putting an extra layer of data source in this technology-enriched supply chain operation. This environment also needs to provide access to data anywhere, anytime to its end-users. This new type of data set originating from the global retail supply chain is commonly known as big data because of its huge volume, resulting from the velocity with which it arrives in the global retail business environment. Such environments empower and necessitate decision makers to act or react quicker to all decision tasks. Academics and practitioners are researching and building the next generation of big-data-based application software systems. This new generation of software applications is based on complex data analysis algorithms (i.e., on data that does not adhere to standard relational data models). The traditional software testing methods are insufficient for big-data-based applications. Testing big-data-based applications is one of the biggest challenges faced by modern software design and development communities because of lack of knowledge on what to test and how much data to test. Big-data-based applications developers have been facing a daunting task in defining the best strategies for structured and unstructured data validation, setting up an optimal test environment, and working with non-relational databases testing approaches. This chapter focuses on big-data-based software testing and quality-assurance-related issues in the context of Hadoop, an open source framework. It includes discussion about several challenges with respect to massively parallel data generation from multiple sources, testing methods for validation of pre-Hadoop processing, software application quality factors, and some of the software testing mechanisms for this new breed of applications


Author(s):  
Amanpreet Kaur ◽  
Heena Wadhwa ◽  
Pardeep Singh ◽  
Harpreet Kaur Toor

Fog Computing is eminent to ensure quality of service in handling huge volume and variety of data and to display output, or for closed loop process control. It comprises of fog devices to manage huge data transmission but results in high energy consumption, end-to end-delay, latency. In this paper, an energy model for fog computing environment has been proposed and implemented based on teacher student learning model called Teaching Learning Based Optimization (TLBO) to improve the responsiveness of the fog network in terms of energy optimization. The results show the effectiveness of TLBO in choosing the shortest path with least energy consumption.


Author(s):  
Yaasmin Attarwala ◽  
Sakshi Baid

With progression in technology, an enormous magnitude of information being collected from digital users by various businesses and organizations, has resulted in formation of huge data repositories commonly known by the term Big data. Data mining is a tool used for extracting hidden information from these vast databases to identify unique patterns and rules. The present paper aims to provide a detailed description of the importance of big data in today’s times, its characteristics, how data mining plays an important role in big data, why it is a necessity in today’s times, the process of data mining and functionalities it performs, data mining techniques such as classification, clustering etc. that help in finding the patterns to decide upon the future trends in businesses and applications of the same in various fields. The paper also discusses the important role of data mining in Business Intelligence (BI) and various industries, to identify unique patterns and obtain results from the data along with the second half of the paper focusing on further exploring the challenges that are faced in big data and tools used, the applications and upcoming trends in data science and lastly, the scope and importance of data science in the future.


Sign in / Sign up

Export Citation Format

Share Document