Domain Partitioning: Approach to Computing Average Iceberg Queries

Recent Patents on Computer Science ◽

10.2174/2213275912666190423143340 ◽

2019 ◽

Vol 12 ◽

Author(s):

Pallam Ravi ◽

D. Haritha

Keyword(s):

Data Mining ◽

Data Analytics ◽

Input Data ◽

It Use ◽

Data Set ◽

Limited Memory ◽

Aggregate Function ◽

Domain Partitioning ◽

Bit Vector

: Data analytics and data mining systems work on data which stored in files, the files are not store relationships among the data, from such kind of data we compute aggregate values over the set of required attributes for find insights of data, find attributes values which aggregation values greater than threshold such kind of queries called iceberg queries. Computing iceberg queries with average aggregate function is default, because limited memory available. Existing method are suffers with re-computation of candidate. We proposed a Record Traction Algorithm(RTA) ,it use Domain partitioning approach, it avoid re-computation of candidate in during next scan of data set, it use bit vector and bitmap numbers for Domain Partitioning the data, our experiment reveals that our approach generate a candidate only once and input data will reduced in further candidate sets.

Download Full-text

Trends and Opportunities in Health Analytics as a Service and Implications for Use in Low Resource Settings: A Literature Review Abstract (Preprint)

10.2196/preprints.15737 ◽

2019 ◽

Author(s):

Meghana Bastwadkar ◽

Carolyn McGregor ◽

S Balaji

Keyword(s):

Data Mining ◽

Cloud Computing ◽

Big Data ◽

Intensive Care ◽

Literature Review ◽

Health Monitoring ◽

Data Analytics ◽

Neonatal Intensive Care ◽

Big Data Analytics ◽

Healthcare Facilities

BACKGROUND This paper presents a systematic literature review of existing remote health monitoring systems with special reference to neonatal intensive care (NICU). Articles on NICU clinical decision support systems (CDSSs) which used cloud computing and big data analytics were surveyed. OBJECTIVE The aim of this study is to review technologies used to provide NICU CDSS. The literature review highlights the gaps within frameworks providing HAaaS paradigm for big data analytics METHODS Literature searches were performed in Google Scholar, IEEE Digital Library, JMIR Medical Informatics, JMIR Human Factors and JMIR mHealth and only English articles published on and after 2015 were included. The overall search strategy was to retrieve articles that included terms that were related to “health analytics” and “as a service” or “internet of things” / ”IoT” and “neonatal intensive care unit” / ”NICU”. Title and abstracts were reviewed to assess relevance. RESULTS In total, 17 full papers met all criteria and were selected for full review. Results showed that in most cases bedside medical devices like pulse oximeters have been used as the sensor device. Results revealed a great diversity in data acquisition techniques used however in most cases the same physiological data (heart rate, respiratory rate, blood pressure, blood oxygen saturation) was acquired. Results obtained have shown that in most cases data analytics involved data mining classification techniques, fuzzy logic-NICU decision support systems (DSS) etc where as big data analytics involving Artemis cloud data analysis have used CRISP-TDM and STDM temporal data mining technique to support clinical research studies. In most scenarios both real-time and retrospective analytics have been performed. Results reveal that most of the research study has been performed within small and medium sized urban hospitals so there is wide scope for research within rural and remote hospitals with NICU set ups. Results have shown creating a HAaaS approach where data acquisition and data analytics are not tightly coupled remains an open research area. Reviewed articles have described architecture and base technologies for neonatal health monitoring with an IoT approach. CONCLUSIONS The current work supports implementation of the expanded Artemis cloud as a commercial offering to healthcare facilities in Canada and worldwide to provide cloud computing services to critical care. However, no work till date has been completed for low resource setting environment within healthcare facilities in India which results in scope for research. It is observed that all the big data analytics frameworks which have been reviewed in this study have tight coupling of components within the framework, so there is a need for a framework with functional decoupling of components.

Download Full-text

A Mathematical Programming Model to Determine Objective Weights for the Interval Extension of TOPSIS

Mathematical Problems in Engineering ◽

10.1155/2018/3783101 ◽

2018 ◽

Vol 2018 ◽

pp. 1-6 ◽

Cited By ~ 1

Author(s):

Hai Shen ◽

Lingyu Hu ◽

Kin Keung Lai

Keyword(s):

Mathematical Programming ◽

Input Data ◽

Programming Model ◽

Multiple Criteria Decision Making ◽

Decision Makers ◽

Mathematical Programming Model ◽

Topsis Method ◽

Data Set ◽

Interval Extension ◽

Interval Valued

Technique for Order Performance by Similarity to Ideal Solution (TOPSIS) method has been extended in previous literature to consider the situation with interval input data. However, the weights associated with criteria are still subjectively assigned by decision makers. This paper develops a mathematical programming model to determine objective weights for the implementation of interval extension of TOPSIS. Our method not only takes into account the optimization of interval-valued Multiple Criteria Decision Making (MCDM) problems, but also determines the weights only based upon the data set itself. An illustrative example is performed to compare our results with that of existing literature.

Download Full-text

Components of Data Mining and Big Data Analytics in Intra-Data Center Networks

SSRN Electronic Journal ◽

10.2139/ssrn.3852482 ◽

2021 ◽

Author(s):

rakesh rojanala

Keyword(s):

Data Mining ◽

Big Data ◽

Data Center ◽

Data Analytics ◽

Big Data Analytics ◽

Data Center Networks

Download Full-text

An Effective Model for Consumer Need Prediction Using Big Data Analytics

Journal of Interconnection Networks ◽

10.1142/s0219265921430088 ◽

2021 ◽

Author(s):

Yihao Tian

Keyword(s):

Big Data ◽

Consumer Behavior ◽

Data Analytics ◽

Big Data Analytics ◽

Customer Behavior ◽

Data Set ◽

Customer Data ◽

Demand Prediction ◽

Management Ratio ◽

The Veil

Big data is an unstructured data set with a considerable volume, coming from various sources such as the internet, business organizations, etc., in various formats. Predicting consumer behavior is a core responsibility for most dealers. Market research can show consumer intentions; it can be a big order for a best-designed research project to penetrate the veil, protecting real customer motivations from closer scrutiny. Customer behavior usually focuses on customer data mining, and each model is structured at one stage to answer one query. Customer behavior prediction is a complex and unpredictable challenge. In this paper, advanced mathematical and big data analytical (BDA) methods to predict customer behavior. Predictive behavior analytics can provide modern marketers with multiple insights to optimize efforts in their strategies. This model goes beyond analyzing historical evidence and making the most knowledgeable assumptions about what will happen in the future using mathematical. Because the method is complex, it is quite straightforward for most customers. As a result, most consumer behavior models, so many variables that produce predictions that are usually quite accurate using big data. This paper attempts to develop a model of association rule mining to predict customers’ behavior, improve accuracy, and derive major consumer data patterns. The finding recommended BDA method improves Big data analytics usability in the organization (98.2%), risk management ratio (96.2%), operational cost (97.1%), customer feedback ratio (98.5%), and demand prediction ratio (95.2%).

Download Full-text

14th International Conference on Computer Graphics, Visualization, Computer Vision and Image Processing, 5th International Conference on Big Data Analytics, Data Mining and Computational Intelligence and 9th International Conference on Theory and Practice in Modern Computing

10.33965/cgv_big_tpmc2020 ◽

2020 ◽

Keyword(s):

Data Mining ◽

Image Processing ◽

Computer Vision ◽

Big Data ◽

Computer Graphics ◽

Computational Intelligence ◽

Data Analytics ◽

Big Data Analytics ◽

Theory And Practice ◽

International Conference

Download Full-text

Characterization of Road Condition with Data Mining Based on Measured Kinematic Vehicle Parameters

Journal of Advanced Transportation ◽

10.1155/2018/8647607 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Johannes Masino ◽

Jakob Thumm ◽

Guillaume Levasseur ◽

Michael Frey ◽

Frank Gauterin ◽

...

Keyword(s):

Data Mining ◽

Support Vector ◽

Matlab Toolbox ◽

Data Set ◽

The Road ◽

Acceleration Sensors ◽

Road Surfaces ◽

Road Condition ◽

Sensor Signals

This work aims at classifying the road condition with data mining methods using simple acceleration sensors and gyroscopes installed in vehicles. Two classifiers are developed with a support vector machine (SVM) to distinguish between different types of road surfaces, such as asphalt and concrete, and obstacles, such as potholes or railway crossings. From the sensor signals, frequency-based features are extracted, evaluated automatically with MANOVA. The selected features and their meaning to predict the classes are discussed. The best features are used for designing the classifiers. Finally, the methods, which are developed and applied in this work, are implemented in a Matlab toolbox with a graphical user interface. The toolbox visualizes the classification results on maps, thus enabling manual verification of the results. The accuracy of the cross-validation of classifying obstacles yields 81.0% on average and of classifying road material 96.1% on average. The results are discussed on a comprehensive exemplary data set.

Download Full-text

A Survey on Major Classification Algorithms and Comparative Analysis of Few Classification Algorithms on Contact Lenses Data Set Using Data Mining Tool

New Trends in Computational Vision and Bio-inspired Computing ◽

10.1007/978-3-030-41862-5_121 ◽

2020 ◽

pp. 1201-1209

Author(s):

Syed Nawaz Pasha ◽

D. Ramesh ◽

Mohammad Sallauddin

Keyword(s):

Data Mining ◽

Comparative Analysis ◽

Contact Lenses ◽

Classification Algorithms ◽

Data Set ◽

Data Mining Tool ◽

Mining Tool ◽

Using Data

Download Full-text

Data Warehousing, Data Mining, Data Modeling, and Data Analytics

A Model to Forecast Future Paradigms ◽

10.1201/9781003000662-3 ◽

2019 ◽

pp. 73-109

Author(s):

Zohuri Bahman ◽

Mossavar-Rahmani Farhang

Keyword(s):

Data Mining ◽

Data Analytics ◽

Data Warehousing ◽

Data Modeling

Download Full-text

Failure Analysis in University and Computer Science Contexts With Data Mining

10.5753/wei.2020.11132 ◽

2020 ◽

Author(s):

Daniela De Souza Gomes ◽

Marcos Henrique Fonseca Ribeiro ◽

Giovanni Ventorim Comarela ◽

Gabriel Philippe Pereira

Keyword(s):

Data Mining ◽

Decision Making ◽

Failure Analysis ◽

Computer Science ◽

Educational Administration ◽

Intelligent Systems ◽

Data Set ◽

Data Mining Techniques ◽

Study Case ◽

Support Students

High failure rates are a worrying and relevant problem in Brazilian universities. From a data set of student transcripts, we performed a study case for both general and Computer Science contexts, in which Data Mining Techniques were used to find patterns concerning failures. The knowledge acquired can be used for better educational administration and also build intelligent systems to support students’ decision making.

Download Full-text

A Novel Density-based Technique for Outlier Detection of High Dimensional Data Utilizing Full Feature Space

Information Technology And Control ◽

10.5755/j01.itc.50.1.25588 ◽

2021 ◽

Vol 50 (1) ◽

pp. 138-152

Author(s):

Mujeeb Ur Rehman ◽

Dost Muhammad Khan

Keyword(s):

Data Mining ◽

Outlier Detection ◽

High Dimensional Data ◽

Research Work ◽

Feature Space ◽

High Dimensional ◽

Data Set ◽

Data Points ◽

Low Dimensional ◽

Intrinsic Feature

Recently, anomaly detection has acquired a realistic response from data mining scientists as a graph of its reputation has increased smoothly in various practical domains like product marketing, fraud detection, medical diagnosis, fault detection and so many other fields. High dimensional data subjected to outlier detection poses exceptional challenges for data mining experts and it is because of natural problems of the curse of dimensionality and resemblance of distant and adjoining points. Traditional algorithms and techniques were experimented on full feature space regarding outlier detection. Customary methodologies concentrate largely on low dimensional data and hence show ineffectiveness while discovering anomalies in a data set comprised of a high number of dimensions. It becomes a very difficult and tiresome job to dig out anomalies present in high dimensional data set when all subsets of projections need to be explored. All data points in high dimensional data behave like similar observations because of its intrinsic feature i.e., the distance between observations approaches to zero as the number of dimensions extends towards infinity. This research work proposes a novel technique that explores deviation among all data points and embeds its findings inside well established density-based techniques. This is a state of art technique as it gives a new breadth of research towards resolving inherent problems of high dimensional data where outliers reside within clusters having different densities. A high dimensional dataset from UCI Machine Learning Repository is chosen to test the proposed technique and then its results are compared with that of density-based techniques to evaluate its efficiency.

Download Full-text