Big Data Computation Model for Landslide Risk Analysis Using Remote Sensing Data

Effective and efficient strategies to acquire, manage, and analyze data leads to better decision making and competitive advantage. The development of cloud computing and the big data era brings up challenges to traditional data mining algorithms. The processing capacity, architecture, and algorithms of traditional database systems are not coping with big data analysis. Big data are now rapidly growing in all science and engineering domains, including biological, biomedical sciences, and disaster management. The characteristics of complexity formulate an extreme challenge for discovering useful knowledge from the big data. Spatial data is complex big data. The aim of this chapter is to propose a multi-ranking decision tree big data approach to handle complex spatial landslide data. The proposed classifier performance is validated with massive real-time dataset. The results indicate that the classifier exhibits both time efficiency and scalability.

Download Full-text

GIS- BASED APPLICATION FOR GEOTECHNICAL DATA MANAGING

Bulletin of the Geological Society of Greece ◽

10.12681/bgsg.11340 ◽

2017 ◽

Vol 43 (3) ◽

pp. 1656

Author(s):

P. Tsangaratos ◽

I. Koumantakis ◽

D. Rozos

Keyword(s):

Spatial Data ◽

Spatial Query ◽

Borehole Data ◽

Geotechnical Data ◽

Data Mining Algorithms ◽

Field Practice ◽

Open Source Gis ◽

Analysis System ◽

Mining Algorithms ◽

Management Capabilities

The need to provide data management capabilities in geotechnical projects, makes data visualization in a more understanding way vital, while improvements in computer science, have created an opportunity to rethink the manner in which such data is archived and presented. Geographic Information Systems are considered nowadays as principal methods for analysis, utilizing their ability of manipulating, compiling and processing spatial data, such as geotechnical one. In this paper, the development of Borehole Analysis System (BAS) a specific Graphical User Interface (GUI) application is proposed to access geotechnical data with the aim of a relational database and an open source GIS platform, embodied in the application. The BAS, is able to integrate multiple layers of gathered information and to derive additional knowledge by applying statistical and data mining algorithms with the use of spatial query tools. These can give reasonable conclusions and better representation in 2-D and 3-D environment. The presented application is illustrated with an example from field practice, testifying its ability to be a useful tool for management and presentation of geological and geotechnical borehole data.

Download Full-text

Big Data Mining Algorithms for Predicting Dynamic Product Price by Online Analysis

Advances in Intelligent Systems and Computing - Computational Intelligence in Data Mining ◽

10.1007/978-981-13-8676-3_59 ◽

2019 ◽

pp. 701-708

Author(s):

Manjushree Nayak ◽

Bhavana Narain

Keyword(s):

Data Mining ◽

Big Data ◽

Product Price ◽

Online Analysis ◽

Big Data Mining ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

Parallel Primitives for Vendor-Agnostic Implementation of Big Data Mining Algorithms

2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA) ◽

10.1109/waina.2018.00118 ◽

2018 ◽

Author(s):

Cesare Bandirali ◽

Stefano Lodi ◽

Gianluca Moro ◽

Andrea Pagliarani ◽

Claudio Sartori

Keyword(s):

Data Mining ◽

Big Data ◽

Big Data Mining ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

BioMedR: an R/CRAN package for integrated data analysis pipeline in biomedical study

Briefings in Bioinformatics ◽

10.1093/bib/bbz150 ◽

2019 ◽

Cited By ~ 2

Author(s):

Jie Dong ◽

Min-Feng Zhu ◽

Yong-Huan Yun ◽

Ai-Ping Lu ◽

Ting-Jun Hou ◽

...

Keyword(s):

Data Mining ◽

Clustering Algorithms ◽

R Package ◽

Integrated Analysis ◽

Analysis Pipeline ◽

Molecular Fingerprints ◽

Useful Knowledge ◽

Data Mining Algorithms ◽

Mining Methods ◽

Mining Algorithms

Abstract Background With the increasing development of biotechnology and information technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these resources needs to be extracted and then transformed to useful knowledge by various data mining methods. However, a main computational challenge is how to effectively represent or encode molecular objects under investigation such as chemicals, proteins, DNAs and even complicated interactions when data mining methods are employed. To further explore these complicated data, an integrated toolkit to represent different types of molecular objects and support various data mining algorithms is urgently needed. Results We developed a freely available R/CRAN package, called BioMedR, for molecular representations of chemicals, proteins, DNAs and pairwise samples of their interactions. The current version of BioMedR could calculate 293 molecular descriptors and 13 kinds of molecular fingerprints for small molecules, 9920 protein descriptors based on protein sequences and six types of generalized scale-based descriptors for proteochemometric modeling, more than 6000 DNA descriptors from nucleotide sequences and six types of interaction descriptors using three different combining strategies. Moreover, this package realized five similarity calculation methods and four powerful clustering algorithms as well as several useful auxiliary tools, which aims at building an integrated analysis pipeline for data acquisition, data checking, descriptor calculation and data modeling. Conclusion BioMedR provides a comprehensive and uniform R package to link up different representations of molecular objects with each other and will benefit cheminformatics/bioinformatics and other biomedical users. It is available at: https://CRAN.R-project.org/package=BioMedR and https://github.com/wind22zhu/BioMedR/.

Download Full-text

Data Mining algorithms in search of effective conditions for conducting chemical reactions

Herald of Tver State University Series Applied Mathematics ◽

10.26456/vtpmk625 ◽

2021 ◽

pp. 29-42

Author(s):

Владимир Арнольдович Биллиг ◽

Николай Васильевич Звягинцев

Keyword(s):

Data Mining ◽

Chemical Reactions ◽

Software Package ◽

Maximum Amount ◽

Minimal Cost ◽

Useful Knowledge ◽

Data Mining Algorithms ◽

Practical Information ◽

Mining Algorithms ◽

Effective Conditions

В настоящее время накоплено значительное количество экспериментальных данных, фиксирующих процесс протекания химических реакций. Анализ этих данных комплексом алгоритмов Data Mining дает важную практическую информацию для поиска эффективных условий проведения реакций, при которых получается максимальное количество целевого продукта при минимальных затратах. В данной работе на примере работы с базой, содержащей данные о протекании реакции карбонилирования различных олефинов, показано, как разработанный нами программный комплекс позволяет извлечь полезные знания, способствующие повышению эффективности химических реакций. At present, a significant amount of experimental data has been accumulated, recording the process of the occurrence of chemical reactions. Analysis of these data by a set of Data Mining algorithms provides important practical information for finding effective conditions for carrying out reactions, at which the maximum amount of the target product is obtained at minimal cost. In this paper, using the example of working with a database containing data on the course of the carbonylation reaction of various olefins, it is shown how the software package developed by us allows us to extract useful knowledge that contributes to an increase in the efficiency of chemical reactions.

Download Full-text

Modelling of e-Governance Framework for Mining Knowledge from Massive Grievance Redressal Data

International Journal of Advances in Applied Sciences ◽

10.11591/ijaas.v6.i1.pp32-41 ◽

2017 ◽

Vol 6 (1) ◽

pp. 32

Author(s):

Sangeetha G ◽

L. Manjunatha Rao

Keyword(s):

Data Mining ◽

Big Data ◽

Knowledge Discovery ◽

Discussion Forums ◽

Discovery Process ◽

Web Based ◽

Governance Framework ◽

Data Mining Algorithms ◽

Massive Number ◽

Mining Algorithms

With the massive proliferation of online applications for the citizens with abundant resources, there is a tremendous hike in usage of e-governance platforms. Right from entrepreneur, players, politicians, students, or anyone who are highly depending on web-based grievance redressal networking sites, which generates loads of massive grievance data that are not only challenging but also highly impossible to understand. The prime reason behind this is grievance data is massive in size and they are highly unstructured. Because of this fact, the proposed system attempts to understand the possibility of performing knowledge discovery process from grievance Data using conventional data mining algorithms. Designed in Java considering massive number of online e-governance framework from civilian’s grievance discussion forums, the proposed system evaluates the effectiveness of performing datamining for Big data.

Download Full-text

Migrating From Data Mining to Big Data Mining

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.4.14667 ◽

2018 ◽

Vol 7 (3.4) ◽

pp. 13

Author(s):

Gourav Bathla ◽

Himanshu Aggarwal ◽

Rinkle Rani

Keyword(s):

Data Mining ◽

Big Data ◽

Response Time ◽

Large Scale ◽

Naive Bayes ◽

Naïve Bayes ◽

Data Mining Algorithm ◽

Big Data Mining ◽

Data Mining Algorithms ◽

Mining Algorithms

Data mining is one of the most researched fields in computer science. Several researches have been carried out to extract and analyse important information from raw data. Traditional data mining algorithms like classification, clustering and statistical analysis can process small scale of data with great efficiency and accuracy. Social networking interactions, business transactions and other communications result in Big data. It is large scale of data which is not in competency for traditional data mining techniques. It is observed that traditional data mining algorithms are not capable for storage and processing of large scale of data. If some algorithms are capable, then response time is very high. Big data have hidden information, if that is analysed in intelligent manner can be highly beneficial for business organizations. In this paper, we have analysed the advancement from traditional data mining algorithms to Big data mining algorithms. Applications of traditional data mining algorithms can be straight forward incorporated in Big data mining algorithm. Several studies have analysed traditional data mining with Big data mining, but very few have analysed most important algortihsm within one research work, which is the core motive of our paper. Readers can easily observe the difference between these algorthithms with pros and cons. Mathemtics concepts are applied in data mining algorithms. Means and Euclidean distance calculation in Kmeans, Vectors application and margin in SVM and Bayes therorem, conditional probability in Naïve Bayes algorithm are real examples. Classification and clustering are the most important applications of data mining. In this paper, Kmeans, SVM and Naïve Bayes algorithms are analysed in detail to observe the accuracy and response time both on concept and empirical perspective. Hadoop, Mapreduce etc. Big data technologies are used for implementing Big data mining algorithms. Performace evaluation metrics like speedup, scaleup and response time are used to compare traditional mining with Big data mining.

Download Full-text

De-Identification of Health Data in Big Data using a Novel Bio-Inspired Apoptosis Algorithm

International Journal of Organizational and Collective Intelligence ◽

10.4018/ijoci.2015070101 ◽

2015 ◽

Vol 5 (3) ◽

pp. 1-15

Author(s):

Amine Rahmani ◽

Abdelmalek Amine ◽

Reda Mohamed Hamou

Keyword(s):

Big Data ◽

New Technologies ◽

Sensitive Information ◽

Sensitive Data ◽

Privacy Concerns ◽

Shared Data ◽

Data Mining Algorithms ◽

Identity Disclosure ◽

Using Data ◽

Mining Algorithms

In the last years, with the emergence of new technologies in the image of big data, the privacy concerns had grown widely. However, big data means the dematerialization of the data. The classical security solutions are no longer efficient in this case. Nowadays, sharing the data is much easier as well as saying hello. The amount of shared data over the web keeps growing from day to another which creates a wide gap between the purpose of sharing data and the fact that these last contain sensitive information. For that, the researches turned their attention to new issues and domains in order to minimize this gap. In other way, they intended to ensure a good utility of data by preserving its meaning while hiding sensitive information to prevent identity disclosure. Many techniques had been used for that. Some of it is mathematical and other ones using data mining algorithms. This paper deals with the problem of hiding sensitive data in shared structured medical data using a new bio-inspired algorithm from the natural phenomena of apoptosis cells in human body.

Download Full-text

Data Mining in Proteomics Using Grid Computing

Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine, and Healthcare ◽

10.4018/978-1-60566-374-6.ch013 ◽

2011 ◽

pp. 245-267

Author(s):

Fotis Psomopoulos ◽

Pericles Mitkas

Keyword(s):

Data Mining ◽

Data Retrieval ◽

Protein Classification ◽

Classification Problems ◽

Proteomics Data ◽

Grid Environment ◽

Useful Knowledge ◽

Domain Specific ◽

Data Mining Algorithms ◽

Mining Algorithms

The scope of this chapter is the presentation of Data Mining techniques for knowledge extraction in proteomics, taking into account both the particular features of most proteomics issues (such as data retrieval and system complexity), and the opportunities and constraints found in a Grid environment. The chapter discusses the way new and potentially useful knowledge can be extracted from proteomics data, utilizing Grid resources in a transparent way. Protein classification is introduced as a current research issue in proteomics, which also demonstrates most of the domain – specific traits. An overview of common and custom-made Data Mining algorithms is provided, with emphasis on the specific needs of protein classification problems. A unified methodology is presented for complex Data Mining processes on the Grid, highlighting the different application types and the benefits and drawbacks in each case. Finally, the methodology is validated through real-world case studies, deployed over the EGEE grid environment.

Download Full-text

Data Mining for the Internet of Things: Literature Review and Challenges

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37122 ◽

2021 ◽

Vol 9 (VII) ◽

pp. 3351-3362

Author(s):

Dr. Mohd Zuber

Keyword(s):

Data Mining ◽

Big Data ◽

Internet Of Things ◽

The Internet ◽

Mining System ◽

Research Issues ◽

Data Mining Algorithms ◽

Huge Data ◽

Mining Algorithms ◽

The Internet Of Things

The huge data generate by the Internet of Things (IOT) are measured of high business worth, and data mining algorithms can be applied to IOT to take out hidden information from data. In this paper, we give a methodical way to review data mining in knowledge, technique and application view, together with classification, clustering, association analysis and time series analysis, outlier analysis. And the latest application luggage is also surveyed. As more and more devices connected to IOT, huge volume of data should be analyzed, the latest algorithms should be customized to apply to big data. We reviewed these algorithms and discussed challenges and open research issues. At last a suggested big data mining system is proposed.

Download Full-text