INTERACTIVE MACHINE LEARNING TOOLS FOR DATA ANALYSIS

Data analysis carried out by machine learning tools has covered almost all areas of human activity. This is due to a large amount of data that needs to be processed in order, for example, to predict the occurrence of specific events (an emergency, a customer contacting the organization’s technical support, a natural disaster, etc.) or to formulate recommendations regarding interaction with a certain group of people (personalized offers for the customer, a person’s reaction to advertising, etc.). The paper deals with the possibilities of the Multitool analytical system, created based on the machine learning method «decision tree», in terms of building predictive models that are suitable for solving data analysis problems in practical use. For this purpose, a series of ten experiments was conducted, in which the results generated by the system were evaluated in terms of their reliability and robustness using five criteria: arithmetic mean, standard deviation, variance, probability, and F-measure. As a result, it was found that Multitool, despite its limited functionality, allows creating predictive models of sufficient quality and suitable for practical use.

Download Full-text

Cluster Validation

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch038 ◽

2011 ◽

pp. 231-236

Author(s):

Ricardo Vilalta ◽

Tomasz Stepinski

Keyword(s):

Machine Learning ◽

Pattern Recognition ◽

Data Analysis ◽

Visual Inspection ◽

Automated Analysis ◽

Learning Tools ◽

Planetary Surfaces ◽

Martian Surface ◽

New Classification ◽

Domain Experts

Spacecrafts orbiting a selected suite of planets and moons of our solar system are continuously sending long sequences of data back to Earth. The availability of such data provides an opportunity to invoke tools from machine learning and pattern recognition to extract patterns that can help to understand geological processes shaping planetary surfaces. Due to the marked interest of the scientific community on this particular planet, we base our current discussion on Mars, where there are presently three spacecrafts in orbit (e.g., NASA’s Mars Odyssey Orbiter, Mars Reconnaissance Orbiter, ESA’s Mars Express). Despite the abundance of available data describing Martian surface, only a small fraction of the data is being analyzed in detail because current techniques for data analysis of planetary surfaces rely on a simple visual inspection and descriptive characterization of surface landforms (Wilhelms, 1990). The demand for automated analysis of Mars surface has prompted the use of machine learning and pattern recognition tools to generate geomorphic maps, which are thematic maps of landforms (or topographical expressions). Examples of landforms are craters, valley networks, hills, basins, etc. Machine learning can play a vital role in automating the process of geomorphic mapping. A learning system can be employed to either fully automate the process of discovering meaningful landform classes using clustering techniques; or it can be used instead to predict the class of unlabeled landforms (after an expert has manually labeled a representative sample of the landforms) using classification techniques. The impact of these techniques on the analysis of Mars topography can be of immense value due to the sheer size of the Martian surface that remains unmapped. While it is now clear that machine learning can greatly help in automating the detailed analysis of Mars’ surface (Stepinski et al., 2007; Stepinski et al., 2006; Bue and Stepinski, 2006; Stepinski and Vilalta, 2005), an interesting problem, however, arises when an automated data analysis has produced a novel classification of a specific site’s landforms. The problem lies on the interpretation of this new classification as compared to traditionally derived classifications generated through visual inspection by domain experts. Is the new classification novel in all senses? Is the new classification only partially novel, with many landforms matching existing classifications? This article discusses how to assess the value of clusters generated by machine learning tools as applied to the analysis of Mars’ surface.

Download Full-text

Data analysis and machine learning tools in MATLAB and Python

Computational Learning Approaches to Data Analytics in Biomedical Applications ◽

10.1016/b978-0-12-814482-4.00009-7 ◽

2020 ◽

pp. 231-290

Author(s):

Khalid K. Al-jabery ◽

Tayo Obafemi-Ajayi ◽

Gayla R. Olbricht ◽

Donald C. Wunsch II

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Learning Tools

Download Full-text

Mapping organizational complexity: a network-based approach to behavior through machine learning tools

International Journal of Business and Applied Social Science ◽

10.33642/ijbass.v7n12p5 ◽

2021 ◽

pp. 34-51

Author(s):

Sabrina Bagnato ◽

Antonina Barreca ◽

Roberta Costantini ◽

Francesca Quintiliani

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Organizational Behavior ◽

Professional Growth ◽

Univariate Analysis ◽

Learning Tools ◽

Organizational Complexity ◽

Multivariate Technique ◽

Analysis Methodology ◽

Development Paths

The current uncertain, dynamic scenario calls for a systemic perspective when referring to organizational complexity and behavior. Our research contributes to the analysis of organizational complexity through multidimensional behavioral mapping. Our method uses machine learning tools to detect the interconnections between the different behaviors of a person in his/her operating context. First, the research project dealt with prototyping a model to read the organizational behavior, the related detection tool, and a data analysis methodology. It used machine learning tools and ended with a data visualization phase. We set our model to read the organizational behavior by comparing the literature benchmark theories with our field experience. The model was organized around 4 areas and 16 behaviors. These were the basis for singling out the indicators and the questionnaire items. The data analysis methodology aimed at detecting the interconnections between behaviors. We designed it by joining univariate analysis with a multivariate technique based on the application of machine learning tools. This led to a high-resolution network map through three specific steps: (a) creating a multidimensional topology based on a Kohonen Map (a type of unsupervised learning artificial neural network) to geometrically represent behavioral relationships; (b) implementing k-means clustering for identifying which areas of the map have behavior similarity or affinity factors; and (c) locating people and the various identified clusters within the map. The research highlighted the validity of machine learning tools to detect the multidimensionality of organizational behavior. Therefore, we could delineate the networking of the observed elements and visualize an otherwise unattainable complexity through multimedia and interactive reporting. Application in the field of research consisted of the design and development of a prototype integrated with our LMS platform via a plugin. Field experimentation confirmed the effectiveness of the method for creating professional growth and development paths. Furthermore, this experimentation allowed us to obtain significant data by applying our model to several sectors, namely pharmaceutical, TLC, banking, automotive, machinery, and services.

Download Full-text

A Comparative Study of Different Machine Learning Tools

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i4.184190 ◽

2019 ◽

Vol 7 (4) ◽

pp. 184-190

Author(s):

Himani Maheshwari ◽

Pooja Goswami ◽

Isha Rana

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Learning Tools

Download Full-text

Bipolar Disorder and Oxidative Stress Injury Mechanism - Clinical Big Data Analysis Based on Machine Learning

Case Medical Research ◽

10.31525/ct1-nct03949218 ◽

2019 ◽

Author(s):

Keyword(s):

Oxidative Stress ◽

Machine Learning ◽

Bipolar Disorder ◽

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Injury Mechanism ◽

Stress Injury ◽

Oxidative Stress Injury ◽

And Oxidative Stress

Download Full-text

Machine Learning Based Predictive Action on Categorical Non-Sequential Data

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190417150421 ◽

2020 ◽

Vol 13 (5) ◽

pp. 1020-1030

Author(s):

Pradeep S. ◽

Jagadish S. Kallimani

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Categorical Data ◽

Numerical Data ◽

Processing Technique ◽

Machine Learning Algorithms ◽

Sequential Data ◽

Industry Standard ◽

Robust Model ◽

Future Work

Background: With the advent of data analysis and machine learning, there is a growing impetus of analyzing and generating models on historic data. The data comes in numerous forms and shapes with an abundance of challenges. The most sorted form of data for analysis is the numerical data. With the plethora of algorithms and tools it is quite manageable to deal with such data. Another form of data is of categorical nature, which is subdivided into, ordinal (order wise) and nominal (number wise). This data can be broadly classified as Sequential and Non-Sequential. Sequential data analysis is easier to preprocess using algorithms. Objective: The challenge of applying machine learning algorithms on categorical data of nonsequential nature is dealt in this paper. Methods: Upon implementing several data analysis algorithms on such data, we end up getting a biased result, which makes it impossible to generate a reliable predictive model. In this paper, we will address this problem by walking through a handful of techniques which during our research helped us in dealing with a large categorical data of non-sequential nature. In subsequent sections, we will discuss the possible implementable solutions and shortfalls of these techniques. Results: The methods are applied to sample datasets available in public domain and the results with respect to accuracy of classification are satisfactory. Conclusion: The best pre-processing technique we observed in our research is one hot encoding, which facilitates breaking down the categorical features into binary and feeding it into an Algorithm to predict the outcome. The example that we took is not abstract but it is a real – time production services dataset, which had many complex variations of categorical features. Our Future work includes creating a robust model on such data and deploying it into industry standard applications.

Download Full-text