K-Means Algorithm for Clustering of Learners Performance Levels Using Machine Learning Techniques

Data Clustering is the process of grouping the objects in a way which is identical to the objects in the same group than in other classes. In this paper, the clustering of data is used as k-means to assess the output of students. Machine Learning is an area used in all systems. Machine learning is used in education, pattern recognition, sports, industrial applications. Its significance increases with the future of the students in the educational system. Data collection in education is very useful, as data volumes in the education system are growing each day. Higher education is relatively new, but due to the growing database its significance grows. There are several ways to assess the success of students. K-means is one of the best and most successful methods. The secret information in the database is extracted using data mining to increase the output of students. The decision tree is also a way to predict the success of the students. In recent years, educational institutions have the greatest challenges in increasing data growth and using it to increase efficiency, such that better decision-making can be made. Clustering is one of the most important methods used for the analysis of data sets. This trial uses cluster analyses according to their features for section students in various classes. Uncontrolled K-means algorithm is discussed. The mining of education data is used for the study of the knowledge available in the field of education in order to provide secret, significant and useful information. The proposed model considers K-means clustering model for analyzing learners performance. The outcomes and future of students can be strengthened with this support. The results show that the K-means cluster algorithm is useful for grouping students based on similar performance features.

Download Full-text

Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach

INFORMS Journal on Computing ◽

10.1287/ijoc.2020.1019 ◽

2021 ◽

Author(s):

Gediminas Adomavicius ◽

Yaqiong Wang

Keyword(s):

Machine Learning ◽

General Purpose ◽

Reliability Estimation ◽

Machine Learning Techniques ◽

Data Sets ◽

Real World Data ◽

Learning Techniques ◽

Reliability Indicator ◽

Machine Learning Approach ◽

Prediction Reliability

Numerical predictive modeling is widely used in different application domains. Although many modeling techniques have been proposed, and a number of different aggregate accuracy metrics exist for evaluating the overall performance of predictive models, other important aspects, such as the reliability (or confidence and uncertainty) of individual predictions, have been underexplored. We propose to use estimated absolute prediction error as the indicator of individual prediction reliability, which has the benefits of being intuitive and providing highly interpretable information to decision makers, as well as allowing for more precise evaluation of reliability estimation quality. As importantly, the proposed reliability indicator allows the reframing of reliability estimation itself as a canonical numeric prediction problem, which makes the proposed approach general-purpose (i.e., it can work in conjunction with any outcome prediction model), alleviates the need for distributional assumptions, and enables the use of advanced, state-of-the-art machine learning techniques to learn individual prediction reliability patterns directly from data. Extensive experimental results on multiple real-world data sets show that the proposed machine learning-based approach can significantly improve individual prediction reliability estimation as compared with a number of baselines from prior work, especially in more complex predictive scenarios.

Download Full-text

Decision Tree: A Machine Learning for Intrusion Detection

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f1234.0486s419 ◽

2019 ◽

Vol 8 (6S4) ◽

pp. 1126-1130

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Detection System ◽

Research Work ◽

Machine Learning Techniques ◽

Data Sets ◽

Legitimate User ◽

Learning Techniques ◽

Three Stages

The Intrusion is a major threat to unauthorized data or legal network using the legitimate user identity or any of the back doors and vulnerabilities in the network. IDS mechanisms are developed to detect the intrusions at various levels. The objective of the research work is to improve the Intrusion Detection System performance by applying machine learning techniques based on decision trees for detection and classification of attacks. The methodology adapted will process the datasets in three stages. The experimentation is conducted on KDDCUP99 data sets based on number of features. The Bayesian three modes are analyzed for different sized data sets based upon total number of attacks. The time consumed by the classifier to build the model is analyzed and the accuracy is done.

Download Full-text

Artificially Generated Training Data-sets for Supervised Machine Learning Techniques in Magnetic Resonance Imaging: An Example in Myocardial Segmentation

2019 Computing in Cardiology Conference (CinC) ◽

10.22489/cinc.2019.220 ◽

2019 ◽

Author(s):

Christos Xanthis ◽

Kostas Haris ◽

Dimitrios Filos ◽

Anthony Aletras

Keyword(s):

Magnetic Resonance Imaging ◽

Machine Learning ◽

Magnetic Resonance ◽

Training Data ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Data Sets ◽

Resonance Imaging ◽

Learning Techniques ◽

Myocardial Segmentation

Download Full-text

Agriculture Analysis Using Data Mining And Machine Learning Techniques

2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS) ◽

10.1109/icaccs.2019.8728382 ◽

2019 ◽

Cited By ~ 3

Author(s):

C.N. Vanitha ◽

N. Archana ◽

R. Sowmiya

Keyword(s):

Machine Learning ◽

Data Mining ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Using Data

Download Full-text

Homeland Security Data Mining and Link Analysis

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch107 ◽

2011 ◽

pp. 566-569

Author(s):

Bhavani Thuraisingham

Keyword(s):

Machine Learning ◽

Data Mining ◽

Homeland Security ◽

Link Analysis ◽

Machine Learning Techniques ◽

Data Mining Technique ◽

Learning Techniques ◽

Terrorist Groups ◽

Using Data ◽

Terrorist Events

Data mining is the process of posing queries to large quantities of data and extracting information often previously unknown using mathematical, statistical, and machine-learning techniques. Data mining has many applications in a number of areas, including marketing and sales, medicine, law, manufacturing, and, more recently, homeland security. Using data mining, one can uncover hidden dependencies between terrorist groups as well as possibly predict terrorist events based on past experience. One particular data-mining technique that is being investigated a great deal for homeland security is link analysis, where links are drawn between various nodes, possibly detecting some hidden links.

Download Full-text

Controlling a Simulated Robot Using Machine Learning Techniques

ASME 2010 World Conference on Innovative Virtual Reality ◽

10.1115/winvr2010-3705 ◽

2010 ◽

Author(s):

Jonathan Becker ◽

Aveek Purohit ◽

Zheng Sun

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Linear Regression ◽

Pid Controller ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Gaming Environment ◽

Using Data

USARSim group at NIST developed a simulated robot that operated in the Unreal Tournament 3 (UT3) gaming environment. They used a software PID controller to control the robot in UT3 worlds. Unfortunately, the PID controller did not work well, so NIST asked us to develop a better controller using machine learning techniques. In the process, we characterized the software PID controller and the robot’s behavior in UT3 worlds. Using data collected from our simulations, we compared different machine learning techniques including linear regression and reinforcement learning (RL). Finally, we implemented a RL based controller in Matlab and ran it in the UT3 environment via a TCP/IP link between Matlab and UT3.

Download Full-text

Malicious web domain identification using online credibility and performance data by considering the class imbalance issue

Industrial Management & Data Systems ◽

10.1108/imds-02-2018-0072 ◽

2019 ◽

Vol 119 (3) ◽

pp. 676-696 ◽

Cited By ~ 5

Author(s):

Zhongyi Hu ◽

Raymond Chiong ◽

Ilung Pranata ◽

Yukun Bao ◽

Yuqing Lin

Keyword(s):

Machine Learning ◽

Class Imbalance ◽

Performance Data ◽

Machine Learning Techniques ◽

Data Sets ◽

Real World Data ◽

Content Type ◽

Domain Identification ◽

Learning Techniques ◽

And Performance

Purpose Malicious web domain identification is of significant importance to the security protection of internet users. With online credibility and performance data, the purpose of this paper to investigate the use of machine learning techniques for malicious web domain identification by considering the class imbalance issue (i.e. there are more benign web domains than malicious ones). Design/methodology/approach The authors propose an integrated resampling approach to handle class imbalance by combining the synthetic minority oversampling technique (SMOTE) and particle swarm optimisation (PSO), a population-based meta-heuristic algorithm. The authors use the SMOTE for oversampling and PSO for undersampling. Findings By applying eight well-known machine learning classifiers, the proposed integrated resampling approach is comprehensively examined using several imbalanced web domain data sets with different imbalance ratios. Compared to five other well-known resampling approaches, experimental results confirm that the proposed approach is highly effective. Practical implications This study not only inspires the practical use of online credibility and performance data for identifying malicious web domains but also provides an effective resampling approach for handling the class imbalance issue in the area of malicious web domain identification. Originality/value Online credibility and performance data are applied to build malicious web domain identification models using machine learning techniques. An integrated resampling approach is proposed to address the class imbalance issue. The performance of the proposed approach is confirmed based on real-world data sets with different imbalance ratios.

Download Full-text

A fusion fuzzy model for detecting phony accounts in social networks

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.17345 ◽

2018 ◽

Vol 7 (4) ◽

pp. 2738

Author(s):

P. Srinivas Rao ◽

Jayadev Gyani ◽

G. Narsimha

Keyword(s):

Social Network ◽

Fuzzy Model ◽

Machine Learning Techniques ◽

Personal Attributes ◽

Data Sets ◽

Learning Techniques ◽

The Social ◽

Using Data ◽

Removal Technique ◽

Normal Rank

In online social network’s phony account detection is one of the major task among the ability of genuine user from forged user account. The fundamental objective of detection of phony account framework is to detect fake account and removal technique in Social network user sites. This work concentrates on detection of phony account in which it depends on normal basis framework, transformative Algorithms and fuzzy technique. Initially, the most essential attributes including personal attributes, comparability techniques and various real user review, tweets, or comments are extricated. A direct blend of these attributes demonstrates the significance of each reviews tweets comments etc. To compute closeness measure, a consolidated strategy in view of artiﬁcial honey bee state Algorithm and fuzzy technique are utilized. Second approach is proposed to alter the best weights of the normal user attributes utilizing the social network activities/transaction and inherited Algorithm. Finally, a normal rank rationale framework is utilized to calculate the ﬁnal scoring of normal user activities. The decision making of proposed approach to find phony account are variation with existing techniques user behavioral analysis using data sets and machine learning techniques such as crowdflower_sample and genuine_accounts_sample dataset of facebook and Twitter. The outcomes demonstrate that proposed strategy overcomes the previously mentioned strategies.

Download Full-text

Examination of Technical Efficiency of Public Hospital Associations Using Data Envelopment Analysis and Machine Learning Techniques

Ekonomik Yaklasim ◽

10.5455/ey.36122 ◽

2017 ◽

Vol 28 (104) ◽

pp. 81

Author(s):

Songül ÇINAROĞLU ◽

Onur BAŞER

Keyword(s):

Machine Learning ◽

Data Envelopment Analysis ◽

Technical Efficiency ◽

Public Hospital ◽

Machine Learning Techniques ◽

Data Envelopment ◽

Learning Techniques ◽

Using Data

Download Full-text

Application of multi-sensor unmanned aerial system for identification of hydrothermal alteration zones

10.5194/egusphere-egu2020-12546 ◽

2020 ◽

Author(s):

Yosoon Choi ◽

Jieun Baek ◽

Jangwon Suh ◽

Sung-Min Kim

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Training Data ◽

Sensor Data ◽

Machine Learning Techniques ◽

Integrated Analysis ◽

Unmanned Aerial System ◽

Data Sets ◽

Learning Techniques ◽

Hydrothermal Alteration Zones

<p>In this study, we proposed a method to utilize a multi-sensor Unmanned Aerial System (UAS) for exploration of hydrothermal alteration zones. This study selected an area (10m &#215; 20m) composed mainly of the andesite and located on the coast, with wide outcrops and well-developed structural and mineralization elements. Multi-sensor (visible, multispectral, thermal, magnetic) data were acquired in the study area using UAS, and were studied using machine learning techniques. For utilizing the machine learning techniques, we applied the stratified random method to sample 1000 training data in the hydrothermal zone and 1000 training data in the non-hydrothermal zone identified through the field survey. The 2000 training data sets created for supervised learning were first classified into 1500 for training and 500 for testing. Then, 1500 for training were classified into 1200 for training and 300 for validation. The training and validation data for machine learning were generated in five sets to enable cross-validation. Five types of machine learning techniques were applied to the training data sets: k-Nearest Neighbors (k-NN), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), and Deep Neural Network (DNN). As a result of integrated analysis of multi-sensor data using five types of machine learning techniques, RF and SVM techniques showed high classification accuracy of about 90%. Moreover, performing integrated analysis using multi-sensor data showed relatively higher classification accuracy in all five machine learning techniques than analyzing magnetic sensing data or single optical sensing data only.</p>

Download Full-text