Practical Web Spam Lifelong Machine Learning System with Automatic Adjustment to Current Lifecycle Phase

Machine learning techniques are a standard approach in spam detection. Their quality depends on the quality of the learning set, and when the set is out of date, the quality of classification falls rapidly. The most popular public web spam dataset that can be used to train a spam detector—WEBSPAM-UK2007—is over ten years old. Therefore, there is a place for a lifelong machine learning system that can replace the detectors based on a static learning set. In this paper, we propose a novel web spam recognition system. The system automatically rebuilds the learning set to avoid classification based on outdated data. Using a built-in automatic selection of the active classifier the system very quickly attains productive accuracy despite a limited learning set. Moreover, the system automatically rebuilds the learning set using external data from spam traps and popular web services. A test on real data from Quora, Reddit, and Stack Overflow proved the high recognition quality. Both the obtained average accuracy and the F-measure were 0.98 and 0.96 for semiautomatic and full–automatic mode, respectively.

Download Full-text

Machine Learning Technique for Rainfall Prediction

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35032 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 594-600

Author(s):

Dr. Vivek Waghmare

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Real Data ◽

Learning System ◽

Machine Learning Techniques ◽

Weather Data ◽

Human Society ◽

Rainfall Prediction ◽

Reliability Estimates ◽

Machine Learning Model

Rain prediction is one of the most challenging and uncertain tasks that has a profound effect on human society. Timely and accurate forecasting can help significantly reduce population and financial losses. This study presents a collection of tests involving the use of conventional machine learning techniques to create rainfall prediction models depending on the weather information of the area. This Comparative research was conducted focusing on three aspects: modeling inputs, modeling methods, and prioritization techniques. The results provide a comparison of the various test metrics for these machine learning methods and their reliability estimates in rain by analyzing weather data. This study seeks a unique and effective machine learning system for predicting rainfall. The study experimented with different parameters of the rainfall from various regions in order to assess the efficiency and durability of the model. The machine learning model is focused on this study. Rainfall patterns in this study are collected, trained and tested for achievement of sustainable outcomes using machine learning models. The monthly rainfall predictions obtained after training and testing are then compared to real data to ensure the accuracy of the model The results of this study indicate that the model has been successful in it predicting monthly rain data and specific parameters.

Download Full-text

A Literature Review Study of Software Defect Prediction using Machine Learning Techniques

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i6.286 ◽

2018 ◽

Vol 6 (6) ◽

pp. 300 ◽

Cited By ~ 3

Author(s):

Feidu Akmel ◽

Ermiyas Birihanu ◽

Bahir Siraj

Keyword(s):

Machine Learning ◽

Software Metrics ◽

Quality Standard ◽

Machine Learning Techniques ◽

Software Systems ◽

Health Care Insurance ◽

Software Defect ◽

Learning Techniques ◽

Software Product

Software systems are any software product or applications that support business domains such as Manufacturing,Aviation, Health care, insurance and so on.Software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from other for this reason it is better to apply the software metrics to measure the quality of software. Attributes that we gathered from source code through software metrics can be an input for software defect predictor. Software defect are an error that are introduced by software developer and stakeholders. Finally, in this study we discovered the application of machine learning on software defect that we gathered from the previous research works.

Download Full-text

Application of a Rough Set-Based Inductive Learning System

Fundamenta Informaticae ◽

10.3233/fi-1993-182-409 ◽

1993 ◽

Vol 18 (2-4) ◽

pp. 209-220

Author(s):

Michael Hadjimichael ◽

Anita Wasilewska

Keyword(s):

Machine Learning ◽

Rough Set ◽

Presidential Election ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Inductive Learning ◽

Real Data ◽

Semantic Content ◽

Learning System ◽

Voter Preferences

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.

Download Full-text

Machine learning techniques based on security management in smart cities using robots

Work ◽

10.3233/wor-203423 ◽

2021 ◽

pp. 1-12

Author(s):

Zhang Mengqi ◽

Wang Xi ◽

V.E. Sathishkumar ◽

V. Sivakumar

Keyword(s):

Machine Learning ◽

Information And Communication Technologies ◽

Smart City ◽

Smart Cities ◽

Security Management ◽

Machine Learning Techniques ◽

Quality Of Services ◽

Learning Techniques ◽

Learning Concept

BACKGROUND: Nowadays, the growth of smart cities is enhanced gradually, which collects a lot of information and communication technologies that are used to maximize the quality of services. Even though the intelligent city concept provides a lot of valuable services, security management is still one of the major issues due to shared threats and activities. For overcoming the above problems, smart cities’ security factors should be analyzed continuously to eliminate the unwanted activities that used to enhance the quality of the services. OBJECTIVES: To address the discussed problem, active machine learning techniques are used to predict the quality of services in the smart city manages security-related issues. In this work, a deep reinforcement learning concept is used to learn the features of smart cities; the learning concept understands the entire activities of the smart city. During this energetic city, information is gathered with the help of security robots called cobalt robots. The smart cities related to new incoming features are examined through the use of a modular neural network. RESULTS: The system successfully predicts the unwanted activity in intelligent cities by dividing the collected data into a smaller subset, which reduces the complexity and improves the overall security management process. The efficiency of the system is evaluated using experimental analysis. CONCLUSION: This exploratory study is conducted on the 200 obstacles are placed in the smart city, and the introduced DRL with MDNN approach attains maximum results on security maintains.

Download Full-text

Leveraging Road Characteristics and Contributor Behaviour for Assessing Road Type Quality in OSM

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070436 ◽

2021 ◽

Vol 10 (7) ◽

pp. 436

Author(s):

Amerah Alghanim ◽

Musfira Jilani ◽

Michela Bertolotto ◽

Gavin McArdle

Keyword(s):

Machine Learning ◽

Spatial Data ◽

Classification Accuracy ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Data Set ◽

Semantic Inference ◽

Road Type ◽

The Impact

Volunteered Geographic Information (VGI) is often collected by non-expert users. This raises concerns about the quality and veracity of such data. There has been much effort to understand and quantify the quality of VGI. Extrinsic measures which compare VGI to authoritative data sources such as National Mapping Agencies are common but the cost and slow update frequency of such data hinder the task. On the other hand, intrinsic measures which compare the data to heuristics or models built from the VGI data are becoming increasingly popular. Supervised machine learning techniques are particularly suitable for intrinsic measures of quality where they can infer and predict the properties of spatial data. In this article we are interested in assessing the quality of semantic information, such as the road type, associated with data in OpenStreetMap (OSM). We have developed a machine learning approach which utilises new intrinsic input features collected from the VGI dataset. Specifically, using our proposed novel approach we obtained an average classification accuracy of 84.12%. This result outperforms existing techniques on the same semantic inference task. The trustworthiness of the data used for developing and training machine learning models is important. To address this issue we have also developed a new measure for this using direct and indirect characteristics of OSM data such as its edit history along with an assessment of the users who contributed the data. An evaluation of the impact of data determined to be trustworthy within the machine learning model shows that the trusted data collected with the new approach improves the prediction accuracy of our machine learning technique. Specifically, our results demonstrate that the classification accuracy of our developed model is 87.75% when applied to a trusted dataset and 57.98% when applied to an untrusted dataset. Consequently, such results can be used to assess the quality of OSM and suggest improvements to the data set.

Download Full-text

Machine Learning for Neurodegenerative Disorder Diagnosis — Survey of Practices and Launch of Benchmark Dataset

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213018500112 ◽

2018 ◽

Vol 27 (03) ◽

pp. 1850011 ◽

Cited By ~ 5

Author(s):

Athanasios Tagaris ◽

Dimitrios Kollias ◽

Andreas Stafylopatis ◽

Georgios Tagaris ◽

Stefanos Kollias

Keyword(s):

Machine Learning ◽

Neurodegenerative Disorder ◽

Developed Countries ◽

Machine Learning Techniques ◽

Network Architectures ◽

Imaging Data ◽

Learning Techniques ◽

Medical Examinations

Neurodegenerative disorders, such as Alzheimer’s and Parkinson’s, constitute a major factor in long-term disability and are becoming more and more a serious concern in developed countries. As there are, at present, no effective therapies, early diagnosis along with avoidance of misdiagnosis seem to be critical in ensuring a good quality of life for patients. In this sense, the adoption of computer-aided-diagnosis tools can offer significant assistance to clinicians. In the present paper, we provide in the first place a comprehensive recording of medical examinations relevant to those disorders. Then, a review is conducted concerning the use of Machine Learning techniques in supporting diagnosis of neurodegenerative diseases, with reference to at times used medical datasets. Special attention has been given to the field of Deep Learning. In addition to that, we communicate the launch of a newly created dataset for Parkinson’s disease, containing epidemiological, clinical and imaging data, which will be publicly available to researchers for benchmarking purposes. To assess the potential of the new dataset, an experimental study in Parkinson’s diagnosis is carried out, based on state-of-the-art Deep Neural Network architectures and yielding very promising accuracy results.

Download Full-text

Human Activity Recognition using Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35694 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3553-3556

Author(s):

Chaudhari Shraddha

Keyword(s):

Machine Learning ◽

Random Forest ◽

Activity Recognition ◽

Human Activity ◽

Recognition System ◽

Human Activity Recognition ◽

Machine Learning Techniques ◽

K Nearest Neighbors ◽

Random Forest Classification ◽

Medical Health Care

Activity recognition in humans is one of the active challenges that find its application in numerous fields such as, medical health care, military, manufacturing, assistive techniques and gaming. Due to the advancements in technologies the usage of smartphones in human lives has become inevitable. The sensors in the smartphones help us to measure the essential vital parameters. These measured parameters enable us to monitor the activities of humans, which we call as human activity recognition. We have applied machine learning techniques on a publicly available dataset. K-Nearest Neighbors and Random Forest classification algorithms are applied. In this paper, we have designed and implemented an automatic human activity recognition system that independently recognizes the actions of the humans. This system is able to recognize the activities such as Laying, Sitting, Standing, Walking, Walking downstairs and Walking upstairs. The results obtained show that, the KNN and Random Forest Algorithms gives 90.22% and 92.70% respectively of overall accuracy in detecting the activities.

Download Full-text

Framework for the Classification of Emotions in People With Visual Disabilities Through Brain Signals

Frontiers in Neuroinformatics ◽

10.3389/fninf.2021.642766 ◽

2021 ◽

Vol 15 ◽

Author(s):

Jesús Leonardo López-Hernández ◽

Israel González-Carrasco ◽

José Luis López-Cuadrado ◽

Belén Ruiz-Mezcua

Keyword(s):

Machine Learning ◽

Positive Emotions ◽

Machine Learning Techniques ◽

Visual Disabilities ◽

Brain Signals ◽

Learning Techniques ◽

Sensory Disabilities ◽

Recognition Of Emotions

Nowadays, the recognition of emotions in people with sensory disabilities still represents a challenge due to the difficulty of generalizing and modeling the set of brain signals. In recent years, the technology that has been used to study a person’s behavior and emotions based on brain signals is the brain–computer interface (BCI). Although previous works have already proposed the classification of emotions in people with sensory disabilities using machine learning techniques, a model of recognition of emotions in people with visual disabilities has not yet been evaluated. Consequently, in this work, the authors present a twofold framework focused on people with visual disabilities. Firstly, auditory stimuli have been used, and a component of acquisition and extraction of brain signals has been defined. Secondly, analysis techniques for the modeling of emotions have been developed, and machine learning models for the classification of emotions have been defined. Based on the results, the algorithm with the best performance in the validation is random forest (RF), with an accuracy of 85 and 88% in the classification for negative and positive emotions, respectively. According to the results, the framework is able to classify positive and negative emotions, but the experimentation performed also shows that the framework performance depends on the number of features in the dataset and the quality of the Electroencephalogram (EEG) signals is a determining factor.

Download Full-text

A classification approach with different feature sets to predict the quality of different types of wine using machine learning techniques

2018 20th International Conference on Advanced Communication Technology (ICACT) ◽

10.23919/icact.2018.8323674 ◽

2018 ◽

Author(s):

Satyabrata Aich ◽

Ahmed Abdulhakim Al-Absi ◽

Kueh Lee Hui ◽

John Tark Lee ◽

Mangal Sain

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Classification Approach ◽

Feature Sets ◽

Learning Techniques ◽

Different Types

Download Full-text

Global and Local Clustering-Based Regression Models to Forecast Power Consumption in Buildings

Advances in Computer and Electrical Engineering - Handbook of Research on Emerging Technologies for Electrical Power Planning, Analysis, and Optimization ◽

10.4018/978-1-4666-9911-3.ch011 ◽

2016 ◽

pp. 207-234

Author(s):

Gonzalo Vergara ◽

Juan J. Carrasco ◽

Jesus Martínez-Gómez ◽

Manuel Domínguez ◽

José A. Gámez ◽

...

Keyword(s):

Machine Learning ◽

Power Consumption ◽

Electric Power ◽

Principal Component ◽

Real Data ◽

Machine Learning Techniques ◽

Electric Power Consumption ◽

Support Vector ◽

Learning Machines ◽

The University

The study of energy efficiency in buildings is an active field of research. Modeling and predicting energy related magnitudes leads to analyze electric power consumption and can achieve economical benefits. In this study, classical time series analysis and machine learning techniques, introducing clustering in some models, are applied to predict active power in buildings. The real data acquired corresponds to time, environmental and electrical data of 30 buildings belonging to the University of León (Spain). Firstly, we segmented buildings in terms of their energy consumption using principal component analysis. Afterwards, we applied state of the art machine learning methods and compare between them. Finally, we predicted daily electric power consumption profiles and compare them with actual data for different buildings. Our analysis shows that multilayer perceptrons have the lowest error followed by support vector regression and clustered extreme learning machines. We also analyze daily load profiles on weekdays and weekends for different buildings.

Download Full-text