scholarly journals Practical Web Spam Lifelong Machine Learning System with Automatic Adjustment to Current Lifecycle Phase

2019 ◽  
Vol 2019 ◽  
pp. 1-16 ◽  
Author(s):  
Marcin Luckner

Machine learning techniques are a standard approach in spam detection. Their quality depends on the quality of the learning set, and when the set is out of date, the quality of classification falls rapidly. The most popular public web spam dataset that can be used to train a spam detector—WEBSPAM-UK2007—is over ten years old. Therefore, there is a place for a lifelong machine learning system that can replace the detectors based on a static learning set. In this paper, we propose a novel web spam recognition system. The system automatically rebuilds the learning set to avoid classification based on outdated data. Using a built-in automatic selection of the active classifier the system very quickly attains productive accuracy despite a limited learning set. Moreover, the system automatically rebuilds the learning set using external data from spam traps and popular web services. A test on real data from Quora, Reddit, and Stack Overflow proved the high recognition quality. Both the obtained average accuracy and the F-measure were 0.98 and 0.96 for semiautomatic and full–automatic mode, respectively.

Author(s):  
Dr. Vivek Waghmare

Rain prediction is one of the most challenging and uncertain tasks that has a profound effect on human society. Timely and accurate forecasting can help significantly reduce population and financial losses. This study presents a collection of tests involving the use of conventional machine learning techniques to create rainfall prediction models depending on the weather information of the area. This Comparative research was conducted focusing on three aspects: modeling inputs, modeling methods, and prioritization techniques. The results provide a comparison of the various test metrics for these machine learning methods and their reliability estimates in rain by analyzing weather data. This study seeks a unique and effective machine learning system for predicting rainfall. The study experimented with different parameters of the rainfall from various regions in order to assess the efficiency and durability of the model. The machine learning model is focused on this study. Rainfall patterns in this study are collected, trained and tested for achievement of sustainable outcomes using machine learning models. The monthly rainfall predictions obtained after training and testing are then compared to real data to ensure the accuracy of the model The results of this study indicate that the model has been successful in it predicting monthly rain data and specific parameters.


Author(s):  
Feidu Akmel ◽  
Ermiyas Birihanu ◽  
Bahir Siraj

Software systems are any software product or applications that support business domains such as Manufacturing,Aviation, Health care, insurance and so on.Software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from other for this reason it is better to apply the software metrics to measure the quality of software. Attributes that we gathered from source code through software metrics can be an input for software defect predictor. Software defect are an error that are introduced by software developer and stakeholders. Finally, in this study we discovered the application of machine learning on software defect that we gathered from the previous research works.


1993 ◽  
Vol 18 (2-4) ◽  
pp. 209-220
Author(s):  
Michael Hadjimichael ◽  
Anita Wasilewska

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.


Work ◽  
2021 ◽  
pp. 1-12
Author(s):  
Zhang Mengqi ◽  
Wang Xi ◽  
V.E. Sathishkumar ◽  
V. Sivakumar

BACKGROUND: Nowadays, the growth of smart cities is enhanced gradually, which collects a lot of information and communication technologies that are used to maximize the quality of services. Even though the intelligent city concept provides a lot of valuable services, security management is still one of the major issues due to shared threats and activities. For overcoming the above problems, smart cities’ security factors should be analyzed continuously to eliminate the unwanted activities that used to enhance the quality of the services. OBJECTIVES: To address the discussed problem, active machine learning techniques are used to predict the quality of services in the smart city manages security-related issues. In this work, a deep reinforcement learning concept is used to learn the features of smart cities; the learning concept understands the entire activities of the smart city. During this energetic city, information is gathered with the help of security robots called cobalt robots. The smart cities related to new incoming features are examined through the use of a modular neural network. RESULTS: The system successfully predicts the unwanted activity in intelligent cities by dividing the collected data into a smaller subset, which reduces the complexity and improves the overall security management process. The efficiency of the system is evaluated using experimental analysis. CONCLUSION: This exploratory study is conducted on the 200 obstacles are placed in the smart city, and the introduced DRL with MDNN approach attains maximum results on security maintains.


2021 ◽  
Vol 10 (7) ◽  
pp. 436
Author(s):  
Amerah Alghanim ◽  
Musfira Jilani ◽  
Michela Bertolotto ◽  
Gavin McArdle

Volunteered Geographic Information (VGI) is often collected by non-expert users. This raises concerns about the quality and veracity of such data. There has been much effort to understand and quantify the quality of VGI. Extrinsic measures which compare VGI to authoritative data sources such as National Mapping Agencies are common but the cost and slow update frequency of such data hinder the task. On the other hand, intrinsic measures which compare the data to heuristics or models built from the VGI data are becoming increasingly popular. Supervised machine learning techniques are particularly suitable for intrinsic measures of quality where they can infer and predict the properties of spatial data. In this article we are interested in assessing the quality of semantic information, such as the road type, associated with data in OpenStreetMap (OSM). We have developed a machine learning approach which utilises new intrinsic input features collected from the VGI dataset. Specifically, using our proposed novel approach we obtained an average classification accuracy of 84.12%. This result outperforms existing techniques on the same semantic inference task. The trustworthiness of the data used for developing and training machine learning models is important. To address this issue we have also developed a new measure for this using direct and indirect characteristics of OSM data such as its edit history along with an assessment of the users who contributed the data. An evaluation of the impact of data determined to be trustworthy within the machine learning model shows that the trusted data collected with the new approach improves the prediction accuracy of our machine learning technique. Specifically, our results demonstrate that the classification accuracy of our developed model is 87.75% when applied to a trusted dataset and 57.98% when applied to an untrusted dataset. Consequently, such results can be used to assess the quality of OSM and suggest improvements to the data set.


2018 ◽  
Vol 27 (03) ◽  
pp. 1850011 ◽  
Author(s):  
Athanasios Tagaris ◽  
Dimitrios Kollias ◽  
Andreas Stafylopatis ◽  
Georgios Tagaris ◽  
Stefanos Kollias

Neurodegenerative disorders, such as Alzheimer’s and Parkinson’s, constitute a major factor in long-term disability and are becoming more and more a serious concern in developed countries. As there are, at present, no effective therapies, early diagnosis along with avoidance of misdiagnosis seem to be critical in ensuring a good quality of life for patients. In this sense, the adoption of computer-aided-diagnosis tools can offer significant assistance to clinicians. In the present paper, we provide in the first place a comprehensive recording of medical examinations relevant to those disorders. Then, a review is conducted concerning the use of Machine Learning techniques in supporting diagnosis of neurodegenerative diseases, with reference to at times used medical datasets. Special attention has been given to the field of Deep Learning. In addition to that, we communicate the launch of a newly created dataset for Parkinson’s disease, containing epidemiological, clinical and imaging data, which will be publicly available to researchers for benchmarking purposes. To assess the potential of the new dataset, an experimental study in Parkinson’s diagnosis is carried out, based on state-of-the-art Deep Neural Network architectures and yielding very promising accuracy results.


Author(s):  
Chaudhari Shraddha

Activity recognition in humans is one of the active challenges that find its application in numerous fields such as, medical health care, military, manufacturing, assistive techniques and gaming. Due to the advancements in technologies the usage of smartphones in human lives has become inevitable. The sensors in the smartphones help us to measure the essential vital parameters. These measured parameters enable us to monitor the activities of humans, which we call as human activity recognition. We have applied machine learning techniques on a publicly available dataset. K-Nearest Neighbors and Random Forest classification algorithms are applied. In this paper, we have designed and implemented an automatic human activity recognition system that independently recognizes the actions of the humans. This system is able to recognize the activities such as Laying, Sitting, Standing, Walking, Walking downstairs and Walking upstairs. The results obtained show that, the KNN and Random Forest Algorithms gives 90.22% and 92.70% respectively of overall accuracy in detecting the activities.


2021 ◽  
Vol 15 ◽  
Author(s):  
Jesús Leonardo López-Hernández ◽  
Israel González-Carrasco ◽  
José Luis López-Cuadrado ◽  
Belén Ruiz-Mezcua

Nowadays, the recognition of emotions in people with sensory disabilities still represents a challenge due to the difficulty of generalizing and modeling the set of brain signals. In recent years, the technology that has been used to study a person’s behavior and emotions based on brain signals is the brain–computer interface (BCI). Although previous works have already proposed the classification of emotions in people with sensory disabilities using machine learning techniques, a model of recognition of emotions in people with visual disabilities has not yet been evaluated. Consequently, in this work, the authors present a twofold framework focused on people with visual disabilities. Firstly, auditory stimuli have been used, and a component of acquisition and extraction of brain signals has been defined. Secondly, analysis techniques for the modeling of emotions have been developed, and machine learning models for the classification of emotions have been defined. Based on the results, the algorithm with the best performance in the validation is random forest (RF), with an accuracy of 85 and 88% in the classification for negative and positive emotions, respectively. According to the results, the framework is able to classify positive and negative emotions, but the experimentation performed also shows that the framework performance depends on the number of features in the dataset and the quality of the Electroencephalogram (EEG) signals is a determining factor.


Author(s):  
Gonzalo Vergara ◽  
Juan J. Carrasco ◽  
Jesus Martínez-Gómez ◽  
Manuel Domínguez ◽  
José A. Gámez ◽  
...  

The study of energy efficiency in buildings is an active field of research. Modeling and predicting energy related magnitudes leads to analyze electric power consumption and can achieve economical benefits. In this study, classical time series analysis and machine learning techniques, introducing clustering in some models, are applied to predict active power in buildings. The real data acquired corresponds to time, environmental and electrical data of 30 buildings belonging to the University of León (Spain). Firstly, we segmented buildings in terms of their energy consumption using principal component analysis. Afterwards, we applied state of the art machine learning methods and compare between them. Finally, we predicted daily electric power consumption profiles and compare them with actual data for different buildings. Our analysis shows that multilayer perceptrons have the lowest error followed by support vector regression and clustered extreme learning machines. We also analyze daily load profiles on weekdays and weekends for different buildings.


Sign in / Sign up

Export Citation Format

Share Document