Classification of iron oxide aerosols by a single particle soot photometer using supervised machine learning

Abstract. Single particle soot photometers (SP2) use laser-induced incandescence to detect aerosols on a single particle basis. SP2s that have been modified to provide greater spectral contrast between their narrow and broad-band incandescent detectors have previously been used to characterize both refractory black carbon (rBC) and light-absorbing metallic aerosols, including iron oxides (FeOx). However, single particles cannot be unambiguously identified from their incandescent peak height (a function of particle mass) and color ratio (a measure of blackbody temperature) alone. Machine learning offers a promising approach for improving the classification of these aerosols. Here we explore the advantages and limitations of classifying single particle signals obtained with a modified SP2 using a supervised machine learning algorithm. Laboratory samples of different aerosols that incandesce in the SP2 (fullerene soot, mineral dust, volcanic ash, coal fly ash, Fe2O3, and Fe3O4) were used to train a random forest algorithm. The trained algorithm was then applied to test data sets of laboratory samples and atmospheric aerosols. This method provides a systematic approach for classifying incandescent aerosols by providing a score, or conditional probability, that a particle is likely to belong to a particular aerosol class (rBC, FeOx, etc.) given its observed single particle features. We consider two alternative approaches for identifying aerosols in mixed populations based on their single particle SP2 response: one with specific class labels for each species sampled, and one with three broader classes (rBC, anthropogenic FeOx, and dust-like) for particles with similar SP2 responses. Predictions of the most likely particle class (the one with the highest mean probability) based on applying the trained random forest algorithm to the single particle features for test data sets comprising examples of each class are compared with the true class for those particles to estimate generalization performance. While the specific class approach performed well for rBC and Fe3O4 (≥99 % of these aerosols are correctly identified), its classification of other aerosol types is significantly worse (only 47 %–66 % of other particles are correctly identified). Using the broader class approach, we find a classification accuracy of 99 % for FeOx samples measured in the laboratory. The method allows for classification of FeOx as anthropogenic or dust-like for aerosols with effective spherical diameters from 170 to >1200 nm. The misidentification of both dust-like aerosols and rBC as anthropogenic FeOx is small, with <3 % of the dust-like aerosols and <0.1 % of rBC misidentified as FeOx for the broader class case. When applying this method to atmospheric observations taken in Boulder, CO, a clear mode consistent with FeOx was observed, distinct from dust-like aerosols.

Download Full-text

Comparations of Supervised Machine Learning Techniques in Predicting the Classification of the Household’s Welfare Status

Journal Pekommas ◽

10.30818/jpkm.2019.2040105 ◽

2019 ◽

Vol 4 (1) ◽

pp. 43

Author(s):

Nfn Nofriani

Keyword(s):

Machine Learning ◽

Random Forest ◽

Social Assistance ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Random Forest Algorithm ◽

K Nearest Neighbor ◽

Learning Techniques

Poverty has been a major problem for most countries around the world, including Indonesia. One approach to eradicate poverty is through equitable distribution of social assistance for target households based on Integrated Database of social assistance. This study has compared several well-known supervised machine learning techniques, namely: Naïve Bayes Classifier, Support Vector Machines, K-Nearest Neighbor Classification, C4.5 Algorithm, and Random Forest Algorithm to predict household welfare status classification by using an Integrated Database as a study case. The main objective of this study was to choose the best-supervised machine learning approach in predicting the classification of household’s welfare status based on attributes in the Integrated Database. The results showed that the Random Forest Algorithm was the best.

Download Full-text

Classification of iron oxide aerosols by a single particle soot photometer using supervised machine learning

10.5194/amt-2019-106 ◽

2019 ◽

Author(s):

Kara D. Lamb

Keyword(s):

Machine Learning ◽

Atmospheric Aerosols ◽

Single Particle ◽

Learning Algorithm ◽

Coal Fly Ash ◽

Peak Height ◽

Supervised Machine Learning ◽

Specific Class ◽

Color Ratio

Abstract. Single particle soot photometers (SP2) use laser-induced incandescence to detect aerosols on a single particle basis. Both refractory black carbon (rBC) and other light absorbing metallic aerosols, including iron oxides (FeOx), have been characterized by the SP2, but single particles cannot be unambiguously identified from their incandescent peak height (a function of particle mass) and color ratio (a measure of blackbody temperature) alone. Machine learning offers a promising approach to improving the classification of these aerosols. Here we explore the advantages and limitations of classifying single particle signals obtained with the SP2 using a supervised learning algorithm. Laboratory samples of different aerosols that incandesce in the SP2 (fullerene soot, mineral dust, volcanic ash, coal fly ash, Fe2O3, and Fe3O4) were used to train a random forest algorithm. The trained algorithm was then applied to test data sets of laboratory samples and atmospheric aerosols. This method provides a systematic approach for classifying incandescent aerosols by providing a score, or conditional probability, that a particle is likely to belong to a particular aerosol class (rBC, FeOx, etc.) given its observed single-particle features. We consider two alternative approaches for identifying aerosols in mixed populations: one with specific class labels for each species sampled, and one with three broader classes for aerosols with similar properties. While the specific class approach performs well for rBC and Fe3O4 (> = 99 % of these aerosols are correctly identified), its classification of other aerosol types is significantly worse (only 47–66 % of other particles are correctly identified). Using the broader class approach, we find a classification accuracy of 99 % for FeOx samples measured in the laboratory. The method allows for classification of FeOx as anthropogenic or dust-like for aerosols with effective spherical diameters from 170 to > 1200 nm. The misidentification of both dust-like aerosols and rBC as anthropogenic FeOx is small, with

Download Full-text

Referee comment to "Classification of iron oxide aerosols by a single particle soot photometer using supervised machine learning" by Lamb

10.5194/amt-2019-106-rc1 ◽

2019 ◽

Author(s):

Anonymous

Keyword(s):

Machine Learning ◽

Iron Oxide ◽

Single Particle ◽

Supervised Machine Learning

Download Full-text

Classification of Agriculture Farm Machinery Using Machine Learning and Internet of Things

Symmetry ◽

10.3390/sym13030403 ◽

2021 ◽

Vol 13 (3) ◽

pp. 403

Author(s):

Muhammad Waleed ◽

Tai-Won Um ◽

Tariq Kamal ◽

Syed Muhammad Usman

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Farm Machinery ◽

Learning Techniques

In this paper, we apply the multi-class supervised machine learning techniques for classifying the agriculture farm machinery. The classification of farm machinery is important when performing the automatic authentication of field activity in a remote setup. In the absence of a sound machine recognition system, there is every possibility of a fraudulent activity taking place. To address this need, we classify the machinery using five machine learning techniques—K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Gradient Boosting (GB). For training of the model, we use the vibration and tilt of machinery. The vibration and tilt of machinery are recorded using the accelerometer and gyroscope sensors, respectively. The machinery included the leveler, rotavator and cultivator. The preliminary analysis on the collected data revealed that the farm machinery (when in operation) showed big variations in vibration and tilt, but observed similar means. Additionally, the accuracies of vibration-based and tilt-based classifications of farm machinery show good accuracy when used alone (with vibration showing slightly better numbers than the tilt). However, the accuracies improve further when both (the tilt and vibration) are used together. Furthermore, all five machine learning algorithms used for classification have an accuracy of more than 82%, but random forest was the best performing. The gradient boosting and random forest show slight over-fitting (about 9%), but both algorithms produce high testing accuracy. In terms of execution time, the decision tree takes the least time to train, while the gradient boosting takes the most time.

Download Full-text

A Study of the Classification of Motor Imagery Signals using Machine Learning Tools

10.5121/csit.2021.112104 ◽

2021 ◽

Author(s):

Anam Hashmi ◽

Bilal Alam Khan ◽

Omar Farooq

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Wavelet Transform ◽

Random Forest ◽

Random Forest Algorithm ◽

Eeg Signals ◽

Relaxation State ◽

Wavelet Transform Analysis ◽

Imagined Movement

In this paper, we propose a system for the purpose of classifying Electroencephalography (EEG) signals associated with imagined movement of right hand and relaxation state using machine learning algorithm namely Random Forest Algorithm. The EEG dataset used in this research was created by the University of Tubingen, Germany. EEG signals associated with the imagined movement of right hand and relaxation state were processed using wavelet transform analysis with Daubechies orthogonal wavelet as the mother wavelet. After the wavelet transform analysis, eight features were extracted. Subsequently, a feature selection method based on Random Forest Algorithm was employed giving us the best features out of the eight proposed features. The feature selection stage was followed by classification stage in which eight different models combining the different features based on their importance were constructed. The optimum classification performance of 85.41% was achieved with the Random Forest classifier. This research shows that this system of classification of motor movements can be used in a Brain Computer Interface system (BCI) to mentally control a robotic device or an exoskeleton.

Download Full-text

Regional Mapping of Vineyards Using Machine Learning and LiDAR Data

International Journal of Applied Geospatial Research ◽

10.4018/ijagr.2020100101 ◽

2020 ◽

Vol 11 (4) ◽

pp. 1-22

Author(s):

Adriaan Jacobus Prins ◽

Adriaan van Niekerk

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithms ◽

Window Size ◽

Machine Learning Algorithms ◽

Surface Model ◽

Lidar Data ◽

Data Sets ◽

Random Forest Algorithm ◽

Spectral Mixing

This study evaluates the use of LiDAR data and machine learning algorithms for mapping vineyards. Vineyards are planted in rows spaced at various distances, which can cause spectral mixing within individual pixels and complicate image classification. Four resolution where used for generating normalized digital surface model and intensity derivatives from the LiDAR data. In addition, texture measures with window sizes of 3x3 and 5x5 were generated from the LiDAR derivatives. The different combinations of the resolutions and window sizes resulted in eight data sets that were used as input to 11 machine learning algorithms. A larger window size was found to improve the overall accuracy for all the classifier–resolution combinations. The results showed that random forest with texture measures generated at a 5x5 window size outperformed the other experiments, regardless of the resolution used. The authors conclude that the random forest algorithm used on LiDAR derivatives with a resolution of 1.5m and a window size of 5x5 is the recommend configuration for vineyard mapping using LiDAR data.

Download Full-text

ANALYSIS OF SINGLE AND ENSEMBLE MACHINE LEARNING CLASSIFIERS FOR PHISHING ATTACKS DETECTION

International Journal of Computer Systems & Software Engineering ◽

10.15282/ijsecs.7.2.2021.5.0088 ◽

2021 ◽

Vol 7 (2) ◽

pp. 44-49

Author(s):

Oyelakin A. M ◽

Alimi O. M ◽

Mustapha I. O ◽

Ajiboye I. K

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Decision Trees ◽

Random Forest Algorithm ◽

Ensemble Techniques ◽

Learning Classifiers ◽

Phishing Attacks ◽

Ensemble Machine Learning

Phishing attacks have been used in different ways to harvest the confidential information of unsuspecting internet users. To stem the tide of phishing-based attacks, several machine learning techniques have been proposed in the past. However, fewer studies have considered investigating single and ensemble machine learning-based models for the classification of phishing attacks. This study carried out performance analysis of selected single and ensemble machine learning (ML) classifiers in phishing classification.The focus is to investigate how these algorithms behave in the classification of phishing attacks in the chosen dataset. Logistic Regression and Decision Trees were chosen as single learning classifiers while simple voting techniques and Random Forest were used as the ensemble machine learning algorithms. Accuracy, Precision, Recall and F1-score were used as performance metrics. Logistic Regression algorithm recorded 0.86 as accuracy, 0.89 as precision, 0.87 as recall and 0.81 as F1-score. Similarly, the Decision Trees classifier achieved an accuracy of 0.87, 0.83 for precision, 0.88 for recall and 0.81 for F1-score. In the voting ensemble, accuracy of 0.92 was achieved. 0.90 was obtained for precision, 0.92 for recall and 0.92 for F1-score. Random Forest algorithm recorded 0.98, 0.97, 0.98 and 0.97 as accuracy, precision, recall and F1-score respectively. From the experimental analyses, Random Forest algorithm outperformed simple averaging classifier and the two single algorithms used for phishing url detection. The study established that the ensemble techniques that were used for the experimentations are more efficient for phishing url identification compared to the single classifiers.

Download Full-text

Attribute-oriented Classification with Variable Importance using Random Forest Model

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1297.0782s319 ◽

2019 ◽

Vol 8 (2S3) ◽

pp. 1630-1635

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Large Data ◽

Variable Importance ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Importance Measure ◽

Variable Importance Measure

In the present century, various classification issues are raised with large data and most commonly used machine learning algorithms are failed in the classification process to get accurate results. Datamining techniques like ensemble, which is made up of individual classifiers for the classification process and to generate the new data as well. Random forest is one of the ensemble supervised machine learning technique and essentially used in numerous machine learning applications such as the classification of text and image data. It is popular since it collects more relevant features such as variable importance measure, Out-of-bag error etc. For the viable learning and classification of random forest, it is required to reduce the number of decision trees (Pruning) in the random forest. In this paper, we have presented systematic overview of random forest algorithm along with its application areas. In addition, we presented a brief review of machine learning algorithm proposed in the recent years. Animal classification is considered as an important problem and most of the recent studies are classifying the animals by taking the image dataset. But, very less work has been done on attribute-oriented animal classification and poses many challenges in the process of extracting the accurate features. We have taken a real-time dataset from the Kaggle to classify the animal by collecting the more relevant features with the help of variable importance measure metric and compared with the other popular machine learning models.

Download Full-text

Detection of Spam Bots on Twitter using Machine Learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f3691.049620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 249-252

Keyword(s):

Machine Learning ◽

Random Forest ◽

Domain Knowledge ◽

Data Sets ◽

Bag Of Words ◽

Online Environment ◽

Random Forest Algorithm ◽

Research Papers ◽

Twitter Data ◽

Key Phrases

Twitter is a popularly used microblogging website that is used to share views, opinions, and updates. However, in recent times, an epidemic of spammer accounts have spread across the website causing disorder and chaos among the normal users. These spammers either aim to promote some commercial agenda or disturb the peace in the online environment. Our project aims to analyze the tweets made by users and predict if they might be spammers so that appropriate action can be taken on them. This is done using machine learning. The random forest algorithm has been modified by giving weighted importance to certain variables assigned using domain knowledge that has been obtained from exploratory analysis of various twitter data sets and knowledge from scientific research papers. A bag of words has also been added to the algorithm, in order to quickly identify the key phrases used by spam bots. By identifying the spammers we can systematically report them and create a more peaceful online environment.

Download Full-text

Reply to "'Referee comment to "Classification of iron oxide aerosols by a single particle soot photometer using supervised machine learning" by Lamb'"

10.5194/amt-2019-106-ac1 ◽

2019 ◽

Author(s):

Kara Lamb

Keyword(s):

Machine Learning ◽

Iron Oxide ◽

Single Particle ◽

Supervised Machine Learning

Download Full-text