Remote Sensing Based Binary Classification of Maize. Dealing with Residual Autocorrelation in Sparse Sample Situations

In order to discuss potential sustainability issues of expanding silage maize cultivation in Rhineland-Palatinate, spatially explicit monitoring is necessary. Publicly available statistical records are often not a sufficient basis for extensive research, especially on soil health, where risk factors like erosion and compaction depend on variables that are specific to every site, and hard to generalize for larger administrative aggregates. The focus of this study is to apply established classification algorithms to estimate maize abundance for each independent pixel, while at the same time accounting for their spatial relationship. Therefore, two ways to incorporate spatial autocorrelation of neighboring pixels are combined with three different classification models. The performance of each of these modeling approaches is analyzed and discussed. Finally, one prediction approach is applied to the imagery, and the overall predicted acreage is compared to publicly available data. We were able to show that Support Vector Machine (SVM) classification and Random Forests (RF) were able to distinguish maize pixels reliably, with kappa values well above 0.9 in most cases. The Generalized Linear Model (GLM) performed substantially worse. Furthermore, Regression Kriging (RK) as an approach to integrate spatial autocorrelation into the prediction model is not suitable in use cases with millions of sparsely clustered training pixels. Gaussian Blur is able to improve predictions slightly in these cases, but it is possible that this is only because it smoothes out impurities of the reference data. The overall prediction with RF classification combined with Gaussian Blur performed well, with out of bag error rates of 0.5% in 2009 and 1.3% in 2016. Despite the low error rates, there is a discrepancy between the predicted acreage and the official records, which is 20% in 2009 and 27% in 2016.

Download Full-text

Implications of Bioenergy Cropping for Soil: Remote Sensing Identification of Silage Maize Cultivation and Risk Assessment Concerning Soil Erosion and Compaction

Land ◽

10.3390/land10020128 ◽

2021 ◽

Vol 10 (2) ◽

pp. 128

Author(s):

Thorsten Ruf ◽

Mario Gilcher ◽

Thomas Udelhoven ◽

Christoph Emmerling

Keyword(s):

Remote Sensing ◽

Soil Erosion ◽

Cropping Systems ◽

Arable Land ◽

Soil Health ◽

Spatially Explicit ◽

Mountain Range ◽

Silage Maize ◽

Maize Cultivation ◽

Spatially Explicit Data

Energy transition strategies in Germany have led to an expansion of energy crop cultivation in landscape, with silage maize as most valuable feedstock. The changes in the traditional cropping systems, with increasing shares of maize, raised concerns about the sustainability of agricultural feedstock production regarding threats to soil health. However, spatially explicit data about silage maize cultivation are missing; thus, implications for soil cannot be estimated in a precise way. With this study, we firstly aimed to track the fields cultivated with maize based on remote sensing data. Secondly, available soil data were target-specifically processed to determine the site-specific vulnerability of the soils for erosion and compaction. The generated, spatially-explicit data served as basis for a differentiated analysis of the development of the agricultural biogas sector, associated maize cultivation and its implications for soil health. In the study area, located in a low mountain range region in Western Germany, the number and capacity of biogas producing units increased by 25 installations and 10,163 kW from 2009 to 2016. The remote sensing-based classification approach showed that the maize cultivation area was expanded by 16% from 7305 to 8447 hectares. Thus, maize cultivation accounted for about 20% of the arable land use; however, with distinct local differences. Significant shares of about 30% of the maize cultivation was done on fields that show at least high potentials for soil erosion exceeding 25 t soil ha−1 a−1. Furthermore, about 10% of the maize cultivation was done on fields that pedogenetically show an elevated risk for soil compaction. In order to reach more sustainable cultivation systems of feedstock for anaerobic digestion, changes in cultivated crops and management strategies are urgently required, particularly against first signs of climate change. The presented approach can regionally be modified in order to develop site-adapted, sustainable bioenergy cropping systems.

Download Full-text

GESTURE RECOGNITION SYSTEM FOR NIGERIAN TRIBAL GREETING POSTURES USING SUPPORT VECTOR MACHINE

MALAYSIAN JOURNAL OF COMPUTING ◽

10.24191/mjoc.v5i2.10347 ◽

2020 ◽

Vol 5 (2) ◽

pp. 609

Author(s):

Segun Aina ◽

Kofoworola V. Sholesi ◽

Aderonke R. Lawal ◽

Samuel D. Okegbile ◽

Adeniran I. Oluwaranti

Keyword(s):

Support Vector Machine ◽

Gesture Recognition ◽

Recognition Rate ◽

Recognition Task ◽

Recognition System ◽

Human Interaction ◽

Support Vector ◽

System A ◽

Extraction Algorithm ◽

Gaussian Blur

This paper presents the application of Gaussian blur filters and Support Vector Machine (SVM) techniques for greeting recognition among the Yoruba tribe of Nigeria. Existing efforts have considered different recognition gestures. However, tribal greeting postures or gestures recognition for the Nigerian geographical space has not been studied before. Some cultural gestures are not correctly identified by people of the same tribe, not to mention other people from different tribes, thereby posing a challenge of misinterpretation of meaning. Also, some cultural gestures are unknown to most people outside a tribe, which could also hinder human interaction; hence there is a need to automate the recognition of Nigerian tribal greeting gestures. This work hence develops a Gaussian Blur – SVM based system capable of recognizing the Yoruba tribe greeting postures for men and women. Videos of individuals performing various greeting gestures were collected and processed into image frames. The images were resized and a Gaussian blur filter was used to remove noise from them. This research used a moment-based feature extraction algorithm to extract shape features that were passed as input to SVM. SVM is exploited and trained to perform the greeting gesture recognition task to recognize two Nigerian tribe greeting postures. To confirm the robustness of the system, 20%, 25% and 30% of the dataset acquired from the preprocessed images were used to test the system. A recognition rate of 94% could be achieved when SVM is used, as shown by the result which invariably proves that the proposed method is efficient.

Download Full-text

Prediction of Incident Cancers in the Lifelines Population-Based Cohort

Cancers ◽

10.3390/cancers13092133 ◽

2021 ◽

Vol 13 (9) ◽

pp. 2133

Author(s):

Francisco O. Cortés-Ibañez ◽

Sunil Belur Nagaraj ◽

Ludo Cornelissen ◽

Gerjan J. Navis ◽

Bert van der Vegt ◽

...

Keyword(s):

Cancer Incidence ◽

Binary Classification ◽

Predictive Performance ◽

Population Based ◽

Support Vector ◽

Clinical Variables ◽

Incident Cancer ◽

History Of ◽

Diagnosis Of Cancer ◽

Auc Value

Cancer incidence is rising, and accurate prediction of incident cancers could be relevant to understanding and reducing cancer incidence. The aim of this study was to develop machine learning (ML) models that could predict an incident diagnosis of cancer. Participants without any history of cancer within the Lifelines population-based cohort were followed for a median of 7 years. Data were available for 116,188 cancer-free participants and 4232 incident cancer cases. At baseline, socioeconomic, lifestyle, and clinical variables were assessed. The main outcome was an incident cancer during follow-up (excluding skin cancer), based on linkage with the national pathology registry. The performance of three ML algorithms was evaluated using supervised binary classification to identify incident cancers among participants. Elastic net regularization and Gini index were used for variables selection. An overall area under the receiver operator curve (AUC) <0.75 was obtained, the highest AUC value was for prostate cancer (random forest AUC = 0.82 (95% CI 0.77–0.87), logistic regression AUC = 0.81 (95% CI 0.76–0.86), and support vector machines AUC = 0.83 (95% CI 0.78–0.88), respectively); age was the most important predictor in these models. Linear and non-linear ML algorithms including socioeconomic, lifestyle, and clinical variables produced a moderate predictive performance of incident cancers in the Lifelines cohort.

Download Full-text

A comparison study: Support vector machines for binary classification in machine learning

2011 4th International Conference on Biomedical Engineering and Informatics (BMEI) ◽

10.1109/bmei.2011.6098517 ◽

2011 ◽

Cited By ~ 4

Author(s):

Wencai Zeng ◽

Jiong Jia ◽

Zhonglong Zheng ◽

Chenmao Xie ◽

Li Guo

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Binary Classification ◽

Support Vector ◽

Comparison Study ◽

Vector Machines ◽

Study Support

Download Full-text

Gaussian Processes for Classification: Mean-Field Algorithms

Neural Computation ◽

10.1162/089976600300014881 ◽

2000 ◽

Vol 12 (11) ◽

pp. 2655-2684 ◽

Cited By ~ 91

Author(s):

Manfred Opper ◽

Ole Winther

Keyword(s):

Support Vector Machines ◽

Gaussian Processes ◽

Disordered Systems ◽

Binary Classification ◽

Computational Cost ◽

Mean Field ◽

Strong Support ◽

Support Vector ◽

Vector Machines ◽

Leave One Out

We derive a mean-field algorithm for binary classification with gaussian processes that is based on the TAP approach originally proposed in statistical physics of disordered systems. The theory also yields an approximate leave-one-out estimator for the generalization error, which is computed with no extra computational cost. We show that from the TAP approach, it is possible to derive both a simpler “naive” mean-field theory and support vector machines (SVMs) as limiting cases. For both mean-field algorithms and support vector machines, simulation results for three small benchmark data sets are presented. They show that one may get state-of-the-art performance by using the leave-one-out estimator for model selection and the built-in leave-one-out estimators are extremely precise when compared to the exact leave-one-out estimate. The second result is taken as strong support for the internal consistency of the mean-field approach.

Download Full-text

A Two-Stage Machine Learning Classification Approach to Identify Extremism in Arabic Opinions

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/391022021 ◽

2021 ◽

Vol 10 (2) ◽

pp. 736-745

Keyword(s):

Machine Learning ◽

Binary Classification ◽

Feature Selection Method ◽

Support Vector ◽

Two Stage ◽

Machine Learning Classification ◽

Second Stage ◽

Testing Data ◽

Stage Classification ◽

Positive Dataset

The increased usage of the Internet and social networks allowed and enabled people to express their views, which have generated an increasing attention lately. Sentiment Analysis (SA) techniques are used to determine the polarity of information, either positive or negative, toward a given topic, including opinions. In this research, we have introduced a machine learning approach based on Support Vector Machine (SVM), Naïve Bayes (NB) and Random Forest (RF) classifiers, to find and classify extreme opinions in Arabic reviews. To achieve this, a dataset of 1500 Arabic reviews was collected from Google Play Store. In addition, a two-stage Classification process was applied to classify the reviews. In the first stage, we built a binary classifier to sort out positive from negative reviews. In the second stage, however we applied a binary classification mechanism based on a set of proposed rules that distinguishes extreme positive from positive reviews, and extreme negative from negative reviews. Four major experiments were conducted with a total of 10 different sub experiments to fulfill the two-stage process using different X-validation schemas and Term Frequency-Inverse Document Frequency feature selection method. Obtained results have indicated that SVM was the best during the first stage classification with 30% testing data, and NB was the best with 20% testing data. The results of the second stage classification indicated that SVM has scored better results in identifying extreme positive reviews when dealing with the positive dataset with an overall accuracy of 68.7% and NB showed better accuracy results in identifying extreme negative reviews when dealing with the negative dataset, with an overall accuracy of 72.8%.

Download Full-text

Forecasting the risk at infractions: an ensemble comparison of machine learning approach

Industrial Management & Data Systems ◽

10.1108/imds-10-2020-0603 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Lei Li ◽

Desheng Wu

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Short Term Memory ◽

Model Performance ◽

Large Data ◽

Support Vector ◽

Learning Approaches ◽

Content Type ◽

Day To Day Operations ◽

Prediction Approach

PurposeThe infraction of securities regulations (ISRs) of listed firms in their day-to-day operations and management has become one of common problems. This paper proposed several machine learning approaches to forecast the risk at infractions of listed corporates to solve financial problems that are not effective and precise in supervision.Design/methodology/approachThe overall proposed research framework designed for forecasting the infractions (ISRs) include data collection and cleaning, feature engineering, data split, prediction approach application and model performance evaluation. We select Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machines, Artificial Neural Network and Long Short-Term Memory Networks (LSTMs) as ISRs prediction models.FindingsThe research results show that prediction performance of proposed models with the prior infractions provides a significant improvement of the ISRs than those without prior, especially for large sample set. The results also indicate when judging whether a company has infractions, we should pay attention to novel artificial intelligence methods, previous infractions of the company, and large data sets.Originality/valueThe findings could be utilized to address the problems of identifying listed corporates' ISRs at hand to a certain degree. Overall, results elucidate the value of the prior infraction of securities regulations (ISRs). This shows the importance of including more data sources when constructing distress models and not only focus on building increasingly more complex models on the same data. This is also beneficial to the regulatory authorities.

Download Full-text

Handling binary classification problems with a priority class by using Support Vector Machines

Applied Soft Computing ◽

10.1016/j.asoc.2017.08.023 ◽

2017 ◽

Vol 61 ◽

pp. 661-669 ◽

Cited By ~ 10

Author(s):

L. Gonzalez-Abril ◽

C. Angulo ◽

H. Nuñez ◽

Y. Leal

Keyword(s):

Support Vector Machines ◽

Binary Classification ◽

Support Vector ◽

Classification Problems ◽

Priority Class ◽

Vector Machines ◽

A Priority

Download Full-text

Predicting ionizing radiation exposure using biochemically-inspired genomic machine learning

F1000Research ◽

10.12688/f1000research.14048.1 ◽

2018 ◽

Vol 7 ◽

pp. 233

Author(s):

Jonathan Z.L. Zhao ◽

Eliseos J. Mucaki ◽

Peter K. Rogan

Keyword(s):

Machine Learning ◽

Ionizing Radiation ◽

Radiation Exposure ◽

Large Scale ◽

Nearest Neighbor ◽

Error Rates ◽

Support Vector ◽

Dose Estimation ◽

Gene Signatures ◽

Ionizing Radiation Exposure

Background: Gene signatures derived from transcriptomic data using machine learning methods have shown promise for biodosimetry testing. These signatures may not be sufficiently robust for large scale testing, as their performance has not been adequately validated on external, independent datasets. The present study develops human and murine signatures with biochemically-inspired machine learning that are strictly validated using k-fold and traditional approaches. Methods: Gene Expression Omnibus (GEO) datasets of exposed human and murine lymphocytes were preprocessed via nearest neighbor imputation and expression of genes implicated in the literature to be responsive to radiation exposure (n=998) were then ranked by Minimum Redundancy Maximum Relevance (mRMR). Optimal signatures were derived by backward, complete, and forward sequential feature selection using Support Vector Machines (SVM), and validated using k-fold or traditional validation on independent datasets. Results: The best human signatures we derived exhibit k-fold validation accuracies of up to 98% (DDB2, PRKDC, TPP2, PTPRE, and GADD45A) when validated over 209 samples and traditional validation accuracies of up to 92% (DDB2, CD8A, TALDO1, PCNA, EIF4G2, LCN2, CDKN1A, PRKCH, ENO1, and PPM1D) when validated over 85 samples. Some human signatures are specific enough to differentiate between chemotherapy and radiotherapy. Certain multi-class murine signatures have sufficient granularity in dose estimation to inform eligibility for cytokine therapy (assuming these signatures could be translated to humans). We compiled a list of the most frequently appearing genes in the top 20 human and mouse signatures. More frequently appearing genes among an ensemble of signatures may indicate greater impact of these genes on the performance of individual signatures. Several genes in the signatures we derived are present in previously proposed signatures. Conclusions: Gene signatures for ionizing radiation exposure derived by machine learning have low error rates in externally validated, independent datasets, and exhibit high specificity and granularity for dose estimation.

Download Full-text

Phybrata Sensors and Machine Learning for Enhanced Neurophysiological Diagnosis and Treatment

Sensors ◽

10.3390/s21217417 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7417

Author(s):

Alex J. Hope ◽

Utkarsh Vashisth ◽

Matthew J. Parker ◽

Andreas B. Ralston ◽

Joshua M. Roper ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Random Forest ◽

Binary Classification ◽

Classification Performance ◽

Support Vector ◽

Use Case ◽

Signal Features ◽

Test Population

Concussion injuries remain a significant public health challenge. A significant unmet clinical need remains for tools that allow related physiological impairments and longer-term health risks to be identified earlier, better quantified, and more easily monitored over time. We address this challenge by combining a head-mounted wearable inertial motion unit (IMU)-based physiological vibration acceleration (“phybrata”) sensor and several candidate machine learning (ML) models. The performance of this solution is assessed for both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments. Results are compared with previously reported approaches to ML-based concussion diagnostics. Using phybrata data from a previously reported concussion study population, four different machine learning models (Support Vector Machine, Random Forest Classifier, Extreme Gradient Boost, and Convolutional Neural Network) are first investigated for binary classification of the test population as healthy vs. concussion (Use Case 1). Results are compared for two different data preprocessing pipelines, Time-Series Averaging (TSA) and Non-Time-Series Feature Extraction (NTS). Next, the three best-performing NTS models are compared in terms of their multiclass prediction performance for specific concussion-related impairments: vestibular, neurological, both (Use Case 2). For Use Case 1, the NTS model approach outperformed the TSA approach, with the two best algorithms achieving an F1 score of 0.94. For Use Case 2, the NTS Random Forest model achieved the best performance in the testing set, with an F1 score of 0.90, and identified a wider range of relevant phybrata signal features that contributed to impairment classification compared with manual feature inspection and statistical data analysis. The overall classification performance achieved in the present work exceeds previously reported approaches to ML-based concussion diagnostics using other data sources and ML models. This study also demonstrates the first combination of a wearable IMU-based sensor and ML model that enables both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments.

Download Full-text