scholarly journals Machine Learning Using Hyperspectral Data Inaccurately Predicts Plant Traits Under Spatial Dependency

2018 ◽  
Vol 10 (8) ◽  
pp. 1263 ◽  
Author(s):  
Alby Rocha ◽  
Thomas Groen ◽  
Andrew Skidmore ◽  
Roshanak Darvishzadeh ◽  
Louise Willemen

Spectral, temporal and spatial dimensions are difficult to model together when predicting in situ plant traits from remote sensing data. Therefore, machine learning algorithms solely based on spectral dimensions are often used as predictors, even when there is a strong effect of spatial or temporal autocorrelation in the data. A significant reduction in prediction accuracy is expected when algorithms are trained using a sequence in space or time that is unlikely to be observed again. The ensuing inability to generalise creates a necessity for ground-truth data for every new area or period, provoking the propagation of “single-use” models. This study assesses the impact of spatial autocorrelation on the generalisation of plant trait models predicted with hyperspectral data. Leaf Area Index (LAI) data generated at increasing levels of spatial dependency are used to simulate hyperspectral data using Radiative Transfer Models. Machine learning regressions to predict LAI at different levels of spatial dependency are then tuned (determining the optimum model complexity) using cross-validation as well as the NOIS method. The results show that cross-validated prediction accuracy tends to be overestimated when spatial structures present in the training data are fitted (or learned) by the model.

2020 ◽  
Author(s):  
Ahmed Tageldin ◽  
Dalia Adly ◽  
Hassan Mostafa ◽  
Haitham S Mohammed

AbstractThe use of technology in agriculture has grown in recent years with the era of data analytics affecting every industry. The main challenge in using technology in agriculture is identification of effectiveness of big data analytics algorithms and their application methods. Pest management is one of the most important problems facing farmers. The cotton leafworm, Spodoptera littoralis (Boisd.) (CLW) is one of the major polyphagous key pests attacking plants includes 73 species recorded at Egypt. In the present study, several machine learning algorithms have been implemented to predict plant infestation with CLW. The moth of CLW data was weekly collected for two years in a commercial hydroponic greenhouse. Furthermore, among other features temperature and relative humidity were recorded over the total period of the study. It was proven that the XGBoost algorithm is the most effective algorithm applied in this study. Prediction accuracy of 84 % has been achieved using this algorithm. The impact of environmental features on the prediction accuracy was compared with each other to ensure a complete dataset for future results. In conclusion, the present study provided a framework for applying machine learning in the prediction of plant infestation with the CLW in the greenhouses. Based on this framework, further studies with continuous measurements are warranted to achieve greater accuracy.


Author(s):  
Nicholas Westing ◽  
Brett Borghetti ◽  
Kevin Gross

The increasing spatial and spectral resolution of hyperspectral imagers yields detailed spectroscopy measurements from both space-based and airborne platforms. Machine learning algorithms have achieved state-of-the-art material classification performance on benchmark hyperspectral data sets; however, these techniques often do not consider varying atmospheric conditions experienced in a real-world detection scenario. To reduce the impact of atmospheric effects in the at-sensor signal, atmospheric compensation must be performed. Radiative Transfer (RT) modeling can generate high-fidelity atmospheric estimates at detailed spectral resolutions, but is often too time-consuming for real-time detection scenarios. This research utilizes machine learning methods to perform dimension reduction on the transmittance, upwelling radiance, and downwelling radiance (TUD) data to create high accuracy atmospheric estimates with lower computational cost than RT modeling. The utility of this approach is investigated using the instrument line shape for the Mako long-wave infrared hyperspectral sensor. This study employs physics-based metrics and loss functions to identify promising dimension reduction techniques. As a result, TUD vectors can be produced in real-time allowing for atmospheric compensation across diverse remote sensing scenarios.


2020 ◽  
Vol 39 (5) ◽  
pp. 6579-6590
Author(s):  
Sandy Çağlıyor ◽  
Başar Öztayşi ◽  
Selime Sezgin

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.


Author(s):  
Anik Das ◽  
Mohamed M. Ahmed

Accurate lane-change prediction information in real time is essential to safely operate Autonomous Vehicles (AVs) on the roadways, especially at the early stage of AVs deployment, where there will be an interaction between AVs and human-driven vehicles. This study proposed reliable lane-change prediction models considering features from vehicle kinematics, machine vision, driver, and roadway geometric characteristics using the trajectory-level SHRP2 Naturalistic Driving Study and Roadway Information Database. Several machine learning algorithms were trained, validated, tested, and comparatively analyzed including, Classification And Regression Trees (CART), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), Support Vector Machine (SVM), K Nearest Neighbor (KNN), and Naïve Bayes (NB) based on six different sets of features. In each feature set, relevant features were extracted through a wrapper-based algorithm named Boruta. The results showed that the XGBoost model outperformed all other models in relation to its highest overall prediction accuracy (97%) and F1-score (95.5%) considering all features. However, the highest overall prediction accuracy of 97.3% and F1-score of 95.9% were observed in the XGBoost model based on vehicle kinematics features. Moreover, it was found that XGBoost was the only model that achieved a reliable and balanced prediction performance across all six feature sets. Furthermore, a simplified XGBoost model was developed for each feature set considering the practical implementation of the model. The proposed prediction model could help in trajectory planning for AVs and could be used to develop more reliable advanced driver assistance systems (ADAS) in a cooperative connected and automated vehicle environment.


Processes ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 1241
Author(s):  
Véronique Gomes ◽  
Marco S. Reis ◽  
Francisco Rovira-Más ◽  
Ana Mendes-Ferreira ◽  
Pedro Melo-Pinto

The high quality of Port wine is the result of a sequence of winemaking operations, such as harvesting, maceration, fermentation, extraction and aging. These stages require proper monitoring and control, in order to consistently achieve the desired wine properties. The present work focuses on the harvesting stage, where the sugar content of grapes plays a key role as one of the critical maturity parameters. Our approach makes use of hyperspectral imaging technology to rapidly extract information from wine grape berries; the collected spectra are fed to machine learning algorithms that produce estimates of the sugar level. A consistent predictive capability is important for establishing the harvest date, as well as to select the best grapes to produce specific high-quality wines. We compared four different machine learning methods (including deep learning), assessing their generalization capacity for different vintages and varieties not included in the training process. Ridge regression, partial least squares, neural networks and convolutional neural networks were the methods considered to conduct this comparison. The results show that the estimated models can successfully predict the sugar content from hyperspectral data, with the convolutional neural network outperforming the other methods.


2020 ◽  
Vol 5 (19) ◽  
pp. 32-35
Author(s):  
Anand Vijay ◽  
Kailash Patidar ◽  
Manoj Yadav ◽  
Rishi Kushwah

In this paper an analytical survey on the role of machine learning algorithms in case of intrusion detection has been presented and discussed. This paper shows the analytical aspects in the development of efficient intrusion detection system (IDS). The related study for the development of this system has been presented in terms of computational methods. The discussed methods are data mining, artificial intelligence and machine learning. It has been discussed along with the attack parameters and attack types. This paper also elaborates the impact of different attack and handling mechanism based on the previous papers.


2021 ◽  
Vol 251 ◽  
pp. 01017
Author(s):  
Zhixiang Lu

With the vigorous development of the sharing economy, the short-term rental industry has also spawned many emerging industries that belong to the sharing economy. However, due to the impact of the COVID-19 pandemic in 2020, many sharing economy industries, including the short-term housing leasing industry, have been affected. This study takes the rental information of 1,004 short-term rental houses in New York in April 2020 as an example, through machine learning and quantitative analysis, we conducted statistical and visual analysis on the impact of different factors on the housing rental status. This project is based on the machine learning model to predict the changes in the rental status of the house on the time series. The results show that the prediction accuracy of the random forest model has reached more than 94%, and the prediction accuracy of the logistic model has reached more than 74%. At the same time, we have further explored the impact of time span differences and regional differences on the housing rental status.


Author(s):  
Noor Asyikin Sulaiman ◽  
Md Pauzi Abdullah ◽  
Hayati Abdullah ◽  
Muhammad Noorazlan Shah Zainudin ◽  
Azdiana Md Yusop

Air conditioning system is a complex system and consumes the most energy in a building. Any fault in the system operation such as cooling tower fan faulty, compressor failure, damper stuck, etc. could lead to energy wastage and reduction in the system’s coefficient of performance (COP). Due to the complexity of the air conditioning system, detecting those faults is hard as it requires exhaustive inspections. This paper consists of two parts; i) to investigate the impact of different faults related to the air conditioning system on COP and ii) to analyse the performances of machine learning algorithms to classify those faults. Three supervised learning classifier models were developed, which were deep learning, support vector machine (SVM) and multi-layer perceptron (MLP). The performances of each classifier were investigated in terms of six different classes of faults. Results showed that different faults give different negative impacts on the COP. Also, the three supervised learning classifier models able to classify all faults for more than 94%, and MLP produced the highest accuracy and precision among all.


Author(s):  
Francesc López Seguí ◽  
Ricardo Ander Egg Aguilar ◽  
Gabriel de Maeztu ◽  
Anna García-Altés ◽  
Francesc García Cuyàs ◽  
...  

Background: the primary care service in Catalonia has operated an asynchronous teleconsulting service between GPs and patients since 2015 (eConsulta), which has generated some 500,000 messages. New developments in big data analysis tools, particularly those involving natural language, can be used to accurately and systematically evaluate the impact of the service. Objective: the study was intended to examine the predictive potential of eConsulta messages through different combinations of vector representation of text and machine learning algorithms and to evaluate their performance. Methodology: 20 machine learning algorithms (based on 5 types of algorithms and 4 text representation techniques)were trained using a sample of 3,559 messages (169,102 words) corresponding to 2,268 teleconsultations (1.57 messages per teleconsultation) in order to predict the three variables of interest (avoiding the need for a face-to-face visit, increased demand and type of use of the teleconsultation). The performance of the various combinations was measured in terms of precision, sensitivity, F-value and the ROC curve. Results: the best-trained algorithms are generally effective, proving themselves to be more robust when approximating the two binary variables "avoiding the need of a face-to-face visit" and "increased demand" (precision = 0.98 and 0.97, respectively) rather than the variable "type of query"(precision = 0.48). Conclusion: to the best of our knowledge, this study is the first to investigate a machine learning strategy for text classification using primary care teleconsultation datasets. The study illustrates the possible capacities of text analysis using artificial intelligence. The development of a robust text classification tool could be feasible by validating it with more data, making it potentially more useful for decision support for health professionals.


Sign in / Sign up

Export Citation Format

Share Document