Estimating Coastal Chlorophyll-A Concentration from Time-Series OLCI Data Based on Machine Learning

Chlorophyll-a (chl-a) is an important parameter of water quality and its concentration can be directly retrieved from satellite observations. The Ocean and Land Color Instrument (OLCI), a new-generation water-color sensor onboard Sentinel-3A and Sentinel-3B, is an excellent tool for marine environmental monitoring. In this study, we introduce a new machine learning model, Light Gradient Boosting Machine (LightGBM), for estimating time-series chl-a concentration in Fujian’s coastal waters using multitemporal OLCI data and in situ data. We applied the Case 2 Regional CoastColour (C2RCC) processor to obtain OLCI band reflectance and constructed four spectral indices based on OLCI feature bands as supplementary input features. We also used root-mean-square error (RMSE), mean absolute error (MAE), median absolute percentage error (MAPE), and R2 as performance indicators. The results indicate that the addition of spectral indices can easily improve the prediction accuracy of the model, and normalized fluorescence height index (NFHI) has the best performance, with an RMSE of 0.38 µg/L, MAE of 0.22 µg/L, MAPE of 28.33%, and R2 of 0.785. Moreover, we used the well-known band ratio and three-band methods for chl-a estimation validation, and another two OLCI chl-a products were adopted for comparison (OC4Me chl-a and Inverse Modelling Technique (IMT) Neural Net chl-a). The results confirmed that the LightGBM model outperforms the traditional methods and OLCI chl-a products. This study provides an effective remote sensing technique for coastal chl-a concentration estimation and promotes the advantage of OLCI data in ocean color remote sensing.

Download Full-text

Retrieval and Evaluation of Chlorophyll-A Spatiotemporal Variability Using GF-1 Imagery: Case Study of Qinzhou Bay, China

Sustainability ◽

10.3390/su13094649 ◽

2021 ◽

Vol 13 (9) ◽

pp. 4649

Author(s):

Ze-Lin Na ◽

Huan-Mei Yao ◽

Hua-Quan Chen ◽

Yi-Ming Wei ◽

Ke Wen ◽

...

Keyword(s):

Remote Sensing ◽

Chlorophyll A ◽

Near Infrared ◽

Red Tide ◽

Accurate Determination ◽

Satellite Image ◽

Spatiotemporal Variability ◽

Percentage Error ◽

Chl A ◽

Qinzhou Bay

Chlorophyll-a (Chl-a) concentration is a measure of phytoplankton biomass, and has been used to identify ‘red tide’ events. However, nearshore waters are optically complex, making the accurate determination of the chlorophyll-a concentration challenging. Therefore, in this study, a typical area affected by the Phaeocystis ‘red tide’ bloom, Qinzhou Bay, was selected as the study area. Based on the Gaofen-1 remote sensing satellite image and water quality monitoring data, the sensitive bands and band combinations of the nearshore Chl-a concentration of Qinzhou Bay were screened, and a Qinzhou Bay Chl-a retrieval model was constructed through stepwise regression analysis. The main conclusions of this work are as follows: (1) The Chl-a concentration retrieval regression model based on 1/B4 (near-infrared band (NIR)) has the best accuracy (R2 = 0.67, root-mean-square-error = 0.70 μg/L, and mean absolute percentage error = 0.23) for the remote sensing of Chl-a concentration in Qinzhou Bay. (2) The spatiotemporal distribution of Chl-a in Qinzhou Bay is varied, with lower concentrations (0.50 μg/L) observed near the shore and higher concentrations (6.70 μg/L) observed offshore, with a gradual decreasing trend over time (−0.8).

Download Full-text

Short-Term Energy Forecasting Using Machine-Learning-Based Ensemble Voting Regression

Symmetry ◽

10.3390/sym14010160 ◽

2022 ◽

Vol 14 (1) ◽

pp. 160

Author(s):

Pyae-Pyae Phyo ◽

Yung-Cheol Byun ◽

Namje Park

Keyword(s):

Machine Learning ◽

Mean Squared Error ◽

Weather Data ◽

Gradient Boosting ◽

Percentage Error ◽

Short Term ◽

Energy Forecasting ◽

Light Gradient ◽

Proposed Model ◽

Term Energy

Meeting the required amount of energy between supply and demand is indispensable for energy manufacturers. Accordingly, electric industries have paid attention to short-term energy forecasting to assist their management system. This paper firstly compares multiple machine learning (ML) regressors during the training process. Five best ML algorithms, such as extra trees regressor (ETR), random forest regressor (RFR), light gradient boosting machine (LGBM), gradient boosting regressor (GBR), and K neighbors regressor (KNN) are trained to build our proposed voting regressor (VR) model. Final predictions are performed using the proposed ensemble VR and compared with five selected ML benchmark models. Statistical autoregressive moving average (ARIMA) is also compared with the proposed model to reveal results. For the experiments, usage energy and weather data are gathered from four regions of Jeju Island. Error measurements, including mean absolute percentage error (MAPE), mean absolute error (MAE), and mean squared error (MSE) are computed to evaluate the forecasting performance. Our proposed model outperforms six baseline models in terms of the result comparison, giving a minimum MAPE of 0.845% on the whole test set. This improved performance shows that our approach is promising for symmetrical forecasting using time series energy data in the power system sector.

Download Full-text

Retrieval of Chlorophyll-a Concentrations in the Coastal Waters of the Beibu Gulf in Guangxi Using a Gradient-Boosting Decision Tree Model

Applied Sciences ◽

10.3390/app11177855 ◽

2021 ◽

Vol 11 (17) ◽

pp. 7855

Author(s):

Huanmei Yao ◽

Yi Huang ◽

Yiming Wei ◽

Weiping Zhong ◽

Ke Wen

Keyword(s):

Remote Sensing ◽

Water Quality ◽

Spatial Distribution ◽

Decision Tree ◽

Chlorophyll A ◽

Coastal Waters ◽

Gradient Boosting ◽

Beibu Gulf ◽

Landsat 8 ◽

Chl A

Remote sensing for the monitoring of chlorophyll-a (Chl-a) is essential to compensate for the shortcomings of traditional water quality monitoring, strengthen red tide disaster monitoring and early warnings, and reduce marine environmental risks. In this study, a machine learning approach called the Gradient-Boosting Decision Tree (GBDT) was employed to develop an algorithm for estimating the Chl-a concentrations of the coastal waters of the Beibu Gulf in Guangxi, using Landsat 8 OLI image data as the image source in combination with field measurements of Chl-a concentrations. The GBDT model with B4, B3 + B4, B3, B1 − B4, B2 + B4, B1 + B4, and B2 − B4 as input features exhibited higher accuracy (MAE = 0.998 μg/L, MAPE = 19.413%, and RMSE = 1.626 μg/L) compared with different physics models, providing a new method for remote sensing inversion of water quality parameters. The GBDT model was used to study the spatial distribution and temporal variation of Chl-a concentrations in the coastal sea surface of the Beibu Gulf of Guangxi from 2013 to 2020. The results showed a spatial distribution with high concentrations in nearshore waters and low concentrations in offshore waters. The Chl-a concentration exhibited seasonal changes (concentration in summer > autumn > spring ≈ winter).

Download Full-text

Monitoring the Foliar Nutrients Status of Mango Using Spectroscopy-Based Spectral Indices and PLSR-Combined Machine Learning Models

Remote Sensing ◽

10.3390/rs13040641 ◽

2021 ◽

Vol 13 (4) ◽

pp. 641

Author(s):

Gopal Ramdas Mahajan ◽

Bappa Das ◽

Dayesh Murgaokar ◽

Ittai Herrmann ◽

Katja Berger ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Partial Least Square ◽

Least Square ◽

Partial Least Square Regression ◽

Support Vector ◽

Spectral Indices ◽

Learning Models ◽

Leaf Nutrients ◽

Machine Learning Models

Conventional methods of plant nutrient estimation for nutrient management need a huge number of leaf or tissue samples and extensive chemical analysis, which is time-consuming and expensive. Remote sensing is a viable tool to estimate the plant’s nutritional status to determine the appropriate amounts of fertilizer inputs. The aim of the study was to use remote sensing to characterize the foliar nutrient status of mango through the development of spectral indices, multivariate analysis, chemometrics, and machine learning modeling of the spectral data. A spectral database within the 350–1050 nm wavelength range of the leaf samples and leaf nutrients were analyzed for the development of spectral indices and multivariate model development. The normalized difference and ratio spectral indices and multivariate models–partial least square regression (PLSR), principal component regression, and support vector regression (SVR) were ineffective in predicting any of the leaf nutrients. An approach of using PLSR-combined machine learning models was found to be the best to predict most of the nutrients. Based on the independent validation performance and summed ranks, the best performing models were cubist (R2 ≥ 0.91, the ratio of performance to deviation (RPD) ≥ 3.3, and the ratio of performance to interquartile distance (RPIQ) ≥ 3.71) for nitrogen, phosphorus, potassium, and zinc, SVR (R2 ≥ 0.88, RPD ≥ 2.73, RPIQ ≥ 3.31) for calcium, iron, copper, boron, and elastic net (R2 ≥ 0.95, RPD ≥ 4.47, RPIQ ≥ 6.11) for magnesium and sulfur. The results of the study revealed the potential of using hyperspectral remote sensing data for non-destructive estimation of mango leaf macro- and micro-nutrients. The developed approach is suggested to be employed within operational retrieval workflows for precision management of mango orchard nutrients.

Download Full-text

Development and validation of a difficult laryngoscopy prediction model using machine learning of neck circumference and thyromental height

BMC Anesthesiology ◽

10.1186/s12871-021-01343-4 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Jong Ho Kim ◽

Haewon Kim ◽

Ji Su Jang ◽

Sung Mi Hwang ◽

So Young Lim ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Confidence Interval ◽

Neck Circumference ◽

Difficult Laryngoscopy ◽

Gradient Boosting ◽

Test Set ◽

Equal Distribution ◽

Light Gradient ◽

Extreme Gradient Boosting

Abstract Background Predicting difficult airway is challengeable in patients with limited airway evaluation. The aim of this study is to develop and validate a model that predicts difficult laryngoscopy by machine learning of neck circumference and thyromental height as predictors that can be used even for patients with limited airway evaluation. Methods Variables for prediction of difficulty laryngoscopy included age, sex, height, weight, body mass index, neck circumference, and thyromental distance. Difficult laryngoscopy was defined as Grade 3 and 4 by the Cormack-Lehane classification. The preanesthesia and anesthesia data of 1677 patients who had undergone general anesthesia at a single center were collected. The data set was randomly stratified into a training set (80%) and a test set (20%), with equal distribution of difficulty laryngoscopy. The training data sets were trained with five algorithms (logistic regression, multilayer perceptron, random forest, extreme gradient boosting, and light gradient boosting machine). The prediction models were validated through a test set. Results The model’s performance using random forest was best (area under receiver operating characteristic curve = 0.79 [95% confidence interval: 0.72–0.86], area under precision-recall curve = 0.32 [95% confidence interval: 0.27–0.37]). Conclusions Machine learning can predict difficult laryngoscopy through a combination of several predictors including neck circumference and thyromental height. The performance of the model can be improved with more data, a new variable and combination of models.

Download Full-text

Online Prediction of Derived Remote Sensing Image Time Series: An Autonomous Machine Learning Approach

IGARSS 2020 - 2020 IEEE International Geoscience and Remote Sensing Symposium ◽

10.1109/igarss39084.2020.9324428 ◽

2020 ◽

Cited By ~ 1

Author(s):

Monidipa Das

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Time Series ◽

Remote Sensing Image ◽

Learning Approach ◽

Machine Learning Approach ◽

Online Prediction ◽

Autonomous Machine

Download Full-text

Prediction of grape yields from time-series vegetation indices using satellite remote sensing and a machine-learning approach

Remote Sensing Applications Society and Environment ◽

10.1016/j.rsase.2021.100485 ◽

2021 ◽

Vol 22 ◽

pp. 100485

Author(s):

Sara Tokhi Arab ◽

Ryozo Noguchi ◽

Shusuke Matsushita ◽

Tofael Ahamed

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Time Series ◽

Satellite Remote Sensing ◽

Vegetation Indices ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text

Prediction and Analysis of Gold Prices using Ensemble Machine Learning Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.36028 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 4367-4374

Author(s):

Gudipally Chandrashakar

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Gold Price ◽

Machine Learning Algorithms ◽

Series Data ◽

Gradient Boosting ◽

Support Vector ◽

Average Value ◽

Ensemble Machine Learning

In this article, we used historical time series data up to the current day gold price. In this study of predicting gold price, we consider few correlating factors like silver price, copper price, standard, and poor’s 500 value, dollar-rupee exchange rate, Dow Jones Industrial Average Value. Considering the prices of every correlating factor and gold price data where dates ranging from 2008 January to 2021 February. Few algorithms of machine learning are used to analyze the time-series data are Random Forest Regression, Support Vector Regressor, Linear Regressor, ExtraTrees Regressor and Gradient boosting Regression. While seeing the results the Extra Tree Regressor algorithm gives the predicted value of gold prices more accurately.

Download Full-text

Boosting Algorithm Choice in Predictive Machine Learning Models for Fracturing Applications

10.2118/205642-ms ◽

2021 ◽

Author(s):

Abdul Muqtadir Khan

Keyword(s):

Machine Learning ◽

Data Science ◽

Oil And Gas ◽

Oil And Gas Industry ◽

Injection Rate ◽

Model Construction ◽

Gradient Boosting ◽

Light Gradient ◽

Fracture Damage ◽

Boosting Technique

Abstract With the advancement in machine learning (ML) applications, some recent research has been conducted to optimize fracturing treatments. There are a variety of models available using various objective functions for optimization and different mathematical techniques. There is a need to extend the ML techniques to optimize the choice of algorithm. For fracturing treatment design, the literature for comparative algorithm performance is sparse. The research predominantly shows that compared to the most commonly used regressors and classifiers, some sort of boosting technique consistently outperforms on model testing and prediction accuracy. A database was constructed for a heterogeneous reservoir. Four widely used boosting algorithms were used on the database to predict the design only from the output of a short injection/falloff test. Feature importance analysis was done on eight output parameters from the falloff analysis, and six were finalized for the model construction. The outputs selected for prediction were fracturing fluid efficiency, proppant mass, maximum proppant concentration, and injection rate. Extreme gradient boost (XGBoost), categorical boost (CatBoost), adaptive boost (AdaBoost), and light gradient boosting machine (LGBM) were the algorithms finalized for the comparative study. The sensitivity was done for a different number of classes (four, five, and six) to establish a balance between accuracy and prediction granularity. The results showed that the best algorithm choice was between XGBoost and CatBoost for the predicted parameters under certain model construction conditions. The accuracy for all outputs for the holdout sets varied between 80 and 92%, showing robust significance for a wider utilization of these models. Data science has contributed to various oil and gas industry domains and has tremendous applications in the stimulation domain. The research and review conducted in this paper add a valuable resource for the user to build digital databases and use the appropriate algorithm without much trial and error. Implementing this model reduced the complexity of the proppant fracturing treatment redesign process, enhanced operational efficiency, and reduced fracture damage by eliminating minifrac steps with crosslinked gel.

Download Full-text

Temporal Variation of Chlorophyll-a Concentrations in Highly Dynamic Waters from Unattended Sensors and Remote Sensing Observations

Sensors ◽

10.3390/s18082699 ◽

2018 ◽

Vol 18 (8) ◽

pp. 2699 ◽

Cited By ~ 2

Author(s):

Jian Li ◽

Liqiao Tian ◽

Qingjun Song ◽

Zhaohua Sun ◽

Hongjing Yu ◽

...

Keyword(s):

Remote Sensing ◽

Water Quality ◽

Chlorophyll Fluorescence ◽

Chlorophyll A ◽

Temporal Variations ◽

Lake Area ◽

Short Term ◽

Chl A ◽

Spatio Temporal

Monitoring of water quality changes in highly dynamic inland lakes is frequently impeded by insufficient spatial and temporal coverage, for both field surveys and remote sensing methods. To track short-term variations of chlorophyll fluorescence and chlorophyll-a concentrations in Poyang Lake, the largest freshwater lake in China, high-frequency, in-situ, measurements were collected from two fixed stations. The K-mean clustering method was also applied to identify clusters with similar spatio-temporal variations, using remote sensing Chl-a data products from the MERIS satellite, taken from 2003 to 2012. Four lake area classes were obtained with distinct spatio-temporal patterns, two of which were selected for in situ measurement. Distinct daily periodic variations were observed, with peaks at approximately 3:00 PM and troughs at night or early morning. Short-term variations of chlorophyll fluorescence and Chl-a levels were revealed, with a maximum intra-diurnal ratio of 5.1 and inter-diurnal ratio of 7.4, respectively. Using geostatistical analysis, the temporal range of chlorophyll fluorescence and corresponding Chl-a variations was determined to be 9.6 h, which indicates that there is a temporal discrepancy between Chl-a variations and the sampling frequency of current satellite missions. An analysis of the optimal sampling strategies demonstrated that the influence of the sampling time on the mean Chl-a concentrations observed was higher than 25%, and the uncertainty of any single Terra/MODIS or Aqua/MODIS observation was approximately 15%. Therefore, sampling twice a day is essential to resolve Chl-a variations with a bias level of 10% or less. The results highlight short-term variations of critical water quality parameters in freshwater, and they help identify specific design requirements for geostationary earth observation missions, so that they can better address the challenges of monitoring complex coastal and inland environments around the world.

Download Full-text