scholarly journals Classification of Mangrove Species Using Combined WordView-3 and LiDAR Data in Mai Po Nature Reserve, Hong Kong

2019 ◽  
Vol 11 (18) ◽  
pp. 2114 ◽  
Author(s):  
Qiaosi Li ◽  
Frankie Kwan Kit Wong ◽  
Tung Fung

Mangroves have significant social, economic, environmental, and ecological values but they are under threat due to human activities. An accurate map of mangrove species distribution is required to effectively conserve mangrove ecosystem. This study evaluates the synergy of WorldView-3 (WV-3) spectral bands and high return density LiDAR-derived elevation metrics for classifying seven species in mangrove habitat in Mai Po Nature Reserve in Hong Kong, China. A recursive feature elimination algorithm was carried out to identify important spectral bands and LiDAR (Airborne Light Detection and Ranging) metrics whilst appropriate spatial resolution for pixel-based classification was investigated for discriminating different mangrove species. Two classifiers, support vector machine (SVM) and random forest (RF) were compared. The results indicated that the combination of 2 m resolution WV-3 and LiDAR data yielded the best overall accuracy of 0.88 by SVM classifier comparing with WV-3 (0.72) and LiDAR (0.79). Important features were identified as green (510–581 nm), red edge (705–745 nm), red (630–690 nm), yellow (585–625 nm), NIR (770–895 nm) bands of WV-3, and LiDAR metrics relevant to canopy height (e.g., canopy height model), canopy shape (e.g., canopy relief ratio), and the variation of height (e.g., variation and standard deviation of height). LiDAR features contributed more information than spectral features. The significance of this study is that a mangrove species distribution map with satisfactory accuracy can be acquired by the proposed classification scheme. Meanwhile, with LiDAR data, vertical stratification of mangrove forests in Mai Po was firstly mapped, which is significant to bio-parameter estimation and ecosystem service evaluation in future studies.

2020 ◽  
Vol 12 (4) ◽  
pp. 656 ◽  
Author(s):  
Luoma Wan ◽  
Yinyi Lin ◽  
Hongsheng Zhang ◽  
Feng Wang ◽  
Mingfeng Liu ◽  
...  

Hyperspectral data has been widely used in species discrimination of plants with rich spectral information in hundreds of spectral bands, while the availability of hyperspectral data has hindered its applications in many specific cases. The successful operation of the Chinese satellite, Gaofen-5 (GF-5), provides potentially promising new hyperspectral dataset with 330 spectral bands in visible and near infrared range. Therefore, there is much demand for assessing the effectiveness and superiority of GF-5 hyperspectral data in plants species mapping, particularly mangrove species mapping, to better support the efficient mangrove management. In this study, mangrove forest in Mai Po Nature Reserve (MPNR), Hong Kong was selected as the study area. Four dominant native mangrove species were investigated in this study according to the field surveys. Two machine learning methods, Random Forests and Support Vector Machines, were employed to classify mangrove species with Landsat 8, Simulated Hyperion and GF-5 data sets. The results showed that 97 more bands of GF-5 over Hyperion brought a higher over accuracy of 87.12%, in comparison with 86.82% from Hyperion and 73.89% from Landsat 8. The higher spectral resolution of 5 nm in GF-5 was identified as making the major contribution, especially for the mapping of Aegiceras corniculatum. Therefore, GF-5 is likely to improve the classification accuracy of mangrove species mapping via enhancing spectral resolution and thus has promising potential to improve mangrove monitoring at species level to support mangrove management.


2020 ◽  
pp. 3397-3407
Author(s):  
Nur Syafiqah Mohd Nafis ◽  
Suryanti Awang

Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the text documents at the first stage. SVM-RFE utilized a backward feature elimination scheme to recursively remove insignificant features from the filtered feature subsets at the second stage. This research executes sets of experiments using a text document retrieved from a benchmark repository comprising a collection of Twitter posts. Pre-processing processes are applied to extract relevant features. After that, the pre-processed features are divided into training and testing datasets. Next, feature selection is implemented on the training dataset by calculating the TF-IDF score for each feature. SVM-RFE is applied for feature ranking as the next feature selection step. Only top-rank features will be selected for text classification using the SVM classifier. Based on the experiments, it shows that the proposed technique able to achieve 98% accuracy that outperformed other existing techniques. In conclusion, the proposed technique able to select the significant features in the unstructured and high dimensional text document.


Forests ◽  
2020 ◽  
Vol 11 (12) ◽  
pp. 1324
Author(s):  
Xi Peng ◽  
Anjiu Zhao ◽  
Yongfu Chen ◽  
Qiao Chen ◽  
Haodong Liu ◽  
...  

Knowledge of forest structure is vital for sustainable forest management decisions. Terrestrial laser scanning cannot describe the canopy trees in a large area, and it is unclear whether unmanned aerial vehicle-light detection and ranging (UAV-LiDAR) data have the ability to capture the forest canopy structural parameters in tropical forests. In this study, we estimated five forest canopy structures (stand density (N), basic area (G), above-ground biomass (AGB), Lorey’s mean height (HL), and under-crown height (hT)) with four modeling algorithms (linear regression (LR), bagged tree (BT), support vector regression (SVR), and random forest (RF)) based on UAV-LiDAR data and 60 sample plot data from tropical forests in Hainan and determined the optimal algorithms for the five canopy structures by comparing the performance of the four algorithms. First, we defined the canopy tree as a tree with a height ≥70% HL. Then, UAV-LiDAR metrics were calculated, and the LiDAR metrics were screened by recursive feature elimination (RFE). Finally, a prediction model of the five forest canopy structural parameters was established by the four algorithms, and the results were compared. The metrics’ screening results show that the most important LiDAR indexes for estimating HL, AGB, and hT are the leaf area index and some height metrics, while the most important indexes for estimating N and G are the kurtosis of heights and the coefficient of variation of height. The relative root mean squared error (rRMSE) of five structure parameters showed the following: when modeling HL, the rRMSEs (10.60%–12.05%) obtained by the four algorithms showed little difference; when N was modeled, BT, RF, and SVR had lower rRMSEs (26.76%–27.44%); when G was modeled, the rRMSEs of RF and SVR (15.37%–15.87%) were lower; when hT was modeled, BT, RF, and SVR had lower rRMSEs (10.24%–11.07%); when AGB was modeled, RF had the lowest rRMSE (26.75%). Our results will help facilitate choosing LiDAR indexes and modeling algorithms for tropical forest resource inventories.


2020 ◽  
Vol 14 (3) ◽  
pp. 269-279
Author(s):  
Hayet Djellali ◽  
Nacira Ghoualmi-Zine ◽  
Souad Guessoum

This paper investigates feature selection methods based on hybrid architecture using feature selection algorithm called Adapted Fast Correlation Based Feature selection and Support Vector Machine Recursive Feature Elimination (AFCBF-SVMRFE). The AFCBF-SVMRFE has three stages and composed of SVMRFE embedded method with Correlation based Features Selection. The first stage is the relevance analysis, the second one is a redundancy analysis, and the third stage is a performance evaluation and features restoration stage. Experiments show that the proposed method tested on different classifiers: Support Vector Machine SVM and K nearest neighbors KNN provide a best accuracy on various dataset. The SVM classifier outperforms KNN classifier on these data. The AFCBF-SVMRFE outperforms FCBF multivariate filter, SVMRFE, Particle swarm optimization PSO and Artificial bees colony ABC.


Author(s):  
F. Samadzadega ◽  
H. Hasani

Hyperspectral imagery is a rich source of spectral information and plays very important role in discrimination of similar land-cover classes. In the past, several efforts have been investigated for improvement of hyperspectral imagery classification. Recently the interest in the joint use of LiDAR data and hyperspectral imagery has been remarkably increased. Because LiDAR can provide structural information of scene while hyperspectral imagery provide spectral and spatial information. The complementary information of LiDAR and hyperspectral data may greatly improve the classification performance especially in the complex urban area. In this paper feature level fusion of hyperspectral and LiDAR data is proposed where spectral and structural features are extract from both dataset, then hybrid feature space is generated by feature stacking. Support Vector Machine (SVM) classifier is applied on hybrid feature space to classify the urban area. In order to optimize the classification performance, two issues should be considered: SVM parameters values determination and feature subset selection. Bees Algorithm (BA) is powerful meta-heuristic optimization algorithm which is applied to determine the optimum SVM parameters and select the optimum feature subset simultaneously. The obtained results show the proposed method can improve the classification accuracy in addition to reducing significantly the dimension of feature space.


2021 ◽  
Author(s):  
Eric Adua ◽  
Emmanuel Awuni Kolog ◽  
Ebenezer Afrifa-Yamoah ◽  
Bright Amankwah ◽  
Christian Obirikorang ◽  
...  

Abstract BackgroundAccurate prediction and early recognition of type II diabetes (T2DM) will lead to timely and meaningful interventions, while preventing T2DM associated complications. In this context, machine learning (ML) is promising, as it can transform vast amount of T2DM data into clinically relevant information. This study compares multiple ML techniques for predictive modelling based on different T2DM associated variables in an African population, Ghana. MethodsThe study involves 219 T2DM patients and 219 healthy individuals who were recruited from the hospital and the local community, respectively. Anthropometric and biochemical information including glycated haemoglobin (HbA1c), body mass index (BMI), blood pressure, fasting blood sugar (FBS), serum lipids [(total cholesterol (TC), triglycerides (TG), high and low-density lipoprotein cholesterol (HDL-c and LDL-c)] were collected. From this data, four ML classification algorithms including Naïve-Bayes (NB), K-Nearest Neighbor (KNN), Support Vector Machines (SVM) and Decision Tree (DT) were used to predict T2DM. Precision, Recall, F1-Scores, Receiver Operating Characteristics (ROC) scores and the confusion matrix were computed to determine the performance of the various algorithms while the importance of the feature attributes was determined by recursive feature elimination technique.ResultsAll the classifiers performed beyond the acceptable threshold of 70% for the Precision, Recall, F-score and Accuracy. After building the predictive model, 82% of diabetic test data was detected by the NB classifier, of which 93% were accurately predicted. The SVM classifier was the second-best performing classifier which yielded an overall accuracy of 84%. The non-T2DM test data yielded an accurate prediction score of 75% from the 98% of the proportion of the non-T2DM test data. KNN and DT yielded accuracies of 83% and 81%, respectively. NB has the best performance (AUC=0.87) followed by SVM (AUC= 0.84), KNN (AUC= 0.85) and DT (AUC= 0.81). The best three feature attributes, in order of importance, are HbA1c, TC and BMI whereas the least three importance of the features are Age, HDL-c and LDL-c.ConclusionBased on the predictive performance and high accuracy, the study has shown the potential of ML as a robust forecasting tool for T2DM. Our results can be a benchmark for guiding policy decisions in T2DM surveillance in resource and medical expertise limited countries such as Ghana.


Author(s):  
N. Munir ◽  
M. Awrangjeb ◽  
B. Stantic ◽  
G. Lu ◽  
S. Islam

<p><strong>Abstract.</strong> Extraction of individual pylons and wires is important for modelling of 3D objects in a power line corridor (PLC) map. However, the existing methods mostly classify points into distinct classes like pylons and wires, but hardly into individual pylons or wires. The proposed method extracts standalone pylons, vegetation and wires from LiDAR data. The extraction of individual objects is needed for a detailed PLC mapping. The proposed approach starts off with the separation of ground and non ground points. The non-ground points are then classified into vertical (e.g., pylons and vegetation) and non-vertical (e.g., wires) object points using the vertical profile feature (VPF) through the binary support vector machine (SVM) classifier. Individual pylons and vegetation are then separated using their shape and area properties. The locations of pylons are further used to extract the span points between two successive pylons. Finally, span points are voxelised and alignment properties of wires in the voxel grid is used to extract individual wires points. The results are evaluated on dataset which has multiple spans with bundled wires in each span. The evaluation results show that the proposed method and features are very effective for extraction of individual wires, pylons and vegetation with 99% correctness and 98% completeness.</p>


2021 ◽  
Author(s):  
Eric Adua ◽  
Emmanuel Awuni Kolog ◽  
Ebenezer Afrifa-Yamoah ◽  
Bright Amankwah ◽  
Christian Obirikorang ◽  
...  

Abstract Background Accurate prediction and early recognition of type II diabetes (T2DM) will lead to timely and meaningful interventions, while preventing T2DM associated complications. In this context, machine learning (ML) is promising, as it can transform vast amount of T2DM data into clinically relevant information. This study compares multiple ML techniques for predictive modelling based on different T2DM associated variables in an African population, Ghana. Methods The study involves 219 T2DM patients and 219 healthy individuals who were recruited from the hospital and the local community, respectively. Anthropometric and biochemical information including glycated haemoglobin (HbA1c), body mass index (BMI), blood pressure, fasting blood sugar (FBS), serum lipids [(total cholesterol (TC), triglycerides (TG), high and low-density lipoprotein cholesterol (HDL-c and LDL-c)] were collected. From this data, four ML classification algorithms including Naïve-Bayes (NB), K-Nearest Neighbor (KNN), Support Vector Machines (SVM) and Decision Tree (DT) were used to predict T2DM. Precision, Recall, F1-Scores, Receiver Operating Characteristics (ROC) scores and the confusion matrix were computed to determine the performance of the various algorithms while the importance of the feature attributes was determined by recursive feature elimination technique. Results All the classifiers performed beyond the acceptable threshold of 70% for the Precision, Recall, F-score and Accuracy. After building the predictive model, 82% of diabetic test data was detected by the NB classifier, of which 93% were accurately predicted. The SVM classifier was the second-best performing classifier which yielded an overall accuracy of 84%. The non-T2DM test data yielded an accurate prediction score of 75% from the 98% of the proportion of the non-T2DM test data. KNN and DT yielded accuracies of 83% and 81%, respectively. NB has the best performance (AUC = 0.87) followed by SVM (AUC = 0.84), KNN (AUC = 0.85) and DT (AUC = 0.81). The best three feature attributes, in order of importance, are HbA1c, TC and BMI whereas the least three importance of the features are Age, HDL-c and LDL-c. Conclusion Based on the predictive performance and high accuracy, the study has shown the potential of ML as a robust forecasting tool for T2DM. Our results can be a benchmark for guiding policy decisions in T2DM surveillance in resource and medical expertise limited countries such as Ghana.


2019 ◽  
Vol 20 (9) ◽  
pp. 2344
Author(s):  
Yang Yang ◽  
Huiwen Zheng ◽  
Chunhua Wang ◽  
Wanyue Xiao ◽  
Taigang Liu

To reveal the working pattern of programmed cell death, knowledge of the subcellular location of apoptosis proteins is essential. Besides the costly and time-consuming method of experimental determination, research into computational locating schemes, focusing mainly on the innovation of representation techniques on protein sequences and the selection of classification algorithms, has become popular in recent decades. In this study, a novel tri-gram encoding model is proposed, which is based on using the protein overlapping property matrix (POPM) for predicting apoptosis protein subcellular location. Next, a 1000-dimensional feature vector is built to represent a protein. Finally, with the help of support vector machine-recursive feature elimination (SVM-RFE), we select the optimal features and put them into a support vector machine (SVM) classifier for predictions. The results of jackknife tests on two benchmark datasets demonstrate that our proposed method can achieve satisfactory prediction performance level with less computing capacity required and could work as a promising tool to predict the subcellular locations of apoptosis proteins.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Mei-Ling Huang ◽  
Yung-Hsiang Hung ◽  
W. M. Lee ◽  
R. K. Li ◽  
Bo-Ru Jiang

Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parametersCandγto increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.


Sign in / Sign up

Export Citation Format

Share Document