ground truth data
Recently Published Documents

Monitoring the extent of plateau forests has drawn much attention from governments given the fact that the plateau forests play a key role in global carbon circulation. Despite the recent advances in the remote-sensing applications of satellite imagery over large regions, accurate mapping of plateau forest remains challenging due to limited ground truth information and high uncertainties in their spatial distribution. In this paper, we aim to generate a better segmentation map for plateau forests using high-resolution satellite imagery with limited ground-truth data. We present the first 2 m spatial resolution large-scale plateau forest dataset of Sanjiangyuan National Nature Reserve, including 38,708 plateau forest imagery samples and 1187 handmade accurate plateau forest ground truth masks. We then propose an few-shot learning method for mapping plateau forests. The proposed method is conducted in two stages, including unsupervised feature extraction by leveraging domain knowledge, and model fine-tuning using limited ground truth data. The proposed few-shot learning method reached an F1-score of 84.23%, and outperformed the state-of-the-art object segmentation methods. The result proves the proposed few-shot learning model could help large-scale plateau forest monitoring. The dataset proposed in this paper will soon be available online for the public.

Download Full-text

Explainable AI for CNN-based Prostate Tumor Segmentation in Multi-parametric MRI Correlated to Whole Mount Histopathology

10.21203/rs.3.rs-1225229/v1 ◽

2022 ◽

Author(s):

Deepa Darshini Gunashekar ◽

Lars Bielak ◽

Leonard Hägele ◽

Arnie Berlin ◽

Benedict Oerther ◽

...

Keyword(s):

Prostate Gland ◽

Ground Truth ◽

Prostate Tumor ◽

Tumor Segmentation ◽

Ground Truth Data ◽

Heat Maps ◽

Whole Mount ◽

Histopathology Images ◽

Deep Learning Model ◽

Activation Map

Abstract Automatic prostate tumor segmentation is often unable to identify the lesion even if in multi-parametric MRI data is used as input, and the segmentation output is difficult to verify due to the lack of clinically established ground truth images. In this work we use an explainable deep learning model to interpret the predictions of a convolutional neural network (CNN) for prostate tumor segmentation. The CNN uses a U-Net architecture which was trained on multi-parametric MRI data from 122 patients to automatically segment the prostate gland and prostate tumor lesions. In addition, co-registered ground truth data from whole mount histopathology images were available in 15 patients that were used as a test set during CNN testing. To be able to interpret the segmentation results of the CNN, heat maps were generated using the Gradient Weighted Class Activation Map (Grad-CAM) method. With the CNN a mean Dice Sorensen Coefficient for the prostate gland and the tumor lesions of 0.62 and 0.31 with the radiologist drawn ground truth and 0.32 with wholemount histology ground truth for tumor lesions could be achieved. Dice Sorensen Coefficient between CNN predictions and manual segmentations from MRI and histology data were not significantly different. In the prostate the Grad-CAM heat maps could differentiate between tumor and healthy prostate tissue, which indicates that the image information in the tumor was essential for the CNN segmentation.

Download Full-text

Rice Mapping in Training Sample Shortage Regions Using a Deep Semantic Segmentation Model Trained on Pseudo-Labels

Remote Sensing ◽

10.3390/rs14020328 ◽

2022 ◽

Vol 14 (2) ◽

pp. 328

Author(s):

Pengliang Wei ◽

Ran Huang ◽

Tao Lin ◽

Jingfeng Huang

Keyword(s):

Large Scale ◽

Ground Truth ◽

Semantic Segmentation ◽

Training Sample ◽

Transfer Model ◽

Ground Truth Data ◽

Rice Distribution ◽

Multi Temporal ◽

Crop Mapping ◽

Model Training

A deep semantic segmentation model-based method can achieve state-of-the-art accuracy and high computational efficiency in large-scale crop mapping. However, the model cannot be widely used in actual large-scale crop mapping applications, mainly because the annotation of ground truth data for deep semantic segmentation model training is time-consuming. At the operational level, it is extremely difficult to obtain a large amount of ground reference data by photointerpretation for the model training. Consequently, in order to solve this problem, this study introduces a workflow that aims to extract rice distribution information in training sample shortage regions, using a deep semantic segmentation model (i.e., U-Net) trained on pseudo-labels. Based on the time series Sentinel-1 images, Cropland Data Layer (CDL) and U-Net model, the optimal multi-temporal datasets for rice mapping were summarized, using the global search method. Then, based on the optimal multi-temporal datasets, the proposed workflow (a combination of K-Means and random forest) was directly used to extract the rice-distribution information of Jiangsu (i.e., the K–RF pseudo-labels). For comparison, the optimal well-trained U-Net model acquired from Arkansas (i.e., the transfer model) was also transferred to Jiangsu to extract local rice-distribution information (i.e., the TF pseudo-labels). Finally, the pseudo-labels with high confidences generated from the two methods were further used to retrain the U-Net models, which were suitable for rice mapping in Jiangsu. For different rice planting pattern regions of Jiangsu, the final results showed that, compared with the U-Net model trained on the TF pseudo-labels, the rice area extraction errors of pseudo-labels could be further reduced by using the U-Net model trained on the K–RF pseudo-labels. In addition, compared with the existing rule-based rice mapping methods, he U-Net model trained on the K–RF pseudo-labels could robustly extract the spatial distribution information of rice. Generally, this study could provide new options for applying a deep semantic segmentation model to training sample shortage regions.

Download Full-text

Classification of Daily Crop Phenology in PhenoCams Using Deep Learning and Hidden Markov Models

Remote Sensing ◽

10.3390/rs14020286 ◽

2022 ◽

Vol 14 (2) ◽

pp. 286

Author(s):

Shawn D. Taylor ◽

Dawn M. Browning

Keyword(s):

Time Series ◽

Deep Learning ◽

Image Classification ◽

Hidden Markov ◽

Classification Model ◽

Common Source ◽

Post Processing ◽

Crop Phenology ◽

Ground Truth Data ◽

Near Surface

Near-surface cameras, such as those in the PhenoCam network, are a common source of ground truth data in modelling and remote sensing studies. Despite having locations across numerous agricultural sites, few studies have used near-surface cameras to track the unique phenology of croplands. Due to management activities, crops do not have a natural vegetation cycle which many phenological extraction methods are based on. For example, a field may experience abrupt changes due to harvesting and tillage throughout the year. A single camera can also record several different plants due to crop rotations, fallow fields, and cover crops. Current methods to estimate phenology metrics from image time series compress all image information into a relative greenness metric, which discards a large amount of contextual information. This can include the type of crop present, whether snow or water is present on the field, the crop phenology, or whether a field lacking green plants consists of bare soil, fully senesced plants, or plant residue. Here, we developed a modelling workflow to create a daily time series of crop type and phenology, while also accounting for other factors such as obstructed images and snow covered fields. We used a mainstream deep learning image classification model, VGG16. Deep learning classification models do not have a temporal component, so to account for temporal correlation among images, our workflow incorporates a hidden Markov model in the post-processing. The initial image classification model had out of sample F1 scores of 0.83–0.85, which improved to 0.86–0.91 after all post-processing steps. The resulting time series show the progression of crops from emergence to harvest, and can serve as a daily, local-scale dataset of field states and phenological stages for agricultural research.

Download Full-text

Social Media Discussions Predict Mental Health Consultations on College Campuses

Scientific Reports ◽

10.1038/s41598-021-03423-4 ◽

2022 ◽

Vol 12 (1) ◽

Author(s):

Koustuv Saha ◽

Asra Yousuf ◽

Ryan L. Boyd ◽

James W. Pennebaker ◽

Munmun De Choudhury

Keyword(s):

Mental Health ◽

College Students ◽

Social Media ◽

Ground Truth ◽

Treatment Needs ◽

Ground Truth Data ◽

Social Media Data ◽

Passive Sensor ◽

Mental Health Consultations ◽

Media Data

AbstractThe mental health of college students is a growing concern, and gauging the mental health needs of college students is difficult to assess in real-time and in scale. To address this gap, researchers and practitioners have encouraged the use of passive technologies. Social media is one such "passive sensor" that has shown potential as a viable "passive sensor" of mental health. However, the construct validity and in-practice reliability of computational assessments of mental health constructs with social media data remain largely unexplored. Towards this goal, we study how assessing the mental health of college students using social media data correspond with ground-truth data of on-campus mental health consultations. For a large U.S. public university, we obtained ground-truth data of on-campus mental health consultations between 2011–2016, and collected 66,000 posts from the university’s Reddit community. We adopted machine learning and natural language methodologies to measure symptomatic mental health expressions of depression, anxiety, stress, suicidal ideation, and psychosis on the social media data. Seasonal auto-regressive integrated moving average (SARIMA) models of forecasting on-campus mental health consultations showed that incorporating social media data led to predictions with r = 0.86 and SMAPE = 13.30, outperforming models without social media data by 41%. Our language analyses revealed that social media discussions during high mental health consultations months consisted of discussions on academics and career, whereas months of low mental health consultations saliently show expressions of positive affect, collective identity, and socialization. This study reveals that social media data can improve our understanding of college students’ mental health, particularly their mental health treatment needs.

Download Full-text

Monitoring of Spectral Signatures of Maize Crop using Temporal SAR and Optical Remote Sensing data

International Journal of Bio-resource and Stress Management ◽

10.23910/1.2021.2482 ◽

2021 ◽

Vol 12 (6) ◽

pp. 745-750

Author(s):

D. Anil Kumar ◽

◽

P. Srikanth ◽

T. L. Neelima ◽

M. Uma Devi ◽

...

Keyword(s):

Vegetation Index ◽

Plant Density ◽

Optical Data ◽

Maturity Stage ◽

Optical Remote Sensing ◽

Maize Crop ◽

Ground Truth Data ◽

Backscatter Intensity ◽

Multi Temporal ◽

Sentinel 2A

A study was carried out using the temporal Sentinel-1B microwave data (June to November at 12 days interval) and Sentinel-2A/2B optical data (June to November) to discriminate the maize crop from other competing crops rice and cotton in Siddipet district, Telangana state, India during kharif, 2019 (June to November). The study utilized the data from multiple sources such as Multi-temporal VH backscatter intensity from Sentinel-1B SAR and NDVI values from Sentinel-2A/2B in combination with field data to discriminate the maize crop. Synchronous to satellite pass, ground truth data on crop parameters viz., crop stage, crop vigour, biomass, plant height, plant density, soil moisture, LAI and chlorophyll content were collected. Multi-temporal VH backscatter intensity and Normalized Difference Vegetation Index (NDVI) data were used to characterize backscatter and greenness behaviour of the maize crop. The backscatter intensity (dB) for maize crop ranged from -21.83 (the lowest backscatter values) at planting to -12.52 (the highest backscatter values) at peak growth stage. The NDVI values during vegetative and reproductive stages (August and September) were >0.6 and during senescence to harvesting the values were less than or equal to 0.52. The increase in backscatter intensity values from initial vegetative stage to peak stage was due to increased volume scattering of the maize crop canopy and a continuous decline in backscatter intensity values of VH band at maturity stage, was due to decrease in greenness and moisture content in leaves of the maize crop helped in maize crop discrimination from other dominant kharif crops in the study area.

Download Full-text

QuestionComb: A Gamification Approach for the Visual Explanation of Linguistic Phenomena through Interactive Labeling

ACM Transactions on Interactive Intelligent Systems ◽

10.1145/3429448 ◽

2021 ◽

Vol 11 (3-4) ◽

pp. 1-38

Author(s):

Rita Sevastjanova ◽

Wolfgang Jentner ◽

Fabian Sperrle ◽

Rebecca Kehlbeck ◽

Jürgen Bernard ◽

...

Keyword(s):

Machine Learning ◽

Information Seeking ◽

Visual Analytics ◽

Evaluation Studies ◽

Model Performance ◽

Ground Truth ◽

Training Data ◽

Supervised Machine Learning ◽

Ground Truth Data ◽

The Creation

Linguistic insight in the form of high-level relationships and rules in text builds the basis of our understanding of language. However, the data-driven generation of such structures often lacks labeled resources that can be used as training data for supervised machine learning. The creation of such ground-truth data is a time-consuming process that often requires domain expertise to resolve text ambiguities and characterize linguistic phenomena. Furthermore, the creation and refinement of machine learning models is often challenging for linguists as the models are often complex, in-transparent, and difficult to understand. To tackle these challenges, we present a visual analytics technique for interactive data labeling that applies concepts from gamification and explainable Artificial Intelligence (XAI) to support complex classification tasks. The visual-interactive labeling interface promotes the creation of effective training data. Visual explanations of learned rules unveil the decisions of the machine learning model and support iterative and interactive optimization. The gamification-inspired design guides the user through the labeling process and provides feedback on the model performance. As an instance of the proposed technique, we present QuestionComb , a workspace tailored to the task of question classification (i.e., in information-seeking vs. non-information-seeking questions). Our evaluation studies confirm that gamification concepts are beneficial to engage users through continuous feedback, offering an effective visual analytics technique when combined with active learning and XAI.

Download Full-text

Integrating SAR and Optical Remote Sensing for Conservation-Targeted Wetlands Mapping

Remote Sensing ◽

10.3390/rs14010159 ◽

2021 ◽

Vol 14 (1) ◽

pp. 159

Author(s):

Hossein Sahour ◽

Kaylan M. Kemink ◽

Jessica O’Connell

Keyword(s):

Surface Water ◽

Google Earth ◽

Machine Learning Algorithms ◽

Emergent Vegetation ◽

Water Bodies ◽

Depressional Wetlands ◽

Support Vector ◽

Ground Truth Data ◽

Synthetic Aperture Radar Data ◽

Sentinel 2

The Prairie Pothole Region (PPR) contains numerous depressional wetlands known as potholes that provide habitats for waterfowl and other wetland-dependent species. Mapping these wetlands is essential for identifying viable waterfowl habitat and conservation planning scenarios, yet it is a challenging task due to the small size of the potholes, and the presence of emergent vegetation. This study develops an open-source process within the Google Earth Engine platform for mapping the spatial distribution of wetlands through the integration of Sentinel-1 C-band SAR (synthetic aperture radar) data with high-resolution (10-m) Sentinel-2 bands. We used two machine-learning algorithms (random forest (RF) and support vector machine (SVM)) to identify wetlands across the study area through supervised classification of the multisensor composite. We trained the algorithms with ground truth data provided through field studies and aerial photography. The accuracy was assessed by comparing the predicted and actual wetland and non-wetland classes using statistical coefficients (overall accuracy, Kappa, sensitivity, and specificity). For this purpose, we used four different out-of-sample test subsets, including the same year, next year, small vegetated, and small non-vegetated test sets to evaluate the methods on different spatial and temporal scales. The results were also compared to Landsat-derived JRC surface water products, and the Sentinel-2-derived normalized difference water index (NDWI). The wetlands derived from the RF model (overall accuracy 0.76 to 0.95) yielded favorable results, and outperformed the SVM, NDWI, and JRC products in all four testing subsets. To provide a further characterization of the potholes, the water bodies were stratified based on the presence of emergent vegetation using Sentinel-2-derived NDVI, and, after excluding permanent water bodies, using the JRC surface water product. The algorithm presented in the study is scalable and can be adopted for identifying wetlands in other regions of the world.

Download Full-text

CARTOGRAPHY OF MOROCCAN ARGAN TREE USING COMBINED OPTICAL AND SAR IMAGERY INTEGRATED WITH DIGITAL ELEVATION MODEL

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlvi-4-w5-2021-211-2021 ◽

2021 ◽

Vol XLVI-4/W5-2021 ◽

pp. 211-217

Author(s):

E. Elmoussaoui ◽

A. Moumni ◽

A. Lahrouni

Keyword(s):

Machine Learning ◽

Time Series ◽

Digital Elevation Model ◽

Optical Data ◽

Support Vector ◽

Ground Truth Data ◽

Argan Tree ◽

Digital Elevation ◽

Elevation Model ◽

Sentinel 2

Abstract. Forest tree species mapping became easier due to the global availability of high spatio-temporal resolution images acquired from multiple sensors. Such data can lead to better forest resources management. Machine-learning pixel based analysis was performed to multi-spectral Sentinel-2 and Synthetic Aperture Radar Sentinel-1 time series integrated with Digital Elevation Model acquired over Argan forest of Essaouira province, Morocco. The argan tree constitutes a fundamental resource for the populations of this arid area of Morocco. This research aims to use the potential of the combination of multi-sensor data to detect, map and identify argan tree from other forest species using three Machine Learning algorithms: Support Vector Machine (SVM), Maximum Likelihood (ML) and Artificial Neural Networks (ANN). The exploited datasets included Sentinel-1 (S1), Sentinel-2 (S2) time series, Shuttle Radar Topographic Missing Digital Elevation Model (DEM) layer and Ground truth data. We tested several sets of scenarios, including single S1 derived features, single S2 time series and combined S1 and S2 derived layers with DEM scene acquisition. The best results (overall accuracy OA and Kappa coefficient K) obtained from time series of optical data (NDVI): OA = 86.87%, K = 0.84, from time series of SAR data (VV+VH/VV): OA = 45.90%, K = 0.36, from the combination of optical and SAR time series (NDVI+VH+DEM): OA = 93.01%, K = 0.914, and from the fusion of optical time series and DEM layer (NDVI+DEM): OA = 93.25%, K = 0.91. These results indicate that single-sensor (S2) integrated with the DEM layer led us to obtain the highest classification results.

Download Full-text

Developing Multi-Source Indices to Discriminate between Native Tropical Forests, Oil Palm and Rubber Plantations in Indonesia

Remote Sensing ◽

10.3390/rs14010003 ◽

2021 ◽

Vol 14 (1) ◽

pp. 3

Author(s):

Inggit Lolita Sari ◽

Christopher J. Weston ◽

Glenn J. Newnham ◽

Liubov Volkova

Keyword(s):

Land Cover ◽

Oil Palm ◽

Forest Monitoring ◽

Rubber Plantation ◽

Native Forest ◽

Landsat 8 ◽

Ground Truth Data ◽

Rubber Plantations ◽

Radar Images ◽

Land Cover Maps

Over the last 18 years, Indonesia has experienced significant deforestation due to the expansion of oil palm and rubber plantations. Accurate land cover maps are essential for policymakers to track and manage land change to support sustainable forest management and investment decisions. An automatic digital processing (ADP) method is currently used to develop land cover change maps for Indonesia, based on optical imaging (Landsat). Such maps produce only forest and non-forest classes, and often oil palm and rubber plantations are misclassified as native forests. To improve accuracy of these land cover maps, this study developed oil palm and rubber plantation discrimination indices using the integration of Landsat-8 and synthetic aperture radar Sentinel-1 images. Sentinel-1 VH and VV difference (>7.5 dB) and VH backscatter intensity were used to discriminate oil palm plantations. A combination of Landsat-8 NDVI, NDMI with Sentinel-1 VV and VH were used to discriminate rubber plantations. The improved map produced four land cover classes: native forest, oil palm plantation, rubber plantation, and non-forest. High-resolution SPOT 6/7 imagery and ground truth data were used for validation of the new classified maps. The map had an overall accuracy of 92%; producer’s accuracy for all classes was higher than 90%, except for rubber (65%), and user’s accuracy was over 80% for all classes. These results demonstrate that indices developed from a combination of optical and radar images can improve our ability to discriminate between native forest and oil palm and rubber plantations in the tropics. The new mapping method will help to support Indonesia’s national forest monitoring system and inform monitoring of plantation expansion.

Download Full-text

ground truth dataRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Mapping Large-Scale Plateau Forest in Sanjiangyuan Using High-Resolution Satellite Imagery and Few-Shot Learning

Explainable AI for CNN-based Prostate Tumor Segmentation in Multi-parametric MRI Correlated to Whole Mount Histopathology

Rice Mapping in Training Sample Shortage Regions Using a Deep Semantic Segmentation Model Trained on Pseudo-Labels

Classification of Daily Crop Phenology in PhenoCams Using Deep Learning and Hidden Markov Models

Social Media Discussions Predict Mental Health Consultations on College Campuses

Monitoring of Spectral Signatures of Maize Crop using Temporal SAR and Optical Remote Sensing data

QuestionComb: A Gamification Approach for the Visual Explanation of Linguistic Phenomena through Interactive Labeling

Integrating SAR and Optical Remote Sensing for Conservation-Targeted Wetlands Mapping

CARTOGRAPHY OF MOROCCAN ARGAN TREE USING COMBINED OPTICAL AND SAR IMAGERY INTEGRATED WITH DIGITAL ELEVATION MODEL

Developing Multi-Source Indices to Discriminate between Native Tropical Forests, Oil Palm and Rubber Plantations in Indonesia

ground truth data
Recently Published Documents