SYNERGISTIC USE OF SENTINEL-1 AND SENTINEL-2 TIME SERIES FOR POPLAR PLANTATIONS MONITORING AT LARGE SCALE

Abstract. The current context of availability of Earth Observation satellite data at high spatial and temporal resolutions makes it possible to map large areas. Although supervised classification is the most widely adopted approach, its performance is highly dependent on the availability and the quality of training data. However, gathering samples from field surveys or through photo interpretation is often expensive and time-consuming especially when the area to be classified is large. In this paper we propose the use of an active learning-based technique to address this issue by reducing the labelling effort required for supervised classification while increasing the generalisation capabilities of the classifier across space. Experiments were conducted to identify poplar plantations in three different sites in France using Sentinel-2 time series. In order to characterise the age of the identified poplar stands, temporal means of Sentinel-1 backscatter coefficients were computed. The results are promising and show the good capacities of the active learning-based approach to achieve similar performance (Poplar F-score &geq; 90%) to traditional passive learning (i.e. with random selection of samples) with up to 50% fewer training samples. Sentinel-1 annual means have demonstrated their potential to differentiate two stand ages with an overall accuracy of 83% regardless of the cultivar considered.

Download Full-text

From Local to Global: A Transfer Learning-Based Approach for Mapping Poplar Plantations at Large Scale

10.20944/preprints202004.0302.v1 ◽

2020 ◽

Author(s):

Yousra Hamrouni ◽

Éric Paillassa ◽

Véronique Chéret ◽

Claude Monteil ◽

David Sheeren

Keyword(s):

Active Learning ◽

Random Sampling ◽

Large Scale ◽

Forest Cover ◽

Classification Performance ◽

Global Model ◽

Source Image ◽

National Scale ◽

Training Samples ◽

Poplar Plantations

Reliable estimates of poplar plantations area are not available at the French national scale due to the unsuitability and low update rate of existing forest databases for this short-rotation species. While supervised classification methods have been shown to be highly accurate in mapping forest cover from remotely sensed images, their performance depends to a great extent on the labelled samples used to build the models. In addition to their high acquisition cost, such samples are often scarce and not fully representative of the variability in class distributions. Consequently, when classification models are applied to large areas with high intra-class variance, they generally yield poor accuracies. In this paper, we propose the use of active learning (AL) to efficiently adapt a classifier trained on a source image to spatially distinct target images with minimal labelling effort and without sacrificing classification performance. The adaptation consists in actively adding to the initial local model, new relevant training samples from other areas, in a cascade that iteratively improves the generalisation capabilities of the classifier, leading to a global model tailored to different areas. This active selection relies on uncertainty sampling to directly focus on the most informative pixels for which the algorithm is the least certain of their class labels. Experiments conducted on Sentinel-2 time series showed that when the same number of training samples was used, active learning outperformed passive learning (random sampling) by up to 5% of overall accuracy and up to 12% of class F-score. In addition, and depending on the class considered, the random sampling required up to 50% more samples to achieve the same performance of an active learning-based model. Moreover, the results demonstrate the suitability of the derived global model to accurately map poplar plantations among other tree species with overall accuracy values up to 14% higher than those obtained with local models. The proposed approach paves the way for national-scale mapping in an operational context.

Download Full-text

CLASSIFICATION OF TIME SERIES OF SENTINEL-2 IMAGES FOR LARGE SCALE MAPPING IN CAMEROON

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b3-2020-633-2020 ◽

2020 ◽

Vol XLIII-B3-2020 ◽

pp. 633-640

Author(s):

H. Tagne ◽

A. Le Bris ◽

D. Monkam ◽

C. Mallet

Keyword(s):

Time Series ◽

Land Cover ◽

Large Scale ◽

State Of The Art ◽

Training Data ◽

Dense Image ◽

Supervised Classifiers ◽

Long Time ◽

Sentinel 2

Abstract. Sentinel-2 satellites provide dense image time series exhibiting high spectral, spatial and temporal resolution. These images are in particular of utter interest to map Land-Cover (LC) at large scale. LC maps can now be computed on a yearly basis at the scale of a country with efficient supervised classifiers, assuming suitable training data are available. However, the efficient exploitation of large amount of Sentinel-2 imagery still remain challenging on unexplored areas where state-of-the-art classifiers are prone to fail. This paper focuses on Land-Cover mapping over Cameroon for the purpose of updating the national topographic geodatabase. The ι2 framework is adopted and tested for the specificity of the country. Here, experiments focus on generic classes (five) which enables providing robust focusing masks for higher resolution classifications. Two strategies are compared: (i) a LC map is calculated out of a year long time series and (ii) monthly LC maps are generated and merged into a single yearly map. Satisfactory accuracy scores are obtained, allowing to provide a first step towards finer-grained map retrieval.

Download Full-text

New active learning algorithms for near-infrared spectroscopy in agricultural applications

at - Automatisierungstechnik ◽

10.1515/auto-2020-0143 ◽

2021 ◽

Vol 69 (4) ◽

pp. 297-306

Author(s):

Julius Krause ◽

Maurice Günder ◽

Daniel Schulz ◽

Robin Gruna

Keyword(s):

Active Learning ◽

Near Infrared ◽

Agricultural Products ◽

Training Data ◽

Calibration Model ◽

Learning Approaches ◽

Training Samples ◽

Agricultural Applications ◽

Selection Of

Abstract The selection of training data determines the quality of a chemometric calibration model. In order to cover the entire parameter space of known influencing parameters, an experimental design is usually created. Nevertheless, even with a carefully prepared Design of Experiment (DoE), redundant reference analyses are often performed during the analysis of agricultural products. Because the number of possible reference analyses is usually very limited, the presented active learning approaches are intended to provide a tool for better selection of training samples.

Download Full-text

MetaTP

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3478083 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-28

Author(s):

Weida Zhong ◽

Qiuling Suo ◽

Abhishek Gupta ◽

Xiaowei Jia ◽

Chunming Qiao ◽

...

Keyword(s):

Time Series ◽

Large Scale ◽

Multivariate Time Series ◽

Modern Society ◽

Training Data ◽

Traffic Prediction ◽

Temporal Prediction ◽

Reference Space ◽

Meta Learning ◽

Real World Datasets

With the popularity of smartphones, large-scale road sensing data is being collected to perform traffic prediction, which is an important task in modern society. Due to the nature of the roving sensors on smartphones, the collected traffic data which is in the form of multivariate time series, is often temporally sparse and unevenly distributed across regions. Moreover, different regions can have different traffic patterns, which makes it challenging to adapt models learned from regions with sufficient training data to target regions. Given that many regions may have very sparse data, it is also impossible to build individual models for each region separately. In this paper, we propose a meta-learning based framework named MetaTP to overcome these challenges. MetaTP has two key parts, i.e., basic traffic prediction network (base model) and meta-knowledge transfer. In base model, a two-layer interpolation network is employed to map original time series onto uniformly-spaced reference time points, so that temporal prediction can be effectively performed in the reference space. The meta-learning framework is employed to transfer knowledge from source regions with a large amount of data to target regions with a few data examples via fast adaptation, in order to improve model generalizability on target regions. Moreover, we use two memory networks to capture the global patterns of spatial and temporal information across regions. We evaluate the proposed framework on two real-world datasets, and experimental results show the effectiveness of the proposed framework.

Download Full-text

Adapting SVM for data sparseness and imbalance: a case study in information extraction

Natural Language Engineering ◽

10.1017/s1351324908004968 ◽

2009 ◽

Vol 15 (2) ◽

pp. 241-271 ◽

Cited By ~ 31

Author(s):

YAOYONG LI ◽

KALINA BONTCHEVA ◽

HAMISH CUNNINGHAM

Keyword(s):

Active Learning ◽

Language Learning ◽

Information Extraction ◽

Language Processing ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Passive Learning ◽

Wide Range

AbstractSupport Vector Machines (SVM) have been used successfully in many Natural Language Processing (NLP) tasks. The novel contribution of this paper is in investigating two techniques for making SVM more suitable for language learning tasks. Firstly, we propose an SVM with uneven margins (SVMUM) model to deal with the problem of imbalanced training data. Secondly, SVM active learning is employed in order to alleviate the difficulty in obtaining labelled training data. The algorithms are presented and evaluated on several Information Extraction (IE) tasks, where they achieved better performance than the standard SVM and the SVM with passive learning, respectively. Moreover, by combining SVMUM with the active learning algorithm, we achieve the best reported results on the seminars and jobs corpora, which are benchmark data sets used for evaluation and comparison of machine learning algorithms for IE. In addition, we also evaluate the token based classification framework for IE with three different entity tagging schemes. In comparison to previous methods dealing with the same problems, our methods are both effective and efficient, which are valuable features for real-world applications. Due to the similarity in the formulation of the learning problem for IE and for other NLP tasks, the two techniques are likely to be beneficial in a wide range of applications1.

Download Full-text

Combining Self-supervised Learning and Active Learning for Disfluency Detection

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3487290 ◽

2022 ◽

Vol 21 (3) ◽

pp. 1-25

Author(s):

Shaolei Wang ◽

Zhongyuan Wang ◽

Wanxiang Che ◽

Sendong Zhao ◽

Ting Liu

Keyword(s):

Neural Network ◽

Active Learning ◽

Supervised Learning ◽

Large Scale ◽

Training Data ◽

Fine Tuning ◽

Training Dataset ◽

Performance Gap ◽

Annotation Costs ◽

Trained Neural Network

Spoken language is fundamentally different from the written language in that it contains frequent disfluencies or parts of an utterance that are corrected by the speaker. Disfluency detection (removing these disfluencies) is desirable to clean the input for use in downstream NLP tasks. Most existing approaches to disfluency detection heavily rely on human-annotated data, which is scarce and expensive to obtain in practice. To tackle the training data bottleneck, in this work, we investigate methods for combining self-supervised learning and active learning for disfluency detection. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled data and propose two self-supervised pre-training tasks: (i) a tagging task to detect the added noisy words and (ii) sentence classification to distinguish original sentences from grammatically incorrect sentences. We then combine these two tasks to jointly pre-train a neural network. The pre-trained neural network is then fine-tuned using human-annotated disfluency detection training data. The self-supervised learning method can capture task-special knowledge for disfluency detection and achieve better performance when fine-tuning on a small annotated dataset compared to other supervised methods. However, limited in that the pseudo training data are generated based on simple heuristics and cannot fully cover all the disfluency patterns, there is still a performance gap compared to the supervised models trained on the full training dataset. We further explore how to bridge the performance gap by integrating active learning during the fine-tuning process. Active learning strives to reduce annotation costs by choosing the most critical examples to label and can address the weakness of self-supervised learning with a small annotated dataset. We show that by combining self-supervised learning with active learning, our model is able to match state-of-the-art performance with just about 10% of the original training data on both the commonly used English Switchboard test set and a set of in-house annotated Chinese data.

Download Full-text

Mapping Winter Crops in China with Multi-Source Satellite Imagery and Phenology-Based Algorithm

Remote Sensing ◽

10.3390/rs11070820 ◽

2019 ◽

Vol 11 (7) ◽

pp. 820 ◽

Cited By ~ 8

Author(s):

Haifeng Tian ◽

Ni Huang ◽

Zheng Niu ◽

Yuchu Qin ◽

Jie Pei ◽

...

Keyword(s):

Time Series ◽

Large Scale ◽

Time Windows ◽

Vegetation Indices ◽

Optical Images ◽

Landsat 7 ◽

Multi Temporal ◽

Winter Crop ◽

Winter Crops ◽

Sentinel 2

Timely and accurate mapping of winter crop planting areas in China is important for food security assessment at a national level. Time-series of vegetation indices, such as the normalized difference vegetation index (NDVI), are widely used for crop mapping, as they can characterize the growth cycle of crops. However, with the moderate spatial resolution optical imagery acquired by Landsat and Sentinel-2, it is difficult to obtain complete time-series curves for vegetation indices due to the influence of the revisit cycle of the satellite and weather conditions. Therefore, in this study, we propose a method for compositing the multi-temporal NDVI, in order to map winter crop planting areas with the Landsat-7 and -8 and Sentinel-2 optical images. The algorithm composites the multi-temporal NDVI into three key values, according to two time-windows—a period of low NDVI values and a period of high NDVI values—for the winter crops. First, we identify the two time-windows, according to the time-series of the NDVI obtained from daily Moderate Resolution Imaging Spectroradiometer observations. Second, the 30 m spatial resolution multi-temporal NDVI curve, derived from the Landsat-7 and -8 and Sentinel-2 optical images, is composited by selecting the maximal value in the high NDVI value period, and the minimal and median values in the low NDVI value period, using an algorithm of the Google Earth Engine. Third, a decision tree classification method is utilized to perform the winter crop classification at a pixel level. The results indicate that this method is effective for the large-scale mapping of winter crops. In the study area, the area of winter crops in 2018 was determined to be 207,641 km2, with an overall accuracy of 96.22% and a kappa coefficient of 0.93. The method proposed in this paper is expected to contribute to the rapid and accurate mapping of winter crops in large-scale applications and analyses.

Download Full-text

Potential of Multiway PLS (N-PLS) Regression Method to Analyse Time-Series of Multispectral Images: A Case Study in Agriculture

Remote Sensing ◽

10.3390/rs14010216 ◽

2022 ◽

Vol 14 (1) ◽

pp. 216

Author(s):

Eva Lopez-Fornieles ◽

Guilhem Brunel ◽

Florian Rancon ◽

Belal Gaci ◽

Maxime Metz ◽

...

Keyword(s):

Remote Sensing ◽

Time Series ◽

Heat Wave ◽

Large Scale ◽

Regional Scale ◽

Pls Regression ◽

Wave Impact ◽

Sensing Applications ◽

Sentinel 2

Recent literature reflects the substantial progress in combining spatial, temporal and spectral capacities for remote sensing applications. As a result, new issues are arising, such as the need for methodologies that can process simultaneously the different dimensions of satellite information. This paper presents PLS regression extended to three-way data in order to integrate multiwavelengths as variables measured at several dates (time-series) and locations with Sentinel-2 at a regional scale. Considering that the multi-collinearity problem is present in remote sensing time-series to estimate one response variable and that the dataset is multidimensional, a multiway partial least squares (N-PLS) regression approach may be relevant to relate image information to ground variables of interest. N-PLS is an extension of the ordinary PLS regression algorithm where the bilinear model of predictors is replaced by a multilinear model. This paper presents a case study within the context of agriculture, conducted on a time-series of Sentinel-2 images covering regional scale scenes of southern France impacted by the heat wave episode that occurred on 28 June 2019. The model has been developed based on available heat wave impact data for 107 vineyard blocks in the Languedoc-Roussillon region and multispectral time-series predictor data for the period May to August 2019. The results validated the effectiveness of the proposed N-PLS method in estimating yield loss from spectral and temporal attributes. The performance of the model was evaluated by the R2 obtained on the prediction set (0.661), and the root mean square of error (RMSE), which was 10.7%. Limitations of the approach when dealing with time-series of large-scale images which represent a source of challenges are discussed; however, the N–PLS regression seems to be a suitable choice for analysing complex multispectral imagery data with different spectral domains and with a clear temporal evolution, such as an extreme weather event.

Download Full-text

Reduced Annotation Based on Deep Active Learning for Arabic Text Detection in Natural Scene Images

10.36227/techrxiv.17327963 ◽

2021 ◽

Author(s):

Khalil Boukthir ◽

Abdulrahman M. Qahtani ◽

Omar Almutiry ◽

habib dhahri ◽

Adel Alimi

Keyword(s):

Active Learning ◽

Text Detection ◽

Training Data ◽

Arabic Text ◽

Natural Scene ◽

Novel Approach ◽

Training Samples ◽

Scene Text ◽

Text Images ◽

Natural Scene Images

<div>- A novel approach is presented to reduced annotation based on Deep Active Learning for Arabic text detection in Natural Scene Images.</div><div>- A new Arabic text images dataset (7k images) using the Google Street View service named TSVD.</div><div>- A new semi-automatic method for generating natural scene text images from the streets.</div><div>- Training samples is reduced to 1/5 of the original training size on average.</div><div>- Much less training data to achieve better dice index : 0.84</div>

Download Full-text