Improving SAR Altimeter processing over the coastal zone and inland waters - the ESA HYDROCOASTAL project

Mapping Intimacies ◽

10.5194/egusphere-egu21-9 ◽

2021 ◽

Author(s):

David Cotton ◽

Keyword(s):

Coastal Zone ◽

Test Data ◽

River Discharge ◽

Altimeter Data ◽

Inland Waters ◽

Data Sets ◽

Data Set ◽

Discharge Data ◽

Processing Algorithms ◽

The Impact

IntroductionHYDROCOASTAL is a two year project funded by ESA, with the objective to maximise exploitation of SAR and SARin altimeter measurements in the coastal zone and inland waters, by evaluating and implementing new approaches to process SAR and SARin data from CryoSat-2, and SAR altimeter data from Sentinel-3A and Sentinel-3B. Optical data from Sentinel-2 MSI and Sentinel-3 OLCI instruments will also be used in generating River Discharge products.New SAR and SARin processing algorithms for the coastal zone and inland waters will be developed and implemented and evaluated through an initial Test Data Set for selected regions. From the results of this evaluation a processing scheme will be implemented to generate global coastal zone and river discharge data sets.A series of case studies will assess these products in terms of their scientific impacts.All the produced data sets will be available on request to external researchers, and full descriptions of the processing algorithms will be provided&#160;ObjectivesThe scientific objectives of HYDROCOASTAL are to enhance our understanding&#160; of interactions between the inland water and coastal zone, between the coastal zone and the open ocean, and the small scale processes that govern these interactions. Also the project aims to improve our capability to characterize the variation at different time scales of inland water storage, exchanges with the ocean and the impact on regional sea-level changesThe technical objectives are to develop and evaluate&#160; new SAR&#160; and SARin altimetry processing techniques in support of the scientific objectives, including stack processing, and filtering, and retracking. Also an improved Wet Troposphere Correction will be developed and evaluated.Project&#160; OutlineThere are four tasks to the project<ul><li>Scientific Review and Requirements Consolidation: Review the current state of the art in SAR and SARin altimeter data processing as applied to the coastal zone and to inland waters</li> <li>Implementation and Validation: New processing algorithms with be implemented to generate a Test Data sets, which will be validated against models, in-situ data, and other satellite data sets. Selected algorithms will then be used to generate global coastal zone and river discharge data sets</li> <li>Impacts Assessment: The impact of these global products will be assess in a series of Case Studies</li> <li>Outreach and Roadmap: Outreach material will be prepared and distributed to engage with the wider scientific community and provide recommendations for development of future missions and future research.</li> </ul>&#160;PresentationThe presentation will provide an overview to the project, present the different SAR altimeter processing algorithms that are being evaluated in the first phase of the project, and early results from the evaluation of the initial test data set.&#160;

Download Full-text

Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media

Journal Of Big Data ◽

10.1186/s40537-021-00488-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Yahya Albalawi ◽

Jim Buckley ◽

Nikola S. Nikolov

Keyword(s):

Social Media ◽

Deep Learning ◽

Comprehensive Evaluation ◽

Classification Problem ◽

Data Sets ◽

Word Embeddings ◽

Data Set ◽

Lower Accuracy ◽

Health Related ◽

The Impact

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.

Download Full-text

The Midlatitude Continental Convective Clouds Experiment (MC3E) sounding network: operations, processing and analysis

Atmospheric Measurement Techniques ◽

10.5194/amt-8-421-2015 ◽

2015 ◽

Vol 8 (1) ◽

pp. 421-434 ◽

Cited By ~ 18

Author(s):

M. P. Jensen ◽

T. Toto ◽

D. Troyan ◽

P. E. Ciesielski ◽

D. Holdridge ◽

...

Keyword(s):

Large Scale ◽

Scale Model ◽

Data Sets ◽

Central Plains ◽

Data Set ◽

Convective Systems ◽

Convective Clouds ◽

Quality Checks ◽

Network Operations ◽

The Impact

Abstract. The Midlatitude Continental Convective Clouds Experiment (MC3E) took place during the spring of 2011 centered in north-central Oklahoma, USA. The main goal of this field campaign was to capture the dynamical and microphysical characteristics of precipitating convective systems in the US Central Plains. A major component of the campaign was a six-site radiosonde array designed to capture the large-scale variability of the atmospheric state with the intent of deriving model forcing data sets. Over the course of the 46-day MC3E campaign, a total of 1362 radiosondes were launched from the enhanced sonde network. This manuscript provides details on the instrumentation used as part of the sounding array, the data processing activities including quality checks and humidity bias corrections and an analysis of the impacts of bias correction and algorithm assumptions on the determination of convective levels and indices. It is found that corrections for known radiosonde humidity biases and assumptions regarding the characteristics of the surface convective parcel result in significant differences in the derived values of convective levels and indices in many soundings. In addition, the impact of including the humidity corrections and quality controls on the thermodynamic profiles that are used in the derivation of a large-scale model forcing data set are investigated. The results show a significant impact on the derived large-scale vertical velocity field illustrating the importance of addressing these humidity biases.

Download Full-text

DEVELOPMENT OF A GLOBAL RIVER DISCHARGE DATA SET AND ANALYSES ON THE TEMPORAL VARIATIONS OF ANNUAL RUNOFF

PROCEEDINGS OF HYDRAULIC ENGINEERING ◽

10.2208/prohe.43.151 ◽

1999 ◽

Vol 43 ◽

pp. 151-156

Author(s):

Taikan OKI ◽

Katumi MUSIAKE

Keyword(s):

River Discharge ◽

Temporal Variations ◽

Annual Runoff ◽

Data Set ◽

Discharge Data

Download Full-text

Data Analysis With Shapley Values For Automatic Subject Selection in Alzheimer's Disease Data Sets Using Interpretable Machine Learning

10.21203/rs.3.rs-245707/v1 ◽

2021 ◽

Author(s):

Louise Bloch ◽

Christoph M. Friedrich

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Test Data ◽

Noisy Data ◽

Training Data ◽

Data Sets ◽

Data Set ◽

Model Interpretation ◽

Percentage Points ◽

Shapley Values

Abstract Background: The prediction of whether Mild Cognitive Impaired (MCI) subjects will prospectively develop Alzheimer's Disease (AD) is important for the recruitment and monitoring of subjects for therapy studies. Machine Learning (ML) is suitable to improve early AD prediction. The etiology of AD is heterogeneous, which leads to noisy data sets. Additional noise is introduced by multicentric study designs and varying acquisition protocols. This article examines whether an automatic and fair data valuation method based on Shapley values can identify subjects with noisy data. Methods: An ML-workow was developed and trained for a subset of the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. The validation was executed for an independent ADNI test data set and for the Australian Imaging, Biomarker and Lifestyle Flagship Study of Ageing (AIBL) cohort. The workow included volumetric Magnetic Resonance Imaging (MRI) feature extraction, subject sample selection using data Shapley, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) for model training and Kernel SHapley Additive exPlanations (SHAP) values for model interpretation. This model interpretation enables clinically relevant explanation of individual predictions. Results: The XGBoost models which excluded 116 of the 467 subjects from the training data set based on their Logistic Regression (LR) data Shapley values outperformed the models which were trained on the entire training data set and which reached a mean classification accuracy of 58.54 % by 14.13 % (8.27 percentage points) on the independent ADNI test data set. The XGBoost models, which were trained on the entire training data set reached a mean accuracy of 60.35 % for the AIBL data set. An improvement of 24.86 % (15.00 percentage points) could be reached for the XGBoost models if those 72 subjects with the smallest RF data Shapley values were excluded from the training data set. Conclusion: The data Shapley method was able to improve the classification accuracies for the test data sets. Noisy data was associated with the number of ApoEϵ4 alleles and volumetric MRI measurements. Kernel SHAP showed that the black-box models learned biologically plausible associations.

Download Full-text

Six years of total ozone column measurements from SCIAMACHY nadir observations

Atmospheric Measurement Techniques ◽

10.5194/amt-2-87-2009 ◽

2009 ◽

Vol 2 (1) ◽

pp. 87-98 ◽

Cited By ~ 39

Author(s):

C. Lerot ◽

M. Van Roozendael ◽

J. van Geffen ◽

J. van Gent ◽

C. Fayt ◽

...

Keyword(s):

Cross Sections ◽

Total Ozone ◽

Large Scale ◽

European Space Agency ◽

Data Sets ◽

Data Set ◽

Ozone Data ◽

Space Agency ◽

German Aerospace ◽

The Impact

Abstract. Total O3 columns have been retrieved from six years of SCIAMACHY nadir UV radiance measurements using SDOAS, an adaptation of the GDOAS algorithm previously developed at BIRA-IASB for the GOME instrument. GDOAS and SDOAS have been implemented by the German Aerospace Center (DLR) in the version 4 of the GOME Data Processor (GDP) and in version 3 of the SCIAMACHY Ground Processor (SGP), respectively. The processors are being run at the DLR processing centre on behalf of the European Space Agency (ESA). We first focus on the description of the SDOAS algorithm with particular attention to the impact of uncertainties on the reference O3 absorption cross-sections. Second, the resulting SCIAMACHY total ozone data set is globally evaluated through large-scale comparisons with results from GOME and OMI as well as with ground-based correlative measurements. The various total ozone data sets are found to agree within 2% on average. However, a negative trend of 0.2–0.4%/year has been identified in the SCIAMACHY O3 columns; this probably originates from instrumental degradation effects that have not yet been fully characterized.

Download Full-text

Investigation of the international comparability of population-based routine hospital data set derived comorbidity scores for patients with lung cancer

Thorax ◽

10.1136/thoraxjnl-2017-210362 ◽

2017 ◽

Vol 73 (4) ◽

pp. 339-349 ◽

Cited By ~ 4

Author(s):

Margreet Lüchtenborg ◽

Eva J A Morris ◽

Daniela Tataru ◽

Victoria H Coupland ◽

Andrew Smith ◽

...

Keyword(s):

Lung Cancer ◽

Cancer Survival ◽

Population Based ◽

Data Sets ◽

Hospital Data ◽

Data Set ◽

Discharge Data ◽

International Differences ◽

Cancer Data ◽

Comorbidity Scores

IntroductionThe International Cancer Benchmarking Partnership (ICBP) identified significant international differences in lung cancer survival. Differing levels of comorbid disease across ICBP countries has been suggested as a potential explanation of this variation but, to date, no studies have quantified its impact. This study investigated whether comparable, robust comorbidity scores can be derived from the different routine population-based cancer data sets available in the ICBP jurisdictions and, if so, use them to quantify international variation in comorbidity and determine its influence on outcome.MethodsLinked population-based lung cancer registry and hospital discharge data sets were acquired from nine ICBP jurisdictions in Australia, Canada, Norway and the UK providing a study population of 233 981 individuals. For each person in this cohort Charlson, Elixhauser and inpatient bed day Comorbidity Scores were derived relating to the 4–36 months prior to their lung cancer diagnosis. The scores were then compared to assess their validity and feasibility of use in international survival comparisons.ResultsIt was feasible to generate the three comorbidity scores for each jurisdiction, which were found to have good content, face and concurrent validity. Predictive validity was limited and there was evidence that the reliability was questionable.ConclusionThe results presented here indicate that interjurisdictional comparability of recorded comorbidity was limited due to probable differences in coding and hospital admission practices in each area. Before the contribution of comorbidity on international differences in cancer survival can be investigated an internationally harmonised comorbidity index is required.

Download Full-text

Evaluating Infill Well Performance and Fracture Driven Interactions Using Intervention Based Distributed Fiber Optics

10.2118/204184-ms ◽

2021 ◽

Author(s):

Ahmed Attia ◽

Matthew Lawrence

Keyword(s):

Fiber Optics ◽

Fracture Network ◽

Data Sets ◽

Well Performance ◽

Design Strategies ◽

Data Set ◽

Geological Features ◽

Production Output ◽

Long Carbon Fiber ◽

The Impact

Abstract Distributed Fiber Optics (DFO) technology has been the new face for unconventional well diagnostics. This technology focuses on measuring Distributed Acoustic Sensing (DAS) and Distrusted Temperature Sensing (DTS) to give an in-depth understanding of well productivity pre and post stimulation. Many different completion design strategies, both on surface and downhole, are used to obtain the best fracture network outcome; however, with complex geological features, different fracture designs, and fracture driven interactions (FDIs) effecting nearby wells, it is difficult to grasp a full understanding on completion design performance for each well. Validating completion designs and improving on the learnings found in each data set should be the foundation in developing each field. Capturing a data set with strong evidence of what works and what doesn't, can help the operator make better engineering decisions to make more efficient wells as well as help gauge the spacing between each well. The focus of this paper will be on a few case studies in the Bakken which vividly show how infill wells greatly interfered with production output. A DFO deployed with a 0.6" OD, 23,000-foot-long carbon fiber rod to acquire DAS and DTS for post frac flow, completion, and interference evaluation. This paper will dive into the DFO measurements taken post frac to further explain what effects are seen on completion designs caused by interferences with infill wells; the learnings taken from the DFO post frac were applied to further escalate the understanding and awareness of how infill wells will preform on future pad sites. A showcase of three separate data sets from the Bakken will identify how effective DFO technology can be in evaluating and making informed decisions on future frac completions. In this paper we will also show and discuss how DFO can measure real time FDI events and what measures can be taken to lessen the impact on negative interference caused by infill wells.

Download Full-text

The shifting of climate types: manifestation to phenology and ecosystems structure

10.5194/egusphere-egu21-5689 ◽

2021 ◽

Author(s):

Gunta Kalvāne ◽

Andis Kalvāns ◽

Agrita Briede ◽

Ilmārs Krampis ◽

Dārta Kaupe ◽

...

Keyword(s):

Climate Change ◽

Breaking Point ◽

Data Sets ◽

Precipitation Regime ◽

Climate Type ◽

Data Set ◽

Temperature And Precipitation ◽

Spatial Changes ◽

Temporal And Spatial ◽

The Impact

According to the K&#246;ppen climate classification, almost the entire area of Latvia belongs to the same climate type, Dfb, which is characterized by humid continental climates with warm (sometimes hot) summers and cold winters.&#160; In the last decades whether conditions on the western coast of Latvia more characterized by temperate maritime climates. In this area there has been a transition (and still ongoing) to the climate type Cfb.Temporal and spatial changes of temperature and precipitation regime have been examined in whole territory to identify the breaking point of climate type shifts. We used two type of climatological data sets: gridded daily temperature from the E-OBS data set version 21.0e (Cornes et al., 2018) and direct observations from meteorological stations (data source: Latvian Environment, Geology and Meteorology Centre). The temperature and precipitation regime have changed significantly in the last century - seasonal and regional differences can be observed in the territory of Latvia.We have digitized and analysed more than 47 thousand phenological records, fixed by volunteers in period 1970-2018. Study has shown that significant seasonal changes have taken place across the Latvian landscape due to climate change (Kalv&#257;ne and Kalv&#257;ns, 2021). The largest changes have been recorded for the unfolding (BBCH11) and flowering (BBCH61) phase of plants&#160;&#8211; almost 90% of the data included in the database demonstrate a negative trend. The winter of 1988/1989 may be considered as breaking point, it has been common that many phases have begun sooner (particularly spring phases), while abiotic autumn phases have been characterized by late years.Study gives an overview aboutclimate change (also climate type shift) impacts on ecosystems in Latvia, particularly to forest and semi-natural grasslands and temporal and spatial changes of vegetation structure and distribution areas.This study was carried out within the framework of the Impact of Climate Change on Phytophenological Phases and Related Risks in the Baltic Region (No. 1.1.1.2/VIAA/2/18/265) ERDF project and the Climate change and sustainable use of natural resources&#160;institutional research grant&#160;of the University of Latvia (No. AAP2016/B041//ZD2016/AZ03).Cornes, R. C., van der Schrier, G., van den Besselaar, E. J. M. and Jones, P. D.: An Ensemble Version of the E-OBS Temperature and Precipitation Data Sets, J. Geophys. Res. Atmos., 123(17), 9391&#8211;9409, doi:10.1029/2017JD028200, 2018.Kalv&#257;ne, G. and Kalv&#257;ns, A.(2021): Phenological trends of multi-taxonomic groups in Latvia, 1970-2018, Int. J. Biometeorol., doi:https://doi.org/10.1007/s00484-020-02068-8, 2021.

Download Full-text

Feature-Based Uncertainty Visualization

Big Data ◽

10.4018/978-1-4666-9840-6.ch014 ◽

2016 ◽

pp. 261-287

Author(s):

Keqin Wu ◽

Song Zhang

Keyword(s):

Scientific Data ◽

Data Sets ◽

Data Set ◽

Uncertainty Visualization ◽

Scalar Data ◽

Critical Issues ◽

Contour Level ◽

Feature Based ◽

2D Data ◽

The Impact

While uncertainty in scientific data attracts an increasing research interest in the visualization community, two critical issues remain insufficiently studied: (1) visualizing the impact of the uncertainty of a data set on its features and (2) interactively exploring 3D or large 2D data sets with uncertainties. In this chapter, a suite of feature-based techniques is developed to address these issues. First, an interactive visualization tool for exploring scalar data with data-level, contour-level, and topology-level uncertainties is developed. Second, a framework of visualizing feature-level uncertainty is proposed to study the uncertain feature deviations in both scalar and vector data sets. With quantified representation and interactive capability, the proposed feature-based visualizations provide new insights into the uncertainties of both data and their features which otherwise would remain unknown with the visualization of only data uncertainties.

Download Full-text

Influence of the weights in IHS and Brovey methods for pan-sharpening WorldView-3 satellite images

International Journal of Engineering & Technology ◽

10.14419/ijet.v6i3.7702 ◽

2017 ◽

Vol 6 (3) ◽

pp. 71 ◽

Cited By ~ 5

Author(s):

Claudio Parente ◽

Massimiliano Pepe

Keyword(s):

Satellite Images ◽

Urban Landscape ◽

Spectral Response ◽

Rural Landscape ◽

Spectral Radiance ◽

Data Sets ◽

Data Set ◽

Inertial Moment ◽

The Impact

The purpose of this paper is to investigate the impact of weights in pan-sharpening methods applied to satellite images. Indeed, different data sets of weights have been considered and compared in the IHS and Brovey methods. The first dataset contains the same weight for each band while the second takes in account the weighs obtained by spectral radiance response; these two data sets are most common in pan-sharpening application. The third data set is resulting by a new method. It consists to compute the inertial moment of first order of each band taking in account the spectral response. For testing the impact of the weights of the different data sets, WorlView-3 satellite images have been considered. In particular, two different scenes (the first in urban landscape, the latter in rural landscape) have been investigated. The quality of pan-sharpened images has been analysed by three different quality indexes: Root mean square error (RMSE), Relative average spectral error (RASE) and Erreur Relative Global Adimensionnelle de Synthèse (ERGAS).

Download Full-text