The Complete Data Fusion as a ready to use tool for the exploitation of atmospheric Sentinel ozone profiles

Author(s):  
Nicola Zoppetti ◽  
Simone Ceccherini ◽  
Flavio Barbara ◽  
Samuele Del Bianco ◽  
Marco Gai ◽  
...  

<p>Remote sounding of atmospheric composition makes use of satellite measurements with very heterogeneous characteristics. In particular, the determination of vertical profiles of gases in the atmosphere can be performed using measurements acquired in different spectral bands and with different observation geometries. The most rigorous way to combine heterogeneous measurements of the same quantity in a single Level 2 (L2) product is simultaneous retrieval. The main drawback of simultaneous retrieval is its complexity, due to the necessity to embed the forward models of different instruments into the same retrieval application. To overcome this shortcoming, we developed a data fusion method, referred to as Complete Data Fusion (CDF), to provide an efficient and adaptable alternative to simultaneous retrieval. In general, the CDF input is any number of profiles retrieved with the optimal estimation technique, characterized by their a priori information, covariance matrix (CM), and averaging kernel (AK) matrix. The output of the CDF is a single product also characterized by an a priori, a CM and an AK matrix, which collect all the available information content. To account for the geo-temporal differences and different vertical grids of the fusing profiles, a coincidence and an interpolation error have to be included in the error budget.<br>In the first part of the work, the CDF method is applied to ozone profiles simulated in the thermal infrared and ultraviolet bands, according to the specifications of the Sentinel 4 (geostationary) and Sentinel 5 (low Earth orbit) missions of the Copernicus program. The simulated data have been produced in the context of the Advanced Ultraviolet Radiation and Ozone Retrieval for Applications (AURORA) project funded by the European Commission in the framework of the Horizon 2020 program. The use of synthetic data and the assumption of negligible systematic error in the simulated measurements allow studying the behavior of the CDF in ideal conditions. The use of synthetic data allows evaluating the performance of the algorithm also in terms of differences between the products of interest and the reference truth, represented by the atmospheric scenario used in the procedure to simulate the L2 products. This analysis aims at demonstrating the potential benefits of the CDF for the synergy of products measured by different platforms in a close future realistic scenario, when the Sentinel 4, 5/5p ozone profiles will be available.<br>In the second part of this work, the CDF is applied to a set of real measurements of ozone acquired by GOME-2 onboard the MetOp-B platform. The quality of the CDF products, obtained for the first time from operational products, is compared with that of the original GOME-2 products. This aims to demonstrate the concrete applicability of the CDF to real data and its possible use to generate Level-3 (or higher) gridded products.<br>The results discussed in this presentation offer a first consolidated picture of the actual and potential value of an innovative technique for post-retrieval processing and generation of Level-3 (or higher) products from the atmospheric Sentinel data.</p>

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Camilo Broc ◽  
Therese Truong ◽  
Benoit Liquet

Abstract Background The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. Results Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. Conclusion The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields.


Geophysics ◽  
1993 ◽  
Vol 58 (1) ◽  
pp. 91-100 ◽  
Author(s):  
Claude F. Lafond ◽  
Alan R. Levander

Prestack depth migration still suffers from the problems associated with building appropriate velocity models. The two main after‐migration, before‐stack velocity analysis techniques currently used, depth focusing and residual moveout correction, have found good use in many applications but have also shown their limitations in the case of very complex structures. To address this issue, we have extended the residual moveout analysis technique to the general case of heterogeneous velocity fields and steep dips, while keeping the algorithm robust enough to be of practical use on real data. Our method is not based on analytic expressions for the moveouts and requires no a priori knowledge of the model, but instead uses geometrical ray tracing in heterogeneous media, layer‐stripping migration, and local wavefront analysis to compute residual velocity corrections. These corrections are back projected into the velocity model along raypaths in a way that is similar to tomographic reconstruction. While this approach is more general than existing migration velocity analysis implementations, it is also much more computer intensive and is best used locally around a particularly complex structure. We demonstrate the technique using synthetic data from a model with strong velocity gradients and then apply it to a marine data set to improve the positioning of a major fault.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Francesca Pizzorni Ferrarese ◽  
Flavio Simonetti ◽  
Roberto Israel Foroni ◽  
Gloria Menegaz

Validation and accuracy assessment are the main bottlenecks preventing the adoption of image processing algorithms in the clinical practice. In the classical approach, a posteriori analysis is performed through objective metrics. In this work, a different approach based on Petri nets is proposed. The basic idea consists in predicting the accuracy of a given pipeline based on the identification and characterization of the sources of inaccuracy. The concept is demonstrated on a case study: intrasubject rigid and affine registration of magnetic resonance images. Both synthetic and real data are considered. While synthetic data allow the benchmarking of the performance with respect to the ground truth, real data enable to assess the robustness of the methodology in real contexts as well as to determine the suitability of the use of synthetic data in the training phase. Results revealed a higher correlation and a lower dispersion among the metrics for simulated data, while the opposite trend was observed for pathologic ones. Results show that the proposed model not only provides a good prediction performance but also leads to the optimization of the end-to-end chain in terms of accuracy and robustness, setting the ground for its generalization to different and more complex scenarios.


2019 ◽  
Author(s):  
Ahmad Ilham

Determining the number of clusters k-Means is the most populer problem among data mining researchers because of the difficulty to determining information from the data a priori so that the results cluster un optimal and to be quickly trapped into local minimums. Automatic clustering method with evolutionary computation (EC) approach can solve the k-Means problem. The automatic clustering differential evolution (ACDE) method is one of the most popular methods of the EC approach because it can handle high-dimensional data and improve k-Means drafting performance with low cluster validity values. However, the process of determining k activation threshold on ACDE is still dependent on user considerations, so that the process of determining the number of k-Means clusters is not yet efficient. In this study, the ACDE problem will be improved using the u-control chart (UCC) method, which is proven to be efficiently used to solve k-Means problems automatically. The proposed method is evaluated using the state-of-the-art datasets such as synthetic data and real data (iris, glass, wine, vowel, ruspini) from UCI repository machine learning and using davies bouldin index (DBI) and cosine similarity measure (CS) as an evaluation method. The results of this study indicate that the UCC method has successfully improved the k-Means method with the lowest objective values of DBI and CS of 0.470 and 0.577 respectively. The lowest objective value of DBI and CS is the best method. The proposed method has high performance when compared with other current methods such as genetic clustering for unknown k (GCUK), dynamic clustering pso (DCPSO) and automatic clustering approach based on differential evolution algorithm combining with k-Means for crisp clustering (ACDE) for almost all DBI and CS evaluations. It can be concluded that the UCC method is able to correct the weakness of the ACDE method on determining the number of k-Means clusters by automatically determining k activation threshold


2020 ◽  
Vol 12 (5) ◽  
pp. 771 ◽  
Author(s):  
Miguel Angel Ortíz-Barrios ◽  
Ian Cleland ◽  
Chris Nugent ◽  
Pablo Pancardo ◽  
Eric Järpe ◽  
...  

Automatic detection and recognition of Activities of Daily Living (ADL) are crucial for providing effective care to frail older adults living alone. A step forward in addressing this challenge is the deployment of smart home sensors capturing the intrinsic nature of ADLs performed by these people. As the real-life scenario is characterized by a comprehensive range of ADLs and smart home layouts, deviations are expected in the number of sensor events per activity (SEPA), a variable often used for training activity recognition models. Such models, however, rely on the availability of suitable and representative data collection and is habitually expensive and resource-intensive. Simulation tools are an alternative for tackling these barriers; nonetheless, an ongoing challenge is their ability to generate synthetic data representing the real SEPA. Hence, this paper proposes the use of Poisson regression modelling for transforming simulated data in a better approximation of real SEPA. First, synthetic and real data were compared to verify the equivalence hypothesis. Then, several Poisson regression models were formulated for estimating real SEPA using simulated data. The outcomes revealed that real SEPA can be better approximated ( R pred 2 = 92.72 % ) if synthetic data is post-processed through Poisson regression incorporating dummy variables.


Author(s):  
Zhanpeng Wang ◽  
Jiaping Wang ◽  
Michael Kourakos ◽  
Nhung Hoang ◽  
Hyong Hark Lee ◽  
...  

AbstractPopulation genetics relies heavily on simulated data for validation, inference, and intuition. In particular, since real data is always limited, simulated data is crucial for training machine learning methods. Simulation software can accurately model evolutionary processes, but requires many hand-selected input parameters. As a result, simulated data often fails to mirror the properties of real genetic data, which limits the scope of methods that rely on it. In this work, we develop a novel approach to estimating parameters in population genetic models that automatically adapts to data from any population. Our method is based on a generative adversarial network that gradually learns to generate realistic synthetic data. We demonstrate that our method is able to recover input parameters in a simulated isolation-with-migration model. We then apply our method to human data from the 1000 Genomes Project, and show that we can accurately recapitulate the features of real data.


2020 ◽  
Vol 37 (4) ◽  
pp. 573-587 ◽  
Author(s):  
Cecilia Tirelli ◽  
Simone Ceccherini ◽  
Nicola Zoppetti ◽  
Samuele Del Bianco ◽  
Marco Gai ◽  
...  

AbstractThe complete data fusion method, generalized to the case of fusing profiles of atmospheric variables retrieved on different vertical grids and referred to different true values, is applied to ozone profiles retrieved from simulated measurements in the ultraviolet, visible, and thermal infrared spectral ranges for the Sentinel-4 and Sentinel-5 missions of the Copernicus program. In this study, the production and characterization of combined low Earth orbit (Sentinel-5) and geostationary Earth orbit (Sentinel-4) fused ozone data is performed. Fused and standard products have been compared and a performance assessment of the generalized complete data fusion is presented. The analysis of the output products of the complete data fusion algorithm and of the standard processing using quality quantifiers demonstrates that the generalized complete data fusion algorithm provides products of better quality when compared with standard products.


2018 ◽  
Vol 2018 ◽  
pp. 1-14
Author(s):  
Karim El mokhtari ◽  
Serge Reboul ◽  
Georges Stienne ◽  
Jean Bernard Choquel ◽  
Benaissa Amami ◽  
...  

In this article, we propose a multimodel filter for circular data. The so-called Circular Interacting Multimodel filter is derived in a Bayesian framework with the circular normal von Mises distribution. The aim of the proposed filter is to obtain the same performance in the circular domain as the classical IMM filter in the linear domain. In our approach, the mixing and fusion stages of the Circular Interacting Multimodel filter are, respectively, defined from the a priori and from the a posteriori circular distributions of the state angle knowing the measurements and according to a set of models. We propose in this article a set of circular models that will be used in order to detect the vehicle maneuvers from heading measurements. The Circular Interacting Multimodel filter performances are assessed on synthetic data and we show on real data a vehicle maneuver detection application.


2021 ◽  
Vol 13 (2) ◽  
pp. 210
Author(s):  
Marco Gai ◽  
Flavio Barbara ◽  
Simone Ceccherini ◽  
Ugo Cortesi ◽  
Samuele Del Bianco ◽  
...  

Remote sensing of the atmospheric composition from current and future satellites, such as the Sentinel missions of the Copernicus programme, yields an unprecedented amount of data to monitor air quality, ozone, UV radiation and other climate variables. Hence, full exploitation of the growing wealth of information delivered by spaceborne observing systems requires addressing the technological challenges for developing new strategies and tools that are capable to deal with these huge data volumes. The H2020 AURORA (Advanced Ultraviolet Radiation and Ozone Retrieval for Applications) project investigated a novel approach for synergistic use of ozone profile measurements acquired at different frequencies (ultraviolet, visible, thermal infrared) by sensors onboard Geostationary Equatorial Orbit (GEO) and Low Earth Orbit (LEO) satellites in the framework of the Copernicus Sentinel-4 and Sentinel-5 missions. This paper outlines the main features of the technological infrastructure, designed and developed to support the AURORA data processing chain as a distributed data processing and describes in detail the key components of the infrastructure and the software prototype. The latter demonstrates the technical feasibility of the automatic execution of the full processing chain with simulated data. The Data Processing Chain (DPC) presented in this work thus replicates a processing system that, starting from the operational satellite retrievals, carries out their fusion and results in the assimilation of the fused products. These consist in ozone vertical profiles from which further modules of the chain deliver tropospheric ozone and UV radiation at the Earth’s surface. The conclusions highlight the relevance of this novel approach to the synergistic use of operational satellite data and underline that the infrastructure uses general-purpose technologies and is open for applications in different contexts.


2013 ◽  
Vol 31 (3) ◽  
pp. 427 ◽  
Author(s):  
Dionisio Uendro Carlos ◽  
Marco Antonio Braga ◽  
Henry F. Galbiatti ◽  
Wanderson Roberto Pereira

ABSTRACT. This paper discusses some processing techniques (all codes were implemented with open source software) developed for airborne gravity gradient systems, aiming at outlining geological features by applying mathematical formulations based on the potential field properties and its breakdown into gradiometric tensors. These techniques were applied to both synthetic and real data. These techniques applied to synthetic data allow working in a controlled environment, under- standing the different processing results and establishing a comparative parameter. These methodologies were applied to a survey area of the Quadrilátero Ferrífero to map iron ore targets, resulting in a set of very helpful and important information for geological mapping activities and a priori information for inversion geophysical models.Keywords: processing, airborne gravity gradiometry, iron ore exploration, FTG system, FALCON system. RESUMO. Neste trabalho apresentamos algumas técnicas de processamento (todos os códigos foram implementados em softwares livres) desenvolvidas para aplicação aos dados de aerogradiometria gravimétrica. Os processamentos foram aplicados tanto a dados sintéticos como a dados reais. A aplicação a dados sintéticos permite atuar em um ambiente controlado e entender o padrão resultante de cada processamento. Esses mesmos processamentos foram aplicados em uma área do Quadrilátero Ferrífero para o mapeamento de minério de ferro. Todos os resultados desses processamentos são muito úteis e importantes para o mapeamento geológicoe como informação a priori para modelos de inversão geofísica.Palavras-chave: processamento, dados de aerogradiometria gravimétrica, exploração de minério de ferro, sistema FTG, sistema FALCON.


Sign in / Sign up

Export Citation Format

Share Document