Data Processing Strategies in Data Lakes

Author(s):  
Saurabh Gupta ◽  
Venkata Giri
Sensors ◽  
2017 ◽  
Vol 17 (11) ◽  
pp. 2469 ◽  
Author(s):  
Gianluca Gennarelli ◽  
Obada Al Khatib ◽  
Francesco Soldovieri

2013 ◽  
Vol 6 (6) ◽  
pp. 10443-10480 ◽  
Author(s):  
H. L. Brantley ◽  
G. S. W. Hagler ◽  
S. Kimbrough ◽  
R. W. Williams ◽  
S. Mukerjee ◽  
...  

Abstract. The collection of real-time air quality measurements while in motion (i.e., mobile monitoring) is currently conducted worldwide to evaluate in situ emissions, local air quality trends, and air pollutant exposure. This measurement strategy pushes the limits of traditional data analysis with complex second-by-second multipollutant data varying as a function of time and location. Data reduction and filtering techniques are often applied to deduce trends, such as pollutant spatial gradients downwind of a highway. However, rarely do mobile monitoring studies report the sensitivity of their results to the chosen data processing approaches. The study being reported here utilized a large mobile monitoring dataset collected on a roadway network in central North Carolina to explore common data processing strategies including time-alignment, short-term emissions event detection, background estimation, and averaging techniques. One-second time resolution measurements of ultrafine particles ≤ 100 nm in diameter (UFPs), black carbon (BC), particulate matter (PM), carbon monoxide (CO), carbon dioxide (CO2), and nitrogen dioxide (NO2) were collected on twelve unique driving routes that were repeatedly sampled. Analyses demonstrate that the multiple emissions event detection strategies reported produce generally similar results and that utilizing a median (as opposed to a mean) as a summary statistic may be sufficient to avoid bias in near-source spatial trends. Background levels of the pollutants are shown to vary with time, and the estimated contributions of the background to the mean pollutant concentrations were: BC (6%), PM2.5–10 (12%), UFPs (19%), CO (38%), PM10 (45%), NO2 (51%), PM2.5 (56%), and CO2 (86%). Lastly, while temporal smoothing (e.g., 5 s averages) results in weak pair-wise correlation and the blurring of spatial trends, spatial averaging (e.g., 10 m) is demonstrated to increase correlation and refine spatial trends.


Author(s):  
Francesco Soldovieri ◽  
Gianluca Gennarelli ◽  
Ilaria Catapano ◽  
D. Liao ◽  
T. Dogaru

2013 ◽  
Author(s):  
Yijin Liu ◽  
Korneel H. Cats ◽  
Johanna Nelson Weker ◽  
Joy C. Andrews ◽  
Bert M. Weckhuysen ◽  
...  

2021 ◽  
Vol 478 (10) ◽  
pp. 1827-1845
Author(s):  
Euan Pyle ◽  
Giulia Zanetti

Cryo-electron tomography (cryo-ET) can be used to reconstruct three-dimensional (3D) volumes, or tomograms, from a series of tilted two-dimensional images of biological objects in their near-native states in situ or in vitro. 3D subvolumes, or subtomograms, containing particles of interest can be extracted from tomograms, aligned, and averaged in a process called subtomogram averaging (STA). STA overcomes the low signal to noise ratio within the individual subtomograms to generate structures of the particle(s) of interest. In recent years, cryo-ET with STA has increasingly been capable of reaching subnanometer resolution due to improvements in microscope hardware and data processing strategies. There has also been an increase in the number and quality of software packages available to process cryo-ET data with STA. In this review, we describe and assess the data processing strategies available for cryo-ET data and highlight the recent software developments which have enabled the extraction of high-resolution information from cryo-ET datasets.


2021 ◽  
pp. 233-241
Author(s):  
Paolo Castelli ◽  
Michele Rizzo ◽  
Ostilio Spadaccini

Sign in / Sign up

Export Citation Format

Share Document