OzFlux Data: Network integration from collection to curation

Abstract. Measurement of the exchange of energy and mass between the surface and the atmospheric boundary-layer by the eddy covariance technique has undergone great change in the last two decades. Early studies of these exchanges were confined to brief field campaigns in carefully controlled conditions followed by months of data analysis. Current practice is to run tower-based eddy covariance systems continuously over several years due to the need for continuous monitoring as part of a global effort to develop local-, regional-, continental- and global-scale budgets of carbon, water and energy. Efficient methods of processing the increased quantities of data are needed to maximise the time available for analysis and interpretation. Standardised methods are needed to remove differences in data processing as possible contributors to observed spatial variability. Furthermore, public availability of these datasets assists with undertaking global research efforts. The OzFlux data path has been developed (i) to provide a standard set of quality control and post-processing tools across the network, thereby facilitating inter-site integration and spatial comparisons; (ii) to increase the time available to researchers for analysis and interpretation by reducing the time spent collecting and processing data; (iii) to propagate both data and metadata to the final product; and (iv) to facilitate the use of the OzFlux data by adopting a standard file format and making the data available from web-based portals. The fundamentals of the OzFlux data path include the adoption of netCDF as the underlying file format to integrate data and metadata, a suite of Python scripts to provide a standard quality control, post-processing, gap filling and partitioning environment, a portal from which data can be downloaded and an OPeNDAP server offering internet access to the latest version of the OzFlux data set. Discovery of the OzFlux data set is facilitated through incorporation in FluxNet data syntheses and the publication of collection metadata via the RIF-CS format. This paper serves two purposes. The first is to describe the datasets, along with their quality control and post-processing, for the other papers of this Special Issue. The second is to provide an example of one solution to the data collection and curation challenges that are encountered by similar flux tower networks worldwide.

Download Full-text

OzFlux data: network integration from collection to curation

Biogeosciences ◽

10.5194/bg-14-2903-2017 ◽

2017 ◽

Vol 14 (12) ◽

pp. 2903-2928 ◽

Cited By ~ 25

Author(s):

Peter Isaac ◽

James Cleverly ◽

Ian McHugh ◽

Eva van Gorsel ◽

Cacilia Ewenz ◽

...

Keyword(s):

Quality Control ◽

Eddy Covariance ◽

Global Scale ◽

Data Sets ◽

Post Processing ◽

Data Set ◽

Web Based ◽

Standard File Format ◽

Standard Set ◽

Field Campaigns

Abstract. Measurement of the exchange of energy and mass between the surface and the atmospheric boundary-layer by the eddy covariance technique has undergone great change in the last 2 decades. Early studies of these exchanges were confined to brief field campaigns in carefully controlled conditions followed by months of data analysis. Current practice is to run tower-based eddy covariance systems continuously over several years due to the need for continuous monitoring as part of a global effort to develop local-, regional-, continental- and global-scale budgets of carbon, water and energy. Efficient methods of processing the increased quantities of data are needed to maximise the time available for analysis and interpretation. Standardised methods are needed to remove differences in data processing as possible contributors to observed spatial variability. Furthermore, public availability of these data sets assists with undertaking global research efforts. The OzFlux data path has been developed (i) to provide a standard set of quality control and post-processing tools across the network, thereby facilitating inter-site integration and spatial comparisons; (ii) to increase the time available to researchers for analysis and interpretation by reducing the time spent collecting and processing data; (iii) to propagate both data and metadata to the final product; and (iv) to facilitate the use of the OzFlux data by adopting a standard file format and making the data available from web-based portals. Discovery of the OzFlux data set is facilitated through incorporation in FLUXNET data syntheses and the publication of collection metadata via the RIF-CS format. This paper serves two purposes. The first is to describe the data sets, along with their quality control and post-processing, for the other papers of this Special Issue. The second is to provide an example of one solution to the data collection and curation challenges that are encountered by similar flux tower networks worldwide.

Download Full-text

Quality control in scRNA-Seq can discriminate pacemaker cells: the mtRNA bias

Cellular and Molecular Life Sciences ◽

10.1007/s00018-021-03916-5 ◽

2021 ◽

Author(s):

Anne-Marie Galow ◽

Sophie Kussauer ◽

Markus Wolfien ◽

Ronald M. Brunner ◽

Tom Goldammer ◽

...

Keyword(s):

Quality Control ◽

Cardiac Tissue ◽

High Energy ◽

Tissue Type ◽

Pacemaker Cells ◽

Data Set ◽

Public Data ◽

Standard Quality ◽

Mitochondrial Transcripts

AbstractSingle-cell RNA-sequencing (scRNA-seq) provides high-resolution insights into complex tissues. Cardiac tissue, however, poses a major challenge due to the delicate isolation process and the large size of mature cardiomyocytes. Regardless of the experimental technique, captured cells are often impaired and some capture sites may contain multiple or no cells at all. All this refers to “low quality” potentially leading to data misinterpretation. Common standard quality control parameters involve the number of detected genes, transcripts per cell, and the fraction of transcripts from mitochondrial genes. While cutoffs for transcripts and genes per cell are usually user-defined for each experiment or individually calculated, a fixed threshold of 5% mitochondrial transcripts is standard and often set as default in scRNA-seq software. However, this parameter is highly dependent on the tissue type. In the heart, mitochondrial transcripts comprise almost 30% of total mRNA due to high energy demands. Here, we demonstrate that a 5%-threshold not only causes an unacceptable exclusion of cardiomyocytes but also introduces a bias that particularly discriminates pacemaker cells. This effect is apparent for our in vitro generated induced-sinoatrial-bodies (iSABs; highly enriched physiologically functional pacemaker cells), and also evident in a public data set of cells isolated from embryonal murine sinoatrial node tissue (Goodyer William et al. in Circ Res 125:379–397, 2019). Taken together, we recommend omitting this filtering parameter for scRNA-seq in cardiovascular applications whenever possible.

Download Full-text

Aplicação do produto de evapotranspiração do MODIS para uma área de pastagem na Amazônia ocidental

Ciência e Natura ◽

10.5902/2179460x30738 ◽

2018 ◽

Vol 40 ◽

pp. 162

Author(s):

Agni Cristina de Carvalho Brito ◽

Nara Luisa Reis de Andrade ◽

Larissa Santos Fambri ◽

Camila Bermond Ruezzene ◽

Renata Gonçalves Aguiar

Keyword(s):

Land Use ◽

Quality Control ◽

Eddy Covariance ◽

Linear Correlation ◽

Seasonal Variations ◽

Natural Ecosystems ◽

Data Set ◽

Water Cycling ◽

Eddy Covariance System ◽

Pasture Area

The processes of land use and occupation generate interventions in the natural ecosystems making them susceptible to reactions, such as changes in the processes that govern water cycling, emphasizing the importance of monitoring the evapotranspiration behavior. In this sense, the objective of this study was to verify the applicability of the evaporation product originated by the MODIS sensor to a pasture area, from 2003 to 2010, at Fazenda Nossa Senhora in the municipality of Ouro Preto do Oeste - Rondônia. Were used evapotranspiration data from the MODIS (Terra / Aqua) sensor, estimated by MOD16 algorithm, and micrometeorological tower located in the pasture area, generated by eddy covariance system. It was verified that for ET Eddy x ET MOD16 (Quality control – QC 0/8) data set, ET MOD16 (QC 0/8) data showed evapotranspiration values above those of ET Eddy and with a greater amplitude. A linear correlation between the study datasets was not identified, however, seasonal variations are captured by product, showing good approximation with ET Eddy data, especially in the transition periods.

Download Full-text

Unlocking the potential of eddy covariance data with the R software package openeddy

10.5194/egusphere-egu2020-2382 ◽

2020 ◽

Author(s):

Ladislav Šigut ◽

Pavel Sedlák ◽

Milan Fischer ◽

Georg Jocher ◽

Thomas Wutzler ◽

...

Keyword(s):

Quality Control ◽

Eddy Covariance ◽

Data Aggregation ◽

Software Package ◽

Input Data ◽

Ease Of Use ◽

Post Processing ◽

R Software ◽

Software Packages ◽

Processing Steps

The eddy covariance method provides important insights about CO2, water and energy exchange-related processes on the ecosystem scale level. Data are collected quasi-continuously with sampling frequency 10 Hz at minimum, often throughout multiple years, producing large datasets. Standard data processing methods are already devised but undergo continuous refinements that should be reflected in the available software. Currently, a suite of software packages is available for computation of half-hourly products from high frequency raw data. However, software packages consolidating the further post-processing computations are not yet that common. The post-processing steps can consist of quality control, footprint modelling, computation of storage fluxes, gap-filling, flux partitioning and data aggregation. Also they can be realized in different programming languages and require various input data formats. Users would therefore often evaluate only certain aspects of the dataset which limits the amount of extractable information from obtained data and they possibly omit the features that could affect data quality or interpretation. Here we present the free R software package openeddy (<a href="https://github.com/lsigut/openeddy">https://github.com/lsigut/openeddy</a>) that provides utilities for input data handling, extended quality control checks, data aggregation and visualization and that includes a workflow (<a href="https://github.com/lsigut/EC_workflow">https://github.com/lsigut/EC_workflow</a>) that attempts to integrate all post-processing steps through incorporation of other free software packages, such as REddyProc (<a href="https://github.com/bgctw/REddyProc/">https://github.com/bgctw/REddyProc/</a>). The framework is designed for the standard set of eddy covariance fluxes, i.e. of momentum, latent and sensible heat as well as CO2. Special attention was paid to the visualization of results at different stages of processing and at different time resolutions and aggregation steps. This allows to quickly check that computations were performed as expected and it also helps to notice issues in the dataset itself. Finally, the proposed folder structure with defined post-processing steps allows to organize data in different stages of processing for improved ease of use. Produced workflow files document the whole processing chain and its possible adaptations for a given site. We believe that such a tool can be particularly useful for eddy-covariance novices, groups that cannot or do not contribute their data to regional networks for further processing or users that want to evaluate their data independently. This or similar efforts can also help to save human resources or speed up the development of new methods. This work was supported by the Ministry of Education, Youth and Sports of the Czech Republic within the CzeCOS program, grant number LM2015061, and within Mobility CzechGlobe 2, grant number CZ.02.2.69/0.0/0.0/18_053/0016924.

Download Full-text

A likelihood ratio approach for identifying three-quarter siblings in genetic databases

Heredity ◽

10.1038/s41437-020-00392-8 ◽

2021 ◽

Author(s):

Iván Galván-Femenía ◽

Carles Barceló-Vidal ◽

Lauro Sumoy ◽

Victor Moreno ◽

Rafael de Cid ◽

...

Keyword(s):

Likelihood Ratio ◽

Family Relationships ◽

Genetic Research ◽

Data Set ◽

Scientific Disciplines ◽

Genome Wide ◽

Standard Quality ◽

Control Procedures ◽

Identity By State ◽

Second Degree

AbstractThe detection of family relationships in genetic databases is of interest in various scientific disciplines such as genetic epidemiology, population and conservation genetics, forensic science, and genealogical research. Nowadays, screening genetic databases for related individuals forms an important aspect of standard quality control procedures. Relatedness research is usually based on an allele sharing analysis of identity by state (IBS) or identity by descent (IBD) alleles. Existing IBS/IBD methods mainly aim to identify first-degree relationships (parent–offspring or full siblings) and second degree (half-siblings, avuncular, or grandparent–grandchild) pairs. Little attention has been paid to the detection of in-between first and second-degree relationships such as three-quarter siblings (3/4S) who share fewer alleles than first-degree relationships but more alleles than second-degree relationships. With the progressively increasing sample sizes used in genetic research, it becomes more likely that such relationships are present in the database under study. In this paper, we extend existing likelihood ratio (LR) methodology to accurately infer the existence of 3/4S, distinguishing them from full siblings and second-degree relatives. We use bootstrap confidence intervals to express uncertainty in the LRs. Our proposal accounts for linkage disequilibrium (LD) by using marker pruning, and we validate our methodology with a pedigree-based simulation study accounting for both LD and recombination. An empirical genome-wide array data set from the GCAT Genomes for Life cohort project is used to illustrate the method.

Download Full-text

Rapid multi-directed cholinergic transmission in the central nervous system

Nature Communications ◽

10.1038/s41467-021-21680-9 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Santhosh Sethuramanujam ◽

Akihiro Matsumoto ◽

Geoff deRosenroll ◽

Benjamin Murphy-Baum ◽

J Michael McIntosh ◽

...

Keyword(s):

Central Nervous System ◽

Nervous System ◽

Ganglion Cells ◽

Release Site ◽

Amacrine Cells ◽

Global Scale ◽

Cholinergic Transmission ◽

Data Set ◽

The Central Nervous System ◽

Synaptic Mechanisms

AbstractIn many parts of the central nervous system, including the retina, it is unclear whether cholinergic transmission is mediated by rapid, point-to-point synaptic mechanisms, or slower, broad-scale ‘non-synaptic’ mechanisms. Here, we characterized the ultrastructural features of cholinergic connections between direction-selective starburst amacrine cells and downstream ganglion cells in an existing serial electron microscopy data set, as well as their functional properties using electrophysiology and two-photon acetylcholine (ACh) imaging. Correlative results demonstrate that a ‘tripartite’ structure facilitates a ‘multi-directed’ form of transmission, in which ACh released from a single vesicle rapidly (~1 ms) co-activates receptors expressed in multiple neurons located within ~1 µm of the release site. Cholinergic signals are direction-selective at a local, but not global scale, and facilitate the transfer of information from starburst to ganglion cell dendrites. These results suggest a distinct operational framework for cholinergic signaling that bears the hallmarks of synaptic and non-synaptic forms of transmission.

Download Full-text

User password repetitive patterns analysis and visualization

Information and Computer Security ◽

10.1108/ics-06-2015-0026 ◽

2016 ◽

Vol 24 (1) ◽

pp. 93-115 ◽

Cited By ~ 2

Author(s):

Xiaoying Yu ◽

Qi Liao

Keyword(s):

Security Policy ◽

Large Data ◽

Efficient Algorithms ◽

Privacy And Security ◽

Data Set ◽

Web Based ◽

Content Type ◽

Word Cloud ◽

Individual Privacy ◽

Password Security

Purpose – Passwords have been designed to protect individual privacy and security and widely used in almost every area of our life. The strength of passwords is therefore critical to the security of our systems. However, due to the explosion of user accounts and increasing complexity of password rules, users are struggling to find ways to make up sufficiently secure yet easy-to-remember passwords. This paper aims to investigate whether there are repetitive patterns when users choose passwords and how such behaviors may affect us to rethink password security policy. Design/methodology/approach – The authors develop a model to formalize the password repetitive problem and design efficient algorithms to analyze the repeat patterns. To help security practitioners to analyze patterns, the authors design and implement a lightweight, Web-based visualization tool for interactive exploration of password data. Findings – Through case studies on a real-world leaked password data set, the authors demonstrate how the tool can be used to identify various interesting patterns, e.g. shorter substrings of the same type used to make up longer strings, which are then repeated to make up the final passwords, suggesting that the length requirement of password policy does not necessarily increase security. Originality/value – The contributions of this study are two-fold. First, the authors formalize the problem of password repetitive patterns by considering both short and long substrings and in both directions, which have not yet been considered in past. Efficient algorithms are developed and implemented that can analyze various repeat patterns quickly even in large data set. Second, the authors design and implement four novel visualization views that are particularly useful for exploration of password repeat patterns, i.e. the character frequency charts view, the short repeat heatmap view, the long repeat parallel coordinates view and the repeat word cloud view.

Download Full-text

Improvement of the safety of the red pepper spice with FMEA and post processing EWMA quality control charts

Journal of Food Science and Technology ◽

10.1007/s13197-011-0371-7 ◽

2011 ◽

Vol 50 (3) ◽

pp. 466-476 ◽

Cited By ~ 6

Author(s):

Sibel Ozilgen ◽

Seyda Bucak ◽

Mustafa Ozilgen

Keyword(s):

Quality Control ◽

Control Charts ◽

Red Pepper ◽

Post Processing ◽

Quality Control Charts

Download Full-text

Artificial intelligence-based automatic visual inspection system for built heritage

Smart and Sustainable Built Environment ◽

10.1108/sasbe-09-2020-0139 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Lukman E. Mansuri ◽

D.A. Patel

Keyword(s):

Artificial Intelligence ◽

Visual Inspection ◽

Image Data ◽

Inspection System ◽

Data Set ◽

Web Based ◽

Content Type ◽

Built Heritage ◽

Automatic Visual Inspection ◽

Inspection Systems

PurposeHeritage is the latent part of a sustainable built environment. Conservation and preservation of heritage is one of the United Nations' (UN) sustainable development goals. Many social and natural factors seriously threaten heritage structures by deteriorating and damaging the original. Therefore, regular visual inspection of heritage structures is necessary for their conservation and preservation. Conventional inspection practice relies on manual inspection, which takes more time and human resources. The inspection system seeks an innovative approach that should be cheaper, faster, safer and less prone to human error than manual inspection. Therefore, this study aims to develop an automatic system of visual inspection for the built heritage.Design/methodology/approachThe artificial intelligence-based automatic defect detection system is developed using the faster R-CNN (faster region-based convolutional neural network) model of object detection to build an automatic visual inspection system. From the English and Dutch cemeteries of Surat (India), images of heritage structures were captured by digital camera to prepare the image data set. This image data set was used for training, validation and testing to develop the automatic defect detection model. While validating this model, its optimum detection accuracy is recorded as 91.58% to detect three types of defects: “spalling,” “exposed bricks” and “cracks.”FindingsThis study develops the model of automatic web-based visual inspection systems for the heritage structures using the faster R-CNN. Then it demonstrates detection of defects of spalling, exposed bricks and cracks existing in the heritage structures. Comparison of conventional (manual) and developed automatic inspection systems reveals that the developed automatic system requires less time and staff. Therefore, the routine inspection can be faster, cheaper, safer and more accurate than the conventional inspection method.Practical implicationsThe study presented here can improve inspecting the built heritages by reducing inspection time and cost, eliminating chances of human errors and accidents and having accurate and consistent information. This study attempts to ensure the sustainability of the built heritage.Originality/valueFor ensuring the sustainability of built heritage, this study presents the artificial intelligence-based methodology for the development of an automatic visual inspection system. The automatic web-based visual inspection system for the built heritage has not been reported in previous studies so far.

Download Full-text

Influence of tranexamic acid use on venous thromboembolism risk in patients undergoing surgery for spine tumors

Journal of Neurosurgery Spine ◽

10.3171/2021.1.spine201935 ◽

2021 ◽

pp. 1-11

Author(s):

Zach Pennington ◽

Jeff Ehresman ◽

Andrew Schilling ◽

James Feghali ◽

Andrew M. Hersh ◽

...

Keyword(s):

Venous Thromboembolism ◽

Tranexamic Acid ◽

Vertebral Column ◽

Comprehensive Cancer Center ◽

Primary Study ◽

High Dose ◽

Data Set ◽

Web Based ◽

Spine Tumors ◽

Increased Risk

OBJECTIVE Patients with spine tumors are at increased risk for both hemorrhage and venous thromboembolism (VTE). Tranexamic acid (TXA) has been advanced as a potential intervention to reduce intraoperative blood loss in this surgical population, but many fear it is associated with increased VTE risk due to the hypercoagulability noted in malignancy. In this study, the authors aimed to 1) develop a clinical calculator for postoperative VTE risk in the population with spine tumors, and 2) investigate the association of intraoperative TXA use and postoperative VTE. METHODS A retrospective data set from a comprehensive cancer center was reviewed for adult patients treated for vertebral column tumors. Data were collected on surgery performed, patient demographics and medical comorbidities, VTE prophylaxis measures, and TXA use. TXA use was classified as high-dose (≥ 20 mg/kg) or low-dose (< 20 mg/kg). The primary study outcome was VTE occurrence prior to discharge. Secondary outcomes were deep venous thrombosis (DVT) or pulmonary embolism (PE). Multivariable logistic regression was used to identify independent risk factors for VTE and the resultant model was deployed as a web-based calculator. RESULTS Three hundred fifty patients were included. The mean patient age was 57 years, 53% of patients were male, and 67% of surgeries were performed for spinal metastases. TXA use was not associated with increased VTE (14.3% vs 10.1%, p = 0.37). After multivariable analysis, VTE was independently predicted by lower serum albumin (odds ratio [OR] 0.42 per g/dl, 95% confidence interval [CI] 0.23–0.79, p = 0.007), larger mean corpuscular volume (OR 0.91 per fl, 95% CI 0.84–0.99, p = 0.035), and history of prior VTE (OR 2.60, 95% CI 1.53–4.40, p < 0.001). Longer surgery duration approached significance and was included in the final model. Although TXA was not independently associated with the primary outcome of VTE, high-dose TXA use was associated with increased odds of both DVT and PE. The VTE model showed a fair fit of the data with an area under the curve of 0.77. CONCLUSIONS In the present cohort of patients treated for vertebral column tumors, TXA was not associated with increased VTE risk, although high-dose TXA (≥ 20 mg/kg) was associated with increased odds of DVT or PE. Additionally, the web-based clinical calculator of VTE risk presented here may prove useful in counseling patients preoperatively about their individualized VTE risk.

Download Full-text