scholarly journals From Field Observations and Plant Specimens to a Trans-continental Knowledge Base: Efficient, semantically rich integration of highly heterogeneous plant phenological data

Author(s):  
Brian Stucky ◽  
John Deck ◽  
Ramona Walls ◽  
Robert Guralnick

Ideally, an information system that automates the integration of disparate datasets should be able to minimize the loss of information from any one dataset, achieve computational complexity suitable for working with large datasets, be flexible enough to easily incorporate new data sources, and produce output that is easily analyzed and understood by data users. Achieving all of these goals within highly heterogeneous and highly complex data domains is a major challenge. In this talk, we present the results of our recent efforts to develop such a system for data about plant phenology. Our data integration system, which is built around the Plant Phenology Ontology, currently supports semantically fine-grained integration of phenological data from both field observations and herbarium specimens. We show that even with a heavily axiomatized ontology and sophisticated, machine-reasoning-based data analysis, it is possible to implement a high-throughput data integration pipeline capable of processing millions of individual records in a matter of minutes while running on modest, server-class hardware. Success requires careful ontology design and judicious application of machine reasoning techniques. We also discuss some of the many challenges that remain for designing efficient, general-purpose data integration systems.

2020 ◽  
Vol 8 (6) ◽  
Author(s):  
Hervé Goëau ◽  
Adán Mora‐Fallas ◽  
Julien Champ ◽  
Natalie L. Rossington Love ◽  
Susan J. Mazer ◽  
...  

2021 ◽  
pp. 1-30
Author(s):  
Lisa Grace S. Bersales ◽  
Josefina V. Almeda ◽  
Sabrina O. Romasoc ◽  
Marie Nadeen R. Martinez ◽  
Dannela Jann B. Galias

With the advancement of technology, digitalization, and the internet of things, large amounts of complex data are being produced daily. This vast quantity of various data produced at high speed is referred to as Big Data. The utilization of Big Data is being implemented with success in the private sector, yet the public sector seems to be falling behind despite the many potentials Big Data has already presented. In this regard, this paper explores ways in which the government can recognize the use of Big Data for official statistics. It begins by gathering and presenting Big Data-related initiatives and projects across the globe for various types and sources of Big Data implemented. Further, this paper discusses the opportunities, challenges, and risks associated with using Big Data, particularly in official statistics. This paper also aims to assess the current utilization of Big Data in the country through focus group discussions and key informant interviews. Based on desk review, discussions, and interviews, the paper then concludes with a proposed framework that provides ways in which Big Data may be utilized by the government to augment official statistics.


Author(s):  
Zhuliang Yao ◽  
Shijie Cao ◽  
Wencong Xiao ◽  
Chen Zhang ◽  
Lanshun Nie

In trained deep neural networks, unstructured pruning can reduce redundant weights to lower storage cost. However, it requires the customization of hardwares to speed up practical inference. Another trend accelerates sparse model inference on general-purpose hardwares by adopting coarse-grained sparsity to prune or regularize consecutive weights for efficient computation. But this method often sacrifices model accuracy. In this paper, we propose a novel fine-grained sparsity approach, Balanced Sparsity, to achieve high model accuracy with commercial hardwares efficiently. Our approach adapts to high parallelism property of GPU, showing incredible potential for sparsity in the widely deployment of deep learning services. Experiment results show that Balanced Sparsity achieves up to 3.1x practical speedup for model inference on GPU, while retains the same high model accuracy as finegrained sparsity.


2008 ◽  
Vol 5 (2) ◽  
Author(s):  
Robert Pesch ◽  
Artem Lysenko ◽  
Matthew Hindle ◽  
Keywan Hassani-Pak ◽  
Ralf Thiele ◽  
...  

SummaryThe automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara- Cyc) which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation.The methods and algorithms presented in this publication are an integral part of the ONDEX system which is freely available from http://ondex.sf.net/.


1983 ◽  
Vol 61 (1) ◽  
pp. 179-187 ◽  
Author(s):  
Mark P. Widrlechner

Through a review of floristic and taxonomic literature and an examination of over 1500 herbarium specimens, this report documents the rapid spread of Chaenorrhinum minus (L.) Lange along railroads across North America. The relationship between C. minus and railroads is described and phenological data on flowering and fruiting are presented. The combination of an effective dispersal mechanism and the rapid onset of reproductive maturity contributes to the species' adaptive success.


2018 ◽  
Vol 51 (6) ◽  
pp. 1571-1585 ◽  
Author(s):  
Graeme Hansford

A conceptual design for a handheld X-ray diffraction (HHXRD) instrument is proposed. Central to the design is the application of energy-dispersive XRD (EDXRD) in a back-reflection geometry. This technique brings unique advantages which enable a handheld instrument format, most notably, insensitivity to sample morphology and to the precise sample position relative to the instrument. For fine-grained samples, including many geological specimens and the majority of common alloys, these characteristics negate sample preparation requirements. A prototype HHXRD device has been developed by minor modification of a handheld X-ray fluorescence instrument, and the performance of the prototype has been tested with samples relevant to mining/quarrying and with an extensive range of metal samples. It is shown, for example, that the mineralogical composition of iron-ore samples can be approximately quantified. In metals analysis, identification and quantification of the major phases have been demonstrated, along with extraction of lattice parameters. Texture analysis is also possible and a simple example for a phosphor bronze sample is presented. Instrument formats other than handheld are possible and online process control in metals production is a promising area. The prototype instrument requires extended measurement times but it is argued that a purpose-designed instrument can achieve data-acquisition times below one minute. HHXRD based on back-reflection EDXRD is limited by the low resolution of diffraction peaks and interference by overlapping fluorescence peaks and, for these reasons, cannot serve as a general-purpose XRD tool. However, the advantages ofin situ, nondestructive and rapid measurement, tolerance of irregular surfaces, and no sample preparation requirement in many cases are potentially transformative. For targeted applications in which the analysis meets commercially relevant performance criteria, HHXRD could become the method of choice through sheer speed and convenience.


Sign in / Sign up

Export Citation Format

Share Document