data format Latest Research Papers

Clustering and Smoothing Pipeline for Management Zone Delineation Using Proximal and Remote Sensing

Sensors ◽

10.3390/s22020645 ◽

2022 ◽

Vol 22 (2) ◽

pp. 645

Author(s):

S. Hamed Javadi ◽

Angela Guerrero ◽

Abdul M. Mouazen

Keyword(s):

Feature Selection ◽

Precision Agriculture ◽

Input Data ◽

Clustering Algorithm ◽

Near Infrared ◽

Variable Rate ◽

Yield Data ◽

Data Format ◽

Variable Rate Application ◽

Cross Correlation Analysis

In precision agriculture (PA) practices, the accurate delineation of management zones (MZs), with each zone having similar characteristics, is essential for map-based variable rate application of farming inputs. However, there is no consensus on an optimal clustering algorithm and the input data format. In this paper, we evaluated the performances of five clustering algorithms including k-means, fuzzy C-means (FCM), hierarchical, mean shift, and density-based spatial clustering of applications with noise (DBSCAN) in different scenarios and assessed the impacts of input data format and feature selection on MZ delineation quality. We used key soil fertility attributes (moisture content (MC), organic carbon (OC), calcium (Ca), cation exchange capacity (CEC), exchangeable potassium (K), magnesium (Mg), sodium (Na), exchangeable phosphorous (P), and pH) collected with an online visible and near-infrared (vis-NIR) spectrometer along with Sentinel2 and yield data of five commercial fields in Belgium. We demonstrated that k-means is the optimal clustering method for MZ delineation, and the input data should be normalized (range normalization). Feature selection was also shown to be positively effective. Furthermore, we proposed an algorithm based on DBSCAN for smoothing the MZs maps to allow smooth actuating during variable rate application by agricultural machinery. Finally, the whole process of MZ delineation was integrated in a clustering and smoothing pipeline (CaSP), which automatically performs the following steps sequentially: (1) range normalization, (2) feature selection based on cross-correlation analysis, (3) k-means clustering, and (4) smoothing. It is recommended to adopt the developed platform for automatic MZ delineation for variable rate applications of farming inputs.

Download Full-text

Aird: a computation-oriented mass spectrometry data format enables a higher compression ratio and less decoding time

BMC Bioinformatics ◽

10.1186/s12859-021-04490-0 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Miaoshan Lu ◽

Shaowei An ◽

Ruimin Wang ◽

Jinyin Wang ◽

Changbin Yu

Keyword(s):

Mass Spectrometry ◽

Data Storage ◽

High Speed ◽

Lossless Compression ◽

Mass Spectrometry Data ◽

Compression Rate ◽

Search Performance ◽

Data Format ◽

Link Type ◽

Decoding Speed

Abstract Background With the precision of the mass spectrometry (MS) going higher, the MS file size increases rapidly. Beyond the widely-used open format mzML, near-lossless or lossless compression algorithms and formats emerged in scenarios with different precision requirements. The data precision is often related to the instrument and subsequent processing algorithms. Unlike storage-oriented formats, which focus more on lossless compression rate, computation-oriented formats concentrate as much on decoding speed as the compression rate. Results Here we introduce “Aird”, an opensource and computation-oriented format with controllable precision, flexible indexing strategies, and high compression rate. Aird provides a novel compressor called Zlib-Diff-PforDelta (ZDPD) for m/z data. Compared with Zlib only, m/z data size is about 55% lower in Aird average. With the high-speed decoding and encoding performance of the single instruction multiple data technology used in the ZDPD, Aird merely takes 33% decoding time compared with Zlib. We have downloaded seven datasets from ProteomeXchange and Metabolights. They are from different SCIEX, Thermo, and Agilent instruments. Then we convert the raw data into mzML, mgf, and mz5 file formats by MSConvert and compare them with Aird format. Aird uses JavaScript Object Notation for metadata storage. Aird-SDK is written in Java, and AirdPro is a GUI client for vendor file converting written in C#. They are freely available at https://github.com/CSi-Studio/Aird-SDK and https://github.com/CSi-Studio/AirdPro. Conclusions With the innovation of MS acquisition mode, MS data characteristics are also constantly changing. New data features can bring more effective compression methods and new index modes to achieve high search performance. The MS data storage mode will also become professional and customized. ZDPD uses multiple MS digital features, and researchers also can use it in other formats like mzML. Aird is designed to become a computing-oriented data format with high scalability, compression rate, and fast decoding speed.

Download Full-text

iLDM: An Interoperable Graph-Based Local Dynamic Map

Vehicles ◽

10.3390/vehicles4010003 ◽

2022 ◽

Vol 4 (1) ◽

pp. 42-59

Author(s):

Mikel García ◽

Itziar Urbieta ◽

Marcos Nieto ◽

Javier González de Mendibil ◽

Oihana Otaegui

Keyword(s):

System Development ◽

Local Dynamic ◽

Driver Assistance ◽

Data Format ◽

Common Reference ◽

Multiple Data ◽

Discovery Service ◽

Static Data ◽

Dynamic Map ◽

Database Engine

Local dynamic map (LDM) is a key component in the future of autonomous and connected vehicles. An LDM serves as a local database with the necessary tools to have a common reference system for both static data (i.e., map information) and dynamic data (vehicles, pedestrians, etc.). The LDM should have a common and well-defined input system in order to be interoperable across multiple data sources such as sensor detections or V2X communications. In this work, we present an interoperable graph-based LDM (iLDM) using Neo4j as our database engine and OpenLABEL as a common data format. An analysis on data insertion and querying time to the iLDM is reported, including a vehicle discovery service function in order to test the capabilities of our work and a comparative analysis with other LDM implementations showing that our proposed iLDM outperformed in several relevant features, furthering its practical utilisation in advanced driver assistance system development.

Download Full-text

The Problem of Reference Rot in Spatial Metadata Catalogues

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi11010027 ◽

2021 ◽

Vol 11 (1) ◽

pp. 27

Author(s):

Sergio Martin-Segura ◽

Francisco Javier Lopez-Pellicer ◽

Javier Nogueras-Iso ◽

Javier Lacasta ◽

Francisco Javier Zarazaga-Soria

Keyword(s):

Spatial Data ◽

Actual Data ◽

Web Pages ◽

Web Page ◽

Data Types ◽

Data Format ◽

Data Services ◽

Data Formats ◽

Break Link

The content at the end of any hyperlink is subject to two phenomena: the link may break (Link Rot) or the content at the end of the link may no longer be the same as it was when it was created (Content Drift). Reference Rot denotes the combination of both effects. Spatial metadata records rely on hyperlinks for indicating the location of the resources they describe. Therefore, they are also subject to Reference Rot. This paper evaluates the presence of Reference Rot and its impact on the 22,738 distribution URIs of 18,054 metadata records from 26 European INSPIRE spatial data catalogues. Our Link Rot checking method detects broken links while considering the specific requirements of spatial data services. Our Content Drift checking method uses the data format as an indicator. It compares the data formats declared in the metadata with the actual data types returned by the hyperlinks. Findings show that 10.41% of the distribution URIs suffer from Link Rot and at least 6.21% of records suffer from Content Drift (do not declare its distribution types correctly). Additionally, 14.94% of metadata records only contain intermediate HTML web pages as distribution URIs and 31.37% contain at least one HTML web page; thus, they cannot be accessed or checked directly.

Download Full-text

GTF: An Adaptive Network Anomaly Detection Method at the Network Edge

Security and Communication Networks ◽

10.1155/2021/3017797 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Renjie Li ◽

Zhou Zhou ◽

Xuan Liu ◽

Da Li ◽

Wei Yang ◽

...

Keyword(s):

Anomaly Detection ◽

Input Data ◽

Rapid Development ◽

Poor Performance ◽

Gradient Boosting ◽

Data Format ◽

Detection Model ◽

Imbalanced Classes ◽

Public Datasets ◽

Network Anomaly Detection

Network Anomaly Detection (NAD) has become the foundation for network management and security due to the rapid development and adoption of edge computing technologies. There are two main characteristics of NAD tasks: tabular input data and imbalanced classes. Tabular input data format means NAD tasks take both sparse categorical features and dense numerical features as input. In order to achieve good performance, the detection model needs to handle both types of features efficiently. Among all widely used models, Gradient Boosting Decision Tree (GBDT) and Neural Network (NN) are the two most popular ones. However, each method has its limitation: GBDT is inefficient when dealing with sparse categorical features, while NN cannot yield satisfactory performance for dense numerical features. Imbalanced classes may downgrade the classifier’s performance and cause biased results towards the majority classes, often neglected by many exiting NAD studies. Most of the existing solutions addressing imbalance suffer from poor performance, high computational consumption, or loss of vital information under such a scenario. In this paper, we propose an adaptive ensemble-based method, named GTF, which combines TabTransformer and GBDT to leverage categorical and numerical features effectively and introduces Focal Loss to mitigate the imbalance classification. Our comprehensive experiments on two public datasets demonstrate that GTF can outperform other well-known methods in both multiclass and binary cases. Our implementation also shows that GTF has limited complexity, making it be a good candidate for deployment at the network edge.

Download Full-text

A statistical interpolation code for ocean analysis and forecasting

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech-d-21-0033.1 ◽

2021 ◽

Author(s):

T. M. Chin ◽

E. P. Chassignet ◽

M. Iskandarani ◽

N. Groves

Keyword(s):

Data Assimilation ◽

Ocean Circulation ◽

Markov Random Fields ◽

Information Matrix ◽

Ocean Model ◽

Global Ocean ◽

Error Statistics ◽

Data Format ◽

Covariance Models

Abstract We present a data assimilation package for use with ocean circulation models in analysis, forecasting and system evaluation applications. The basic functionality of the package is centered on a multivariate linear statistical estimation for a given predicted/background ocean state, observations and error statistics. Novel features of the package include support for multiple covariance models, and the solution of the least squares normal equations either using the covariance matrix or its inverse - the information matrix. The main focus of this paper, however, is on the solution of the analysis equations using the information matrix, which offers several advantages for solving large problems efficiently. Details of the parameterization of the inverse covariance using Markov Random Fields are provided and its relationship to finite difference discretizations of diffusion equations are pointed out. The package can assimilate a variety of observation types from both remote sensing and in-situ platforms. The performance of the data assimilation methodology implemented in the package is demonstrated with a yearlong global ocean hindcast with a 1/4°ocean model. The code is implemented in modern Fortran, supports distributed memory, shared memory, multi-core architectures and uses Climate and Forecasts compliant Network Common Data Format for Input/Output. The package is freely available with an open source license from www.tendral.com/tsis/

Download Full-text

Description of buckwheat cultivars from Belarus and Ukraine in the environments of Leningrad Province

PROCEEDINGS ON APPLIED BOTANY GENETICS AND BREEDING ◽

10.30901/2227-8834-2021-4-61-70 ◽

2021 ◽

Vol 182 (4) ◽

pp. 61-70

Author(s):

O. I. Romanova

Keyword(s):

Statistical Data ◽

Stem Growth ◽

Average Score ◽

Data Format ◽

Fruit Setting ◽

The World ◽

Leningrad Province ◽

Growing Conditions ◽

Modal Values ◽

Fruit Formation

Background. Buckwheat is an extremely valuable groat crop in demand both in Russia and abroad. The buckwheat collection held by VIR is the largest in the world. Studying and systematizing knowledge about the conserved diversity of the genus Fagopyrum Mill. representatives cannot be efficient without switching to the use of the most detailed descriptors containing plant characters least dependent on differences in growing conditions.Materials and methods. Seventeen buckwheat cultivars from Ukraine and Belarus and two references from Russia were studied in Leningrad Province. The methodological basis of the study included the IPGRI buckwheat descriptors and personal recommendations of N. V. Fesenko. Statistical data processing was performed according to P. F. Rokitsky.Results. The cultivars formed their typical plant habitus and demonstrated good fruit setting − an average score was 3.3–4.9 out of five. The determinant stem growth was observed in 10 cultivars. The stem developed 2.7−6.7 generative nodes and 4–6 vegetative ones, while 1.9–4.7 generative and 0.8−2.3 vegetative nodes were formed on the two upper branches.Conclusion. The study confirmed that medium-ripening buckwheat can be grown in Leningrad Province. The modal value of the number of vegetative nodes for the studied cultivars was 4−5, which is an indicator of intermediate ripening. The results of studying the metamerism of the stem and the two upper branches, expressed by modal values, were recorded in the “agricultural fitness” passport for the tested cultivars as follows: determinant stem; branching zone 4+1+2; fruit-forming zone 3+3+3; average score of fruit formation 4.8. The presented data format most fully characterizes a cultivar in terms of the potential of its earliness and productivity. Depending on the task, indicators for the main stem or for the two upper branches can be used. Recording values in the form of a formula is convenient and does not imply any other meanings.

Download Full-text

Indonesia's socio-political developments during Jokowi's leadership

Linguistics and Culture Review ◽

10.21744/lingcure.v5ns1.1966 ◽

2021 ◽

Vol 5 (S1) ◽

pp. 1588-1598

Author(s):

Wempi Feber ◽

Deandlles Christover

Keyword(s):

Separation Of Powers ◽

Secondary Data ◽

Power Sharing ◽

Future Research ◽

Coding System ◽

Data Format ◽

Data Coding ◽

Additional Input ◽

System Data ◽

The Republic

This paper tries to discuss the latest socio-political developments in Indonesia during President Jokowi, a study of political journals and perspectives from the international community. The sources of literature that we use are various international publications and media highlights published in the last five years, both national and international journals. At the same time, the method is literature analysis involving data coding system, data evaluation, and interpreting conclusion drawing so that this finding is under the study question with high validity principle. Our searches are electronic. This study relies on secondary data. The series of reports for this study are in descriptive qualitative data format. The findings that we can convey are that the politics of the Jokowi era was the division of power between the executive and the legislature in the form of the Unitary State of the Republic of Indonesia with a presidential system with a parliamentary system. In other words, Indonesia does not adhere to a system of separation of powers but rather a system of power-sharing between the executive at the center and the regions. Thus, these findings serve as additional input for future research on the same theme.

Download Full-text

PC based new software developed to create an input pilot balloon data file to an alternative to Hand Held Data Logger (HHDL) for using PC based SAMEER Pibal computation software

MAUSAM ◽

10.54302/mausam.v67i2.1359 ◽

2021 ◽

Vol 67 (2) ◽

pp. 499-504

Author(s):

N. MEENATCHI NATHAN ◽

CHANABASANAGOUDA. S. PATIL ◽

J. P. IMMANUEL JAYAPRAKASH

Keyword(s):

Data Processing ◽

Computer System ◽

India Meteorological Department ◽

Data File ◽

Input File ◽

National Data ◽

Data Logger ◽

Data Format ◽

Data Centre ◽

Pilot Balloon

Pilot balloon observatories of India Meteorological Department (IMD) are using Hand Held Data Logger (HHDL), manufactured by SAMEER, to compute upper air data since 2007. The HHDL, which is a sleek and microcontroller based battery operated unit, accepts all information through the numeric keypad pertaining to the PB ascent for raw file generation and pilot balloon data processing. The raw file can be transferred to computer system as an input file to PC based Pibal computation software. This software generates Pibal messages similar to HHDL in addition to National Data Centre (NDC) data format and monthly climate. In case of any failure of hardware, both HHDL & PC based Pibal computation software cannot be used. Therefore to overcome this problem, a PC based Pibal data keying software has been developed using visual C sharp. The new software, what is developed, creates an input file similar to HHDL; it was tested with PC based Pibal computation software which works successfully as an alternate in case of failure of HHDL & it’s hardware accessories

Download Full-text

Definition of an FHIR-based multiprotocol IoT home gateway to support the dynamic plug of new devices within instrumented environments

Journal of Reliable Intelligent Environments ◽

10.1007/s40860-021-00161-2 ◽

2021 ◽

Author(s):

Paolo Zampognaro ◽

Giovanni Paragliola ◽

Vincenzo Falanga

Keyword(s):

Medical Devices ◽

Semantic Annotation ◽

Health Resources ◽

Data Retrieval ◽

Original Data ◽

Seamless Integration ◽

Data Format ◽

Continuous Growth ◽

Home Gateway ◽

Mobile Phone Technology

AbstractInternet of Things (IoT) technologies have become a milestone advancement in the digital healthcare domain, since the number of IoT medical devices is grown exponentially, and it is now anticipated that by 2020, there will be over 161 million of them connected worldwide. Therefore, in an era of continuous growth, IoT healthcare faces various challenges, such as the collection over multiple protocols (e.g. Bluetooth, MQTT, CoAP, ZigBEE, etc.) the interpretation, as well as the harmonization of the data format that derive from the existing huge amounts of heterogeneous IoT medical devices. In this respect, this study aims at proposing an advanced Home Gateway architecture that offers a unique data collection module, supporting direct data acquisition over multiple protocols (i.e.BLE, MQTT) and indirect data retrieval from cloud health services (i.e. GoogleFit). Moreover, the solution propose a mechanism to automatically convert the original data format, carried over BLE, in HL7 FHIR by exploiting device capabilities semantic annotation implemented by means of FHIR resource as well. The adoption of such annotation enables the dynamic plug of new sensors within the instrumented environment without the need to stop and adapt the gateway. This simplifies the dynamic devices landscape customization requested by the several telemedicine applications contexts (e.g. CVD, Diabetes) and demonstrate, for the first time, a concrete example of using the FHIR standard not only (as usual) for health resources representation and storage but also as instrument to enable seamless integration of IoT devices. The proposed solution also relies on mobile phone technology which is widely adopted aiming at reducing any obstacle for a larger adoption.

Download Full-text

data format
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Clustering and Smoothing Pipeline for Management Zone Delineation Using Proximal and Remote Sensing

Aird: a computation-oriented mass spectrometry data format enables a higher compression ratio and less decoding time

iLDM: An Interoperable Graph-Based Local Dynamic Map

The Problem of Reference Rot in Spatial Metadata Catalogues

GTF: An Adaptive Network Anomaly Detection Method at the Network Edge

A statistical interpolation code for ocean analysis and forecasting

Description of buckwheat cultivars from Belarus and Ukraine in the environments of Leningrad Province

Indonesia's socio-political developments during Jokowi's leadership

PC based new software developed to create an input pilot balloon data file to an alternative to Hand Held Data Logger (HHDL) for using PC based SAMEER Pibal computation software

Definition of an FHIR-based multiprotocol IoT home gateway to support the dynamic plug of new devices within instrumented environments

Export Citation Format

data formatRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Clustering and Smoothing Pipeline for Management Zone Delineation Using Proximal and Remote Sensing

Aird: a computation-oriented mass spectrometry data format enables a higher compression ratio and less decoding time

iLDM: An Interoperable Graph-Based Local Dynamic Map

The Problem of Reference Rot in Spatial Metadata Catalogues

GTF: An Adaptive Network Anomaly Detection Method at the Network Edge

A statistical interpolation code for ocean analysis and forecasting

Description of buckwheat cultivars from Belarus and Ukraine in the environments of Leningrad Province

Indonesia's socio-political developments during Jokowi's leadership

PC based new software developed to create an input pilot balloon data file to an alternative to Hand Held Data Logger (HHDL) for using PC based SAMEER Pibal computation software

Definition of an FHIR-based multiprotocol IoT home gateway to support the dynamic plug of new devices within instrumented environments

data format
Recently Published Documents