Lossless Compression of Sensor Signals Using an Untrained Multi-Channel Recurrent Neural Predictor

The use of sensor applications has been steadily increasing, leading to an urgent need for efficient data compression techniques to facilitate the storage, transmission, and processing of digital signals generated by sensors. Unlike other sequential data such as text sequences, sensor signals have more complex statistical characteristics. Specifically, in every signal point, each bit, which corresponds to a specific precision scale, follows its own conditional distribution depending on its history and even other bits. Therefore, applying existing general-purpose data compressors usually leads to a relatively low compression ratio, since these compressors do not fully exploit such internal features. What is worse, partitioning a bit stream into groups with a preset size will sometimes break the integrity of each signal point. In this paper, we present a lossless data compressor dedicated to compressing sensor signals which is built upon a novel recurrent neural architecture named multi-channel recurrent unit (MCRU). Each channel in the proposed MCRU models a specific precision range of each signal point without breaking data integrity. During compressing and decompressing, the mirrored network will be trained on observed data; thus, no pre-training is needed. The superiority of our approach over other compressors is demonstrated experimentally on various types of sensor signals.

Download Full-text

Input Pattern Classification Based on the Markov Property of the IMBT with Related Equations and Contingency Tables

Entropy ◽

10.3390/e22020245 ◽

2020 ◽

Vol 22 (2) ◽

pp. 245

Author(s):

István Finta ◽

Sándor Szénási ◽

Lóránt Farkas

Keyword(s):

Binary Tree ◽

Contingency Tables ◽

Markov Property ◽

Input Pattern ◽

Statistical Characteristics ◽

Bernstein Theorem ◽

Data Packets ◽

Efficient Data ◽

Search Operation ◽

Fibonacci Sequences

In this contribution, we provide a detailed analysis of the search operation for the Interval Merging Binary Tree (IMBT), an efficient data structure proposed earlier to handle typical anomalies in the transmission of data packets. A framework is provided to decide under which conditions IMBT outperforms other data structures typically used in the field, as a function of the statistical characteristics of the commonly occurring anomalies in the arrival of data packets. We use in the modeling Bernstein theorem, Markov property, Fibonacci sequences, bipartite multi-graphs, and contingency tables.

Download Full-text

SPRING: a next-generation compressor for FASTQ data

Bioinformatics ◽

10.1093/bioinformatics/bty1015 ◽

2018 ◽

Vol 35 (15) ◽

pp. 2674-2676 ◽

Cited By ~ 18

Author(s):

Shubham Chandak ◽

Kedar Tatwawadi ◽

Idoia Ochoa ◽

Mikel Hernaez ◽

Tsachy Weissman

Keyword(s):

High Throughput Sequencing ◽

Random Access ◽

Lossless Compression ◽

General Purpose ◽

Supplementary Information ◽

High Coverage ◽

Sequencing Technologies ◽

Long Read ◽

Previous State ◽

Computational Resources

Abstract Motivation High-Throughput Sequencing technologies produce huge amounts of data in the form of short genomic reads, associated quality values and read identifiers. Because of the significant structure present in these FASTQ datasets, general-purpose compressors are unable to completely exploit much of the inherent redundancy. Although there has been a lot of work on designing FASTQ compressors, most of them lack in support of one or more crucial properties, such as support for variable length reads, scalability to high coverage datasets, pairing-preserving compression and lossless compression. Results In this work, we propose SPRING, a reference-free compressor for FASTQ files. SPRING supports a wide variety of compression modes and features, including lossless compression, pairing-preserving compression, lossy compression of quality values, long read compression and random access. SPRING achieves substantially better compression than existing tools, for example, SPRING compresses 195 GB of 25× whole genome human FASTQ from Illumina’s NovaSeq sequencer to less than 7 GB, around 1.6× smaller than previous state-of-the-art FASTQ compressors. SPRING achieves this improvement while using comparable computational resources. Availability and implementation SPRING can be downloaded from https://github.com/shubhamchandak94/SPRING. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Second compression for pixelated images under edge-based compression algorithms: JPEG-LS as an example

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201563 ◽

2021 ◽

pp. 1-9

Author(s):

Kamal Al-Khayyat ◽

Imad Al-Shaikhli ◽

Mohamad Al-Hagery

Keyword(s):

Data Compression ◽

Compression Ratio ◽

Lossless Compression ◽

General Purpose ◽

Compression Algorithm ◽

Random Data ◽

Data Set ◽

Compression Algorithms ◽

Edge Based

This paper details the examination of a particular case of data compression, where the compression algorithm removes the redundancy from data, which occurs when edge-based compression algorithms compress (previously compressed) pixelated images. The newly created redundancy can be removed using another round of compression. This work utilized the JPEG-LS as an example of an edge-based compression algorithm for compressing pixelated images. The output of this process was subjected to another round of compression using a more robust but slower compressor (PAQ8f). The compression ratio of the second compression was, on average, 18%, which is high for random data. The results of the second compression were superior to the lossy JPEG. Under the used data set, lossy JPEG needs to sacrifice 10% on average to realize nearly total lossless compression ratios of the two-successive compressions. To generalize the results, fast general-purpose compression algorithms (7z, bz2, and Gzip) were used too.

Download Full-text

Statistical Considerations for Analyzing Ecological Momentary Assessment Data

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-21-00081 ◽

2021 ◽

pp. 1-17

Author(s):

Jacob J. Oleson ◽

Michelle A. Jones ◽

Erik J. Jorgensen ◽

Yu-Hsiang Wu

Keyword(s):

Ecological Momentary Assessment ◽

Mixed Model ◽

Linear Mixed Model ◽

General Purpose ◽

Statistical Characteristics ◽

Statistical Approaches ◽

Survey Responses ◽

Ecological Momentary ◽

Ema Data ◽

Momentary Assessment

Purpose: The analysis of Ecological Momentary Assessment (EMA) data can be difficult to conceptualize due to the complexity of how the data are collected. The goal of this tutorial is to provide an overview of statistical considerations for analyzing observational data arising from EMA studies. Method: EMA data are collected in a variety of ways, complicating the statistical analysis. We focus on fundamental statistical characteristics of the data and general purpose statistical approaches to analyzing EMA data. We implement those statistical approaches using a recent study involving EMA. Results: The linear or generalized linear mixed-model statistical approach can adequately capture the challenges resulting from EMA collected data if properly set up. Additionally, while sample size depends on both the number of participants and the number of survey responses per participant, having more participants is more important than the number of responses per participant. Conclusion: Using modern statistical methods when analyzing EMA data and adequately considering all of the statistical assumptions being used can lead to interesting and important findings when using EMA. Supplemental Material https://doi.org/10.23641/asha.17155961

Download Full-text

Quantifiers for randomness of chaotic pseudo-random number generators

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2009.0075 ◽

2009 ◽

Vol 367 (1901) ◽

pp. 3281-3296 ◽

Cited By ~ 24

Author(s):

L. De Micco ◽

H. A. Larrondo ◽

A. Plastino ◽

O. A. Rosso

Keyword(s):

Time Series ◽

Comparative Analysis ◽

Invariant Measure ◽

Random Number ◽

Chaotic Maps ◽

Chaotic Map ◽

General Purpose ◽

Statistical Characteristics ◽

Random Number Generators ◽

Pseudo Random Number

We deal with randomness quantifiers and concentrate on their ability to discern the hallmark of chaos in time series used in connection with pseudo-random number generators (PRNGs). Workers in the field are motivated to use chaotic maps for generating PRNGs because of the simplicity of their implementation. Although there exist very efficient general-purpose benchmarks for testing PRNGs, we feel that the analysis provided here sheds additional didactic light on the importance of the main statistical characteristics of a chaotic map, namely (i) its invariant measure and (ii) the mixing constant. This is of help in answering two questions that arise in applications: (i) which is the best PRNG among the available ones? and (ii) if a given PRNG turns out not to be good enough and a randomization procedure must still be applied to it, which is the best applicable randomization procedure? Our answer provides a comparative analysis of several quantifiers advanced in the extant literature.

Download Full-text

NanoSpring: reference-free lossless compression of nanopore sequencing reads using an approximate assembly approach

10.1101/2021.06.09.447198 ◽

2021 ◽

Author(s):

Qingxi Meng ◽

Shubham Chandak ◽

Yifan Zhu ◽

Tsachy Weissman

Keyword(s):

State Of The Art ◽

Lossless Compression ◽

General Purpose ◽

Quality Score ◽

Nanopore Sequencing ◽

The Past ◽

Genome Data ◽

Sequencing Technologies ◽

Efficient Storage ◽

Long Reads

Motivation: The amount of data produced by genome sequencing experiments has been growing rapidly over the past several years, making compression important for efficient storage, transfer and analysis of the data. In recent years, nanopore sequencing technologies have seen increasing adoption since they are portable, real-time and provide long reads. However, there has been limited progress on compression of nanopore sequencing reads obtained in FASTQ files. Previous work ENANO focuses mostly on quality score compression and does not achieve significant gains for the compression of read sequences over general-purpose compressors. RENANO achieves significantly better compression for read sequences but is limited to aligned data with a reference available. Results: We present NanoSpring, a reference-free compressor for nanopore sequencing reads, relying on an approximate assembly approach. We evaluate NanoSpring on a variety of datasets including bacterial, metagenomic, plant, animal, and human whole genome data. For recently basecalled high quality nanopore datasets, NanoSpring achieves close to 3x improvement in compression over state-of-the-art reference-free compressors. The computational requirements of NanoSpring are practical, although it uses more time and memory during compression than previous tools to achieve the compression gains. Availability: NanoSpring is available on GitHub at https://github.com/qm2/NanoSpring.

Download Full-text

INTEGRATION OF IOT SENSORS TO 3D INDOOR MODELS WITH INDOORGML

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliv-4-w1-2020-135-2020 ◽

2020 ◽

Vol XLIV-4/W1-2020 ◽

pp. 135-142

Author(s):

A. Sarmiento ◽

A. A. Diakité

Keyword(s):

Data Model ◽

Functional Requirements ◽

Sensor Applications ◽

Multiple Sensors ◽

Iot Applications ◽

Systems Research ◽

Potential Scale ◽

Efficient Data ◽

Sensor Information ◽

Conceptual Data

Abstract. Sensors are the vehicle through which Internet of Things (IoT) applications collect timely data of which are communicated to objects, or “Things”, to make them aware of their environment. With multiple sensors within an IoT system sending continuous streams of data, the potential scale of data is large, so efficient data management and useful representation is a key concern. As the information required from sensors benefit from a spatial context, 3D indoor models, such as IndoorGML, have been identified to support this condition. As it stands, a standardised structure to the amalgamation of sensors with IndoorGML has not been defined. The goal of this paper is to explore this opportunity by firstly, reviewing previous approaches to the integration of the two systems. Research into the interpretations of sensor information through existing standards is conducted before narrowing these attributes down into a minimal profile according to identified functional requirements of sensor applications. Finally, this knowledge is organised into a conceptual data model and presented as a thematic module in IndoorGML.

Download Full-text

Reversible Data Hiding on Image Encryption with Index Boundary for Partial Confidential Documents

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f7142.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 3647-3650

Keyword(s):

Data Hiding ◽

Reversible Data Hiding ◽

Vital Role ◽

Cyber Attacks ◽

Security Requirements ◽

Digital Signals ◽

Confidential Data ◽

Efficient Data ◽

Cyber Crimes ◽

Arnold Transformation

An efficient data hiding method is proposed in the smart cyber-attacks such as data theft, virus attacks on the transmitting of whole confidential data, partial confidential data via private and government networks. The transmitting data over the internet is one of the better solutions to make a work easy and fast even though protecting data from the smart hackers is vital role in the cyber-crimes. There are various protection techniques of confidential data like digital watermarking, digital signals with embedding data into images, audio and videos. The existing watermarking techniques using RSA is a time consuming process for number of iterations to be performed even in signal decryption of image. These complexities of iterations to be overcome by the Least Significant Bits, DWT transformation and Arnold transformation were failing in all aspects of security requirements. The proposed method of reversible data hiding has an efficient encryption in medical images, health care image and data transmitting in various organizations.

Download Full-text

Ag Nanoparticle/Polydimethylsiloxane Composite Films for High-Sensitivity Strain Gauge Applications

Journal of Nano Research ◽

10.4028/www.scientific.net/jnanor.46.57 ◽

2017 ◽

Vol 46 ◽

pp. 57-63

Author(s):

Li Shuang Liu ◽

Xiu Jian Chou ◽

Tao Chen ◽

Li Ning Sun

Keyword(s):

Strain Gauge ◽

Ag Nanoparticles ◽

Electron Tunneling ◽

Composite Films ◽

High Sensitivity ◽

Nanocomposite Material ◽

Ag Nanoparticle ◽

Sensor Applications ◽

Sensor Signals ◽

Gauge Element

This paper presents a type of Ag/polydimethylsiloxane (Ag/PDMS) nanocomposite material for use in strain gauge element applications. In these elements, the Ag nanoparticles work as conductive elements by electron tunneling and the PDMS forms the tunneling dielectricstructure. In our experiments, the piezoresistance and piezocapacitance characteristics of these Ag/PDMS composites have been studiedunder applied rain/stress. From the results, the gauge factors for piezoresistance and piezocapacitance can reach up to 153 and 224, respectively. Also, when acting as strain elements, the Ag/PDMS composites show fine levels of repeatability and stability. The results given here prove that the material can be used to form anew type of high-sensitivity element for sensor applications. The detectionmethod used for the sensor signals also offersdiversity by including both piezoresistance and piezocapacitance.

Download Full-text

DZip: Improved General-Purpose Lossless Compression Based on Novel Neural Network Modeling

2020 Data Compression Conference (DCC) ◽

10.1109/dcc47342.2020.00065 ◽

2020 ◽

Author(s):

Mohit Goyal ◽

Kedar Tatwawadi ◽

Shubham Chandak ◽

Idoia Ochoa

Keyword(s):

Neural Network ◽

Network Modeling ◽

Lossless Compression ◽

General Purpose ◽

Neural Network Modeling

Download Full-text