RNAseq in the mosquito maxillary palp: a little antennal RNA goes a long way

A comparative transcriptomic study of mosquito olfactory tissues recently published in BMC Genomics (Hodges et al., 2014) reported several novel findings that have broad implications for the field of insect olfaction. In this brief commentary, we outline why the conclusions of Hodges et al. are problematic under the current models of insect olfaction and then contrast their findings with those of other RNAseq based studies of mosquito olfactory tissues. We also generated a new RNAseq data set from the maxillary palp of Anopheles gambiae in an effort to replicate the novel results of Hodges et al. but were unable to reproduce their results. Instead, our new RNAseq data support the more straightforward explanation that the novel findings of Hodges et al. were a consequence of contamination by antennal RNA. In summary, we find strong evidence to suggest that the conclusions of Hodges et al were spurious, and that at least some of their RNAseq data sets were irrevocably compromised by cross-contamination between samples.

Download Full-text

CIM-seq

10.21203/rs.3.pex-1365/v1 ◽

2021 ◽

Author(s):

Nathanael Andrews ◽

Martin Enge

Keyword(s):

Single Cell ◽

Single Cells ◽

Likelihood Estimation ◽

Cell Types ◽

Data Sets ◽

Target Tissue ◽

Data Set ◽

Rnaseq Data ◽

The Given ◽

Cell Data

Abstract CIM-seq is a tool for deconvoluting RNA-seq data from cell multiplets (clusters of two or more cells) in order to identify physically interacting cell in a given tissue. The method requires two RNAseq data sets from the same tissue: one of single cells to be used as a reference, and one of cell multiplets to be deconvoluted. CIM-seq is compatible with both droplet based sequencing methods, such as Chromium Single Cell 3′ Kits from 10x genomics; and plate based methods, such as Smartseq2. The pipeline consists of three parts: 1) Dissociation of the target tissue, FACS sorting of single cells and multiplets, and conventional scRNA-seq 2) Feature selection and clustering of cell types in the single cell data set - generating a blueprint of transcriptional profiles in the given tissue 3) Computational deconvolution of multiplets through a maximum likelihood estimation (MLE) to determine the most likely cell type constituents of each multiplet.

Download Full-text

On the influence and limitations of hyper-resolution hydrological modelling – application of the 1 km PCR-GLOBWB model over Europe

10.5194/egusphere-egu21-125 ◽

2021 ◽

Author(s):

Jannis Hoch ◽

Edwin Sutanudjaja ◽

Rens van Beek ◽

Marc Bierkens

Keyword(s):

Global Scale ◽

Data Sets ◽

The Novel ◽

Data Set ◽

Meteorological Forcing ◽

Recent Developments ◽

Terrestrial Water ◽

Multiple Challenges ◽

Global Data ◽

Development Application

Developing and applying hyper-resolution models over larger extents has long been a quest in hydrological sciences. With the recent developments of global-scale yet fine data sets and advances in computational power, achieving this goal becomes increasingly feasible.We here present the development, application, and results of the novel 1 km version of PCR-GLOBWB for the period 1981 until 2020. Even though employing global data sets only, we developed, ran, and evaluated the 1 km model for the continent Europe only. In comparison to past versions of PCR-GLOBWB, input data was replaced with sufficiently fine data, for example the recent SoilGrids and MERIT-DEM data. Preliminary results indicate an improvement of model outcome when evaluating simulated discharge, evaporation, and terrestrial water storage.Additionally, we aim to answer the question to what extent developing hyper-resolution models is actually needed of whether the run times could be saved by using hyper-resolution state-of-the-art meteorological forcing. Therefore, the relative importance of model resolution and forcing resolution was cross-compared. To that end, the ERA5-Land data set was employed at different resolutions, matching the model resolutions at 1 km, 10 km, and 50 km.Despite multiple challenges still lying ahead before achieve true hyper-resolution, this application of a 1 km model across an entire continent can form the basis for the next steps to be taken.

Download Full-text

Coronavirus Spread Limitation Using Detective Smart System

10.21203/rs.3.rs-916532/v1 ◽

2021 ◽

Author(s):

Morsy Ismail ◽

Osama Galal ◽

Waleed Saad

Keyword(s):

Text Message ◽

Data Sets ◽

The Novel ◽

Data Set ◽

Smart System ◽

Medical Centers ◽

Image Capturing ◽

Thermal Cameras ◽

Novel Coronavirus ◽

Isolation Facility

Abstract Given the circumstances the world is going through due to the novel coronavirus (Covid-19); this paper proposes a new smart system that aims to reduce the spread of the virus. The proposed Covid-19 containment system is designed to be installed outside hospitals and medical centers. Additionally, it works at night as well as at daylight. The system is based on Deep Learning applied to pedestrian temperature data sets that are collected using thermal cameras. The data set is primarily of temperature of pedestrians around medical centers. The thermal cameras are paired with conventional cameras for image capturing and cross referencing the target pedestrian with an existing central database (Big Data). If target is positive, the system sends a text message to the potentially infected person's cell phone upon recognition. The advisory sent text may contain useful information such as the nearest testing or isolation facility. This proposed system is assumed to be linked with the bigger network of the country’s Covid-19 response efforts. The simulation results reveal that the system can achieve an average precision of 90% fever detection among pedestrians.

Download Full-text

Machine Learning for Clinical Data Processing

Advances in Digital Crime, Forensics, and Cyber Terrorism - Digital Forensics for the Health Sciences ◽

10.4018/978-1-60960-483-7.ch009 ◽

2011 ◽

pp. 193-215

Author(s):

Guo-Zheng Li

Keyword(s):

Machine Learning ◽

Data Processing ◽

Clinical Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Data Sets ◽

The Novel ◽

Real World Data ◽

Data Set ◽

Learning Techniques

This chapter introduces great challenges and the novel machine learning techniques employed in clinical data processing. It argues that the novel machine learning techniques including support vector machines, ensemble learning, feature selection, feature reuse by using multi-task learning, and multi-label learning provide potentially more substantive solutions for decision support and clinical data analysis. The authors demonstrate the generalization performance of the novel machine learning techniques on real world data sets including one data set of brain glioma, one data set of coronary heart disease in Chinese Medicine and some tumor data sets of microarray. More and more machine learning techniques will be developed to improve analysis precision of clinical data sets.

Download Full-text

Machine Learning for Clinical Data Processing

Machine Learning ◽

10.4018/978-1-60960-818-7.ch409 ◽

2012 ◽

pp. 875-897

Author(s):

Guo-Zheng Li

Keyword(s):

Machine Learning ◽

Data Processing ◽

Clinical Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Data Sets ◽

The Novel ◽

Real World Data ◽

Data Set ◽

Learning Techniques

Download Full-text

The social wasp Vespula germanica (Fabricius) (Hymenoptera: Vespidae) population dynamics in England over 39 years.

The Entomologist s monthly magazine ◽

10.31184/m00138908.1542.3906 ◽

2018 ◽

Vol 154 (2) ◽

pp. 149-155

Author(s):

Michael Archer

Keyword(s):

Population Dynamics ◽

Population Dynamic ◽

Ecological Factors ◽

Social Wasp ◽

Data Sets ◽

Data Set ◽

Vespula Germanica ◽

The Social ◽

Minimum Number ◽

Suction Traps

1. Yearly records of worker Vespula germanica (Fabricius) taken in suction traps at Silwood Park (28 years) and at Rothamsted Research (39 years) are examined. 2. Using the autocorrelation function (ACF), a significant negative 1-year lag followed by a lesser non-significant positive 2-year lag was found in all, or parts of, each data set, indicating an underlying population dynamic of a 2-year cycle with a damped waveform. 3. The minimum number of years before the 2-year cycle with damped waveform was shown varied between 17 and 26, or was not found in some data sets. 4. Ecological factors delaying or preventing the occurrence of the 2-year cycle are considered.

Download Full-text

Predictive and Descriptive CoMFA Models: The Effect of Variable Selection

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207321666180212162028 ◽

2018 ◽

Vol 21 (2) ◽

pp. 117-124 ◽

Cited By ~ 4

Author(s):

Bakhtyar Sepehri ◽

Nematollah Omidikia ◽

Mohsen Kompany-Zareh ◽

Raouf Ghavami

Keyword(s):

Variable Selection ◽

Predictive Power ◽

Selection Method ◽

Data Sets ◽

Data Set ◽

Comfa Model ◽

Variable Selection Method

Aims & Scope: In this research, 8 variable selection approaches were used to investigate the effect of variable selection on the predictive power and stability of CoMFA models. Materials & Methods: Three data sets including 36 EPAC antagonists, 79 CD38 inhibitors and 57 ATAD2 bromodomain inhibitors were modelled by CoMFA. First of all, for all three data sets, CoMFA models with all CoMFA descriptors were created then by applying each variable selection method a new CoMFA model was developed so for each data set, 9 CoMFA models were built. Obtained results show noisy and uninformative variables affect CoMFA results. Based on created models, applying 5 variable selection approaches including FFD, SRD-FFD, IVE-PLS, SRD-UVEPLS and SPA-jackknife increases the predictive power and stability of CoMFA models significantly. Result & Conclusion: Among them, SPA-jackknife removes most of the variables while FFD retains most of them. FFD and IVE-PLS are time consuming process while SRD-FFD and SRD-UVE-PLS run need to few seconds. Also applying FFD, SRD-FFD, IVE-PLS, SRD-UVE-PLS protect CoMFA countor maps information for both fields.

Download Full-text

Human Activity Recognition using Fourier Transform Inspired Deep Learning Combination Model

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327908666180727123657 ◽

2019 ◽

Vol 9 (1) ◽

pp. 16-31

Author(s):

Kyungkoo Jun

Keyword(s):

Fourier Transform ◽

Deep Learning ◽

Short Term Memory ◽

Window Size ◽

Sensor Data ◽

Data Sets ◽

Data Set ◽

Proposed Model ◽

Testing Data ◽

Labeling Scheme

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.

Download Full-text

An Algorithm for the Removal of Cosmic Ray Artifacts in Spectral Data Sets

Applied Spectroscopy ◽

10.1177/0003702819839098 ◽

2019 ◽

Vol 73 (8) ◽

pp. 893-901

Author(s):

Sinead J. Barton ◽

Bryan M. Hennelly

Keyword(s):

Cosmic Ray ◽

Data Sets ◽

Biological Cells ◽

Statistical Classification ◽

Signal To Noise ◽

Multivariate Statistical ◽

Data Set ◽

Artefact Removal ◽

Single Capture ◽

Acquisition Method

Cosmic ray artifacts may be present in all photo-electric readout systems. In spectroscopy, they present as random unidirectional sharp spikes that distort spectra and may have an affect on post-processing, possibly affecting the results of multivariate statistical classification. A number of methods have previously been proposed to remove cosmic ray artifacts from spectra but the goal of removing the artifacts while making no other change to the underlying spectrum is challenging. One of the most successful and commonly applied methods for the removal of comic ray artifacts involves the capture of two sequential spectra that are compared in order to identify spikes. The disadvantage of this approach is that at least two recordings are necessary, which may be problematic for dynamically changing spectra, and which can reduce the signal-to-noise (S/N) ratio when compared with a single recording of equivalent duration due to the inclusion of two instances of read noise. In this paper, a cosmic ray artefact removal algorithm is proposed that works in a similar way to the double acquisition method but requires only a single capture, so long as a data set of similar spectra is available. The method employs normalized covariance in order to identify a similar spectrum in the data set, from which a direct comparison reveals the presence of cosmic ray artifacts, which are then replaced with the corresponding values from the matching spectrum. The advantage of the proposed method over the double acquisition method is investigated in the context of the S/N ratio and is applied to various data sets of Raman spectra recorded from biological cells.

Download Full-text

Imbalanced Data Detection Kernel Method in Closed Systems

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.756-759.3652 ◽

2013 ◽

Vol 756-759 ◽

pp. 3652-3658

Author(s):

You Li Lu ◽

Jun Luo

Keyword(s):

Kernel Methods ◽

Kernel Method ◽

Imbalanced Data ◽

Data Detection ◽

Data Sets ◽

System Call ◽

Data Set ◽

Imbalanced Data Sets ◽

Lower Complexity ◽

Closed Systems

Under the study of Kernel Methods, this paper put forward two improved algorithm which called R-SVM & I-SVDD in order to cope with the imbalanced data sets in closed systems. R-SVM used K-means algorithm clustering space samples while I-SVDD improved the performance of original SVDD by imbalanced sample training. Experiment of two sets of system call data set shows that these two algorithms are more effectively and R-SVM has a lower complexity.

Download Full-text