scholarly journals Framework for Parallel Preprocessing of Microarray Data Using Hadoop

2018 ◽  
Vol 2018 ◽  
pp. 1-9 ◽  
Author(s):  
Amirhossein Sahlabadi ◽  
Ravie Chandren Muniyandi ◽  
Mahdi Sahlabadi ◽  
Hossein Golshanbafghy

Nowadays, microarray technology has become one of the popular ways to study gene expression and diagnosis of disease. National Center for Biology Information (NCBI) hosts public databases containing large volumes of biological data required to be preprocessed, since they carry high levels of noise and bias. Robust Multiarray Average (RMA) is one of the standard and popular methods that is utilized to preprocess the data and remove the noises. Most of the preprocessing algorithms are time-consuming and not able to handle a large number of datasets with thousands of experiments. Parallel processing can be used to address the above-mentioned issues. Hadoop is a well-known and ideal distributed file system framework that provides a parallel environment to run the experiment. In this research, for the first time, the capability of Hadoop and statistical power of R have been leveraged to parallelize the available preprocessing algorithm called RMA to efficiently process microarray data. The experiment has been run on cluster containing 5 nodes, while each node has 16 cores and 16 GB memory. It compares efficiency and the performance of parallelized RMA using Hadoop with parallelized RMA using affyPara package as well as sequential RMA. The result shows the speed-up rate of the proposed approach outperforms the sequential approach and affyPara approach.

2012 ◽  
Vol 9 (1) ◽  
pp. 32-43 ◽  
Author(s):  
Jinlu Cai ◽  
Henry L. Keen ◽  
Curt D. Sigmund ◽  
Thomas L. Casavant

Summary Microarrays have been widely used to study differential gene expression at the genomic level. They can also provide genome-wide co-expression information. Biologically related datasets from independent studies are publicly available, which requires robust combined approaches for integration and validation. Previously, meta-analysis has been adopted to solve this problem.As an alternative to meta-analysis, for microarray data with high similarity in biological experimental design, a more direct combined approach is possible. Gene-level normalization across datasets is motivated by the different scale and distribution of data due to separate origins. However, there has been limited discussion about this point in the past. Here we describe a combined approach for microarray analysis, including gene-level normalization and Coex-Rank approach. After normalization, a linear modeling process is used to identify lists of differentially expressed genes. The Coex-Rank approach incorporates co-expression information into a rank-aggregation procedure. We applied this computational approach to our data, which illustrated an improvement in statistical power and a complementary advantage of the Coex-Rank approach from a biological perspective.Our combined approach for microarray data analysis (Coex-rank) is based on normalization, which is naturally driven. The Coex-rank process not only takes advantage of merging the power of multiple methods regarding normalization but also assists in the discovery of functional clusters of genes.


2014 ◽  
Vol 74 (2) ◽  
pp. 483-488
Author(s):  
R. Campitelli-Ramos ◽  
JV. Lucca ◽  
LLD. Oliveira ◽  
MR. Marchese ◽  
O. Rocha

Annelid worms represent a significant part of freshwater benthic communities worldwide and Oligochaeta is a particularly species-rich group. Dero (A) bimagnasetus (Naididae) previously found and described from a small marsh in Surinam in 1974, has now been found for the first time in Barra Lake, MG, Brazil. Due to the scarce biological data and absence of ecological information in the literature regarding this species we are presenting morphological information on the specimens obtained and the physical and chemical characteristics of the habitat they were found. This species occurred only in the littoral zone of Barra Lake, in muddy, low oxygen, low conductivity and low organic matter sediment. The four individuals collected ranged 3.17-4.15 mm total length; 0.25 - 0.26 mm body width and 0.16-0.21 mm3 total volume. Considering the present anthropic pressures on freshwater biota and fast biodiversity losses worldwide it is now recognized that attention must be paid to low abundance species and the urgency for preservation of their habitats.


2020 ◽  
Vol 50 (3) ◽  
pp. 252-255
Author(s):  
Nathalie CITELI ◽  
Mariana DE-CARVALHO ◽  
Reuber BRANDÃO

ABSTRACT The rare Amazonian snake Eutrachelophis papilio is known from only five individuals, from four localities, belonging to its type-series, the more recent collected over 10 years ago. Here, we expand its distribution and describe its color in life for the first time. We also provide an estimate of its distribution area using the minimum convex polygon method and identify the values of anthropic pressure within its known distribution range with the Human Footprint Index. The new occurrence is located 291 km from the nearest known locality and its distribution is associated with pristine forests. Considering its rarity, and the absence of demographic and biological data, we suggest that the species should be classified as Data Deficient by IUCN criteria.


2008 ◽  
Vol 295 (6) ◽  
pp. R1914-R1920 ◽  
Author(s):  
Charles Hindmarch ◽  
Mark Fry ◽  
Song T. Yao ◽  
Pauline M. Smith ◽  
David Murphy ◽  
...  

We have employed microarray technology using Affymetrix 230 2.0 genome chips to initially catalog the transcriptome of the subfornical organ (SFO) under control conditions and to also evaluate the changes (common and differential) in gene expression induced by the challenges of fluid and food deprivation. We have identified a total of 17,293 genes tagged as present in one of our three experimental conditions, transcripts, which were then used as the basis for further filtering and statistical analysis. In total, the expression of 46 genes was changed in the SFO following dehydration compared with control animals (22 upregulated and 24 downregulated), with the largest change being the greater than fivefold increase in brain-derived neurotrophic factor (BDNF) expression, while significant changes in the expression of the calcium-sensing (upregulated) and apelin (downregulated) receptors were also reported. In contrast, food deprivation caused greater than twofold changes in a total of 687 transcripts (222 upregulated and 465 downregulated), including significant reductions in vasopressin, oxytocin, promelanin concentrating hormone, cocaine amphetamine-related transcript (CART), and the endothelin type B receptor, as well as increases in the expression of the GABAB receptor. Of these regulated transcripts, we identified 37 that are commonly regulated by fasting and dehydration, nine that were uniquely regulated by dehydration, and 650 that are uniquely regulated by fasting. We also found five transcripts that were differentially regulated by fasting and dehydration including BDNF and CART. In these studies we have for the first time described the transcriptome of the rat SFO and have in addition identified genes, the expression of which is significantly modified by either water or food deprivation.


2019 ◽  
pp. 77-92
Author(s):  
Chad West ◽  
C. Michael Palmer ◽  
Michael Grace ◽  
Daniel Fabricius

How does one take a concert band snare drummer, classically trained pianist, orchestral bass player, and self-taught guitar player and turn them into a jazz rhythm section? The drummer has never had so many drums and cymbals to worry about, the pianist may be playing with a group for the very first time, the bass player has to learn to “walk” a bass line, and the guitar player has to play in foreign keys. This chapter addresses the teaching of the rhythm section with regard to (a) rhythm section notation, (b) bass, (c) piano, (d) guitar, (e) drum set, (f) auxiliary instruments, and (g) rhythm section rehearsal strategies. It presents a sequential approach to teaching the beginning rhythm section: (a) walking bass lines, (b) voicing chords, (c) comping patterns, (d) playing setups and fills, and (e) interpreting and realizing instrument-specific rhythm section notation markings.


2002 ◽  
Vol 85 (4) ◽  
pp. 906-910 ◽  
Author(s):  
Sufian F Al-Khaldi ◽  
Scott A Martin ◽  
Avraham Rasooly ◽  
Jeff D Evans

Abstract Microarray analysis is an emerging technology that has the potential to become a leading trend in bacterial identification in food and feed improvement. The technology uses fluorescent-labeled probes amplified from bacterial samples that are then hybridized to thousands of DNA sequences immobilized on chemically modified glass slides. The whole gene or open reading frame(s) is represented by a polymerase chain reaction fragment of double-strand DNA, approximately 1000 base pair (bp) or 20–70 bp single-strand oligonucleotides. The technology can be used to identify bacteria and to study gene expression in complex microbial populations, such as those found in food and gastrointestinal tracts. Data generated by microarray analysis can be potentially used to improve the safety of our food supply as well as ensure the efficiency of animal feed conversion to human food, e.g., in meat and milk production by ruminants. This minireview addresses the use of microarray technology in bacterial identification and gene expression in different microbial systems and in habitats containing mixed populations of bacteria.


2015 ◽  
Vol 138 (1) ◽  
Author(s):  
Daniel Maier ◽  
Corinna Hager ◽  
Hartmut Hetzler ◽  
Nicolas Fillot ◽  
Philippe Vergne ◽  
...  

In order to obtain a fast solution scheme, the trajectory piecewise linear (TPWL) method is applied to the transient elastohydrodynamic (EHD) line contact problem for the first time. TPWL approximates the nonlinearity of a dynamical system by a weighted superposition of reduced linearized systems along specified trajectories. The method is compared to another reduced order model (ROM), based on Galerkin projection, Newton–Raphson scheme and an approximation of the nonlinear reduced system functions. The TPWL model provides further speed-up compared to the Newton–Raphson based method at a high accuracy.


2020 ◽  
Author(s):  
Fulei Ji ◽  
Wentao Zhang ◽  
Tianyou Ding

Abstract Automatic search methods have been widely used for cryptanalysis of block ciphers, especially for the most classic cryptanalysis methods—differential and linear cryptanalysis. However, the automatic search methods, no matter based on MILP, SMT/SAT or CP techniques, can be inefficient when the search space is too large. In this paper, we propose three new methods to improve Matsui’s branch-and-bound search algorithm, which is known as the first generic algorithm for finding the best differential and linear trails. The three methods, named reconstructing DDT and LAT according to weight, executing linear layer operations in minimal cost and merging two 4-bit S-boxes into one 8-bit S-box, respectively, can efficiently speed up the search process by reducing the search space as much as possible and reducing the cost of executing linear layer operations. We apply our improved algorithm to DESL and GIFT, which are still the hard instances for the automatic search methods. As a result, we find the best differential trails for DESL (up to 14-round) and GIFT-128 (up to 19-round). The best linear trails for DESL (up to 16-round), GIFT-128 (up to 10-round) and GIFT-64 (up to 15-round) are also found. To the best of our knowledge, these security bounds for DESL and GIFT under single-key scenario are given for the first time. Meanwhile, it is the longest exploitable (differential or linear) trails for DESL and GIFT. Furthermore, benefiting from the efficiency of the improved algorithm, we do experiments to demonstrate that the clustering effect of differential trails for 13-round DES and DESL are both weak.


1968 ◽  
Vol 46 (5) ◽  
pp. 981-985 ◽  
Author(s):  
Ian R. Ball ◽  
C. H. Fernando

Urceolaria mitra (von Sieb.) is described for the first time from North America, and new host and geographic records are given. Brief biological data on host specificity, distribution and dispersal, and survival of the epizoite in temporary waters are also provided.


Zootaxa ◽  
2019 ◽  
Vol 4674 (2) ◽  
pp. 283-295 ◽  
Author(s):  
ŁUKASZ PRZYBYŁOWICZ ◽  
VINCENT MAICHER ◽  
GYULA M. LÁSZLÓ ◽  
SZABOLCS SÁFIÁN ◽  
ROBERT TROPEK

Amerila is one of the most studied Afrotropical genera of Arctiinae. However, based on a regionally constrained sample of specimens from Mount Cameroon, we show how superficial our knowledge on these tiger moths is. Among six collected Amerila species, A. femina’s female is described here for the first time, and A. mulleri and A. roseomarginata had never been recorded before in the country. Moreover, novel biological data are presented, including individual species’ elevational ranges. Finally, female reproductive organs of the genus are illustrated here for the first time. The value of such regional studies is highlighted, with some remarks on necessary requirements of such small-scaled field sampling. 


Sign in / Sign up

Export Citation Format

Share Document