Identifying emerging phenomenon in long temporal phenotyping experiments

Abstract Motivation The rapid improvement of phenotyping capability, accuracy and throughput have greatly increased the volume and diversity of phenomics data. A remaining challenge is an efficient way to identify phenotypic patterns to improve our understanding of the quantitative variation of complex phenotypes, and to attribute gene functions. To address this challenge, we developed a new algorithm to identify emerging phenomena from large-scale temporal plant phenotyping experiments. An emerging phenomenon is defined as a group of genotypes who exhibit a coherent phenotype pattern during a relatively short time. Emerging phenomena are highly transient and diverse, and are dependent in complex ways on both environmental conditions and development. Identifying emerging phenomena may help biologists to examine potential relationships among phenotypes and genotypes in a genetically diverse population and to associate such relationships with the change of environments or development. Results We present an emerging phenomenon identification tool called Temporal Emerging Phenomenon Finder (TEP-Finder). Using large-scale longitudinal phenomics data as input, TEP-Finder first encodes the complicated phenotypic patterns into a dynamic phenotype network. Then, emerging phenomena in different temporal scales are identified from dynamic phenotype network using a maximal clique based approach. Meanwhile, a directed acyclic network of emerging phenomena is composed to model the relationships among the emerging phenomena. The experiment that compares TEP-Finder with two state-of-art algorithms shows that the emerging phenomena identified by TEP-Finder are more functionally specific, robust and biologically significant. Availability and implementation The source code, manual and sample data of TEP-Finder are all available at: http://phenomics.uky.edu/TEP-Finder/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Identifying Emerging Phenomenon in Plant Long Temporal Phenotyping Experiments

10.1101/454686 ◽

2018 ◽

Author(s):

Jiajie Peng ◽

Junya Lu ◽

Donghee Hoh ◽

Ayesha S Dina ◽

Xuequn Shang ◽

...

Keyword(s):

Large Scale ◽

Maximal Clique ◽

Plant Phenotyping ◽

Rapid Improvement ◽

Complex Phenotypes ◽

Acyclic Network ◽

Sample Data ◽

Short Time ◽

Phenotype Network ◽

Identification Tool

AbstractThe rapid improvement of phenotyping capability, accuracy, and throughput have greatly increased the volume and diversity of phenomics data. A remaining challenge is an efficient way to identify phenotypic patterns to improve our understanding of the quantitative variation of complex phenotypes, and to attribute gene functions. To address this challenge, we developed a new algorithm to identify emerging phenomena from large-scale temporal plant phenotyping experiments. An emerging phenomenon is defined as a group of genotypes who exhibit a coherent phenotype pattern during a relatively short time. Emerging phenomena are highly transient and diverse, and are dependent in complex ways on both environmental conditions and development. Identifying emerging phenomena may help biologists to examine potential relationships among phenotypes and genotypes in a genetically diverse population and to associate such relationships with the change of environments or development. We present an emerging phenomenon identification tool called Temporal Emerging Phenomenon Finder (TEP-Finder). Using large-scale longitudinal phenomics data as input, TEP-Finder first encodes the complicated phenotypic patterns into a dynamic phenotype network. Then, emerging phenomena in different temporal scales are identified from dynamic phenotype network using a maximal clique based approach. Meanwhile, a directed acyclic network of emerging phenomena is composed to model the relationships among the emerging phenomena. The experiment that compares TEP-Finder with two state-of-art algorithms shows that the emerging phenomena identified by TEP-Finder are more functionally specific, robust, and biologically significant. The source code, manual, and sample data of TEP-Finder are all available at: http://phenomics.uky.edu/TEP-Finder/.

Download Full-text

Engineering in the time of cholera: overcoming institutional and political challenges to rebuild Zimbabwe's water and sanitation infrastructure in the aftermath of the 2008 cholera epidemic

Journal of Water Sanitation and Hygiene for Development ◽

10.2166/washdev.2013.143 ◽

2013 ◽

Vol 3 (2) ◽

pp. 222-229 ◽

Cited By ~ 2

Author(s):

Clarissa Brocklehurst ◽

Murtaza Malik ◽

Kiwe Sebunya ◽

Peter Salama

Keyword(s):

Urban Areas ◽

Large Scale ◽

Humanitarian Aid ◽

Urban Systems ◽

Political Crisis ◽

Rapid Improvement ◽

Cholera Epidemic ◽

Rapid Action ◽

Water And Sanitation Infrastructure ◽

Urban Rehabilitation

A devastating cholera epidemic swept Zimbabwe in 2008, causing over 90,000 cases, and leaving more than 4,000 dead. The epidemic raged predominantly in urban areas, and the cause could be traced to the slow deterioration of Zimbabwe's water and sewerage utilities during the economic and political crisis that had gripped the country since the late 1990s. Rapid improvement was needed if the country was to avoid another cholera outbreak. In this context, donors, development agencies and government departments joined forces to work in a unique partnership, and to implement a programme of swift improvements that went beyond emergency humanitarian aid but did not require the time or massive investment associated with full-scale urban rehabilitation. The interventions ranged from supply of water treatment chemicals and sewer rods to advocacy and policy advice. The authors analyse the factors that made the programme effective and the challenges that partners faced. The case of Zimbabwe offers valuable lessons for other countries transitioning from emergency to development, and particularly those that need to take rapid action to upgrade failing urban systems. It illustrates that there is a ‘middle path’ between short-term humanitarian aid delivered in urban areas and large-scale urban rehabilitation, which can provide timely and highly effective results.

Download Full-text

Towards Control of an Estuary

Water Science & Technology ◽

10.2166/wst.1987.0077 ◽

1987 ◽

Vol 19 (9) ◽

pp. 155-174

Author(s):

Henk L. F. Saeijs

Keyword(s):

Shallow Water ◽

Storm Surge ◽

Salt Marshes ◽

Large Scale ◽

Western Europe ◽

Oxygen Depletion ◽

Negative Effects ◽

Intertidal Flats ◽

Environmental Consequences ◽

Short Time

The Delta Project is in its final stage. In 1974 it was subjected to political reconsideration, but it is scheduled now for completion in 1987. The final touches are being put to the storm-surge barrier and two compartment dams that divide the Oosterschelde into three areas: one tidal, one with reduced tide, and one a freshwater lake. Compartmentalization will result in 13% of channels, 45% of intertidal flats and 59% of salt marshes being lost. There is a net gain of 7% of shallow-water areas. Human interventions with large scale impacts are not new in the Oosterschelde but the large scale and short time in which these interventions are taking place are, as is the creation of a controlled tidal system. This article focusses on the area with reduced tide and compares resent day and expected characteristics. In this reduced tidal part salt marshes will extend by 30–70%; intertidal flats will erode to a lower level and at their edges, and the area of shallow water will increase by 47%. Biomass production on the intertidal flats will decrease, with consequences for crustaceans, fishes and birds. The maximum number of waders counted on one day and the number of ‘bird-days' will decrease drastically, with negative effects for the wader populations of western Europe. The net area with a hard substratum in the reduced tidal part has more than doubled. Channels will become shallower. Detritus import will not change significantly. Stratification and oxygen depletion will be rare and local. The operation of the storm-surge barrier and the closure strategy chosen are very important for the ecosystem. Two optional closure strategies can be followed without any additional environmental consequences. It was essential to determine a clearly defined plan of action for the whole area, and to make land-use choices from the outset. How this was done is briefly described.

Download Full-text

Environmental Engineering Techniques to Restore Degraded Posidonia oceanica Meadows

Water ◽

10.3390/w13050661 ◽

2021 ◽

Vol 13 (5) ◽

pp. 661

Author(s):

Luigi Piazzi ◽

Stefano Acunto ◽

Francesca Frau ◽

Fabrizio Atzori ◽

Maria Francesca Cinti ◽

...

Keyword(s):

Large Scale ◽

Natural Habitat ◽

Cost Benefit ◽

Cost Effective ◽

Posidonia Oceanica ◽

Environmental Engineering ◽

Bottom Surface ◽

Sessile Invertebrates ◽

Set Up ◽

Short Time

Seagrass planting techniques have shown to be an effective tool for restoring degraded meadows and ecosystem function. In the Mediterranean Sea, most restoration efforts have been addressed to the endemic seagrass Posidonia oceanica, but cost-benefit analyses have shown unpromising results. This study aimed at evaluating the effectiveness of environmental engineering techniques generally employed in terrestrial systems to restore the P. oceanica meadows: two different restoration efforts were considered, either exploring non-degradable mats or, for the first time, degradable mats. Both of them provided encouraging results, as the loss of transplanting plots was null or very low and the survival of cuttings stabilized to about 50%. Data collected are to be considered positive as the survived cuttings are enough to allow the future spread of the patches. The utilized techniques provided a cost-effective restoration tool likely affordable for large-scale projects, as the methods allowed to set up a wide bottom surface to restore in a relatively short time without any particular expensive device. Moreover, the mats, comparing with other anchoring methods, enhanced the colonization of other organisms such as macroalgae and sessile invertebrates, contributing to generate a natural habitat.

Download Full-text

Synthesis of Metal Organic Frameworks by Ball-Milling

Crystals ◽

10.3390/cryst11010015 ◽

2020 ◽

Vol 11 (1) ◽

pp. 15

Author(s):

Cheng-An Tao ◽

Jian-Fang Wang

Keyword(s):

Ball Milling ◽

Large Scale ◽

Raw Materials ◽

Metal Organic Frameworks ◽

Research Topic ◽

Mass Ratio ◽

Time Consumption ◽

Metal Organic ◽

Short Time ◽

Made In

Metal-organic frameworks (MOFs) have been used in adsorption, separation, catalysis, sensing, photo/electro/magnetics, and biomedical fields because of their unique periodic pore structure and excellent properties and have become a hot research topic in recent years. Ball milling is a method of small pollution, short time-consumption, and large-scale synthesis of MOFs. In recent years, many important advances have been made. In this paper, the influencing factors of MOFs synthesized by grinding were reviewed systematically from four aspects: auxiliary additives, metal sources, organic linkers, and reaction specific conditions (such as frequency, reaction time, and mass ratio of ball and raw materials). The prospect for the future development of the synthesis of MOFs by grinding was proposed.

Download Full-text

TreeMerge: a new method for improving the scalability of species tree estimation methods

Bioinformatics ◽

10.1093/bioinformatics/btz344 ◽

2019 ◽

Vol 35 (14) ◽

pp. i417-i426 ◽

Cited By ~ 7

Author(s):

Erin K Molloy ◽

Tandy Warnow

Keyword(s):

Large Scale ◽

Species Tree ◽

New Method ◽

Divide And Conquer ◽

Supplementary Information ◽

Estimation Methods ◽

Running Time ◽

Tree Estimation ◽

Computationally Intensive ◽

A Minor

Abstract Motivation At RECOMB-CG 2018, we presented NJMerge and showed that it could be used within a divide-and-conquer framework to scale computationally intensive methods for species tree estimation to larger datasets. However, NJMerge has two significant limitations: it can fail to return a tree and, when used within the proposed divide-and-conquer framework, has O(n5) running time for datasets with n species. Results Here we present a new method called ‘TreeMerge’ that improves on NJMerge in two ways: it is guaranteed to return a tree and it has dramatically faster running time within the same divide-and-conquer framework—only O(n2) time. We use a simulation study to evaluate TreeMerge in the context of multi-locus species tree estimation with two leading methods, ASTRAL-III and RAxML. We find that the divide-and-conquer framework using TreeMerge has a minor impact on species tree accuracy, dramatically reduces running time, and enables both ASTRAL-III and RAxML to complete on datasets (that they would otherwise fail on), when given 64 GB of memory and 48 h maximum running time. Thus, TreeMerge is a step toward a larger vision of enabling researchers with limited computational resources to perform large-scale species tree estimation, which we call Phylogenomics for All. Availability and implementation TreeMerge is publicly available on Github (http://github.com/ekmolloy/treemerge). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Genome-wide inferring gene–phenotype relationship by walking on the heterogeneous network

Bioinformatics ◽

10.1093/bioinformatics/btq108 ◽

2010 ◽

Vol 26 (9) ◽

pp. 1219-1224 ◽

Cited By ~ 238

Author(s):

Yongjin Li ◽

Jagdish C. Patra

Keyword(s):

Heterogeneous Network ◽

Gene Network ◽

Genetic Diseases ◽

Supplementary Information ◽

Disease Genes ◽

Phenotypic Data ◽

Disease Associations ◽

Improved Performance ◽

Leave One Out ◽

Phenotype Network

Abstract Motivation: Clinical diseases are characterized by distinct phenotypes. To identify disease genes is to elucidate the gene–phenotype relationships. Mutations in functionally related genes may result in similar phenotypes. It is reasonable to predict disease-causing genes by integrating phenotypic data and genomic data. Some genetic diseases are genetically or phenotypically similar. They may share the common pathogenetic mechanisms. Identifying the relationship between diseases will facilitate better understanding of the pathogenetic mechanism of diseases. Results: In this article, we constructed a heterogeneous network by connecting the gene network and phenotype network using the phenotype–gene relationship information from the OMIM database. We extended the random walk with restart algorithm to the heterogeneous network. The algorithm prioritizes the genes and phenotypes simultaneously. We use leave-one-out cross-validation to evaluate the ability of finding the gene–phenotype relationship. Results showed improved performance than previous works. We also used the algorithm to disclose hidden disease associations that cannot be found by gene network or phenotype network alone. We identified 18 hidden disease associations, most of which were supported by literature evidence. Availability: The MATLAB code of the program is available at http://www3.ntu.edu.sg/home/aspatra/research/Yongjin_BI2010.zip Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Download Full-text

EARRINGS: an efficient and accurate adapter trimmer entails no a priori adapter sequences

Bioinformatics ◽

10.1093/bioinformatics/btab025 ◽

2021 ◽

Author(s):

Ting-Hsuan Wang ◽

Cheng-Ching Huang ◽

Jui-Hung Hung

Keyword(s):

Open Source Software ◽

Large Scale ◽

A Priori ◽

Supplementary Information ◽

Supplementary Data ◽

Comparable Accuracy ◽

Meta Analyses ◽

Next Generation Sequencing Ngs ◽

Adapter Trimming ◽

Generation Sequencing

Abstract Motivation Cross-sample comparisons or large-scale meta-analyses based on the next generation sequencing (NGS) involve replicable and universal data preprocessing, including removing adapter fragments in contaminated reads (i.e. adapter trimming). While modern adapter trimmers require users to provide candidate adapter sequences for each sample, which are sometimes unavailable or falsely documented in the repositories (such as GEO or SRA), large-scale meta-analyses are therefore jeopardized by suboptimal adapter trimming. Results Here we introduce a set of fast and accurate adapter detection and trimming algorithms that entail no a priori adapter sequences. These algorithms were implemented in modern C++ with SIMD and multithreading to accelerate its speed. Our experiments and benchmarks show that the implementation (i.e. EARRINGS), without being given any hint of adapter sequences, can reach comparable accuracy and higher throughput than that of existing adapter trimmers. EARRINGS is particularly useful in meta-analyses of a large batch of datasets and can be incorporated in any sequence analysis pipelines in all scales. Availability and implementation EARRINGS is open-source software and is available at https://github.com/jhhung/EARRINGS. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

TADOSS: computational estimation of tandem domain swap stability

Bioinformatics ◽

10.1093/bioinformatics/bty974 ◽

2018 ◽

Vol 35 (14) ◽

pp. 2507-2508 ◽

Cited By ~ 2

Author(s):

Aleix Lafita ◽

Pengfei Tian ◽

Robert B Best ◽

Alex Bateman

Keyword(s):

Large Scale ◽

Domain Swapping ◽

Coarse Grained ◽

Supplementary Information ◽

Simulation Studies ◽

Computational Tools ◽

Domain Swap ◽

Computational Estimation ◽

High Propensity ◽

The Stability

Abstract Summary Proteins with highly similar tandem domains have shown an increased propensity for misfolding and aggregation. Several molecular explanations have been put forward, such as swapping of adjacent domains, but there is a lack of computational tools to systematically analyze them. We present the TAndem DOmain Swap Stability predictor (TADOSS), a method to computationally estimate the stability of tandem domain-swapped conformations from the structures of single domains, based on previous coarse-grained simulation studies. The tool is able to discriminate domains susceptible to domain swapping and to identify structural regions with high propensity to form hinge loops. TADOSS is a scalable method and suitable for large scale analyses. Availability and implementation Source code and documentation are freely available under an MIT license on GitHub at https://github.com/lafita/tadoss. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

GWASpro: a high-performance genome-wide association analysis server

Bioinformatics ◽

10.1093/bioinformatics/bty989 ◽

2018 ◽

Vol 35 (14) ◽

pp. 2512-2514 ◽

Cited By ~ 4

Author(s):

Bongsong Kim ◽

Xinbin Dai ◽

Wenchao Zhang ◽

Zhaohong Zhuang ◽

Darlene L Sanchez ◽

...

Keyword(s):

High Performance ◽

Large Scale ◽

Linear Mixed Model ◽

Association Studies ◽

Learning Curves ◽

Experimental Designs ◽

Genome Wide Association ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Genome Wide

Abstract Summary We present GWASpro, a high-performance web server for the analyses of large-scale genome-wide association studies (GWAS). GWASpro was developed to provide data analyses for large-scale molecular genetic data, coupled with complex replicated experimental designs such as found in plant science investigations and to overcome the steep learning curves of existing GWAS software tools. GWASpro supports building complex design matrices, by which complex experimental designs that may include replications, treatments, locations and times, can be accounted for in the linear mixed model. GWASpro is optimized to handle GWAS data that may consist of up to 10 million markers and 10 000 samples from replicable lines or hybrids. GWASpro provides an interface that significantly reduces the learning curve for new GWAS investigators. Availability and implementation GWASpro is freely available at https://bioinfo.noble.org/GWASPRO. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text