scholarly journals LDkit: a parallel computing toolkit for linkage disequilibrium analysis

2020 ◽  
Vol 21 (1) ◽  
Author(s):  
You Tang ◽  
Zhuo Li ◽  
Chao Wang ◽  
Yuxin Liu ◽  
Helong Yu ◽  
...  

Abstract Background Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis either in a local or online way; however, there has been no such tool supporting both graphical user interface (GUI) and parallel computing. Results We developed a GUI software called LDkit for LD analysis, which supports parallel computing. The LDkit supports both variant call format (VCF) and PLINK ‘ped + map’ format. At the same time, users could also just analyze a subset of individuals from the whole population. The LDkit reads the data by block and then paralleled the computation process by monitoring the usage of processes. Assessment on the Human 1000 genome data showed that when paralleled with 32 threads, the running time was reduced to less than 6 minutes from ~77 minutes using the chromosome 22 dataset with 1,103,547 SNPs and 2504 individuals. Conclusions The software LDkit can be effectively used to calculate and plot LD decay, LD block, and linkage disequilibrium analysis between a site and a given region. Most importantly, both graphical user interface (GUI) and stand-alone packages are available for users’ convenience. LDkit was written in JAVA language under cross-platform support.

2008 ◽  
Vol 06 (06) ◽  
pp. 1193-1211 ◽  
Author(s):  
MIHAILO KAPLAREVIC ◽  
ALISON E. MURRAY ◽  
STEPHEN C. CARY ◽  
GUANG R. GAO

Short-insert shotgun sequencing approaches have been applied in recent years to environmental genomic libraries. In the case of complex multispecies microbial communities, there can be many sequence reads that are not incorporated into assemblies, and thus need to be annotated and accessible as single reads. Most existing annotation systems and genome databases accommodate assembled genomes containing contiguous gene-encoding sequences. Thus, a solution is required that can work effectively with environmental genomic annotation information to facilitate data analysis. The Environmental Genome Informational Utility System (EnGenIUS) is a comprehensive environmental genome (metagenome) research toolset that was specifically designed to accommodate the needs of large (> 250 K sequence reads) environmental genome sequencing efforts. The core EnGenIUS modules consist of a set of UNIX scripts and PHP programs used for data preprocessing, an annotation pipeline with accompanying analysis tools, two entity relational databases, and a graphical user interface. The annotation pipeline has a modular structure and can be customized to best fit input data set properties. The integrated entity relational databases store raw data and annotation analysis results. Access to the underlying databases and services is facilitated through a web-based graphical user interface. Users have the ability to browse, upload, download, and analyze preprocessed data, based on diverse search criteria. The EnGenIUS toolset was successfully tested using the Alvinella pompejana epibiont environmental genome data set, which comprises more than 300 K sequence reads. A fully browsable EnGenIUS portal is available at (access code: "guest"). The scope of this paper covers the implementation details and technical aspects of the EnGenIUS toolset.


2016 ◽  
Vol 3 (1) ◽  
Author(s):  
LAL SINGH ◽  
PARMEET SINGH ◽  
RAIHANA HABIB KANTH ◽  
PURUSHOTAM SINGH ◽  
SABIA AKHTER ◽  
...  

WOFOST version 7.1.3 is a computer model that simulates the growth and production of annual field crops. All the run options are operational through a graphical user interface named WOFOST Control Center version 1.8 (WCC). WCC facilitates selecting the production level, and input data sets on crop, soil, weather, crop calendar, hydrological field conditions, soil fertility parameters and the output options. The files with crop, soil and weather data are explained, as well as the run files and the output files. A general overview is given of the development and the applications of the model. Its underlying concepts are discussed briefly.


Sign in / Sign up

Export Citation Format

Share Document