Parallel Evolutionary Computation in R

Multidisciplinary Computational Intelligence Techniques ◽

10.4018/978-1-4666-1830-5.ch020 ◽

2012 ◽

pp. 351-377

Author(s):

Cedric Gondro ◽

Paul Kwan

Keyword(s):

Evolutionary Algorithms ◽

Evolutionary Computation ◽

Genome Wide Association Study ◽

Optimization Methods ◽

Large Field ◽

Search Spaces ◽

Parallel Evolutionary Algorithms ◽

Genome Wide ◽

Statistical Programming ◽

Candidate Regions

Evolutionary Computation (EC) is a branch of Artificial Intelligence which encompasses heuristic optimization methods loosely based on biological evolutionary processes. These methods are efficient in finding optimal or near-optimal solutions in large, complex non-linear search spaces. While evolutionary algorithms (EAs) are comparatively slow in comparison to deterministic or sampling approaches, they are also inherently parallelizable. As technology shifts towards multicore and cloud computing, this overhead becomes less relevant, provided a parallel framework is used. In this chapter the authors discuss how to implement and run parallel evolutionary algorithms in the popular statistical programming language R. R has become the de facto language for statistical programming and it is widely used in biostatistics and bioinformatics due to the availability of thousands of packages to manipulate and analyze data. It is also extremely easy to parallelize routines within R, which makes it a perfect environment for evolutionary algorithms. EC is a large field of research, and many different algorithms have been proposed. While there is no single silver bullet that can handle all classes of problems, an algorithm that is extremely simple, efficient, and with good generalization properties is Differential Evolution (DE). Herein the authors discuss step-by-step how to implement DE in R and how to parallelize it. They then illustrate with a toy genome-wide association study (GWAS) how to identify candidate regions associated with a quantitative trait of interest.

Download Full-text

Pool-based genome-wide association study identified novel candidate regions on BTA9 and 14 for oleic acid percentage in Japanese Black cattle

Animal Science Journal ◽

10.1111/asj.13035 ◽

2018 ◽

Vol 89 (8) ◽

pp. 1060-1066 ◽

Cited By ~ 2

Author(s):

Fuki Kawaguchi ◽

Hiroto Kigoshi ◽

Ayaka Nakajima ◽

Yuta Matsumoto ◽

Yoshinobu Uemoto ◽

...

Keyword(s):

Oleic Acid ◽

Association Study ◽

Genome Wide Association Study ◽

Genome Wide Association ◽

Japanese Black Cattle ◽

Genome Wide ◽

Candidate Regions ◽

Acid Percentage

Download Full-text

Comparative analysis of different approaches for dealing with candidate regions in the context of a genome-wide association study

BMC Proceedings ◽

10.1186/1753-6561-3-s7-s93 ◽

2009 ◽

Vol 3 (Suppl 7) ◽

pp. S93 ◽

Cited By ~ 5

Author(s):

Francesca Lantieri ◽

Min A Jhun ◽

Jungsun Park ◽

Taesung Park ◽

Marcella Devoto

Keyword(s):

Comparative Analysis ◽

Association Study ◽

Genome Wide Association Study ◽

Genome Wide Association ◽

Genome Wide ◽

A Genome ◽

Candidate Regions

Download Full-text

ESTIMATES FROM EVOLUTIONARY ALGORITHMS THEORY APPLIED TO DIRECTED EVOLUTION

Mathematical Structures and Modeling ◽

10.24147/2222-8772.2020.1.56-76 ◽

2020 ◽

pp. 56-76

Author(s):

A. V. Eremeev ◽

A. V. Spirov

Keyword(s):

Evolutionary Algorithms ◽

Evolutionary Computation ◽

Dna Sequences ◽

Upper Bound ◽

Fitness Function ◽

Optimization Methods ◽

Target Area ◽

Expected Time ◽

Royal Road ◽

Expected Hitting Time

The field of evolutionary computation emerged in the area of computer science due to transfer of ideas from biology and developed independently for several decades, enriched with techniques from probability theory, complexity theory and optimization methods. Our aim is to consider how some recent results form the theory of evolutionary computation may be transferred back into biology. It has been noted that the non-elitist evolutionary algorithms optimizing Royal Road fitness functions may be considered as models of evolutionary search for the synthetic enhancer sequences “from scratch”. This problem asks for a tight cluster of supposedly unknown motifs from the initial random (or partially random) set of DNA sequences using SELEX approaches. We apply the upper bounds on the expected hitting time of a target area of genotypic space in order to upper-bound the expected time to finding a sufficiently fit series of motifs in a SELEX procedure. On the other hand, using the theory of evolutionary computation, we propose an upper bound on the expected proportion of the DNA sequences with sufficiently high fitness at a given round of a SELEX procedure. Both approaches are evaluated in computational experiment, using a Royal Road fitness function as a model of the SELEX procedure for regulatory FIS factor binding site.

Download Full-text