scholarly journals An Analysis of Selection in Genetic Programming

2021 ◽  
Author(s):  
◽  
Huayang Xie

<p>This thesis presents an analysis of the selection process in tree-based Genetic Programming (GP), covering the optimisation of both parent and offspring selection, and provides a detailed understanding of selection and guidance on how to improve GP search effectively and efficiently. The first part of the thesis providesmodels and visualisations to analyse selection behaviour in standard tournament selection, clarifies several issues in standard tournament selection, and presents a novel solution to automatically and dynamically optimise parent selection pressure. The fitness evaluation cost of parent selection is then addressed and some cost-saving algorithms introduced. In addition, the feasibility of using good predecessor programs to increase parent selection efficiency is analysed. The second part of the thesis analyses the impact of offspring selection pressure on the overall GP search performance. The fitness evaluation cost of offspring selection is then addressed, with investigation of some heuristics to efficiently locate good offspring by constraining crossover point selection structurally through the analysis of the characteristics of good crossover events. The main outcomes of the thesis are three new algorithms and four observations: 1) a clustering tournament selection method is developed to automatically and dynamically tune parent selection pressure; 2) a passive evaluation algorithm is introduced for reducing parent fitness evaluation cost for standard tournament selection using small tournament sizes; 3) a heuristic population clustering algorithm is developed to reduce parent fitness evaluation cost while taking advantage of clustering tournament selection and avoiding the tournament size limitation; 4) population size has little impact on parent selection pressure thus the tournament size configuration is independent of population size; and different sampling replacement strategies have little impact on the selection behaviour in standard tournament selection; 5) premature convergence occurs more often when stochastic elements are removed from both parent and offspring selection processes; 6) good crossover events have a strong preference for whole program trees, and (less strongly) single-node or small subtrees that are at the bottom of parent program trees; 7) the ability of standard GP crossover to generate good offspring is far below what was expected.</p>

2021 ◽  
Author(s):  
◽  
Huayang Xie

<p>This thesis presents an analysis of the selection process in tree-based Genetic Programming (GP), covering the optimisation of both parent and offspring selection, and provides a detailed understanding of selection and guidance on how to improve GP search effectively and efficiently. The first part of the thesis providesmodels and visualisations to analyse selection behaviour in standard tournament selection, clarifies several issues in standard tournament selection, and presents a novel solution to automatically and dynamically optimise parent selection pressure. The fitness evaluation cost of parent selection is then addressed and some cost-saving algorithms introduced. In addition, the feasibility of using good predecessor programs to increase parent selection efficiency is analysed. The second part of the thesis analyses the impact of offspring selection pressure on the overall GP search performance. The fitness evaluation cost of offspring selection is then addressed, with investigation of some heuristics to efficiently locate good offspring by constraining crossover point selection structurally through the analysis of the characteristics of good crossover events. The main outcomes of the thesis are three new algorithms and four observations: 1) a clustering tournament selection method is developed to automatically and dynamically tune parent selection pressure; 2) a passive evaluation algorithm is introduced for reducing parent fitness evaluation cost for standard tournament selection using small tournament sizes; 3) a heuristic population clustering algorithm is developed to reduce parent fitness evaluation cost while taking advantage of clustering tournament selection and avoiding the tournament size limitation; 4) population size has little impact on parent selection pressure thus the tournament size configuration is independent of population size; and different sampling replacement strategies have little impact on the selection behaviour in standard tournament selection; 5) premature convergence occurs more often when stochastic elements are removed from both parent and offspring selection processes; 6) good crossover events have a strong preference for whole program trees, and (less strongly) single-node or small subtrees that are at the bottom of parent program trees; 7) the ability of standard GP crossover to generate good offspring is far below what was expected.</p>


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Junhui Qiu ◽  
Qi Zhou ◽  
Weicai Ye ◽  
Qianjun Chen ◽  
Yun-Juan Bao

Abstract Background The gene-specific sweep is a selection process where an advantageous mutation along with the nearby neutral sites in a gene region increases the frequency in the population. It has been demonstrated to play important roles in ecological differentiation or phenotypic divergence in microbial populations. Therefore, identifying gene-specific sweeps in microorganisms will not only provide insights into the evolutionary mechanisms, but also unravel potential genetic markers associated with biological phenotypes. However, current methods were mainly developed for detecting selective sweeps in eukaryotic data of sparse genotypes and are not readily applicable to prokaryotic data. Furthermore, some challenges have not been sufficiently addressed by the methods, such as the low spatial resolution of sweep regions and lack of consideration of the spatial distribution of mutations. Results We proposed a novel gene-centric and spatial-aware approach for identifying gene-specific sweeps in prokaryotes and implemented it in a python tool SweepCluster. Our method searches for gene regions with a high level of spatial clustering of pre-selected polymorphisms in genotype datasets assuming a null distribution model of neutral selection. The pre-selection of polymorphisms is based on their genetic signatures, such as elevated population subdivision, excessive linkage disequilibrium, or significant phenotype association. Performance evaluation using simulation data showed that the sensitivity and specificity of the clustering algorithm in SweepCluster is above 90%. The application of SweepCluster in two real datasets from the bacteria Streptococcus pyogenes and Streptococcus suis showed that the impact of pre-selection was dramatic and significantly reduced the uninformative signals. We validated our method using the genotype data from Vibrio cyclitrophicus, the only available dataset of gene-specific sweeps in bacteria, and obtained a concordance rate of 78%. We noted that the concordance rate could be underestimated due to distinct reference genomes and clustering strategies. The application to the human genotype datasets showed that SweepCluster is also applicable to eukaryotic data and is able to recover 80% of a catalog of known sweep regions. Conclusion SweepCluster is applicable to a broad category of datasets. It will be valuable for detecting gene-specific sweeps in diverse genotypic data and provide novel insights on adaptive evolution.


2021 ◽  
Author(s):  
Mathilde Barthe ◽  
Claire Doutrelant ◽  
Rita Covas ◽  
Martim Melo ◽  
Juan Carlos Illera ◽  
...  

Shared ecological conditions encountered by species that colonize islands often lead to the evolution of convergent phenotypes, commonly referred to as "island syndrome". Reduced immune functions have been previously proposed to be part of the island syndrome, as a consequence of the reduced diversity of pathogens on island ecosystems. According to this hypothesis, immune genes are expected to exhibit genomic signatures of relaxed selection pressure in island species. In this study, we used comparative genomic methods to study immune genes in island species (N = 20) and their mainland relatives (N = 14). We gathered public data as well as generated new data on innate (Toll-Like Receptors, Beta Defensins) and acquired immune genes (Major Histocompatibility Complex classes I and II), but also on hundreds of genes annotated as involved in various immune functions. As a control, we used a set of 97 genes not involved in immune functions, to account for the lower effective population sizes in island species. We used synonymous and non-synonymous variations to estimate the selection pressure acting on immune genes. For the genes evolving under balancing selection, we used simulation to estimate the impact of population size variation. We found a significant effect of drift on immune genes of island species leading to a reduction in genetic diversity and efficacy of selection. However, the intensity of relaxed selection was not significantly different from control genes, except for MHC class II genes. These genes exhibit a significantly higher level of non-synonymous loss of polymorphism than expected assuming only drift and an evolution under frequency dependent selection, possibly due to a reduction of extracellular parasite communities on islands. Overall, our results showed that demographic effects lead to a decrease in the immune functions of island species, but the relaxed selection caused by a reduced parasite pressure may only occur in some immune genes categories.


2021 ◽  
Author(s):  
Junhui Qiu ◽  
Qi Zhou ◽  
Weicai Ye ◽  
Qianjun Chen ◽  
Yun-Juan Bao

AbstractBackgroundThe gene-specific sweep is a selection process where an advantageous mutation along with the nearby neutral sites in a gene region increases the frequency in the population. It has been demonstrated to play important roles in ecological differentiation or phenotypic divergence in microbial populations. Therefore, identifying gene-specific sweeps in microorganisms will not only provide insights into the evolutionary mechanisms, but also unravel potential genetic markers associated with biological phenotypes. However, current methods were mainly developed for detecting selective sweeps in eukaryotic data of sparse genotypes and are not readily applicable to prokaryotic data. Furthermore, some challenges have not been sufficiently addressed by the methods, such as the low spatial resolution of sweep regions and lack of consideration of the spatial distribution of mutations.ResultsWe proposed a novel gene-centric and spatial-aware approach for identifying gene-specific sweeps in prokaryotes and implemented it in a python tool SweepCluster. Our method searches for gene regions with a high level of spatial clustering of pre-selected polymorphisms in genotype datasets assuming a null distribution model of neutral selection. The pre-selection of polymorphisms is based on their genetic signatures, such as elevated population subdivision, excessive linkage disequilibrium, or significant phenotype association. Performance evaluation using simulation data showed that the accuracy and sensitivity of the clustering algorithm in SweepCluster is above 90%. The application of SweepCluster in two real datasets from the bacteria Streptococcus pyogenes and Streptococcus suis showed that the impact of pre-selection was dramatic and significantly reduced the uninformative signals. We validated our method using the genotype data from Vibrio cyclitrophicus, the only available dataset of gene-specific sweeps in bacteria, and obtained a concordance rate of 78%. We noted that the concordance rate could be underestimated due to distinct reference genomes and clustering strategies. The application to the human genotype datasets showed that SweepCluster is also applicable to eukaryotic data and recovered the known sweep regions in a wide dynamic range of pre-selection parameters.ConclusionsSweepCluster is applicable to a broad category of datasets. It will be valuable for detecting gene-specific sweeps in diverse genotypic data and provide novel insights on adaptive evolution.


Author(s):  
Sidik Wibowo Akhmad

The purpose of this study was to describe the students’ management in increasing the character and achievement in MAN 2 Banjarnegara including: (1) the enrollment process of new students, (2) guiding students through discipline, noble character building, academic and non-academic achievement, and (3) the impact of character building and the achievement for students MAN 2 Banjarnegara. This research implemented descriptive qualitative approach. The data collection techniques were in-depth interview, observation, and documentation study. The validity of the data used three criteria; namely credibility, dependability, and conformability. The findings of this study were: The first, the enrollment process of the new students was made a breakthrough during the registration of academic and non-academic achievement of scholarships, the selection process was conducted through the value of official learning reports, certificate of championship/achievement, academic potential test and non-academic, and also the skill test. For the students who passed the selection process were supposed to sign the achievement contract during the learning process at MAN 2 Banjarnegara. The second, the character building was done by the concept of habituation and activities program that were integrated in curricular and extracurricular activities. The third, students who joined the academic and non-academic achievement programs at MAN 2 Banjarnegara had strong motivation, spirit of competition to achieve higher achievement and more focus on self-development and they could anticipate the usage of spare time for positive things/activities.


2021 ◽  
Vol 13 (12) ◽  
pp. 6581
Author(s):  
Jooyoung Hwang ◽  
Anita Eves ◽  
Jason L. Stienmetz

Travellers have high standards and regard restaurants as important travel attributes. In the tourism and hospitality industry, the use of developed tools (e.g., smartphones and location-based tablets) has been popularised as a way for travellers to easily search for information and to book venues. Qualitative research using semi-structured interviews based on the face-to-face approach was adopted for this study to examine how consumers’ restaurant selection processes are performed with the utilisation of social media on smartphones. Then, thematic analysis was adopted. The findings of this research show that the adoption of social media on smartphones is positively related with consumers’ gratification. More specifically, when consumers regard that process, content and social gratification are satisfied, their intention to adopt social media is fulfilled. It is suggested by this study that consumers’ restaurant decision-making process needs to be understood, as each stage of the decision-making process is not independent; all the stages of the restaurant selection process are organically connected and influence one another.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jie Zhu ◽  
Blanca Gallego

AbstractEpidemic models are being used by governments to inform public health strategies to reduce the spread of SARS-CoV-2. They simulate potential scenarios by manipulating model parameters that control processes of disease transmission and recovery. However, the validity of these parameters is challenged by the uncertainty of the impact of public health interventions on disease transmission, and the forecasting accuracy of these models is rarely investigated during an outbreak. We fitted a stochastic transmission model on reported cases, recoveries and deaths associated with SARS-CoV-2 infection across 101 countries. The dynamics of disease transmission was represented in terms of the daily effective reproduction number ($$R_t$$ R t ). The relationship between public health interventions and $$R_t$$ R t was explored, firstly using a hierarchical clustering algorithm on initial $$R_t$$ R t patterns, and secondly computing the time-lagged cross correlation among the daily number of policies implemented, $$R_t$$ R t , and daily incidence counts in subsequent months. The impact of updating $$R_t$$ R t every time a prediction is made on the forecasting accuracy of the model was investigated. We identified 5 groups of countries with distinct transmission patterns during the first 6 months of the pandemic. Early adoption of social distancing measures and a shorter gap between interventions were associated with a reduction on the duration of outbreaks. The lagged correlation analysis revealed that increased policy volume was associated with lower future $$R_t$$ R t (75 days lag), while a lower $$R_t$$ R t was associated with lower future policy volume (102 days lag). Lastly, the outbreak prediction accuracy of the model using dynamically updated $$R_t$$ R t produced an average AUROC of 0.72 (0.708, 0.723) compared to 0.56 (0.555, 0.568) when $$R_t$$ R t was kept constant. Monitoring the evolution of $$R_t$$ R t during an epidemic is an important complementary piece of information to reported daily counts, recoveries and deaths, since it provides an early signal of the efficacy of containment measures. Using updated $$R_t$$ R t values produces significantly better predictions of future outbreaks. Our results found variation in the effect of early public health interventions on the evolution of $$R_t$$ R t over time and across countries, which could not be explained solely by the timing and number of the adopted interventions.


2020 ◽  
Vol 2020 (12) ◽  
Author(s):  
Roberto Mondini ◽  
Ulrich Schubert ◽  
Ciaran Williams

Abstract In this paper we present a fully-differential calculation for the contributions to the partial widths H →$$ b\overline{b} $$ b b ¯ and H →$$ c\overline{c} $$ c c ¯ that are sensitive to the top quark Yukawa coupling yt to order $$ {\alpha}_s^3 $$ α s 3 . These contributions first enter at order $$ {\alpha}_s^2 $$ α s 2 through terms proportional to ytyq (q = b, c). At order $$ {\alpha}_s^3 $$ α s 3 corrections to the mixed terms are present as well as a new contribution proportional to $$ {y}_t^2 $$ y t 2 . Our results retain the mass of the final-state quarks throughout, while the top quark is integrated out resulting in an effective field theory (EFT). Our results are implemented into a Monte Carlo code allowing for the application of arbitrary final-state selection cuts. As an example we present differential distributions for observables in the Higgs boson rest frame using the Durham jet clustering algorithm. We find that the total impact of the top-induced (i.e. EFT) pieces is sensitive to the nature of the final-state cuts, particularly b-tagging and c-tagging requirements. For bottom quarks, the EFT pieces contribute to the total width (and differential distributions) at around the percent level. The impact is much bigger for the H →$$ c\overline{c} $$ c c ¯ channel, with effects as large as 15%. We show however that their impact can be significantly reduced by the application of jet-tagging selection cuts.


Sign in / Sign up

Export Citation Format

Share Document