genetic program
Recently Published Documents


TOTAL DOCUMENTS

198
(FIVE YEARS 29)

H-INDEX

36
(FIVE YEARS 4)

2021 ◽  
Author(s):  
◽  
Urvesh Bhowan

<p>In classification,machine learning algorithms can suffer a performance bias when data sets are unbalanced. Binary data sets are unbalanced when one class is represented by only a small number of training examples (called the minority class), while the other class makes up the rest (majority class). In this scenario, the induced classifiers typically have high accuracy on the majority class but poor accuracy on the minority class. As the minority class typically represents the main class-of-interest in many real-world problems, accurately classifying examples from this class can be at least as important as, and in some cases more important than, accurately classifying examples from the majority class. Genetic Programming (GP) is a promising machine learning technique based on the principles of Darwinian evolution to automatically evolve computer programs to solve problems. While GP has shown much success in evolving reliable and accurate classifiers for typical classification tasks with balanced data, GP, like many other learning algorithms, can evolve biased classifiers when data is unbalanced. This is because traditional training criteria such as the overall success rate in the fitness function in GP, can be influenced by the larger number of examples from the majority class.  This thesis proposes a GP approach to classification with unbalanced data. The goal is to develop new internal cost-adjustment techniques in GP to improve classification performances on both the minority class and the majority class. By focusing on internal cost-adjustment within GP rather than the traditional databalancing techniques, the unbalanced data can be used directly or "as is" in the learning process. This removes any dependence on a sampling algorithm to first artificially re-balance the input data prior to the learning process. This thesis shows that by developing a number of new methods in GP, genetic program classifiers with good classification ability on the minority and the majority classes can be evolved. This thesis evaluates these methods on a range of binary benchmark classification tasks with unbalanced data. This thesis demonstrates that unlike tasks with multiple balanced classes where some dynamic (non-static) classification strategies perform significantly better than the simple static classification strategy, either a static or dynamic strategy shows no significant difference in the performance of evolved GP classifiers on these binary tasks. For this reason, the rest of the thesis uses this static classification strategy.  This thesis proposes several new fitness functions in GP to perform cost adjustment between the minority and the majority classes, allowing the unbalanced data sets to be used directly in the learning process without sampling. Using the Area under the Receiver Operating Characteristics (ROC) curve (also known as the AUC) to measure how well a classifier performs on the minority and majority classes, these new fitness functions find genetic program classifiers with high AUC on the tasks on both classes, and with fast GP training times. These GP methods outperform two popular learning algorithms, namely, Naive Bayes and Support Vector Machines on the tasks, particularly when the level of class imbalance is large, where both algorithms show biased classification performances.  This thesis also proposes a multi-objective GP (MOGP) approach which treats the accuracies of the minority and majority classes separately in the learning process. The MOGP approach evolves a good set of trade-off solutions (a Pareto front) in a single run that perform as well as, and in some cases better than, multiple runs of canonical single-objective GP (SGP). In SGP, individual genetic program solutions capture the performance trade-off between the two objectives (minority and majority class accuracy) using an ROC curve; whereas in MOGP, this requirement is delegated to multiple genetic program solutions along the Pareto front.  This thesis also shows how multiple Pareto front classifiers can be combined into an ensemble where individual members vote on the class label. Two ensemble diversity measures are developed in the fitness functions which treat the diversity on both the minority and the majority classes as equally important; otherwise, these measures risk being biased toward the majority class. The evolved ensembles outperform their individual members on the tasks due to good cooperation between members.  This thesis further improves the ensemble performances by developing a GP approach to ensemble selection, to quickly find small groups of individuals that cooperate very well together in the ensemble. The pruned ensembles use much fewer individuals to achieve performances that are as good as larger (unpruned) ensembles, particularly on tasks with high levels of class imbalance, thereby reducing the total time to evaluate the ensemble.</p>


2021 ◽  
Author(s):  
◽  
Urvesh Bhowan

<p>In classification,machine learning algorithms can suffer a performance bias when data sets are unbalanced. Binary data sets are unbalanced when one class is represented by only a small number of training examples (called the minority class), while the other class makes up the rest (majority class). In this scenario, the induced classifiers typically have high accuracy on the majority class but poor accuracy on the minority class. As the minority class typically represents the main class-of-interest in many real-world problems, accurately classifying examples from this class can be at least as important as, and in some cases more important than, accurately classifying examples from the majority class. Genetic Programming (GP) is a promising machine learning technique based on the principles of Darwinian evolution to automatically evolve computer programs to solve problems. While GP has shown much success in evolving reliable and accurate classifiers for typical classification tasks with balanced data, GP, like many other learning algorithms, can evolve biased classifiers when data is unbalanced. This is because traditional training criteria such as the overall success rate in the fitness function in GP, can be influenced by the larger number of examples from the majority class.  This thesis proposes a GP approach to classification with unbalanced data. The goal is to develop new internal cost-adjustment techniques in GP to improve classification performances on both the minority class and the majority class. By focusing on internal cost-adjustment within GP rather than the traditional databalancing techniques, the unbalanced data can be used directly or "as is" in the learning process. This removes any dependence on a sampling algorithm to first artificially re-balance the input data prior to the learning process. This thesis shows that by developing a number of new methods in GP, genetic program classifiers with good classification ability on the minority and the majority classes can be evolved. This thesis evaluates these methods on a range of binary benchmark classification tasks with unbalanced data. This thesis demonstrates that unlike tasks with multiple balanced classes where some dynamic (non-static) classification strategies perform significantly better than the simple static classification strategy, either a static or dynamic strategy shows no significant difference in the performance of evolved GP classifiers on these binary tasks. For this reason, the rest of the thesis uses this static classification strategy.  This thesis proposes several new fitness functions in GP to perform cost adjustment between the minority and the majority classes, allowing the unbalanced data sets to be used directly in the learning process without sampling. Using the Area under the Receiver Operating Characteristics (ROC) curve (also known as the AUC) to measure how well a classifier performs on the minority and majority classes, these new fitness functions find genetic program classifiers with high AUC on the tasks on both classes, and with fast GP training times. These GP methods outperform two popular learning algorithms, namely, Naive Bayes and Support Vector Machines on the tasks, particularly when the level of class imbalance is large, where both algorithms show biased classification performances.  This thesis also proposes a multi-objective GP (MOGP) approach which treats the accuracies of the minority and majority classes separately in the learning process. The MOGP approach evolves a good set of trade-off solutions (a Pareto front) in a single run that perform as well as, and in some cases better than, multiple runs of canonical single-objective GP (SGP). In SGP, individual genetic program solutions capture the performance trade-off between the two objectives (minority and majority class accuracy) using an ROC curve; whereas in MOGP, this requirement is delegated to multiple genetic program solutions along the Pareto front.  This thesis also shows how multiple Pareto front classifiers can be combined into an ensemble where individual members vote on the class label. Two ensemble diversity measures are developed in the fitness functions which treat the diversity on both the minority and the majority classes as equally important; otherwise, these measures risk being biased toward the majority class. The evolved ensembles outperform their individual members on the tasks due to good cooperation between members.  This thesis further improves the ensemble performances by developing a GP approach to ensemble selection, to quickly find small groups of individuals that cooperate very well together in the ensemble. The pruned ensembles use much fewer individuals to achieve performances that are as good as larger (unpruned) ensembles, particularly on tasks with high levels of class imbalance, thereby reducing the total time to evaluate the ensemble.</p>


2021 ◽  
Vol 220 (11) ◽  
Author(s):  
Sourabh Bhide ◽  
Denisa Gombalova ◽  
Gregor Mönke ◽  
Johannes Stegmaier ◽  
Valentyna Zinchenko ◽  
...  

The intrinsic genetic program of a cell is not sufficient to explain all of the cell’s activities. External mechanical stimuli are increasingly recognized as determinants of cell behavior. In the epithelial folding event that constitutes the beginning of gastrulation in Drosophila, the genetic program of the future mesoderm leads to the establishment of a contractile actomyosin network that triggers apical constriction of cells and thereby tissue folding. However, some cells do not constrict but instead stretch, even though they share the same genetic program as their constricting neighbors. We show here that tissue-wide interactions force these cells to expand even when an otherwise sufficient amount of apical, active actomyosin is present. Models based on contractile forces and linear stress–strain responses do not reproduce experimental observations, but simulations in which cells behave as ductile materials with nonlinear mechanical properties do. Our models show that this behavior is a general emergent property of actomyosin networks in a supracellular context, in accordance with our experimental observations of actin reorganization within stretching cells.


2021 ◽  
Vol 22 (4) ◽  
pp. 2058
Author(s):  
Dmitry O. Ivanov ◽  
Inna I. Evsyukova ◽  
Ekaterina S. Mironova ◽  
Victoria O. Polyakova ◽  
Igor M. Kvetnoy ◽  
...  

The review summarizes the results of experimental and clinical studies aimed at elucidating the causes and pathophysiological mechanisms of the development of endocrine pathology in children. The modern data on the role of epigenetic influences in the early ontogenesis of unfavorable factors that violate the patterns of the formation of regulatory mechanisms during periods of critical development of fetal organs and systems and contribute to the delayed development of pathological conditions are considered. The mechanisms of the participation of melatonin in the regulation of metabolic processes and the key role of maternal melatonin in the formation of the circadian system of regulation in the fetus and in the protection of the genetic program of its morphofunctional development during pregnancy complications are presented. Melatonin, by controlling DNA methylation and histone modification, prevents changes in gene expression that are directly related to the programming of endocrine pathology in offspring. Deficiency and absence of the circadian rhythm of maternal melatonin underlies violations of the genetic program for the development of hormonal and metabolic regulatory mechanisms of the functional systems of the child, which determines the programming and implementation of endocrine pathology in early ontogenesis, contributing to its development in later life. The significance of this factor in the pathophysiological mechanisms of endocrine disorders determines a new approach to risk assessment and timely prevention of offspring diseases even at the stage of family planning.


Author(s):  
Shamsi Emtenani ◽  
Elliott T. Martin ◽  
Attila Gyoergy ◽  
Julia Bicher ◽  
Jakob-Wendelin Genger ◽  
...  

SUMMARYMetabolic adaptation to changing demands underlies homeostasis. During inflammation or metastasis, cells leading migration into challenging environments require an energy boost, however what controls this capacity is unknown. We identify a previously unstudied nuclear protein, Atossa, as changing metabolism in Drosophila melanogaster immune cells to promote tissue invasion. Atossa’s vertebrate orthologs, FAM214A-B, can fully substitute for Atossa, indicating functional conservation from flies to mammals. Atossa increases mRNA levels of Porthos, an unstudied RNA helicase and two metabolic enzymes, LKR/SDH and GR/HPR. Porthos increases translation of a gene subset, including those affecting mitochondrial functions, the electron transport chain, and metabolism. Respiration measurements and metabolomics indicate that Atossa and Porthos powers up mitochondrial oxidative phosphorylation to produce sufficient energy for leading macrophages to forge a path into tissues. As increasing oxidative phosphorylation enables many crucial physiological responses, this unique genetic program may modulate a wide range of cellular behaviors beyond migration.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Pengyang Li ◽  
Dania Nanes Sarfati ◽  
Yuan Xue ◽  
Xi Yu ◽  
Alexander J. Tarashansky ◽  
...  

AbstractSchistosomes are parasitic flatworms causing one of the most prevalent infectious diseases from which millions of people are currently suffering. These parasites have high fecundity and their eggs are both the transmissible agents and the cause of the infection-associated pathology. Given its biomedical significance, the schistosome germline has been a research focus for more than a century. Nonetheless, molecular mechanisms that regulate its development are only now being understood. In particular, it is unknown what balances the fate of germline stem cells (GSCs) in producing daughter stem cells through mitotic divisions versus gametes through meiosis. Here, we perform single-cell RNA sequencing on juvenile schistosomes and capture GSCs during de novo gonadal development. We identify a genetic program that controls the proliferation and differentiation of GSCs. This program centers around onecut, a homeobox transcription factor, and boule, an mRNA binding protein. Their expressions are mutually dependent in the schistosome male germline, and knocking down either of them causes over-proliferation of GSCs and blocks germ cell differentiation. We further show that this germline-specific regulatory program is conserved in the planarian, schistosome’s free-living evolutionary cousin, but the function of onecut has changed during evolution to support GSC maintenance.


2020 ◽  
Vol 218 (1) ◽  
Author(s):  
Harvey Cantor ◽  
Hye-Jung Kim

CD8+ T reg cells play an important role in the maintenance of self-tolerance and can inhibit the development of autoimmune disease. In this issue of JEM, Mishra et al. (https://doi.org/10.1084/jem.20200030) reveal that TGF-β signaling and an Eomes-dependent genetic program contribute to CD8 T reg cell differentiation and function.


Sign in / Sign up

Export Citation Format

Share Document