scholarly journals Problem-Solving Benefits of Down-Sampled Lexicase Selection

2021 ◽  
pp. 1-21
Author(s):  
Thomas Helmuth ◽  
Lee Spector

Abstract In genetic programming, an evolutionary method for producing computer programs that solve specified computational problems, parent selection is ordinarily based on aggregate measures of performance across an entire training set. Lexicase selection, by contrast, selects on the basis of performance on random sequences of training cases; this has been shown to enhance problem-solving power in many circumstances. Lexicase selection can also be seen as better reflecting biological evolution, by modeling sequences of challenges that organisms face over their lifetimes. Recent work has demonstrated that the advantages of lexicase selection can be amplified by down-sampling, meaning that only a random subsample of the training cases is used each generation. This can be seen as modeling the fact that individual organisms encounter only subsets of the possible environments and that environments change over time. Here we provide the most extensive benchmarking of down-sampled lexicase selection to date, showing that its benefits hold up to increased scrutiny. The reasons that down-sampling helps, however, are not yet fully understood. Hypotheses include that down-sampling allows for more generations to be processed with the same budget of program evaluations; that the variation of training data across generations acts as a changing environment, encouraging adaptation; or that it reduces overfitting, leading to more general solutions. We systematically evaluate these hypotheses, finding evidence against all three, and instead draw the conclusion that down-sampled lexicase selection's main benefit stems from the fact that it allows the evolutionary process to examine more individuals within the same computational budget, even though each individual is examined less completely.

Author(s):  
Liane Gabora ◽  
Maegan Merrifield

This chapter begins by outlining a promising, new theoretical framework for the process by which human culture evolves inspired by the views of complexity theorists on the problem of how life began. Elements of culture, like species, evolve over time; that is, they exhibit cumulative change that is adaptive in nature. By studying how biological evolution got started, it is possible to gain insight into not just the specifics of biological evolution, but also general insights into the initiation of any evolutionary process that may be applicable to culture. The authors, thus, explore the implications of this new framework for culture on the transformative processes of individuals. Specifically, they address what this emerging perspective on cultural evolution implies for to go about attaining a sustainable worldview; that is, a web of habits, understandings, and ways of approaching situations that is conducive to the development of a sustainable world.


2021 ◽  
Vol 13 (3) ◽  
pp. 368
Author(s):  
Christopher A. Ramezan ◽  
Timothy A. Warner ◽  
Aaron E. Maxwell ◽  
Bradley S. Price

The size of the training data set is a major determinant of classification accuracy. Nevertheless, the collection of a large training data set for supervised classifiers can be a challenge, especially for studies covering a large area, which may be typical of many real-world applied projects. This work investigates how variations in training set size, ranging from a large sample size (n = 10,000) to a very small sample size (n = 40), affect the performance of six supervised machine-learning algorithms applied to classify large-area high-spatial-resolution (HR) (1–5 m) remotely sensed data within the context of a geographic object-based image analysis (GEOBIA) approach. GEOBIA, in which adjacent similar pixels are grouped into image-objects that form the unit of the classification, offers the potential benefit of allowing multiple additional variables, such as measures of object geometry and texture, thus increasing the dimensionality of the classification input data. The six supervised machine-learning algorithms are support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), learning vector quantization (LVQ), and gradient-boosted trees (GBM). RF, the algorithm with the highest overall accuracy, was notable for its negligible decrease in overall accuracy, 1.0%, when training sample size decreased from 10,000 to 315 samples. GBM provided similar overall accuracy to RF; however, the algorithm was very expensive in terms of training time and computational resources, especially with large training sets. In contrast to RF and GBM, NEU, and SVM were particularly sensitive to decreasing sample size, with NEU classifications generally producing overall accuracies that were on average slightly higher than SVM classifications for larger sample sizes, but lower than SVM for the smallest sample sizes. NEU however required a longer processing time. The k-NN classifier saw less of a drop in overall accuracy than NEU and SVM as training set size decreased; however, the overall accuracies of k-NN were typically less than RF, NEU, and SVM classifiers. LVQ generally had the lowest overall accuracy of all six methods, but was relatively insensitive to sample size, down to the smallest sample sizes. Overall, due to its relatively high accuracy with small training sample sets, and minimal variations in overall accuracy between very large and small sample sets, as well as relatively short processing time, RF was a good classifier for large-area land-cover classifications of HR remotely sensed data, especially when training data are scarce. However, as performance of different supervised classifiers varies in response to training set size, investigating multiple classification algorithms is recommended to achieve optimal accuracy for a project.


2002 ◽  
Vol 1 (1) ◽  
pp. 125-143 ◽  
Author(s):  
Rolf Pfeifer

Artificial intelligence is by its very nature synthetic, its motto is “Understanding by building”. In the early days of artificial intelligence the focus was on abstract thinking and problem solving. These phenomena could be naturally mapped onto algorithms, which is why originally AI was considered to be part of computer science and the tool was computer programming. Over time, it turned out that this view was too limited to understand natural forms of intelligence and that embodiment must be taken into account. As a consequence the focus changed to systems that are able to autonomously interact with their environment and the main tool became the robot. The “developmental robotics” approach incorporates the major implications of embodiment with regard to what has been and can potentially be learned about human cognition by employing robots as cognitive tools. The use of “robots as cognitive tools” is illustrated in a number of case studies by discussing the major implications of embodiment, which are of a dynamical and information theoretic nature.


Author(s):  
Bryan G. Levman

Abstract This article continues the discussion on the nature of the early language of Buddhism and the language that the Buddha spoke, arguing that the received Pāli transmission evolved out of an earlier Middle Indic idiom, which is identified as a koine. Evidence for this koine can be found by examining correspondence sets within Pāli and its various varieties and by examining parallel, cognate correspondence sets between Pāli and other Prakrits which have survived. This article compares 30 correspondence sets transmitted in the Dhammapada recensions: the Gāndhārī Prakrit verses, the partially Sanskritized Pāli and Patna Dhammapada Prakrit verses, and the fully Sanskritized verses of the Udānavarga. By comparing cognate words, it demonstrates the existence of an underlying inter-language which in many cases can be shown to be the source of the phonological differences in the transmission. The paper includes a discussion on the two major factors of dialect change, evolution with variation over time, and the diffusionary, synchronic influence of dialect variation; it concludes that both are important, with dialect variation – and the phonological constraints of indigenous speakers who adopted MI as a second language – providing the pathways on which the natural evolutionary process was channeled.


Author(s):  
Kate Crowley ◽  
Jenny Stewart ◽  
Adrian Kay ◽  
Brian W. Head

Although institutions are central to the study of public policy, the focus upon them has shifted over time. This chapter is concerned with the role of institutions in problem solving and the utility of an evolving institutional theory that has significantly fragmented. It argues that the rise of new institutionalism in particular is symptomatic of the growing complexity in problems and policy making. We review the complex landscape of institutional theory, we reconsider institutions in the context of emergent networks and systems in the governance era, and we reflect upon institutions and the notion of policy shaping in contemporary times. We find that network institutionalism, which draws upon policy network and community approaches, has a particular utility for depicting and explaining complex policy.


2014 ◽  
Vol 539 ◽  
pp. 181-184
Author(s):  
Wan Li Zuo ◽  
Zhi Yan Wang ◽  
Ning Ma ◽  
Hong Liang

Accurate classification of text is a basic premise of extracting various types of information on the Web efficiently and utilizing the network resources properly. In this paper, a brand new text classification method was proposed. Consistency analysis method is a type of iterative algorithm, which mainly trains different classifiers (weak classifier) by aiming at the same training set, and then these classifiers will be gathered for testing the consistency degrees of various classification methods for the same text, thus to manifest the knowledge of each type of classifier. It main determines the weight of each sample according to the fact is the classification of each sample is accurate in each training set, as well as the accuracy of the last overall classification, and then sends the new data set whose weight has been modified to the subordinate classifier for training. In the end, the classifier gained in the training will be integrated as the final decision classifier. The classifier with consistency analysis can eliminate some unnecessary training data characteristics and place the key words on key training data. According to the experimental result, the average accuracy of this method is 91.0%, while the average recall rate is 88.1%.


2020 ◽  
Vol 10 (6) ◽  
pp. 2104
Author(s):  
Michał Tomaszewski ◽  
Paweł Michalski ◽  
Jakub Osuchowski

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.


2018 ◽  
Author(s):  
Kenny Smith

Recent work suggests that linguistic structure develops through cultural evolution, as a consequence of the repeated cycle of learning and use by which languages persist. This work has important implications for our understanding of the evolution of the cognitive basis for language: in particular, human language and the cognitive capacities underpinning it are likely to have been shaped by co-evolutionary processes, where the cultural evolution of linguistic systems is shaped by and in turn shapes the biological evolution of the capacities underpinning language learning. I review several models of this co-evolutionary process, which suggest that the precise relationship between evolved biases in individuals and the structure of linguistic systems depends on the extent to which cultural evolution masks or unmasks individual-level cognitive biases from selection. I finish by discussing how these co-evolutionary models might be extended to cases where the biases involved in learning are themselves shaped by experience, as is the case for language.


Author(s):  
Hengyi Cai ◽  
Hongshen Chen ◽  
Yonghao Song ◽  
Xiaofang Zhao ◽  
Dawei Yin

Humans benefit from previous experiences when taking actions. Similarly, related examples from the training data also provide exemplary information for neural dialogue models when responding to a given input message. However, effectively fusing such exemplary information into dialogue generation is non-trivial: useful exemplars are required to be not only literally-similar, but also topic-related with the given context. Noisy exemplars impair the neural dialogue models understanding the conversation topics and even corrupt the response generation. To address the issues, we propose an exemplar guided neural dialogue generation model where exemplar responses are retrieved in terms of both the text similarity and the topic proximity through a two-stage exemplar retrieval model. In the first stage, a small subset of conversations is retrieved from a training set given a dialogue context. These candidate exemplars are then finely ranked regarding the topical proximity to choose the best-matched exemplar response. To further induce the neural dialogue generation model consulting the exemplar response and the conversation topics more faithfully, we introduce a multi-source sampling mechanism to provide the dialogue model with both local exemplary semantics and global topical guidance during decoding. Empirical evaluations on a large-scale conversation dataset show that the proposed approach significantly outperforms the state-of-the-art in terms of both the quantitative metrics and human evaluations.


Sign in / Sign up

Export Citation Format

Share Document