scholarly journals AUTOMAT[R]IX: learning simple matrix pipelines

2021 ◽  
Author(s):  
Lidia Contreras-Ochando ◽  
Cèsar Ferri ◽  
José Hernández-Orallo

AbstractMatrices are a very common way of representing and working with data in data science and artificial intelligence. Writing a small snippet of code to make a simple matrix transformation is frequently frustrating, especially for those people without an extensive programming expertise. We present AUTOMATIX, a system that is able to induce R program snippets from a single (and possibly partial) matrix transformation example provided by the user. Our learning algorithm is able to induce the correct matrix pipeline snippet by composing primitives from a library. Because of the intractable search space—exponential on the size of the library and the number of primitives to be combined in the snippet, we speed up the process with (1) a typed system that excludes all combinations of primitives with inconsistent mapping between input and output matrix dimensions, and (2) a probabilistic model to estimate the probability of each sequence of primitives from their frequency of use and a text hint provided by the user. We validate AUTOMATIX with a set of real programming queries involving matrices from Stack Overflow, showing that we can learn the transformations efficiently, from just one partial example.

2020 ◽  
Vol 36 (Supplement_2) ◽  
pp. i831-i839
Author(s):  
Dong-gi Lee ◽  
Myungjun Kim ◽  
Sang Joon Son ◽  
Chang Hyung Hong ◽  
Hyunjung Shin

Abstract Motivation Recently, various approaches for diagnosing and treating dementia have received significant attention, especially in identifying key genes that are crucial for dementia. If the mutations of such key genes could be tracked, it would be possible to predict the time of onset of dementia and significantly aid in developing drugs to treat dementia. However, gene finding involves tremendous cost, time and effort. To alleviate these problems, research on utilizing computational biology to decrease the search space of candidate genes is actively conducted. In this study, we propose a framework in which diseases, genes and single-nucleotide polymorphisms are represented by a layered network, and key genes are predicted by a machine learning algorithm. The algorithm utilizes a network-based semi-supervised learning model that can be applied to layered data structures. Results The proposed method was applied to a dataset extracted from public databases related to diseases and genes with data collected from 186 patients. A portion of key genes obtained using the proposed method was verified in silico through PubMed literature, and the remaining genes were left as possible candidate genes. Availability and implementation The code for the framework will be available at http://www.alphaminers.net/. Supplementary information Supplementary data are available at Bioinformatics online.


2014 ◽  
Vol 665 ◽  
pp. 643-646
Author(s):  
Ying Liu ◽  
Yan Ye ◽  
Chun Guang Li

Metalearning algorithm learns the base learning algorithm, targeted for improving the performance of the learning system. The incremental delta-bar-delta (IDBD) algorithm is such a metalearning algorithm. On the other hand, sparse algorithms are gaining popularity due to their good performance and wide applications. In this paper, we propose a sparse IDBD algorithm by taking the sparsity of the systems into account. Thenorm penalty is contained in the cost function of the standard IDBD, which is equivalent to adding a zero attractor in the iterations, thus can speed up convergence if the system of interest is indeed sparse. Simulations demonstrate that the proposed algorithm is superior to the competing algorithms in sparse system identification.


2012 ◽  
Vol 49 (2) ◽  
pp. 285-327 ◽  
Author(s):  
RUI P. CHAVES

Subject phrases impose particularly strong constraints on extraction. Most research assumes a syntactic account (e.g. Kayne 1983, Chomsky 1986, Rizzi 1990, Lasnik & Saito 1992, Takahashi 1994, Uriagereka 1999), but there are also pragmatic accounts (Erteschik-Shir & Lappin 1979; Van Valin 1986, 1995; Erteschik-Shir 2006, 2007) as well as performance-based approaches (Kluender 2004). In this work I argue that none of these accounts captures the full range of empirical facts, and show that subject and adjunct phrases (phrasal or clausal, finite or otherwise) are by no means impermeable to non-parasitic extraction of nominal, prepositional and adverbial phrases. The present empirical reassessment indicates that the phenomena involving subject and adjunct islands defies the formulation of a general grammatical account. Drawing from insights by Engdahl (1983) and Kluender (2004), I argue that subject island effects have a functional explanation. Independently motivated pragmatic and processing limitations cause subject-internal gaps to be heavily dispreferred, and therefore, extremely infrequent. In turn, this has led to heuristic parsing expectations that preempt subject-internal gaps and therefore speed up processing by pruning the search space of filler–gap dependencies. Such expectations cause processing problems when violated, unless they are dampened by prosodic and pragmatic cues that boost the construction of the correct parse. This account predicts subject islands and their (non-)parasitic exceptions.


2017 ◽  
Author(s):  
Eric Schulz ◽  
Charley M. Wu ◽  
Quentin J. M. Huys ◽  
Andreas Krause ◽  
Maarten Speekenbrink

AbstractHow do people pursue rewards in risky environments, where some outcomes should be avoided at all costs? We investigate how participant search for spatially correlated rewards in scenarios where one must avoid sampling rewards below a given threshold. This requires not only the balancing of exploration and exploitation, but also reasoning about how to avoid potentially risky areas of the search space. Within risky versions of the spatially correlated multi-armed bandit task, we show that participants’ behavior is aligned well with a Gaussian process function learning algorithm, which chooses points based on a safe optimization routine. Moreover, using leave-one-block-out cross-validation, we find that participants adapt their sampling behavior to the riskiness of the task, although the underlying function learning mechanism remains relatively unchanged. These results show that participants can adapt their search behavior to the adversity of the environment and enrich our understanding of adaptive behavior in the face of risk and uncertainty.


Author(s):  
Karem A. Sakallah

Symmetry is at once a familiar concept (we recognize it when we see it!) and a profoundly deep mathematical subject. At its most basic, a symmetry is some transformation of an object that leaves the object (or some aspect of the object) unchanged. For example, a square can be transformed in eight different ways that leave it looking exactly the same: the identity “do-nothing” transformation, 3 rotations, and 4 mirror images (or reflections). In the context of decision problems, the presence of symmetries in a problem’s search space can frustrate the hunt for a solution by forcing a search algorithm to fruitlessly explore symmetric subspaces that do not contain solutions. Recognizing that such symmetries exist, we can direct a search algorithm to look for solutions only in non-symmetric parts of the search space. In many cases, this can lead to significant pruning of the search space and yield solutions to problems which are otherwise intractable. This chapter explores the symmetries of Boolean functions, particularly the symmetries of their conjunctive normal form (CNF) representations. Specifically, it examines what those symmetries are, how to model them using the mathematical language of group theory, how to derive them from a CNF formula, and how to utilize them to speed up CNF SAT solvers.


Author(s):  
Otokar Grošek ◽  
Pavol Zajac

Classical ciphers are used to encrypt plaintext messages written in a natural language in such a way that they are readable for sender or intended recipient only. Many classical ciphers can be broken by brute-force search through the key-space. Methods of artificial intelligence, such as optimization heuristics, can be used to narrow the search space, to speed-up text processing and text recognition in the cryptanalytic process. Here we present a broad overview of different AI techniques usable in cryptanalysis of classical ciphers. Specific methods to effectively recognize the correctly decrypted text among many possible decrypts are discussed in the next part Automated cryptanalysis – Language processing.


1994 ◽  
Vol 05 (01) ◽  
pp. 67-75 ◽  
Author(s):  
BYOUNG-TAK ZHANG

Much previous work on training multilayer neural networks has attempted to speed up the backpropagation algorithm using more sophisticated weight modification rules, whereby all the given training examples are used in a random or predetermined sequence. In this paper we investigate an alternative approach in which the learning proceeds on an increasing number of selected training examples, starting with a small training set. We derive a measure of criticality of examples and present an incremental learning algorithm that uses this measure to select a critical subset of given examples for solving the particular task. Our experimental results suggest that the method can significantly improve training speed and generalization performance in many real applications of neural networks. This method can be used in conjunction with other variations of gradient descent algorithms.


2020 ◽  
Author(s):  
Fulei Ji ◽  
Wentao Zhang ◽  
Tianyou Ding

Abstract Automatic search methods have been widely used for cryptanalysis of block ciphers, especially for the most classic cryptanalysis methods—differential and linear cryptanalysis. However, the automatic search methods, no matter based on MILP, SMT/SAT or CP techniques, can be inefficient when the search space is too large. In this paper, we propose three new methods to improve Matsui’s branch-and-bound search algorithm, which is known as the first generic algorithm for finding the best differential and linear trails. The three methods, named reconstructing DDT and LAT according to weight, executing linear layer operations in minimal cost and merging two 4-bit S-boxes into one 8-bit S-box, respectively, can efficiently speed up the search process by reducing the search space as much as possible and reducing the cost of executing linear layer operations. We apply our improved algorithm to DESL and GIFT, which are still the hard instances for the automatic search methods. As a result, we find the best differential trails for DESL (up to 14-round) and GIFT-128 (up to 19-round). The best linear trails for DESL (up to 16-round), GIFT-128 (up to 10-round) and GIFT-64 (up to 15-round) are also found. To the best of our knowledge, these security bounds for DESL and GIFT under single-key scenario are given for the first time. Meanwhile, it is the longest exploitable (differential or linear) trails for DESL and GIFT. Furthermore, benefiting from the efficiency of the improved algorithm, we do experiments to demonstrate that the clustering effect of differential trails for 13-round DES and DESL are both weak.


Molecules ◽  
2018 ◽  
Vol 23 (7) ◽  
pp. 1729
Author(s):  
Yinghan Hong ◽  
Zhifeng Hao ◽  
Guizhen Mai ◽  
Han Huang ◽  
Arun Kumar Sangaiah

Exploring and detecting the causal relations among variables have shown huge practical values in recent years, with numerous opportunities for scientific discovery, and have been commonly seen as the core of data science. Among all possible causal discovery methods, causal discovery based on a constraint approach could recover the causal structures from passive observational data in general cases, and had shown extensive prospects in numerous real world applications. However, when the graph was sufficiently large, it did not work well. To alleviate this problem, an improved causal structure learning algorithm named brain storm optimization (BSO), is presented in this paper, combining K2 with brain storm optimization (K2-BSO). Here BSO is used to search optimal topological order of nodes instead of graph space. This paper assumes that dataset is generated by conforming to a causal diagram in which each variable is generated from its parent based on a causal mechanism. We designed an elaborate distance function for clustering step in BSO according to the mechanism of K2. The graph space therefore was reduced to a smaller topological order space and the order space can be further reduced by an efficient clustering method. The experimental results on various real-world datasets showed our methods outperformed the traditional search and score methods and the state-of-the-art genetic algorithm-based methods.


Sign in / Sign up

Export Citation Format

Share Document