scholarly journals Learning Implicitly with Noisy Data in Linear Arithmetic

Author(s):  
Alexander Rader ◽  
Ionela G Mocanu ◽  
Vaishak Belle ◽  
Brendan Juba

Robust learning in expressive languages with real-world data continues to be a challenging task. Numerous conventional methods appeal to heuristics without any assurances of robustness. While probably approximately correct (PAC) Semantics offers strong guarantees, learning explicit representations is not tractable, even in propositional logic. However, recent work on so-called “implicit" learning has shown tremendous promise in terms of obtaining polynomial-time results for fragments of first-order logic. In this work, we extend implicit learning in PAC-Semantics to handle noisy data in the form of intervals and threshold uncertainty in the language of linear arithmetic. We prove that our extended framework keeps the existing polynomial-time complexity guarantees. Furthermore, we provide the first empirical investigation of this hitherto purely theoretical framework. Using benchmark problems, we show that our implicit approach to learning optimal linear programming objective constraints significantly outperforms an explicit approach in practice.

Author(s):  
Zeren Sun ◽  
Xian-Sheng Hua ◽  
Yazhou Yao ◽  
Xiu-Shen Wei ◽  
Guosheng Hu ◽  
...  
Keyword(s):  

2021 ◽  
Author(s):  
Jaspreet Kaur Bassan

This work proposes a technique for classifying unlabelled streaming data using grammar-based immune programming, a hybrid meta-heuristic where the space of grammar generated solutions is searched by an artificial immune system inspired algorithm. Data is labelled using an active learning technique and is buffered until the system trains adequately on the labelled data. The system is employed in static and in streaming data environments, and is tested and evaluated using synthetic and real-world data. The performances of the system employed in different data settings are compared with each other and with two benchmark problems. The proposed classification system adapted well to the changing nature of streaming data and the active learning technique made the process less computationally expensive by retaining only those instances which favoured the training process.


2020 ◽  
Vol 34 (04) ◽  
pp. 6853-6860
Author(s):  
Xuchao Zhang ◽  
Xian Wu ◽  
Fanglan Chen ◽  
Liang Zhao ◽  
Chang-Tien Lu

The success of training accurate models strongly depends on the availability of a sufficient collection of precisely labeled data. However, real-world datasets contain erroneously labeled data samples that substantially hinder the performance of machine learning models. Meanwhile, well-labeled data is usually expensive to obtain and only a limited amount is available for training. In this paper, we consider the problem of training a robust model by using large-scale noisy data in conjunction with a small set of clean data. To leverage the information contained via the clean labels, we propose a novel self-paced robust learning algorithm (SPRL) that trains the model in a process from more reliable (clean) data instances to less reliable (noisy) ones under the supervision of well-labeled data. The self-paced learning process hedges the risk of selecting corrupted data into the training set. Moreover, theoretical analyses on the convergence of the proposed algorithm are provided under mild assumptions. Extensive experiments on synthetic and real-world datasets demonstrate that our proposed approach can achieve a considerable improvement in effectiveness and robustness to existing methods.


2017 ◽  
Vol 59 ◽  
pp. 133-173 ◽  
Author(s):  
Robert Bredereck ◽  
Jiehua Chen ◽  
Rolf Niedermeier ◽  
Toby Walsh

We study computational problems for two popular parliamentary voting procedures: the amendment procedure and the successive procedure. They work in multiple stages where the result of each stage may influence the result of the next stage. Both procedures proceed according to a given linear order of the alternatives, an agenda. We obtain the following results for both voting procedures: On the one hand, deciding whether one can make a specific alternative win by reporting insincere preferences by the fewest number of voters, the Manipulation problem, or whether there is a suitable ordering of the agenda, the Agenda Control problem, takes polynomial time. On the other hand, our experimental studies with real-world data indicate that most preference profiles cannot be manipulated by only few voters and a successful agenda control is typically impossible. If the voters' preferences are incomplete, then deciding whether an alternative can possibly win is NP-hard for both procedures. Whilst deciding whether an alternative necessarily wins is coNP-hard for the amendment procedure, it is polynomial-time solvable for the successive procedure.


2021 ◽  
Author(s):  
Jaspreet Kaur Bassan

This work proposes a technique for classifying unlabelled streaming data using grammar-based immune programming, a hybrid meta-heuristic where the space of grammar generated solutions is searched by an artificial immune system inspired algorithm. Data is labelled using an active learning technique and is buffered until the system trains adequately on the labelled data. The system is employed in static and in streaming data environments, and is tested and evaluated using synthetic and real-world data. The performances of the system employed in different data settings are compared with each other and with two benchmark problems. The proposed classification system adapted well to the changing nature of streaming data and the active learning technique made the process less computationally expensive by retaining only those instances which favoured the training process.


2016 ◽  
Vol 55 ◽  
pp. 685-714
Author(s):  
Roderick Sebastiaan De Nijs ◽  
Christian Landsiedel ◽  
Dirk Wollherr ◽  
Martin Buss

This article discusses the quadratization of Markov Logic Networks, which enables efficient approximate MAP computation by means of maximum flows. The procedure relies on a pseudo-Boolean representation of the model, and allows handling models of any order. The employed pseudo-Boolean representation can be used to identify problems that are guaranteed to be solvable in low polynomial-time. Results on common benchmark problems show that the proposed approach finds optimal assignments for most variables in excellent computational time and approximate solutions that match the quality of ILP-based solvers.


2020 ◽  
Vol 34 (04) ◽  
pp. 4428-4435
Author(s):  
Hiroshi Kera ◽  
Yoshihiko Hasegawa

In the last decade, the approximate vanishing ideal and its basis construction algorithms have been extensively studied in computer algebra and machine learning as a general model to reconstruct the algebraic variety on which noisy data approximately lie. In particular, the basis construction algorithms developed in machine learning are widely used in applications across many fields because of their monomial-order-free property; however, they lose many of the theoretical properties of computer-algebraic algorithms. In this paper, we propose general methods that equip monomial-order-free algorithms with several advantageous theoretical properties. Specifically, we exploit the gradient to (i) sidestep the spurious vanishing problem in polynomial time to remove symbolically trivial redundant bases, (ii) achieve consistent output with respect to the translation and scaling of input, and (iii) remove nontrivially redundant bases. The proposed methods work in a fully numerical manner, whereas existing algorithms require the awkward monomial order or exponentially costly (and mostly symbolic) computation to realize properties (i) and (iii). To our knowledge, property (ii) has not been achieved by any existing basis construction algorithm of the approximate vanishing ideal.


Sign in / Sign up

Export Citation Format

Share Document