Learning Implicitly with Noisy Data in Linear Arithmetic

Robust learning in expressive languages with real-world data continues to be a challenging task. Numerous conventional methods appeal to heuristics without any assurances of robustness. While probably approximately correct (PAC) Semantics offers strong guarantees, learning explicit representations is not tractable, even in propositional logic. However, recent work on so-called “implicit" learning has shown tremendous promise in terms of obtaining polynomial-time results for fragments of first-order logic. In this work, we extend implicit learning in PAC-Semantics to handle noisy data in the form of intervals and threshold uncertainty in the language of linear arithmetic. We prove that our extended framework keeps the existing polynomial-time complexity guarantees. Furthermore, we provide the first empirical investigation of this hitherto purely theoretical framework. Using benchmark problems, we show that our implicit approach to learning optimal linear programming objective constraints significantly outperforms an explicit approach in practice.

Download Full-text

CRSSC: Salvage Reusable Samples from Noisy Data for Robust Learning

Proceedings of the 28th ACM International Conference on Multimedia ◽

10.1145/3394171.3413978 ◽

2020 ◽

Cited By ~ 1

Author(s):

Zeren Sun ◽

Xian-Sheng Hua ◽

Yazhou Yao ◽

Xiu-Shen Wei ◽

Guosheng Hu ◽

...

Keyword(s):

Noisy Data ◽

Robust Learning

Download Full-text

Classifying streaming data using grammar-based immune programming

10.32920/ryerson.14656533.v1 ◽

2021 ◽

Author(s):

Jaspreet Kaur Bassan

Keyword(s):

Active Learning ◽

Artificial Immune System ◽

Streaming Data ◽

Benchmark Problems ◽

Artificial Immune ◽

Training Process ◽

Real World Data ◽

Learning Technique ◽

Computationally Expensive ◽

Immune Programming

This work proposes a technique for classifying unlabelled streaming data using grammar-based immune programming, a hybrid meta-heuristic where the space of grammar generated solutions is searched by an artificial immune system inspired algorithm. Data is labelled using an active learning technique and is buffered until the system trains adequately on the labelled data. The system is employed in static and in streaming data environments, and is tested and evaluated using synthetic and real-world data. The performances of the system employed in different data settings are compared with each other and with two benchmark problems. The proposed classification system adapted well to the changing nature of streaming data and the active learning technique made the process less computationally expensive by retaining only those instances which favoured the training process.

Download Full-text

Self-Paced Robust Learning for Leveraging Clean Labels in Noisy Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6166 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6853-6860

Author(s):

Xuchao Zhang ◽

Xian Wu ◽

Fanglan Chen ◽

Liang Zhao ◽

Chang-Tien Lu

Keyword(s):

Real World ◽

Large Scale ◽

Learning Algorithm ◽

Noisy Data ◽

Training Set ◽

Robust Learning ◽

Robust Model ◽

Small Set ◽

Real World Datasets ◽

Theoretical Analyses

The success of training accurate models strongly depends on the availability of a sufficient collection of precisely labeled data. However, real-world datasets contain erroneously labeled data samples that substantially hinder the performance of machine learning models. Meanwhile, well-labeled data is usually expensive to obtain and only a limited amount is available for training. In this paper, we consider the problem of training a robust model by using large-scale noisy data in conjunction with a small set of clean data. To leverage the information contained via the clean labels, we propose a novel self-paced robust learning algorithm (SPRL) that trains the model in a process from more reliable (clean) data instances to less reliable (noisy) ones under the supervision of well-labeled data. The self-paced learning process hedges the risk of selecting corrupted data into the training set. Moreover, theoretical analyses on the convergence of the proposed algorithm are provided under mild assumptions. Extensive experiments on synthetic and real-world datasets demonstrate that our proposed approach can achieve a considerable improvement in effectiveness and robustness to existing methods.

Download Full-text

The Learning of Fuzzy Cognitive Maps With Noisy Data: A Rapid and Robust Learning Method With Maximum Entropy

IEEE Transactions on Cybernetics ◽

10.1109/tcyb.2019.2933438 ◽

2019 ◽

pp. 1-13 ◽

Cited By ~ 2

Author(s):

Guoliang Feng ◽

Wei Lu ◽

Witold Pedrycz ◽

Jianhua Yang ◽

Xiaodong Liu

Keyword(s):

Maximum Entropy ◽

Noisy Data ◽

Cognitive Maps ◽

Fuzzy Cognitive Maps ◽

Learning Method ◽

Robust Learning

Download Full-text

Robust Learning Enabled Intelligence for the Internet-of-Things: A Survey From the Perspectives of Noisy Data and Adversarial Examples

IEEE Internet of Things Journal ◽

10.1109/jiot.2020.3018691 ◽

2020 ◽

pp. 1-1

Author(s):

Yulei Wu

Keyword(s):

Internet Of Things ◽

Noisy Data ◽

The Internet ◽

Robust Learning ◽

Adversarial Examples ◽

The Internet Of Things

Download Full-text

Parliamentary Voting Procedures: Agenda Control, Manipulation, and Uncertainty

Journal of Artificial Intelligence Research ◽

10.1613/jair.5407 ◽

2017 ◽

Vol 59 ◽

pp. 133-173 ◽

Cited By ~ 1

Author(s):

Robert Bredereck ◽

Jiehua Chen ◽

Rolf Niedermeier ◽

Toby Walsh

Keyword(s):

Polynomial Time ◽

Experimental Studies ◽

The Other ◽

Real World Data ◽

Agenda Control ◽

Voting Procedures ◽

Specific Alternative ◽

The One ◽

Amendment Procedure ◽

Multiple Stages

We study computational problems for two popular parliamentary voting procedures: the amendment procedure and the successive procedure. They work in multiple stages where the result of each stage may influence the result of the next stage. Both procedures proceed according to a given linear order of the alternatives, an agenda. We obtain the following results for both voting procedures: On the one hand, deciding whether one can make a specific alternative win by reporting insincere preferences by the fewest number of voters, the Manipulation problem, or whether there is a suitable ordering of the agenda, the Agenda Control problem, takes polynomial time. On the other hand, our experimental studies with real-world data indicate that most preference profiles cannot be manipulated by only few voters and a successful agenda control is typically impossible. If the voters' preferences are incomplete, then deciding whether an alternative can possibly win is NP-hard for both procedures. Whilst deciding whether an alternative necessarily wins is coNP-hard for the amendment procedure, it is polynomial-time solvable for the successive procedure.

Download Full-text

Classifying streaming data using grammar-based immune programming

10.32920/ryerson.14656533 ◽

2021 ◽

Author(s):

Jaspreet Kaur Bassan

Keyword(s):

Active Learning ◽

Artificial Immune System ◽

Streaming Data ◽

Benchmark Problems ◽

Artificial Immune ◽

Training Process ◽

Real World Data ◽

Learning Technique ◽

Computationally Expensive ◽

Immune Programming

Download Full-text

Quadratization and Roof Duality of Markov Logic Networks

Journal of Artificial Intelligence Research ◽

10.1613/jair.5023 ◽

2016 ◽

Vol 55 ◽

pp. 685-714

Author(s):

Roderick Sebastiaan De Nijs ◽

Christian Landsiedel ◽

Dirk Wollherr ◽

Martin Buss

Keyword(s):

Polynomial Time ◽

Approximate Solutions ◽

Computational Time ◽

Benchmark Problems ◽

Markov Logic Networks ◽

Markov Logic ◽

Maximum Flows

This article discusses the quadratization of Markov Logic Networks, which enables efficient approximate MAP computation by means of maximum flows. The procedure relies on a pseudo-Boolean representation of the model, and allows handling models of any order. The employed pseudo-Boolean representation can be used to identify problems that are guaranteed to be solvable in low polynomial-time. Results on common benchmark problems show that the proposed approach finds optimal assignments for most variables in excellent computational time and approximate solutions that match the quality of ILP-based solvers.

Download Full-text

Estimation of heart rate power spectrum bands from real-world data: dealing with ectopic beats and noisy data

Proceedings. Computers in Cardiology 1988 ◽

10.1109/cic.1988.72624 ◽

2003 ◽

Cited By ~ 8

Author(s):

P. Albrecht ◽

R.J. Cohen

Keyword(s):

Heart Rate ◽

Power Spectrum ◽

Real World ◽

Noisy Data ◽

Real World Data ◽

World Data ◽

Ectopic Beats

Download Full-text

Gradient Boosts the Approximate Vanishing Ideal

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5869 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4428-4435

Author(s):

Hiroshi Kera ◽

Yoshihiko Hasegawa

Keyword(s):

Machine Learning ◽

Computer Algebra ◽

Polynomial Time ◽

General Model ◽

Noisy Data ◽

Vanishing Ideal ◽

Algebraic Algorithms ◽

Monomial Order ◽

Basis Construction ◽

Construction Algorithms

In the last decade, the approximate vanishing ideal and its basis construction algorithms have been extensively studied in computer algebra and machine learning as a general model to reconstruct the algebraic variety on which noisy data approximately lie. In particular, the basis construction algorithms developed in machine learning are widely used in applications across many fields because of their monomial-order-free property; however, they lose many of the theoretical properties of computer-algebraic algorithms. In this paper, we propose general methods that equip monomial-order-free algorithms with several advantageous theoretical properties. Specifically, we exploit the gradient to (i) sidestep the spurious vanishing problem in polynomial time to remove symbolically trivial redundant bases, (ii) achieve consistent output with respect to the translation and scaling of input, and (iii) remove nontrivially redundant bases. The proposed methods work in a fully numerical manner, whereas existing algorithms require the awkward monomial order or exponentially costly (and mostly symbolic) computation to realize properties (i) and (iii). To our knowledge, property (ii) has not been achieved by any existing basis construction algorithm of the approximate vanishing ideal.

Download Full-text