PaLM: Pipelined Architecture to Label Legacy Multispectral Data using Unsupervised Learning Algorithm

We introduce landmark grammars , a new family of context-free grammars aimed at describing the HTML source code of pages published by large and templated websites and therefore at effectively tackling Web data extraction problems. Indeed, they address the inherent ambiguity of HTML, one of the main challenges of Web data extraction, which, despite over twenty years of research, has been largely neglected by the approaches presented in literature. We then formalize the Smallest Extraction Problem (SEP), an optimization problem for finding the grammar of a family that best describes a set of pages and contextually extract their data. Finally, we present an unsupervised learning algorithm to induce a landmark grammar from a set of pages sharing a common HTML template, and we present an automatic Web data extraction system. The experiments on consolidated benchmarks show that the approach can substantially contribute to improve the state-of-the-art.

Download Full-text

Evaluating the botanical coverage of PATO using an unsupervised learning algorithm

Proceedings of the 2012 iConference on - iConference '12 ◽

10.1145/2132176.2132265 ◽

2012 ◽

Author(s):

Alyssa Janning ◽

Hong Cui

Keyword(s):

Unsupervised Learning ◽

Learning Algorithm

Download Full-text

Study on Software Log Anomaly Detection System with Unsupervised Learning Algorithm

Advances in Intelligent Systems and Computing - Human Interaction, Emerging Technologies and Future Applications II ◽

10.1007/978-3-030-44267-5_18 ◽

2020 ◽

pp. 122-128

Author(s):

Rin Hirakawa ◽

Keitaro Tominaga ◽

Yoshihisa Nakatoh

Keyword(s):

Anomaly Detection ◽

Unsupervised Learning ◽

Learning Algorithm ◽

Detection System ◽

Anomaly Detection System

Download Full-text

An unsupervised learning algorithm for multiscale neural activity

2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) ◽

10.1109/embc.2017.8036797 ◽

2017 ◽

Cited By ~ 7

Author(s):

Hamidreza Abbaspourazad ◽

Maryam M. Shanechi

Keyword(s):

Unsupervised Learning ◽

Neural Activity ◽

Learning Algorithm

Download Full-text

A Statistical Model for Word Discovery in Transcribed Speech

Computational Linguistics ◽

10.1162/089120101317066113 ◽

2001 ◽

Vol 27 (3) ◽

pp. 351-372 ◽

Cited By ~ 37

Author(s):

Anand Venkataraman

Keyword(s):

Statistical Model ◽

Unsupervised Learning ◽

Learning Algorithm ◽

Continuous Speech ◽

Word Boundaries ◽

Empirical Tests

A statistical model for segmentation and word discovery in continuous speech is presented. An incremental unsupervised learning algorithm to infer word boundaries based on this model is described. Results are also presented of empirical tests showing that the algorithm is competitive with other models that have been used for similar tasks.

Download Full-text

Clustering fMRI data with a robust unsupervised learning algorithm for neuroscience data mining

Journal of Neuroscience Methods ◽

10.1016/j.jneumeth.2018.02.007 ◽

2018 ◽

Vol 299 ◽

pp. 45-54 ◽

Cited By ~ 5

Author(s):

Hadeel K. Aljobouri ◽

Hussain A. Jaber ◽

Orhan M. Koçak ◽

Oktay Algin ◽

Ilyas Çankaya

Keyword(s):

Data Mining ◽

Unsupervised Learning ◽

Learning Algorithm ◽

Fmri Data

Download Full-text

LAMDA-HAD, an Extension to the LAMDA Classifier in the Context of Supervised Learning

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622019500457 ◽

2020 ◽

Vol 19 (01) ◽

pp. 283-316 ◽

Cited By ~ 1

Author(s):

Luis Morales ◽

José Aguilar ◽

Danilo Chávez ◽

Claudia Isaza

Keyword(s):

Data Analysis ◽

Statistical Analysis ◽

Unsupervised Learning ◽

Supervised Learning ◽

Learning Algorithm ◽

New Approach ◽

Supervised And Unsupervised Learning ◽

Training Stage

This paper proposes a new approach to improve the performance of Learning Algorithm for Multivariable Data Analysis (LAMDA). This algorithm can be used for supervised and unsupervised learning, based on the calculation of the Global Adequacy Degree (GAD) of one individual to a class, through the contributions of all its descriptors. LAMDA has the capability of creating new classes after the training stage. If an individual does not have enough similarity to the preexisting classes, it is evaluated with respect to a threshold called the Non-Informative Class (NIC), this being the novelty of the algorithm. However, LAMDA has problems making good classifications, either because the NIC is constant for all classes, or because the GAD calculation is unreliable. In this work, its efficiency is improved by two strategies, the first one, by the calculation of adaptable NICs for each class, which prevents that correctly classified individuals create new classes; and the second one, by computing the Higher Adequacy Degree (HAD), which grants more robustness to the algorithm. LAMDA-HAD is validated by applying it in different benchmarks and comparing it with LAMDA and other classifiers, through a statistical analysis to determinate the cases in which our algorithm presents a better performance.

Download Full-text

A constructive unsupervised learning algorithm for clustering binary patterns

2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541) ◽

10.1109/ijcnn.2004.1380150 ◽

2005 ◽

Cited By ~ 4

Author(s):

Di Wang ◽

N.S. Chaudhari ◽

J.C. Patra

Keyword(s):

Unsupervised Learning ◽

Learning Algorithm

Download Full-text