pattern discovery
Recently Published Documents


TOTAL DOCUMENTS

795
(FIVE YEARS 114)

H-INDEX

37
(FIVE YEARS 5)

2022 ◽  
Vol 139 ◽  
pp. 102629
Author(s):  
Haoran Wang ◽  
Haiping Zhang ◽  
Shangjing Jiang ◽  
Guoan Tang ◽  
Xueying Zhang ◽  
...  

2021 ◽  
Author(s):  
Leonardo Duarte Rodrigues Alexandre ◽  
Rafael S. Costa ◽  
Rui Henriques

Motivation: Pattern discovery and subspace clustering play a central role in the biological domain, supporting for instance putative regulatory module discovery from omic data for both descriptive and predictive ends. In the presence of target variables (e.g. phenotypes), regulatory patterns should further satisfy delineate discriminative power properties, well-established in the presence of categorical outcomes, yet largely disregarded for numerical outcomes, such as risk profiles and quantitative phenotypes. Results: DISA (Discriminative and Informative Subspace Assessment), a Python software package, is proposed to assess patterns in the presence of numerical outcomes using well-established measures together with a novel principle able to statistically assess the correlation gain of the subspace against the overall space. Results confirm the possibility to soundly extend discriminative criteria towards numerical outcomes without the drawbacks well-associated with discretization procedures. A case study is provided to show the properties of the proposed method. Availability: DISA is freely available at https://github.com/JupitersMight/DISA under the MIT license.


2021 ◽  
Vol 27 (4) ◽  
Author(s):  
Aaron Carter-Ényì ◽  
Gilad Rabinovitch

Onset (metric position) and contiguity (pitch adjacency and time proximity) are two melodic features that contribute to the salience of individual notes (core tones) in a monophonic voice or polyphonic texture. Our approach to reductions prioritizes contextual features like onset and contiguity. By awarding points to notes with such features, our process selects core tones from melodic surfaces to produce a reduction. Through this reduction, a new form of musical pattern discovery is possible that has similarities to Gjerdingen’s (".fn_cite_year($gjerdingen_2007).") galant schemata. Recurring n-grams (scale degree skeletons) are matched in an algorithmic approach that we have tested manually (with a printed score and pen and paper) and implemented computationally (with symbolic data and scripted algorithms in MATLAB). A relatively simple method successfully identifies the location of all statements of the subject in Bach’s Fugue in C Minor (BWV 847) identified by Bruhn (".fn_cite_year($bruhn_1993).") and the location of all instances of the Prinner and Meyer schemata in Mozart’s Sonata in C Major (K. 545/i) identified by Gjerdingen (".fn_cite_year($gjerdingen_2007)."). We also apply the method to an excerpt by Kirnberger analyzed in Rabinovitch (".fn_cite_year($rabinovitch_2019)."). Analysts may use this flexible method for pattern discovery in reduced textures through software freely accessible at https://www.atavizm.org. While our case studies in the present article are from eighteenth-century European music, we believe our approach to reduction and pattern discovery is extensible to a variety of musics.


2021 ◽  
Author(s):  
Xing Li ◽  
Qiquan Shi ◽  
Gang Hu ◽  
Lei Chen ◽  
Hui Mao ◽  
...  

2021 ◽  
Author(s):  
Peiyuan Zhou ◽  
Andrew K.C. Wong ◽  
Yang Yang ◽  
Scott T. Leatherdale ◽  
Kate Battista ◽  
...  

Abstract Background: COMPASS is a longitudinal, prospective cohort study collecting data annually from students attending high school in jurisdictions across Canada. We aimed to discover significant frequent/rare associations of behavioral factors among Canadian adolescents related to cannabis use.Methods: We use a subset of the COMPASS dataset which contains 18,761 records of students in grades 9 to 12 with 31 selected features (attributes) involving various characteristics, from living habits to academic performance. We then used the Pattern Discovery and Disentanglement (PDD) algorithm to detect strong and rare (yet statistically significant) associations from the dataset.Results: Cohort characteristics and factors associated with cannabis use and other associations detected by PDD show consistent results with common sense and literature surveys. In addition, PDD outperformed methods using other criteria (i.e. support and confidence) popular as reported in the literature. Association results showed that PDD could discover: i) a smaller set of succinct significant associations in clusters; ii) frequent and rare, yet significant, patterns supported by population health relevant study; iii) patterns from a dataset with extremely imbalanced groups (majority class (None-user): minority class (Regular) = 88.3%: 11.7%). Conclusions: Results on the COMPASS dataset have validated PDD’s efficacy in discovering succinct interpretable frequent associations with comprehensive coverage and rare yet significant associations from datasets with extremely imbalanced class distribution without relying on any balancing process. The frequent associations show consistent results with common sense and literature surveys, while the rare patterns show very special cases. The success of PDD on this project indicates that PDD has great potential for population health data analysis.


Sign in / Sign up

Export Citation Format

Share Document