New CNF Features and Formula Classification

10.29007/b8t1 ◽  
2018 ◽  
Author(s):  
Enrique Alfonso ◽  
Norbert Manthey

In this paper we first present three new features for classifying CNF formulas. These features are based on the structural information of the formula and consider AND-gates as well as exactly-one constraints. Next, we use these features to construct a machine learning approach to select a SAT solver configuration for CNF formulas with random decision forests. Based on this classification task we can show that our new features are useful compared to existing features. Since the computation time for these features is small, the constructed classifier improves the performance of the SAT solvers on application and hand crafted benchmarks. On the other hand, the comparison shows that the set of new features also results in a better classification.

2016 ◽  
Vol 11 (4) ◽  
pp. 791-799 ◽  
Author(s):  
Rina Kagawa ◽  
Yoshimasa Kawazoe ◽  
Yusuke Ida ◽  
Emiko Shinohara ◽  
Katsuya Tanaka ◽  
...  

Background: Phenotyping is an automated technique that can be used to distinguish patients based on electronic health records. To improve the quality of medical care and advance type 2 diabetes mellitus (T2DM) research, the demand for T2DM phenotyping has been increasing. Some existing phenotyping algorithms are not sufficiently accurate for screening or identifying clinical research subjects. Objective: We propose a practical phenotyping framework using both expert knowledge and a machine learning approach to develop 2 phenotyping algorithms: one is for screening; the other is for identifying research subjects. Methods: We employ expert knowledge as rules to exclude obvious control patients and machine learning to increase accuracy for complicated patients. We developed phenotyping algorithms on the basis of our framework and performed binary classification to determine whether a patient has T2DM. To facilitate development of practical phenotyping algorithms, this study introduces new evaluation metrics: area under the precision-sensitivity curve (AUPS) with a high sensitivity and AUPS with a high positive predictive value. Results: The proposed phenotyping algorithms based on our framework show higher performance than baseline algorithms. Our proposed framework can be used to develop 2 types of phenotyping algorithms depending on the tuning approach: one for screening, the other for identifying research subjects. Conclusions: We develop a novel phenotyping framework that can be easily implemented on the basis of proper evaluation metrics, which are in accordance with users’ objectives. The phenotyping algorithms based on our framework are useful for extraction of T2DM patients in retrospective studies.


2018 ◽  
Vol 11 (1) ◽  
pp. 34
Author(s):  
Alfan Farizki Wicaksono ◽  
Sharon Raissa Herdiyana ◽  
Mirna Adriani

Someone's understanding and stance on a particular controversial topic can be influenced by daily news or articles he consume everyday. Unfortunately, readers usually do not realize that they are reading controversial articles. In this paper, we address the problem of automatically detecting controversial article from citizen journalism media. To solve the problem, we employ a supervised machine learning approach with several hand-crafted features that exploits linguistic information, meta-data of an article, structural information in the commentary section, and sentiment expressed inside the body of an article. The experimental results shows that our proposed method manages to perform the addressed task effectively. The best performance so far is achieved when we use all proposed feature with Logistic Regression as our model (82.89\% in terms of accuracy). Moreover, we found that information from commentary section (structural features) contributes most to the classification task.


2019 ◽  
Vol 62 ◽  
pp. 15-19 ◽  
Author(s):  
Birgit Ludwig ◽  
Daniel König ◽  
Nestor D. Kapusta ◽  
Victor Blüml ◽  
Georg Dorffner ◽  
...  

Abstract Methods of suicide have received considerable attention in suicide research. The common approach to differentiate methods of suicide is the classification into “violent” versus “non-violent” method. Interestingly, since the proposition of this dichotomous differentiation, no further efforts have been made to question the validity of such a classification of suicides. This study aimed to challenge the traditional separation into “violent” and “non-violent” suicides by generating a cluster analysis with a data-driven, machine learning approach. In a retrospective analysis, data on all officially confirmed suicides (N = 77,894) in Austria between 1970 and 2016 were assessed. Based on a defined distance metric between distributions of suicides over age group and month of the year, a standard hierarchical clustering method was performed with the five most frequent suicide methods. In cluster analysis, poisoning emerged as distinct from all other methods – both in the entire sample as well as in the male subsample. Violent suicides could be further divided into sub-clusters: hanging, shooting, and drowning on the one hand and jumping on the other hand. In the female sample, two different clusters were revealed – hanging and drowning on the one hand and jumping, poisoning, and shooting on the other. Our data-driven results in this large epidemiological study confirmed the traditional dichotomization of suicide methods into “violent” and “non-violent” methods, but on closer inspection “violent methods” can be further divided into sub-clusters and a different cluster pattern could be identified for women, requiring further research to support these refined suicide phenotypes.


Author(s):  
Ranjan Raj Aryal ◽  
Ankit Bhattarai

Social media is one platform where people share their opinions and views on different topics, services, or behaviors that happen around them. Since the COVID19 pandemic that started at the end of 2019, it has been a topic on which people express their sentiments. Recently, the COVID19 vaccination programs have got a lot of responses. In this paper, we have proposed two models: one based on the machine learning approach: Naive Bayes & the other based on deep learning: LSTM, whose goal is to know the sentiment of Asian region tweets towards the vaccine through sentiment analysis. The data were extracted with the help of Twitter API from March 23, 2021, till April 2, 2021. The extraction approach contains keywords with geocoding of some of the Asian countries, especially Nepal, India and Singapore. After collecting data, some preprocessing such as removing numbers, non-English & stop words, removing special characters, and hyperlinks were done. The polarity of tweets was assigned using the Text blob library. The tweets were classified into one of the three: positive, negative, or neutral. Now the data were preprocessed with the splitting of tweets into training & testing sets. Both the models were trained & tested using 10767 unique tweets. This experiment shows that a number of people in these three countries (Nepal, India and Singapore) have positive sentiment towards the vaccine and are taking the first dose of Covid19 vaccine. At last, the accuracy of the LSTM model was found to be 7% greater than that of the Naive Bayes-based model.


2019 ◽  
Vol 25 (2) ◽  
pp. 145-167 ◽  
Author(s):  
Nicholas Guttenberg ◽  
Nathaniel Virgo ◽  
Alexandra Penn

Natural evolution gives the impression of leading to an open-ended process of increasing diversity and complexity. If our goal is to produce such open-endedness artificially, this suggests an approach driven by evolutionary metaphor. On the other hand, techniques from machine learning and artificial intelligence are often considered too narrow to provide the sort of exploratory dynamics associated with evolution. In this article, we hope to bridge that gap by reviewing common barriers to open-endedness in the evolution-inspired approach and how they are dealt with in the evolutionary case—collapse of diversity, saturation of complexity, and failure to form new kinds of individuality. We then show how these problems map onto similar ones in the machine learning approach, and discuss how the same insights and solutions that alleviated those barriers in evolutionary approaches can be ported over. At the same time, the form these issues take in the machine learning formulation suggests new ways to analyze and resolve barriers to open-endedness. Ultimately, we hope to inspire researchers to be able to interchangeably use evolutionary and gradient-descent-based machine learning methods to approach the design and creation of open-ended systems.


2011 ◽  
Vol 95 (1) ◽  
pp. 33-50 ◽  
Author(s):  
Kamlesh Dutta ◽  
Saroj Kaushik ◽  
Nupur Prakash

Machine Learning Approach for the Classification of Demonstrative Pronouns for Indirect Anaphora in Hindi News Items In this paper, we present machine learning approach for the classification indirect anaphora in Hindi corpus. The direct anaphora is able to find the noun phrase antecedent within a sentence or across few sentences. On the other hand indirect anaphora does not have explicit referent in the discourse. We suggest looking for certain patterns following the indirect anaphor and marking demonstrative pronoun as directly or indirectly anaphoric accordingly. Our focus of study is pronouns without noun phrase antecedent. We analyzed 177 news items having 1334 sentences, 780 demonstrative pronouns of which 97 (12.44 %) were indirectly anaphoric. The experiment with machine learning approaches for the classification of these pronouns based on the semantic cue provided by the collocation patterns following the pronoun is also carried out.


Sign in / Sign up

Export Citation Format

Share Document