scholarly journals A Machine Learning Approach to Coreference Resolution of Noun Phrases

2001 ◽  
Vol 27 (4) ◽  
pp. 521-544 ◽  
Author(s):  
Wee Meng Soon ◽  
Hwee Tou Ng ◽  
Daniel Chung Yong Lim

In this paper, we present a learning approach to coreference resolution of noun phrases in unrestricted text. The approach learns from a small, annotated corpus and the task includes resolving not just a certain type of noun phrase (e.g., pronouns) but rather general noun phrases. It also does not restrict the entity types of the noun phrases; that is, coreference is assigned whether they are of “organization,” “person,” or other types. We evaluate our approach on common data sets (namely, the MUC-6 and MUC-7 coreference corpora) and obtain encouraging results, indicating that on the general noun phrase coreference task, the learning approach holds promise and achieves accuracy comparable to that of nonlearning approaches. Our system is the first learning-based system that offers performance comparable to that of state-of-the-art nonlearning systems on these data sets.

Author(s):  
Tomer Raviv ◽  
Asaf Schwartz ◽  
Yair Be'ery

Tail-biting convolutional codes extend the classical zero-termination convolutional codes: Both encoding schemes force the equality of start and end states, but under the tail-biting each state is a valid termination. This paper proposes a machine-learning approach to improve the state-of-the-art decoding of tail-biting codes, focusing on the widely employed short length regime as in the LTE standard. This standard also includes a CRC code. First, we parameterize the circular Viterbi algorithm, a baseline decoder that exploits the circular nature of the underlying trellis. An ensemble combines multiple such weighted decoders, each decoder specializes in decoding words from a specific region of the channel words' distribution. A region corresponds to a subset of termination states; the ensemble covers the entire states space. A non-learnable gating satisfies two goals: it filters easily decoded words and mitigates the overhead of executing multiple weighted decoders. The CRC criterion is employed to choose only a subset of experts for decoding purpose. Our method achieves FER improvement of up to 0.75dB over the CVA in the waterfall region for multiple code lengths, adding negligible computational complexity compared to the circular Viterbi algorithm in high SNRs.


2021 ◽  
Author(s):  
Diti Roy ◽  
Md. Ashiq Mahmood ◽  
Tamal Joyti Roy

<p>Heart Disease is the most dominating disease which is taking a large number of deaths every year. A report from WHO in 2016 portrayed that every year at least 17 million people die of heart disease. This number is gradually increasing day by day and WHO estimated that this death toll will reach the summit of 75 million by 2030. Despite having modern technology and health care system predicting heart disease is still beyond limitations. As the Machine Learning algorithm is a vital source predicting data from available data sets we have used a machine learning approach to predict heart disease. We have collected data from the UCI repository. In our study, we have used Random Forest, Zero R, Voted Perceptron, K star classifier. We have got the best result through the Random Forest classifier with an accuracy of 97.69.<i><b></b></i></p> <p><b> </b></p>


2020 ◽  
Author(s):  
Mareen Lösing ◽  
Jörg Ebbing ◽  
Wolfgang Szwillus

&lt;p&gt;Improving the understanding of geothermal heat flux in Antarctica is crucial for ice-sheet modelling and glacial isostatic adjustment. It affects the ice rheology and can lead to basal melting, thereby promoting ice flow. Direct measurements are sparse and models inferred from e.g. magnetic or seismological data differ immensely. By Bayesian inversion, we evaluated the uncertainties of some of these models and studied the interdependencies of the thermal parameters. In contrast to previous studies, our method allows the parameters to vary laterally, which leads to a heterogeneous West- and a slightly more homogeneous East Antarctica with overall lower surface heat flux. The Curie isotherm depth and radiogenic heat production have the strongest impact on our results but both parameters have a high uncertainty.&lt;/p&gt;&lt;p&gt;To overcome such shortcomings, we adopt a machine learning approach, more specifically a Gradient Boosted Regression Tree model, in order to find an optimal predictor for locations with sparse measurements. However, this approach largely relies on global data sets, which are notoriously unreliable in Antarctica. Therefore, validity and quality of the data sets is reviewed and discussed. Using regional and more detailed data sets of Antarctica&amp;#8217;s Gondwana neighbors might improve the predictions due to their similar tectonic history. The performance of the machine learning algorithm can then be examined by comparing the predictions to the existing measurements. From our study, we expect to get new insights in the geothermal structure of Antarctica, which will help with future studies on the coupling of Solid Earth and Cryosphere.&lt;/p&gt;


2018 ◽  
Vol 8 (10) ◽  
pp. 1927 ◽  
Author(s):  
Zuzana Dankovičová ◽  
Dávid Sovák ◽  
Peter Drotár ◽  
Liberios Vokorokos

This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech features and used these to train the classification model. As classifiers, three state-of-the-art methods were used: K-nearest neighbors, random forests, and support vector machine. We analyzed the performance of classifiers with and without gender taken into account. The experimental results showed that it is possible to recognize pathological speech with as high as a 91.3% classification accuracy.


2021 ◽  
Author(s):  
Diti Roy ◽  
Md. Ashiq Mahmood ◽  
Tamal Joyti Roy

<p>Heart Disease is the most dominating disease which is taking a large number of deaths every year. A report from WHO in 2016 portrayed that every year at least 17 million people die of heart disease. This number is gradually increasing day by day and WHO estimated that this death toll will reach the summit of 75 million by 2030. Despite having modern technology and health care system predicting heart disease is still beyond limitations. As the Machine Learning algorithm is a vital source predicting data from available data sets we have used a machine learning approach to predict heart disease. We have collected data from the UCI repository. In our study, we have used Random Forest, Zero R, Voted Perceptron, K star classifier. We have got the best result through the Random Forest classifier with an accuracy of 97.69.<i><b></b></i></p> <p><b> </b></p>


2021 ◽  
Author(s):  
Nobonita Saha ◽  
Aninda Mohanta ◽  
Jannatun Tuba Jyoti ◽  
Tamal Joyti Roy ◽  
Diti Roy

We have collected two data sets. First data set consisted of 45 thousand data and second one 43. One data set consisted of food information , like calorie count, sugar in per 100 gram, fat in per 100 gram and so on. Second data set consisted of Obesity rate among USA people from age 0 to 80. We wanted to show a relation with sugar intake and obesity rate. Last of all our experiment found that ther's a significance evidence that there's a link between obesity and sugar intake . We used the machine learning approach for our experimental analysis.


Author(s):  
Tsehay Admassu Assegie ◽  
Pramod Sekharan Nair

Handwritten digits recognition is an area of machine learning, in which a machine is trained to identify handwritten digits. One method of achieving this is with decision tree classification model. A decision tree classification is a machine learning approach that uses the predefined labels from the past known sets to determine or predict the classes of the future data sets where the class labels are unknown. In this paper we have used the standard kaggle digits dataset for recognition of handwritten digits using a decision tree classification approach. And we have evaluated the accuracy of the model against each digit from 0 to 9.


Sign in / Sign up

Export Citation Format

Share Document