scholarly journals Learning to Disambiguate Syntactic Relations

2003 ◽  
Vol 17 (5) ◽  
Author(s):  
Gerold Schneider

Natural Language is highly ambiguous, on every level. This article describes a fast broad-coverage state-of-the-art parser that uses a carefully hand-written grammar and probability-based machine learning approaches on the syntactic level. It is shown in detail which statistical learning models based on Maximum-Likelihood Estimation (MLE) can support a highly developed linguistic grammar in the disambiguation process.

2013 ◽  
Vol 274 ◽  
pp. 359-362
Author(s):  
Shuang Zhang ◽  
Shi Xiong Zhang

Abstract. Shallow parsing is a new strategy of language processing in the domain of natural language processing recently years. It is not focus on the obtaining of the full parsing tree but requiring of the recognition of some simple composition of some structure. It separated parsing into two subtasks: one is the recognition and analysis of chunks the other is the analysis of relationships among chunks. In this essay, some applied technology of shallow parsing is introduced and a new method of it is experimented.


2016 ◽  
Vol 27 (1) ◽  
pp. 286-297 ◽  
Author(s):  
Romain Pirracchio ◽  
John K Yue ◽  
Geoffrey T Manley ◽  
Mark J van der Laan ◽  
Alan E Hubbard ◽  
...  

Standard statistical practice used for determining the relative importance of competing causes of disease typically relies on ad hoc methods, often byproducts of machine learning procedures (stepwise regression, random forest, etc.). Causal inference framework and data-adaptive methods may help to tailor parameters to match the clinical question and free one from arbitrary modeling assumptions. Our focus is on implementations of such semiparametric methods for a variable importance measure (VIM). We propose a fully automated procedure for VIM based on collaborative targeted maximum likelihood estimation (cTMLE), a method that optimizes the estimate of an association in the presence of potentially numerous competing causes. We applied the approach to data collected from traumatic brain injury patients, specifically a prospective, observational study including three US Level-1 trauma centers. The primary outcome was a disability score (Glasgow Outcome Scale - Extended (GOSE)) collected three months post-injury. We identified clinically important predictors among a set of risk factors using a variable importance analysis based on targeted maximum likelihood estimators (TMLE) and on cTMLE. Via a parametric bootstrap, we demonstrate that the latter procedure has the potential for robust automated estimation of variable importance measures based upon machine-learning algorithms. The cTMLE estimator was associated with substantially less positivity bias as compared to TMLE and larger coverage of the 95% CI. This study confirms the power of an automated cTMLE procedure that can target model selection via machine learning to estimate VIMs in complicated, high-dimensional data.


2021 ◽  
Vol 50 (Supplement_1) ◽  
Author(s):  
Ghazaleh Dashti ◽  
Katherine J. Lee ◽  
Julie A. Simpson ◽  
Ian R. White ◽  
John B. Carlin ◽  
...  

Abstract Background Causal inference from cohort studies is central to epidemiological research. Targeted Maximum Likelihood Estimation (TMLE) is an appealing doubly robust method for causal effect estimation, but it is unclear how missing data should be handled when it is used in conjunction with machine learning approaches for the exposure and outcome models. This is problematic because missing data are ubiquitous and can result in biased estimates and loss of precision if handled inappropriately. Methods Based on a motivating example from the Victorian Adolescent Health Cohort Study, we conducted a simulation study to evaluate the performance of available approaches for handling missing data when using TMLE with machine learning. These included complete-case analysis; an extended TMLE approach incorporating an outcome missingness probability model; the missing indicator approach for missing covariate data (MCMI); and multiple imputation (MI) using standard parametric approaches or machine learning algorithms. We considered 11 missingness mechanisms typical in cohort studies, and a simple and a complex setting, in which exposure and outcome generation models included two-way and higher-order interactions. Results MI using regression with no interactions and MI with random forest yielded estimates with the highest bias. MI with regression including two-way interactions was the best performing method overall. Of the non-MI approaches, MCMI performed the worst Conclusions When using TMLE with machine learning to estimate the average causal effect, avoiding standard MI with no interactions and MCMI is recommended. Key messages We provide novel guidance for handling missing data for causal effect estimation using TMLE.


2020 ◽  
Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Ferdinand Filip ◽  
...  

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.


Sign in / Sign up

Export Citation Format

Share Document