WORD2VEC NOT DEAD: PREDICTING HYPERNYMS OF CO-HYPONYMS IS BETTER THAN READING DEFINITIONS

Expert-built lexical resources are known to provide information of good quality for the cost of low coverage. This property limits their applicability in modern NLP applications. Building descriptions of lexical-semantic relations manually in sufficient volume requires a huge amount of qualified human labour. However, given some initial version of a taxonomy is already built, automatic or semi-automatic taxonomy enrichment systems can greatly reduce the required efforts. We propose and experiment with two approaches to taxonomy enrichment, one utilizing information from word definitions and another from word usages, and also a combination of them. The first method retrieves co-hyponyms for the target word from distributional semantic models (word2vec) or language models (XLM-R), then looks for hypernyms of co-hyponyms in the taxonomy. The second method tries to extract hypernyms directly from Wiktionary definitions. The proposed methods were evaluated on the Dialogue-2020 shared task on taxonomy enrichment. We found that predicting hypernyms of cohyponyms achieves better results in this task. The combination of both methods improves results further and is among 3 best-performing systems for verbs. An important part of the work is detailed qualitative and error analysis of the proposed methods, which provide interesting observations of their behaviour and ideas for the future work.

Download Full-text

Adolescent Language: Models, Assessment, and Links to Reading

10.35542/osf.io/pf5y8 ◽

2019 ◽

Cited By ~ 1

Author(s):

Amanda Goodwin ◽

Yaacov Petscher ◽

Jamie Tock

Keyword(s):

Reading Comprehension ◽

Bifactor Model ◽

Language Models ◽

Multiple Group ◽

Global Factor ◽

Eighth Grade Students ◽

Key Aspects ◽

Future Work ◽

The Relationship ◽

Best Fit

Various models have highlighted the complexity of language. Building on foundational ideas regarding three key aspects of language, our study contributes to the literature by 1) exploring broader conceptions of morphology, vocabulary, and syntax, 2) operationalizing this theoretical model into a gamified, standardized, computer-adaptive assessment of language for fifth to eighth grade students entitled Monster, PI, and 3) uncovering further evidence regarding the relationship between language and standardized reading comprehension via this assessment. Multiple-group item response theory (IRT) across grades show that morphology was best fit by a bifactor model of task specific factors along with a global factor related to each skill. Vocabulary was best fit by a bifactor model that identifies performance overall and on specific words. Syntax, though, was best fit by a unidimensional model. Next, Monster, PI produced reliable scores suggesting language can be assessed efficiently and precisely for students via this model. Lastly, performance on Monster, PI explained more than 50% of variance in standardized reading, suggesting operationalizing language via Monster, PI can provide meaningful understandings of the relationship between language and reading comprehension. Specifically, considering just a subset of a construct, like identification of units of meaning, explained significantly less variance in reading comprehension. This highlights the importance of considering these broader constructs. Implications indicate that future work should consider a model of language where component areas are considered broadly and contributions to reading comprehension are explored via general performance on components as well as skill level performance.

Download Full-text

Telecom Churn Prediction System Based on Ensemble Learning Using Feature Grouping

Applied Sciences ◽

10.3390/app11114742 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4742

Author(s):

Tianpei Xu ◽

Ying Ma ◽

Kangchul Kim

Keyword(s):

Ensemble Learning ◽

Machine Learning Algorithms ◽

Customer Relationship ◽

Prediction System ◽

Churn Prediction ◽

Customer Churn ◽

Feature Grouping ◽

Recognition Systems ◽

The Cost ◽

Better Than

In recent years, the telecom market has been very competitive. The cost of retaining existing telecom customers is lower than attracting new customers. It is necessary for a telecom company to understand customer churn through customer relationship management (CRM). Therefore, CRM analyzers are required to predict which customers will churn. This study proposes a customer-churn prediction system that uses an ensemble-learning technique consisting of stacking models and soft voting. Xgboost, Logistic regression, Decision tree, and Naïve Bayes machine-learning algorithms are selected to build a stacking model with two levels, and the three outputs of the second level are used for soft voting. Feature construction of the churn dataset includes equidistant grouping of customer behavior features to expand the space of features and discover latent information from the churn dataset. The original and new churn datasets are analyzed in the stacking ensemble model with four evaluation metrics. The experimental results show that the proposed customer churn predictions have accuracies of 96.12% and 98.09% for the original and new churn datasets, respectively. These results are better than state-of-the-art churn recognition systems.

Download Full-text

The Effectiveness of Multi-Label Classification and Multi-Output Regression in Social Trait Recognition

Sensors ◽

10.3390/s21124127 ◽

2021 ◽

Vol 21 (12) ◽

pp. 4127

Author(s):

Will Farlessyost ◽

Kelsey-Ryan Grant ◽

Sara R. Davis ◽

David Feil-Seifer ◽

Emily M. Hand

Keyword(s):

Social Interactions ◽

First Impressions ◽

Social Traits ◽

Judicial Proceedings ◽

Facial Attributes ◽

Future Work ◽

Automated Methods ◽

Insight Into ◽

Social Trait ◽

Better Than

First impressions make up an integral part of our interactions with other humans by providing an instantaneous judgment of the trustworthiness, dominance and attractiveness of an individual prior to engaging in any other form of interaction. Unfortunately, this can lead to unintentional bias in situations that have serious consequences, whether it be in judicial proceedings, career advancement, or politics. The ability to automatically recognize social traits presents a number of highly useful applications: from minimizing bias in social interactions to providing insight into how our own facial attributes are interpreted by others. However, while first impressions are well-studied in the field of psychology, automated methods for predicting social traits are largely non-existent. In this work, we demonstrate the feasibility of two automated approaches—multi-label classification (MLC) and multi-output regression (MOR)—for first impression recognition from faces. We demonstrate that both approaches are able to predict social traits with better than chance accuracy, but there is still significant room for improvement. We evaluate ethical concerns and detail application areas for future work in this direction.

Download Full-text

Progress and Prospects for the Development of Computer-Generated Actors for Military Simulation: Part 1-Introduction and Background

Presence Teleoperators & Virtual Environments ◽

10.1162/105474603765879558 ◽

2003 ◽

Vol 12 (3) ◽

pp. 311-325 ◽

Cited By ~ 4

Author(s):

Martin R. Stytz ◽

Sheila B. Banks

Keyword(s):

Research Area ◽

Background Information ◽

Life Cycle Costs ◽

Material Management ◽

Reasoning System ◽

Force Structure ◽

The Cost ◽

Future Work ◽

Level Training ◽

Made In

The development of computer-generated synthetic environments, also calleddistributed virtual environments, for military simulation relies heavily upon computer-generated actors (CGAs) to provide accurate behaviors at reasonable cost so that the synthetic environments are useful, affordable, complex, and realistic. Unfortunately, the pace of synthetic environment development and the level of desired CGA performance continue to rise at a much faster rate than CGA capability improvements. This insatiable demand for realism in CGAs for synthetic environments arises from the growing understanding of the significant role that modeling and simulation can play in a variety of venues. These uses include training, analysis, procurement decisions, mission rehearsal, doctrine development, force-level and task-level training, information assurance, cyberwarfare, force structure analysis, sustainability analysis, life cycle costs analysis, material management, infrastructure analysis, and many others. In these and other uses of military synthetic environments, computer-generated actors play a central role because they have the potential to increase the realism of the environment while also reducing the cost of operating the environment. The progress made in addressing the technical challenges that must be overcome to realize effective and realistic CGAs for military simulation environments and the technical areas that should be the focus of future work are the subject of this series of papers, which survey the technologies and progress made in the construction and use of CGAs. In this, the first installment in the series of three papers, we introduce the topic of computer-generated actors and issues related to their performance and fidelity and other background information for this research area as related to military simulation. We also discuss CGA reasoning system techniques and architectures.

Download Full-text

The Race for Online Reputation: Implications for Platforms, Firms, and Consumers

Information Systems Research ◽

10.1287/isre.2021.1005 ◽

2021 ◽

Author(s):

Mingwen Yang ◽

Zhiqiang (Eric) Zheng ◽

Vijay Mookerjee

Keyword(s):

Adverse Event ◽

Marketing Mix ◽

Digital Economy ◽

The Other ◽

The Mean ◽

The Absolute ◽

The Cost ◽

Online Reputation ◽

Better Than

Online reputation has become a key marketing-mix variable in the digital economy. Our study helps managers decide on the effort they should use to manage online reputation. We consider an online reputation race in which it is important not just to manage the absolute reputation, but also the relative rating. That is, to stay ahead, a firm should try to have ratings that are better than those of its competitors. Our findings are particularly significant for platform owners (such as Expedia or Yelp) to strategically grow their base of participating firms: growing the middle of the market (firms with average ratings) is the best option considering the goals of the platform and the other stakeholders, namely incumbents and consumers. For firms, we find that they should increase their effort when the mean market rating increases. Another key insight for firms is that, sometimes, adversity can come disguised as an opportunity. When an adverse event strikes the industry (such as a reduction in sales margin or an increase in the cost of effort), a firm’s profit can increase if it can manage this event better than its competitors.

Download Full-text

Evaluation of a Systems-Based Tobacco Cessation Program Using Bedside Volunteers

Nicotine & Tobacco Research ◽

10.1093/ntr/nty252 ◽

2018 ◽

Vol 22 (3) ◽

pp. 440-445 ◽

Cited By ~ 1

Author(s):

Denise S Taylor ◽

Dominique Medaglio ◽

Claudine T Jurkovitz ◽

Freda Patterson ◽

Zugui Zhang ◽

...

Keyword(s):

Tobacco Cessation ◽

Patient Characteristics ◽

Post Discharge ◽

Cessation Treatment ◽

Opt Out ◽

Artery Disease ◽

Intent To Treat ◽

The Cost ◽

Referral Completion ◽

Future Work

Abstract Introduction Hospitalization and post-discharge provide an opportune time for tobacco cessation. This study tested the feasibility, uptake, and cessation outcomes of a hospital-based tobacco cessation program, delivered by volunteers to the bedside with post-discharge referral to Quitline services. Patient characteristics associated with Quitline uptake and cessation were assessed. Methods Between February and November 2016, trained hospital volunteers approached inpatient tobacco users on six pilot units. Volunteers shared a cessation brochure and used the ASK-ADVISE-CONNECT model to connect ready to quit patients to the Delaware Quitline via fax-referral. Volunteers administered a follow-up survey to all admitted tobacco users via telephone or email at 3-months post-discharge. Results Of the 743 admitted tobacco users, 531 (72%) were visited by a volunteer, and 97% (531/547) of those approached, accepted the visit. Over one-third (201/531; 38%) were ready to quit and fax-referred to the Quitline, and 36% of those referred accepted Quitline services. At 3 months post-discharge, 37% (135/368) reported not using tobacco in the last 30 days; intent-to-treat cessation rate was 18% (135/743). In a multivariable regression model of Quitline fax-referral completion, receiving nicotine replacement therapy (NRT) during hospitalization was the strongest predictor (odds ratios [OR] = 1.97; 95% confidence interval [CI] = 1.34 to 2.90). In a model of 3-month cessation, receiving Quitline services (OR = 3.21, 95% CI = 1.35 to 7.68) and having coronary artery disease (OR = 2.28; 95% CI = 1.11 to 4.68) were associated with tobacco cessation, but a volunteer visit was not. Conclusions An “opt-out” tobacco cessation service using trained volunteers is feasible for connecting patients to Quitline services. Implications This study demonstrates the feasibility of a systems-based approach to link inpatients to evidence-based treatment for tobacco use. This model used trained bedside volunteers to connect inpatients to a state-funded Quitline after discharge that offers free cessation treatment of telephone coaching and cessation medications. Receiving NRT during hospitalization positively impacted Quitline referral, and engagement with Quitline resources was critical to tobacco abstinence post-discharge. Future work is needed to evaluate the cost-effectiveness and sustainability of this volunteer model.

Download Full-text

Embeddings from protein language models predict conservation and variant effects

10.21203/rs.3.rs-584804/v1 ◽

2021 ◽

Author(s):

Céline Marquet ◽

Michael Heinzinger ◽

Tobias Olenyi ◽

Christian Dallago ◽

Michael Bernhofer ◽

...

Keyword(s):

Protein Function ◽

Language Models ◽

Single Amino Acid ◽

Sequence Alignments ◽

Multiple Sequence ◽

Multiple Sequence Alignments ◽

Human Proteins ◽

Entire Sequence ◽

Embedding Methods ◽

Better Than

Abstract The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational Scanning (DMS) sets continue to expand our understanding of the mutational landscape of single proteins, the results continue to challenge analyses. Protein Language Models (LMs) use the latest deep learning (DL) algorithms to leverage growing databases of protein sequences. These methods learn to predict missing or marked amino acids from the context of entire sequence regions. Here, we explored how to benefit from learned protein LM representations (embeddings) to predict SAV effects. Although we have failed so far to predict SAV effects directly from embeddings, this input alone predicted residue conservation almost as accurately from single sequences as using multiple sequence alignments (MSAs) with a two-state per-residue accuracy (conserved/not) of Q2=80% (embeddings) vs. 81% (ConSeq). Considering all SAVs at all residue positions predicted as conserved to affect function reached 68.6% (Q2: effect/neutral; for PMD) without optimization, compared to an expert solution such as SNAP2 (Q2=69.8). Combining predicted conservation with BLOSUM62 to obtain variant-specific binary predictions, DMS experiments of four human proteins were predicted better than by SNAP2, and better than by applying the same simplistic approach to conservation taken from ConSeq. Thus, embedding methods have become competitive with methods relying on MSAs for SAV effect prediction at a fraction of the costs in computing/energy. This allowed prediction of SAV effects for the entire human proteome (~20k proteins) within 17 minutes on a single GPU.

Download Full-text

Image Hiding Using QR Factorization And Discrete Wavelet Transform Techniques

Future Computing and Informatics Journal ◽

10.54623/fue.fcij.6.2.2 ◽

2021 ◽

Vol 6 (2) ◽

pp. 72-81

Author(s):

Reham Ahmed El-Shahed ◽

◽

Maryam Al-Berry ◽

Hala Ebied ◽

Howida A. Shedeed ◽

...

Keyword(s):

Data Security ◽

Color Image ◽

Simple Algorithm ◽

Qr Factorization ◽

Discrete Wavelet ◽

Huge Amount ◽

Image Hiding ◽

Media Applications ◽

Very High ◽

Better Than

Steganography is one of the most important tools in the data security field as there is a huge amount of data transferred each moment over the internet. Hiding secret messages in an image has been widely used because the images are mostly used in social media applications. The proposed algorithm is a simple algorithm for hiding an image in another image. The proposed technique uses QR factorization to conceal the secret image. The technique successfully hid a gray and color image in another one and the performance of the algorithm was measured by PSNR, SSIM and NCC. The PSNR for the cover image was in the range of 41 to 51 dB. DWT was added to increase the security of the method and this enhanced technique increased the cover PSNR to 48 t0 56 dB. The SSIM is 100% and the NCC is 1 for both implementations. Which improves that the imperceptibility of the algorithm is very high. The comparative analysis showed that the performance of the algorithm is better than other state-of-the-art algorithms

Download Full-text

Granulated Rest Frames Outperform Field of View Restrictors on Visual Search Performance

Frontiers in Virtual Reality ◽

10.3389/frvir.2021.604889 ◽

2021 ◽

Vol 2 ◽

Author(s):

Zekun Cao ◽

Jeronimo Grandi ◽

Regis Kopper

Keyword(s):

Visual Search ◽

Field Of View ◽

Search Performance ◽

Within Subjects ◽

Induced Motion ◽

Circular Cutout ◽

Context Specific ◽

The Cost ◽

Future Work ◽

Peripheral Awareness

Dynamic field of view (FOV) restrictors have been successfully used to reduce visually induced motion sickness (VIMS) during continuous viewpoint motion control (virtual travel) in virtual reality (VR). This benefit, however, comes at the cost of losing peripheral awareness during provocative motion. Likewise, the use of visual references that are stable in relation to the physical environment, called rest frames (RFs), has also been shown to reduce discomfort during virtual travel tasks in VR. We propose a new RF-based design called Granulated Rest Frames (GRFs) with a soft-edged circular cutout in the center that leverages the rest frames’ benefits without completely blocking the user’s peripheral view. The GRF design is application-agnostic and does not rely on context-specific RFs, such as commonly used cockpits. We report on a within-subjects experiment with 20 participants. The results suggest that, by strategically applying GRFs during a visual search session in VR, we can achieve better item searching efficiency as compared to restricted FOV. The effect of GRFs on reducing VIMS remains to be determined by future work.

Download Full-text

Multiplex tests to identify gastrointestinal bacteria, viruses and parasites in people with suspected infectious gastroenteritis: a systematic review and economic analysis

Health Technology Assessment ◽

10.3310/hta21230 ◽

2017 ◽

Vol 21 (23) ◽

pp. 1-188 ◽

Cited By ~ 34

Author(s):

Karoline Freeman ◽

Hema Mistry ◽

Alexander Tsertsvadze ◽

Pam Royle ◽

Noel McCarthy ◽

...

Keyword(s):

Public Health ◽

Cost Effectiveness ◽

De Novo ◽

Meta Analysis ◽

Test Accuracy ◽

Test Results ◽

The Public ◽

Infectious Gastroenteritis ◽

The Cost ◽

Future Work

Background Gastroenteritis is a common, transient disorder usually caused by infection and characterised by the acute onset of diarrhoea. Multiplex gastrointestinal pathogen panel (GPP) tests simultaneously identify common bacterial, viral and parasitic pathogens using molecular testing. By providing test results more rapidly than conventional testing methods, GPP tests might positively influence the treatment and management of patients presenting in hospital or in the community. Objective To systematically review the evidence for GPP tests [xTAG® (Luminex, Toronto, ON, Canada), FilmArray (BioFire Diagnostics, Salt Lake City, UT, USA) and Faecal Pathogens B (AusDiagnostics, Beaconsfield, NSW, Australia)] and to develop a de novo economic model to compare the cost-effectiveness of GPP tests with conventional testing in England and Wales. Data sources Multiple electronic databases including MEDLINE, EMBASE, Web of Science and the Cochrane Database were searched from inception to January 2016 (with supplementary searches of other online resources). Review methods Eligible studies included patients with acute diarrhoea; comparing GPP tests with standard microbiology techniques; and patient, management, test accuracy or cost-effectiveness outcomes. Quality assessment of eligible studies used tailored Quality Assessment of Diagnostic Accuracy Studies-2, Consolidated Health Economic Evaluation Reporting Standards and Philips checklists. The meta-analysis included positive and negative agreement estimated for each pathogen. A de novo decision tree model compared patients managed with GPP testing or comparable coverage with patients managed using conventional tests, within the Public Health England pathway. Economic models included hospital and community management of patients with suspected gastroenteritis. The model estimated costs (in 2014/15 prices) and quality-adjusted life-year losses from a NHS and Personal Social Services perspective. Results Twenty-three studies informed the review of clinical evidence (17 xTAG, four FilmArray, two xTAG and FilmArray, 0 Faecal Pathogens B). No study provided an adequate reference standard with which to compare the test accuracy of GPP with conventional tests. A meta-analysis (of 10 studies) found considerable heterogeneity; however, GPP testing produces a greater number of pathogen-positive findings than conventional testing. It is unclear whether or not these additional ‘positives’ are clinically important. The review identified no robust evidence to inform consequent clinical management of patients. There is considerable uncertainty about the cost-effectiveness of GPP panels used to test for suspected infectious gastroenteritis in hospital and community settings. Uncertainties in the model include length of stay, assumptions about false-positive findings and the costs of tests. Although there is potential for cost-effectiveness in both settings, key modelling assumptions need to be verified and model findings remain tentative. Limitations No test–treat trials were retrieved. The economic model reflects one pattern of care, which will vary across the NHS. Conclusions The systematic review and cost-effectiveness model identify uncertainties about the adoption of GPP tests within the NHS. GPP testing will generally correctly identify pathogens identified by conventional testing; however, these tests also generate considerable additional positive results of uncertain clinical importance. Future work An independent reference standard may not exist to evaluate alternative approaches to testing. A test–treat trial might ascertain whether or not additional GPP ‘positives’ are clinically important or result in overdiagnoses, whether or not earlier diagnosis leads to earlier discharge in patients and what the health consequences of earlier intervention are. Future work might also consider the public health impact of different testing treatments, as test results form the basis for public health surveillance. Study registration This study is registered as PROSPERO CRD2016033320. Funding The National Institute for Health Research Health Technology Assessment programme.

Download Full-text