scholarly journals Linguistic Approach to Semantic Correlation Rules

2021 ◽  
Vol 102 ◽  
pp. 02004
Author(s):  
Charlotte Effenberger

As communication between humans and machines in natural language still seems essential, especially for end users, Natural Language Processing (NLP) methods are used to classify and interpret this. NLP, as a technology, combines grammatical, semantical, and pragmatical analyses with statistics or machine learning to make language logically understandable by machines and to allow new interpretations of data in contrast to predefined logical structures. Some NLP methods do not go far beyond a retrieving of the indexation of content. Therefore, indexation is considered as a very simple linguistic approach. Semantic correlation rules offer the possibility to retrieve easy semantic relations without a special tool by using a set of predefined rules. Therefore, this paper aims to examine, to which extend Semantic Correlation Rules (SCRs) will be able to retrieve linguistic semantic relations and to what extend a simple NLP method can be set up to allow further interpretation of data. In order to do so, an easy linguistic model was modelled by an indexation that is enriched with semantical relations to give data more context. These semantic relations were then queried by SCRs to set up an NLP method.

2021 ◽  
pp. 1-14
Author(s):  
Kristen Edwards ◽  
Aoran Peng ◽  
Scarlett Miller ◽  
Faez Ahmed

Abstract A picture is worth a thousand words, and in design metric estimation, a word may be worth a thousand features. Pictures are awarded this worth because they encode a plethora of information. When evaluating designs, we aim to capture a range of information, including usefulness, uniqueness, and novelty of a design. The subjective nature of these concepts makes their evaluation difficult. Still, many attempts have been made and metrics developed to do so, because design evaluation is integral to the creation of novel solutions. The most common metrics used are the consensual assessment technique (CAT) and the Shah, Vargas-Hernandez, and Smith (SVS) method. While CAT is accurate and often regarded as the “gold standard,” it relies on using expert ratings, making CAT expensive and time-consuming. Comparatively, SVS is less resource-demanding, but often criticized as lacking sensitivity and accuracy. We utilize the complementary strengths of both methods through machine learning. This study investigates the potential of machine learning to predict expert creativity assessments from non-expert survey results. The SVS method results in a text-rich dataset about a design. We utilize these textual design representations and the deep semantic relationships that natural language encodes to predict more desirable design metrics, including CAT metrics. We demonstrate the ability of machine learning models to predict design metrics from the design itself and SVS survey information. We show that incorporating natural language processing improves prediction results across design metrics, and that clear distinctions in the predictability of certain metrics exist.


Author(s):  
Marek Maziarz ◽  
Ewa Rudnicka

Expanding WordNet with Gloss and Polysemy Links for Evocation Strength RecognitionEvocation – a phenomenon of sense associations going beyond standard (lexico)-semantic relations – is difficult to recognise for natural language processing systems. Machine learning models give predictions which are only moderately correlated with the evocation strength. It is believed that ordinary graph measures are not as good at this task as methods based on vector representations. The paper proposes a new method of enriching the WordNet structure with weighted polysemy and gloss links, and proves that Dijkstra’s algorithm performs equally as well as other more sophisticated measures when set together with such expanded structures. Rozszerzenie WordNetu o glosy i relacje polisemiczne na potrzeby rozpoznawania siły ewokacjiEwokacja – zjawisko skojarzeń zmysłowych wykraczających poza standardowe (leksykalne) relacje semantyczne – jest trudne do rozpoznania dla systemów przetwarzania języka naturalnego. Modele uczenia maszynowego dają prognozy tylko umiarkowanie skorelowane z siłą ewokacji. Uważa się, że zwykłe miary grafowe nie są tak dobre w tym zadaniu, jak metody oparte na reprezentacjach wektorowych. Proponujemy nową metodę wzbogacania struktury WordNet o polisemie ważone i linki połysku i udowadniamy, że algorytm Dijkstry zestawiony z tak rozbudowanymi strukturami działa a także inne, bardziej wyrafinowane środki.


2020 ◽  
Author(s):  
Vadim V. Korolev ◽  
Artem Mitrofanov ◽  
Kirill Karpov ◽  
Valery Tkachenko

The main advantage of modern natural language processing methods is a possibility to turn an amorphous human-readable task into a strict mathematic form. That allows to extract chemical data and insights from articles and to find new semantic relations. We propose a universal engine for processing chemical and biological texts. We successfully tested it on various use-cases and applied to a case of searching a therapeutic agent for a COVID-19 disease by analyzing PubMed archive.


Author(s):  
Rohan Pandey ◽  
Vaibhav Gautam ◽  
Ridam Pal ◽  
Harsh Bandhey ◽  
Lovedeep Singh Dhingra ◽  
...  

BACKGROUND The COVID-19 pandemic has uncovered the potential of digital misinformation in shaping the health of nations. The deluge of unverified information that spreads faster than the epidemic itself is an unprecedented phenomenon that has put millions of lives in danger. Mitigating this ‘Infodemic’ requires strong health messaging systems that are engaging, vernacular, scalable, effective and continuously learn the new patterns of misinformation. OBJECTIVE We created WashKaro, a multi-pronged intervention for mitigating misinformation through conversational AI, machine translation and natural language processing. WashKaro provides the right information matched against WHO guidelines through AI, and delivers it in the right format in local languages. METHODS We theorize (i) an NLP based AI engine that could continuously incorporate user feedback to improve relevance of information, (ii) bite sized audio in the local language to improve penetrance in a country with skewed gender literacy ratios, and (iii) conversational but interactive AI engagement with users towards an increased health awareness in the community. RESULTS A total of 5026 people who downloaded the app during the study window, among those 1545 were active users. Our study shows that 3.4 times more females engaged with the App in Hindi as compared to males, the relevance of AI-filtered news content doubled within 45 days of continuous machine learning, and the prudence of integrated AI chatbot “Satya” increased thus proving the usefulness of an mHealth platform to mitigate health misinformation. CONCLUSIONS We conclude that a multi-pronged machine learning application delivering vernacular bite-sized audios and conversational AI is an effective approach to mitigate health misinformation. CLINICALTRIAL Not Applicable


2021 ◽  
Vol 28 (1) ◽  
pp. e100262
Author(s):  
Mustafa Khanbhai ◽  
Patrick Anyadi ◽  
Joshua Symons ◽  
Kelsey Flott ◽  
Ara Darzi ◽  
...  

ObjectivesUnstructured free-text patient feedback contains rich information, and analysing these data manually would require a lot of personnel resources which are not available in most healthcare organisations.To undertake a systematic review of the literature on the use of natural language processing (NLP) and machine learning (ML) to process and analyse free-text patient experience data.MethodsDatabases were systematically searched to identify articles published between January 2000 and December 2019 examining NLP to analyse free-text patient feedback. Due to the heterogeneous nature of the studies, a narrative synthesis was deemed most appropriate. Data related to the study purpose, corpus, methodology, performance metrics and indicators of quality were recorded.ResultsNineteen articles were included. The majority (80%) of studies applied language analysis techniques on patient feedback from social media sites (unsolicited) followed by structured surveys (solicited). Supervised learning was frequently used (n=9), followed by unsupervised (n=6) and semisupervised (n=3). Comments extracted from social media were analysed using an unsupervised approach, and free-text comments held within structured surveys were analysed using a supervised approach. Reported performance metrics included the precision, recall and F-measure, with support vector machine and Naïve Bayes being the best performing ML classifiers.ConclusionNLP and ML have emerged as an important tool for processing unstructured free text. Both supervised and unsupervised approaches have their role depending on the data source. With the advancement of data analysis tools, these techniques may be useful to healthcare organisations to generate insight from the volumes of unstructured free-text data.


Sign in / Sign up

Export Citation Format

Share Document