scholarly journals Word embedding for French natural language in healthcare: a comparative study (Preprint)

2018 ◽  
Author(s):  
Emeric Dynomant ◽  
Romain Lelong ◽  
Badisse Dahamna ◽  
Clément Massonaud ◽  
Gaétan Kerdelhué ◽  
...  

BACKGROUND Word embedding technologies are now used in a wide range of applications. However, no formal evaluation and comparison have been made on models produced by the three most famous implementations (Word2Vec, GloVe and FastText). OBJECTIVE The goal of this study is to compare embedding implementations on a corpus of documents produced in a working context, by health professionals. METHODS Models have been trained on documents coming from the Rouen university hospital. This data is not structured and cover a wide range of documents produced in a clinic (discharge summary, prescriptions ...). Four evaluation tasks have been defined (cosine similarity, odd one, mathematical operations and human formal evaluation) and applied on each model. RESULTS Word2Vec had the highest score for three of the four tasks (mathematical operations, odd one similarity and human validation), particularly regarding the Skip-Gram architecture. CONCLUSIONS Even if this implementation had the best rate, each model has its own qualities and defects, like the training time which is very short for GloVe or morphosyntaxic similarity conservation observed with FastText. Models and test sets produced by this study will be the first publicly available through a graphical interface to help advance French biomedical research.

10.2196/12310 ◽  
2019 ◽  
Vol 7 (3) ◽  
pp. e12310 ◽  
Author(s):  
Emeric Dynomant ◽  
Romain Lelong ◽  
Badisse Dahamna ◽  
Clément Massonnaud ◽  
Gaétan Kerdelhué ◽  
...  

Background Word embedding technologies, a set of language modeling and feature learning techniques in natural language processing (NLP), are now used in a wide range of applications. However, no formal evaluation and comparison have been made on the ability of each of the 3 current most famous unsupervised implementations (Word2Vec, GloVe, and FastText) to keep track of the semantic similarities existing between words, when trained on the same dataset. Objective The aim of this study was to compare embedding methods trained on a corpus of French health-related documents produced in a professional context. The best method will then help us develop a new semantic annotator. Methods Unsupervised embedding models have been trained on 641,279 documents originating from the Rouen University Hospital. These data are not structured and cover a wide range of documents produced in a clinical setting (discharge summary, procedure reports, and prescriptions). In total, 4 rated evaluation tasks were defined (cosine similarity, odd one, analogy-based operations, and human formal evaluation) and applied on each model, as well as embedding visualization. Results Word2Vec had the highest score on 3 out of 4 rated tasks (analogy-based operations, odd one similarity, and human validation), particularly regarding the skip-gram architecture. Conclusions Although this implementation had the best rate for semantic properties conservation, each model has its own qualities and defects, such as the training time, which is very short for GloVe, or morphological similarity conservation observed with FastText. Models and test sets produced by this study will be the first to be publicly available through a graphical interface to help advance the French biomedical research.


2021 ◽  
Vol 10 (12) ◽  
pp. 2627
Author(s):  
Pierre-Edouard Fournier ◽  
Sophie Edouard ◽  
Nathalie Wurtz ◽  
Justine Raclot ◽  
Marion Bechet ◽  
...  

The Méditerranée Infection University Hospital Institute (IHU) is located in a recent building, which includes experts on a wide range of infectious disease. The IHU strategy is to develop innovative tools, including epidemiological monitoring, point-of-care laboratories, and the ability to mass screen the population. In this study, we review the strategy and guidelines proposed by the IHU and its application to the COVID-19 pandemic and summarise the various challenges it raises. Early diagnosis enables contagious patients to be isolated and treatment to be initiated at an early stage to reduce the microbial load and contagiousness. In the context of the COVID-19 pandemic, we had to deal with a shortage of personal protective equipment and reagents and a massive influx of patients. Between 27 January 2020 and 5 January 2021, 434,925 nasopharyngeal samples were tested for the presence of SARS-CoV-2. Of them, 12,055 patients with COVID-19 were followed up in our out-patient clinic, and 1888 patients were hospitalised in the Institute. By constantly adapting our strategy to the ongoing situation, the IHU has succeeded in expanding and upgrading its equipment and improving circuits and flows to better manage infected patients.


Geomatics ◽  
2021 ◽  
Vol 1 (1) ◽  
pp. 34-49
Author(s):  
Mael Moreni ◽  
Jerome Theau ◽  
Samuel Foucher

The combination of unmanned aerial vehicles (UAV) with deep learning models has the capacity to replace manned aircrafts for wildlife surveys. However, the scarcity of animals in the wild often leads to highly unbalanced, large datasets for which even a good detection method can return a large amount of false detections. Our objectives in this paper were to design a training method that would reduce training time, decrease the number of false positives and alleviate the fine-tuning effort of an image classifier in a context of animal surveys. We acquired two highly unbalanced datasets of deer images with a UAV and trained a Resnet-18 classifier using hard-negative mining and a series of recent techniques. Our method achieved sub-decimal false positive rates on two test sets (1 false positive per 19,162 and 213,312 negatives respectively), while training on small but relevant fractions of the data. The resulting training times were therefore significantly shorter than they would have been using the whole datasets. This high level of efficiency was achieved with little tuning effort and using simple techniques. We believe this parsimonious approach to dealing with highly unbalanced, large datasets could be particularly useful to projects with either limited resources or extremely large datasets.


2021 ◽  
Vol 17 ◽  
Author(s):  
Alaa Ibrahim Ali ◽  
Wassan Nori Mohammed Hassan ◽  
Sumaya Alrawi

Background: A polycystic ovarian syndrome (PCOS) is a common endocrine syndrome in which women have a wide range of clinical presentations; insulin resistance was linked to its pathogenesis. Objective: We aimed to investigate the copeptin role as a predictive marker of insulin resistance among PCOS women. Material and Methods: In University Hospital, we included 280 women, with 140 of them being healthy controls. 140 out of 280 cases of PCOS subdivided into two groups depending on the insulin resistance; group 1 with homeostasis model assessment for the insulin resistance < 2.5. Group 2 with homeostasis model assessment for the insulin resistance >2.5. The evaluation of body mass index and blood pressure for all besides the blood sampling for estimation of a follicular stimulating hormone, luteinizing hormone, prolactin, estradiol, sex hormone-binding globulin, total testosterone, fasting insulin dehydroepiandrosterone sulfate, C-reactive protein, plasma glucose, free androgen index, and plasma copeptin using the Copeptin-Human EIA Kit besides the transvaginal ultrasound for ovarian assessment. Results: When compared to other groups, PCOS women with positive insulin resistance >2.5 had a significantly higher plasma copeptin level. The ROC curve calculated a 1.94 pmol/L; plasma copeptin cutoff value for detecting the insulin resistance in PCOS with 88% sensitivity value and 36% specificity, AUC was 0.88. Conclusion: The significant positive relationship between serum copeptin and insulin resistance with high sensitivity implies its usefulness as a marker of insulin resistance among PCOS patients with a high prediction of its complication.


2021 ◽  
Vol 39 (6_suppl) ◽  
pp. 123-123
Author(s):  
Gunhild Von Amsberg ◽  
Mirjam Zilles ◽  
Philipp Gild ◽  
Winfried Alsdorf ◽  
Lukas Boeckelmann ◽  
...  

123 Background: Recent developments in the treatment of metastatic castration resistant prostate cancer (mCRPC) lead to a revival of platinum-based chemotherapy demonstrating increased activity in patients with aggressive variants of disease. Here, we report on the results of a combinational salvage therapy with cisplatin, ifosfamide and paclitaxel (TIP) in mCRPC. Methods: We retrospectively analyzed patients with mCRPC treated with TIP at the University Hospital Hamburg-Eppendorf between November 2013 and September 2020. Accompanying in vitro analyses were performed using human prostate carcinoma cell lines harboring different levels of drug resistance including the docetaxel-resistant sublines PC3-DR and DU45-DR. Results: In total, 17 mCRPC patients treated with TIP were eligible for efficacy analyses with a median age of 65 yrs. At baseline, liver metastases were present in 88%, metastases of other visceral sides (lung, adrenal gland, brain) in 47% and bone metastases in 76% of the patients. Median hemoglobin was 9.8mg/dl, LDH 903 U/l and AP 205 U/l. Median PSA value was 77 ng/ml covering a wide range including three patients with a PSA-value below 1ng/ml. NSE was evaluated in 83% of the patients (median 38,5 U/l). Patients were extensively pretreated with a median of three treatment lines before TIP (100% docetaxel, 82% abirateron and/or enzalutamide, 47% cabazitaxel, 41% others). A median of 3,5 cycles of TIP were applied with 29% of the patients receiving the maximum of 6 cycles. Four patients discontinued treatment due to side effects (PNP, infection, ifosfamide induced psychosis). At interim analyses, 59 % of the patients showed a radiological response or stable disease with only one patient progressing till the end of treatment. Median PFS was 2.5 months, median OS 6 months. A decrease of PSA > 30% and LDH > 50% was observed in 41% and 35% of the patients, respectively. In vitro experiments revealed additive effects of TIP in 22Rv1, LNCaP and DU45 cells and synergistic effects in neuroendocrine LASCPC-01 cells. In PC3 cells, TIP induced antagonistic effects at lower doses, whereas dose-independent additive effects were observed in docetaxel-resistant PC3-DR. Surprisingly, preliminary data of combined therapies with different drug pairs suggest an antagonistic effect of paclitaxel in the combination with both, cisplatin and ifosfamide. Conclusions: Combinational therapy with cisplatin, ifosfamide and paclitaxel showed promising activity in some patients with aggressive mCRPC. Preclinical data suggest that the drug combination of cisplatin and ifosfamide rule the efficacy of TIP in mCRPC.


2020 ◽  
Vol 47 (2) ◽  
pp. 202-214
Author(s):  
Joao Soliman-Junior ◽  
Carlos T. Formoso ◽  
Patricia Tzortzopoulos

Healthcare projects are known for having a high degree of complexity. Furthermore, the design of healthcare facilities is highly constrained by regulations containing a wide range of requirements. Using BIM for automated rule checking has been pointed out as an opportunity to improve requirements management in these projects. However, most existing research is focused on hard-coded approaches or on limited sets of requirements. The aim of this investigation is to propose a semantic-based framework for automated rule checking in the context of healthcare design. An empirical study was conducted in the redevelopment of a university hospital, using Design Science Research as a methodological approach. Results indicate that the nature of regulations and the subjectivity of requirements have a major impact on the possibility of their translation into logical rules, which is needed to enable automated checking. The main theoretical contribution is a taxonomy for automated rule checking and information transformation.


2018 ◽  
Vol 7 (4) ◽  
pp. 2153
Author(s):  
P A. Dhulekar ◽  
S T. Gandhe

In modern years large extent of the work has been carried out to recognize human actions perhaps because of its wide range of applications in the field of surveillance, human-machine interaction and video analysis. Several methods were proposed by researchers to resolve action recognition challenges such as variations in viewpoints, occlusion, cluttered backgrounds and camera motion. To address these challenges, we propose a novel method comprise of features extraction using histogram of oriented gradients (HOG), and their classification using k-nearest neighbor (k-NN) and support vector machine (SVM). Six different experimentations were carried out on the basis of hybrid combinations of feature extractors and classifiers. Two gold standard datasets; KTH and Weizmann were used for training and testing purpose. The quantitative parameters such as recognition accuracy, training time and prediction speed were used for evaluation. To validate the applicability of proposed algorithm, its performance has been compared with spatio-temporal interest points (STIP) technique which was proposed as state of art method in the domain. 


1979 ◽  
Vol 23 (1) ◽  
pp. 75-79 ◽  
Author(s):  
Dennis B. Beringer

Systematic and economic design and evaluation strategies were applied to a computer-generated 4-D aerial navigation system. During the evaluation each of 24 experienced instrument pilots received training in a PLATO-based digital flight simulator using either a keyboard entry/static map, keyboard entry/dynamic map, or touch entry/dynamic map system. Tasks performed during the execution of an area navigation course included continuous flight control, navigation data updating, digital data entry, and amended course plotting. Digital data entry training time was comparable for all three systems but the touch-map proved superior for the plotting tasks, greatly reducing training and task execution times while virtually eliminating errors. Subsequent performance evaluation showed that the touch-map reduced flight path tracking error, increased processing rates on a digit-cancelling secondary task, and increased the accuracy of manual plotting operations. It was concluded that a touch entry system could significantly reduce cockpit workload across a wide range of operational environments.


2020 ◽  
Vol 10 (19) ◽  
pp. 6893
Author(s):  
Yerai Doval ◽  
Jesús Vilares ◽  
Carlos Gómez-Rodríguez

Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing from social media. In this work, we propose a simple extension to the skipgram model in which we introduce the concept of bridge-words, which are artificial words added to the model to strengthen the similarity between standard words and their noisy variants. Our new embeddings outperform baseline models on noisy texts on a wide range of evaluation tasks, both intrinsic and extrinsic, while retaining a good performance on standard texts. To the best of our knowledge, this is the first explicit approach at dealing with these types of noisy texts at the word embedding level that goes beyond the support for out-of-vocabulary words.


Sign in / Sign up

Export Citation Format

Share Document