Extending BERT for Clinical Semantic Textual Similarity (Preprint)

Mapping Intimacies ◽

10.2196/preprints.22795 ◽

2020 ◽

Author(s):

Klaus Kades ◽

Jan Sellner ◽

Gregor Koehler ◽

Peter M. Full ◽

T.Y. Emmy Lai ◽

...

Keyword(s):

Correlation Coefficient ◽

Pearson Correlation ◽

Language Models ◽

Training Dataset ◽

Pearson Correlation Coefficient ◽

Text Data ◽

Test Dataset ◽

Clinical Text ◽

Starting Point ◽

Semantic Textual Similarity

BACKGROUND Natural Language Understanding enables automatic extraction of relevant information from clinical text data which are acquired every day in hospitals. In 2018, the language model BERT was introduced generating new state of the art results on several downstream tasks. The National NLP Clinical Challenges (n2c2) was initiated to tackle such downstream tasks on clinical text data where domain adapted methods might be a way to further improve language models like BERT. OBJECTIVE Optimally leverage BERT for the task of semantic textual similarity on clinical text data. METHODS We used BERT as an initial baseline and analysed its results which we used as a starting point to develop three different approaches where we (1) added additional, handcrafted sentence similarity features to the classifier token of BERT and combined the results with more features in multiple regression estimators, (2) incorporated a built-in ensembling method, M-Heads, into BERT by duplicating the regression head and applying an adapted training strategy to facilitate the focus of the heads on different input patterns of the medical sentences and (3) developed a graph-based similarity approach for medications which allows extrapolating similarities across known entities from the training set. The approaches were evaluated with the Pearson correlation coefficient between the predicted scores and ground truth on the official training and test dataset. RESULTS We improve the performance of BERT on the test dataset from a Pearson correlation coefficient of 0.859 to 0.883 using a combination of the M-Heads and the graph-based similarity approach. We also show differences between the test and training dataset and how they influence the results. CONCLUSIONS We found that using a graph-based similarity approach has the potential to extrapolate domain specific knowledge to unseen sentences. For the evaluation, we observed that it is easily possible to get deceived by results on the test dataset especially when the distribution of the data samples is different between the training and test datasets.

Download Full-text

On the Psychological Reality of the Pearson Correlation Coefficient: Effects of Objective Covariation on Subjective Estimates

PsycEXTRA Dataset ◽

10.1037/e537052012-416 ◽

2004 ◽

Author(s):

Richard B. Anderson ◽

Andrea M. Angott ◽

Michael E. Doherty

Keyword(s):

Correlation Coefficient ◽

Pearson Correlation ◽

Pearson Correlation Coefficient ◽

Psychological Reality ◽

Subjective Estimates

Download Full-text

Relationship between Invasive and Non-Invasive Hemodynamic Measures in Experimental Pulmonary Hypertension

Current Respiratory Medicine Reviews ◽

10.2174/1573398x16666200516180118 ◽

2020 ◽

Vol 16 (1) ◽

pp. 47-53

Author(s):

Vicente Benavides-Córdoba ◽

Mauricio Palacios Gómez

Keyword(s):

Heart Rate ◽

Pulmonary Hypertension ◽

Ejection Fraction ◽

Right Ventricle ◽

Correlation Coefficient ◽

Pearson Correlation ◽

Systolic Pressure ◽

Control Group ◽

Pearson Correlation Coefficient ◽

Non Invasive

Introduction: Animal models have been used to understand the pathophysiology of pulmonary hypertension, to describe the mechanisms of action and to evaluate promising active ingredients. The monocrotaline-induced pulmonary hypertension model is the most used animal model. In this model, invasive and non-invasive hemodynamic variables that resemble human measurements have been used. Aim: To define if non-invasive variables can predict hemodynamic measures in the monocrotaline-induced pulmonary hypertension model. Materials and Methods: Twenty 6-week old male Wistar rats weighing between 250-300g from the bioterium of the Universidad del Valle (Cali - Colombia) were used in order to establish that the relationships between invasive and non-invasive variables are sustained in different conditions (healthy, hypertrophy and treated). The animals were organized into three groups, a control group who was given 0.9% saline solution subcutaneously (sc), a group with pulmonary hypertension induced with a single subcutaneous dose of Monocrotaline 30 mg/kg, and a group with pulmonary hypertension with 30 mg/kg of monocrotaline treated with Sildenafil. Right ventricle ejection fraction, heart rate, right ventricle systolic pressure and the extent of hypertrophy were measured. The functional relation between any two variables was evaluated by the Pearson correlation coefficient. Results: It was found that all correlations were statistically significant (p <0.01). The strongest correlation was the inverse one between the RVEF and the Fulton index (r = -0.82). The Fulton index also had a strong correlation with the RVSP (r = 0.79). The Pearson correlation coefficient between the RVEF and the RVSP was -0.81, meaning that the higher the systolic pressure in the right ventricle, the lower the ejection fraction value. Heart rate was significantly correlated to the other three variables studied, although with relatively low correlation. Conclusion: The correlations obtained in this study indicate that the parameters evaluated in the research related to experimental pulmonary hypertension correlate adequately and that the measurements that are currently made are adequate and consistent with each other, that is, they have good predictive capacity.

Download Full-text

Bed-Based Ballistocardiography: Dataset and Ability to Track Cardiovascular Parameters

Sensors ◽

10.3390/s21010156 ◽

2020 ◽

Vol 21 (1) ◽

pp. 156

Author(s):

Charles Carlson ◽

Vanessa-Rose Turpin ◽

Ahmad Suliman ◽

Carl Ade ◽

Steve Warren ◽

...

Keyword(s):

Blood Pressure ◽

Stroke Volume ◽

Systolic Blood Pressure ◽

Correlation Coefficient ◽

Pearson Correlation ◽

Health Assessment ◽

Initial Study ◽

Pearson Correlation Coefficient ◽

Medical Systems ◽

Cardiovascular Parameters

Background: The goal of this work was to create a sharable dataset of heart-driven signals, including ballistocardiograms (BCGs) and time-aligned electrocardiograms (ECGs), photoplethysmograms (PPGs), and blood pressure waveforms. Methods: A custom, bed-based ballistocardiographic system is described in detail. Affiliated cardiopulmonary signals are acquired using a GE Datex CardioCap 5 patient monitor (which collects ECG and PPG data) and a Finapres Medical Systems Finometer PRO (which provides continuous reconstructed brachial artery pressure waveforms and derived cardiovascular parameters). Results: Data were collected from 40 participants, 4 of whom had been or were currently diagnosed with a heart condition at the time they enrolled in the study. An investigation revealed that features extracted from a BCG could be used to track changes in systolic blood pressure (Pearson correlation coefficient of 0.54 +/− 0.15), dP/dtmax (Pearson correlation coefficient of 0.51 +/− 0.18), and stroke volume (Pearson correlation coefficient of 0.54 +/− 0.17). Conclusion: A collection of synchronized, heart-driven signals, including BCGs, ECGs, PPGs, and blood pressure waveforms, was acquired and made publicly available. An initial study indicated that bed-based ballistocardiography can be used to track beat-to-beat changes in systolic blood pressure and stroke volume. Significance: To the best of the authors’ knowledge, no other database that includes time-aligned ECG, PPG, BCG, and continuous blood pressure data is available to the public. This dataset could be used by other researchers for algorithm testing and development in this fast-growing field of health assessment, without requiring these individuals to invest considerable time and resources into hardware development and data collection.

Download Full-text

Assessment of the Dissimilarities of EDI and SPI Measures for Drought Determination in South Africa

Water ◽

10.3390/w13010082 ◽

2021 ◽

Vol 13 (1) ◽

pp. 82

Author(s):

Omolola M. Adisa ◽

Muthoni Masinde ◽

Joel O. Botai

Keyword(s):

South Africa ◽

South African ◽

Correlation Coefficient ◽

Pearson Correlation ◽

Standardized Precipitation Index ◽

Drought Monitoring ◽

Drought Indices ◽

Drought Severity ◽

Pearson Correlation Coefficient ◽

Drought Duration

This study examines the (dis)similarity of two commonly used indices Standardized Precipitation Index (SPI) computed over accumulation periods 1-month, 3-month, 6-month, and 12-month (hereafter SPI-1, SPI-3, SPI-6, and SPI-12, respectively) and Effective Drought Index (EDI). The analysis is based on two drought monitoring indicators (derived from SPI and EDI), namely, the Drought Duration (DD) and Drought Severity (DS) across the 93 South African Weather Service’s delineated rainfall districts over South Africa from 1980 to 2019. In the study, the Pearson correlation coefficient dissimilarity and periodogram dissimilarity estimates were used. The results indicate a positive correlation for the Pearson correlation coefficient dissimilarity and a positive value for periodogram of dissimilarity in both the DD and DS. With the Pearson correlation coefficient dissimilarity, the study demonstrates that the values of the SPI-1/EDI pair and the SPI-3/EDI pair exhibit the highest similar values for DD, while the SPI-6/EDI pair shows the highest similar values for DS. Moreover, dissimilarities are more obvious in SPI-12/EDI pair for DD and DS. When a periodogram of dissimilarity is used, the values of the SPI-1/EDI pair and SPI-6/EDI pair exhibit the highest similar values for DD, while SPI-1/EDI displayed the highest similar values for DS. Overall, the two measures show that the highest similarity is obtained in the SPI-1/EDI pair for DS. The results obtainable in this study contribute towards an in-depth knowledge of deviation between the EDI and SPI values for South Africa, depicting that these two drought indices values are replaceable in some rainfall districts of South Africa for drought monitoring and prediction, and this is a step towards the selection of the appropriate drought indices.

Download Full-text

Fault Detection and Location Method for Mesh-type DC Microgrid using Pearson Correlation Coefficient

IEEE Transactions on Power Delivery ◽

10.1109/tpwrd.2020.3008924 ◽

2020 ◽

pp. 1-1 ◽

Cited By ~ 1

Author(s):

Liang Kong ◽

Heng Nian

Keyword(s):

Fault Detection ◽

Correlation Coefficient ◽

Pearson Correlation ◽

Pearson Correlation Coefficient ◽

Dc Microgrid ◽

Location Method

Download Full-text

Use of Multiple Visits to Increase Blood Pressure Tracking Correlations in Childhood

PEDIATRICS ◽

10.1542/peds.87.5.708 ◽

1991 ◽

Vol 87 (5) ◽

pp. 708-711

Author(s):

Matthew W. Gillman ◽

Bernard Rosner ◽

Denis A. Evans ◽

Laurel A. Smith ◽

James O. Taylor ◽

...

Keyword(s):

Blood Pressure ◽

Diastolic Blood Pressure ◽

Correlation Coefficient ◽

Diastolic Pressure ◽

Pearson Correlation ◽

Systolic Pressure ◽

Pearson Correlation Coefficient ◽

Increase Blood Pressure ◽

Multiple Visit

Previous studies of childhood blood pressure have shown tracking correlations, which estimate the magnitude of association between initial and subsequent measurements, to be lower than corresponding adult values. Inasmuch as this disparity could arise from failing to account for a larger week-to-week variability in children, blood pressure was measured for 4 successive years, on four weekly visits in each year, and with three measurements at each visit, using a random-zero sphygmomanometer, in a cohort of 333 schoolchildren aged 8 through 15 at entry. Ninety percent of subjects had measurements in 1 or more years of follow-up. For all follow-up periods (1, 2, and 3 years from baseline), the Pearson correlation coefficient (r) for both systolic and diastolic blood pressure rose substantially with the number of weekly visits used to calculate each subject's yearly blood pressure (P < .0001). For systolic pressure, the 3-year r values for 1, 2, 3, and 4 visits were .45, .55, .64, and .69, respectively. For diastolic pressure (Korotkoff phase 4), the corresponding values were .28, .41, .47, and .54. These higher multiple-visit estimates of tracking approximate published adult values and raise the possibility that prediction of adult blood pressure from childhood measurements may be improved by averaging readings from multiple weekly visits.

Download Full-text

Algorithm for Eliminating Mismatched Points Based on Pearson Correlation Coefficient

Laser & Optoelectronics Progress ◽

10.3788/lop202158.0810025 ◽

2021 ◽

Vol 58 (8) ◽

pp. 0810025

Author(s):

李硕 Li Shuo ◽

韩迎东 Han Yingdong ◽

王双 Wang Shuang ◽

刘琨 Liu Kun ◽

江俊峰 Jiang Junfeng ◽

...

Keyword(s):

Correlation Coefficient ◽

Pearson Correlation ◽

Pearson Correlation Coefficient

Download Full-text

A new sampling method in particle filter based on Pearson correlation coefficient

Neurocomputing ◽

10.1016/j.neucom.2016.07.036 ◽

2016 ◽

Vol 216 ◽

pp. 208-215 ◽

Cited By ~ 83

Author(s):

Haomiao Zhou ◽

Zhihong Deng ◽

Yuanqing Xia ◽

Mengyin Fu

Keyword(s):

Particle Filter ◽

Correlation Coefficient ◽

Sampling Method ◽

Pearson Correlation ◽

Pearson Correlation Coefficient

Download Full-text

Analisis Pertumbuhan Tanaman Restorasi di Taman Nasional Matalawa

Journal of Tropical Silviculture ◽

10.29244/j-siltrop.11.3.170-176 ◽

2020 ◽

Vol 11 (3) ◽

pp. 170-176

Author(s):

Suyogia Nur Azis ◽

Nurheni Wijayanto ◽

Arum Sekar Wulandari

Keyword(s):

Environmental Factors ◽

Correlation Coefficient ◽

Forest Restoration ◽

National Park ◽

Pearson Correlation ◽

Ecosystem Restoration ◽

Strong Relationship ◽

Pearson Correlation Coefficient ◽

Open Area ◽

Plant Restoration

Ecosystem restoration in Matalawa National Park is an effort to restore the condition of the forest so as to achieve the function of forest to maintain biodiversity. This research aims to analyze the biophysical influence against growth of plant restoration. The research was conducted in Manurara, Taman Mas, Tangairi and Waimanu. The object of the research consists of adinu plant (Melochia umbellata), cimung plant (Timonius timon), kihi plant (Canarium acutifolium), langaha plant (Planchonia valida) and mara plant (Tetrameles nudiflora). The experiment was analyzed by pearson correlation coefficient (r) and t test. The results showed biophysical environmental factors have a very strong relationship with the growth of restoration plants in Matalawa National Park is phosphorus, CEC, pH and altitude of site. Besides, the growth of adinu plant is higher than other plants in open area conditions. Keywords: biophysical, characteristics of plant species, forest restoration, national park

Download Full-text

Evaluation of Pearson correlation coefficient and Parisi parameter of replica symmetry breaking in a hybrid electronically addressable random fiber laser

Optics Express ◽

10.1364/oe.431981 ◽

2021 ◽

Author(s):

Ernesto Raposo ◽

EDWIN DANELLI CORONEL SANCHEZ ◽

Avishek Das ◽

Iván González ◽

Anderson S. Gomes ◽

...

Keyword(s):

Symmetry Breaking ◽

Fiber Laser ◽

Correlation Coefficient ◽

Pearson Correlation ◽

Pearson Correlation Coefficient ◽

Replica Symmetry ◽

Replica Symmetry Breaking

Download Full-text