similarity metrics
Recently Published Documents


TOTAL DOCUMENTS

380
(FIVE YEARS 120)

H-INDEX

23
(FIVE YEARS 4)

Author(s):  
Asma Islam ◽  
Eshrat Jahan Esha ◽  
Sheikh Farhana Binte Ahmed ◽  
Md. Kafiul Islam

Motion artifacts contribute complexity in acquiring clean electroencephalography (EEG) data. It is one of the major challenges for ambulatory EEG. The performance of mobile health monitoring, neurological disorders diagnosis and surgeries can be significantly improved by reducing the motion artifacts. Although different papers have proposed various novel approaches for removing motion artifacts, the datasets used to validate those algorithms are questionable. In this paper, a unique EEG dataset was presented where ten different activities were performed. No such previous EEG recordings using EMOTIV EEG headset are available in research history that explicitly mentioned and considered a number of daily activities that induced motion artifacts in EEG recordings. Quantitative study shows that in comparison to correlation coefficient, the coherence analysis depicted a better similarity measure between motion artifacts and motion sensor data. Motion artifacts were characterized with very low frequency which overlapped with the Delta rhythm of the EEG. Also, a general wavelet transform based approach was presented to remove motion artifacts. Further experiment and analysis with more similarity metrics and longer recording duration for each activity is required to finalize the characteristics of motion artifacts and henceforth reliably identify and subsequently remove the motion artifacts in the contaminated EEG recordings.


2021 ◽  
pp. 1-12
Author(s):  
Jianfei Zhang ◽  
Wenge Rong ◽  
Dali Chen ◽  
Zhang Xiong

The traditional end-to-end Neural Question Generation (NQG) models tend to generate generic and bland questions, as there are two obscure points: 1) the modifications of the answer in the context can be used as the clues to the answer mentioned in the question, while they are generally not unique and can be used independently for generating diverse questions; 2) the same question content can also be asked in diverse ways, which depends on personal preference in practice. The above-mentioned two points are indeed two variables to conduct question generation, but they are not annotated in the original dataset and are thus ignored by the traditional end-to-end models. In this paper we propose a framework that clarifies those two points through two sub-modules to better conduct question generation. We take experiments based on the GPT-2 model and the SQuAD dataset, and prove that our framework can improve the performance measured by similarity metrics, while it also provides appropriate alternatives for controllable diversity enhancement.


2021 ◽  
Author(s):  
Jong-Kang Lee ◽  
Jue-Ni Huang ◽  
Kun-Ju Lin ◽  
Richard Tzong-Han Tsai

BACKGROUND Electronic records provide rich clinical information for biomedical text mining. However, a system developed on one hospital department may not generalize to other departments. Here, we use hospital medical records as a research data source and explore the heterogeneous problem posed by different hospital departments. OBJECTIVE We use MIMIC-III hospital medical records as the research data source. We collaborate with medical experts to annotate the data, with 328 records being included in analyses. Disease named entity recognition (NER), which helps medical experts in consolidating diagnoses, is undertaken as a case study. METHODS To compare heterogeneity of medical records across departments, we access text from multiple departments and employ the similarity metrics. We apply transfer learning to NER in different departments’ records and test the correlation between performance and similarity metrics. We use TF-IDF cosine similarity of the named entities as our similarity metric. We use three pretrained model on the disease NER task to valid the consistency of the result. RESULTS The disease NER dataset we release consists of 328 medical records from MIMIC-III, with 95629 sentences and 8884 disease mentions in total. The inter annotator agreement Cohen’s kappa coefficient is 0.86. Similarity metrics support that medical records from different departments are heterogeneous, ranges from 0.1004 to 0.3541 compare to Medical department. In the transfer learning task using the Medical department as the training set, F1 score performs in three pretrained models average from 0.847 to 0.863. F1 scores correlate with similarity metrics with Spearman’s coefficient of 0.4285. CONCLUSIONS We propose a disease NER dataset based on medical records from MIMIC-III and demonstrate the effectiveness of transfer learning using BERT. Similarity metrics reveal noticeable heterogeneity between department records. The deep learning-based transfer learning method demonstrates good ability to generalize across departments and achieve decent NER performance thus eliminates the concern that training material from one hospital might compromise model performance when applied to another. However, the model performance does not show high correlation to the departments’ similarity.


2021 ◽  
Author(s):  
Steven Frank

Abstract Pathology slides of malignancies are segmented using lightweight convolutional neural networks (CNNs) that may be deployed on mobile devices. This is made possible by preprocessing candidate images to make CNN analysis tractable and also to exclude regions unlikely to be diagnostically relevant. In a training phase, labeled whole-slide histopathology images are first downsampled and decomposed into square tiles. Tiles corresponding to diseased regions are analyzed to determine boundary values of a visual criterion, image entropy. A lightweight CNN is then trained to distinguish tiles of diseased and non-diseased tissue, and if more than one disease type is present, to discriminate among these as well. A segmentation is generated by downsampling and tiling a candidate image, and retaining only those tiles with values of the visual criterion falling within the previously established extrema. The sifted tiles, which now exclude much of the non-diseased image content, are efficiently and accurately classified by the trained CNN. Tiles classified as diseased tissue ¾ or in the case of multiple possible subtypes, as the dominant subtype in the tile set ¾ are combined, either as a simple union or at a pixel level, to produce a segmentation mask or map. This approach was applied successfully to two very different datasets of large whole-slide images, one (PAIP2020) involving multiple subtypes of colorectal cancer and the other (CAMELYON16) single-type breast-cancer metastases. Scored using standard similarity metrics, the segmentations exhibited notably high recall, even when tiles were large relative to tumor features.


2021 ◽  
Vol 9 ◽  
Author(s):  
M. Yu. Cherbunina ◽  
E. S. Karaevskaya ◽  
Yu. K. Vasil’chuk ◽  
N. I. Tananaev ◽  
D. G. Shmelev ◽  
...  

Biotracers marking the geologic history and permafrost evolution in Central Yakutia, including Yedoma Ice Complex (IC) deposits, were identified in a multiproxy analysis of water chemistry, isotopic signatures, and microbial datasets. The key study sections were the Mamontova Gora and Syrdakh exposures, well covered in the literature. In the Mamontova Gora section, two distinct IC strata with massive ice wedges were described and sampled, the upper and lower IC strata, while previously published studies focused only on the lower IC horizon. Our results suggest that these two IC horizons differ in water origin of wedge ice and in their cryogenic evolution, evidenced by the differences in their chemistry, water isotopic signatures and the microbial community compositions. Microbial community similarity between ground ice and host deposits is shown to be a proxy for syngenetic deposition and freezing. High community similarity indicates syngenetic formation of ice wedges and host deposits of the lower IC horizon at the Mamontova Gora exposure. The upper IC horizon in this exposure has much lower similarity metrics between ice wedge and host sediments, and we suggest epigenetic ice wedge development in this stratum. We found a certain correspondence between the water origin and the degree of evaporative transformation in ice wedges and the microbial community composition, notably, the presence of Chloroflexia bacteria, represented by Gitt-GS-136 and KD4-96 classes. These bacteria are absent at the ice wedges of lower IC stratum at Mamontova Gora originating from snowmelt, but are abundant in the Syrdakh ice wedges, where the meltwater underwent evaporative isotopical fractionation. Minor evaporative transformation of water in the upper IC horizon of Mamontova Gora, whose ice wedges formed by meltwater that was additionally fractionated corresponds with moderate abundance of these classes in its bacterial community.


2021 ◽  
Vol 2021 (29) ◽  
pp. 300-305
Author(s):  
Mirko Agarla ◽  
Simone Bianco ◽  
Luigi Celona ◽  
Raimondo Schettini ◽  
Mikhail Tchobanou

In this paper we analyze the most used measures for the assessment of spectral similarity of reflectance and radiance signals. First of all we divide them in five groups on the basis of the type of errors they measure. We proceed analyzing their mathematical definition to identify unintended behaviors and types of errors they are blind to. Then exploiting the Munsell atlas we analyze the correlation between metrics in terms of both Pearson's Linear Correlation Coefficient (PLCC) and Spearman's Rank Order Correlation Coefficient (SROCC). Finally we analyze the behaviour of the selected metrics with respect to two different color properties: the Chroma and the Lightness computed in the CIE L* a* b* color space. The source code of the spectral measures considered is available at the following link: <ext-link ext-link-type="url" xlink:href="https://celuigi.github.io/spectral-similarity-metrics-comparison/">https://celuigi.github.io/spectral-similarity-metrics-comparison/</ext-link>.


2021 ◽  
Author(s):  
Caroline Skirrow ◽  
Marton Meszaros ◽  
Udeepa Meepegama ◽  
Raphael Lenain ◽  
Kathryn V Papp ◽  
...  

INTRODUCTION: Longitudinal data is key to identifying cognitive decline and treatment response in Alzheimer's disease (AD). METHODS: The Automatic Story Recall Task (ASRT) is a novel, fully automated test that can be self-administered remotely. In this longitudinal case-control observational study, 151 participants (mean age: 69.99 (range 54-82), 73 mild cognitive impairment/mild AD and 78 cognitively unimpaired) completed parallel ASRT assessments on their smart devices over 7-8 days. Responses were automatically transcribed and scored using text similarity metrics. RESULTS: Participants reported good task usability. Adherence to optional daily assessment was moderate. Parallel forms correlation coefficients between ASRTs were moderate-high. ASRTs correlated moderately with established tests of episodic memory and global cognitive function. Poorer performance was observed in participants with MCI/Mild AD. DISCUSSION: Unsupervised ASRT assessment is feasible in older and cognitively impaired people. This automated task shows good parallel forms reliability and convergent validity with established cognitive tests.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
John C. Thomas ◽  
Anirudh Raju Natarajan ◽  
Anton Van der Ven

AbstractMeasuring the similarity between two arbitrary crystal structures is a common challenge in crystallography and materials science. Although there are an infinite number of ways to mathematically relate two crystal structures, only a few are physically meaningful. Here we introduce both a geometry-based and a symmetry-adapted similarity metric to compare crystal structures. Using crystal symmetry and combinatorial optimization we describe an algorithm to arrive at the structural relationship that minimizes these similarity metrics across all possible maps between any pair of crystal structures. The approach makes it possible to (i) identify pairs of crystal structures that are identical, (ii) quantitatively measure the similarity between crystal structures, and (iii) find and rank structural transformation pathways between any pair of crystal structures. We discuss the advantages of using the symmetry-adapted cost metric over the geometric cost. Finally, we show that all known structural transformation pathways between common crystal structures are recovered with the mapping algorithm. The methodology presented in this study will be of value to efforts that seek to catalogue crystal structures, identify structural transformation pathways or prune large first-principles datasets used to parameterize on-lattice Hamiltonians.


Computers ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 123
Author(s):  
Triyanna Widiyaningtyas ◽  
Indriana Hidayah ◽  
Teguh Bharata Adji

One of the well-known recommendation systems is memory-based collaborative filtering that utilizes similarity metrics. Recently, the similarity metrics have taken into account the user rating and user behavior scores. The user behavior score indicates the user preference in each product type (genre). The added user behavior score to the similarity metric results in more complex computation. To reduce the complex computation, we combined the clustering method and user behavior score-based similarity. The clustering method applies k-means clustering by determination of the number of clusters using the Silhouette Coefficient. Whereas the user behavior score-based similarity utilizes User Profile Correlation-based Similarity (UPCSim). The experimental results with the MovieLens 100k dataset showed a faster computation time of 4.16 s. In addition, the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) values decreased by 1.88% and 1.46% compared to the baseline algorithm.


Sign in / Sign up

Export Citation Format

Share Document