mean similarity
Recently Published Documents


TOTAL DOCUMENTS

35
(FIVE YEARS 9)

H-INDEX

13
(FIVE YEARS 1)

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Ramón Alain Miranda-Quintana ◽  
Dávid Bajusz ◽  
Anita Rácz ◽  
Károly Héberger

AbstractQuantification of the similarity of objects is a key concept in many areas of computational science. This includes cheminformatics, where molecular similarity is usually quantified based on binary fingerprints. While there is a wide selection of available molecular representations and similarity metrics, there were no previous efforts to extend the computational framework of similarity calculations to the simultaneous comparison of more than two objects (molecules) at the same time. The present study bridges this gap, by introducing a straightforward computational framework for comparing multiple objects at the same time and providing extended formulas for as many similarity metrics as possible. In the binary case (i.e. when comparing two molecules pairwise) these are naturally reduced to their well-known formulas. We provide a detailed analysis on the effects of various parameters on the similarity values calculated by the extended formulas. The extended similarity indices are entirely general and do not depend on the fingerprints used. Two types of variance analysis (ANOVA) help to understand the main features of the indices: (i) ANOVA of mean similarity indices; (ii) ANOVA of sum of ranking differences (SRD). Practical aspects and applications of the extended similarity indices are detailed in the accompanying paper: Miranda-Quintana et al. J Cheminform. 2021. 10.1186/s13321-021-00504-4. Python code for calculating the extended similarity metrics is freely available at: https://github.com/ramirandaq/MultipleComparisons.


Author(s):  
Zhibiao Peng ◽  
Jian Rong ◽  
Yiping Wu ◽  
Chenjing Zhou ◽  
Yuan Yuan ◽  
...  

Driving fatigue is one of the main causes of traffic accidents in monotonous environments such as grassland highways. However, the process of generation of driving fatigue on grassland highways is still not clear. A driving simulation experiment with 23 participants was performed to collect data on driving behavior, reaction time and electrocardiogram (ECG) results when driving on a grassland highway. The effective feature indicators of driving fatigue based on driving behavior data were calculated by Pearson correlation coefficient and principal component analysis method. The matter-element model based on entropy weight method was used to quantify the generation process of driving fatigue (GPDF). GPDF was classified as different patterns by the eigenvalue of GPDF curves. Reaction time and ECG data were utilized to verify the rationality of GPDF. Results show that there were 13 feature indicators of driving behavior suitable for driving fatigue description. GPDF was not completely consistent among different participants and was classified into three patterns (i.e., mild, moderate and severe fatigue). The mean similarity for GPDF in each pattern was 0.87, 0.61 and 0.50. Validation test demonstrated that driving fatigue detection accuracy by GPDF was 72%. The mean similarity of the GPDF between driving behavior and ECG was 0.72. Driving fatigue tended to occur with driving time of 19 min or 33 min. This study is helpful to understand GPDF on grassland highways from the perspective of individual driving behavior, which would provide suggestions for the reasonable setting of anti-fatigue devices.


2020 ◽  
Author(s):  
Angelo Fortunato ◽  
Diego Mallo ◽  
Shawn M. Rupp ◽  
Lorraine King ◽  
Timothy Hardman ◽  
...  

AbstractMost tissue collections of neoplasms are composed of formalin-fixed and paraffin-embedded (FFPE) excised tumor samples used for routine diagnostics. DNA sequencing is becoming increasingly important in cancer research and clinical management; however, it is difficult to accurately sequence DNA from FFPE samples. We developed and validated a new bioinformatic algorithm to robustly identify somatic single nucleotide variants (SNVs) using small amounts of DNA extracted from archival FFPE samples of breast cancers. We optimized this strategy using 28 pairs of technical replicates—the same DNA sample sequenced twice independently. After optimization, the mean similarity between replicates increased 5-fold, reaching 88% (range 0-100%), with a mean of 21.4 SNVs (range 1-68) per sample. We found that the SNV-identification accuracy declined when there was less than 40ng of DNA available and that insertion-deletion variant calls are unreliable. This new algorithm provides a crucial improvement in detecting SNVs in FFPE samples.


2020 ◽  
Vol 8 (2) ◽  
pp. 171
Author(s):  
Erni Mariana

The selection of learning media must be considered several things, including the ease and use, handouts are learning media that are easy to make and use. This study aims to determine the effect of using handouts on the physics learning outcomes of grade VIII students of SMP Negeri 3 Tumijajar even semester. The research design used was Posttest-Only Control Design Design and the research instruments were observation sheets, questionnaire sheets and tests. The data analysis technique used is the calculation of the normality test, the homogeneity test, the two mean similarity test, and the two mean difference test. Research shows that the  valuet of 3.49 was greater than ttable at 2.00 at significance level α = 5%, as well as the significance level α = 1%t is greater than ttable by 2.66. So the hypothesis which states that there is an effect of the use of handouts on the physics learning outcomes of grade VIII students of SMP N 3 Tumijajar even semester is accepted. The selection of learning media must be considered several things, including the ease and use, handouts are learning media that are easy to make and use. This study aims to determine the effect of using handouts on the physics learning outcomes of grade VIII students of SMP Negeri 3 Tumijajar even semester. The research design used was Posttest-Only Control Design Design and the research instruments were observation sheets, questionnaire sheets and tests. The data analysis technique used is the calculation of the normality test, the homogeneity test, the two mean similarity test, and the two mean difference test. Research shows that the  valuet of 3.49 was greater than ttable at 2.00 at significance level α = 5%, as well as the significance level α = 1%t is greater than ttable by 2.66. So the hypothesis which states that there is an effect of the use of handouts on the physics learning outcomes of grade VIII students of SMP N 3 Tumijajar even semester is accepted.  


2020 ◽  
Author(s):  
Ralf Loritz ◽  
Markus Hrachowitz ◽  
Malte Neuper ◽  
Erwin Zehe

Abstract. This study investigates the role and value of distributed rainfall for the runoff generation of a mesoscale catchment (20 km2). We compare the performance of three hydrological models at different periods and show that a distributed model driven by distributed rainfall yields only to improved performances during certain periods. These periods are dominated by convective storms that are typically characterized by higher spatial and temporal variabilities compared to stratiform precipitation events that dominate the rainfall generation in winter. Motivated by these findings we develop a spatially adaptive model that is capable to dynamically adjust its spatial structure during runtime to represent the varying importance of distributed rainfall within a hydrological model without losing predictive performance compared to a spatially distributed model. Our results highlight that adaptive modeling might be a promising way to better understand the varying relevance of distributed rainfall in hydrological models as well as reiterate that it might be one way to reduce computational times. They furthermore show that hydrological similarity concerning the runoff generation does not necessarily mean similarity for other dynamic variables such as the distribution of soil moisture.


Water ◽  
2020 ◽  
Vol 12 (2) ◽  
pp. 449 ◽  
Author(s):  
Lukas Taxböck ◽  
Dirk Nikolaus Karger ◽  
Michael Kessler ◽  
Daniel Spitale ◽  
Marco Cantonati

Understanding the drivers of species richness gradients is a central challenge of ecological and biodiversity research in freshwater science. Species richness along elevational gradients reveals a great variety of patterns. Here, we investigate elevational changes in species richness and turnover between microhabitats in near-natural spring habitats across Switzerland. Species richness was determined for 175 subsamples from 71 near-natural springs, and Poisson regression was applied between species richness and environmental predictors. Compositional turnover was calculated between the different microhabitats within single springs using the Jaccard index based on observed species and the Chao index based on estimated species numbers. In total, 539 diatom species were identified. Species richness increased monotonically with elevation. Habitat diversity and elevation explaining some of the species richness per site. The Jaccard index for the measured compositional turnover showed a mean similarity of 70% between microhabitats within springs, whereas the Chao index which accounts for sampling artefacts estimated a turnover of only 37%. Thus, the commonly applied method of counting 500 valves led to an undersampling of the rare species and might need to be reconsidered when assessing diatom biodiversity.


2020 ◽  
Vol 19 ◽  
pp. 153303382092062
Author(s):  
Xinzhuo Wang ◽  
Raymond Miralbell ◽  
Odile Fargier-Bochaton ◽  
Shelley Bulling ◽  
Jean Paul Vallée ◽  
...  

Objective: Delineation of organs at risk is a time-consuming task. This study evaluates the benefits of using single-subject atlas-based automatic segmentation of organs at risk in patients with breast cancer treated in prone position, with 2 different criteria for choosing the atlas subject. Together with laterality (left/right), the criteria used were either (1) breast volume or (2) body mass index and breast cup size. Methods: An atlas supporting different selection criteria for automatic segmentation was generated from contours drawn by a senior radiation oncologist (RO_A). Atlas organs at risk included heart, left anterior descending artery, and right coronary artery. Manual contours drawn by RO_A and automatic segmentation contours of organs at risk and breast clinical target volume were created for 27 nonatlas patients. A second radiation oncologist (RO_B) manually contoured (M_B) the breast clinical target volume and the heart. Contouring times were recorded and the reliability of the automatic segmentation was assessed in the context of 3-D planning. Results: Accounting for body mass index and breast cup size improved automatic segmentation results compared to breast volume-based sampling, especially for the heart (mean similarity indexes >0.9 for automatic segmentation organs at risk and clinical target volume after RO_A editing). Mean similarity indexes for the left anterior descending artery and the right coronary artery edited by RO_A expanded by 1 cm were ≥0.8. Using automatic segmentation reduced contouring time by 40%. For each parameter analyzed (eg, D2%), the difference in dose, averaged over all patients, between automatic segmentation structures edited by RO_A and the same structure manually drawn by RO_A was <1.5% of the prescribed dose. The mean heart dose was reliable for the unedited heart segmentation, and for right-sided treatments, automatic segmentation was adequate for treatment planning with 3-D conformal tangential fields. Conclusions: Automatic segmentation for prone breast radiotherapy stratified by body mass index and breast cup size improved segmentation accuracy for the heart and coronary vessels compared to breast volume sampling. A significant reduction in contouring time can be achieved by using automatic segmentation.


Diversity ◽  
2019 ◽  
Vol 11 (12) ◽  
pp. 229 ◽  
Author(s):  
Anouk E. van Breukelen ◽  
Harmen P. Doekes ◽  
Jack J. Windig ◽  
Kor Oldenbroek

In this study, we characterized genetic diversity in the gene bank for Dutch native cattle breeds. A total of 715 bulls from seven native breeds and a sample of 165 Holstein Friesian bulls were included. Genotype data were used to calculate genetic similarities. Based on these similarities, most breeds were clearly differentiated, except for two breeds (Deep Red and Improved Red and White) that have recently been derived from the MRY breed, and for the Dutch Friesian and Dutch Friesian Red, which have frequently exchanged bulls. Optimal contribution selection (OCS) was used to construct core sets of bulls with a minimized similarity. The composition of the gene bank appeared to be partly optimized in the semen collection process, i.e., the mean similarity within breeds based on the current number of straws per bull was 0.32% to 1.49% lower than when each bull would have contributed equally. Mean similarity could be further reduced within core sets by 0.34% to 2.79% using OCS. Material not needed for the core sets can be made available for supporting in situ populations and for research. Our findings provide insight in genetic diversity in Dutch cattle breeds and help to prioritize material in gene banking.


Genetika ◽  
2019 ◽  
Vol 51 (1) ◽  
pp. 227-236
Author(s):  
Lawrence Akinro ◽  
Adenubi Adesoye ◽  
Taiye Fasola

Cola species constitute an important non-timber forest product. Besides the food value, Cola is rich in numerous phytochemicals, making it more important for its use in both African traditional medicine and potentials in industrial pharmacopoeia. Knowledge about genetic diversity is essential for conservation. In this paper, we reported genetic variability of Cola acuminata and C. nitida germplasm across the Cola - producing states (the rain forest and derived savannah zones) in Nigeria using Random Amplified Polymorphic DNA (RAPD) markers. Fifteen primers which gives an average of 6.5 bands per primer were selected for both species. C. acuminata exhibited a higher level of variation with 71.5% of the detected markers being polymorphic (223 polymorphic alleles), whereas C. nitida presented 58.3% variation with 182 polymorphicalleles. Inter-population differentiation was measured as Jaccard?s similarity coefficient. The mean similarity index amounted to 42.5% in C. acuminata and 46.7% in C. nitida respectively. Results reveal the genetic structure of both species and conservation strategies are suggested.


Author(s):  
Sajjad Ahmad ◽  
Rajvinder Kaur ◽  
Mark Lefsrud ◽  
Jaswinder Singh

Retrotransposons diversity has been extensively studied in monocots, but it is not well documented in dicot species. Transposition activity of transposons creates DNA polymorphism and their abundant presence in genomes is making transposons a promising marker system for varietal identification and fingerprinting. In this study, four transposon-based markers (two DNA- and two RNA-transposons) were employed to evaluate the effectiveness of Inter-Retrotransposon Amplified Polymorphism (IRAP) transposon system in assessing genetic diversity in pea germplasm accessions. A total of 28 alleles were detected across the 35 pea accessions with number of alleles per locus ranged from 5 (Mutator) to 9 (Cyclops). RNA transposons produced a higher number of polymorphic alleles (Ogre: 8, Cyclops: 9) than DNA transposon markers (Mutator: 5, MITE: 6). Overall mean PIC value and D values for these transposon markers were 0.810 and 0.817 respectively. Genetic similarity values ranged from 0.143 to 0.823 with a mean similarity value of 0.403. Cluster analysis classified pea genotypes into six major groups that were somewhat consistent with their geographical origins. The molecular analyses differentiated all the 35 accessions and generated higher PIC and D values that can be useful for MAS-based breeding programs in pea.


Sign in / Sign up

Export Citation Format

Share Document