discrimination power
Recently Published Documents


TOTAL DOCUMENTS

464
(FIVE YEARS 217)

H-INDEX

30
(FIVE YEARS 5)

2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Metawee Srikummool ◽  
Suparat Srithawong ◽  
Kanha Muisuk ◽  
Sukrit Sangkhano ◽  
Chatmongkon Suwannapoom ◽  
...  

AbstractSouthern Thailand is home to various populations; the Moklen, Moken and Urak Lawoi’ sea nomads and Maniq negrito are the minority, while the southern Thai groups (Buddhist and Muslim) are the majority. Although previous studies have generated forensic STR dataset for major groups, such data of the southern Thai minority have not been included; here we generated a regional forensic database of southern Thailand. We newly genotyped common 15 autosomal STRs in 184 unrelated southern Thais, including all minorities and majorities. When combined with previously published data of major southern Thais, this provides a total of 334 southern Thai samples. The forensic parameter results show appropriate values for personal identification and paternity testing; the probability of excluding paternity is 0.99999622, and the combined discrimination power is 0.999999999999999. Probably driven by genetic drift and/or isolation with small census size, we found genetic distinction of the Maniq and sea nomads from the major groups, which were closer to the Malay and central Thais than the other Thai groups. The allelic frequency results can strength the regional forensic database in southern Thailand and also provide useful information for anthropological perspective.


BMC Genomics ◽  
2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Mei Jiang ◽  
Shu-Fei Xu ◽  
Tai-Shan Tang ◽  
Li Miao ◽  
Bao-Zheng Luo ◽  
...  

Abstract Background Bioassessment and biomonitoring of meat products are aimed at identifying and quantifying adulterants and contaminants, such as meat from unexpected sources and microbes. Several methods for determining the biological composition of mixed samples have been used, including metabarcoding, metagenomics and mitochondrial metagenomics. In this study, we aimed to develop a method based on next-generation DNA sequencing to estimate samples that might contain meat from 15 mammalian and avian species that are commonly related to meat bioassessment and biomonitoring. Results In this project, we found the meat composition from 15 species could not be identified with the metabarcoding approach because of the lack of universal primers or insufficient discrimination power. Consequently, we developed and evaluated a meat mitochondrial metagenomics (3MG) method. The 3MG method has four steps: (1) extraction of sequencing reads from mitochondrial genomes (mitogenomes); (2) assembly of mitogenomes; (3) mapping of mitochondrial reads to the assembled mitogenomes; and (4) biomass estimation based on the number of uniquely mapped reads. The method was implemented in a python script called 3MG. The analysis of simulated datasets showed that the method can determine contaminant composition at a proportion of 2% and the relative error was < 5%. To evaluate the performance of 3MG, we constructed and analysed mixed samples derived from 15 animal species in equal mass. Then, we constructed and analysed mixed samples derived from two animal species (pork and chicken) in different ratios. DNAs were extracted and used in constructing 21 libraries for next-generation sequencing. The analysis of the 15 species mix with the method showed the successful identification of 12 of the 15 (80%) animal species tested. The analysis of the mixed samples of the two species revealed correlation coefficients of 0.98 for pork and 0.98 for chicken between the number of uniquely mapped reads and the mass proportion. Conclusion To the best of our knowledge, this study is the first to demonstrate the potential of the non-targeted 3MG method as a tool for accurately estimating biomass in meat mix samples. The method has potential broad applications in meat product safety.


2022 ◽  
Author(s):  
Haoyu Wen ◽  
Hong-Jia Chen ◽  
Chien-Chih Chen ◽  
Massimo Pica Ciamarra ◽  
Siew Ann Cheong

Abstract. Geoelectric time series (TS) has long been studied for its potential for probabilistic earthquake forecasting, and a recent model (GEMSTIP) directly used the skewness and kurtosis of geoelectric TS to provide Time of Increased Probabilities (TIPs) for earthquakes in several months in future. We followed up on this work by applying the Hidden Markov Model (HMM) on the correlation, variance, skewness, and kurtosis TSs to identify two Hidden States (HSs) with different distributions of these statistical indexes. More importantly, we tested whether these HSs could separate time periods into times of higher/lower earthquake probabilities. Using 0.5-Hz geoelectric TS data from 20 stations across Taiwan over 7 years, we first computed the statistical index TSs, and then applied the Baum-Welch Algorithm with multiple random initializations to obtain a well-converged HMM and its HS TS for each station. We then divided the map of Taiwan into a 16-by-16 grid map and quantified the forecasting skill, i.e., how well the HS TS could separate times of higher/lower earthquake probabilities in each cell in terms of a discrimination power measure that we defined. Next, we compare the discrimination power of empirical HS TSs against those of 400 simulated HS TSs, then organized the statistical significance values from these cellular-level hypothesis testing of the forecasting skill obtained into grid maps of discrimination reliability. Having found such significance values to be high for many grid cells for all stations, we proceeded with a statistical hypothesis test of the forecasting skill at the global level, to find high statistical significance across large parts of the hyperparameter spaces of most stations. We therefore concluded that geoelectric TSs indeed contain earthquake-related information, and the HMM approach to be capable at extracting this information for earthquake forecasting.


Author(s):  
Baida Hamdan ◽  
Davood Zabihzadeh

Similarity/distance measures play a key role in many machine learning, pattern recognition, and data mining algorithms, which leads to the emergence of the metric learning field. Many metric learning algorithms learn a global distance function from data that satisfies the constraints of the problem. However, in many real-world datasets, where the discrimination power of features varies in the different regions of input space, a global metric is often unable to capture the complexity of the task. To address this challenge, local metric learning methods are proposed which learn multiple metrics across the different regions of the input space. Some advantages of these methods include high flexibility and learning a nonlinear mapping, but they typically achieve at the expense of higher time requirements and overfitting problems. To overcome these challenges, this research presents an online multiple metric learning framework. Each metric in the proposed framework is composed of a global and a local component learned simultaneously. Adding a global component to a local metric efficiently reduces the problem of overfitting. The proposed framework is also scalable with both sample size and the dimension of input data. To the best of our knowledge, this is the first local online similarity/distance learning framework based on Passive/Aggressive (PA). In addition, for scalability with the dimension of input data, Dual Random Projection (DRP) is extended for local online learning in the present work. It enables our methods to run efficiently on high-dimensional datasets while maintaining their predictive performance. The proposed framework provides a straightforward local extension to any global online similarity/distance learning algorithm based on PA. Experimental results on some challenging datasets from machine vision community confirm that the extended methods considerably enhance the performance of the related global ones without increasing the time complexity.


2021 ◽  
Vol 21 (4) ◽  
pp. 1-31
Author(s):  
Yolanda A. Rankin ◽  
Jakita O. Thomas ◽  
Sheena Erete

Despite the increasing number of women receiving bachelor’s degrees in computing (i.e., Computer Science, Computer Engineering, Information Technology, etc.), a closer look reveals that the percentage of Black women in computing has significantly dropped in recent years, highlighting the underrepresentation of Black women and its negative impact on broadening participation in the field of computing. The literature reveals that several K-16 interventions have been designed to increase the representation of Black women and girls in computing. Despite these best efforts, the needle seems to have barely moved in increasing the representation or the retention of Black women in computing. Instead, the primary goals have been to recruit and retain women in the CS pipeline using gender-focused efforts intended to increase the number of women who also identify as members of racialized groups. However, these gender-focused efforts have fallen short of increasing the number of Black women in computing because they fail to acknowledge or appreciate how intersectionality (the overlapping social constructs of gender, race, ethnicity, class, etc.) has shaped the lived experiences of Black women navigating the computing pipeline. Without honest dialogue about how power operates in the field of computing, the push for racial equality and social justice in CS education remains an elusive goal. Leveraging intersectionality as a critical framework to address systemic oppression (i.e., racism, gender discrimination, power, and privilege), we interview 24 Black women in different phases of the computing pipeline about their experiences navigating the field of computing. An intersectional analysis of Black women’s experiences reveals that CS education consists of saturated sites of violence in which interconnected systems of power converge to enact oppression. Findings reveal three primary saturated sites of violence within CS education: (1) traditional K-12 classrooms; (2) predominantly White institutions; and (3) internships as supplementary learning experiences. We conclude the article with implications for how the field of CS education can begin to address racial inequality that negatively impacts Black girls and women, thus contributing to a more equitable and socially just field of study that benefits all students.


2021 ◽  
Vol 14 (1) ◽  
pp. 130
Author(s):  
Sunghyon Kyeong ◽  
Daehee Kim ◽  
Jinho Shin

The credit scoring model is one of the most important decision-making tools for the sustainability of banking systems. This study is the first to examine whether it can be improved by using system log data that are stoed extensively for system operation. We used the log data recorded by the mobile application system of KakaoBank, a leading internet bank used by more than 14 million people in Korea. After generating candidate variables from KakaoBank’s log data, we created a credit scoring model by utilizing variables with high information values and logistic regression, the most common method for developing credit scoring models in financial institutions. To prove our hypothesis on the improvement of credit scoring model performance, we performed an independent sample t-test using the simulation results of repeated model development and performance measurement based on randomly sampled data. Consequently, the discrimination power of the proposed model using logistic regression (neural network) compared to the credit bureau-based model significantly improved by 1.84 (2.22) percentage points based on the Kolmogorov–Smirnov statistics. The results of this study suggest that a bank can utilize the accumulated log data inside the bank to improve decision-making systems, including credit scoring, at a low cost.


2021 ◽  
pp. 1-18
Author(s):  
Gang Wang ◽  
Wenju Zhou ◽  
Deping Kong ◽  
Zongshuai Qu ◽  
Maowen Ba ◽  
...  

Background: A univariate neurodegeneration biomarker (UNB) based on MRI with strong statistical discrimination power would be highly desirable for studying hippocampal surface morphological changes associated with APOE ɛ4 genetic risk for AD in the cognitively unimpaired (CU) population. However, existing UNB work either fails to model large group variances or does not capture AD induced changes. Objective: We proposed a subspace decomposition method capable of exploiting a UNB to represent the hippocampal morphological changes related to the APOE ɛ4 dose effects among the longitudinal APOE ɛ4 homozygotes (HM, N = 30), heterozygotes (HT, N = 49) and non-carriers (NC, N = 61). Methods: Rank minimization mechanism combined with sparse constraint considering the local continuity of the hippocampal atrophy regions is used to extract group common structures. Based on the group common structures of amyloid-β (Aβ) positive AD patients and Aβ negative CU subjects, we identified the regions-of-interest (ROI), which reflect significant morphometry changes caused by the AD development. Then univariate morphometry index (UMI) is constructed from these ROIs. Results: The proposed UMI demonstrates a more substantial statistical discrimination power to distinguish the longitudinal groups with different APOE ɛ4 genotypes than the hippocampal volume measurements. And different APOE ɛ4 allele load affects the shrinkage rate of the hippocampus, i.e., HM genotype will cause the largest atrophy rate, followed by HT, and the smallest is NC. Conclusion: The UMIs may capture the APOE ɛ4 risk allele-induced brain morphometry abnormalities and reveal the dose effects of APOE ɛ4 on the hippocampal morphology in cognitively normal individuals.


Equilibrium ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. 859-883
Author(s):  
Michal Karas ◽  
Mária Režňáková

Research background: SMEs face financial constraints in their development, which limits their access to external funds, tightens their investment possibilities, and limits their growth. Much research effort has been devoted to understanding the nature and sources of this phenomenon. In sharp contrast to this, very little has been said about the role of these factors in explaining the default probability of these types of enterprises. Understanding such interrelationships could help to adopt policies to alleviate the situation of constrained SMEs and lower their default rates. Purpose of the article: This study analyses the role of financial constraint factors in SME defaults. This is done by utilising the financial constraint factors in a newly derived default prediction model. A comparison of the derived model and other SME default prediction models is carried out to assess the potential of financial constraints in the discrimination power of the model. Methods: In this study, we use the Cox semiparametric model, while leaving the baseline hazard rate unspecified and employing macroeconomic variables as explanatory variables. The discrimination power was addressed in terms of the area under the curve (AUC), resulting in out-of-sample testing. The DeLong test was used to compare the AUC of the created and analysed models. The model was estimated on a set of over 213,731 SMEs from 28 counties, covering the period 2014?2019. Findings & value added: It was found that adopting the financial constraint measures can explain the default of small and medium enterprises with high accuracy; however, they do not explain the default of micro enterprises.


2021 ◽  
Vol 11 ◽  
Author(s):  
Tomas Bertok ◽  
Aniko Bertokova ◽  
Eduard Jane ◽  
Michal Hires ◽  
Juvissan Aguedo ◽  
...  

Colorectal cancer (CRC) is one of the most common types of cancer among men and women worldwide. Efforts are currently underway to find novel and more cancer-specific biomarkers that could be detected in a non-invasive way. The analysis of aberrant glycosylation of serum glycoproteins is a way to discover novel diagnostic and prognostic CRC biomarkers. The present study investigated a whole-serum glycome with a panel of 16 different lectins in search for age-independent and CRC-specific glycomarkers using receiver operating characteristic (ROC) curve analyses and glycan heat matrices. Glycosylation changes present in the whole serum were identified, which could lead to the discovery of novel biomarkers for CRC diagnostics. In particular, the change in the bisecting glycans (recognized by Phaseolus vulgaris erythroagglutinin) had the highest discrimination potential for CRC diagnostics in combination with human L selectin providing area under the ROC curve (AUC) of 0.989 (95% CI 0.950–1.000), specificity of 1.000, sensitivity of 0.900, and accuracy of 0.960. We also implemented novel tools for identification of lectins with strong discrimination power.


Author(s):  
Md Babul Akter ◽  
Azad Mosab-Bin ◽  
Mohammad Kamruzzaman ◽  
Reflinur Reflinur ◽  
Nazmun Nahar ◽  
...  

Rice is one of the frontline cereals in the world and the major cultivated crop in Bangladesh. A total of eleven simple sequence repeats (SSRs) and thirteen sequence-tagged site (STS) markers were used to characterize twenty-four rice cultivars in Bangladesh. Twenty-four markers generated 60 alleles with 2.5 alleles per locus. The average polymorphism information content (PIC) value was 0.40, while the mean value of heterozygosity, gene diversity, and major allele frequency were recorded as 0.10, 0.48 and 0.62, respectively. However, the SSR markers showed more specificity and a higher discrimination power than the STS markers. The cluster analysis displayed four major clusters with a genetic similarity coefficient value of 0.73. The morphological analyses of the grain identified that Binadhan-20 and BRRI dhan34 had the longest and the shortest seed size, respectively, with a variable correlation between the seed length, width and length/width ratio. The phenol reaction test distinguished seven cultivars as japonica and seventeen cultivars as indica or an intermediate type. All these results regarding the phenotypic data and marker information will be useful for parental selection in modern rice breeding programmes.


Sign in / Sign up

Export Citation Format

Share Document