scholarly journals Identifying the Primary Odor Perception Descriptors by Multi-Output Linear Regression Models

2021 ◽  
Vol 11 (8) ◽  
pp. 3320
Author(s):  
Xin Li ◽  
Dehan Luo ◽  
Yu Cheng ◽  
Kin-Yeung Wong ◽  
Kevin Hung

Semantic odor perception descriptors, such as “sweet”, are widely used for product quality assessment in food, beverage, and fragrance industries to profile the odor perceptions. The current literature focuses on developing as many as possible odor perception descriptors. A large number of odor descriptors poses challenges for odor sensory assessment. In this paper, we propose the task of narrowing down the number of odor perception descriptors. To this end, we contrive a novel selection mechanism based on machine learning to identify the primary odor perceptual descriptors (POPDs). The perceptual ratings of non-primary odor perception descriptors (NPOPDs) could be predicted precisely from those of the POPDs. Therefore, the NPOPDs are redundant and could be disregarded from the odor vocabulary. The experimental results indicate that dozens of odor perceptual descriptors are redundant. It is also observed that the sparsity of the data has a negative correlation coefficient with the model performance, while the Pearson correlation between odor perceptions plays an active role. Reducing the odor vocabulary size could simplify the odor sensory assessment and is auxiliary to understand human odor perceptual space.

2019 ◽  
Author(s):  
Chin Lin ◽  
Yu-Sheng Lou ◽  
Chia-Cheng Lee ◽  
Chia-Jung Hsu ◽  
Ding-Chung Wu ◽  
...  

BACKGROUND An artificial intelligence-based algorithm has shown a powerful ability for coding the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) in discharge notes. However, its performance still requires improvement compared with human experts. The major disadvantage of the previous algorithm is its lack of understanding medical terminologies. OBJECTIVE We propose some methods based on human-learning process and conduct a series of experiments to validate their improvements. METHODS We compared two data sources for training the word-embedding model: English Wikipedia and PubMed journal abstracts. Moreover, the fixed, changeable, and double-channel embedding tables were used to test their performance. Some additional tricks were also applied to improve accuracy. We used these methods to identify the three-chapter-level ICD-10-CM diagnosis codes in a set of discharge notes. Subsequently, 94,483-labeled discharge notes from June 1, 2015 to June 30, 2017 were used from the Tri-Service General Hospital in Taipei, Taiwan. To evaluate performance, 24,762 discharge notes from July 1, 2017 to December 31, 2017, from the same hospital were used. Moreover, 74,324 additional discharge notes collected from other seven hospitals were also tested. The F-measure is the major global measure of effectiveness. RESULTS In understanding medical terminologies, the PubMed-embedding model (Pearson correlation = 0.60/0.57) shows a better performance compared with the Wikipedia-embedding model (Pearson correlation = 0.35/0.31). In the accuracy of ICD-10-CM coding, the changeable model both used the PubMed- and Wikipedia-embedding model has the highest testing mean F-measure (0.7311 and 0.6639 in Tri-Service General Hospital and other seven hospitals, respectively). Moreover, a proposed method called a hybrid sampling method, an augmentation trick to avoid algorithms identifying negative terms, was found to additionally improve the model performance. CONCLUSIONS The proposed model architecture and training method is named as ICD10Net, which is the first expert level model practically applied to daily work. This model can also be applied in unstructured information extraction from free-text medical writing. We have developed a web app to demonstrate our work (https://linchin.ndmctsgh.edu.tw/app/ICD10/).


2021 ◽  
Vol 5 (1) ◽  
Author(s):  
Osman Mamun ◽  
Madison Wenzlick ◽  
Arun Sathanur ◽  
Jeffrey Hawk ◽  
Ram Devanathan

AbstractThe Larson–Miller parameter (LMP) offers an efficient and fast scheme to estimate the creep rupture life of alloy materials for high-temperature applications; however, poor generalizability and dependence on the constant C often result in sub-optimal performance. In this work, we show that the direct rupture life parameterization without intermediate LMP parameterization, using a gradient boosting algorithm, can be used to train ML models for very accurate prediction of rupture life in a variety of alloys (Pearson correlation coefficient >0.9 for 9–12% Cr and >0.8 for austenitic stainless steels). In addition, the Shapley value was used to quantify feature importance, making the model interpretable by identifying the effect of various features on the model performance. Finally, a variational autoencoder-based generative model was built by conditioning on the experimental dataset to sample hypothetical synthetic candidate alloys from the learnt joint distribution not existing in both 9–12% Cr ferritic–martensitic alloys and austenitic stainless steel datasets.


2021 ◽  
Author(s):  
Mekonnen Bogale Abegaz ◽  
Kenenisa Lemi Debela ◽  
Reta Megersa Hundie

Abstract The purpose of this study is to analyze the effect of governance indicators on Entrepreneurship. Explanatory research design with Pearson correlation and multiple linear regression models were applied. Five-year World Bank data (2014–2018) of 126 countries from all economic development levels were used. Worldwide governance indicators considered are voice and accountability, political stability, government effectiveness, regulatory quality, rule of law, and corruption control. Gross net income was taken as a control variable. To measure entrepreneurship, the number of formally registered limited liability businesses as a percentage of the working-age population, was used. To make highly skewed time series data of dependent variable (entrepreneurship) closer to normal, logarithmic transformation was made and heteroscedasticity of residuals was checked. The finding of Pearson correlation shows that there are strong and significant correlations(r > 0.466, p < 0.01) between predictors and the outcome variable and among predictor variables. Regression analysis was computed after two highly collinear variables were dropped from the model using the VIF test. The study found that the remaining four independent variables and the control variable predict 71.5% of the variance in the outcome variable. Except for voice and accountability, all predictors have their own statistically significant influence on entrepreneurship. Thus, working on each predictor up to the standard application can bring incremental changes in new business formation and entry. The researchers believe that this study is of significant interest to policymakers, program developers, entrepreneurs, analysis, and supporters since it provides useful insight on how governance indicators influence entrepreneurship.


2015 ◽  
Vol 19 (7) ◽  
pp. 1195-1199 ◽  
Author(s):  
Kotsedi Daniel Monyeki ◽  
Michael Matome Sekhotha

AbstractObjectiveHeight is required for the assessment of growth and nutritional status, as well as for predictions and standardization of physiological parameters. To determine whether arm span, mid-upper arm and waist circumferences and sum of four skinfolds can be used to predict height, the relationships between these anthropometric variables were assessed among Ellisras rural children aged 8–18 years.DesignThe following parameters were measured according to the International Society for the Advancement of Kinathropometry: height, arm span, mid-upper arm circumference, waist circumference and four skinfolds (suprailiac, subscapular, triceps and biceps). Associations between the variables were assessed using Pearson correlation coefficients and linear regression models.SettingEllisras Longitudinal Study (ELS), Limpopo Province, South Africa.SubjectsBoys (n911) and girls (n858) aged 8–18 years.ResultsMean height was higher than arm span, with differences ranging from 4 cm to 11·5 cm between boys and girls. The correlation between height and arm span was high (ranging from 0·74 to 0·91) withP<0·001. The correlation between height and mid-upper arm circumference, waist circumference and sum of four skinfolds was low (ranging from 0·15 to 0·47) withP<0·00 among girls in the 15–18 years age group.ConclusionsArm span was found to be a good predictor of height. The sum of four skinfolds was significantly associated with height in the older age groups for girls, while waist circumference showed a negative significant association in the same groups.


2012 ◽  
Vol 51 (2) ◽  
pp. 185-190 ◽  
Author(s):  
Alex J. Cannon

AbstractRegression-guided clustering is introduced as a means of constructing circulation-to-environment synoptic climatological classifications. Rather than applying an unsupervised clustering algorithm to synoptic-scale atmospheric circulation data, one instead augments the atmospheric circulation dataset with predictions from a supervised regression model linking circulation to environment. The combined dataset is then entered into the clustering algorithm. The level of influence of the environmental dataset can be controlled by a simple weighting factor. The method is generic in that the choice of regression model and clustering algorithm is left to the user. Examples are given using standard multivariate linear regression models and the k-means clustering algorithm, both established methods in synoptic climatology. Results for southern British Columbia, Canada, indicate that model performance can be made to range between that of a fully unsupervised algorithm and a fully supervised algorithm.


2019 ◽  
Author(s):  
Debem Henry ◽  
Aminu Yakubu ◽  
Mukhtar Ahmed ◽  
Gwamna Jerry ◽  
Dalhatu Ibrahim

AbstractNigeria relies on data from periodic resource-intensive surveys such as antenatal HIV seroprevalence sentinel surveys (ANC-HSS) and population-based National AIDS and Reproductive Health Surveys (NARHS) for its HIV control efforts. Nigeria has not explored the use of readily available routine programmatic data (RPD) to easily inform and monitor epidemic control efforts at local settings in near real time. This study aimed to determine the utility of RPDs (Prevention of Mother-To-Child Transmission [PMTCT] and HIV Testing and Counseling [HTC]) as a proxy for monitoring HIV epidemic in Nigeria. Using World Health Organization 12 step triangulation procedures, we compared state-level seropositivity data from PMTCT and HTC programs to HIV prevalence data from NARHS and ANC-HSS reports in relevant pairs from 2010 to 2014 in Nigeria. The study population was pregnant women and general population. We abstracted relevant data from PEPFAR Nigeria data source and published national survey reports. We compared visual (scatterplots and maps) patterns and trends, and performed Pearson correlation and univariate linear regression models of the estimates for best matched/contiguous years for which data were available. Correlation between PMTCT2014 and ANC-HSS2014 was positive and significant (R=0.7,p<0.001). ANC-HSS2014 and HTC2014 were slightly correlated (R=0.4,p<0.05). Significant correlation was observed between ANC-HSS2010 and PMTCT2013 (R=0.8,p<0.001) and between ANC-HSS2010 and HTC2013 (R=0.6, p<0.001). All RPD sources and ANC-HSS indicated a decreasing trend in national HIV prevalence in Nigeria. PMTCT2014 data showed strong capability of predicting HIV prevalence in ANC-HSS2014 in regression model (B=2.09,p<0.0001). Use of routine PMTCT data in monitoring HIV prevalence among women of reproductive age could be more valid and reliable in local settings than the use of HTC data. Use of RPD to monitor national and sub-national-level HIV epidemic in between national surveys in Nigeria could maximize program resources, and promote a more responsive and efficient actions toward epidemic control.


2019 ◽  
Author(s):  
Kaifeng Ding ◽  
Xiaoyuan Wang ◽  
Dmitry Rinberg ◽  
Terry Acree

There is evidence in mice and honeybees that signals initiated by odorants at the olfactory epithelium arrive downstream in the olfactory bulb between 10 and 200ms later and that these latencies are ligand dependent. It has recently been proposed that these latencies could be used by mice to identify or classify. Here we demonstrate that humans are sensitive to the timing of individual of odorant presentation. Using a two-alternate forced choice (2AFC) paradigm—subjects chose which odorant they recognized first after they experienced two 70ms puffs separated in time by some interval in the range of -450ms to +450ms. All subject recognition probabilities yielded the same linear function of latency (p<0.05) even though they differed in their recognition thresholds for the components and their recognition probability to detect them in binary mixtures. These results indicate that temporal structure of odor delivery affects human odor perception and sniff olfactometry (SO) has the temporal resolution necessary to measure these effects. <div><br></div>


2018 ◽  
Author(s):  
Ji Hyun Bak ◽  
Seogjoo Jang ◽  
Changbong Hyeon

Binding of odorants to olfactory receptors (ORs) elicits downstream chemical and neural signals, which are further processed to odor perception in the brain. Recently, Mainland et al. [Sci. data, (2015) 2:sdata20152] have measured ≳ 500 pairs of odorant-OR interaction by a high-throughput screening assay method, opening a new avenue to understanding the principles of human odor coding. Here, using a recently developed minimal model for OR activation kinetics [J. Phys. Chem. B (2017) 121, 1304–1311], we characterize the statistics of OR activation by odorants in terms of three empirical parameters: the half-maximum effective concentration EC50, the efficacy, and the basal activity. While the data size of odorants is still limited, the statistics offer meaningful information on the breadth and optimality of the tuning of human ORs to odorants, and allow us to relate the three parameters with the microscopic rate constants and binding affinities that define the OR activation kinetics. Despite the stochastic nature of the response expected at individual OR-odorant level, we assess that the confluence of signals in a neuron released from the multitude of ORs is effectively free of noise and deterministic with respect to changes in odorant concentration. Thus, setting a threshold to the fraction of activated OR copy number for neural spiking binarizes the electrophysiological signal of olfactory sensory neuron, thereby making an information theoretic approach a viable tool in studying the principles of odor perception.


2020 ◽  
pp. 026921552094759
Author(s):  
Javier Aceituno-Gómez ◽  
Juan Avendaño-Coy ◽  
Juan José Criado-Álvarez ◽  
Gerardo Ávila-Martín ◽  
Ana Cecilia Marín-Guerrero ◽  
...  

Objective: To compare the correlation of Visual Analog Scale with pain subsections of Shoulder Pain and Disability Index and Constant-Murley Score in subacromial pain syndrome patients. Design: Single cross-sectional analysis. Setting: Hospital Rehabilitation Department. Methods: The assessment tools were applied at baseline. Correlations between Visual Analog Scale, Shoulder Pain and Disability Index and Constant-Murley Score pain subsections were assessed by Pearson correlation coefficient. Linear regression models were calculated between scales. Statistical significance was set at two-sided p < 0.05. Results: Forty-three patients were included. Pearson’s correlation between assessments was for Visual Analog Scale-Shoulder Pain Disability Index-pain ( r = 0.61, p < 0.001) and for Visual Analog Scale-Constant Murley Score-pain were ( r = −0.74, p < 0.001). Visual Analog Scale-Shoulder Pain and Disability Index-pain determination coefficient was r2 = 0.37 and r2 = 0.54 for Visual Analog Scale-Constant-Murley Score-pain. Conclusions: Visual Analog Scale showed better correlation with Constant Murley Score-pain than with Shoulder Pain and Disability Index-pain in subacromial pain syndrome patients.


2009 ◽  
Vol 27 (15_suppl) ◽  
pp. e19101-e19101
Author(s):  
S. J. Ayirookuzhi ◽  
J. McLarty ◽  
R. Mansour ◽  
G. M. Mills

e19101 Background: Carboplatin is a commonly used drug in stage III/IV non-small cell lung cancer (NSCLC). Its dose in typically calculated with the Calvert equation that uses the glomerular filtration rate (GFR) from various predictive formulae such as MDRD, Cockroft-Gault, Jeliffe and Wright as well as the targeted area under the curve (AUC) for carboplatin. Myelosuppression is a common toxicity of carboplatin and this study aimed to assess the relationship between dosing and toxicity as well as if there would be any significant difference in dosing based on calculation of GFR from the above formulae in our patient population. Methods: Data from patients with Stage III and IV NSCLC seen between 1/1/99 to 12/31/2007 were analyzed. Patients who received concurrent radiation, who died before the first cycle was completed, as well as patients with missing lab values were excluded. Only the first cycle of the carboplatin based regime was used for analysis with nadir platelets, hemoglobin and wbc's used as endpoints. SPSS software was used for statistical analysis including Pearson correlation, ANCOVA, independent t-tests as well as multivariate linear regression models. Results: Of the 216 patients initially abstracted for analysis only 132 patients were analyzable. Demographically 71 were Caucasian and rest were African-American while 92 were male. The carboplatin dose calculated from all four formulae were highly correlated (p < 0.0001). The drop in the three cell counts were correlated to each other (p< 0.05), particularly the drop in platelet count with other two cell counts. The correlation between the drop in wbc and drop in hemoglobin approached significance (p=0.075). The nadir wbc was significantly associated with BSA (p=0.004), wbc level at baseline (p< 0.000) and approached significance for carboplatin dose (p=0.059) by ANCOVA analysis, while significance was not reached for nadir platelets or nadir hemoglobin. Conclusions: The different predictive equations resulted in similar doses of carboplatin that were statistically significant. BSA correlated significantly with myelosuppression while carboplatin dose only approached significance. No significant financial relationships to disclose.


Sign in / Sign up

Export Citation Format

Share Document