scholarly journals Differential Peripheral Blood Glycoprotein Profiles in Symptomatic and Asymptomatic COVID-19

Author(s):  
Chad Pickering ◽  
Bo Zhou ◽  
Gege Xu ◽  
Rachel Rice ◽  
Hector Huang ◽  
...  

Glycosylation is the most common form of post-translational modification of proteins, critically affecting their structure and function. Using liquid chromatography and mass spectrometry for high-resolution site-specific quantification of glycopeptides coupled with high-throughput artificial intelligence-powered data processing, we analyzed differential protein glyco-isoform distributions of 597 abundant serum glycopeptides and non-glycosylated peptides in 50 individuals who had been seriously ill with COVID-19 and in 22 individuals who had recovered after an asymptomatic course of COVID-19. As additional comparison reference phenotypes, we included 12 individuals with a history of infection with a common cold coronavirus, 16 patients with bacterial sepsis, and 15 healthy subjects without history of coronavirus exposure. We found statistically significant differences, at FDR<0.05, for normalized abundances of 374 of the 597 peptides and glycopeptides interrogated, between symptomatic and asymptomatic COVID-19 patients. Similar statistically significant differences were seen when comparing symptomatic COVID-19 patients to healthy controls (350 differentially abundant peptides and glycopeptides) and common cold coronavirus seropositive subjects (353 differentially abundant peptides and glycopeptides). Among healthy controls and sepsis patients, 326 peptides and glycopeptides were found to be differentially abundant, of which 277 overlapped with biomarkers that showed differential expression between symptomatic COVID-19 cases and healthy controls. Among symptomatic COVID-19 cases and sepsis patients, 101 glycopeptide and peptide biomarkers were found to be statistically significantly abundant. Using both supervised and unsupervised machine learning techniques, we found specific glycoprotein profiles to be strongly predictive of symptomatic COVID-19 infection. LASSO-regularized multivariable logistic regression and K-means clustering yielded accuracies of 100% in an independent test set and of 96% overall, respectively. Our findings are consistent with the interpretation that a majority of glycoprotein modifications observed which are shared among symptomatic COVID-19 and sepsis patients likely represent a generic consequence of a severe systemic immune and inflammatory state. However, there are glyco-isoform changes that are specific and particular to severe COVID-19 infection. These may be representative of either COVID-19-specific consequences or of susceptibility to or predisposition for a severe course of the disease. Our findings support the potential value of glycoproteomic biomarkers in the biomedical understanding, and, potentially, the clinical management of serious acute infectious conditions.

Author(s):  
Qi Wang ◽  
Xia Zhao ◽  
Jincai Huang ◽  
Yanghe Feng ◽  
Zhong Liu ◽  
...  

The concept of &lsquo;big data&rsquo; has been widely discussed, and its value has been illuminated throughout a variety of domains. To quickly mine potential values and alleviate the ever-increasing volume of information, machine learning is playing an increasingly important role and faces more challenges than ever. Because few studies exist regarding how to modify machine learning techniques to accommodate big data environments, we provide a comprehensive overview of the history of the evolution of big data, the foundations of machine learning, and the bottlenecks and trends of machine learning in the big data era. More specifically, based on learning principals, we discuss regularization to enhance generalization. The challenges of quality in big data are reduced to the curse of dimensionality, class imbalances, concept drift and label noise, and the underlying reasons and mainstream methodologies to address these challenges are introduced. Learning model development has been driven by domain specifics, dataset complexities, and the presence or absence of human involvement. In this paper, we propose a robust learning paradigm by aggregating the aforementioned factors. Over the next few decades, we believe that these perspectives will lead to novel ideas and encourage more studies aimed at incorporating knowledge and establishing data-driven learning systems that involve both data quality considerations and human interactions.


2019 ◽  
Author(s):  
Yuanwei Xu ◽  
Hollie Topliffe ◽  
James Stimson ◽  
Helen R. Stagg ◽  
Ibrahim Abubakar ◽  
...  

AbstractOutbreaks of tuberculosis- such as the large isoniazid-resistant outbreak centered on London, United Kingdom, which originated in 1995- provide excellent opportunities to model transmission of this devastating disease. Transmission chains for tuberculosis are notoriously difficult to ascertain, but mathematical modelling approaches, combined with whole-genome sequencing (WGS) data, have strong potential to contribute to transmission analyses. Using such data, we aimed to reconstruct transmission histories for the outbreak using a Bayesian approach, and to use machine learning techniques with patient-level data to identify the key covariates associated with transmission. By using our transmission reconstruction method that accounts for phylogenetic uncertainty, we are able to identify 24 transmission events with reasonable confidence, 11 of which have zero single nucleotide polymorphism (SNP) distance, and as maximum distance of 3. Patient age, alcohol abuse and history of homelessness were found to be the most important predictors of being credible tuberculosis transmitters.


2021 ◽  
Author(s):  
Akhil Saji

Objectives The annual addresses of the President of the American Urological Association (AUA) may articulate and reflect the contemporary goals, values, and concerns of contemporary AUA membership. There is no organized archive of such addresses. We aimed to create a searchable database of all AUA Presidents and their addresses to determine variables associated with speech sentiment including positivity, negativity, and emotional tone through the 117 years of the AUA’s history. Methods We queried AUA archives, journals, recorded tape, and personal records, to create a database of all existing AUA Presidential addresses and biographic data. We applied natural language processing and machine learning techniques to evaluate the addresses for overall sentiment with validation using analog analyses (i.e reading and annotation). Multivariable logistic regression was performed to identify significant predictors of Presidential address sentiment. Results Between 1902-2019, a total of 113 AUA meetings were held. A total of 85 of 113 (75.22%) presidential addresses were transcribed and archived in the database representing 254,124 words by male presidents with a median (IQR) age of 61.43 (53.1-66.5) years. AUA Presidents during the second half of the history of the AUA (1960-2019) were significantly older at time of inauguration and gave more positive speeches in the active voice than presidents during the first half (1902-1959) (p < .05). The only significant independent predictor of the degree of positivity in an AUA President’s annual address was speaker age (95% CI 1.007-1.119). Conclusions We created the first digital, searchable database of all AUA Presidential speeches from 1902-2019 and aim to add additional addresses prospectively. Artificial intelligence analyses mirrored the findings of human reading and demonstrated that from 1902-2019 AUA Presidential addresses became more positive and optimistic with increasing speaker age but without consistent predictors of a speech’s emotional or factual content.


Author(s):  
Baran Tokar ◽  
Mukaddes Baskaya ◽  
Ozer Celik ◽  
Fatih Cemrek ◽  
Ayfer Acikgoz

Abstract Introduction As a subset of artificial intelligence, machine learning techniques (MLTs) may evaluate very large and raw datasets. In this study, the aim is to establish a model by MLT for the prediction of enuresis in children. Materials and Methods The study included 8,071 elementary school students. A total of 704 children had enuresis. For analysis of data with MLT, another group including 704 nonenuretic children was structured with stratified sampling. Out of 34 independent variables, 14 with high feature values significantly affecting enuresis were selected. A model of estimation was created by training the data. Results Fourteen independent variables in order of feature importance value were starting age of toilet training, having urinary urgency, holding maneuvers to prevent voiding, frequency of defecation, history of enuresis in mother and father, having child's own room, parent's education level, history of enuresis in siblings, consanguineous marriage, incomplete bladder emptying, frequent voiding, gender, history of urinary tract infection, and surgery in the past. The best MLT algorithm for the prediction of enuresis was determined as logistic regression algorithm. The total accuracy rate of the model in prediction was 81.3%. Conclusion MLT might provide a faster and easier evaluation process for studies on enuresis with a large dataset. The model in this study may suggest that selected variables with high feature values could be preferred with priority in any screening studies for enuresis. MLT may prevent clinical errors due to human cognitive biases and may help the physicians to be proactive in diagnosis and treatment of enuresis.


2020 ◽  
pp. jnnp-2020-324371
Author(s):  
Olivia Rousseau ◽  
Matilde Karakachoff ◽  
Alban Gaignard ◽  
Lise Bellanger ◽  
Philippe Bijlenga ◽  
...  

Background and purposeThe ever-growing availability of imaging led to increasing incidentally discovered unruptured intracranial aneurysms (UIAs). We leveraged machine-learning techniques and advanced statistical methods to provide new insights into rupture intracranial aneurysm (RIA) risks.MethodsWe analysed the characteristics of 2505 patients with intracranial aneurysms (IA) discovered between 2016 and 2019. Baseline characteristics, familial history of IA, tobacco and alcohol consumption, pharmacological treatments before the IA diagnosis, cardiovascular risk factors and comorbidities, headaches, allergy and atopy, IA location, absolute IA size and adjusted size ratio (aSR) were analysed with a multivariable logistic regression (MLR) model. A random forest (RF) method globally assessed the risk factors and evaluated the predictive capacity of a multivariate model.ResultsAmong 994 patients with RIA (39.7%) and 1511 patients with UIA (60.3 %), the MLR showed that IA location appeared to be the most significant factor associated with RIA (OR, 95% CI: internal carotid artery, reference; middle cerebral artery, 2.72, 2.02–3.58; anterior cerebral artery, 4.99, 3.61–6.92; posterior circulation arteries, 6.05, 4.41–8.33). Size and aSR were not significant factors associated with RIA in the MLR model and antiplatelet-treatment intake patients were less likely to have RIA (OR: 0.74; 95% CI: 0.55–0.98). IA location, age, following by aSR were the best predictors of RIA using the RF model.ConclusionsThe location of IA is the most consistent parameter associated with RIA. The use of ‘artificial intelligence’ RF helps to re-evaluate the contribution and selection of each risk factor in the multivariate model.


Author(s):  
Qi Wang ◽  
Xia Zhao ◽  
Jincai Huang ◽  
Yanghe Feng ◽  
Jiahao Su ◽  
...  

The concept of &lsquo;big data&rsquo; has been widely discussed, and its value has been illuminated throughout a variety of domains. To quickly mine potential values and alleviate the ever-increasing volume of information, machine learning is playing an increasingly important role and faces more challenges than ever. Because few studies exist regarding how to modify machine learning techniques to accommodate big data environments, we provide a comprehensive overview of the history of the evolution of big data, the foundations of machine learning, and the bottlenecks and trends of machine learning in the big data era. More specifically, based on learning principals, we discuss regularization to enhance generalization. The challenges of quality in big data are reduced to the curse of dimensionality, class imbalances, concept drift and label noise, and the underlying reasons and mainstream methodologies to address these challenges are introduced. Learning model development has been driven by domain specifics, dataset complexities, and the presence or absence of human involvement. In this paper, we propose a robust learning paradigm by aggregating the aforementioned factors. Over the next few decades, we believe that these perspectives will lead to novel ideas and encourage more studies aimed at incorporating knowledge and establishing data-driven learning systems that involve both data quality considerations and human interactions.


PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0248526
Author(s):  
Yu Takahashi ◽  
Kenbun Sone ◽  
Katsuhiko Noda ◽  
Kaname Yoshida ◽  
Yusuke Toyohara ◽  
...  

Endometrial cancer is a ubiquitous gynecological disease with increasing global incidence. Therefore, despite the lack of an established screening technique to date, early diagnosis of endometrial cancer assumes critical importance. This paper presents an artificial-intelligence-based system to detect the regions affected by endometrial cancer automatically from hysteroscopic images. In this study, 177 patients (60 with normal endometrium, 21 with uterine myoma, 60 with endometrial polyp, 15 with atypical endometrial hyperplasia, and 21 with endometrial cancer) with a history of hysteroscopy were recruited. Machine-learning techniques based on three popular deep neural network models were employed, and a continuity-analysis method was developed to enhance the accuracy of cancer diagnosis. Finally, we investigated if the accuracy could be improved by combining all the trained models. The results reveal that the diagnosis accuracy was approximately 80% (78.91–80.93%) when using the standard method, and it increased to 89% (83.94–89.13%) and exceeded 90% (i.e., 90.29%) when employing the proposed continuity analysis and combining the three neural networks, respectively. The corresponding sensitivity and specificity equaled 91.66% and 89.36%, respectively. These findings demonstrate the proposed method to be sufficient to facilitate timely diagnosis of endometrial cancer in the near future.


2020 ◽  
Author(s):  
Petter Jakobsen ◽  
Enrique Garcia-Ceja ◽  
Michael Riegler ◽  
Lena Antonsen Stabell ◽  
Tine Nordgreen ◽  
...  

ABSTRACTCurrent practice of assessing mood episodes in affective disorders largely depends on subjective observations combined with semi-structured clinical rating scales. Motor activity is an objective observation of the inner physiological state expressed in behavior patterns. Alterations of motor activity are essential features of bipolar and unipolar depression. The aim was to investigate if objective measures of motor activity can aid existing diagnostic practice, by applying machine-learning techniques to analyze activity patterns in depressed patients and healthy controls. Random Forrest, Deep Neural Network and Convolutional Neural Network algorithms were used to analyze 14 days of actigraph recorded motor activity from 23 depressed patients and 32 healthy controls. Statistical features analyzed in the dataset were mean activity, standard deviation of mean activity and proportion of zero activity. Various techniques to handle data imbalance were applied, and to ensure generalizability and avoid overfitting a Leave-One-User-Out validation strategy was utilized. All outcomes reports as measures of accuracy for binary tests. A Deep Neural Network combined with random oversampling class balancing technique performed a cut above the rest with a true positive rate of 0.82 (sensitivity) and a true negative rate of 0.84 (specificity). Accuracy was 0.84 and the Matthews Correlation Coefficient 0.65. Misclassifications appear related to data overlapping among the classes, so an appropriate future approach will be to compare mood states intra-individualistic. In summary, machine-learning techniques present promising abilities in discriminating between depressed patients and healthy controls in motor activity time series.


2018 ◽  
Vol 6 (2) ◽  
pp. 155-168 ◽  
Author(s):  
Naresh Babu Bynagari ◽  
Takudzwa Fadziso

Machine learning techniques have been successfully used to analyze neuroimaging data in the context of disease diagnosis in recent years. In this study, we present an overview of contemporary support vector machine-based methods developed and used in psychiatric neuroimaging for schizophrenia research. We focus in particular on our group's algorithms, which have been used to categorize schizophrenia patients and healthy controls, and compare their accuracy findings to those of other recently published studies. First, we'll go over some basic pattern recognition and machine learning terms. Then, for each study, we describe and discuss it independently, emphasizing the key characteristics that distinguish each approach. Finally, conclusions are reached as a result of comparing the data obtained using the various methodologies presented to determine how beneficial automatic categorization systems are in understanding the molecular underpinnings of schizophrenia. The primary implications of applying these approaches in clinical practice are then discussed.


Sign in / Sign up

Export Citation Format

Share Document