scholarly journals CT Image Analysis Using Grayscale Statistics to Categorise Severity of Lung Abnormalities of COVID-19 Patients

Author(s):  
Sara Ghashghaei ◽  
David A. Wood ◽  
Erfan Sadatshojaei ◽  
Mansooreh Jalilpoor

Abstract Grayscale image attributes from 456 images extracted from CT scan slices of 53 patients (49 with COVID-19 and 4 without) are used to establish a visual scale of severity of lung abnormalities (five classes: 0 to 4). The complex trends of these easy-to-derive image attributes can be used graphically to discern the visual scale of lung abnormalities in broad terms. With the aid of machine learning algorithms, the visual classes can be distinguished with close to 95% accuracy using combinations of selected grayscale attributes. Confusion matrices reveal that the best-performing machine learning models are able to distinguish more accurately between certain classes than visual inspection of CT images by experts. The adaboost, decision tree and random forest models confused on average less than 25 of the 456 CT-scan image extracts evaluated between the visual classes of lung abnormalities.

2020 ◽  
Author(s):  
Liam Brierley ◽  
Anna Fowler

AbstractThe COVID-19 pandemic has demonstrated the serious potential for novel zoonotic coronaviruses to emerge and cause major outbreaks. The immediate animal origin of the causative virus, SARS-CoV-2, remains unknown, a notoriously challenging task for emerging disease investigations. Coevolution with hosts leads to specific evolutionary signatures within viral genomes that can inform likely animal origins. We obtained a set of 650 spike protein and 511 whole genome nucleotide sequences from 225 and 187 viruses belonging to the family Coronaviridae, respectively. We then trained random forest models independently on genome composition biases of spike protein and whole genome sequences, including dinucleotide and codon usage biases in order to predict animal host (of nine possible categories, including human). In hold-one-out cross-validation, predictive accuracy on unseen coronaviruses consistently reached ∼73%, indicating evolutionary signal in spike proteins to be just as informative as whole genome sequences. However, different composition biases were informative in each case. Applying optimised random forest models to classify human sequences of MERS-CoV and SARS-CoV revealed evolutionary signatures consistent with their recognised intermediate hosts (camelids, carnivores), while human sequences of SARS-CoV-2 were predicted as having bat hosts (suborder Yinpterochiroptera), supporting bats as the suspected origins of the current pandemic. In addition to phylogeny, variation in genome composition can act as an informative approach to predict emerging virus traits as soon as sequences are available. More widely, this work demonstrates the potential in combining genetic resources with machine learning algorithms to address long-standing challenges in emerging infectious diseases.


2021 ◽  
Vol 17 (4) ◽  
pp. e1009149
Author(s):  
Liam Brierley ◽  
Anna Fowler

The COVID-19 pandemic has demonstrated the serious potential for novel zoonotic coronaviruses to emerge and cause major outbreaks. The immediate animal origin of the causative virus, SARS-CoV-2, remains unknown, a notoriously challenging task for emerging disease investigations. Coevolution with hosts leads to specific evolutionary signatures within viral genomes that can inform likely animal origins. We obtained a set of 650 spike protein and 511 whole genome nucleotide sequences from 222 and 185 viruses belonging to the family Coronaviridae, respectively. We then trained random forest models independently on genome composition biases of spike protein and whole genome sequences, including dinucleotide and codon usage biases in order to predict animal host (of nine possible categories, including human). In hold-one-out cross-validation, predictive accuracy on unseen coronaviruses consistently reached ~73%, indicating evolutionary signal in spike proteins to be just as informative as whole genome sequences. However, different composition biases were informative in each case. Applying optimised random forest models to classify human sequences of MERS-CoV and SARS-CoV revealed evolutionary signatures consistent with their recognised intermediate hosts (camelids, carnivores), while human sequences of SARS-CoV-2 were predicted as having bat hosts (suborder Yinpterochiroptera), supporting bats as the suspected origins of the current pandemic. In addition to phylogeny, variation in genome composition can act as an informative approach to predict emerging virus traits as soon as sequences are available. More widely, this work demonstrates the potential in combining genetic resources with machine learning algorithms to address long-standing challenges in emerging infectious diseases.


Author(s):  
Sergiu Apostu

This paper presents an analysis of voice traffic in telephone networks, based on machine learning algorithms to detect frauds made by callers. Starting from the raw data set that includes information about the call date, destination number, duration and caller's number, in our approach we were able to identify fraudulent calls in early stages. For balance, the data set was split in 2 parts: one for training and one for testing. To obtain mean’s values from dataset, a standardization technique was applied in order to scale the data before the dimensionality reduction using Principal Component Analysis. Then, the first two components were used as inputs for Logistic Regression and Random Forest models, having the caller as target. Finally, the target was moved on the destination file so as to identify the caller and the moment when the call has started based on a vector representation of words.


Atmosphere ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 109
Author(s):  
Ashima Malik ◽  
Megha Rajam Rao ◽  
Nandini Puppala ◽  
Prathusha Koouri ◽  
Venkata Anil Kumar Thota ◽  
...  

Over the years, rampant wildfires have plagued the state of California, creating economic and environmental loss. In 2018, wildfires cost nearly 800 million dollars in economic loss and claimed more than 100 lives in California. Over 1.6 million acres of land has burned and caused large sums of environmental damage. Although, recently, researchers have introduced machine learning models and algorithms in predicting the wildfire risks, these results focused on special perspectives and were restricted to a limited number of data parameters. In this paper, we have proposed two data-driven machine learning approaches based on random forest models to predict the wildfire risk at areas near Monticello and Winters, California. This study demonstrated how the models were developed and applied with comprehensive data parameters such as powerlines, terrain, and vegetation in different perspectives that improved the spatial and temporal accuracy in predicting the risk of wildfire including fire ignition. The combined model uses the spatial and the temporal parameters as a single combined dataset to train and predict the fire risk, whereas the ensemble model was fed separate parameters that were later stacked to work as a single model. Our experiment shows that the combined model produced better results compared to the ensemble of random forest models on separate spatial data in terms of accuracy. The models were validated with Receiver Operating Characteristic (ROC) curves, learning curves, and evaluation metrics such as: accuracy, confusion matrices, and classification report. The study results showed and achieved cutting-edge accuracy of 92% in predicting the wildfire risks, including ignition by utilizing the regional spatial and temporal data along with standard data parameters in Northern California.


2019 ◽  
Author(s):  
Karen-Inge Karstoft ◽  
Ioannis Tsamardinos ◽  
Kasper Eskelund ◽  
Søren Bo Andersen ◽  
Lars Ravnborg Nissen

BACKGROUND Posttraumatic stress disorder (PTSD) is a relatively common consequence of deployment to war zones. Early postdeployment screening with the aim of identifying those at risk for PTSD in the years following deployment will help deliver interventions to those in need but have so far proved unsuccessful. OBJECTIVE This study aimed to test the applicability of automated model selection and the ability of automated machine learning prediction models to transfer across cohorts and predict screening-level PTSD 2.5 years and 6.5 years after deployment. METHODS Automated machine learning was applied to data routinely collected 6-8 months after return from deployment from 3 different cohorts of Danish soldiers deployed to Afghanistan in 2009 (cohort 1, N=287 or N=261 depending on the timing of the outcome assessment), 2010 (cohort 2, N=352), and 2013 (cohort 3, N=232). RESULTS Models transferred well between cohorts. For screening-level PTSD 2.5 and 6.5 years after deployment, random forest models provided the highest accuracy as measured by area under the receiver operating characteristic curve (AUC): 2.5 years, AUC=0.77, 95% CI 0.71-0.83; 6.5 years, AUC=0.78, 95% CI 0.73-0.83. Linear models performed equally well. Military rank, hyperarousal symptoms, and total level of PTSD symptoms were highly predictive. CONCLUSIONS Automated machine learning provided validated models that can be readily implemented in future deployment cohorts in the Danish Defense with the aim of targeting postdeployment support interventions to those at highest risk for developing PTSD, provided the cohorts are deployed on similar missions.


Author(s):  
Soundariya R.S. ◽  
◽  
Tharsanee R.M. ◽  
Vishnupriya B ◽  
Ashwathi R ◽  
...  

Corona virus disease (Covid - 19) has started to promptly spread worldwide from April 2020 till date, leading to massive death and loss of lives of people across various countries. In accordance to the advices of WHO, presently the diagnosis is implemented by Reverse Transcription Polymerase Chain Reaction (RT- PCR) testing, that incurs four to eight hours’ time to process test samples and adds 48 hours to categorize whether the samples are positive or negative. It is obvious that laboratory tests are time consuming and hence a speedy and prompt diagnosis of the disease is extremely needed. This can be attained through several Artificial Intelligence methodologies for prior diagnosis and tracing of corona diagnosis. Those methodologies are summarized into three categories: (i) Predicting the pandemic spread using mathematical models (ii) Empirical analysis using machine learning models to forecast the global corona transition by considering susceptible, infected and recovered rate. (iii) Utilizing deep learning architectures for corona diagnosis using the input data in the form of X-ray images and CT scan images. When X-ray and CT scan images are taken into account, supplementary data like medical signs, patient history and laboratory test results can also be considered while training the learning model and to advance the testing efficacy. Thus the proposed investigation summaries the several mathematical models, machine learning algorithms and deep learning frameworks that can be executed on the datasets to forecast the traces of COVID-19 and detect the risk factors of coronavirus.


2021 ◽  
Vol 5 (CHI PLAY) ◽  
pp. 1-29
Author(s):  
Alessandro Canossa ◽  
Dmitry Salimov ◽  
Ahmad Azadvar ◽  
Casper Harteveld ◽  
Georgios Yannakakis

Is it possible to detect toxicity in games just by observing in-game behavior? If so, what are the behavioral factors that will help machine learning to discover the unknown relationship between gameplay and toxic behavior? In this initial study, we examine whether it is possible to predict toxicity in the MOBA gameFor Honor by observing in-game behavior for players that have been labeled as toxic (i.e. players that have been sanctioned by Ubisoft community managers). We test our hypothesis of detecting toxicity through gameplay with a dataset of almost 1,800 sanctioned players, and comparing these sanctioned players with unsanctioned players. Sanctioned players are defined by their toxic action type (offensive behavior vs. unfair advantage) and degree of severity (warned vs. banned). Our findings, based on supervised learning with random forests, suggest that it is not only possible to behaviorally distinguish sanctioned from unsanctioned players based on selected features of gameplay; it is also possible to predict both the sanction severity (warned vs. banned) and the sanction type (offensive behavior vs. unfair advantage). In particular, all random forest models predict toxicity, its severity, and type, with an accuracy of at least 82%, on average, on unseen players. This research shows that observing in-game behavior can support the work of community managers in moderating and possibly containing the burden of toxic behavior.


2021 ◽  
Author(s):  
Enzo Losi ◽  
Mauro Venturini ◽  
Lucrezia Manservigi ◽  
Giuseppe Fabio Ceschini ◽  
Giovanni Bechini ◽  
...  

Abstract A gas turbine trip is an unplanned shutdown, of which the most relevant consequences are business interruption and a reduction of equipment remaining useful life. Thus, understanding the underlying causes of gas turbine trip would allow predicting its occurrence in order to maximize gas turbine profitability and improve its availability. In the ever competitive Oil & Gas sector, data mining and machine learning are increasingly being employed to support a deeper insight and improved operation of gas turbines. Among the various machine learning tools, Random Forests are an ensemble learning method consisting of an aggregation of decision tree classifiers. This paper presents a novel methodology aimed at exploiting information embedded in the data and develops Random Forest models, aimed at predicting gas turbine trip based on information gathered during a timeframe of historical data acquired from multiple sensors. The novel approach exploits time series segmentation to increase the amount of training data, thus reducing overfitting. First, data are transformed according to a feature engineering methodology developed in a separate work by the same authors. Then, Random Forest models are trained and tested on unseen observations to demonstrate the benefits of the novel approach. The superiority of the novel approach is proved by considering two real-word case-studies, involving filed data taken during three years of operation of two fleets of Siemens gas turbines located in different regions. The novel methodology allows values of Precision, Recall and Accuracy in the range 75–85 %, thus demonstrating the industrial feasibility of the predictive methodology.


2021 ◽  
Author(s):  
Naser Zaeri

The coronavirus disease 2019 (COVID-19) outbreak has been designated as a worldwide pandemic by World Health Organization (WHO) and raised an international call for global health emergency. In this regard, recent advancements of technologies in the field of artificial intelligence and machine learning provide opportunities for researchers and scientists to step in this battlefield and convert the related data into a meaningful knowledge through computational-based models, for the task of containment the virus, diagnosis and providing treatment. In this study, we will provide recent developments and practical implementations of artificial intelligence modeling and machine learning algorithms proposed by researchers and practitioners during the pandemic period which suggest serious potential in compliant solutions for investigating diagnosis and decision making using computerized tomography (CT) scan imaging. We will review the modern algorithms in CT scan imaging modeling that may be used for detection, quantification, and tracking of Coronavirus and study how they can differentiate Coronavirus patients from those who do not have the disease.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Vishwesh Venkatraman

Abstract Motivation The absorption, distribution, metabolism, excretion, and toxicity (ADMET) of drugs plays a key role in determining which among the potential candidates are to be prioritized. In silico approaches based on machine learning methods are becoming increasing popular, but are nonetheless limited by the availability of data. With a view to making both data and models available to the scientific community, we have developed FPADMET which is a repository of molecular fingerprint-based predictive models for ADMET properties. Summary In this article, we have examined the efficacy of fingerprint-based machine learning models for a large number of ADMET-related properties. The predictive ability of a set of 20 different binary fingerprints (based on substructure keys, atom pairs, local path environments, as well as custom fingerprints such as all-shortest paths) for over 50 ADMET and ADMET-related endpoints have been evaluated as part of the study. We find that for a majority of the properties, fingerprint-based random forest models yield comparable or better performance compared with traditional 2D/3D molecular descriptors. Availability The models are made available as part of open access software that can be downloaded from https://gitlab.com/vishsoft/fpadmet.


Sign in / Sign up

Export Citation Format

Share Document