Cancer Diagnosis and Disease Gene Identification via Statistical Machine Learning

2020 ◽  
Vol 15 ◽  
Author(s):  
Liuyuan Chen ◽  
Juntao Li ◽  
Mingming Chang

: Diagnosing cancer and identifying the disease gene by using DNA microarray gene expression data are the hot topics in current bioinformatics. This paper is devoted to the latest development of cancer diagnosis and gene selection via statistical machine learning. Support vector machine is firstly introduced for the binary cancer diagnosis. Then, 1_norm support vector machine, doubly regularized support vector machine, adaptive huberized support vector machine and other extensions are presented to improve the performance of gene selection. Lasso, elastic net, partly adaptive elastic net, group lasso, sparse group lasso, adaptive sparse group lasso and other sparse regression methods are also introduced for performing simultaneous binary cancer classification and gene selection. In addition to introducing three strategies for reducing multiclass to binary, methods of directly considering all classes of data in a learning model (multi_class support vector, sparse multinomial regression, adaptive multinomial regression and so on) are presented for performing multiple cancer diagnosis. Limitations and promising directions are also discussed.

2012 ◽  
Vol 3 (1) ◽  
pp. 76-88
Author(s):  
Hiroshi Sato ◽  
Julien Rossignol

Statistical machine learning approach to understand human behaviors has been attracting considerable amounts of attention in recent years. If the authors understand more about humans, the authors can make more user-friendly machines. In this paper, the authors propose the driver recognition method from their record of manipulations using support vector machine. The authors demonstrate the efficiency of the authors’ method using the Segway. The performance of the recognition is quite good especially when the authors introduce the pre-process with FFT.


2019 ◽  
Vol 59 (1) ◽  
Author(s):  
Isolde Van Dorst

This study creates a prediction model to identify which linguistic and extra-linguistic features influence pronoun choices in the plays of Shakespeare. In the English of Shakespeare’s time, the now-archaic distinction between you and thou persisted, and is usually reported as being determined by relative social status and personal closeness of speaker and addressee. But it remains to be determined whether statistical machine learning will support this traditional explanation. 23 features are investigated, having been selected from multiple linguistic areas, such as pragmatics, sociolinguistics and conversation analysis. The three algorithms used, Naive Bayes, decision tree and support vector machine, are selected as illustrative of a range of possible models in light of their contrasting assumptions and learning biases. Two predictions are performed, firstly on a binary (you/thou) distinction and then on a trinary (you/thou/thee) distinction. Of the three algorithms, the support vector machine models score best. The features identified as the best predictors of pronoun choice are the words in the direct linguistic context. Several other features are also shown to influence the pronoun prediction, including the names of the speaker and addressee, the status differential, and positive and negative sentiment.


2020 ◽  
Vol 25 (1) ◽  
pp. 24-38
Author(s):  
Eka Patriya

Saham adalah instrumen pasar keuangan yang banyak dipilih oleh investor sebagai alternatif sumber keuangan, akan tetapi saham yang diperjual belikan di pasar keuangan sering mengalami fluktuasi harga (naik dan turun) yang tinggi. Para investor berpeluang tidak hanya mendapat keuntungan, tetapi juga dapat mengalami kerugian di masa mendatang. Salah satu indikator yang perlu diperhatikan oleh investor dalam berinvestasi saham adalah pergerakan Indeks Harga Saham Gabungan (IHSG). Tindakan dalam menganalisa IHSG merupakan hal yang penting dilakukan oleh investor dengan tujuan untuk menemukan suatu trend atau pola yang mungkin berulang dari pergerakan harga saham masa lalu, sehingga dapat digunakan untuk memprediksi pergerakan harga saham di masa mendatang. Salah satu metode yang dapat digunakan untuk memprediksi pergerakan harga saham secara akurat adalah machine learning. Pada penelitian ini dibuat sebuah model prediksi harga penutupan IHSG menggunakan algoritma Support Vector Regression (SVR) yang menghasilkan kemampuan prediksi dan generalisasi yang baik dengan nilai RMSE training dan testing sebesar 14.334 dan 20.281, serta MAPE training dan testing sebesar 0.211% dan 0.251%. Hasil penelitian ini diharapkan dapat membantu para investor dalam mengambil keputusan untuk menyusun strategi investasi saham.


2021 ◽  
Vol 186 (Supplement_1) ◽  
pp. 445-451
Author(s):  
Yifei Sun ◽  
Navid Rashedi ◽  
Vikrant Vaze ◽  
Parikshit Shah ◽  
Ryan Halter ◽  
...  

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.


2021 ◽  
Vol 13 (6) ◽  
pp. 3497
Author(s):  
Hassan Adamu ◽  
Syaheerah Lebai Lutfi ◽  
Nurul Hashimah Ahamed Hassain Malim ◽  
Rohail Hassan ◽  
Assunta Di Vaio ◽  
...  

Sustainable development plays a vital role in information and communication technology. In times of pandemics such as COVID-19, vulnerable people need help to survive. This help includes the distribution of relief packages and materials by the government with the primary objective of lessening the economic and psychological effects on the citizens affected by disasters such as the COVID-19 pandemic. However, there has not been an efficient way to monitor public funds’ accountability and transparency, especially in developing countries such as Nigeria. The understanding of public emotions by the government on distributed palliatives is important as it would indicate the reach and impact of the distribution exercise. Although several studies on English emotion classification have been conducted, these studies are not portable to a wider inclusive Nigerian case. This is because Informal Nigerian English (Pidgin), which Nigerians widely speak, has quite a different vocabulary from Standard English, thus limiting the applicability of the emotion classification of Standard English machine learning models. An Informal Nigerian English (Pidgin English) emotions dataset is constructed, pre-processed, and annotated. The dataset is then used to classify five emotion classes (anger, sadness, joy, fear, and disgust) on the COVID-19 palliatives and relief aid distribution in Nigeria using standard machine learning (ML) algorithms. Six ML algorithms are used in this study, and a comparative analysis of their performance is conducted. The algorithms are Multinomial Naïve Bayes (MNB), Support Vector Machine (SVM), Random Forest (RF), Logistics Regression (LR), K-Nearest Neighbor (KNN), and Decision Tree (DT). The conducted experiments reveal that Support Vector Machine outperforms the remaining classifiers with the highest accuracy of 88%. The “disgust” emotion class surpassed other emotion classes, i.e., sadness, joy, fear, and anger, with the highest number of counts from the classification conducted on the constructed dataset. Additionally, the conducted correlation analysis shows a significant relationship between the emotion classes of “Joy” and “Fear”, which implies that the public is excited about the palliatives’ distribution but afraid of inequality and transparency in the distribution process due to reasons such as corruption. Conclusively, the results from this experiment clearly show that the public emotions on COVID-19 support and relief aid packages’ distribution in Nigeria were not satisfactory, considering that the negative emotions from the public outnumbered the public happiness.


Sign in / Sign up

Export Citation Format

Share Document