scholarly journals Identifying Polarity in Tweets from an Imbalanced Dataset about Diseases and Vaccines Using a Meta-Model Based on Machine Learning Techniques

2020 ◽  
Vol 10 (24) ◽  
pp. 9019
Author(s):  
Alejandro Rodríguez-González ◽  
Juan Manuel Tuñas ◽  
Lucia Prieto Santamaría ◽  
Diego Fernández Peces-Barba ◽  
Ernestina Menasalvas Ruiz ◽  
...  

Sentiment analysis is one of the hottest topics in the area of natural language. It has attracted a huge interest from both the scientific and industrial perspective. Identifying the sentiment expressed in a piece of textual information is a challenging task that several commercial tools have tried to address. In our aim of capturing the sentiment expressed in a set of tweets retrieved for a study about vaccines and diseases during the period 2015–2018, we found that some of the main commercial tools did not allow an accurate identification of the sentiment expressed in a tweet. For this reason, we aimed to create a meta-model which used the results of the commercial tools to improve the results of the tools individually. As part of this research, we had to deal with the problem of unbalanced data. This paper presents the main results in creating a metal-model from three commercial tools to the correct identification of sentiment in tweets by using different machine-learning techniques and methods and dealing with the unbalanced data problem.

2020 ◽  
Author(s):  
Phyo Phyo Zin ◽  
Xinhao Li ◽  
Dhoha TRIKI ◽  
Denis Fourches

This study presents CryptoChem, a new method and associated software to securely store and transfer information using chemicals. Relying on the concept of Big Chemical Data, molecular descriptors and machine learning techniques, CryptoChem offers a highly complex and robust system with multiple layers of security for transmitting confidential information. This revolutionary technology adds fully untapped layers of complexity and is thus of relevance for different types of applications and users. The algorithm directly uses chemical structures and their properties as the central element of the secured storage. QSDR (Quantitative Structure-Data Relationship) models are used as private keys to encode and decode the data. Herein, we validate the software with a series of five datasets consisting of numerical and textual information with increasing size and complexity. We discuss <i>(i)</i> the initial concept and current features of CryptoChem, <i>(ii)</i> the associated Molread and Molwrite programs which encode messages as series of molecules and decodes them with an ensemble of QSDR machine learning models, <i>(iii)</i> the Analogue Retriever and Label Swapper methods, which enforce additional layers of security, <i>(iv)</i> the results of encoding and decoding the five datasets using CryptoChem, and (v) the comparison of CryptoChem to contemporary encryption methods. CryptoChem is freely available for testing at <a href="https://github.com/XinhaoLi74/CryptoChem">https://github.com/XinhaoLi74/CryptoChem</a>


2020 ◽  
Author(s):  
Hao Li ◽  
Liqian Cui ◽  
Liping Cao ◽  
Yizhi Zhang ◽  
Yueheng Liu ◽  
...  

Abstract Background: Bipolar disorder (BPD) is a common mood disorder that is often goes misdiagnosed or undiagnosed. Recently, machine learning techniques have been combined with neuroimaging methods to aid in the diagnosis of BPD. However, most studies have focused on the construction of classifiers based on single-modality MRI. Hence, in this study, we aimed to construct a support vector machine (SVM) model using a combination of structural and functional MRI, which could be used to accurately identify patients with BPD.Methods: In total, 44 patients with BPD and 36 healthy controls were enrolled in the study. Clinical evaluation and MRI scans were performed for each subject. Next, image pre-processing, VBM and ReHo analyses were performed. The ReHo values of each subject in the clusters showing significant differences were extracted. Further, LASSO approach was recruited to screen features. Based on selected features, the SVM model was established, and discriminant analysis was performed.Results: After using the two-sample t-test with multiple comparisons, a total of 8 clusters were extracted from the data (VBM = 6; ReHo = 2). Next, we used both VBM and ReHo data to construct the new SVM classifier, which could effectively identify patients with BPD at an accuracy of 87.5% (95%CI: 72.5-95.3%), sensitivity of 86.4% (95%CI: 64.0-96.4%), and specificity of 88.9% (95%CI: 63.9-98.0%) in the test data (p=0.0022). Conclusions: A combination of structural and functional MRI can be of added value in the construction of SVM classifiers to aid in the accurate identification of BPD in the clinic.


2020 ◽  
Author(s):  
Hao Li ◽  
Liqian Cui ◽  
Liping Cao ◽  
Yizhi Zhang ◽  
Yueheng Liu ◽  
...  

Abstract Background: Bipolar disorder (BPD) is a common mood disorder that is often goes misdiagnosed or undiagnosed. Recently, machine learning techniques have been combined with neuroimaging methods to aid in the diagnosis of BPD. However, most studies have focused on the construction of classifiers based on single-modality MRI. Hence, in this study, we aimed to construct a support vector machine (SVM) model using a combination of structural and functional MRI, which could be used to accurately identify patients with BPD.Methods: In total, 44 patients with BPD and 36 healthy controls were enrolled in the study. Clinical evaluation and MRI scans were performed for each subject. Next, image pre-processing, VBM and ReHo analyses were performed. The ReHo values of each subject in the clusters showing significant differences were extracted. Further, LASSO approach was recruited to screen features. Based on selected features, the SVM model was established, and discriminant analysis was performed.Results: After using the two-sample t-test with multiple comparisons, a total of 8 clusters were extracted from the data (VBM = 6; ReHo = 2). Next, we used both VBM and ReHo data to construct the new SVM classifier, which could effectively identify patients with BPD at an accuracy of 87.5% (95%CI: 72.5-95.3%), sensitivity of 86.4% (95%CI: 64.0-96.4%), and specificity of 88.9% (95%CI: 63.9-98.0%) in the test data (p=0.0022). Limitations: The sample size was small, and we were unable to eliminate the potential effects of medications. Conclusions: A combination of structural and functional MRI can be of added value in the construction of SVM classifiers to aid in the accurate identification of BPD in the clinic.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Hao Li ◽  
Liqian Cui ◽  
Liping Cao ◽  
Yizhi Zhang ◽  
Yueheng Liu ◽  
...  

Abstract Background Bipolar disorder (BPD) is a common mood disorder that is often goes misdiagnosed or undiagnosed. Recently, machine learning techniques have been combined with neuroimaging methods to aid in the diagnosis of BPD. However, most studies have focused on the construction of classifiers based on single-modality MRI. Hence, in this study, we aimed to construct a support vector machine (SVM) model using a combination of structural and functional MRI, which could be used to accurately identify patients with BPD. Methods In total, 44 patients with BPD and 36 healthy controls were enrolled in the study. Clinical evaluation and MRI scans were performed for each subject. Next, image pre-processing, VBM and ReHo analyses were performed. The ReHo values of each subject in the clusters showing significant differences were extracted. Further, LASSO approach was recruited to screen features. Based on selected features, the SVM model was established, and discriminant analysis was performed. Results After using the two-sample t-test with multiple comparisons, a total of 8 clusters were extracted from the data (VBM = 6; ReHo = 2). Next, we used both VBM and ReHo data to construct the new SVM classifier, which could effectively identify patients with BPD at an accuracy of 87.5% (95%CI: 72.5–95.3%), sensitivity of 86.4% (95%CI: 64.0–96.4%), and specificity of 88.9% (95%CI: 63.9–98.0%) in the test data (p = 0.0022). Conclusions A combination of structural and functional MRI can be of added value in the construction of SVM classifiers to aid in the accurate identification of BPD in the clinic.


2016 ◽  
Vol 2016 ◽  
pp. 1-11 ◽  
Author(s):  
Jieru Zhang ◽  
Ying Ju ◽  
Huijuan Lu ◽  
Ping Xuan ◽  
Quan Zou

Cancerlectins are cancer-related proteins that function as lectins. They have been identified through computational identification techniques, but these techniques have sometimes failed to identify proteins because of sequence diversity among the cancerlectins. Advanced machine learning identification methods, such as support vector machine and basic sequence features (n-gram), have also been used to identify cancerlectins. In this study, various protein fingerprint features and advanced classifiers, including ensemble learning techniques, were utilized to identify this group of proteins. We improved the prediction accuracy of the original feature extraction methods and classification algorithms by more than 10% on average. Our work provides a basis for the computational identification of cancerlectins and reveals the power of hybrid machine learning techniques in computational proteomics.


2021 ◽  
Vol 17 (1) ◽  
pp. 19-24
Author(s):  
Siti Masturoh ◽  
Fitra Septia Nugraha ◽  
Siti Nurlela ◽  
M. Rangga Ramadhan Saelan ◽  
Daniati Uki Eka Saputri ◽  
...  

Telemarketing is a promotion that is considered effective for promoting a product to consumers by telephone, other than that telemarketing is easier to accept because of its direct nature of offering products to consumers. Telemarketing is also considered to help increase a company's revenue. The problem of predicting the success of a bank's telemarketing data must be done using machine learning techniques.  Machine learning used in the available historical data is a bank dataset of 45211 instances at 17 features using the multilayer perceptron algorithm (MLP) with resampling. The use of resampling aims to balance the unbalanced data resulting in an accuracy value of 90.18% and a ROC of 0.89%. Meanwhile, if the data resampling is not used in the multilayer perceptron (MLP) algorithm, the accuracy value is 88.6 and ROC is 0.88%. The use of resampling data becomes more effective and results in higher accuracy values.


2020 ◽  
Author(s):  
Phyo Phyo Zin ◽  
Xinhao Li ◽  
Dhoha TRIKI ◽  
Denis Fourches

This study presents CryptoChem, a new method and associated software to securely store and transfer information using chemicals. Relying on the concept of Big Chemical Data, molecular descriptors and machine learning techniques, CryptoChem offers a highly complex and robust system with multiple layers of security for transmitting confidential information. This revolutionary technology adds fully untapped layers of complexity and is thus of relevance for different types of applications and users. The algorithm directly uses chemical structures and their properties as the central element of the secured storage. QSDR (Quantitative Structure-Data Relationship) models are used as private keys to encode and decode the data. Herein, we validate the software with a series of five datasets consisting of numerical and textual information with increasing size and complexity. We discuss <i>(i)</i> the initial concept and current features of CryptoChem, <i>(ii)</i> the associated Molread and Molwrite programs which encode messages as series of molecules and decodes them with an ensemble of QSDR machine learning models, <i>(iii)</i> the Analogue Retriever and Label Swapper methods, which enforce additional layers of security, <i>(iv)</i> the results of encoding and decoding the five datasets using CryptoChem, and (v) the comparison of CryptoChem to contemporary encryption methods. CryptoChem is freely available for testing at <a href="https://github.com/XinhaoLi74/CryptoChem">https://github.com/XinhaoLi74/CryptoChem</a>


2020 ◽  
Author(s):  
Hao Li ◽  
Liqian Cui ◽  
Liping Cao ◽  
Yizhi Zhang ◽  
Yueheng Liu ◽  
...  

Abstract Background: Bipolar disorder (BPD) is a common mood disorder that is often goes misdiagnosed or undiagnosed for years. Recently, machine learning techniques have been combined with neuroimaging methods to aid in the diagnosis of BPD. However, most studies have focused on the construction of classifiers based on single-modality MRI. Hence, in this study, we aimed to construct a support vector machine (SVM) model using a combination of structural and functional MRI, which could be used to accurately identify patients with BPD.Methods: In total, 44 patients with BPD and 36 healthy controls were enrolled in the study. Clinical evaluation and MRI scans were performed for each subject. Next, image pre-processing, voxel-based morphometry (VBM), and ReHo analyses were performed. The ReHo values of each subject in the clusters showing significant differences were extracted. Further, LASSO approach was recruited to screen features. Based on selected features, the SVM model was established, and discriminant analysis was performed.Results: After using the two-sample t-test with multiple comparisons, a total of 8 clusters were extracted from the data (VBM = 6; ReHo = 2). Next, we used both VBM and ReHo data to construct the new SVM classifier, which could effectively identify patients with BPD at an accuracy of 87.5%, sensitivity of 86.4%, and specificity of 88.9% in the test data (p=0.0022).Conclusions: A combination of structural and functional MRI can be of added value in the construction of SVM classifiers to aid in the accurate identification of BPD in the clinic.


Sign in / Sign up

Export Citation Format

Share Document