Application of Machine Learning Techniques in Drug-Target Interactions Prediction

2020 ◽  
Vol 26 ◽  
Author(s):  
Shengli Zhang ◽  
Jiesheng Wang ◽  
Zhenhui Lin ◽  
Yunyun Liang

Background: Drug-Target interactions are vital for drug design and drug repositioning. However, traditional lab experiments are both expensive and time-consuming. Various computational methods which applied machine learning techniques performed efficiently and effectively in the field. Results: The machine learning methods can be divided into three categories basically: Supervised methods, SemiSupervised methods and Unsupervised methods. We reviewed recent representative methods applying machine learning techniques of each category in DTIs and summarized a brief list of databases frequently used in drug discovery. In addition, we compared the advantages and limitations of these methods in each category. Conclusion: Every prediction model has its both strengths and weaknesses and should be adopted in proper ways. Three major problems in DTIs prediction including the lack of nonreactive drug-target pairs data sets, overoptimistic results due to the biases and the exploiting of regression models on DTIs prediction should be seriously considered.

2020 ◽  
Author(s):  
Mark Daly Reed ◽  
Timothy James Le Souef ◽  
Elliot Rampono

BACKGROUND Arthritis is a common condition, which frequently involves the hands. Patients with inflammatory arthritis have been shown to experience significant delays in diagnosis. OBJECTIVE We sought to develop and test a screening tool combining an image of a patient’s hands, a short series of questions, and a single examination technique, to determine the most likely diagnosis in a patient presenting with hand arthritis. Machine learning techniques were used to develop separate algorithms for each component, which were combined to produce a diagnosis. METHODS 280 consecutive new patients presenting to a Rheumatology practice with hand arthritis were enrolled. Each patient completed a 9-part questionnaire, had photographs taken of each hand, and had a single examination result recorded. The Rheumatologist diagnosis was recorded following a 45-minute consultation. The photograph algorithm was developed from a library of 1000 images, and machine learning techniques were applied to the questionnaire results, training several models against the diagnosis from the Rheumatologist. RESULTS The combined algorithms in this study were able to predict inflammatory arthritis with an accuracy, precision, recall and specificity of 96·8%, 97·2%, 98·6% and 90·5% respectively. Similar results were found when inflammatory arthritis was subclassified into rheumatoid arthritis and psoriatic arthritis. The corresponding figures for osteoarthritis were 79·6%, 85·9%, 61·9% and 92·6%. CONCLUSIONS This study demonstrates a novel application of a combined image-processing and a patient questionnaire with applied machine-learning methods, to facilitate the diagnosis of patients presenting with hand arthritis. Preliminary results are encouraging for the application of such techniques in clinical practice. CLINICALTRIAL Not applicable.


2021 ◽  
Author(s):  
Giulio Mario Cappelletti ◽  
Luca Grilli ◽  
Carlo Russo ◽  
Domenico Santoro

Abstract Thanks to the development of increasingly sophisticated machine-learning techniques, it is possible to improve predictions of a certain phenomenon. In this paper, after having analyzed data relating to the mobility habits of University of Foggia (UniFG) community members and deter- mined their emissions of pollutants, we applied machine-learning techniques to these data to estimate the quantities of pollutants (in a certain time period) produced by new subjects not present in the data sets, using very little information. In this way, we developed a method that the university could apply to inform new students about what their emissions of pollutants could be in the near future, through several easily obtainable features. This method could allow the UniFG Rectorate to improve its sustainable mobility policies by encouraging the use of methods that are as appropriate as possible to the users’ needs. In addition, any public/private organization outside the academic environment can use the method, due to the need for little information.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tomoaki Mameno ◽  
Masahiro Wada ◽  
Kazunori Nozaki ◽  
Toshihito Takahashi ◽  
Yoshitaka Tsujioka ◽  
...  

AbstractThe purpose of this retrospective cohort study was to create a model for predicting the onset of peri-implantitis by using machine learning methods and to clarify interactions between risk indicators. This study evaluated 254 implants, 127 with and 127 without peri-implantitis, from among 1408 implants with at least 4 years in function. Demographic data and parameters known to be risk factors for the development of peri-implantitis were analyzed with three models: logistic regression, support vector machines, and random forests (RF). As the results, RF had the highest performance in predicting the onset of peri-implantitis (AUC: 0.71, accuracy: 0.70, precision: 0.72, recall: 0.66, and f1-score: 0.69). The factor that had the most influence on prediction was implant functional time, followed by oral hygiene. In addition, PCR of more than 50% to 60%, smoking more than 3 cigarettes/day, KMW less than 2 mm, and the presence of less than two occlusal supports tended to be associated with an increased risk of peri-implantitis. Moreover, these risk indicators were not independent and had complex effects on each other. The results of this study suggest that peri-implantitis onset was predicted in 70% of cases, by RF which allows consideration of nonlinear relational data with complex interactions.


Author(s):  
Gediminas Adomavicius ◽  
Yaqiong Wang

Numerical predictive modeling is widely used in different application domains. Although many modeling techniques have been proposed, and a number of different aggregate accuracy metrics exist for evaluating the overall performance of predictive models, other important aspects, such as the reliability (or confidence and uncertainty) of individual predictions, have been underexplored. We propose to use estimated absolute prediction error as the indicator of individual prediction reliability, which has the benefits of being intuitive and providing highly interpretable information to decision makers, as well as allowing for more precise evaluation of reliability estimation quality. As importantly, the proposed reliability indicator allows the reframing of reliability estimation itself as a canonical numeric prediction problem, which makes the proposed approach general-purpose (i.e., it can work in conjunction with any outcome prediction model), alleviates the need for distributional assumptions, and enables the use of advanced, state-of-the-art machine learning techniques to learn individual prediction reliability patterns directly from data. Extensive experimental results on multiple real-world data sets show that the proposed machine learning-based approach can significantly improve individual prediction reliability estimation as compared with a number of baselines from prior work, especially in more complex predictive scenarios.


The Intrusion is a major threat to unauthorized data or legal network using the legitimate user identity or any of the back doors and vulnerabilities in the network. IDS mechanisms are developed to detect the intrusions at various levels. The objective of the research work is to improve the Intrusion Detection System performance by applying machine learning techniques based on decision trees for detection and classification of attacks. The methodology adapted will process the datasets in three stages. The experimentation is conducted on KDDCUP99 data sets based on number of features. The Bayesian three modes are analyzed for different sized data sets based upon total number of attacks. The time consumed by the classifier to build the model is analyzed and the accuracy is done.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 2012 ◽  
Author(s):  
Hashem Koohy

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.


2020 ◽  
Vol 11 (2) ◽  
pp. 71-85
Author(s):  
Nhat-Vinh Lu ◽  
Trong-Nhan Vuong ◽  
Duy-Tai Dinh

Sensory evaluation plays an important role in the food and consumer goods industry. In recent years, the application of machine learning techniques to support food sensory evaluation has become popular. Many different machine learning methods have been applied and produced positive results in this field. In this article, the authors propose a new method to support sensory evaluation on multiple criteria based on the use of a correlation-based feature selection technique, combined with machine learning methods such as linear regression, multilayer perceptron, support vector machine, and random forest. Experimental results are based on considering the correlation between physicochemical components and sensory factors on the Saigon beer dataset.


2019 ◽  
Vol 119 (3) ◽  
pp. 676-696 ◽  
Author(s):  
Zhongyi Hu ◽  
Raymond Chiong ◽  
Ilung Pranata ◽  
Yukun Bao ◽  
Yuqing Lin

Purpose Malicious web domain identification is of significant importance to the security protection of internet users. With online credibility and performance data, the purpose of this paper to investigate the use of machine learning techniques for malicious web domain identification by considering the class imbalance issue (i.e. there are more benign web domains than malicious ones). Design/methodology/approach The authors propose an integrated resampling approach to handle class imbalance by combining the synthetic minority oversampling technique (SMOTE) and particle swarm optimisation (PSO), a population-based meta-heuristic algorithm. The authors use the SMOTE for oversampling and PSO for undersampling. Findings By applying eight well-known machine learning classifiers, the proposed integrated resampling approach is comprehensively examined using several imbalanced web domain data sets with different imbalance ratios. Compared to five other well-known resampling approaches, experimental results confirm that the proposed approach is highly effective. Practical implications This study not only inspires the practical use of online credibility and performance data for identifying malicious web domains but also provides an effective resampling approach for handling the class imbalance issue in the area of malicious web domain identification. Originality/value Online credibility and performance data are applied to build malicious web domain identification models using machine learning techniques. An integrated resampling approach is proposed to address the class imbalance issue. The performance of the proposed approach is confirmed based on real-world data sets with different imbalance ratios.


Sign in / Sign up

Export Citation Format

Share Document