Evolution of Machine Learning Algorithms in the Prediction and Design of Anticancer Peptides

: Peptides act as promising anticancer agents due to their ease of synthesis and modifications, enhanced tumor penetration, and less systemic toxicity. However, only limited success has been achieved so far, as experimental design and synthesis of anticancer peptides (ACPs) are prohibitively costly and time-consuming. Furthermore, the sequential increase in the protein sequence data via highthroughput sequencing makes it difficult to identify ACPs only through experimentation, which often involves months or years of speculation and failure. All these limitations could be overcome by applying machine learning (ML) approaches, which is a field of artificial intelligence that automates analytical model building for rapid and accurate outcome predictions. Recently, ML approaches hold great promise in the rapid discovery of ACPs, which could be witnessed by the growing number of MLbased anticancer prediction tools. In this review, we aim to provide a comprehensive view on the existing ML approaches for ACP predictions. Initially, we will briefly discuss the currently available ACP databases. This is followed by the main text, where state-of-the-art ML approaches working principles and their performances based on the ML algorithms are reviewed. Lastly, we discuss the limitations and future directions of the ML methods in the prediction of ACPs.

Download Full-text

iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data

Briefings in Bioinformatics ◽

10.1093/bib/bbz041 ◽

2019 ◽

Vol 21 (3) ◽

pp. 1047-1057 ◽

Cited By ~ 57

Author(s):

Zhen Chen ◽

Pei Zhao ◽

Fuyi Li ◽

Tatiana T Marquez-Lago ◽

André Leier ◽

...

Keyword(s):

Machine Learning ◽

Dimensionality Reduction ◽

Sequence Data ◽

Machine Learning Algorithms ◽

User Friendliness ◽

Data Set ◽

Protein Sequence Data ◽

Learning Analysis ◽

High Throughput Manner ◽

Online Web

Abstract With the explosive growth of biological sequences generated in the post-genomic era, one of the most challenging problems in bioinformatics and computational biology is to computationally characterize sequences, structures and functions in an efficient, accurate and high-throughput manner. A number of online web servers and stand-alone tools have been developed to address this to date; however, all these tools have their limitations and drawbacks in terms of their effectiveness, user-friendliness and capacity. Here, we present iLearn, a comprehensive and versatile Python-based toolkit, integrating the functionality of feature extraction, clustering, normalization, selection, dimensionality reduction, predictor construction, best descriptor/model selection, ensemble learning and results visualization for DNA, RNA and protein sequences. iLearn was designed for users that only want to upload their data set and select the functions they need calculated from it, while all necessary procedures and optimal settings are completed automatically by the software. iLearn includes a variety of descriptors for DNA, RNA and proteins, and four feature output formats are supported so as to facilitate direct output usage or communication with other computational tools. In total, iLearn encompasses 16 different types of feature clustering, selection, normalization and dimensionality reduction algorithms, and five commonly used machine-learning algorithms, thereby greatly facilitating feature analysis and predictor construction. iLearn is made freely available via an online web server and a stand-alone toolkit.

Download Full-text

Machine Learning for Design Optimization of Electromagnetic Devices: Recent Developments and Future Directions

Applied Sciences ◽

10.3390/app11041627 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1627

Author(s):

Yanbin Li ◽

Gang Lei ◽

Gerd Bramerdorfer ◽

Sheng Peng ◽

Xiaodong Sun ◽

...

Keyword(s):

Machine Learning ◽

Design Optimization ◽

Optimization Methods ◽

Machine Learning Algorithms ◽

Cloud Services ◽

Robust Design Optimization ◽

Support Vector ◽

Future Directions ◽

Electromagnetic Devices ◽

Recent Developments

This paper reviews the recent developments of design optimization methods for electromagnetic devices, with a focus on machine learning methods. First, the recent advances in multi-objective, multidisciplinary, multilevel, topology, fuzzy, and robust design optimization of electromagnetic devices are overviewed. Second, a review is presented to the performance prediction and design optimization of electromagnetic devices based on the machine learning algorithms, including artificial neural network, support vector machine, extreme learning machine, random forest, and deep learning. Last, to meet modern requirements of high manufacturing/production quality and lifetime reliability, several promising topics, including the application of cloud services and digital twin, are discussed as future directions for design optimization of electromagnetic devices.

Download Full-text

Deep Learning Approaches for Sentiment Analysis Challenges and Future Issues

10.4018/978-1-7998-8161-2.ch003 ◽

2022 ◽

pp. 27-50

Author(s):

Rajalaxmi Prabhu B. ◽

Seema S.

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Model Building ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets ◽

Learning Approaches ◽

Learning Techniques ◽

Important Challenge

A lot of user-generated data is available these days from huge platforms, blogs, websites, and other review sites. These data are usually unstructured. Analyzing sentiments from these data automatically is considered an important challenge. Several machine learning algorithms are implemented to check the opinions from large data sets. A lot of research has been undergone in understanding machine learning approaches to analyze sentiments. Machine learning mainly depends on the data required for model building, and hence, suitable feature exactions techniques also need to be carried. In this chapter, several deep learning approaches, its challenges, and future issues will be addressed. Deep learning techniques are considered important in predicting the sentiments of users. This chapter aims to analyze the deep-learning techniques for predicting sentiments and understanding the importance of several approaches for mining opinions and determining sentiment polarity.

Download Full-text

Review on the Application of Machine Learning Algorithms in the Sequence Data Mining of DNA

Frontiers in Bioengineering and Biotechnology ◽

10.3389/fbioe.2020.01032 ◽

2020 ◽

Vol 8 ◽

Author(s):

Aimin Yang ◽

Wei Zhang ◽

Jiahao Wang ◽

Ke Yang ◽

Yang Han ◽

...

Keyword(s):

Machine Learning ◽

Data Mining ◽

Sequence Data ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Application of different machine learning techniques in identifying features of protein sequence data

2016 1st India International Conference on Information Processing (IICIP) ◽

10.1109/iicip.2016.7975376 ◽

2016 ◽

Author(s):

Swati Mishra ◽

Mukesh Kumar ◽

Santanu Kumar Rath

Keyword(s):

Machine Learning ◽

Protein Sequence ◽

Sequence Data ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Protein Sequence Data

Download Full-text

Predicting Alert Source Device using Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1526.079920 ◽

2020 ◽

Vol 9 (9) ◽

pp. 1-10

Keyword(s):

Neural Network ◽

Machine Learning ◽

Model Building ◽

Learning Algorithm ◽

Learning Algorithms ◽

Research Work ◽

Machine Learning Algorithms ◽

Training Dataset ◽

Imbalanced Dataset ◽

Daunting Task

In a large distributed virtualized environment, predicting the alerting source from its text seems to be daunting task. This paper explores the option of using machine learning algorithm to solve this problem. Unfortunately, our training dataset is highly imbalanced. Where 96% of alerting data is reported by 24% of alerting sources. This is the expected dataset in any live distributed virtualized environment, where new version of device will have relatively less alert compared to older devices. Any classification effort with such imbalanced dataset present different set of challenges compared to binary classification. This type of skewed data distribution makes conventional machine learning less effective, especially while predicting the minority device type alerts. Our challenge is to build a robust model which can cope with this imbalanced dataset and achieves relative high level of prediction accuracy. This research work stared with traditional regression and classification algorithms using bag of words model. Then word2vec and doc2vec models are used to represent the words in vector formats, which preserve the sematic meaning of the sentence. With this alerting text with similar message will have same vector form representation. This vectorized alerting text is used with Logistic Regression for model building. This yields better accuracy, but the model is relatively complex and demand more computational resources. Finally, simple neural network is used for this multi-class text classification problem domain by using keras and tensorflow libraries. A simple two layered neural network yielded 99 % accuracy, even though our training dataset was not balanced. This paper goes through the qualitative evaluation of the different machine learning algorithms and their respective result. Finally, two layered deep learning algorithms is selected as final solution, since it takes relatively less resource and time with better accuracy values.

Download Full-text

Pathways to Consumers’ Minds: Using Machine Learning and Multiple EEG Metrics to Increase Preference Prediction Above and Beyond Traditional Measurements

10.1101/317073 ◽

2018 ◽

Cited By ~ 3

Author(s):

Adam Hakim ◽

Shira Klorfeld ◽

Tal Sela ◽

Doron Friedman ◽

Maytal Shabat-Simon ◽

...

Keyword(s):

Machine Learning ◽

Predictive Power ◽

Rank Order ◽

Marketing Research ◽

Machine Learning Algorithms ◽

Great Promise ◽

Neural Signals ◽

Novel Approach ◽

First Time ◽

Better Than

AbstractA basic aim of marketing research is to predict consumers’ preferences and the success of marketing campaigns in the general population. However, traditional behavioral measurements have various limitations, calling for novel measurements to improve predictive power. In this study, we use neural signals measured with electroencephalography (EEG) in order to overcome these limitations. We record the EEG signals of subjects, as they watched commercials of six food products. We introduce a novel approach in which instead of using one type of EEG measure, we combine several measures, and use state-of-the-art machine learning algorithms to predict subjects’ individual future preferences over the products and the commercials’ population success, as measured by their YouTube metrics. As a benchmark, we acquired measurements of the commercials’ effectiveness using a standard questionnaire commonly used in marketing research. We reached 68.5% accuracy in predicting between the most and least preferred items and a lower than chance RMSE score for predicting the rank order preferences of all six products. We also predicted the commercials’ population success better than chance. Most importantly, we demonstrate for the first time, that for all of our predictions, the EEG measurements increased the prediction power of the questionnaires. Our analyses methods and results show great promise for utilizing EEG measures by managers, marketing practitioners, and researchers, as a valuable tool for predicting subjects’ preferences and marketing campaigns’ success.

Download Full-text

Automated clinical computational biology: an interpretable machine learning framework to predict disease severity and stratify patients from clinical data

10.31219/osf.io/9xc2j ◽

2018 ◽

Author(s):

soumya banerjee

Keyword(s):

Machine Learning ◽

Disease Severity ◽

Clinical Data ◽

Model Building ◽

Learning Experience ◽

Machine Learning Algorithms ◽

Close Collaboration ◽

Learning Framework ◽

Novel Biomarkers ◽

Automated Machine Learning

We outline an automated computational and machine learning framework that predicts disease severity andstratifies patients. We apply our framework to available clinical data. Our algorithm automatically generatesinsights and predicts disease severity with minimal operator intervention. The computational frameworkpresented here can be used to stratify patients, predict disease severity and propose novel biomarkers fordisease. Insights from machine learning algorithms coupled with clinical data may help guide therapy,personalize treatment and help clinicians understand the change in disease over time. Computationaltechniques like these can be used in translational medicine in close collaboration with clinicians and healthcareproviders. Our models are also interpretable, allowing clinicians with minimal machine learning experience toengage in model building. This work is a step towards automated machine learning in the clinic.

Download Full-text

A Systematic Review of Machine Learning Algorithms in Cyberbullying Detection: Future Directions and Challenges

Journal of Information Security and Cybercrimes Research ◽

10.26735/gbtv9013 ◽

2021 ◽

Vol 4 (1) ◽

pp. 01-26

Author(s):

Muhammad Arif

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Social Media ◽

Language Processing ◽

Machine Learning Algorithms ◽

Future Directions ◽

Social Media Networks ◽

Current State ◽

Art Research ◽

Cyberbullying Detection

Social media networks are becoming an essential part of life for most of the world’s population. Detecting cyberbullying using machine learning and natural language processing algorithms is getting the attention of researchers. There is a growing need for automatic detection and mitigation of cyberbullying events on social media. In this study, research directions and the theoretical foundation in this area are investigated. A systematic review of the current state-of-the-art research in this area is conducted. A framework considering all possible actors in the cyberbullying event must be designed, including various aspects of cyberbullying and its effect on the participating actors. Furthermore, future directions and challenges are also discussed.

Download Full-text

Heterogeneous Network Representation Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/677 ◽

2020 ◽

Cited By ~ 2

Author(s):

Yuxiao Dong ◽

Ziniu Hu ◽

Kuansan Wang ◽

Yizhou Sun ◽

Jie Tang

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Heterogeneous Network ◽

Representation Learning ◽

Machine Learning Algorithms ◽

Future Directions ◽

Relational Properties ◽

Open Research ◽

Different Types ◽

Graph Neural Networks

Representation learning has offered a revolutionary learning paradigm for various AI domains. In this survey, we examine and review the problem of representation learning with the focus on heterogeneous networks, which consists of different types of vertices and relations. The goal of this problem is to automatically project objects, most commonly, vertices, in an input heterogeneous network into a latent embedding space such that both the structural and relational properties of the network can be encoded and preserved. The embeddings (representations) can be then used as the features to machine learning algorithms for addressing corresponding network tasks. To learn expressive embeddings, current research developments can fall into two major categories: shallow embedding learning and graph neural networks. After a thorough review of the existing literature, we identify several critical challenges that remain unaddressed and discuss future directions. Finally, we build the Heterogeneous Graph Benchmark to facilitate open research for this rapidly-developing topic.

Download Full-text