Startup Investment Decision Support: Application of Venture Capital Scorecards Using Machine Learning Approaches

This research aims to explore which kinds of metrics are more valuable in making investment decisions for a venture capital firm using machine learning methods. We measure the fit of developed companies to a venture capital firm’s investment thesis with a balanced scorecard based on quantitative and qualitative characteristics of the companies. Collaborating with the management team of Rose Street Capital (RSC), we explore the most influential factors of their balanced scorecard using their retrospective investment decisions of successful and failed startup companies. Our study employs six standard machine learning models and their counterparts with an additional feature selection technique. Our findings suggest that “planning strategy” and “team management” are the two most determinant factors in the firm’s investment decisions, implying that qualitative factors could be more important to startup evaluation. Furthermore, we analyzed which machine learning models were most accurate in predicting the firm’s investment decisions. Our experimental results demonstrate that the best machine learning models achieve an overall accuracy of 78% in making the correct investment decisions, with an average of 87% and 69% in predicting the decision of companies the firm would and would not have invested in, respectively. Our study provides convincing evidence that qualitative criteria could be more influential in investment decisions and machine learning models can be adapted to help provide which values may be more important to consider for a venture capital firm.

Download Full-text

Machine learning approaches to understand and predict rate constants for organic processes in mixtures containing ionic liquids

Physical Chemistry Chemical Physics ◽

10.1039/d0cp04227g ◽

2021 ◽

Vol 23 (4) ◽

pp. 2742-2752

Author(s):

Tamar L. Greaves ◽

Karin S. Schaffarczyk McHale ◽

Raphael F. Burkart-Radke ◽

Jason B. Harper ◽

Tu C. Le

Keyword(s):

Machine Learning ◽

Ionic Liquids ◽

Rate Constants ◽

Learning Approaches ◽

Learning Models ◽

Organic Reaction ◽

Machine Learning Models ◽

Selection Of

Machine learning models were developed for an organic reaction in ionic liquids and validated on a selection of ionic liquids.

Download Full-text

EVALUATING INTONATIONAL FEATURES FOR EMOTION RECOGNITION FROM SPEECH

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213007003679 ◽

2007 ◽

Vol 16 (06) ◽

pp. 1001-1014 ◽

Cited By ~ 1

Author(s):

PANAGIOTIS ZERVAS ◽

IOSIF MPORAS ◽

NIKOS FAKOTAKIS ◽

GEORGE KOKKINAKIS

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Emotion Recognition ◽

Bayesian Learning ◽

Experimental Results ◽

Speech Signals ◽

Learning Approaches ◽

Learning Models ◽

C4.5 Decision Tree ◽

Machine Learning Models

This paper presents and discusses the problem of emotion recognition from speech signals with the utilization of features bearing intonational information. In particular parameters extracted from Fujisaki's model of intonation are presented and evaluated. Machine learning models were build with the utilization of C4.5 decision tree inducer, instance based learner and Bayesian learning. The datasets utilized for the purpose of training machine learning models were extracted from two emotional databases of acted speech. Experimental results showed the effectiveness of Fujisaki's model attributes since they enhanced the recognition process for most of the emotion categories and learning approaches helping to the segregation of emotion categories.

Download Full-text

Sentiment Analysis and Topic Modeling on Tweets about Online Education during COVID-19

Applied Sciences ◽

10.3390/app11188438 ◽

2021 ◽

Vol 11 (18) ◽

pp. 8438

Author(s):

Muhammad Mujahid ◽

Ernesto Lee ◽

Furqan Rustam ◽

Patrick Bernard Washington ◽

Saleem Ullah ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Online Education ◽

Sentiment Analysis ◽

Topic Modeling ◽

Support Vector ◽

Learning Approaches ◽

Learning Models ◽

E Learning ◽

Machine Learning Models

Amid the worldwide COVID-19 pandemic lockdowns, the closure of educational institutes leads to an unprecedented rise in online learning. For limiting the impact of COVID-19 and obstructing its widespread, educational institutions closed their campuses immediately and academic activities are moved to e-learning platforms. The effectiveness of e-learning is a critical concern for both students and parents, specifically in terms of its suitability to students and teachers and its technical feasibility with respect to different social scenarios. Such concerns must be reviewed from several aspects before e-learning can be adopted at such a larger scale. This study endeavors to investigate the effectiveness of e-learning by analyzing the sentiments of people about e-learning. Due to the rise of social media as an important mode of communication recently, people’s views can be found on platforms such as Twitter, Instagram, Facebook, etc. This study uses a Twitter dataset containing 17,155 tweets about e-learning. Machine learning and deep learning approaches have shown their suitability, capability, and potential for image processing, object detection, and natural language processing tasks and text analysis is no exception. Machine learning approaches have been largely used both for annotation and text and sentiment analysis. Keeping in view the adequacy and efficacy of machine learning models, this study adopts TextBlob, VADER (Valence Aware Dictionary for Sentiment Reasoning), and SentiWordNet to analyze the polarity and subjectivity score of tweets’ text. Furthermore, bearing in mind the fact that machine learning models display high classification accuracy, various machine learning models have been used for sentiment classification. Two feature extraction techniques, TF-IDF (Term Frequency-Inverse Document Frequency) and BoW (Bag of Words) have been used to effectively build and evaluate the models. All the models have been evaluated in terms of various important performance metrics such as accuracy, precision, recall, and F1 score. The results reveal that the random forest and support vector machine classifier achieve the highest accuracy of 0.95 when used with Bow features. Performance comparison is carried out for results of TextBlob, VADER, and SentiWordNet, as well as classification results of machine learning models and deep learning models such as CNN (Convolutional Neural Network), LSTM (Long Short Term Memory), CNN-LSTM, and Bi-LSTM (Bidirectional-LSTM). Additionally, topic modeling is performed to find the problems associated with e-learning which indicates that uncertainty of campus opening date, children’s disabilities to grasp online education, and lagging efficient networks for online education are the top three problems.

Download Full-text

Significance Of Multilayer Perceptron Model For Early Detection Of Diabetes Over Ml Methods

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/08358 ◽

2021 ◽

Vol 23 (08) ◽

pp. 148-160

Author(s):

Dr. V.Vasudha Rani ◽

◽

Dr. G. Vasavi ◽

Dr. K.R.N Kiran Kumar ◽

◽

...

Keyword(s):

Machine Learning ◽

Multilayer Perceptron ◽

Predictive Analytics ◽

Early Stage ◽

Feature Selection Method ◽

Health Condition ◽

Performance Comparison ◽

Learning Approaches ◽

Learning Models ◽

Machine Learning Models

Diabetes is one of the chronicdiseases in the world. Millions of people are suffering with several other health issues caused by diabetes, every year. Diabetes has got three stages such as type2, type1 and insulin. Curing of diabetes disease at later stages is practically difficult. Here in this paper, we proposed a DNN model and its performance comparison with some of the machine learning models to predict the disease at an earlystage based on the current health condition of the patient. An artificial neural network (ANN) is a predictive model designed to work the same way a human brain does and works better with larger datasets. Having the concept of hidden layers, neural networks work better at predictive analytics and can make predictions with more accuracy. Novelty of this work lies in integration of feature selection method used to optimize the Multilayer Perceptron (MLP) to reduce the number of required input attributes. The results achieved using this method and several conventional machines learning approaches such as Logistic Regression, Random Forest Classifier (RFC) are compared. The proposed DNN method is proved to show better accuracy than Machine learning models for early stage detection of diabetes. This paper work is applicable to clinical support as a tool for making predecisions by the doctors and physicians.

Download Full-text

FedSpeech: Federated Text-to-Speech with Continual Learning

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/527 ◽

2021 ◽

Author(s):

Ziyue Jiang ◽

Yi Ren ◽

Ming Lei ◽

Zhou Zhao

Keyword(s):

Machine Learning ◽

Global Model ◽

Learning Approaches ◽

Learning Models ◽

Text To Speech ◽

Training Samples ◽

Task Training ◽

Collaborative Training ◽

Machine Learning Models ◽

Continual Learning

Federated learning enables collaborative training of machine learning models under strict privacy restrictions and federated text-to-speech aims to synthesize natural speech of multiple users with a few audio training samples stored in their devices locally. However, federated text-to-speech faces several challenges: very few training samples from each speaker are available, training samples are all stored in local device of each user, and global model is vulnerable to various attacks. In this paper, we propose a novel federated learning architecture based on continual learning approaches to overcome the difficulties above. Specifically, 1) we use gradual pruning masks to isolate parameters for preserving speakers' tones; 2) we apply selective masks for effectively reusing knowledge from tasks; 3) a private speaker embedding is introduced to keep users' privacy. Experiments on a reduced VCTK dataset demonstrate the effectiveness of FedSpeech: it nearly matches multi-task training in terms of multi-speaker speech quality; moreover, it sufficiently retains the speakers' tones and even outperforms the multi-task training in the speaker similarity experiment.

Download Full-text

Machine learning approaches for predicting difficult airway and first-pass success in the emergency department: A multiple prospective observational study (Preprint)

10.2196/preprints.28366 ◽

2021 ◽

Author(s):

Syunsuke Yamanaka ◽

Tadahiro Goto ◽

Koji Morikawa ◽

Hiroko Watase ◽

Hiroshi Okamoto ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Difficult Airway ◽

Nearest Neighbor ◽

Learning Approaches ◽

Learning Models ◽

Discrimination Ability ◽

C Statistic ◽

First Pass ◽

Machine Learning Models

BACKGROUND There is still room for improvement in the modified LEMON criteria for difficult airway prediction and no prediction tool for first-pass success in the ED. OBJECTIVE We applied modern machine learning approaches to predict difficult airway and first-pass success. METHODS In a multicenter prospective study that enrolled consecutive patients who underwent tracheal intubation in the 13 EDs, we developed seven machine learning models (e.g., random forest model) using routinely collected data (e.g., demographics, initial airway assessment). The outcomes were difficult airway and first-pass success. Model performance was evaluated by c-statistics, calibration slope, and association measures (e.g., sensitivity) in the test set (randomly-selected 20% of data). Their performance was compared with the modified LEMON criteria for the difficult airway and with a logistic regression model for the first-pass success. RESULTS Of 10,741 patients who underwent intubation, 543 patients (5%) had a difficult airway, and 7,690 patients (71%) had first-pass success. In predicting the difficult airway, machine learning models—except for k-point nearest neighbor and multilayer perceptron—had a higher discrimination ability compared with the modified LEMON criteria (P<0.01). For example, the ensemble method had the highest c-statistic (0.74 vs 0.62 in the modified LEMON criteria; P <0.01). For the first-pass success, machine learning models—except for k-point nearest neighbor and random forest models—had a higher discrimination ability. In particular, the ensemble model had the highest c-statistic (0.81 vs 0.76 in the reference regression; P <0.01). CONCLUSIONS Machine learning models demonstrated a greater ability in predicting difficult airway and first-pass success in the ED.

Download Full-text

Evaluation of Short-Term Freeway Speed Prediction Based on Periodic Analysis Using Statistical Models and Machine Learning Models

Journal of Advanced Transportation ◽

10.1155/2020/9628957 ◽

2020 ◽

Vol 2020 ◽

pp. 1-16 ◽

Cited By ~ 19

Author(s):

Xiaoxue Yang ◽

Yajie Zou ◽

Jinjun Tang ◽

Jian Liang ◽

Muhammad Ijaz

Keyword(s):

Machine Learning ◽

Statistical Models ◽

Prediction Performance ◽

Periodic Component ◽

Learning Approaches ◽

Learning Models ◽

Short Term ◽

Speed Prediction ◽

The Impact ◽

Machine Learning Models

Accurate prediction of traffic information (i.e., traffic flow, travel time, traffic speed, etc.) is a key component of Intelligent Transportation System (ITS). Traffic speed is an important indicator to evaluate traffic efficiency. Up to date, although a few studies have considered the periodic feature in traffic prediction, very few studies comprehensively evaluate the impact of periodic component on statistical and machine learning prediction models. This paper selects several representative statistical models and machine learning models to analyze the influence of periodic component on short-term speed prediction under different scenarios: (1) multi-horizon ahead prediction (5, 15, 30, 60 minutes ahead predictions), (2) with and without periodic component, (3) two data aggregation levels (5-minute and 15-minute), (4) peak hours and off-peak hours. Specifically, three statistical models (i.e., space time (ST) model, vector autoregressive (VAR) model, autoregressive integrated moving average (ARIMA) model) and three machine learning approaches (i.e., support vector machines (SVM) model, multi-layer perceptron (MLP) model, recurrent neural network (RNN) model) are developed and examined. Furthermore, the periodic features of the speed data are considered via a hybrid prediction method, which assumes that the data consist of two components: a periodic component and a residual component. The periodic component is described by a trigonometric regression function, and the residual component is modeled by the statistical models or the machine learning approaches. The important conclusions can be summarized as follows: (1) the multi-step ahead prediction accuracy improves when considering the periodic component of speed data for both three statistical models and three machine learning models, especially in the peak hours; (2) considering the impact of periodic component for all models, the prediction performance improvement gradually becomes larger as the time step increases; (3) under the same prediction horizon, the prediction performance of all models for 15-minute speed data is generally better than that for 5-minute speed data. Overall, the findings in this paper suggest that the proposed hybrid prediction approach is effective for both statistical and machine learning models in short-term speed prediction.

Download Full-text

Machine Learning in Epidemiology and Health Outcomes Research

Annual Review of Public Health ◽

10.1146/annurev-publhealth-040119-094437 ◽

2020 ◽

Vol 41 (1) ◽

pp. 21-36 ◽

Cited By ~ 6

Author(s):

Timothy L. Wiemken ◽

Robert R. Kelley

Keyword(s):

Machine Learning ◽

Health Outcomes ◽

Outcomes Research ◽

Treatment Effects ◽

Supervised Machine Learning ◽

Learning Approaches ◽

Learning Models ◽

Health Outcomes Research ◽

Daunting Task ◽

Machine Learning Models

Machine learning approaches to modeling of epidemiologic data are becoming increasingly more prevalent in the literature. These methods have the potential to improve our understanding of health and opportunities for intervention, far beyond our past capabilities. This article provides a walkthrough for creating supervised machine learning models with current examples from the literature. From identifying an appropriate sample and selecting features through training, testing, and assessing performance, the end-to-end approach to machine learning can be a daunting task. We take the reader through each step in the process and discuss novel concepts in the area of machine learning, including identifying treatment effects and explaining the output from machine learning models.

Download Full-text

Predicting Energy Demand in Semi-Remote Arctic Locations

Energies ◽

10.3390/en14040798 ◽

2021 ◽

Vol 14 (4) ◽

pp. 798

Author(s):

Odin Foldvik Eikeland ◽

Filippo Maria Bianchi ◽

Harry Apostoleris ◽

Morten Hansen ◽

Yu-Cheng Chiou ◽

...

Keyword(s):

Machine Learning ◽

Rural Areas ◽

Energy Demand ◽

Alternative Energy ◽

Distribution Network ◽

Energy Resources ◽

The Arctic ◽

Learning Approaches ◽

Learning Models ◽

Machine Learning Models

Forecasting energy demand within a distribution network is essential for developing strategies to manage and optimize available energy resources and the associated infrastructure. In this study, we consider remote communities in the Arctic located at the end of the radial distribution network without alternative energy supply. Therefore, it is crucial to develop an accurate forecasting model to manage and optimize the limited energy resources available. We first compare the accuracy of several models that perform short-and medium-term load forecasts in rural areas, where a single industrial customer dominates the electricity consumption. We consider both statistical methods and machine learning models to predict energy demand. Then, we evaluate the transferability of each method to a geographical rural area different from the one considered for training. Our results indicate that statistical models achieve higher accuracy on longer forecast horizons relative to neural networks, while the machine-learning approaches perform better in predicting load at shorter time intervals. The machine learning models also exhibit good transferability, as they manage to predict well the load at new locations that were not accounted for during training. Our work will serve as a guide for selecting the appropriate prediction model and apply it to perform energy load forecasting in rural areas and in locations where historical consumption data may be limited or even not available.

Download Full-text

A pan-ontology view of machine-derived knowledge representations and feedback mechanisms for curation

10.1101/2021.03.02.433532 ◽

2021 ◽

Author(s):

Tomasz Konopka ◽

Damian Smedley

Keyword(s):

Machine Learning ◽

Biological Research ◽

Formal Ontology ◽

Learning Approaches ◽

Learning Models ◽

Knowledge Representations ◽

Plain Text ◽

Research Areas ◽

Internal Properties ◽

Machine Learning Models

AbstractBiomedical ontologies are established tools that organize knowledge in specialized research areas. They can also be used to train machine-learning models. However, it is unclear to what extent representations of ontology concepts learned by machine-learning models capture the relationships intended by ontology curators. It is also unclear whether the representations can provide insights to improve the curation process. Here, we investigate ontologies from across the spectrum of biological research and assess the concordance of formal ontology hierarchies with representations based on plain-text definitions. By comparing the internal properties of each ontology, we describe general patterns across the pan-ontology landscape and pinpoint areas with discrepancies in individual domains. We suggest specific mechanisms through which machine-learning approaches can lead to clarifications of ontology definitions. Synchronizing patterns in machine-derived representations with those intended by the ontology curators will likely streamline the use of ontologies in downstream applications.

Download Full-text