What factors determine reviewer credibility?

Purpose The experiential nature of travel and tourism services has popularized the importance of electronic word-of-mouth (EWOM) among potential customers. EWOM has a significant influence on hotel booking intention of customers as they tend to trust EWOM more than the messages spread by marketers. Amid abundant reviews available online, it becomes difficult for travelers to identify the most significant ones. This questions the credibility of reviewers as various online businesses allow reviewers to post their feedback using nickname or email address rather than using real name, photo or other personal information. Therefore, this study aims to determine the factors leading to reviewer credibility. Design/methodology/approach The paper proposes an econometric model to determine the variables that affect the reviewer’s credibility in the hospitality and tourism sector. The proposed model uses quantifiable variables of reviewers and reviews to estimate reviewer credibility, defined in terms of proportion of number of helpful votes received by a reviewer to the number of total reviews written by him. This covers both aspects of source credibility i.e. trustworthiness and expertness. The authors have used the data set of TripAdvisor.com to validate the models. Findings Regression analysis significantly validated the econometric models proposed here. To check the predictive efficiency of the models, predictive modeling using five commonly used classifiers such as random forest (RF), linear discriminant analysis, k-nearest neighbor, decision tree and support vector machine is performed. RF gave the best accuracy for the overall model. Practical implications The findings of this research paper suggest various implications for hoteliers and managers to help retain credible reviewers in the online travel community. This will help them to achieve long term relationships with the clients and increase their trust in the brand. Originality/value To the best of authors’ knowledge, this study performs an econometric modeling approach to find determinants of reviewer credibility, not conducted in previous studies. Moreover, the study contracts from earlier works by considering it to be an endogenous variable, rather than an exogenous one.

Download Full-text

Application of Machine Learning Approaches for the Design and Study of Anticancer Drugs

Current Drug Targets ◽

10.2174/1389450119666180809122244 ◽

2019 ◽

Vol 20 (5) ◽

pp. 488-500 ◽

Cited By ~ 6

Author(s):

Yan Hu ◽

Yi Lu ◽

Shuo Wang ◽

Mengying Zhang ◽

Xiaosheng Qu ◽

...

Keyword(s):

Machine Learning ◽

Drug Design ◽

Anticancer Drugs ◽

Nearest Neighbor ◽

Cost Effective ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Activity Prediction ◽

Linear Discriminant

Background: Globally the number of cancer patients and deaths are continuing to increase yearly, and cancer has, therefore, become one of the world's highest causes of morbidity and mortality. In recent years, the study of anticancer drugs has become one of the most popular medical topics. Objective: In this review, in order to study the application of machine learning in predicting anticancer drugs activity, some machine learning approaches such as Linear Discriminant Analysis (LDA), Principal components analysis (PCA), Support Vector Machine (SVM), Random forest (RF), k-Nearest Neighbor (kNN), and Naïve Bayes (NB) were selected, and the examples of their applications in anticancer drugs design are listed. Results: Machine learning contributes a lot to anticancer drugs design and helps researchers by saving time and is cost effective. However, it can only be an assisting tool for drug design. Conclusion: This paper introduces the application of machine learning approaches in anticancer drug design. Many examples of success in identification and prediction in the area of anticancer drugs activity prediction are discussed, and the anticancer drugs research is still in active progress. Moreover, the merits of some web servers related to anticancer drugs are mentioned.

Download Full-text

An Incremental Isomap Method for Hyperspectral Dimensionality Reduction and Classification

Photogrammetric Engineering & Remote Sensing ◽

10.14358/pers.87.7.445 ◽

2021 ◽

Vol 87 (6) ◽

pp. 445-455

Author(s):

Yi Ma ◽

Zezhong Zheng ◽

Yutang Ma ◽

Mingcang Zhu ◽

Ran Huang ◽

...

Keyword(s):

Manifold Learning ◽

Nearest Neighbor ◽

Hyperspectral Image ◽

Hyperspectral Data ◽

Training Data ◽

Support Vector ◽

Data Sets ◽

K Nearest Neighbor ◽

Data Set ◽

Data Points

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.

Download Full-text

Cardiovascular Disease Prediction from Electrocardiogram by using Machine Learning Method: A Snapshot from the Subjects of the Malaysian Cohort

10.21203/rs.2.22561/v1 ◽

2020 ◽

Author(s):

Nazrul Anuar Nayan ◽

Hafifah Ab Hamid ◽

Mohd Zubir Suboh ◽

Noraidatulakma Abdullah ◽

Rosmina Jaafar ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Nearest Neighbor ◽

Total Cholesterol Level ◽

Population Based ◽

Support Vector ◽

K Nearest Neighbor ◽

Cvd Risk ◽

Linear Discriminant ◽

Artificial Neural Network Ann

Abstract Background: Cardiovascular disease (CVD) is the leading cause of deaths worldwide. In 2017, CVD contributed to 13,503 deaths in Malaysia. The current approaches for CVD prediction are usually invasive and costly. Machine learning (ML) techniques allow an accurate prediction by utilizing the complex interactions among relevant risk factors. Results: This study presents a case–control study involving 60 participants from The Malaysian Cohort, which is a prospective population-based project. Five parameters, namely, the R–R interval and root mean square of successive differences extracted from electrocardiogram (ECG), systolic and diastolic blood pressures, and total cholesterol level, were statistically significant in predicting CVD. Six ML algorithms, namely, linear discriminant analysis, linear and quadratic support vector machines, decision tree, k-nearest neighbor, and artificial neural network (ANN), were evaluated to determine the most accurate classifier in predicting CVD risk. ANN, which achieved 90% specificity, 90% sensitivity, and 90% accuracy, demonstrated the highest prediction performance among the six algorithms. Conclusions: In summary, by utilizing ML techniques, ECG data can serve as a good parameter for CVD prediction among the Malaysian multiethnic population.

Download Full-text

Determinants of rental strategy: short-term vs long-term rental strategy

International Journal of Contemporary Hospitality Management ◽

10.1108/ijchm-03-2020-0185 ◽

2020 ◽

Vol 13 (12) ◽

pp. 3873-3894

Author(s):

Sina Shokoohyar ◽

Ahmad Sobhani ◽

Anae Sobhani

Keyword(s):

Nearest Neighbor ◽

Performance Metrics ◽

Sharing Economy ◽

Real Estate Market ◽

Support Vector ◽

Attractive Alternative ◽

K Nearest Neighbor ◽

Short Term ◽

Content Type

Purpose Short-term rental option enabled via accommodation sharing platforms is an attractive alternative to conventional long-term rental. The purpose of this study is to compare rental strategies (short-term vs long-term) and explore the main determinants for strategy selection. Design/methodology/approach Using logistic regression, this study predicts the rental strategy with the highest rate of return for a given property in the City of Philadelphia. The modeling result is then compared with the applied machine learning methods, including random forest, k-nearest neighbor, support vector machine, naïve Bayes and neural networks. The best model is finally selected based on different performance metrics that determine the prediction strength of underlying models. Findings By analyzing 2,163 properties, the results show that properties with more bedrooms, closer to the historic attractions, in neighborhoods with lower minority rates and higher nightlife vibe are more likely to have a higher return if they are rented out through short-term rental contract. Additionally, the property location is found out to have a significant impact on the selection of the rental strategy, which emphasizes the widely known term of “location, location, location” in the real estate market. Originality/value The findings of this study contribute to the literature by determining the neighborhood and property characteristics that make a property more suitable for the short-term rental vs the long-term one. This contribution is extremely important as it facilitates differentiating the short-term rentals from the long-term rentals and would help better understanding the supply-side in the sharing economy-based accommodation market.

Download Full-text

A New Approach to Fall Detection Based on Improved Dual Parallel Channels Convolutional Neural Network

Sensors ◽

10.3390/s19122814 ◽

2019 ◽

Vol 19 (12) ◽

pp. 2814 ◽

Cited By ~ 2

Author(s):

Xiaoguang Liu ◽

Huanliang Li ◽

Cunguang Lou ◽

Tie Liang ◽

Xiuling Liu ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Nearest Neighbor ◽

Fall Detection ◽

Classification Performance ◽

Daily Activities ◽

Support Vector ◽

K Nearest Neighbor ◽

Linear Discriminant ◽

Parallel Channels

Falls are the major cause of fatal and non-fatal injury among people aged more than 65 years. Due to the grave consequences of the occurrence of falls, it is necessary to conduct thorough research on falls. This paper presents a method for the study of fall detection using surface electromyography (sEMG) based on an improved dual parallel channels convolutional neural network (IDPC-CNN). The proposed IDPC-CNN model is designed to identify falls from daily activities using the spectral features of sEMG. Firstly, the classification accuracy of time domain features and spectrograms are compared using linear discriminant analysis (LDA), k-nearest neighbor (KNN) and support vector machine (SVM). Results show that spectrograms provide a richer way to extract pattern information and better classification performance. Therefore, the spectrogram features of sEMG are selected as the input of IDPC-CNN to distinguish between daily activities and falls. Finally, The IDPC-CNN is compared with SVM and three different structure CNNs under the same conditions. Experimental results show that the proposed IDPC-CNN achieves 92.55% accuracy, 95.71% sensitivity and 91.7% specificity. Overall, The IDPC-CNN is more effective than the comparison in accuracy, efficiency, training and generalization.

Download Full-text

Cardiovascular Disease Prediction from Electrocardiogram by Using Machine Learning

International Journal of Online and Biomedical Engineering (iJOE) ◽

10.3991/ijoe.v16i07.13569 ◽

2020 ◽

Vol 16 (07) ◽

pp. 34

Author(s):

Nayan Nazrul Anuar ◽

Ab Hamid Hafifah ◽

Suboh Mohd Zubir ◽

Abdullah Noraidatulakma ◽

Jaafar Rosmina ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Nearest Neighbor ◽

Total Cholesterol Level ◽

Population Based ◽

Support Vector ◽

K Nearest Neighbor ◽

Cvd Risk ◽

Linear Discriminant ◽

Artificial Neural Network Ann

Cardiovascular disease (CVD) is the leading cause of deaths worldwide. In 2017, CVD contributed to 13,503 deaths in Malaysia. The current approaches for CVD prediction are usually invasive and costly. Machine learning (ML) techniques allow an accurate prediction by utilizing the complex interactions among relevant risk factors. This study presents a case–control study involving 60 participants from The Malaysian Cohort, which is a prospective population-based project. Five parameters, namely, the R–R interval and root mean square of successive differences extracted from electrocardiogram (ECG), systolic and diastolic blood pressures, and total cholesterol level, were statistically significant in predicting CVD. Six ML algorithms, namely, linear discriminant analysis, linear and quadratic support vector machines, decision tree, k-nearest neighbor, and artificial neural network (ANN), were evaluated to determine the most accurate classifier in predicting CVD risk. ANN, which achieved 90% specificity, 90% sensitivity, and 90% accuracy, demonstrated the highest prediction performance among the six algorithms. In summary, by utilizing ML techniques, ECG data can serve as a good parameter for CVD prediction among the Malaysian multiethnic population.

Download Full-text

Identification of Leukemia Subtypes from Microscopic Images Using Convolutional Neural Network

Diagnostics ◽

10.3390/diagnostics9030104 ◽

2019 ◽

Vol 9 (3) ◽

pp. 104 ◽

Cited By ~ 11

Author(s):

Ahmed ◽

Yigit ◽

Isik ◽

Alpkocak

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Set ◽

Leukemia Data

Leukemia is a fatal cancer and has two main types: Acute and chronic. Each type has two more subtypes: Lymphoid and myeloid. Hence, in total, there are four subtypes of leukemia. This study proposes a new approach for diagnosis of all subtypes of leukemia from microscopic blood cell images using convolutional neural networks (CNN), which requires a large training data set. Therefore, we also investigated the effects of data augmentation for an increasing number of training samples synthetically. We used two publicly available leukemia data sources: ALL-IDB and ASH Image Bank. Next, we applied seven different image transformation techniques as data augmentation. We designed a CNN architecture capable of recognizing all subtypes of leukemia. Besides, we also explored other well-known machine learning algorithms such as naive Bayes, support vector machine, k-nearest neighbor, and decision tree. To evaluate our approach, we set up a set of experiments and used 5-fold cross-validation. The results we obtained from experiments showed that our CNN model performance has 88.25% and 81.74% accuracy, in leukemia versus healthy and multiclass classification of all subtypes, respectively. Finally, we also showed that the CNN model has a better performance than other wellknown machine learning algorithms.

Download Full-text

Detection and Characterization of Physical Activity and Psychological Stress from Wristband Data

Signals ◽

10.3390/signals1020011 ◽

2020 ◽

Vol 1 (2) ◽

pp. 188-208

Author(s):

Mert Sevil ◽

Mudassir Rashid ◽

Mohammad Reza Askari ◽

Zacharie Maloney ◽

Iman Hajizadeh ◽

...

Keyword(s):

Physical Activity ◽

Signal Processing ◽

Feature Extraction ◽

Psychological Stress ◽

Training Data ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Set ◽

Linear Discriminant ◽

Physiological Variables

Wearable devices continuously measure multiple physiological variables to inform users of health and behavior indicators. The computed health indicators must rely on informative signals obtained by processing the raw physiological variables with powerful noise- and artifacts-filtering algorithms. In this study, we aimed to elucidate the effects of signal processing techniques on the accuracy of detecting and discriminating physical activity (PA) and acute psychological stress (APS) using physiological measurements (blood volume pulse, heart rate, skin temperature, galvanic skin response, and accelerometer) collected from a wristband. Data from 207 experiments involving 24 subjects were used to develop signal processing, feature extraction, and machine learning (ML) algorithms that can detect and discriminate PA and APS when they occur individually or concurrently, classify different types of PA and APS, and estimate energy expenditure (EE). Training data were used to generate feature variables from the physiological variables and develop ML models (naïve Bayes, decision tree, k-nearest neighbor, linear discriminant, ensemble learning, and support vector machine). Results from an independent labeled testing data set demonstrate that PA was detected and classified with an accuracy of 99.3%, and APS was detected and classified with an accuracy of 92.7%, whereas the simultaneous occurrences of both PA and APS were detected and classified with an accuracy of 89.9% (relative to actual class labels), and EE was estimated with a low mean absolute error of 0.02 metabolic equivalent of task (MET).The data filtering and adaptive noise cancellation techniques used to mitigate the effects of noise and artifacts on the classification results increased the detection and discrimination accuracy by 0.7% and 3.0% for PA and APS, respectively, and by 18% for EE estimation. The results demonstrate the physiological measurements from wristband devices are susceptible to noise and artifacts, and elucidate the effects of signal processing and feature extraction on the accuracy of detection, classification, and estimation of PA and APS.

Download Full-text

Machine Learning to Classify Intracardiac Electrical Patterns During Atrial Fibrillation

Circulation Arrhythmia and Electrophysiology ◽

10.1161/circep.119.008160 ◽

2020 ◽

Vol 13 (8) ◽

Cited By ~ 1

Author(s):

Mahmood I. Alhusseini ◽

Firas Abuzaid ◽

Albert J. Rogers ◽

Junaid A.B. Zaman ◽

Tina Baykaner ◽

...

Keyword(s):

Atrial Fibrillation ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Unique Identifier ◽

Multiple Systems ◽

Linear Discriminant ◽

Link Type ◽

Decision Logic ◽

The Hilbert Transform

Background: Advances in ablation for atrial fibrillation (AF) continue to be hindered by ambiguities in mapping, even between experts. We hypothesized that convolutional neural networks (CNN) may enable objective analysis of intracardiac activation in AF, which could be applied clinically if CNN classifications could also be explained. Methods: We performed panoramic recording of bi-atrial electrical signals in AF. We used the Hilbert-transform to produce 175 000 image grids in 35 patients, labeled for rotational activation by experts who showed consistency but with variability (kappa [κ]=0.79). In each patient, ablation terminated AF. A CNN was developed and trained on 100 000 AF image grids, validated on 25 000 grids, then tested on a separate 50 000 grids. Results: In the separate test cohort (50 000 grids), CNN reproducibly classified AF image grids into those with/without rotational sites with 95.0% accuracy (CI, 94.8%–95.2%). This accuracy exceeded that of support vector machines, traditional linear discriminant, and k-nearest neighbor statistical analyses. To probe the CNN, we applied gradient-weighted class activation mapping which revealed that the decision logic closely mimicked rules used by experts (C statistic 0.96). Conclusions: CNNs improved the classification of intracardiac AF maps compared with other analyses and agreed with expert evaluation. Novel explainability analyses revealed that the CNN operated using a decision logic similar to rules used by experts, even though these rules were not provided in training. We thus describe a scaleable platform for robust comparisons of complex AF data from multiple systems, which may provide immediate clinical utility to guide ablation. Registration: URL: https://www.clinicaltrials.gov ; Unique identifier: NCT02997254. Graphic Abstract: A graphic abstract is available for this article.

Download Full-text

Feature Selection Algorithm for Hyperlipidemia Classification

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.701-702.110 ◽

2014 ◽

Vol 701-702 ◽

pp. 110-113

Author(s):

Qi Rui Zhang ◽

He Xian Wang ◽

Jiang Wei Qin

Keyword(s):

Feature Selection ◽

Nearest Neighbor ◽

Information Gain ◽

Classification Systems ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Set ◽

Document Frequency ◽

Selection Algorithms ◽

Term Weights

This paper reports a comparative study of feature selection algorithms on a hyperlipimedia data set. Three methods of feature selection were evaluated, including document frequency (DF), information gain (IG) and aχ2 statistic (CHI). The classification systems use a vector to represent a document and use tfidfie (term frequency, inverted document frequency, and inverted entropy) to compute term weights. In order to compare the effectives of feature selection, we used three classification methods: Naïve Bayes (NB), k Nearest Neighbor (kNN) and Support Vector Machines (SVM). The experimental results show that IG and CHI outperform significantly DF, and SVM and NB is more effective than KNN when macro-averagingF1 measure is used. DF is suitable for the task of large text classification.

Download Full-text