Fast Characteristic of Skin Lesions by Machine-Learning of Raman Spectrum

Author(s):  
Hua Zhang ◽  
Danhua Wang ◽  
Limei Qu ◽  
Ying Xue ◽  
Xinli Li ◽  
...  

Abstract Background: The traditional diagnosis of skin lesions mainly relies on dermoscope and pathological biopsy, of which the former is non-objective and the latter is invasive and time-consuming. It is necessary to find an objective and non-invasive inspection method for the diagnosis of skin cancer which is the most common malignant tumor. Herein, we aimed to fast identify the skin cancers on ultrathin frozen fresh tissue sections by combining Raman spectroscopy detection and machine learning technology. Methods and material: 22 fresh frozen tissue sections including 3 squamous cell carcinomas, 11 basal cell carcinomas, 2 malignant melanomas, 3 seborrheic keratosis, and 3 melanocytic nevi, were included and performed Raman detection. To prevent the discrete Raman data distribution affecting the generalization ability of the learning model, a series of adaptive preprocessing algorithms were first applied to standardize the raw Raman data of five skin lesions. The processed Raman data were performed visualized cluster analysis by principal components analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE). And, using K-nearest Neighbor (KNN) and support vector machine (SVM) classifiers, two predictive models for diagnose were established and evaluated in the training set and test set by the confusion matrixes and receiver operating characteristic (ROC) curves.Results: The mean variance Raman spectrum graph of 5 skin lesion types were acquired after standardization procession and 4 peak positions with large differences were found. Through dimensionality reduction by PCA and t-SNE, the visual clustering results of Raman data showed heterogeneous intra-cluster homogeneity and inter-cluster dispersion. The test accuracies reached 94.56% and 98.94% in KNN and SVM classifiers respectively. The areas under the ROCs of the two classifiers, in the category dimension and the sample dimension, were all more than 0.99 which is close to the perfect classification effect. Conclusions: Raman spectroscopy is a competitive candidate for the fast and accurate diagnosis of skin lesions and the molecular information provided may be used in the pathological classification, predicting immunotherapy responsiveness and stratifying prognostic risk. Furthermore, the combination of Raman spectroscopy and machine learning methods showed great diagnostic capabilities with high accuracy is a promising tool for the diagnosis of skin lesions.

Author(s):  
Kristiawan Kristiawan ◽  
Andreas Widjaja

Abstract  — The application of machine learning technology in various industrial fields is currently developing rapidly, including in the retail industry. This study aims to find the most accurate algorithmic model so that it can be used to help retailers choose a store location more precisely. By using several methods such as Pearson Correlation, Chi-Square Features, Recursive Feature Elimination and Tree-based to select features (predictive variables). These features are then used to train and build models using 6 different classification algorithms such as Logistic Regression, K Nearest Neighbor (KNN), Decision Tree, Random Forest, Support Vector Machine (SVM) and Neural Network to classify whether a location is recommended or not as a new store location. Keywords— Application of Machine Learning, Pearson Correlation, Random Forest, Neural Network, Logistic Regression.


2019 ◽  
Vol 20 (5) ◽  
pp. 488-500 ◽  
Author(s):  
Yan Hu ◽  
Yi Lu ◽  
Shuo Wang ◽  
Mengying Zhang ◽  
Xiaosheng Qu ◽  
...  

Background: Globally the number of cancer patients and deaths are continuing to increase yearly, and cancer has, therefore, become one of the world&#039;s highest causes of morbidity and mortality. In recent years, the study of anticancer drugs has become one of the most popular medical topics. </P><P> Objective: In this review, in order to study the application of machine learning in predicting anticancer drugs activity, some machine learning approaches such as Linear Discriminant Analysis (LDA), Principal components analysis (PCA), Support Vector Machine (SVM), Random forest (RF), k-Nearest Neighbor (kNN), and Naïve Bayes (NB) were selected, and the examples of their applications in anticancer drugs design are listed. </P><P> Results: Machine learning contributes a lot to anticancer drugs design and helps researchers by saving time and is cost effective. However, it can only be an assisting tool for drug design. </P><P> Conclusion: This paper introduces the application of machine learning approaches in anticancer drug design. Many examples of success in identification and prediction in the area of anticancer drugs activity prediction are discussed, and the anticancer drugs research is still in active progress. Moreover, the merits of some web servers related to anticancer drugs are mentioned.


2021 ◽  
Vol 186 (Supplement_1) ◽  
pp. 445-451
Author(s):  
Yifei Sun ◽  
Navid Rashedi ◽  
Vikrant Vaze ◽  
Parikshit Shah ◽  
Ryan Halter ◽  
...  

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.


2021 ◽  
Vol 13 (3) ◽  
pp. 168781402110027
Author(s):  
Jianchen Zhu ◽  
Kaixin Han ◽  
Shenlong Wang

With economic growth, automobiles have become an irreplaceable means of transportation and travel. Tires are important parts of automobiles, and their wear causes a large number of traffic accidents. Therefore, predicting tire life has become one of the key factors determining vehicle safety. This paper presents a tire life prediction method based on image processing and machine learning. We first build an original image database as the initial sample. Since there are usually only a few sample image libraries in engineering practice, we propose a new image feature extraction and expression method that shows excellent performance for a small sample database. We extract the texture features of the tire image by using the gray-gradient co-occurrence matrix (GGCM) and the Gauss-Markov random field (GMRF), and classify the extracted features by using the K-nearest neighbor (KNN) classifier. We then conduct experiments and predict the wear life of automobile tires. The experimental results are estimated by using the mean average precision (MAP) and confusion matrix as evaluation criteria. Finally, we verify the effectiveness and accuracy of the proposed method for predicting tire life. The obtained results are expected to be used for real-time prediction of tire life, thereby reducing tire-related traffic accidents.


2021 ◽  
pp. 1-17
Author(s):  
Ahmed Al-Tarawneh ◽  
Ja’afer Al-Saraireh

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4324
Author(s):  
Moaed A. Abd ◽  
Rudy Paul ◽  
Aparna Aravelli ◽  
Ou Bai ◽  
Leonel Lagos ◽  
...  

Multifunctional flexible tactile sensors could be useful to improve the control of prosthetic hands. To that end, highly stretchable liquid metal tactile sensors (LMS) were designed, manufactured via photolithography, and incorporated into the fingertips of a prosthetic hand. Three novel contributions were made with the LMS. First, individual fingertips were used to distinguish between different speeds of sliding contact with different surfaces. Second, differences in surface textures were reliably detected during sliding contact. Third, the capacity for hierarchical tactile sensor integration was demonstrated by using four LMS signals simultaneously to distinguish between ten complex multi-textured surfaces. Four different machine learning algorithms were compared for their successful classification capabilities: K-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and neural network (NN). The time-frequency features of the LMSs were extracted to train and test the machine learning algorithms. The NN generally performed the best at the speed and texture detection with a single finger and had a 99.2 ± 0.8% accuracy to distinguish between ten different multi-textured surfaces using four LMSs from four fingers simultaneously. The capability for hierarchical multi-finger tactile sensation integration could be useful to provide a higher level of intelligence for artificial hands.


2021 ◽  
Vol 13 (6) ◽  
pp. 3497
Author(s):  
Hassan Adamu ◽  
Syaheerah Lebai Lutfi ◽  
Nurul Hashimah Ahamed Hassain Malim ◽  
Rohail Hassan ◽  
Assunta Di Vaio ◽  
...  

Sustainable development plays a vital role in information and communication technology. In times of pandemics such as COVID-19, vulnerable people need help to survive. This help includes the distribution of relief packages and materials by the government with the primary objective of lessening the economic and psychological effects on the citizens affected by disasters such as the COVID-19 pandemic. However, there has not been an efficient way to monitor public funds’ accountability and transparency, especially in developing countries such as Nigeria. The understanding of public emotions by the government on distributed palliatives is important as it would indicate the reach and impact of the distribution exercise. Although several studies on English emotion classification have been conducted, these studies are not portable to a wider inclusive Nigerian case. This is because Informal Nigerian English (Pidgin), which Nigerians widely speak, has quite a different vocabulary from Standard English, thus limiting the applicability of the emotion classification of Standard English machine learning models. An Informal Nigerian English (Pidgin English) emotions dataset is constructed, pre-processed, and annotated. The dataset is then used to classify five emotion classes (anger, sadness, joy, fear, and disgust) on the COVID-19 palliatives and relief aid distribution in Nigeria using standard machine learning (ML) algorithms. Six ML algorithms are used in this study, and a comparative analysis of their performance is conducted. The algorithms are Multinomial Naïve Bayes (MNB), Support Vector Machine (SVM), Random Forest (RF), Logistics Regression (LR), K-Nearest Neighbor (KNN), and Decision Tree (DT). The conducted experiments reveal that Support Vector Machine outperforms the remaining classifiers with the highest accuracy of 88%. The “disgust” emotion class surpassed other emotion classes, i.e., sadness, joy, fear, and anger, with the highest number of counts from the classification conducted on the constructed dataset. Additionally, the conducted correlation analysis shows a significant relationship between the emotion classes of “Joy” and “Fear”, which implies that the public is excited about the palliatives’ distribution but afraid of inequality and transparency in the distribution process due to reasons such as corruption. Conclusively, the results from this experiment clearly show that the public emotions on COVID-19 support and relief aid packages’ distribution in Nigeria were not satisfactory, considering that the negative emotions from the public outnumbered the public happiness.


Nutrients ◽  
2019 ◽  
Vol 11 (7) ◽  
pp. 1681 ◽  
Author(s):  
Ramyaa Ramyaa ◽  
Omid Hosseini ◽  
Giri P. Krishnan ◽  
Sridevi Krishnan

Nutritional phenotyping can help achieve personalized nutrition, and machine learning tools may offer novel means to achieve phenotyping. The primary aim of this study was to use energy balance components, namely input (dietary energy intake and macronutrient composition) and output (physical activity) to predict energy stores (body weight) as a way to evaluate their ability to identify potential phenotypes based on these parameters. From the Women’s Health Initiative Observational Study (WHI OS), carbohydrates, proteins, fats, fibers, sugars, and physical activity variables, namely energy expended from mild, moderate, and vigorous intensity activity, were used to predict current body weight (both as body weight in kilograms and as a body mass index (BMI) category). Several machine learning tools were used for this prediction. Finally, cluster analysis was used to identify putative phenotypes. For the numerical predictions, the support vector machine (SVM), neural network, and k-nearest neighbor (kNN) algorithms performed modestly, with mean approximate errors (MAEs) of 6.70 kg, 6.98 kg, and 6.90 kg, respectively. For categorical prediction, SVM performed the best (54.5% accuracy), followed closely by the bagged tree ensemble and kNN algorithms. K-means cluster analysis improved prediction using numerical data, identified 10 clusters suggestive of phenotypes, with a minimum MAE of ~1.1 kg. A classifier was used to phenotype subjects into the identified clusters, with MAEs <5 kg for 15% of the test set (n = ~2000). This study highlights the challenges, limitations, and successes in using machine learning tools on self-reported data to identify determinants of energy balance.


Author(s):  
Noman Ashraf ◽  
Abid Rafiq ◽  
Sabur Butt ◽  
Hafiz Muhammad Faisal Shehzad ◽  
Grigori Sidorov ◽  
...  

On YouTube, billions of videos are watched online and millions of short messages are posted each day. YouTube along with other social networking sites are used by individuals and extremist groups for spreading hatred among users. In this paper, we consider religion as the most targeted domain for spreading hate speech among people of different religions. We present a methodology for the detection of religion-based hate videos on YouTube. Messages posted on YouTube videos generally express the opinions of users’ related to that video. We provide a novel dataset for religious hate speech detection on Youtube comments. The proposed methodology applies data mining techniques on extracted comments from religious videos in order to filter religion-oriented messages and detect those videos which are used for spreading hate. The supervised learning algorithms: Support Vector Machine (SVM), Logistic Regression (LR), and k-Nearest Neighbor (k-NN) are used for baseline results.


Author(s):  
Jonas Marx ◽  
Stefan Gantner ◽  
Jörn Städing ◽  
Jens Friedrichs

In recent years, the demands of Maintenance, Repair and Overhaul (MRO) customers to provide resource-efficient after market services have grown increasingly. One way to meet these requirements is by making use of predictive maintenance methods. These are ideas that involve the derivation of workscoping guidance by assessing and processing previously unused or undocumented service data. In this context a novel approach on predictive maintenance is presented in form of a performance-based classification method for high pressure compressor (HPC) airfoils. The procedure features machine learning algorithms that establish a relation between the airfoil geometry and the associated aerodynamic behavior and is hereby able to divide individual operating characteristics into a finite number of distinct aero-classes. By this means the introduced method not only provides a fast and simple way to assess piece part performance through geometrical data, but also facilitates the consideration of stage matching (axial as well as circumferential) in a simplified manner. It thus serves as prerequisite for an improved customary HPC performance workscope as well as for an automated optimization process for compressor buildup with used or repaired material that would be applicable in an MRO environment. The methods of machine learning that are used in the present work enable the formation of distinct groups of similar aero-performance by unsupervised (step 1) and supervised learning (step 2). The application of the overall classification procedure is shown exemplary on an artificially generated dataset based on real characteristics of a front and a rear rotor of a 10-stage axial compressor that contains both geometry as well as aerodynamic information. In step 1 of the investigation only the aerodynamic quantities in terms of multivariate functional data are used in order to benchmark different clustering algorithms and generate a foundation for a geometry-based aero-classification. Corresponding classifiers are created in step 2 by means of both, the k Nearest Neighbor and the linear Support Vector Machine algorithms. The methods’ fidelities are brought to the test with the attempt to recover the aero-based similarity classes solely by using normalized and reduced geometry data. This results in high classification probabilities of up to 96 % which is proven by using stratified k-fold cross-validation.


Sign in / Sign up

Export Citation Format

Share Document