A Sentence Classification Framework to Identify Geometric Errors in Radiation Therapy from Relevant Literature

The objective of systematic reviews is to address a research question by summarizing relevant studies following a detailed, comprehensive, and transparent plan and search protocol to reduce bias. Systematic reviews are very useful in the biomedical and healthcare domain; however, the data extraction phase of the systematic review process necessitates substantive expertise and is labour-intensive and time-consuming. The aim of this work is to partially automate the process of building systematic radiotherapy treatment literature reviews by summarizing the required data elements of geometric errors of radiotherapy from relevant literature using machine learning and natural language processing (NLP) approaches. A framework is developed in this study that initially builds a training corpus by extracting sentences containing different types of geometric errors of radiotherapy from relevant publications. The publications are retrieved from PubMed following a given set of rules defined by a domain expert. Subsequently, the method develops a training corpus by extracting relevant sentences using a sentence similarity measure. A support vector machine (SVM) classifier is then trained on this training corpus to extract the sentences from new publications which contain relevant geometric errors. To demonstrate the proposed approach, we have used 60 publications containing geometric errors in radiotherapy to automatically extract the sentences stating the mean and standard deviation of different types of errors between planned and executed radiotherapy. The experimental results show that the recall and precision of the proposed framework are, respectively, 97% and 72%. The results clearly show that the framework is able to extract almost all sentences containing required data of geometric errors.

Download Full-text

Analyzing Behavior of Cancer Patients using Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i8414.078919 ◽

2019 ◽

Vol 8 (9) ◽

pp. 1547-1556

Keyword(s):

Machine Learning ◽

Natural Language ◽

Cancer Patients ◽

Language Processing ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Operating Characteristics ◽

Decision Tree Classifier ◽

Tree Classifier

The online discussion forums and blogs are very vibrant platforms for cancer patients to express their views in the form of stories. These stories sometimes become a source of inspiration for some patients who are anxious in searching the similar cases. This paper proposes a method using natural language processing and machine learning to analyze unstructured texts accumulated from patient’s reviews and stories. The proposed methodology aims to identify behavior, emotions, side-effects, decisions and demographics associated with the cancer victims. The pre-processing phase of our work involves extraction of web text followed by text-cleaning where some special characters and symbols are omitted, and finally tagging the texts using NLTK’s (Natural Language Toolkit) POS (Parts of Speech) Tagger. The post-processing phase performs training of seven machine learning classifiers (refer Table 6). The Decision Tree classifier shows the higher precision (0.83) among the other classifiers while, the Area under the operating Characteristics (AUC) for Support Vector Machine (SVM) classifier is highest (0.98).

Download Full-text

AUTOMATED SHAPE-BASED PAVEMENT CRACK DETECTION APPROACH

Transport ◽

10.3846/transport.2018.1559 ◽

2018 ◽

Vol 33 (3) ◽

pp. 598-608 ◽

Cited By ~ 4

Author(s):

Teng Wang ◽

Kasthurirangan Gopalakrishnan ◽

Omar Smadi ◽

Arun K. Somani

Keyword(s):

Crack Detection ◽

Support Vector ◽

Svm Classifier ◽

False Alarms ◽

Polynomial Curve ◽

Detection Approach ◽

Vector Machines ◽

Different Types ◽

Surface Irregularities ◽

Pavement Crack Detection

Pavements are critical man-made infrastructure systems that undergo repeated traffic and environmental loadings. Consequently, they deteriorate with time and manifest certain distresses. To ensure long-lasting performance and appropriate level of service, they need to be preserved and maintained. Highway agencies routinely employ semiautomated and automated image-based methods for network-level pavement-cracking data collection, and there are different types of pavement-cracking data collected by highway agencies for reporting and management purposes. We design a shape-based crack detection approach for pavement health monitoring, which takes advantage of spatial distribution of potential cracks. To achieve this, we first extract Potential Crack Components (PCrCs) from pavement images. Next, we employ polynomial curve to fit all pixels within these components. Finally, we define a Shape Metric (SM) to distinguish crack blocks from background. We experiment the shape-based crack detection approach on different datasets, and compare detection results with an alternate method that is based on Support Vector Machines (SVM) classifier. Experimental results prove that our approach has the capability to produce higher detections and fewer false alarms. Additional research is needed to improve the robustness and accuracy of the developed approach in the presence of anomalies and other surface irregularities.

Download Full-text

Video Data Extraction and Processing for Investigation of Vehicles’ Impact on the Asphalt Deformation Through the Prism of Computational Algorithms

Traitement du signal ◽

10.18280/ts.370603 ◽

2020 ◽

Vol 37 (6) ◽

pp. 899-906

Author(s):

Sabahudin Vrtagić ◽

Edis Softić ◽

Mirza Ponjavić ◽

Željko Stević ◽

Marko Subotić ◽

...

Keyword(s):

Traffic Accidents ◽

Data Extraction ◽

Video Data ◽

Digital Data ◽

Vehicle Classification ◽

Support Vector ◽

Svm Classifier ◽

Small Data ◽

Video File ◽

Linear Svm

There are numerous algorithms and solutions for car or object detection as humanity is aiming towards the smart city solutions. Most solutions are based on counting, speed detection, traffic accidents and vehicle classification. The mentioned solutions are mostly based on high-quality videos, wide angles camera view, vehicles in motion, and are optimized for good visibility conditions intervals. A novelty of the proposed algorithm and solution is more accurate digital data extraction from video file sources generated by security cameras in Bosnia and Herzegovina from M18 roadway, but not limited only to that particular source. From the video file sources, data regarding number of vehicles, speed, traveling direction, and time intervals for the region of interest will be collected. Since finding contours approach is effective only on objects that are mobile, and because the application of this approach on traffic junctions did not yield desired results, a more specific approach of classification using a combination of Histogram of Oriented Gradients (HOG) and Support Vector Machines (Linear SVM) has shown to be more appropriate as the original source data can be used for training where the main benefit is the preservation of local second-order interactions, providing tolerance to local geometric misalignment and ability to work with small data samples. The features of the objects within a frame are extracted first by standardizing the feature variables and then computing the first order gradients of the frame. In the next stage, an encoding that remains robust to small changes while being sensitive to local frame content is produced. Finally, the HOG descriptors are generated and normalized again. In this way the channel histogram and spatial vector becomes the feature vector for the Linear SVM classifier. With the following parameters and setup system accuracy was around 85 to 95%. In the next phase, after cleaning protocols on collected data parameters, data will be used to research asphalt deformation effects.

Download Full-text

Penguin Search Optimization Based Feature Selection for Automated Opinion Mining

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b2629.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 648-653

Keyword(s):

Feature Selection ◽

Language Processing ◽

Opinion Mining ◽

Sentiment Classification ◽

Support Vector ◽

Svm Classifier ◽

Np Hard ◽

Search Optimization ◽

Selection Approach ◽

Feature Selection Approach

Twitter sentiment analysis is a vital concept in determining the public opinions about products, services, events or personality. Analyzing the medical tweets on a specific topic can provide immense benefits in medical industry. However, the medical tweets require efficient feature selection approach to produce significantly accurate results. Penguin search optimization algorithm (PeSOA) has the ability to resolve NP-hard problems. This paper aims at developing an automated opinion mining framework by modeling the feature selection problem as NP-hard optimization problem and using PeSOA based feature selection approach to solve it. Initially, the medical tweets based on cancer and drugs keywords are extracted and pre-processed to filter the relevant informative tweets. Then the features are extracted based on the Natural Language Processing (NLP) concepts and the optimal features are selected using PeSOA whose results are fed as input to three baseline classifiers to achieve optimal and accurate sentiment classification. The experimental results obtained through MATLAB simulations on cancer and drug tweets using k-Nearest Neighbor (KNN), Naïve Bayes (NB) and Support Vector Machine (SVM) indicate that the proposed PeSOA feature selection based tweet opinion mining has improved the classification performance significantly. It shows that the PeSOA feature selection with the SVM classifier provides superior sentiment classification than the other classifiers

Download Full-text

BurnoutWords - Detecting Burnout for a Clinical Setting

10.5121/csit.2021.111815 ◽

2021 ◽

Author(s):

Sukanya Nath ◽

Mascha Kurpicz-Briki

Keyword(s):

Mental Health ◽

Language Processing ◽

Workplace Stress ◽

Support Vector ◽

Svm Classifier ◽

Free Text ◽

Clinical Use ◽

Real Patient ◽

Global Pandemic ◽

Clinical Methods

Burnout, a syndrome conceptualized as resulting from major workplace stress that has not been successfully managed, is a major problem of today's society, in particular in crisis times such as a global pandemic situation. Burnout detection is hard, because the symptoms often overlap with other diseases and syndromes. Typical clinical approaches are using inventories to assess burnout for their patients, even though free-text approaches are considered promising. In research of natural language processing (NLP) applied to mental health, often data from social media is used and not real patient data, which leads to some limitations for the application in clinical use cases. In this paper, we fill the gap and provide a dataset using extracts from interviews with burnout patients containing 216 records. We train a support vector machine (SVM) classifier to detect burnout in text snippets with an accuracy of around 80%, which is clearly higher than the random baseline of our setup. This provides the foundation for a next generation of clinical methods based on NLP.

Download Full-text

A Feature Extraction Method of Ship-Radiated Noise Based on Fluctuation-Based Dispersion Entropy and Intrinsic Time-Scale Decomposition

Entropy ◽

10.3390/e21070693 ◽

2019 ◽

Vol 21 (7) ◽

pp. 693 ◽

Cited By ~ 9

Author(s):

Zhaoxi Li ◽

Yaan Li ◽

Kai Zhang

Keyword(s):

Feature Extraction ◽

Time Scale ◽

Recognition Rate ◽

Support Vector ◽

Svm Classifier ◽

Feature Extraction Method ◽

Radiated Noise ◽

Intrinsic Time ◽

Different Types ◽

Time Scale Decomposition

To improve the feature extraction of ship-radiated noise in a complex ocean environment, fluctuation-based dispersion entropy is used to extract the features of ten types of ship-radiated noise. Since fluctuation-based dispersion entropy only analyzes the ship-radiated noise signal in single scale and it cannot distinguish different types of ship-radiated noise effectively, a new method of ship-radiated noise feature extraction is proposed based on fluctuation-based dispersion entropy (FDispEn) and intrinsic time-scale decomposition (ITD). Firstly, ten types of ship-radiated noise signals are decomposed into a series of proper rotation components (PRCs) by ITD, and the FDispEn of each PRC is calculated. Then, the correlation between each PRC and the original signal are calculated, and the FDispEn of each PRC is analyzed to select the Max-relative PRC fluctuation-based dispersion entropy as the feature parameter. Finally, by comparing the Max-relative PRC fluctuation-based dispersion entropy of a certain number of the above ten types of ship-radiated noise signals with FDispEn, it is discovered that the Max-relative PRC fluctuation-based dispersion entropy is at the same level for similar ship-radiated noise, but is distinct for different types of ship-radiated noise. The Max-relative PRC fluctuation-based dispersion entropy as the feature vector is sent into the support vector machine (SVM) classifier to classify and recognize ten types of ship-radiated noise. The experimental results demonstrate that the recognition rate of the proposed method reaches 95.8763%. Consequently, the proposed method can effectively achieve the classification of ship-radiated noise.

Download Full-text

Functional Classification of Urban Parks Based on Urban Functional Zone and Crowd-Sourced Geographical Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10120824 ◽

2021 ◽

Vol 10 (12) ◽

pp. 824

Author(s):

Su Cao ◽

Shihong Du ◽

Shuwen Yang ◽

Shouhang Du

Keyword(s):

Data Fusion ◽

Language Processing ◽

Data Extraction ◽

Urban Parks ◽

Urban Park ◽

Social Functions ◽

Functional Zone ◽

Two Cities ◽

Different Types

Urban parks have important impacts on urban ecosystems and in disaster prevention. They also have diverse social functions that are important to the living conditions and spatial structures of cities. Identifying and classifying the different types of urban parks are important for analyzing the sustainable development and the greening progress in cities. Existing studies have predominantly focused on the data extraction of urban green spaces as a whole, while there have been relatively few studies that have considered different categories of urban parks and their impact, which makes it difficult to characterize or predict the spatial distribution and structures of urban parks and limits further refinement of urban research. At present, the classification of urban parks relies on the physical features observed in remote sensing images, but these methods are limited when mapping the diverse functions and attributes of urban parks. Crowd-sourced geographic data may more accurately express the social functions of points of interest (POIs) in cities, and, therefore, employing open data sources may assist in data extraction and the classification of different types of urban parks. This paper proposed a multi-source data fusion approach for urban park classification including POI and urban functional zone (UFZ) data. First, the POI data were automatically reclassified using improved natural language processing (NLP) (i.e., text similarity measurements and topic modeling) to establish the links between urban park green-space types and POIs. The reclassified POI data as well as the UFZ data were then subjected to scene-based data fusion, and various types of urban parks were extracted using data attribute analysis and social attribute recognition for urban park mapping. Experimental analysis was conducted across Beijing and Hangzhou to verify the effectiveness of the proposed method, which had an overall classification accuracy of 82.8%. Finally, the urban park types of the two cities were compared and analyzed to obtain the characteristics of urban park types and structures in the two cities, which have different climates and urban structures.

Download Full-text

Sentiment Analysis for Social Media using SVM Classifier of Machine Learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i1107.0789s419 ◽

2019 ◽

Vol 8 (9S4) ◽

pp. 39-47

Keyword(s):

Machine Learning ◽

Social Media ◽

Feature Selection ◽

Sentiment Analysis ◽

Language Processing ◽

Cuckoo Search ◽

Support Vector ◽

Svm Classifier ◽

Feature Selection Technique ◽

Performance Factors

Sentiment analysis is an area of natural language processing (NLP) and machine learning where the text is to be categorized into predefined classes i.e. positive and negative. As the field of internet and social media, both are increasing day by day, the product of these two nowadays is having many more feedbacks from the customer than before. Text generated through social media, blogs, post, review on any product, etc. has become the bested suited cases for consumer sentiment, providing a best-suited idea for that particular product. Features are an important source for the classification task as more the features are optimized, the more accurate are results. Therefore, this research paper proposes a hybrid feature selection which is a combination of Particle swarm optimization (PSO) and cuckoo search. Due to the subjective nature of social media reviews, hybrid feature selection technique outperforms the traditional technique. The performance factors like f-measure, recall, precision, and accuracy tested on twitter dataset using Support Vector Machine (SVM) classifier and compared with convolution neural network. Experimental results of this paper on the basis of different parameters show that the proposed work outperforms the existing work

Download Full-text

Systematic Reviews of Health Care Interventions: An Essential Component of Health Sciences Graduate Programs

International Journal of Nursing Education Scholarship ◽

10.2202/1548-923x.1042 ◽

2004 ◽

Vol 1 (1) ◽

Author(s):

Shelley Peacock ◽

Dorothy Forbes

Keyword(s):

Systematic Review ◽

Systematic Reviews ◽

Graduate Programs ◽

Data Extraction ◽

Research Question ◽

Opportunity To Learn ◽

Health Science ◽

Policy Makers ◽

Unpublished Studies ◽

Unpublished Research

Systematic reviews are an objective, rigorous assessment of both published and unpublished research that enable the reviewer to make recommendations to clinicians, policy-makers, consumers, and researchers. The steps in a systematic review include: (a) developing a research question, (b) developing relevance and validity tools, (c) conducting a thorough literature search of published and unpublished studies, (d) using relevance and validity tools to assess the studies, (e) completing data extraction for each study, (f) synthesizing the findings and, (g) writing the report. The purpose of this paper is to demonstrate the value of providing health science graduate students with the opportunity to learn about the conduct of a systematic review. An example of a thesis utilizing the method of a systematic review is presented.

Download Full-text

Voice Feature Extraction for Gender and Emotion Recognition

ITM Web of Conferences ◽

10.1051/itmconf/20214003008 ◽

2021 ◽

Vol 40 ◽

pp. 03008

Author(s):

Madhu M. Nashipudimath ◽

Pooja Pillai ◽

Anupama Subramanian ◽

Vani Nair ◽

Sarah Khalife

Keyword(s):

Feature Extraction ◽

Data Extraction ◽

Principal Component ◽

Feature Reduction ◽

Healthcare Sector ◽

Support Vector ◽

Svm Classifier ◽

Human Machine Interaction ◽

Interaction Domain ◽

Voice Activity Detector

Voice recognition plays a key function in spoken communication that facilitates identifying the emotions of a person that reflects within the voice. Gender classification through speech is a popular Human Computer Interaction (HCI) method on account that determining gender through computer is hard. This led to the development of a model for "Voice feature extraction for Emotion and Gender Recognition". The speech signal consists of semantic information, speaker information (gender, age, emotional state), accompanied by noise. Females and males have specific vocal traits because of their acoustical and perceptual variations along with a variety of emotions which bring their own specific perceptions. In order to explore this area, feature extraction requires pre-processing of data, which is necessary for increasing the accuracy. The proposed model follows steps such as data extraction, pre-processing using Voice Activity Detector(VAD), feature extraction using Mel-Frequency Cepstral Coefficient(MFCC), feature reduction by Principal Component Analysis(PCA) and Support Vector Machine (SVM) classifier. The proposed combination of techniques produced better results which can be useful in healthcare sector, virtual assistants, security purposes and other fields related to Human Machine Interaction domain.

Download Full-text