Deep Level Analysis of Legitimacy in Bengali News Sentences

Soma Das; Pooja Rai; Sanjay Chatterji

doi:10.1145/3459928

Deep Level Analysis of Legitimacy in Bengali News Sentences

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3459928 ◽

2022 ◽

Vol 21 (1) ◽

pp. 1-18

Author(s):

Soma Das ◽

Pooja Rai ◽

Sanjay Chatterji

Keyword(s):

Deep Level ◽

Research Direction ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

News Article ◽

Support Vector ◽

Potential Threat ◽

Annotation Guideline ◽

Level 1 ◽

Level 2

The tremendous increase in the growth of misinformation in news articles has the potential threat for the adverse effects on society. Hence, the detection of misinformation in news data has become an appealing research area. The task of annotating and detecting distorted news article sentences is the immediate need in this research direction. Therefore, an attempt has been made to formulate the legitimacy annotation guideline followed by annotation and detection of the legitimacy in Bengali e-papers. The sentence-level manual annotation of Bengali news has been carried out in two levels, namely “Level-1 Shallow Level Classification” and “Level-2 Deep Level Classification” based on semantic properties of Bengali sentences. The tagging of 1,300 anonymous Bengali e-paper sentences has been done using the formulated guideline-based tags for both levels. The validation of the annotation guideline has been done by applying benchmark supervised machine learning algorithms using the lexical feature, syntactic feature, domain-specific feature, and Level-2 specific feature in both levels. Performance evaluation of these classifiers is done in terms of Accuracy, Precision, Recall, and F-Measure. In both levels, Support Vector Machine outperforms other benchmark classifiers with an accuracy of 72% and 65% in Level-1 and Level-2, respectively.

Download Full-text

Financial Context News Sentiment Analysis for the Lithuanian Language

Applied Sciences ◽

10.3390/app11104443 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4443

Author(s):

Rokas Štrimaitis ◽

Pavel Stefanovič ◽

Simona Ramanauskaitė ◽

Asta Slotkienė

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Experimental Investigations ◽

Support Vector ◽

Applied Machine Learning ◽

Bayes Algorithm ◽

Website Content

Financial area analysis is not limited to enterprise performance analysis. It is worth analyzing as wide an area as possible to obtain the full impression of a specific enterprise. News website content is a datum source that expresses the public’s opinion on enterprise operations, status, etc. Therefore, it is worth analyzing the news portal article text. Sentiment analysis in English texts and financial area texts exist, and are accurate, the complexity of Lithuanian language is mostly concentrated on sentiment analysis of comment texts, and does not provide high accuracy. Therefore in this paper, the supervised machine learning model was implemented to assign sentiment analysis on financial context news, gathered from Lithuanian language websites. The analysis was made using three commonly used classification algorithms in the field of sentiment analysis. The hyperparameters optimization using the grid search was performed to discover the best parameters of each classifier. All experimental investigations were made using the newly collected datasets from four Lithuanian news websites. The results of the applied machine learning algorithms show that the highest accuracy is obtained using a non-balanced dataset, via the multinomial Naive Bayes algorithm (71.1%). The other algorithm accuracies were slightly lower: a long short-term memory (71%), and a support vector machine (70.4%).

Download Full-text

PREDICTIVE MODELLING AND ANALYTICS FOR DIABETES USING A MACHINE LEARNING APPROACH

INFORMATION TECHNOLOGY IN INDUSTRY ◽

10.17762/itii.v9i1.121 ◽

2021 ◽

Vol 9 (1) ◽

pp. 215-223

Author(s):

Prateek Mishra, Dr.Anurag Sharma, Dr. Abhishek Badholia

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Machine Learning Algorithms ◽

Computational Method ◽

Supervised Machine Learning ◽

Undiagnosed Diabetes ◽

Support Vector ◽

Entire Body ◽

Data Manipulation ◽

Kernel Support Vector Machine

Adverse effects can be seen in the entire body due to the major disorders known as Diabetes. The risk of dangers like diabetic nephropathy, cardiac stroke and other disorders can increase severally because of the undiagnosed diabetes. Around the globe the people are suffering from this disease. For a healthy life early detection of this disease is very curtail. As the causes of the diabetes is increasing rapidly this disease might turn up as a reason for worldwide concern. Increasing the chances for a more accurate predictions and form experiences automatic learning by computational method may be provided by Machine Learning (ML). With the help of R data manipulation tool for trends development and with risk factor patterns detection in Pima Indian diabetes technique of machine learning is been used in the current researches. With the use of R data manipulation tool analysis and development five different predictive models is done for the categorization of patients into diabetic and non- diabetic. supervised machine learning algorithms namely multifactor dimensionality reduction (MDR), k-nearest neighbor (k-NN), artificial neural network (ANN) radial basis function (RBF) kernel support vector machine and linear kernel support vector machine (SVM-linear) are used for this purpose.

Download Full-text

Encrypted DNP3 Traffic Classification Using Supervised Machine Learning Algorithms

Machine Learning and Knowledge Extraction ◽

10.3390/make1010022 ◽

2019 ◽

Vol 1 (1) ◽

pp. 384-399 ◽

Cited By ~ 2

Author(s):

Thais de Toledo ◽

Nunzio Torrisi

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Decision Tree ◽

Smart Grids ◽

Learning Algorithms ◽

Electric Utility ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

Communication Link

The Distributed Network Protocol (DNP3) is predominately used by the electric utility industry and, consequently, in smart grids. The Peekaboo attack was created to compromise DNP3 traffic, in which a man-in-the-middle on a communication link can capture and drop selected encrypted DNP3 messages by using support vector machine learning algorithms. The communication networks of smart grids are a important part of their infrastructure, so it is of critical importance to keep this communication secure and reliable. The main contribution of this paper is to compare the use of machine learning techniques to classify messages of the same protocol exchanged in encrypted tunnels. The study considers four simulated cases of encrypted DNP3 traffic scenarios and four different supervised machine learning algorithms: Decision tree, nearest-neighbor, support vector machine, and naive Bayes. The results obtained show that it is possible to extend a Peekaboo attack over multiple substations, using a decision tree learning algorithm, and to gather significant information from a system that communicates using encrypted DNP3 traffic.

Download Full-text

Measuring Biomechanical Risk in Lifting Load Tasks Through Wearable System and Machine-Learning Approach

Sensors ◽

10.3390/s20061557 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1557 ◽

Cited By ~ 4

Author(s):

Ilaria Conforti ◽

Ilaria Mileti ◽

Zaccaria Del Prete ◽

Eduardo Palermo

Keyword(s):

Machine Learning ◽

Material Handling ◽

Learning Algorithm ◽

Occupational Injuries ◽

Wearable Sensors ◽

High Stress ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

Kinematic Parameters

Ergonomics evaluation through measurements of biomechanical parameters in real time has a great potential in reducing non-fatal occupational injuries, such as work-related musculoskeletal disorders. Assuming a correct posture guarantees the avoidance of high stress on the back and on the lower extremities, while an incorrect posture increases spinal stress. Here, we propose a solution for the recognition of postural patterns through wearable sensors and machine-learning algorithms fed with kinematic data. Twenty-six healthy subjects equipped with eight wireless inertial measurement units (IMUs) performed manual material handling tasks, such as lifting and releasing small loads, with two postural patterns: correctly and incorrectly. Measurements of kinematic parameters, such as the range of motion of lower limb and lumbosacral joints, along with the displacement of the trunk with respect to the pelvis, were estimated from IMU measurements through a biomechanical model. Statistical differences were found for all kinematic parameters between the correct and the incorrect postures (p < 0.01). Moreover, with the weight increase of load in the lifting task, changes in hip and trunk kinematics were observed (p < 0.01). To automatically identify the two postures, a supervised machine-learning algorithm, a support vector machine, was trained, and an accuracy of 99.4% (specificity of 100%) was reached by using the measurements of all kinematic parameters as features. Meanwhile, an accuracy of 76.9% (specificity of 76.9%) was reached by using the measurements of kinematic parameters related to the trunk body segment.

Download Full-text

Automatic recognition of self-acknowledged limitations in clinical research literature

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy038 ◽

2018 ◽

Vol 25 (7) ◽

pp. 855-861 ◽

Cited By ~ 4

Author(s):

Halil Kilicoglu ◽

Graciela Rosemblat ◽

Mario Malički ◽

Gerben ter Riet

Keyword(s):

Machine Learning ◽

Clinical Research ◽

Binary Classification ◽

Classification Performance ◽

Research Literature ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

Rule Based ◽

Research Transparency

Abstract Objective To automatically recognize self-acknowledged limitations in clinical research publications to support efforts in improving research transparency. Methods To develop our recognition methods, we used a set of 8431 sentences from 1197 PubMed Central articles. A subset of these sentences was manually annotated for training/testing, and inter-annotator agreement was calculated. We cast the recognition problem as a binary classification task, in which we determine whether a given sentence from a publication discusses self-acknowledged limitations or not. We experimented with three methods: a rule-based approach based on document structure, supervised machine learning, and a semi-supervised method that uses self-training to expand the training set in order to improve classification performance. The machine learning algorithms used were logistic regression (LR) and support vector machines (SVM). Results Annotators had good agreement in labeling limitation sentences (Krippendorff’s α = 0.781). Of the three methods used, the rule-based method yielded the best performance with 91.5% accuracy (95% CI [90.1-92.9]), while self-training with SVM led to a small improvement over fully supervised learning (89.9%, 95% CI [88.4-91.4] vs 89.6%, 95% CI [88.1-91.1]). Conclusions The approach presented can be incorporated into the workflows of stakeholders focusing on research transparency to improve reporting of limitations in clinical studies.

Download Full-text

Bitcoin Theft Detection Based on Supervised Machine Learning Algorithms

Security and Communication Networks ◽

10.1155/2021/6643763 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Binjie Chen ◽

Fushan Wei ◽

Chunxiang Gu

Keyword(s):

Support Vector Machine ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Economic Losses ◽

Support Vector ◽

K Nearest Neighbor ◽

Security Threat ◽

Adaptive Boosting ◽

Supervised Methods ◽

Unsupervised Methods

Since its inception, Bitcoin has been subject to numerous thefts due to its enormous economic value. Hackers steal Bitcoin wallet keys to transfer Bitcoin from compromised users, causing huge economic losses to victims. To address the security threat of Bitcoin theft, supervised learning methods were used in this study to detect and provide warnings about Bitcoin theft events. To overcome the shortcomings of the existing work, more comprehensive features of Bitcoin transaction data were extracted, the unbalanced dataset was equalized, and five supervised methods—the k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), adaptive boosting (AdaBoost), and multi-layer perceptron (MLP) techniques—as well as three unsupervised methods—the local outlier factor (LOF), one-class support vector machine (OCSVM), and Mahalanobis distance-based approach (MDB)—were used for detection. The best performer among these algorithms was the RF algorithm, which achieved recall, precision, and F1 values of 95.9%. The experimental results showed that the designed features are more effective than the currently used ones. The results of the supervised methods were significantly better than those of the unsupervised methods, and the results of the supervised methods could be further improved after equalizing the training set.

Download Full-text

Deep Learning and Conventional Machine Learning for Image-Based in-Situ Fault Detection During Laser Welding: A Comparative Study

10.20944/preprints202105.0272.v1 ◽

2021 ◽

Author(s):

Christian Knaak ◽

Moritz Kröger ◽

Frederic Schulze ◽

Peter Abels ◽

Arnold Gillner

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Near Infrared ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

Detection Rates ◽

Feature Extraction And Selection ◽

Welding Defects

An effective process monitoring strategy is a requirement for meeting the challenges posed by increasingly complex products and manufacturing processes. To address these needs, this study investigates a comprehensive scheme based on classical machine learning methods, deep learning algorithms, and feature extraction and selection techniques. In a first step, a novel deep learning architecture based on convolutional neural networks (CNN) and gated recurrent units (GRU) is introduced to predict the local weld quality based on mid-wave infrared (MWIR) and near-infrared (NIR) image data. The developed technology is used to discover critical welding defects including lack of fusion (false friends), sagging and lack of penetration, and geometric deviations of the weld seam. Additional work is conducted to investigate the significance of various geometrical, statistical, and spatio-temporal features extracted from the keyhole and weld pool regions. Furthermore, the performance of the proposed deep learning architecture is compared to that of classical supervised machine learning algorithms, such as multi-layer perceptron (MLP), logistic regression (LogReg), support vector machines (SVM), decision trees (DT), random forest (RF) and k-Nearest Neighbors (kNN). Optimal hyperparameters for each algorithm are determined by an extensive grid search. Ultimately, the three best classification models are combined into an ensemble classifier that yields the highest detection rates and achieves the most robust estimation of welding defects among all classifiers studied, which is validated on previously unknown welding trials.

Download Full-text

Detecting Real-Time Fall of Elderly People Using Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39635 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1913-1918

Author(s):

Prathima P

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Elderly People ◽

Fall Detection ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

False Alarms ◽

Severe Injuries

Abstract: Fall is a significant national health issue for the elderly people, generally resulting in severe injuries when the person lies down on the floor over an extended period without any aid after experiencing a great fall. Thus, elders need to be cared very attentively. A supervised-machine learning based fall detection approach with accelerometer, gyroscope is devised. The system can detect falls by grouping different actions as fall or non-fall events and the care taker is alerted immediately as soon as the person falls. The public dataset SisFall with efficient class of features is used to identify fall. The Random Forest (RF) and Support Vector Machine (SVM) machine learning algorithms are employed to detect falls with lesser false alarms. The SVM algorithm obtain a highest accuracy of 99.23% than RF algorithm. Keywords: Fall detection, Machine learning, Supervised classification, Sisfall, Activities of daily living, Wearable sensors, Random Forest, Support Vector Machine

Download Full-text

Artificial Intelligence Supports Decision Making during Open-Chest Surgery of Rare Congenital Heart Defects

Journal of Clinical Medicine ◽

10.3390/jcm10225330 ◽

2021 ◽

Vol 10 (22) ◽

pp. 5330

Author(s):

Francesco Paolo Lo Muzio ◽

Giacomo Rozzi ◽

Stefano Rossi ◽

Giovanni Battista Luciani ◽

Ruben Foresti ◽

...

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Decision Making ◽

Intraoperative Imaging ◽

Complex Function ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

Open Chest ◽

Chest Surgery

The human right ventricle is barely monitored during open-chest surgery due to the absence of intraoperative imaging techniques capable of elaborating its complex function. Accordingly, artificial intelligence could not be adopted for this specific task. We recently proposed a video-based approach for the real-time evaluation of the epicardial kinematics to support medical decisions. Here, we employed two supervised machine learning algorithms based on our technique to predict the patients’ outcomes before chest closure. Videos of the beating hearts were acquired before and after pulmonary valve replacement in twelve Tetralogy of Fallot patients and recordings were properly labeled as the “unhealthy” and “healthy” classes. We extracted frequency-domain-related features to train different supervised machine learning models and selected their best characteristics via 10-fold cross-validation and optimization processes. Decision surfaces were built to classify two additional patients having good and unfavorable clinical outcomes. The k-nearest neighbors and support vector machine showed the highest prediction accuracy; the patients’ class was identified with a true positive rate ≥95% and the decision surfaces correctly classified the additional patients in the “healthy” (good outcome) or “unhealthy” (unfavorable outcome) classes. We demonstrated that classifiers employed with our video-based technique may aid cardiac surgeons in decision making before chest closure.

Download Full-text

Hydraulic Flow Unit Classification and Prediction Using Machine Learning Techniques: A Case Study from the Nam Con Son Basin, Offshore Vietnam

Energies ◽

10.3390/en14227714 ◽

2021 ◽

Vol 14 (22) ◽

pp. 7714

Author(s):

Ha Quang Man ◽

Doan Huy Hien ◽

Kieu Duy Thong ◽

Bui Viet Dung ◽

Nguyen Minh Hoa ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Flow Unit ◽

Supervised Machine Learning ◽

Support Vector ◽

Learning Methods ◽

Log Data ◽

Hydraulic Flow ◽

Core Data ◽

Machine Learning Methods

The test study area is the Miocene reservoir of Nam Con Son Basin, offshore Vietnam. In the study we used unsupervised learning to automatically cluster hydraulic flow units (HU) based on flow zone indicators (FZI) in a core plug dataset. Then we applied supervised learning to predict HU by combining core and well log data. We tested several machine learning algorithms. In the first phase, we derived hydraulic flow unit clustering of porosity and permeability of core data using unsupervised machine learning methods such as Ward’s, K mean, Self-Organize Map (SOM) and Fuzzy C mean (FCM). Then we applied supervised machine learning methods including Artificial Neural Networks (ANN), Support Vector Machines (SVM), Boosted Tree (BT) and Random Forest (RF). We combined both core and log data to predict HU logs for the full well section of the wells without core data. We used four wells with six logs (GR, DT, NPHI, LLD, LSS and RHOB) and 578 cores from the Miocene reservoir to train, validate and test the data. Our goal was to show that the correct combination of cores and well logs data would provide reservoir engineers with a tool for HU classification and estimation of permeability in a continuous geological profile. Our research showed that machine learning effectively boosts the prediction of permeability, reduces uncertainty in reservoir modeling, and improves project economics.

Download Full-text