Deep Level Analysis of Legitimacy in Bengali News Sentences

Soma Das ◽  
Pooja Rai ◽  
Sanjay Chatterji

The tremendous increase in the growth of misinformation in news articles has the potential threat for the adverse effects on society. Hence, the detection of misinformation in news data has become an appealing research area. The task of annotating and detecting distorted news article sentences is the immediate need in this research direction. Therefore, an attempt has been made to formulate the legitimacy annotation guideline followed by annotation and detection of the legitimacy in Bengali e-papers. The sentence-level manual annotation of Bengali news has been carried out in two levels, namely “Level-1 Shallow Level Classification” and “Level-2 Deep Level Classification” based on semantic properties of Bengali sentences. The tagging of 1,300 anonymous Bengali e-paper sentences has been done using the formulated guideline-based tags for both levels. The validation of the annotation guideline has been done by applying benchmark supervised machine learning algorithms using the lexical feature, syntactic feature, domain-specific feature, and Level-2 specific feature in both levels. Performance evaluation of these classifiers is done in terms of Accuracy, Precision, Recall, and F-Measure. In both levels, Support Vector Machine outperforms other benchmark classifiers with an accuracy of 72% and 65% in Level-1 and Level-2, respectively.

2021 ◽  
Vol 11 (10) ◽  
pp. 4443
Rokas Štrimaitis ◽  
Pavel Stefanovič ◽  
Simona Ramanauskaitė ◽  
Asta Slotkienė

Financial area analysis is not limited to enterprise performance analysis. It is worth analyzing as wide an area as possible to obtain the full impression of a specific enterprise. News website content is a datum source that expresses the public’s opinion on enterprise operations, status, etc. Therefore, it is worth analyzing the news portal article text. Sentiment analysis in English texts and financial area texts exist, and are accurate, the complexity of Lithuanian language is mostly concentrated on sentiment analysis of comment texts, and does not provide high accuracy. Therefore in this paper, the supervised machine learning model was implemented to assign sentiment analysis on financial context news, gathered from Lithuanian language websites. The analysis was made using three commonly used classification algorithms in the field of sentiment analysis. The hyperparameters optimization using the grid search was performed to discover the best parameters of each classifier. All experimental investigations were made using the newly collected datasets from four Lithuanian news websites. The results of the applied machine learning algorithms show that the highest accuracy is obtained using a non-balanced dataset, via the multinomial Naive Bayes algorithm (71.1%). The other algorithm accuracies were slightly lower: a long short-term memory (71%), and a support vector machine (70.4%).

2021 ◽  
Vol 9 (1) ◽  
pp. 215-223
Prateek Mishra, Dr.Anurag Sharma, Dr. Abhishek Badholia

Adverse effects can be seen in the entire body due to the major disorders known as Diabetes. The risk of dangers like diabetic nephropathy, cardiac stroke and other disorders can increase severally because of the undiagnosed diabetes. Around the globe the people are suffering from this disease. For a healthy life early detection of this disease is very curtail. As the causes of the diabetes is increasing rapidly this disease might turn up as a reason for worldwide concern. Increasing the chances for a more accurate predictions and form experiences automatic learning by computational method may be provided by Machine Learning (ML). With the help of R data manipulation tool for trends development and with risk factor patterns detection in Pima Indian diabetes technique of machine learning is been used in the current researches. With the use of R data manipulation tool analysis and development five different predictive models is done for the categorization of patients into diabetic and non- diabetic.  supervised machine learning algorithms namely multifactor dimensionality reduction (MDR), k-nearest neighbor (k-NN), artificial neural network (ANN) radial basis function (RBF) kernel support vector machine and linear kernel support vector machine (SVM-linear) are used for this purpose.

2019 ◽  
Vol 1 (1) ◽  
pp. 384-399 ◽  
Thais de Toledo ◽  
Nunzio Torrisi

The Distributed Network Protocol (DNP3) is predominately used by the electric utility industry and, consequently, in smart grids. The Peekaboo attack was created to compromise DNP3 traffic, in which a man-in-the-middle on a communication link can capture and drop selected encrypted DNP3 messages by using support vector machine learning algorithms. The communication networks of smart grids are a important part of their infrastructure, so it is of critical importance to keep this communication secure and reliable. The main contribution of this paper is to compare the use of machine learning techniques to classify messages of the same protocol exchanged in encrypted tunnels. The study considers four simulated cases of encrypted DNP3 traffic scenarios and four different supervised machine learning algorithms: Decision tree, nearest-neighbor, support vector machine, and naive Bayes. The results obtained show that it is possible to extend a Peekaboo attack over multiple substations, using a decision tree learning algorithm, and to gather significant information from a system that communicates using encrypted DNP3 traffic.

Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1557 ◽  
Ilaria Conforti ◽  
Ilaria Mileti ◽  
Zaccaria Del Prete ◽  
Eduardo Palermo

Ergonomics evaluation through measurements of biomechanical parameters in real time has a great potential in reducing non-fatal occupational injuries, such as work-related musculoskeletal disorders. Assuming a correct posture guarantees the avoidance of high stress on the back and on the lower extremities, while an incorrect posture increases spinal stress. Here, we propose a solution for the recognition of postural patterns through wearable sensors and machine-learning algorithms fed with kinematic data. Twenty-six healthy subjects equipped with eight wireless inertial measurement units (IMUs) performed manual material handling tasks, such as lifting and releasing small loads, with two postural patterns: correctly and incorrectly. Measurements of kinematic parameters, such as the range of motion of lower limb and lumbosacral joints, along with the displacement of the trunk with respect to the pelvis, were estimated from IMU measurements through a biomechanical model. Statistical differences were found for all kinematic parameters between the correct and the incorrect postures (p < 0.01). Moreover, with the weight increase of load in the lifting task, changes in hip and trunk kinematics were observed (p < 0.01). To automatically identify the two postures, a supervised machine-learning algorithm, a support vector machine, was trained, and an accuracy of 99.4% (specificity of 100%) was reached by using the measurements of all kinematic parameters as features. Meanwhile, an accuracy of 76.9% (specificity of 76.9%) was reached by using the measurements of kinematic parameters related to the trunk body segment.

2018 ◽  
Vol 25 (7) ◽  
pp. 855-861 ◽  
Halil Kilicoglu ◽  
Graciela Rosemblat ◽  
Mario Malički ◽  
Gerben ter Riet

Abstract Objective To automatically recognize self-acknowledged limitations in clinical research publications to support efforts in improving research transparency. Methods To develop our recognition methods, we used a set of 8431 sentences from 1197 PubMed Central articles. A subset of these sentences was manually annotated for training/testing, and inter-annotator agreement was calculated. We cast the recognition problem as a binary classification task, in which we determine whether a given sentence from a publication discusses self-acknowledged limitations or not. We experimented with three methods: a rule-based approach based on document structure, supervised machine learning, and a semi-supervised method that uses self-training to expand the training set in order to improve classification performance. The machine learning algorithms used were logistic regression (LR) and support vector machines (SVM). Results Annotators had good agreement in labeling limitation sentences (Krippendorff’s α = 0.781). Of the three methods used, the rule-based method yielded the best performance with 91.5% accuracy (95% CI [90.1-92.9]), while self-training with SVM led to a small improvement over fully supervised learning (89.9%, 95% CI [88.4-91.4] vs 89.6%, 95% CI [88.1-91.1]). Conclusions The approach presented can be incorporated into the workflows of stakeholders focusing on research transparency to improve reporting of limitations in clinical studies.

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Binjie Chen ◽  
Fushan Wei ◽  
Chunxiang Gu

Since its inception, Bitcoin has been subject to numerous thefts due to its enormous economic value. Hackers steal Bitcoin wallet keys to transfer Bitcoin from compromised users, causing huge economic losses to victims. To address the security threat of Bitcoin theft, supervised learning methods were used in this study to detect and provide warnings about Bitcoin theft events. To overcome the shortcomings of the existing work, more comprehensive features of Bitcoin transaction data were extracted, the unbalanced dataset was equalized, and five supervised methods—the k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), adaptive boosting (AdaBoost), and multi-layer perceptron (MLP) techniques—as well as three unsupervised methods—the local outlier factor (LOF), one-class support vector machine (OCSVM), and Mahalanobis distance-based approach (MDB)—were used for detection. The best performer among these algorithms was the RF algorithm, which achieved recall, precision, and F1 values of 95.9%. The experimental results showed that the designed features are more effective than the currently used ones. The results of the supervised methods were significantly better than those of the unsupervised methods, and the results of the supervised methods could be further improved after equalizing the training set.

Christian Knaak ◽  
Moritz Kröger ◽  
Frederic Schulze ◽  
Peter Abels ◽  
Arnold Gillner

An effective process monitoring strategy is a requirement for meeting the challenges posed by increasingly complex products and manufacturing processes. To address these needs, this study investigates a comprehensive scheme based on classical machine learning methods, deep learning algorithms, and feature extraction and selection techniques. In a first step, a novel deep learning architecture based on convolutional neural networks (CNN) and gated recurrent units (GRU) is introduced to predict the local weld quality based on mid-wave infrared (MWIR) and near-infrared (NIR) image data. The developed technology is used to discover critical welding defects including lack of fusion (false friends), sagging and lack of penetration, and geometric deviations of the weld seam. Additional work is conducted to investigate the significance of various geometrical, statistical, and spatio-temporal features extracted from the keyhole and weld pool regions. Furthermore, the performance of the proposed deep learning architecture is compared to that of classical supervised machine learning algorithms, such as multi-layer perceptron (MLP), logistic regression (LogReg), support vector machines (SVM), decision trees (DT), random forest (RF) and k-Nearest Neighbors (kNN). Optimal hyperparameters for each algorithm are determined by an extensive grid search. Ultimately, the three best classification models are combined into an ensemble classifier that yields the highest detection rates and achieves the most robust estimation of welding defects among all classifiers studied, which is validated on previously unknown welding trials.

Prathima P

Abstract: Fall is a significant national health issue for the elderly people, generally resulting in severe injuries when the person lies down on the floor over an extended period without any aid after experiencing a great fall. Thus, elders need to be cared very attentively. A supervised-machine learning based fall detection approach with accelerometer, gyroscope is devised. The system can detect falls by grouping different actions as fall or non-fall events and the care taker is alerted immediately as soon as the person falls. The public dataset SisFall with efficient class of features is used to identify fall. The Random Forest (RF) and Support Vector Machine (SVM) machine learning algorithms are employed to detect falls with lesser false alarms. The SVM algorithm obtain a highest accuracy of 99.23% than RF algorithm. Keywords: Fall detection, Machine learning, Supervised classification, Sisfall, Activities of daily living, Wearable sensors, Random Forest, Support Vector Machine

2021 ◽  
Vol 10 (22) ◽  
pp. 5330
Francesco Paolo Lo Muzio ◽  
Giacomo Rozzi ◽  
Stefano Rossi ◽  
Giovanni Battista Luciani ◽  
Ruben Foresti ◽  

The human right ventricle is barely monitored during open-chest surgery due to the absence of intraoperative imaging techniques capable of elaborating its complex function. Accordingly, artificial intelligence could not be adopted for this specific task. We recently proposed a video-based approach for the real-time evaluation of the epicardial kinematics to support medical decisions. Here, we employed two supervised machine learning algorithms based on our technique to predict the patients’ outcomes before chest closure. Videos of the beating hearts were acquired before and after pulmonary valve replacement in twelve Tetralogy of Fallot patients and recordings were properly labeled as the “unhealthy” and “healthy” classes. We extracted frequency-domain-related features to train different supervised machine learning models and selected their best characteristics via 10-fold cross-validation and optimization processes. Decision surfaces were built to classify two additional patients having good and unfavorable clinical outcomes. The k-nearest neighbors and support vector machine showed the highest prediction accuracy; the patients’ class was identified with a true positive rate ≥95% and the decision surfaces correctly classified the additional patients in the “healthy” (good outcome) or “unhealthy” (unfavorable outcome) classes. We demonstrated that classifiers employed with our video-based technique may aid cardiac surgeons in decision making before chest closure.

Energies ◽  
2021 ◽  
Vol 14 (22) ◽  
pp. 7714
Ha Quang Man ◽  
Doan Huy Hien ◽  
Kieu Duy Thong ◽  
Bui Viet Dung ◽  
Nguyen Minh Hoa ◽  

The test study area is the Miocene reservoir of Nam Con Son Basin, offshore Vietnam. In the study we used unsupervised learning to automatically cluster hydraulic flow units (HU) based on flow zone indicators (FZI) in a core plug dataset. Then we applied supervised learning to predict HU by combining core and well log data. We tested several machine learning algorithms. In the first phase, we derived hydraulic flow unit clustering of porosity and permeability of core data using unsupervised machine learning methods such as Ward’s, K mean, Self-Organize Map (SOM) and Fuzzy C mean (FCM). Then we applied supervised machine learning methods including Artificial Neural Networks (ANN), Support Vector Machines (SVM), Boosted Tree (BT) and Random Forest (RF). We combined both core and log data to predict HU logs for the full well section of the wells without core data. We used four wells with six logs (GR, DT, NPHI, LLD, LSS and RHOB) and 578 cores from the Miocene reservoir to train, validate and test the data. Our goal was to show that the correct combination of cores and well logs data would provide reservoir engineers with a tool for HU classification and estimation of permeability in a continuous geological profile. Our research showed that machine learning effectively boosts the prediction of permeability, reduces uncertainty in reservoir modeling, and improves project economics.

Sign in / Sign up

Export Citation Format

Share Document