Neural Embeddings for Text Analysis: A Case Study in Neoliberal Discourse

This paper examines the notions of neoliberalism and the financialization and marketisation of public life by using computational tools such as sentence embeddings on a novel corpus of neoliberal articles. More specifically, we experimented with distributional semantics along with several Natural Language Processing (NLP) techniques and machine learning algorithms in order to extract conceptual dictionaries and “seed” words. Our findings show that sentence embeddings reveal repetitive patterns constructed around the given concepts and highlight the mechanical character of an ideology in its function of providing solutions, policies and constructing stereotypes. This work introduces a novel pipeline for computer-assisted research in discourse analysis and ideology.

Download Full-text

Comparative Analysis of Machine Learning Algorithms for Computer-Assisted Reporting Based on Fully Automated Cross-Lingual RadLex® Mappings

10.20944/preprints202004.0354.v1 ◽

2020 ◽

Author(s):

Máté E. Maros ◽

Chang Gyu Cho ◽

Andreas G. Junge ◽

Benedikt Kämpgen ◽

Victor Saase ◽

...

Keyword(s):

Machine Learning ◽

Language Processing ◽

Confusion Matrix ◽

Imbalanced Data ◽

Machine Learning Algorithms ◽

Imaging Biomarkers ◽

Brier Score ◽

Support Vector ◽

Computer Assisted ◽

Cross Lingual

Objectives: Studies evaluating machine learning (ML) algorithms on cross-lingual RadLex® mappings for developing context-sensitive radiological reporting tools are lacking. Therefore, we investigated whether ML-based approaches can be utilized to assist radiologists in providing key imaging biomarkers – such as The Alberta stroke programme early CT score (APECTS). Material and Methods: A stratified random sample (age, gender, year) of CT reports (n=206) with suspected ischemic stroke was generated out of 3997 reports signed off between 2015-2019. Three independent, blinded readers assessed these reports and manually annotated clinico-radiologically relevant key features. The primary outcome was whether ASPECTS should have been provided (yes/no: 154/52). For all reports, both the findings and impressions underwent cross-lingual (German to English) RadLex®-mappings using natural language processing. Well-established ML-algorithms including classification trees, random forests, elastic net, support vector machines (SVMs) and boosted trees were evaluated in a 5 x 5-fold nested cross-validation framework. Further, a linear classifier (fastText) was directly fitted on the German reports. Ensemble learning was used to provide robust importance rankings of these ML-algorithms. Performance was evaluated using derivates of the confusion matrix and metrics of calibration including AUC, brier score and log loss as well as visually by calibration plots. Results: On this imbalanced classification task SVMs showed the highest accuracies both on human-extracted- (87%) and fully automated RadLex® features (findings: 82.5%; impressions: 85.4%). FastText without pre-trained language model showed the highest accuracy (89.3%) and AUC (92%) on the impressions. Ensemble learner revealed that boosted trees, fastText and SVMs are the most important ML-classifiers. Boosted trees fitted on the findings showed the best overall calibration curve. Conclusions: Contextual ML-based assistance suggesting ASPECTS while reporting neuroradiological emergencies is feasible, even if ML-models are restricted to be developed on limited and highly imbalanced data sets.

Download Full-text

Digital Humanities: New Tools and New Knowledge

KnE Social Sciences ◽

10.18502/kss.v3i18.4770 ◽

2019 ◽

Author(s):

Simon Musgrave

Keyword(s):

Machine Learning ◽

Digital Humanities ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Distributional Semantics ◽

Computational Tools ◽

New Methods ◽

Digital World ◽

Theory Of Meaning ◽

New Knowledge

A key aspect of the rapidly growing field of Digital Humanities is the application of computational tools to problems in humanistic research, a process which can lead to exciting new knowledge. I will illustrate this development with examples from my own research and from that of other scholars showing how the new tools are applicable across many areas of research in the humanities. In particular, I will discuss how the recent development of machine learning algorithms has made it possible to investigate more fully insights based on a theory of meaning (distributional semantics) which is over 60 years old. Although most of my discussion will focus on the application of new methods for research in the humanities, I will end by switching the perspective and considering how such approaches can enrich education in the humanities and produce graduates equipped with diverse skills which will serve them well in our digital world.

Download Full-text

Twitter Sentiment Analysis Using Machine Learning Algorithms: A Case Study

2020 International Conference on Advances in Computing, Communication & Materials (ICACCM) ◽

10.1109/icaccm50413.2020.9213011 ◽

2020 ◽

Author(s):

Sheresh Zahoor ◽

Rajesh Rohilla

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Citizen Participation and Machine Learning for a Better Democracy

Digital Government: Research and Practice (DGOV) ◽

10.1145/3452118 ◽

2021 ◽

Author(s):

Robert Procter ◽

Miguel Arana-Catania ◽

Felix-Anselm van Lier ◽

Nataliya Tkachenko ◽

Yulan He ◽

...

Keyword(s):

Machine Learning ◽

Citizen Participation ◽

Language Processing ◽

Information Overload ◽

City Council ◽

Decision Making Processes ◽

Development Goals ◽

Democratic Systems ◽

The City

The development of democratic systems is a crucial task as confirmed by its selection as one of the Millennium Sustainable Development Goals by the United Nations. In this article, we report on the progress of a project that aims to address barriers, one of which is information overload, to achieving effective direct citizen participation in democratic decision-making processes. The main objectives are to explore if the application of Natural Language Processing (NLP) and machine learning can improve citizens? experience of digital citizen participation platforms. Taking as a case study the ?Decide Madrid? Consul platform, which enables citizens to post proposals for policies they would like to see adopted by the city council, we used NLP and machine learning to provide new ways to (a) suggest to citizens proposals they might wish to support; (b) group citizens by interests so that they can more easily interact with each other; (c) summarise comments posted in response to proposals; (d) assist citizens in aggregating and developing proposals. Evaluation of the results confirms that NLP and machine learning have a role to play in addressing some of the barriers users of platforms such as Consul currently experience.

Download Full-text

A content spectral-based text representation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219248 ◽

2021 ◽

pp. 1-12

Author(s):

Melesio Crespo-Sanchez ◽

Ivan Lopez-Arevalo ◽

Edwin Aldana-Bobadilla ◽

Alejandro Molina-Villegas

Keyword(s):

Machine Learning ◽

Text Analysis ◽

Question Answering ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Text Representation ◽

Feature Vectors ◽

Learning Tasks ◽

Semantic Component ◽

Vector Representations

In the last few years, text analysis has grown as a keystone in several domains for solving many real-world problems, such as machine translation, spam detection, and question answering, to mention a few. Many of these tasks can be approached by means of machine learning algorithms. Most of these algorithms take as input a transformation of the text in the form of feature vectors containing an abstraction of the content. Most of recent vector representations focus on the semantic component of text, however, we consider that also taking into account the lexical and syntactic components the abstraction of content could be beneficial for learning tasks. In this work, we propose a content spectral-based text representation applicable to machine learning algorithms for text analysis. This representation integrates the spectra from the lexical, syntactic, and semantic components of text producing an abstract image, which can also be treated by both, text and image learning algorithms. These components came from feature vectors of text. For demonstrating the goodness of our proposal, this was tested on text classification and complexity reading score prediction tasks obtaining promising results.

Download Full-text

Estimating Lead Time Using Machine Learning Algorithms: A Case Study by a Textile Company

10.1109/asyu52992.2021.9599012 ◽

2021 ◽

Author(s):

Ceren Atik ◽

Recen Alp Kut ◽

Safak Birol

Keyword(s):

Machine Learning ◽

Lead Time ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

An Efficient SMOTE-Based Deep Learning Model for Heart Attack Prediction

Scientific Programming ◽

10.1155/2021/6621622 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Muhammad Waqar ◽

Hassan Dawood ◽

Hussain Dawood ◽

Nadeem Majeed ◽

Ameen Banjar ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Heart Attack ◽

High Reliability ◽

Learning Algorithms ◽

Research Work ◽

Machine Learning Algorithms ◽

Feature Engineering ◽

Unequal Distribution ◽

The Given

Cardiac disease treatments are often being subjected to the acquisition and analysis of vast quantity of digital cardiac data. These data can be utilized for various beneficial purposes. These data’s utilization becomes more important when we are dealing with critical diseases like a heart attack where patient life is often at stake. Machine learning and deep learning are two famous techniques that are helping in making the raw data useful. Some of the biggest problems that arise from the usage of the aforementioned techniques are massive resource utilization, extensive data preprocessing, need for features engineering, and ensuring reliability in classification results. The proposed research work presents a cost-effective solution to predict heart attack with high accuracy and reliability. It uses a UCI dataset to predict the heart attack via various machine learning algorithms without the involvement of any feature engineering. Moreover, the given dataset has an unequal distribution of positive and negative classes which can reduce performance. The proposed work uses a synthetic minority oversampling technique (SMOTE) to handle given imbalance data. The proposed system discarded the need of feature engineering for the classification of the given dataset. This led to an efficient solution as feature engineering often proves to be a costly process. The results show that among all machine learning algorithms, SMOTE-based artificial neural network when tuned properly outperformed all other models and many existing systems. The high reliability of the proposed system ensures that it can be effectively used in the prediction of the heart attack.

Download Full-text

DEVELOPMENT OF A MACHINE LEARNING ALGORITHM TO PREDICT AUTHOR’S AGE FROM TEXT

International Journal of Research -GRANTHAALAYAH ◽

10.29121/granthaalayah.v7.i10.2019.408 ◽

2020 ◽

Vol 7 (10) ◽

pp. 380-389

Author(s):

Asogwa D.C ◽

Anigbogu S.O ◽

Anigbogu G.N ◽

Efozia F.N

Keyword(s):

Machine Learning ◽

Language Processing ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Machine Learning Algorithm ◽

Age Group ◽

Political Views ◽

Learning Techniques ◽

Age Prediction

Author's age prediction is the task of determining the author's age by studying the texts written by them. The prediction of author’s age can be enlightening about the different trends, opinions social and political views of an age group. Marketers always use this to encourage a product or a service to an age group following their conveyed interests and opinions. Methodologies in natural language processing have made it possible to predict author’s age from text by examining the variation of linguistic characteristics. Also, many machine learning algorithms have been used in author’s age prediction. However, in social networks, computational linguists are challenged with numerous issues just as machine learning techniques are performance driven with its own challenges in realistic scenarios. This work developed a model that can predict author's age from text with a machine learning algorithm (Naïve Bayes) using three types of features namely, content based, style based and topic based. The trained model gave a prediction accuracy of 80%.

Download Full-text

Detection of Slums from Very High-Resolution Satellite Images Using Machine Learning Algorithms: A Case Study of Fustat Area in Cairo, Egypt

Proceedings of International Exchange and Innovation Conference on Engineering & Sciences (IEICES) ◽

10.5109/4102491 ◽

2020 ◽

Vol 6 ◽

pp. 219-224

Author(s):

Muhammad Salem ◽

Naoki Tsurusaki ◽

Ahmed Eissa ◽

Taher Osman

Keyword(s):

Machine Learning ◽

High Resolution ◽

Satellite Images ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

High Resolution Satellite Images ◽

Very High

Download Full-text

A Comparative Analysis of Machine Learning Techniques for Spam Detection

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-1308 ◽

2021 ◽

pp. 657-661

Author(s):

Rashida Ali ◽

Ibrahim Rampurawala ◽

Mayuri Wandhe ◽

Ruchika Shrikhande ◽

Arpita Bhatkar

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Comparative Analysis ◽

Natural Language ◽

Language Processing ◽

High Volume ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Spam Detection ◽

Learning Techniques

Internet provides a medium to connect with individuals of similar or different interests creating a hub. Since a huge hub participates on these platforms, the user can receive a high volume of messages from different individuals creating a chaos and unwanted messages. These messages sometimes contain a true information and sometimes false, which leads to a state of confusion in the minds of the users and leads to first step towards spam messaging. Spam messages means an irrelevant and unsolicited message sent by a known/unknown user which may lead to a sense of insecurity among users. In this paper, the different machine learning algorithms were trained and tested with natural language processing (NLP) to classify whether the messages are spam or ham.

Download Full-text