language detection Latest Research Papers

Domain-Specific Keyword Extraction Using Joint Modeling of Local and Global Contextual Semantics

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3494560 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-30

Author(s):

Muhammad Abulaish ◽

Mohd Fazil ◽

Mohammed J. Zaki

Keyword(s):

Hybrid Approach ◽

Joint Modeling ◽

Keyword Extraction ◽

Semantic Association ◽

Target Domain ◽

Domain Specific ◽

Online Social Media ◽

Word Representation ◽

E Mail ◽

Language Detection

Domain-specific keyword extraction is a vital task in the field of text mining. There are various research tasks, such as spam e-mail classification, abusive language detection, sentiment analysis, and emotion mining, where a set of domain-specific keywords (aka lexicon) is highly effective. Existing works for keyword extraction list all keywords rather than domain-specific keywords from a document corpus. Moreover, most of the existing approaches perform well on formal document corpuses but fail on noisy and informal user-generated content in online social media. In this article, we present a hybrid approach by jointly modeling the local and global contextual semantics of words, utilizing the strength of distributional word representation and contrasting-domain corpus for domain-specific keyword extraction. Starting with a seed set of a few domain-specific keywords, we model the text corpus as a weighted word-graph. In this graph, the initial weight of a node (word) represents its semantic association with the target domain calculated as a linear combination of three semantic association metrics, and the weight of an edge connecting a pair of nodes represents the co-occurrence count of the respective words. Thereafter, a modified PageRank method is applied to the word-graph to identify the most relevant words for expanding the initial set of domain-specific keywords. We evaluate our method over both formal and informal text corpuses (comprising six datasets), and show that it performs significantly better in comparison to state-of-the-art methods. Furthermore, we generalize our approach to handle the language-agnostic case, and show that it outperforms existing language-agnostic approaches.

Hand Sign Language Detection Using Deep Learning

Lecture Notes in Networks and Systems - Advances in Distributed Computing and Machine Learning ◽

10.1007/978-981-16-4807-6_47 ◽

2022 ◽

pp. 493-502

Author(s):

Subham Sharma ◽

Apala Ghosh ◽

Sharmila Subudhi

Keyword(s):

Deep Learning ◽

Sign Language ◽

Language Detection

Towards Automated Moderation: Enabling Toxic Language Detection with Transfer Learning and Attention-Based Models

10.24251/hicss.2022.098 ◽

2022 ◽

Author(s):

Matthew Caron ◽

Frederik S. Bäumer ◽

Oliver Müller

Keyword(s):

Transfer Learning ◽

Language Detection

Text Mining Techniques for Identify Islamophobic Conversation Language by Selecting Preprocessing Feature

10.21203/rs.3.rs-1105114/v1 ◽

2021 ◽

Author(s):

Fachrul Kurniawan ◽

Badruddin ◽

Aji Prasetya Wibawa

Keyword(s):

Text Mining ◽

Sentiment Analysis ◽

Muslim Country ◽

Processing Stage ◽

Language Detection ◽

Mining Tools

Abstract By identifying a text's polarity, sentiment analysis is a technique for extracting information from a person's attitude about an issue or occurrence. The grouping is made to discuss whether the reader is positive or negative. The drop duplication procedure creates 4339 from the preceding 10997, and the result language detection is 31 languages, thanks to the pre-processing stage. Although the data comes from the world's largest Muslim country, the problem is not limited to it, as evidenced by the use of text mining tools to identify languages.

Real-Time Sign Language Detection and Recognition

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39103 ◽

2021 ◽

Vol 9 (11) ◽

pp. 1944-1948

Author(s):

Sarthak Sharma

Keyword(s):

Neural Networks ◽

American Sign Language ◽

Real Time ◽

Sign Language ◽

American Sign ◽

Hand Gestures ◽

Natural Form ◽

To Come ◽

Language Detection ◽

Detection And Recognition

Abstract: Sign language is one of the oldest and most natural form of language for communication, but since most people do not know sign language and interpreters are very difficult to come by we have come up with a real time method using neural networks for fingerspelling based American sign language. In our method, the hand is first passed through a filter and after the filter is applied the hand is passed through a classifier which predicts the class of the hand gestures.

Protecting marginalized communities by mitigating discrimination in toxic language detection

10.1109/istas52410.2021.9629201 ◽

2021 ◽

Author(s):

Farshid Faal ◽

Ketra Schmitt ◽

Jia Yuan Yu

Keyword(s):

Marginalized Communities ◽

Language Detection

Hate Speech and Offensive Language Detection from Social Media

10.1109/icecube53880.2021.9628255 ◽

2021 ◽

Author(s):

Vildan Mercan ◽

Akhtar Jamil ◽

Alaa Ali Hameed ◽

Irfan Ahmed Magsi ◽

Sibghatullah Bazai ◽

...

Keyword(s):

Social Media ◽

Hate Speech ◽

Offensive Language ◽

Language Detection

Abusive language detection in youtube comments leveraging replies as conversational context

PeerJ Computer Science ◽

10.7717/peerj-cs.742 ◽

2021 ◽

Vol 7 ◽

pp. e742

Author(s):

Noman Ashraf ◽

Arkaitz Zubiaga ◽

Alexander Gelbukh

Keyword(s):

Social Media ◽

Classification Accuracy ◽

Contextual Information ◽

Original Description ◽

Linguistic Features ◽

Abusive Behavior ◽

Conversational Context ◽

Media Experience ◽

Language Detection

Nowadays, social media experience an increase in hostility, which leads to many people suffering from online abusive behavior and harassment. We introduce a new publicly available annotated dataset for abusive language detection in short texts. The dataset includes comments from YouTube, along with contextual information: replies, video, video title, and the original description. The comments in the dataset are labeled as abusive or not and are classified by topic: politics, religion, and other. In particular, we discuss our refined annotation guidelines for such classification. We report a number of strong baselines on this dataset for the tasks of abusive language detection and topic classification, using a number of classifiers and text representations. We show that taking into account the conversational context, namely, replies, greatly improves the classification results as compared with using only linguistic features of the comments. We also study how the classification accuracy depends on the topic of the comment.

Mitigating Racial Biases in Toxic Language Detection with an Equity-Based Ensemble Framework

10.1145/3465416.3483299 ◽

2021 ◽

Author(s):

Matan Halevy ◽

Camille Harris ◽

Amy Bruckman ◽

Diyi Yang ◽

Ayanna Howard

Keyword(s):

Racial Biases ◽

Language Detection

Using Machine Learning to Detect Events on the Basis of Bengali and Banglish Facebook Posts

Electronics ◽

10.3390/electronics10192367 ◽

2021 ◽

Vol 10 (19) ◽

pp. 2367

Author(s):

Noyon Dey ◽

Md. Sazzadur Rahman ◽

Motahara Sabah Mredula ◽

A. S. M. Sanwar Hosen ◽

In-Ho Ra

Keyword(s):

Social Media ◽

Detection System ◽

Classification Model ◽

Support Vector ◽

Proposed Model ◽

Prime Concern ◽

Use Of Social Media ◽

Bengali Language ◽

Language Detection ◽

Media Data

In modern times, ensuring social security has become the prime concern for security administrators. The widespread and recurrent use of social media sites is creating a huge risk for the lives of the general people, as these sites are frequently becoming potential sources of the organization of various types of immoral events. For protecting society from these dangers, a prior detection system which can effectively detect events by analyzing these social media data is essential. However, automating the process of event detection has been difficult, as existing processes must account for diverse writing styles, languages, dialects, post lengths, and et cetera. To overcome these difficulties, we developed an effective model for detecting events, which, for our purposes, were classified as either protesting, celebrating, religious, or neutral, using Bengali and Banglish Facebook posts. At first, the collected posts’ text were processed for language detection, and then, detected posts were pre-processed using stopwords removal and tokenization. Features were then extracted from these pre-processed texts using three sub-processes: filtering, phrase matching of specific events, and sentiment analysis. The collected features were ultimately used to train our Bernoulli Naive Bayes classification model, which was capable of detecting events with 90.41% accuracy (for Bengali-language posts) and 70% (for the Banglish-form posts). For evaluating the effectiveness of our proposed model more precisely, we compared it with two other classifiers: Support Vector Machine and Decision Tree.

language detection
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Domain-Specific Keyword Extraction Using Joint Modeling of Local and Global Contextual Semantics

Hand Sign Language Detection Using Deep Learning

Towards Automated Moderation: Enabling Toxic Language Detection with Transfer Learning and Attention-Based Models

Text Mining Techniques for Identify Islamophobic Conversation Language by Selecting Preprocessing Feature

Real-Time Sign Language Detection and Recognition

Protecting marginalized communities by mitigating discrimination in toxic language detection

Hate Speech and Offensive Language Detection from Social Media

Abusive language detection in youtube comments leveraging replies as conversational context

Mitigating Racial Biases in Toxic Language Detection with an Equity-Based Ensemble Framework

Using Machine Learning to Detect Events on the Basis of Bengali and Banglish Facebook Posts

Export Citation Format

language detectionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Domain-Specific Keyword Extraction Using Joint Modeling of Local and Global Contextual Semantics

Hand Sign Language Detection Using Deep Learning

Towards Automated Moderation: Enabling Toxic Language Detection with Transfer Learning and Attention-Based Models

Text Mining Techniques for Identify Islamophobic Conversation Language by Selecting Preprocessing Feature

Real-Time Sign Language Detection and Recognition

Protecting marginalized communities by mitigating discrimination in toxic language detection

Hate Speech and Offensive Language Detection from Social Media

Abusive language detection in youtube comments leveraging replies as conversational context

Mitigating Racial Biases in Toxic Language Detection with an Equity-Based Ensemble Framework

Using Machine Learning to Detect Events on the Basis of Bengali and Banglish Facebook Posts

language detection
Recently Published Documents