Investigating translated Chinese and its variants using machine learning

Natural Language Engineering ◽

10.1017/s1351324920000182 ◽

2020 ◽

pp. 1-34

Author(s):

Hai Hu ◽

Sandra Kübler

Keyword(s):

Machine Learning ◽

Target Language ◽

Machine Learning Techniques ◽

English Studies ◽

Source Language ◽

European Languages ◽

Language Studies ◽

Frequent Use ◽

Learning Techniques ◽

Different Characteristics

Abstract Translations are generally assumed to share universal features that distinguish them from texts that are originally written in the same language. Thus, we can argue that these translations constitute their own variety of a language, often called translationese. However, translations are also influenced by their source languages and thus show different characteristics depending on the source language. Consequently, we argue that these variants constitute different “dialects” of translations into the same target language. Studies using machine learning techniques on Indo-European languages have investigated the universal characteristics of translationese and how translations from various source languages differ. However, for typologically very different languages such as Chinese, there are only few corpus studies that tap into the intricate relation between translations and the originals, as well as into the relations among translations themselves. In this contribution, we investigate the following questions: (1) What are the characteristics of Chinese translationese, both in general and with respect to different source languages? (2) Can we find differences not only at the lexical but also on the syntactic level? and (3) Based on the characteristics found in the previous questions, which of the proposed laws and universals can we corroborate based on our evidence from Chinese? We use machine learning to operationalize determining the importance of different characteristics and comparing their importance for our Chinese dataset with characteristics previously reported in studies on English. In addition, our methodology allows us to add syntactic features, which have rarely been used to study translations into Chinese. Our results show that Chinese translations as a whole can be reliably distinguished from non-translations, even based on only five features. More interestingly, typological traces from the source languages can often be found in their translations, therefore creating what we call dialects of translationese. For instance, translations from two Altaic languages exhibit more noun repetition and less frequent use of pronouns. Additionally, some characteristics that are not discriminative for English work well for Chinese, possibly because the distance between Chinese and the source languages is greater than that in English studies.

Download Full-text

Forecasting Monthly Discharge Using Machine Learning Techniques

International Research Journal of Multidisciplinary Technovation ◽

10.34256/irjmtcon1 ◽

2019 ◽

Vol 1 (6) ◽

pp. 1-6

Author(s):

Bharthavarapu Srikanth ◽

Geetha Selvarani A. ◽

Bibhuti Bhusan Sahoo

Keyword(s):

Machine Learning ◽

Research Area ◽

Machine Learning Techniques ◽

Support Vector ◽

Promising Tool ◽

Learning Techniques ◽

Different Types ◽

Monthly Discharge ◽

Property Destruction ◽

Different Characteristics

Discharge prediction methods play crucial role in providing early warnings and helping local people and government agencies to prepare well before flood or managing available water for various purposes. The ability to predict future river flows helps people anticipate and plan for upcoming flooding, preventing deaths and decreasing property destruction. Different hydrological models supporting these predictions have different characteristics, driven by available data and the research area. This study applied two different types of Machine learning techniques to the Tikarpara station present in the lower end of the Mahanadi river basin India. The two Machine learning techniques include Multi-layer perception (MLP) and support vector regression (SVR) MLP has shown great deal of accuracy as compared to SVR across the cases used in the study; based on available data and the study area, MLP showed the best applicability, compared to SVR techniques. MLP out performed SVR model with r2 = 0.75 and lowest RMSE = 0.58.MLP can be used as a promising tool for forecasting monthly discharge at the selected station.

Download Full-text

Comparing Classical and Modern Machine Learning Techniques for Monitoring Pedestrian Workers in Top-View Construction Site Video Sequences

Applied Sciences ◽

10.3390/app10238466 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8466

Author(s):

Marcel Neuhausen ◽

Dennis Pawlowski ◽

Markus König

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Kalman Filter ◽

Safety Management ◽

Machine Learning Techniques ◽

Training Dataset ◽

Learning Approaches ◽

Construction Sites ◽

Learning Techniques ◽

Different Characteristics

Keeping an overview of all ongoing processes on construction sites is almost unfeasible, especially for the construction workers executing their tasks. It is difficult for workers to concentrate on their work while paying attention to other processes. If their workflows in hazardous areas do not run properly, this can lead to dangerous accidents. Tracking pedestrian workers could improve the productivity and safety management on construction sites. For this, vision-based tracking approaches are suitable, but the training and evaluation of such a system requires a large amount of data originating from construction sites. These are rarely available, which complicates deep learning approaches. Thus, we use a small generic dataset and juxtapose a deep learning detector with an approach based on classical machine learning techniques. We identify workers using a YOLOv3 detector and compare its performance with an approach based on a soft cascaded classifier. Afterwards, tracking is done by a Kalman filter. In our experiments, the classical approach outperforms YOLOv3 on the detection task given a small training dataset. However, the Kalman filter is sufficiently robust to compensate for the drawbacks of YOLOv3. We found that both approaches generally yield a satisfying tracking performances but feature different characteristics.

Download Full-text

Using machine learning techniques to reduce data annotation time

PsycEXTRA Dataset ◽

10.1037/e577762012-020 ◽

2006 ◽

Author(s):

Christopher Schreiner ◽

Kari Torkkola ◽

Mike Gardner ◽

Keshu Zhang

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Data Annotation ◽

Learning Techniques

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text

Blind Spoofing Detection for Multi-Antenna Snapshot Receivers using Machine-Learning Techniques

Proceedings of the 33rd International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2020) ◽

10.33012/2020.17564 ◽

2020 ◽

Author(s):

J. Rossouw van der Merwe ◽

Ana Nikolikj ◽

Sebastian Kram ◽

Ivana Lukcin ◽

Gorjan Nadzinski ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Spoofing Detection

Download Full-text

389-P: Ability for Detecting or Predicting Hypoglycemia with the Aid of Machine Learning Techniques: A Meta-analysis

Diabetes ◽

10.2337/db20-389-p ◽

2020 ◽

Vol 69 (Supplement 1) ◽

pp. 389-P

Author(s):

SATORU KODAMA ◽

MAYUKO H. YAMADA ◽

YUTA YAGUCHI ◽

MASARU KITAZAWA ◽

MASANORI KANEKO ◽

...

Keyword(s):

Machine Learning ◽

Meta Analysis ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Impact of Machine Learning Techniques with Cache Replacement Algorithmsinenhancingthe Performance of theWebserver

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i6/0323 ◽

2017 ◽

Vol 7 (6) ◽

pp. 812-816

Author(s):

Muralidharan Murugesan ◽

◽

E. Kirubakaran ◽

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Cache Replacement ◽

Learning Techniques

Download Full-text

A Brief Survey on Text Classification Using Various Machine Learning Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v8i1.521 ◽

2018 ◽

Vol 8 (1) ◽

pp. 14

Author(s):

Padmavathi .S ◽

M. Chidambaram

Keyword(s):

Machine Learning ◽

Text Classification ◽

Fixed Number ◽

Machine Learning Techniques ◽

Online Information ◽

Rule Based ◽

Learning Techniques ◽

Machine Learning Approach ◽

Rule Based Approach

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.

Download Full-text

A Literature Review Study of Software Defect Prediction using Machine Learning Techniques

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i6.286 ◽

2018 ◽

Vol 6 (6) ◽

pp. 300 ◽

Cited By ~ 3

Author(s):

Feidu Akmel ◽

Ermiyas Birihanu ◽

Bahir Siraj

Keyword(s):

Machine Learning ◽

Software Metrics ◽

Quality Standard ◽

Machine Learning Techniques ◽

Software Systems ◽

Health Care Insurance ◽

Software Defect ◽

Learning Techniques ◽

Software Product

Software systems are any software product or applications that support business domains such as Manufacturing,Aviation, Health care, insurance and so on.Software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from other for this reason it is better to apply the software metrics to measure the quality of software. Attributes that we gathered from source code through software metrics can be an input for software defect predictor. Software defect are an error that are introduced by software developer and stakeholders. Finally, in this study we discovered the application of machine learning on software defect that we gathered from the previous research works.

Download Full-text