Construction of Corpus in Artificial Intelligence Age

As a kind of revolutionary technology, artificial intelligence marked an explosive transformation in many fields of study. Nowadays, much of the translation work used to be done by human has been undertaken by machines. The construction of corpus is a crucial step leading to successful machine translation. The paper aims to exploring the mode of corpus construction from the perspective of information mining, information retrieval and information processing. The retrieval system uses web crawlers to collect network information and automatic tagging technology to index the collected information, then applies corresponding language processing techniques to achieve correspondence between two languages and form an index database. In the age of artificial intelligence, machines can keep a track of many users’ searches, queries, so as to record, extract as well as to feed back on different translations to build a new corpus. In this way, machine translation is improving in its scope and accuracy in translation with the goal to take up the tedious work of human translation as well as to increase the speed and reduce the cost of it.

Download Full-text

Artificial Intelligence and Investing

Encyclopedia of Information Science and Technology, Second Edition ◽

10.4018/978-1-60566-026-4.ch041 ◽

2011 ◽

pp. 237-240 ◽

Cited By ~ 1

Author(s):

Roy Rada

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Risk Management ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Asset Valuation ◽

Stock Valuation ◽

Knowledge Based ◽

Processing Techniques

The techniques of artificial intelligence include knowledgebased, machine learning, and natural language processing techniques. The discipline of investing requires data identification, asset valuation, and risk management. Artificial intelligence techniques apply to many aspects of financial investing, and published work has shown an emphasis on the application of knowledge-based techniques for credit risk assessment and machine learning techniques for stock valuation. However, in the future, knowledge-based, machine learning, and natural language processing techniques will be integrated into systems that simultaneously address data identification, asset valuation, and risk management.

Download Full-text

Machine Translation

The Oxford Handbook of Computational Linguistics 2nd edition ◽

10.1093/oxfordhb/9780199573691.013.26 ◽

2016 ◽

Author(s):

Lucia Specia ◽

Yorick Wilks

Keyword(s):

Machine Translation ◽

Language Processing ◽

State Of The Art ◽

Research Area ◽

Rule Based ◽

Active Research ◽

Translation Methods ◽

The Cost ◽

Translation Systems ◽

Active Research Area

Machine Translation (MT) is and always has been a core application in the field of natural-language processing. It is a very active research area and it has been attracting significant commercial interest, most of which has been driven by the deployment of corpus-based, statistical approaches, which can be built in a much shorter time and at a fraction of the cost of traditional, rule-based approaches, and yet produce translations of comparable or superior quality. This chapter aims at introducing MT and its main approaches. It provides a historical overview of the field, an introduction to different translation methods, both rationalist (rule-based) and empirical, and a more in depth description of state-of-the-art statistical methods. Finally, it covers popular metrics to evaluate the output of machine translation systems.

Download Full-text

An Approach to Determining Software Projects with Similar Functionality and Architecture Process Based on Artificial Intelligence Methods

10.20944/preprints201801.0290.v1 ◽

2018 ◽

Cited By ~ 2

Author(s):

Nadezhda Yarushkina ◽

Gleb Guskov ◽

Pavel Dudarin

Keyword(s):

Artificial Intelligence ◽

Language Processing ◽

Source Code ◽

Similarity Metrics ◽

Software Projects ◽

Two Phase ◽

The World ◽

Novel Method ◽

Software Engineers ◽

Processing Techniques

Software engineers from all over the world solve independently a lot of similar problems. In this condition the problem of code or even better architecture reusing becomes an issue of the day. In this paper two phase approach to determining the functional and structural likenesses of software projects is proposed. This approach combines two methods of artificial intelligence: natural language processing techniques with a novel method for comparing software projects based on ontological representation of their architecture automatically obtained from the projects source code. Additionally several similarity metrics are proposed to estimate similarity between projects.

Download Full-text

The Survey: Advances in Natural Language Processing using Deep Learning*

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i4.611 ◽

2021 ◽

Vol 12 (4) ◽

pp. 1035-1040

Author(s):

Vamsi Krishna Vedantam

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Successful Implementation ◽

Critical Areas ◽

Processing Techniques ◽

Automatic Text ◽

Linguistic Models

Natural Language Processing using Deep Learning is one of the critical areas of Artificial Intelligence to focus in the next decades. Over the last few years, Artificial intelligence had evolved by maturing critical areas in research and development. The latest developments in Natural Language Processing con- tributed to the successful implementation of machine translations, linguistic models, Speech recognitions, automatic text generations applications. This paper covers the recent advancements in Natural Language Processing using Deep Learning and some of the much-waited areas in NLP to look for in the next few years. The first section explains Deep Learning architecture, Natural Language Processing techniques followed by the second section that highlights the developments in NLP using Deep learning and the last part by concluding the critical takeaways from my article.

Download Full-text

A Study on the journey of Natural Language Processing models: from Symbolic Natural Language Processing to Bidirectional Encoder Representations from Transformers

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit217688 ◽

2021 ◽

pp. 331-345

Author(s):

Rajarshi SinhaRoy

Keyword(s):

Artificial Intelligence ◽

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Computational Process ◽

Historical Aspects ◽

Making Sense ◽

Digital Era ◽

History Of

In this digital era, Natural language Processing is not just a computational process rather it is a way to communicate with machines as humanlike. It has been used in several fields from smart artificial assistants to health or emotion analyzers. Imagine a digital era without Natural language processing is something which we cannot even think of. In Natural language Processing, firstly it reads the information given and after that begins making sense of the information. After the data has been properly processed, the real steps are taken by the machine throwing some responses or completing the work. In this paper, I review the journey of natural language processing from the late 1940s to the present. This paper also contains several salient and most important works in this timeline which leads us to where we currently stand in this field. The review separates four eras in the history of Natural language Processing, each marked by a focus on machine translation, artificial intelligence impact, the adoption of a logico-grammatical style, and an attack on huge linguistic data. This paper helps to understand the historical aspects of Natural language processing and also inspires others to work and research in this domain.

Download Full-text

Comparing the Validity of Automated and Human Scoring of Essays

Journal of Educational Computing Research ◽

10.2190/cx92-7wkv-n7wc-jl0a ◽

2002 ◽

Vol 26 (4) ◽

pp. 407-425 ◽

Cited By ~ 22

Author(s):

Donald E. Powers ◽

Jill C. Burstein ◽

Martin S. Chodorow ◽

Mary E. Fowles ◽

Karen Kukich

Keyword(s):

Language Processing ◽

Writing Assessment ◽

Writing Skills ◽

Writing Proficiency ◽

Automated Scoring ◽

Computer Based ◽

The Cost ◽

Processing Techniques ◽

Relationship Of ◽

Automatic Scoring

Automated, or computer-based, scoring represents one promising possibility for improving the cost effectiveness (and other features) of complex performance assessments (such as direct tests of writing skill) that require examinees to construct responses rather than select them from a set of multiple choices. Indeed, significant advances have been made in applying natural language processing techniques to the automatic scoring of essays. Thus far, most of the validation of automated scoring has focused appropriately (but too narrowly, we contend) on the correspondence between computer-generated scores and those assigned by human readers. Far less effort has been devoted to assessing the relation of automated scores to independent indicators of examinees' writing skills. This study examined the relationship of scores from a graduate level writing assessment to several independent, non-test indicators of examinees' writing skills—both for automated scores and for scores assigned by trained human readers. The extent to which automated and human scores exhibited similar relations with the non-test indicators was taken as evidence of the degree to which the two methods of scoring reflect similar aspects of writing proficiency. Analyses revealed significant, but modest, correlations between the non-test indicators and each of the two methods of scoring. These relations were somewhat weaker for automated scores than for scores awarded by human readers. Overall, however, the results provide some evidence of the validity of one specific procedure for automated scoring.

Download Full-text

A Machine Learning Application for Raising WASH Awareness in the Times of COVID-19 Pandemic

10.21203/rs.3.rs-562183/v1 ◽

2021 ◽

Author(s):

Rohan Pandey ◽

Vaibhav Gautam ◽

Ridam Pal ◽

Harsh Bandhey ◽

Lovedeep Singh Dhingra ◽

...

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Who Guidelines ◽

Correct Information ◽

The Times ◽

Local Languages

Abstract Background: The COVID-19 pandemic has uncovered the potential of digital misinformation in shaping the health of nations. The deluge of unverified information that spreads faster than the epidemic itself is an unprecedented phenomenon that has put millions of lives in danger. Mitigating this ‘Infodemic’ requires robust health messaging systems that are engaging, vernacular, scalable, effective, and continuously learn new misinformation patterns.Objective: We created WashKaro, a multi-pronged intervention for mitigating misinformation through conversational Artificial Intelligence (AI), machine translation and natural language processing (NLP). WashKaro provides the correct information matched against WHO guidelines through AI and delivers it in a suitable format in local languages. Results: A total of 5026 people downloaded the app during the study window; among those, 1545 were actively engaged users. Our study shows that 3.4 times more females engaged with the App in Hindi as compared to males, the relevance of AI-filtered news content doubled within 45 days of continuous machine learning, and the prudence of integrated AI chatbot “Satya” increased thus proving the usefulness of a mHealth platform to mitigate health misinformation.Conclusion: We conclude that a machine learning application delivering bite-sized vernacular audios and conversational AI is a practical approach to mitigate health misinformation.

Download Full-text

A Review and evaluation of Machine Translation methods for Lumasaaba

Journal of Digital Science ◽

10.33847/2686-8296.2.1_1 ◽

2020 ◽

pp. 3-17

Author(s):

Peter Nabende

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Research Area ◽

Data Driven ◽

East African ◽

Data Set ◽

African Languages ◽

Translation Methods

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.

Download Full-text

Reports of the Workshops Held at the 2019 AAAI Conference on Artificial Intelligence

AI Magazine ◽

10.1609/aimag.v40i3.4981 ◽

2019 ◽

Vol 40 (3) ◽

pp. 67-78

Author(s):

Guy Barash ◽

Mauricio Castillo-Effen ◽

Niyati Chhaya ◽

Peter Clark ◽

Huáscar Espinoza ◽

...

Keyword(s):

Artificial Intelligence ◽

Language Processing ◽

Cyber Security ◽

Question Answering ◽

Intent Recognition ◽

Affective Content ◽

Learning Plan ◽

Dialog System ◽

Affective Content Analysis ◽

Games And Simulations

The workshop program of the Association for the Advancement of Artificial Intelligence’s 33rd Conference on Artificial Intelligence (AAAI-19) was held in Honolulu, Hawaii, on Sunday and Monday, January 27–28, 2019. There were fifteen workshops in the program: Affective Content Analysis: Modeling Affect-in-Action, Agile Robotics for Industrial Automation Competition, Artificial Intelligence for Cyber Security, Artificial Intelligence Safety, Dialog System Technology Challenge, Engineering Dependable and Secure Machine Learning Systems, Games and Simulations for Artificial Intelligence, Health Intelligence, Knowledge Extraction from Games, Network Interpretability for Deep Learning, Plan, Activity, and Intent Recognition, Reasoning and Learning for Human-Machine Dialogues, Reasoning for Complex Question Answering, Recommender Systems Meet Natural Language Processing, Reinforcement Learning in Games, and Reproducible AI. This report contains brief summaries of the all the workshops that were held.

Download Full-text

A Machine Learning Application for Raising WASH Awareness in the Times of COVID-19 Pandemic (Preprint)

10.2196/preprints.25320 ◽

2020 ◽

Cited By ~ 1

Author(s):

Rohan Pandey ◽

Vaibhav Gautam ◽

Ridam Pal ◽

Harsh Bandhey ◽

Lovedeep Singh Dhingra ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

User Feedback ◽

Who Guidelines ◽

The Times ◽

The Right ◽

Local Languages

BACKGROUND The COVID-19 pandemic has uncovered the potential of digital misinformation in shaping the health of nations. The deluge of unverified information that spreads faster than the epidemic itself is an unprecedented phenomenon that has put millions of lives in danger. Mitigating this ‘Infodemic’ requires strong health messaging systems that are engaging, vernacular, scalable, effective and continuously learn the new patterns of misinformation. OBJECTIVE We created WashKaro, a multi-pronged intervention for mitigating misinformation through conversational AI, machine translation and natural language processing. WashKaro provides the right information matched against WHO guidelines through AI, and delivers it in the right format in local languages. METHODS We theorize (i) an NLP based AI engine that could continuously incorporate user feedback to improve relevance of information, (ii) bite sized audio in the local language to improve penetrance in a country with skewed gender literacy ratios, and (iii) conversational but interactive AI engagement with users towards an increased health awareness in the community. RESULTS A total of 5026 people who downloaded the app during the study window, among those 1545 were active users. Our study shows that 3.4 times more females engaged with the App in Hindi as compared to males, the relevance of AI-filtered news content doubled within 45 days of continuous machine learning, and the prudence of integrated AI chatbot “Satya” increased thus proving the usefulness of an mHealth platform to mitigate health misinformation. CONCLUSIONS We conclude that a multi-pronged machine learning application delivering vernacular bite-sized audios and conversational AI is an effective approach to mitigate health misinformation. CLINICALTRIAL Not Applicable

Download Full-text