scholarly journals Classification of Astrophysics Journal Articles with Machine Learning to Identify Data for NED

2022 ◽  
Vol 134 (1031) ◽  
pp. 014501
Author(s):  
Tracy X. Chen ◽  
Rick Ebert ◽  
Joseph M. Mazzarella ◽  
Cren Frayer ◽  
Scott Terek ◽  
...  

Abstract The NASA/IPAC Extragalactic Database (NED) is a comprehensive online service that combines fundamental multi-wavelength information for known objects beyond the Milky Way and provides value-added, derived quantities and tools to search and access the data. The contents and relationships between measurements in the database are continuously augmented and revised to stay current with astrophysics literature and new sky surveys. The conventional process of distilling and extracting data from the literature involves human experts to review the journal articles and determine if an article is of extragalactic nature, and if so, what types of data it contains. This is both labor intensive and unsustainable, especially given the ever-increasing number of publications each year. We present here a machine learning (ML) approach developed and integrated into the NED production pipeline to help automate the classification of journal article topics and their data content for inclusion into NED. We show that this ML application can successfully reproduce the classifications of a human expert to an accuracy of over 90% in a fraction of the time it takes a human, allowing us to focus human expertise on tasks that are more difficult to automate.

2018 ◽  
Vol 17 (03) ◽  
pp. 1850034
Author(s):  
Mandava Kranthi Kiran ◽  
K. Thammi Reddy

The growth in the availability of number of research journal articles in a digital form is explosive. So more and more researchers have turned towards the publications in digital form, download and maintain them in their standalone personal computers. This increase in a number of publications in digital form has made it difficult for researchers, who have to face the burden of managing, linking and searching the research journal articles on their personal computers. Many reference manager softwares like RefWorks, Zotero, EndNote, and Mendeley are available in the market, which provide an easy way for a researcher to organize and manage the research journal articles. These above said reference managers take the help of extracted basic metadata such as Title, Author, Abstract, etc. to organize the journal articles and maintain a link between them. But the essential feature of “reference linking” is not focused. Reference linking is a feature where a cited article can be tracked from the article citing it, from a large volume of journal article collection. This feature is mostly available for online, but not for offline standalone personal computers. This paper addresses this problem in detail, explores the existing reference linking features which exist for online scholarlty literature, presents the algorithms for retrieving the metadata along with references at the end of each journal article and a way for storing and linking them to the full texts present in the reference management software that works on a standalone personal computer, using semantic technology. A reference management software “sodhanaRef” which includes the reference linking feature and semantic search has been built along with an explanation about its architecture.


2018 ◽  
Vol 8 (9) ◽  
pp. 1654 ◽  
Author(s):  
Krzysztof Gajowniczek ◽  
Tomasz Ząbkowski ◽  
Mariya Sodenkamp

In this article, the Grade Correspondence Analysis (GCA) with posterior clustering and visualization is introduced and applied to extract important features to reveal households’ characteristics based on electricity usage data. The main goal of the analysis is to automatically extract, in a non-intrusive way, number of socio-economic household properties including family type, age of inhabitants, employment type, house type, and number of bedrooms. The knowledge of specific properties enables energy utilities to develop targeted energy conservation tariffs and to assure balanced operation management. In particular, classification of the households based on the electricity usage delivers value added information to allow accurate demand planning with the goal to enhance the overall efficiency of the network. The approach was evaluated by analyzing smart meter data collected from 4182 households in Ireland over a period of 1.5 years. The analysis outcome shows that revealing characteristics from smart meter data is feasible, and the proposed machine learning methods were yielding for an accuracy of approx. 90% and Area Under Receiver Operating Curve (AUC) of 0.82.


Author(s):  
Padmavathi .S ◽  
M. Chidambaram

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.


Author(s):  
Hyeuk Kim

Unsupervised learning in machine learning divides data into several groups. The observations in the same group have similar characteristics and the observations in the different groups have the different characteristics. In the paper, we classify data by partitioning around medoids which have some advantages over the k-means clustering. We apply it to baseball players in Korea Baseball League. We also apply the principal component analysis to data and draw the graph using two components for axis. We interpret the meaning of the clustering graphically through the procedure. The combination of the partitioning around medoids and the principal component analysis can be used to any other data and the approach makes us to figure out the characteristics easily.


2019 ◽  
Vol 1 (3) ◽  
pp. 73-78
Author(s):  
Rumintang Harianja ◽  
Ratih Saltri Yudar ◽  
Susy Deliani ◽  
Mutia Sari Nursafira ◽  
Budianto Hamuddin

This study aims at identifying the pronouns used in journal articles in terms of numbers and familiarity. The data taken from three different journals from three various fields, i.e., Education, Medics and Engineering. It consists of  21 articles taken from the current issue 2018, where this study started. It is selected conveniently due to its unique and fame as a discipline and reputable sources. In collecting the data, the researcher accessed the journals published by science direct (Q1 Scopus indexed). The analysis showed that the writer in these three international journals commonly used several pronouns interchangeably. However, some articles in journal from Medical and Engineering consistently used only one chosen pronoun, which was recorded found at different sections in the journal article. The data then coded and transcribed to ease the analysis in this researcher. As a result of the study, it was found out that the data showed 19 kinds of pronouns in total were used in these three different fields. These results showed us that the pronoun usage in a scientific article from these three various fields varies with options of different pronouns.  The pronoun seems used to help the impact of imposition and showing politeness or quality of the articles. 


Author(s):  
Ivan Herreros

This chapter discusses basic concepts from control theory and machine learning to facilitate a formal understanding of animal learning and motor control. It first distinguishes between feedback and feed-forward control strategies, and later introduces the classification of machine learning applications into supervised, unsupervised, and reinforcement learning problems. Next, it links these concepts with their counterparts in the domain of the psychology of animal learning, highlighting the analogies between supervised learning and classical conditioning, reinforcement learning and operant conditioning, and between unsupervised and perceptual learning. Additionally, it interprets innate and acquired actions from the standpoint of feedback vs anticipatory and adaptive control. Finally, it argues how this framework of translating knowledge between formal and biological disciplines can serve us to not only structure and advance our understanding of brain function but also enrich engineering solutions at the level of robot learning and control with insights coming from biology.


2020 ◽  
Vol 13 (5) ◽  
pp. 508-523 ◽  
Author(s):  
Guan‐Hua Huang ◽  
Chih‐Hsuan Lin ◽  
Yu‐Ren Cai ◽  
Tai‐Been Chen ◽  
Shih‐Yen Hsu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document