Exploring Earth Science Applications using Word Embeddings

Author(s):  
Derek Koehl ◽  
Carson Davis ◽  
Rahul Ramachandran ◽  
Udaysankar Nair ◽  
Manil Maskey

<p>Word embedding are numeric representations of text which capture meanings and semantic relationships in text. Embeddings can be constructed using different methods such as One Hot encoding, Frequency-based or Prediction-based approaches. Prediction-based approaches such as  Word2Vec, can be used to generate word embeddings that can capture the underlying semantics and word relationships in a corpus. Word2Vec embeddings generated from domain specific corpus have been shown in studies to both predict relationships and augment word vectors to improve classifications. We describe results from two different experiments utilizing word embeddings for Earth science constructed from a corpus of over 20,000 journal papers using Word2Vec. </p><p>The first experiment explores the analogy prediction performance of word embeddings built from the Earth science journal corpus and trained using domain-specific vocabulary. Our results demonstrate that the accuracy of domain-specific word embeddings in predicting Earth science analogy questions outperforms the ability of general corpus embedding to predict general analogy questions. While the results are as anticipated,  the substantial increase in accuracy, particularly in the lexicographical domain was encouraging. The results point to the need for developing a comprehensive Earth science analogy test set that covers the full breadth of lexicographical and encyclopedic categories for validating word embeddings.</p><p>The second experiment utilizes the word embeddings to augment metadata keyword classifications. Metadata describing NASA datasets have science keywords that are manually assigned which can lead to errors and inconsistencies. These science keywords are controlled vocabulary and are used to aid data discovery via faceted search and relevancy ranking. Given the small size of the number of metadata records with proper description and keywords, word embeddings were used for augmentation. A fully connected neural network was trained to suggest keywords given a description text. This approach provided the best accuracy at ~76% as compared to other methods tested.</p>

Author(s):  
Muthukumaran Ramasubramanian ◽  
Hassan Muhammad ◽  
Iksha Gurung ◽  
Manil Maskey ◽  
Rahul Ramachandran

1964 ◽  
Vol 12 (2) ◽  
pp. 64-68
Author(s):  
ROBERT L. HELLER

Radiocarbon ◽  
2001 ◽  
Vol 43 (2B) ◽  
pp. 731-742 ◽  
Author(s):  
D Lal ◽  
A J T Jull

Nuclear interactions of cosmic rays produce a number of stable and radioactive isotopes on the earth (Lai and Peters 1967). Two of these, 14C and 10Be, find applications as tracers in a wide variety of earth science problems by virtue of their special combination of attributes: 1) their source functions, 2) their half-lives, and 3) their chemical properties. The radioisotope, 14C (half-life = 5730 yr) produced in the earth's atmosphere was the first to be discovered (Anderson et al. 1947; Libby 1952). The next longer-lived isotope, also produced in the earth's atmosphere, 10Be (half-life = 1.5 myr) was discovered independently by two groups within a decade (Arnold 1956; Goel et al. 1957; Lal 1991a). Both the isotopes are produced efficiently in the earth's atmosphere, and also in solids on the earth's surface. Independently and jointly they serve as useful tracers for characterizing the evolutionary history of a wide range of materials and artifacts. Here, we specifically focus on the production of 14C in terrestrial solids, designated as in-situ-produced 14C (to differentiate it from atmospheric 14C, initially produced in the atmosphere). We also illustrate the application to several earth science problems. This is a relatively new area of investigations, using 14C as a tracer, which was made possible by the development of accelerator mass spectrometry (AMS). The availability of the in-situ 14C variety has enormously enhanced the overall scope of 14C as a tracer (singly or together with in-situ-produced 10Be), which eminently qualifies it as a unique tracer for studying earth sciences.


2018 ◽  
Vol 24 (1) ◽  
pp. 553-562 ◽  
Author(s):  
Shusen Liu ◽  
Peer-Timo Bremer ◽  
Jayaraman J. Thiagarajan ◽  
Vivek Srikumar ◽  
Bei Wang ◽  
...  

2020 ◽  
Vol 42 (4) ◽  
pp. 478-484
Author(s):  
Kirill Golikov ◽  
Ekaterina LAPTEVA ◽  
A. SOCHIVKO

The article discusses the use of live plants as the botanical exposition component supplement of the “Natural areas” (hall № 17 “Natural zonality and its components” and № 20 “Desert, subtropical, tropical countries, high-altitude zone”) and “Physico-georaphic regions” (hall № 24 “Continents and parts of the world”) departments in order to visualize information presented in the Earth Science Museum. Demonstration of plants originating from different regions of the world representing different life forms and being structural components of various plant communities allows to visually characterizing thematic aspects of an exposition. That in turn reveal such principles of systematic nature organization as ecobiomorphic and phytocenotic.


Sign in / Sign up

Export Citation Format

Share Document