scholarly journals The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures

Author(s):  
Haim Dubossarsky ◽  
Ivan Vulić ◽  
Roi Reichart ◽  
Anna Korhonen
Sensors ◽  
2019 ◽  
Vol 19 (23) ◽  
pp. 5176
Author(s):  
Guannan Li ◽  
Ying Li ◽  
Bingxin Liu ◽  
Peng Wu ◽  
Chen Chen

Polarimetric synthetic aperture radar is an important tool in the effective detection of marine oil spills. In this study, two cases of Radarsat-2 Fine mode quad-polarimetric synthetic aperture radar datasets are exploited to detect a well-known oil seep area that collected over the Gulf of Mexico using the same research area, sensor, and time. A novel oil spill detection scheme based on a multi-polarimetric features model matching method using spectral pan-similarity measure (SPM) is proposed. A multi-polarimetric features curve is generated based on optimal polarimetric features selected using Jeffreys–Matusita distance considering its ability to discriminate between thick and thin oil slicks and seawater. The SPM is used to search for and match homogeneous unlabeled pixels and assign them to a class with the highest similarity to their spectral vector size, spectral curve shape, and spectral information content. The superiority of the SPM for oil spill detection compared to traditional spectral similarity measures is demonstrated for the first time based on accuracy assessments and computational complexity analysis by comparing with four traditional spectral similarity measures, random forest (RF), support vector machine (SVM), and decision tree (DT). Experiment results indicate that the proposed method has better oil spill detection capability, with a higher average accuracy and kappa coefficient (1.5–7.9% and 1–25% higher, respectively) than the four traditional spectral similarity measures under the same computational complexity operations. Furthermore, in most cases, the proposed method produces valuable and acceptable results that are better than the RF, SVM, and DT in terms of accuracy and computational complexity.


2016 ◽  
Vol 8 (4) ◽  
pp. 344 ◽  
Author(s):  
Ke Wang ◽  
Bin Yong

Author(s):  
H. Chauhan ◽  
B. Krishna Mohan

The present study was undertaken with the objective to check effectiveness of spectral similarity measures to develop precise crop spectra from the collected hyperspectral field spectra. In Multispectral and Hyperspectral remote sensing, classification of pixels is obtained by statistical comparison (by means of spectral similarity) of known field or library spectra to unknown image spectra. Though these algorithms are readily used, little emphasis has been placed on use of various spectral similarity measures to select precise crop spectra from the set of field spectra. Conventionally crop spectra are developed after rejecting outliers based only on broad-spectrum analysis. Here a successful attempt has been made to develop precise crop spectra based on spectral similarity. As unevaluated data usage leads to uncertainty in the image classification, it is very crucial to evaluate the data. Hence, notwithstanding the conventional method, the data precision has been performed effectively to serve the purpose of the present research work. The effectiveness of developed precise field spectra was evaluated by spectral discrimination measures and found higher discrimination values compared to spectra developed conventionally. Overall classification accuracy for the image classified by field spectra selected conventionally is 51.89% and 75.47% for the image classified by field spectra selected precisely based on spectral similarity. KHAT values are 0.37, 0.62 and Z values are 2.77, 9.59 for image classified using conventional and precise field spectra respectively. Reasonable higher classification accuracy, KHAT and Z values shows the possibility of a new approach for field spectra selection based on spectral similarity measure.


2009 ◽  
Vol 126 (6) ◽  
pp. 3227-3235 ◽  
Author(s):  
Annika Hämäläinen ◽  
Michele Gubian ◽  
Louis ten Bosch ◽  
Lou Boves

2018 ◽  
Vol 24 (5) ◽  
pp. 677-694 ◽  
Author(s):  
D. LANGLOIS ◽  
M. SAAD ◽  
K. SMAILI

AbstractThe objective, in this article, is to address the issue of the comparability of documents, which are extracted from different sources and written in different languages. These documents are not necessarily translations of each other. This material is referred as multilingual comparable corpora. These language resources are useful for multilingual natural language processing applications, especially for low-resourced language pairs. In this paper, we collect different data in Arabic, English, and French. Two corpora are built by using available hyperlinks for Wikipedia and Euronews. Euronews is an aligned multilingual (Arabic, English, and French) corpus of 34k documents collected from Euronews website. A more challenging issue is to build comparable corpus from two different and independent media having two distinct editorial lines, such as British Broadcasting Corporation (BBC) and Al Jazeera (JSC). To build such corpus, we propose to use the Cross-Lingual Latent Semantic approach. For this purpose, documents have been harvested from BBC and JSC websites for each month of the years 2012 and 2013. The comparability is calculated for each Arabic–English couple of documents of each month. This automatic task is then validated by hand. This led to a multilingual (Arabic–English) aligned corpus of 305 pairs of documents (233k English words and 137k Arabic words). In addition, a study is presented in this paper to analyze the performance of three methods of the literature allowing to measure the comparability of documents on the multilingual reference corpora. A recall at rank 1 of 50.16 per cent is achieved with the Cross-lingual LSI approach for BBC–JSC test corpus, while the dictionary-based method reaches a recall of only 35.41 per cent.


Sign in / Sign up

Export Citation Format

Share Document