An Unsupervised Data-Driven Cross-Lingual Method for Building High Precision Sentiment Lexicons

The data-driven Bulgarian WordNet: BTBWN

Cognitive Studies | Études cognitives ◽

10.11649/cs.1713 ◽

2018 ◽

Author(s):

Petya Osenova ◽

Kiril Simov

Keyword(s):

Information Retrieval ◽

Machine Translation ◽

Semantic Information ◽

Data Driven ◽

Lexical Resources ◽

Multilingual Information Retrieval ◽

Cross Lingual ◽

Princeton Wordnet ◽

Word Senses

The data-driven Bulgarian WordNet: BTBWNThe paper presents our work towards the simultaneous creation of a data-driven WordNet for Bulgarian and a manually annotated treebank with semantic information. Such an approach requires synchronization of the word senses in both - syntactic and lexical resources, without limiting the WordNet senses to the corpus or vice versa. Our strategy focuses on the identification of senses used in BulTreeBank, but the missing senses of a lemma also have been covered through exploration of bigger corpora. The identified senses have been organized in synsets for the Bulgarian WordNet. Then they have been aligned to the Princeton WordNet synsets. Various types of mappings are considered between both resources in a cross-lingual aspect and with respect to ensuring maximum connectivity and potential for incorporating the language specific concepts. The mapping between the two WordNets (English and Bulgarian) is a basis for applications such as machine translation and multilingual information retrieval. Oparty na danych WordNet bułgarski: BTBWNW artykule przedstawiono naszą pracę na rzecz jednoczesnej budowy opartego na danych wordnetu dla języka bułgarskiego oraz ręcznie oznaczonego informacjami semantycznymi banku drzew. Takie podejście wymaga uzgodnienia znaczeń słów zarówno w zasobach składniowych, jak i leksykalnych, bez ograniczania znaczeń umieszczanych w wordnecie do tych obecnych w korpusie, jak i odwrotnie. Nasza strategia koncentruje się na identyfikacji znaczeń stosowanych w BulTreeBank, przy czym brakujące znaczenia lematu zostały również zbadane przez zgłębienie większych korpusów. Zidentyfikowane znaczenia zostały zorganizowane w synsety bułgarskiego wordnetu, a następnie powiązane z synsetami Princeton WordNet. Rozmaite rodzaje rzutowań są rozpatrywane pomiędzy obydwoma zasobami w kontekście międzyjęzykowym, a także w odniesieniu do zapewnienia maksymalnej łączności i możliwości uwzględnienia pojęć specyficznych dla języka bułgarskiego. Rzutowanie między dwoma wordnetami (angielskim i bułgarskim) jest podstawą dla aplikacji, takich jak tłumaczenie maszynowe i wielojęzyczne wyszukiwanie informacji.

Download Full-text

Data driven methods for improving mono- and cross-lingual IR performance in noisy environments

Proceedings of the second workshop on Analytics for noisy unstructured text data - AND '08 ◽

10.1145/1390749.1390762 ◽

2008 ◽

Cited By ~ 1

Author(s):

Antti Järvelin ◽

Tuomas Talvensaari ◽

Anni Järvelin

Keyword(s):

Data Driven ◽

Noisy Environments ◽

Cross Lingual

Download Full-text

Improving Efficiency of Volunteer-Based Food Rescue Operations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i08.7051 ◽

2020 ◽

Vol 34 (08) ◽

pp. 13369-13375

Author(s):

Zheyuan Ryan Shi ◽

Yiwen Yuan ◽

Kimberly Lo ◽

Leah Lizarondo ◽

Fei Fang

Keyword(s):

Food Insecurity ◽

High Precision ◽

Branch And Bound ◽

Food Waste ◽

Optimization Algorithm ◽

Data Driven ◽

Data Generation ◽

Rescue Operation ◽

Optimal Intervention ◽

Near Future

Food waste and food insecurity are two challenges that coexist in many communities. To mitigate the problem, food rescue platforms match excess food with the communities in need, and leverage external volunteers to transport the food. However, the external volunteers bring significant uncertainty to the food rescue operation. We work with a large food rescue organization to predict the uncertainty and furthermore to find ways to reduce the human dispatcher's workload and the redundant notifications sent to volunteers. We make two main contributions. (1) We train a stacking model which predicts whether a rescue will be claimed with high precision and AUC. This model can help the dispatcher better plan for backup options and alleviate their uncertainty. (2) We develop a data-driven optimization algorithm to compute the optimal intervention and notification scheme. The algorithm uses a novel counterfactual data generation approach and the branch and bound framework. Our result reduces the number of notifications and interventions required in the food rescue operation. We are working with the organization to deploy our results in the near future.

Download Full-text

Data-driven gradient algorithm for high-precision quantum control

Physical Review A ◽

10.1103/physreva.97.042122 ◽

2018 ◽

Vol 97 (4) ◽

Cited By ~ 14

Author(s):

Re-Bing Wu ◽

Bing Chu ◽

David H. Owens ◽

Herschel Rabitz

Keyword(s):

High Precision ◽

Quantum Control ◽

Gradient Algorithm ◽

Data Driven

Download Full-text

High Precision Data-driven Force Control of Compact Elastic Module for a Lower Extremity Augmentation Device

Journal of Bionic Engineering ◽

10.1007/s42235-018-0068-y ◽

2018 ◽

Vol 15 (5) ◽

pp. 805-819 ◽

Cited By ~ 1

Author(s):

Likun Wang ◽

Chaofeng Chen ◽

Zhengyang Li ◽

Wei Dong ◽

Zhijiang Du ◽

...

Keyword(s):

Lower Extremity ◽

High Precision ◽

Force Control ◽

Data Driven ◽

Precision Data ◽

Elastic Module ◽

High Precision Data

Download Full-text

Multi-Level Model Reduction and Data-Driven Identification of the Lithium-Ion Battery

Energies ◽

10.3390/en13153791 ◽

2020 ◽

Vol 13 (15) ◽

pp. 3791

Author(s):

Yong Li ◽

Jue Yang ◽

Wei Long Liu ◽

Cheng Lin Liao

Keyword(s):

Lithium Ion Battery ◽

High Precision ◽

Tracking Error ◽

Lithium Ion ◽

Data Driven ◽

System Level ◽

Identification Algorithm ◽

Identification Methods ◽

Battery Model ◽

Non Linear System

The lithium-ion battery is a complicated non-linear system with multi electrochemical processes including mass and charge conservations as well as electrochemical kinetics. The calculation process of the electrochemical model depends on an in-depth understanding of the physicochemical characteristics and parameters, which can be costly and time-consuming. We investigated the electrochemical modeling, reduction, and identification methods of the lithium-ion battery from the electrode-level to the system-level. A reduced 9th order linear model was proposed using electrode-level physicochemical modeling and the cell-level mathematical reduction method. The data-driven predictor-based subspace identification algorithm was presented for the estimation of lithium-ion battery model in the system-level. The effectiveness of the proposed modeling and identification methods was validated in an experimental study based on LiFePO4 cells. The accuracy and dynamic characteristics of the identified model were found to be much more likely related to the operating State of Charge (SOC) range. Experimental results showed that the proposed methods perform well with high precision and good robustness in the SOC range of 90% to 10%, and the tracking error increases significantly within higher (100–90%) or lower (10–0%) SOC ranges. Moreover, to achieve an optimal balance between high-precision and low complexity, statistical analysis revealed that the 6th, 3rd, and 5th order battery model is the optimal choice in the SOC range of 90% to 100%, 90% to 10%, and 10% to 0%, respectively.

Download Full-text

Data-driven Laser Plane Optimization for High-precision Numerical Calibration of Line Structured Light Sensors

IEEE Access ◽

10.1109/access.2021.3072662 ◽

2021 ◽

pp. 1-1

Author(s):

Jingbo Zhou ◽

Laisheng Pan ◽

Yuehua Li ◽

Renjie Du ◽

Fuxiang Zhang

Keyword(s):

High Precision ◽

Structured Light ◽

Data Driven ◽

Laser Plane

Download Full-text

Tracking Child Language Development With Neural Network Language Models

Frontiers in Psychology ◽

10.3389/fpsyg.2021.674402 ◽

2021 ◽

Vol 12 ◽

Author(s):

Kenji Sagae

Keyword(s):

Neural Network ◽

Language Development ◽

Child Language ◽

Structural Characteristics ◽

Language Models ◽

Data Driven ◽

Evaluation Methodology ◽

Specific Information ◽

Data Driven Approach ◽

Cross Lingual

Recent work on the application of neural networks to language modeling has shown that models based on certain neural architectures can capture syntactic information from utterances and sentences even when not given an explicitly syntactic objective. We examine whether a fully data-driven model of language development that uses a recurrent neural network encoder for utterances can track how child language utterances change over the course of language development in a way that is comparable to what is achieved using established language assessment metrics that use language-specific information carefully designed by experts. Given only transcripts of child language utterances from the CHILDES Database and no pre-specified information about language, our model captures not just the structural characteristics of child language utterances, but how these structures reflect language development over time. We establish an evaluation methodology with which we can examine how well our model tracks language development compared to three known approaches: Mean Length of Utterance, the Developmental Sentence Score, and the Index of Productive Syntax. We discuss the applicability of our model to data-driven assessment of child language development, including how a fully data-driven approach supports the possibility of increased research in multilingual and cross-lingual issues.

Download Full-text

Extracting Cylinder Individual Combustion Data From a High Precision Torque Sensor

ASME 2010 Internal Combustion Engine Division Fall Technical Conference ◽

10.1115/icef2010-35101 ◽

2010 ◽

Cited By ~ 4

Author(s):

Hans Aulin ◽

Per Tunestal ◽

Thomas Johansson ◽

Bengt Johansson

Keyword(s):

High Precision ◽

Complex Dynamics ◽

Signal To Noise Ratio ◽

Data Driven ◽

Coefficient Of Determination ◽

Combustion Model ◽

Torque Sensor ◽

Sensor Position ◽

Data Driven Approach ◽

Data Driven Modeling

A high precision torque sensor is used for extracting combustion timing information from cylinder individual pressure estimates constructed from the torque measurements. A combination of physics-based and data driven modeling is used where the physical part of the model is based on equations describing contributions of inertial and gas forces while the flexing of the crankshaft, which has rather complex dynamics, is modeled using the data driven approach. The first part of the study shows the derivation of the models and how well the torque at the sensor position can be estimated from the pressures in the four cylinders. The second part demonstrates how it is possible to reconstruct cylinder individual torque and pressure by inverting the pressure to torque model. Going from measured torque to pressure in each cylinder is not trivial since the inverted model is ill conditioned around top dead centre which causes large errors where the precision is the most needed. A parameterized combustion model is therefore introduced to improve the signal to noise ratio in the estimated parameters. The proposed method for detecting combustion demonstrated good results with a coefficient of determination of 0.95 against “true” combustion phasing.

Download Full-text

Automatic measurement of SAED patterns from asbestos minerals

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100106296 ◽

1988 ◽

Vol 46 ◽

pp. 846-847

Author(s):

J. C. Russ ◽

T. Taguchi ◽

P. M. Peters ◽

E. Chatfield ◽

J. C. Russ ◽

...

Keyword(s):

Diffraction Pattern ◽

High Precision ◽

Diffraction Data ◽

Automatic Measurement ◽

Automatic Method ◽

Important Method ◽

X Ray ◽

Integrated Intensity ◽

Asbestiform Minerals ◽

True Center

Conventional SAD patterns as obtained in the TEM present difficulties for identification of materials such as asbestiform minerals, although diffraction data is considered to be an important method for making this purpose. The preferred orientation of the fibers and the spotty patterns that are obtained do not readily lend themselves to measurement of the integrated intensity values for each d-spacing, and even the d-spacings may be hard to determine precisely because the true center location for the broken rings requires estimation. We have implemented an automatic method for diffraction pattern measurement to overcome these problems. It automatically locates the center of patterns with high precision, measures the radius of each ring of spots in the pattern, and integrates the density of spots in that ring. The resulting spectrum of intensity vs. radius is then used just as a conventional X-ray diffractometer scan would be, to locate peaks and produce a list of d,I values suitable for search/match comparison to known or expected phases.

Download Full-text