Advanced Applications on Bilingual Document Analysis and Processing Systems

2022 ◽  
pp. 625-674
Author(s):  
Shalini Puri ◽  
Satya Prakash Singh

Today, rapid digitization requires efficient bilingual non-image and image document classification systems. Although many bilingual NLP and image-based systems provide solutions for real-world problems, they primarily focus on text extraction, identification, and recognition tasks with limited document types. This article discusses a journey of these systems and provides an overview of their methods, feature extraction techniques, document sets, classifiers, and accuracy for English-Hindi and other language pairs. The gaps found lead toward the idea of a generic and integrated bilingual English-Hindi document classification system, which classifies heterogeneous documents using a dual class feeder and two character corpora. Its non-image and image modules include pre- and post-processing stages and pre-and post-segmentation stages to classify documents into predefined classes. This article discusses many real-life applications on societal and commercial issues. The analytical results show important findings of existing and proposed systems.

2020 ◽  
Vol 11 (4) ◽  
pp. 149-193
Author(s):  
Shalini Puri ◽  
Satya Prakash Singh

Today, rapid digitization requires efficient bilingual non-image and image document classification systems. Although many bilingual NLP and image-based systems provide solutions for real-world problems, they primarily focus on text extraction, identification, and recognition tasks with limited document types. This article discusses a journey of these systems and provides an overview of their methods, feature extraction techniques, document sets, classifiers, and accuracy for English-Hindi and other language pairs. The gaps found lead toward the idea of a generic and integrated bilingual English-Hindi document classification system, which classifies heterogeneous documents using a dual class feeder and two character corpora. Its non-image and image modules include pre- and post-processing stages and pre-and post-segmentation stages to classify documents into predefined classes. This article discusses many real-life applications on societal and commercial issues. The analytical results show important findings of existing and proposed systems.


Author(s):  
Rohitkumar R Upadhyay

Abstract: Historically, most students really have been struggling with mathematics, which for the most part specifically makes them wonder if they will ever generally apply the knowledge in general sort of real world life, contrary to popular belief. Teachers and parents mostly particularly admit when they kind of really have been kind of kind of asked that students for all intents and purposes actually have very definitely for all intents and purposes few knowledge about the relevance of mathematics in real life, or so they thought. That essentially is why this paper really mostly is based on application of maths in particularly generally real life, or so they definitely thought, or so they really thought. In this paper the most common and pretty essential applications of mathematics in real life literally generally are discussed such as finance and banking, weather prediction, computers and its games, search engines (goggle), music and Transportation and logistics in a subtle way in a very major way. Apart from these some mostly advanced applications are also discussed actually such as satellite navigation, military and Defence and crime prediction in a particularly big way. Keywords: Mathematics, Real life, Finance and Banking, Satellite Navigation, Military and Defence


2017 ◽  
Vol 33 (S1) ◽  
pp. 139-140
Author(s):  
Giuditta Callea ◽  
Maria Caterina Cavallo ◽  
Rosanna Tarricone

INTRODUCTION:Administrative data (for example, hospital discharge databases, HDDs) can be used as a real world source of clinical and economic evidence for assessing new medical devices (MDs), provided that their use can be identified in the data. In absence of updated classification systems for procedures and diagnoses, which allow to identify the use of new technologies in the data, traceability can still be achieved thanks to authorities coding guidelines (that is, indication on how to combine the existing codes for procedures and/or diagnoses when new technologies are used).In 2009 Italy adopted version 2007 of the International Classification System of Diseases (ICD-9-CM) and version 24 of Diagnosis Related Groups (DRGs), which are still in use. The aim of this work was to investigate the capacity of the classification system currently used in Italy, which is at high risk of obsolescence, to identify innovative MDs.METHODS:To achieve our goal, we performed a systematic search of all the national and regional coding guidelines published from 2009 (that is, the year of introduction of the new classification systems) to 2015. We extracted from each document the list of technologies for which the Ministry of Health and/or the Regional Authorities provided with coding indications.RESULTS:Our results show that only a few recent technological innovations can be identified in the Italian HDDs. This reduces the possibility for decision makers to measure new technologies outcomes and costs in the real world clinical practice.CONCLUSIONS:The traceability of new MDs' can support Heath Technology Assessment (HTA). Indeed, HTA programs should use real world evidence to re-assess MDs 2–3 years after their introduction in clinical practice. The use of routinely collected data, such as HDD, would allow to measure new technologies' “real” effectiveness in “real” world, on “real” patients in “real” hospitals to complement the evidence from Randomized Controlled Trials.


2014 ◽  
Vol 25 (4) ◽  
pp. 233-238 ◽  
Author(s):  
Martin Peper ◽  
Simone N. Loeffler

Current ambulatory technologies are highly relevant for neuropsychological assessment and treatment as they provide a gateway to real life data. Ambulatory assessment of cognitive complaints, skills and emotional states in natural contexts provides information that has a greater ecological validity than traditional assessment approaches. This issue presents an overview of current technological and methodological innovations, opportunities, problems and limitations of these methods designed for the context-sensitive measurement of cognitive, emotional and behavioral function. The usefulness of selected ambulatory approaches is demonstrated and their relevance for an ecologically valid neuropsychology is highlighted.


Author(s):  
Manish M. Kayasth ◽  
Bharat C. Patel

The entire character recognition system is logically characterized into different sections like Scanning, Pre-processing, Classification, Processing, and Post-processing. In the targeted system, the scanned image is first passed through pre-processing modules then feature extraction, classification in order to achieve a high recognition rate. This paper describes mainly on Feature extraction and Classification technique. These are the methodologies which play an important role to identify offline handwritten characters specifically in Gujarati language. Feature extraction provides methods with the help of which characters can identify uniquely and with high degree of accuracy. Feature extraction helps to find the shape contained in the pattern. Several techniques are available for feature extraction and classification, however the selection of an appropriate technique based on its input decides the degree of accuracy of recognition. 


2021 ◽  
Author(s):  
Amarildo Likmeta ◽  
Alberto Maria Metelli ◽  
Giorgia Ramponi ◽  
Andrea Tirinzoni ◽  
Matteo Giuliani ◽  
...  

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.


Neurosurgery ◽  
2021 ◽  
Author(s):  
Kenny Yat Hong Kwan ◽  
J Naresh-Babu ◽  
Wilco Jacobs ◽  
Marinus de Kleuver ◽  
David W Polly ◽  
...  

Abstract BACKGROUND Existing adult spinal deformity (ASD) classification systems are based on radiological parameters but management of ASD patients requires a holistic approach. A comprehensive clinically oriented patient profile and classification of ASD that can guide decision-making and correlate with patient outcomes is lacking. OBJECTIVE To perform a systematic review to determine the purpose, characteristic, and methodological quality of classification systems currently used in ASD. METHODS A systematic literature search was conducted in MEDLINE, EMBASE, CINAHL, and Web of Science for literature published between January 2000 and October 2018. From the included studies, list of classification systems, their methodological measurement properties, and correlation with treatment outcomes were analyzed. RESULTS Out of 4470 screened references, 163 were included, and 54 different classification systems for ASD were identified. The most commonly used was the Scoliosis Research Society-Schwab classification system. A total of 35 classifications were based on radiological parameters, and no correlation was found between any classification system levels with patient-related outcomes. Limited evidence of limited quality was available on methodological quality of the classification systems. For studies that reported the data, intraobserver and interobserver reliability were good (kappa = 0.8). CONCLUSION This systematic literature search revealed that current classification systems in clinical use neither include a comprehensive set of dimensions relevant to decision-making nor did they correlate with outcomes. A classification system comprising a core set of patient-related, radiological, and etiological characteristics relevant to the management of ASD is needed.


2021 ◽  
pp. 193229682110075
Author(s):  
Rebecca A. Harvey Towers ◽  
Xiaohe Zhang ◽  
Rasoul Yousefi ◽  
Ghazaleh Esmaili ◽  
Liang Wang ◽  
...  

The algorithm for the Dexcom G6 CGM System was enhanced to retain accuracy while reducing the frequency and duration of sensor error. The new algorithm was evaluated by post-processing raw signals collected from G6 pivotal trials (NCT02880267) and by assessing the difference in data availability after a limited, real-world launch. Accuracy was comparable with the new algorithm—the overall %20/20 was 91.7% before and 91.8% after the algorithm modification; MARD was unchanged. The mean data gap due to sensor error nearly halved and total time spent in sensor error decreased by 59%. A limited field launch showed similar results, with a 43% decrease in total time spent in sensor error. Increased data availability may improve patient experience and CGM data integration into insulin delivery systems.


Author(s):  
Marcelo N. de Sousa ◽  
Ricardo Sant’Ana ◽  
Rigel P. Fernandes ◽  
Julio Cesar Duarte ◽  
José A. Apolinário ◽  
...  

AbstractIn outdoor RF localization systems, particularly where line of sight can not be guaranteed or where multipath effects are severe, information about the terrain may improve the position estimate’s performance. Given the difficulties in obtaining real data, a ray-tracing fingerprint is a viable option. Nevertheless, although presenting good simulation results, the performance of systems trained with simulated features only suffer degradation when employed to process real-life data. This work intends to improve the localization accuracy when using ray-tracing fingerprints and a few field data obtained from an adverse environment where a large number of measurements is not an option. We employ a machine learning (ML) algorithm to explore the multipath information. We selected algorithms random forest and gradient boosting; both considered efficient tools in the literature. In a strict simulation scenario (simulated data for training, validating, and testing), we obtained the same good results found in the literature (error around 2 m). In a real-world system (simulated data for training, real data for validating and testing), both ML algorithms resulted in a mean positioning error around 100 ,m. We have also obtained experimental results for noisy (artificially added Gaussian noise) and mismatched (with a null subset of) features. From the simulations carried out in this work, our study revealed that enhancing the ML model with a few real-world data improves localization’s overall performance. From the machine ML algorithms employed herein, we also observed that, under noisy conditions, the random forest algorithm achieved a slightly better result than the gradient boosting algorithm. However, they achieved similar results in a mismatch experiment. This work’s practical implication is that multipath information, once rejected in old localization techniques, now represents a significant source of information whenever we have prior knowledge to train the ML algorithm.


2021 ◽  
Vol 11 (15) ◽  
pp. 6748
Author(s):  
Hsun-Ping Hsieh ◽  
Fandel Lin ◽  
Jiawei Jiang ◽  
Tzu-Ying Kuo ◽  
Yu-En Chang

Research on flourishing public bike-sharing systems has been widely discussed in recent years. In these studies, many existing works focus on accurately predicting individual stations in a short time. This work, therefore, aims to predict long-term bike rental/drop-off demands at given bike station locations in the expansion areas. The real-world bike stations are mainly built-in batches for expansion areas. To address the problem, we propose LDA (Long-Term Demand Advisor), a framework to estimate the long-term characteristics of newly established stations. In LDA, several engineering strategies are proposed to extract discriminative and representative features for long-term demands. Moreover, for original and newly established stations, we propose several feature extraction methods and an algorithm to model the correlations between urban dynamics and long-term demands. Our work is the first to address the long-term demand of new stations, providing the government with a tool to pre-evaluate the bike flow of new stations before deployment; this can avoid wasting resources such as personnel expense or budget. We evaluate real-world data from New York City’s bike-sharing system, and show that our LDA framework outperforms baseline approaches.


Sign in / Sign up

Export Citation Format

Share Document