scholarly journals Automation of Morphological Tagging of Archival Documents

Author(s):  
Anatoly Komendantov ◽  
Alexander Matveev ◽  
Andrey Svetlov

The paper provides the description of the add-on to the stemming tool MyStem by I. Segalovich. We designe the application to add the MyStem a convenient graphical interface that is easy to learn and intuitive for users who do not specialize in information technology. It turned out that MyStem correctly processes outdated vocabulary if it is passed into the program using modern Cyrillic. In addition to the convenient interface, our program has the option to work with the outdated Cyrillic alphabet, when turned on, for instance, the letters zelo and omega are being replaced by «ks» and «o» respectively, and only then the text is transferring for analysis to MyStem, and then the characters are being replaced back in the processed document. So our add-on intercepts the output of the MyStem tool, reformatts and analyzes it in a special way. In addition, the application has functionality for removing homonyms manually if the program was not correct with automatic tagging the morphological characteristics of a word. The main purpose of this application is to prepare the morphological tagging of documents of the archival fund «Mikhailovsky Stanichny Ataman» to create a linguistic corpus. During the work on the application, we solved the problem with the correct processing of texts containing outdated Cyrillic characters. To implement the functional and user-friendly graphical interface, we use the JavaFX platform (OpenJFX).

Author(s):  
Daniil Filimonov ◽  
Andrey Svetlov ◽  
Oksana Gorban ◽  
Marina Kosova

The main goal of this project is to create a corpus of documents from the «Mikhailovsky stanichny ataman» archival fund. The methods of corpus linguistics seem to be the most optimal in this case, since they involve the processing of a large number of texts in order to solve a wide variety of linguistic problems. Our group joined the team of philologists to provide the technical and software part of the project. The main task for us is to create a document corpus engine, that is, software that solves the tasks of storing a database of marked-up texts, executing queries to this database, and also providing users with a convenient interface for work that does not require special qualifications in the field of information technology. However, it is necessary to prepare documents for inclusion in the corpus: all texts must undergo special markup. There are many types of markup, and in the previous publications [6; 9] our group has already described the solution to the problem of morphological tagging. This article is about meta tagging. Meta tagging refers to the assignment of certain descriptive attributes to text. In the case of office documents, these are such parameters as the type of document (genre), author (compiler), addressee, date and place of creation. Meta tagging is necessary for the implementation of the corpus search features, so that the researchers can receive text samples with specified external parameters: for example, texts of a certain type, created at a certain period, addressed to a certain addressee, etc. The archives of the «Mikhailovsky stanichny ataman» fund mainly contain documents from the Chanceries of the Don Army from the mid-18th to the first third of the 19th century, that’s why there are not so many varieties of these documents. Moreover, these are mostly official documents, and they were written up according to certain templates, forms, the parameters of which can be relatively easily extracted from documents through preliminary analysis. This work is also carried out by the team of philologists from VolSU under the guidance of Professor O.A. Gorban. The result of their systematization of documents was the description of special speech markers of genre parameters for all document types in the archive. Thus, in our case, there is no need for heavy methods of statistical analysis or machine learning, it is enough to search for certain markers in the document. Moreover, the main marker in all reviewed documents is a direct indication of their type. Other markers are auxiliary elements of meta tagging. The paper is devoted to the description of the created application for determining the type of a document and its meta tagging by searching the text for certain regular expressions derived from the markers.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Leanne Kosowan ◽  
Alan Katz ◽  
Gayle Halas ◽  
Alexander Singer

Abstract Background Primary care provides an opportunity to introduce prevention strategies and identify risk behaviours. Algorithmic information technology such as the Risk Factor Identification Tool (RFIT) can support primary care counseling. This study explores the integration of the tablet-based RFIT in primary care clinics to support exploration of patient risk factor information. Methods Qualitative study to explore patients’ perspectives of RFIT. RFIT was implemented in two primary care clinics in Manitoba, Canada. There were 207 patients who completed RFIT, offered to them by eight family physicians. We conducted one-on-one patient interviews with 86 patients to capture the patient’s perspective. Responses were coded and categorized into five common themes. Results RFIT had a completion rate of 86%. Clinic staff reported that very few patients declined the use of RFIT or required assistance to use the tablet. Patients reported that the tablet-based RFIT provided a user-friendly interface that enabled self-reflection while in the waiting room. Patients discussed the impact of RFIT on the patient-provider interaction, utility for the clinician, their concerns and suggested improvements for RFIT. Among the patients who used RFIT 12.1% smoked, 21.2% felt their diet could be improved, 9.3% reported high alcohol consumption, 56.4% reported less than 150 min of PA a week, and 8.2% lived in poverty. Conclusion RFIT is a user-friendly tool for the collection of patient risk behaviour information. RFIT is particularly useful for patients lacking continuity in the care they receive. Information technology can promote self-reflection while providing useful information to the primary care clinician. When combined with practical tools and resources RFIT can assist in the reduction of risk behaviours.


2005 ◽  
Vol 38 (2) ◽  
pp. 381-388 ◽  
Author(s):  
Maria C. Burla ◽  
Rocco Caliandro ◽  
Mercedes Camalli ◽  
Benedetta Carrozzini ◽  
Giovanni L. Cascarano ◽  
...  

SIR2004is the evolution of theSIR2002program [Burla, Camalli, Carrozzini, Cascarano, Giacovazzo, Polidori & Spagna (2003).J. Appl. Cryst.36, 1103]. It is devoted to the solution of crystal structures by direct and Patterson methods. Several new features implemented inSIR2004make this program efficient: it is able to solveab initioboth small/medium-size structures as well as macromolecules (up to 2000 atoms in the asymmetric unit). In favourable circumstances, the program is also able to solve protein structures with data resolution up to 1.4–1.5 Å, and to provide interpretable electron density maps. A powerful user-friendly graphical interface is provided.


1998 ◽  
Vol 31 (6) ◽  
pp. 963-964
Author(s):  
Leonard J. Barbour ◽  
Jerry L. Atwood

RES2INSruns under the MS-DOS operating system and allows the user to view graphically the results of successiveSHELXstructure solution and refinement runs. In addition, the structural model can be edited in a user-friendly manner and these changes can be carried through to a newSHELXinstruction file. The program is menu driven and extensive use is made of the mouse for the facilitation of operations on individual atoms.


2018 ◽  
Vol 2 (20) ◽  
pp. 2637-2645
Author(s):  
Jason Xu ◽  
Yiwen Wang ◽  
Peter Guttorp ◽  
Janis L. Abkowitz

Abstract Stochastic simulation has played an important role in understanding hematopoiesis, but implementing and interpreting mathematical models requires a strong statistical background, often preventing their use by many clinical and translational researchers. Here, we introduce a user-friendly graphical interface with capabilities for visualizing hematopoiesis as a stochastic process, applicable to a variety of mammal systems and experimental designs. We describe the visualization tool and underlying mathematical model, and then use this to simulate serial transplantations in mice, human cord blood cell expansion, and clonal hematopoiesis of indeterminate potential. The outcomes of these virtual experiments challenge previous assumptions and provide examples of the flexible range of hypotheses easily testable via the visualization tool.


2019 ◽  
Vol 34 (3) ◽  
pp. 233-241 ◽  
Author(s):  
Justin R. Blanton ◽  
Robert J. Papoular ◽  
Daniel Louër

A straightforward intuitive user-friendly compact graphical interface, PreDICT (Premier DICVOL Tool) has been developed to take full advantage of the new capabilities of the most recent version of the DICVOL14 Indexing Software. The latter, an updated version of DICVOL04, includes optimizations, e.g. for monoclinic and triclinic cases, a detailed review of the input data from the indexing solutions, cell centering tests, as well as the handling of a moderate number of impurity peaks. Among the most salient features of PreDICT, one can mention the ability (1) to use 2θ non-equistepped input 1D X-ray powder diffraction patterns as can be obtained from 2D detectors, (2) to strip laboratory data from its Kα2 contribution when present, (3) to generate 2θ equistepped output 1D X-ray powder diffraction patterns in both the “.XY” and “.GSA” formats. In addition, PreDICT allows for the following features: (1) full access to the native DICVOL14 input/output ASCII file system is retained, (2) for any selection of a DICVOL14 suggested unit cell, all predicted Bragg peaks up to a certain 2θMAX value are clearly displayed and indicated, thereby emphasizing the contribution of the unaccounted peaks (if any) to the 1D X-ray powder diffraction pattern under current investigation.


Author(s):  
Guo Jia ◽  
Yang Ming

Since safety-critical software is crucial to nuclear safety in the occurrence of accident, it is required to have rather higher requirements in both reliability and safety than the non-safety one. However, since the complexity of a software product, how to ensure the reliability and safety of a software product is still a challenging work. The paper presents a design of a platform for safety justification of safety-critical software of nuclear power plants. A syllogism referred as to Claim, Argument and Evidence (CAE) is applied to clarify the key factors that will affect software reliability and the dependencies between them. The proposed safety justification platform offers a user-friendly graphical interface to help construct a CAE model by a drag and drop way. The proposed safety justification platform could be used for the rigorous argument of various factors that may affect the reliability of a safety-critical software product during different phases of its life cycle and establishing their causalities. In this way, it could greatly improve its creditability and applicability and lowering the uncertainties in software development and application, and therefore has a significant engineering values in ensuring and improving the quality and reliability of nuclear software products.


2010 ◽  
Vol 85 (10-12) ◽  
pp. 1957-1965
Author(s):  
Young-Seok Lee ◽  
Seok-Heun Yoon ◽  
Jung-Hoon Han

Robotica ◽  
1997 ◽  
Vol 15 (1) ◽  
pp. 99-103 ◽  
Author(s):  
Tamio Arai ◽  
Toshiyuki Itoko ◽  
Hidetoshi Yago

A graphical robot programming system has been developed. This system with a graphical interface is user-friendly and easy-to-learn for low-skill users. It has been developed as a prototype system under a project by the Japan Robot Association (JARA) since 1994. The system runs on a personal computer and consists of a graphical user interface and an editing system. It is designed for programming an arc welding robot in small batch production and is expected to provide low-skill users with a means to use industrial robots with ease.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
W. J. Pereira ◽  
F. M. Almeida ◽  
D. Conde ◽  
K. M. Balmant ◽  
P. M. Triozzi ◽  
...  

Abstract Background Single-cell RNA sequencing (scRNA-seq) has revolutionized the study of transcriptomes, arising as a powerful tool for discovering and characterizing cell types and their developmental trajectories. However, scRNA-seq analysis is complex, requiring a continuous, iterative process to refine the data and uncover relevant biological information. A diversity of tools has been developed to address the multiple aspects of scRNA-seq data analysis. However, an easy-to-use web application capable of conducting all critical steps of scRNA-seq data analysis is still lacking. Summary We present Asc-Seurat, a feature-rich workbench, providing an user-friendly and easy-to-install web application encapsulating tools for an all-encompassing and fluid scRNA-seq data analysis. Asc-Seurat implements functions from the Seurat package for quality control, clustering, and genes differential expression. In addition, Asc-Seurat provides a pseudotime module containing dozens of models for the trajectory inference and a functional annotation module that allows recovering gene annotation and detecting gene ontology enriched terms. We showcase Asc-Seurat’s capabilities by analyzing a peripheral blood mononuclear cell dataset. Conclusions Asc-Seurat is a comprehensive workbench providing an accessible graphical interface for scRNA-seq analysis by biologists. Asc-Seurat significantly reduces the time and effort required to analyze and interpret the information in scRNA-seq datasets.


Sign in / Sign up

Export Citation Format

Share Document