textual databases
Recently Published Documents


TOTAL DOCUMENTS

44
(FIVE YEARS 8)

H-INDEX

7
(FIVE YEARS 0)

2021 ◽  
Vol 25 (109) ◽  
pp. 80-87
Author(s):  
Fredy Humberto Troncoso Espinosa ◽  
Nicolás Esteban Fernández Rozas

Para la georreferenciación de un gran número de direcciones, es necesaria la previa geocodificación mediante sistemas de carácter público o privado. La geocodificación no es una ciencia exacta porque las direcciones generalmente son escritas y almacenadas por personas, lo que provoca diferentes problemas de precisión en el registro, como errores ortográficos, datos innecesarios o falta de datos mínimos. Para enfrentar este problema, en este artículo se describe una metodología que limpia y corrige las direcciones optimizando el proceso de geocodificación utilizando los sistemas existentes. Para su desarrollo se utiliza el proceso Knowledge Discovery in Text (KDT). La metodología se aplica a una base de datos de direcciones de hechos delictivos proporcionada por la unidad de análisis penal de la Fiscalía Regional del Biobío, Chile. Los resultados muestran un aumento en el número de geocodificaciones de los sistemas implementados, que varía según el sistema utilizado. Palabras Clave: Georreferenciación, Geocodificación, Minería de Texto. Referencias [1]C. Davis y F. Fonseca, «Assessing the Certainty of Locations Produced by an Address Geocoding System,» Geoinformatica, vol. 11, pp. 103-129, 2007. [2]L. Hill, «Georeferencing in Digital Libraries,» D-Lib Magazine, vol. 10, nº 5, 2004. [3]J. Pontón y A. Santillán, «Seguridad Ciudadana: escenarios y efectos,» 2008. [4]D. W. Goldberg, «Spatial approaches to reducing error in geocoded data,» 2010. [5]D.-H. Yang, L. M. Bilaver, O. Hayes y R. Goerge, «Improving Geocoding Practices: Evaluation of Geocoding Tools,» Journal of Medical Systems, vol. 28, pp. 361-370, 2004. [6]T. Ah-Hwee, «Text mining: The state of the art and the challenges,» de PAKDD’99 workshop on Knowledge Discovery from Advanced Databases, Beijing, 1999. [7]R. Feldman y I. Dagan, «Knowledge discovery in textual databases,» de First International Conference on Knowledge Discovery and Data Mining (KDD-95), 1995. [8]M. d. C Justicia de la Torre , «Nuevas Tecnicas de Mineria de Textos: Aplicaciones,» Granada, 2017. [9]M. Lutz, Programming Python, vol. 2, O'reilly & Associates, 2001, pp. 1-10. [10]W. McKinney, Python For Dara Analysis, O'Reilly, 2012, pp. 111-152. [11]E. Ukkonen, «Algorithms for Approximate String Matching,» de International Conference on Foundations of Computation Theory, 1985. [12]M. A. Alvarez Carmona, «Deteccion de similitud en textos cortos considerando traslape, ordeny relacion semantica de palabras,» Tonantzintla, Puebla, 2014. [13]V. I. Levenshtein, «Binary Codes Capble Of Correcting Deletions, Insertions, and Reversals,» Soviet Physics Doklady, vol. 10, p. 707, 2 February 1966. [14]Google, «Google Maps Plataform,» 2020. [En línea]. Disponible: https://developers.google.com/maps/documentation/javascript/geocoding?hl=es-419. [Último acceso: 29 Julio 2020]. [15]Mapquest, «Mapquest Developer,» 2020. [En línea]. Disponible: https://developer.mapquest.com/. [Último acceso: 25 Julio 2020]. [16]Microsoft Corporation, «Bing Maps Dev Center,» 2020. [En línea]. Disponible: https://www.bingmapsportal.com/. [Último acceso: 29 Julio 2020]. [17]Open Street Map Wiki, 2020. [En línea]. Disponible: https://wiki.openstreetmap.org/wiki/Main_Page. [Último acceso:29 Julio 2020]. [18]OpenAdrdresses, «OpenAdrdresses,» 2020. [En línea]. Disponible: https://openaddresses.io/. [Último acceso: 25 Julio 2020]. [19]OpenCage Geocoder, 2020. [En línea]. Disponible: https://opencagedata.com/. [Último acceso: 29 Julio 2020]. [20]Yahoo, «Yahoo Developer,» 2016. [En línea]. Disponible:https://developer.yahoo.com/. [Último acceso: 14 Agosto 2020]. [21]K. Jordahl, J. Van Den Bossche y J. Wasserman, «Geopandas/Geopandas: V0. 4.1. Zenodo,» 2020.


Author(s):  
Vlada A. Chernysheva ◽  

This article touches upon the idea of inchoativity in the works of Roman grammarians. It aims to observe the development of the usage of the term inchoativus in the Roman grammatical tradition. The study is based on Latin grammatical treatises dating back to the 3rd-7th centuries A.D., the most part of which was published by Heinrich Keil in the second half of the 19th century. Besides Keil's edition, the article refers to recent editions of grammatical treatises. The study was conducted using three digital textual databases including Corpora Corporum, Digital Library of Latin Texts, and PHI Latin Texts. The Latin adjective inchoativus (or inco-hativus, а less common spelling), which literally means ‘inceptive, initial', is attested in three meanings and is used in collocations concerning verbal tense, verbal inflection, and conjunctions respectively. The first two usages were widespread and refer to verbal categories, while the last one is attested only once. The article is divided into two parts. The first one discusses collocations with types of verbal tense such as gradus ‘grade, degree', distantia ‘distance', differentia ‘difference', discertio ‘difference', species ‘aspect' and tempus ‘tense' itself. The second part deals with Roman grammatical categories including forma ‘form', qualitas ‘quality', species ‘aspect', genus ‘voice', figura ‘figure'. The study draws a conclusion that the adjective inchoativus/incohativus is used with categories of tense and aspect only in the works of early grammarians including Probus, Sacerdos, Diomedes, Charisius, and PseudoProbus. However, these grammarians also mention this term with regard to verb forms ending in -sco. Mostly, inchoativity is bound with the Roman verbal category of forma, which can be observed in the works by Dositheus, Phocas, Eutyches, Audax, Pseudo-Victorinus, Donatus and his commentators Sergius, Servius, Pompeius, Cledonius, and Julian of Toledo, and species (Macrobius, Priscian), which is not to be confused with the species of tense mentioned above. Pseudo-Asper is the only Roman grammarian who exceptionally puts inchoativity into the category of figura and spells inchoativus as incohativus. If the category of forma is absent, inchoativity is reckoned to be a verbal quality (Diomedes). Inchoativity is included into the category of voice in case voice is regarded as a subcategory of quality (Sacerdos, Pseudo-Probus, and Cledonius). In respect to forms ending in -sco, inchoativity is a manifestation of the so-called grammatical category of quality.


Author(s):  
Julie Stauder-Porchet

The history of Egyptian lexicography is marked by the early twentieth-century Wörterbuch der ägyptischen Sprache (Wb). The decades after the appearance of the Wb saw the publication of many specialized dictionaries of language phases and corpuses. Ancient Egyptian lexicography is conditioned by sampling issues and the discontinuous nature of the artefactual record, by cultural distance, and by the nature of the writing systems of Egyptian. Current studies in Egyptian lexicography develop a variety of approaches, ranging from the more philological to the more linguistic. Since the publication of the Wb, many new texts have been published and many new words and meanings have been attested, while the lemmatization of the Wb (i.e. the process of grouping together the inflected forms of a word so that they can be analysed as a single item, identified by the word’s lemma, or dictionary form) appears in need of partial revision. The debate as to whether to produce a new comprehensive dictionary or a general text database based on the lexicon has been decided in favour of the latter in the 1990s. While digital textual databases based on the lexicon are being built, several voices call for a new dictionary, observing that a thesaurus does not replace a lexicographer’s work.


2020 ◽  
pp. 114013
Author(s):  
Maikel Lázaro Pérez Gort ◽  
Martina Olliaro ◽  
Agostino Cortesi ◽  
Claudia Feregrino Uribe
Keyword(s):  

2020 ◽  
Vol 1 (2) ◽  
pp. 114-119
Author(s):  
Nur Aniq Syafiq Rodzuan ◽  
Shahreen Kasim ◽  
Mohanavali Sithambranathan ◽  
Muhammad Zaki Hassan

Textual information gives us more clear information as it is presented using words and characters, which is easy for humans to understand. To extract this kind of information, text mining was introduced as new technology. Text mining is the process of extracting non-trivial patterns or knowledge from text documents or from textual databases. The purpose of this research paper is to perform and compare keyword extraction using statistical and linguistic extraction tools for 120 text documents related to hypertension and diabetes disease. In order to draw this comparison, RStudio, a statistical-based tool and TerMine, a linguistic-based tool have been used to demonstrate the process of extracting the specified keyword from the biomedical literature. Thus, classification evaluation using Naïve Bayes classifier is carried out in order to evaluate and compare the performance of the statistical and linguistic approaches using these tools. Experimental results show the result of the comparison and the difference between both tools in executing extraction keywords.


2019 ◽  
Vol 22 (3) ◽  
Author(s):  
Dildre Georgiana Vasques ◽  
Paulo Sérgio Martins ◽  
Solange Oliveira Rezende

The discovery of knowledge in textual databases is an approach that basically seeks for implicitrelationships between different concepts in different documents written in natural language, inorder to identify new useful knowledge. To assist in this process, this approach can count on thehelp of Text Mining techniques. Despite all the progress made, researchers in this area must stilldeal with the large number of false relationships generated by most of the available processes.A statistical and verbal semantic approach that supports the understanding of the logic betweenrelationships may bridge this gap. Thus, the objective of this work is to support the user with theidentification of implicit relationships between concepts present in different texts, consideringthe causal relationships between concepts in the texts. To this end, this work proposes a hybridapproach for the discovery of implicit knowledge present in a text corpus, using analysis based onassociation rules together with metrics from complex networks and verbal semantics. Througha case study, a set of texts from alternative medicine was selected and the different extractionsshowed that the proposed approach facilitates the identification of implicit knowledge by theuser


Author(s):  
Derek Heng

Abstract Premodern Southeast Asian history has primarily been predicated upon the exploitation of Chinese written documents. Reliance has been placed on several texts that detail Southeast Asian polities, products, and their respective societies. As indigenously generated sources of data have become available, primarily through archaeology, the trend has been to seek convergence between these two bodies of information. The availability of searchable digital databases has rendered Chinese documents to be open to the discoveries of new information previously unknown to historians of premodern Southeast Asia. This unutilized information has the potential of throwing new light on previously held conclusions. This article seeks to make an argument for the exploitation and potential of digitized Chinese textual databases, through keyword search methodologies, in expanding our understanding of Southeast Asia’s past, as well as the potential challenges that need to be addressed so that this new source base can be made sufficiently utilizable for Southeast Asian studies.


Sign in / Sign up

Export Citation Format

Share Document