text indexing
Recently Published Documents


TOTAL DOCUMENTS

129
(FIVE YEARS 11)

H-INDEX

17
(FIVE YEARS 1)

Author(s):  
Mariana D. A. Salgueiro ◽  
Veronica dos Santos ◽  
André L. C. Rêgo ◽  
Daniel S. Guimarães ◽  
Edward H. Haeusler ◽  
...  

Quem@PUC is an Information Retrieval System available on the Web that allows searching for researchers and professors based on a keyword list of research related terms. It publicizes research and teaching activities from the PUC-Rio community to society in general. The idea is to integrate information from professors from administrative systems, courses offered, and researchers’ Lattes CVs. Data sources are converted to RDF format using domain ontologies, then stored in a NoSQL database that supports native free-text indexing on triple objects. Search results include names, academic papers, teaching activities, and contact links.


2021 ◽  
Vol 26 ◽  
pp. 1-26
Author(s):  
Giulia Bernardini ◽  
Huiping Chen ◽  
Gabriele Fici ◽  
Grigorios Loukides ◽  
Solon P. Pissis

We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z - reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D . The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z , we propose an algorithm that constructs a z -reverse-safe data structure ( z -RSDS) that has size O(n) and answers decision and counting pattern matching queries of length at most d optimally, where d is maximal for any such z -RSDS. The construction algorithm takes O(nɷ log d) time, where ɷ is the matrix multiplication exponent. We show that, despite the nɷ factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We also show that plugging our method in data analysis applications gives insignificant or no data utility loss. Furthermore, we show how our technique can be extended to support applications under realistic adversary models. Finally, we show a z -RSDS for decision pattern matching queries, whose size can be sublinear in n . A preliminary version of this article appeared in ALENEX 2020.


Author(s):  
Guilia Bernardini ◽  
Huiping Chen ◽  
Gabriele Fici ◽  
Grigorios Loukides ◽  
Solon P. Pissis

2019 ◽  
Vol 14 (2) ◽  
pp. 72
Author(s):  
Yessy Prima Putri ◽  
Ridwan Lawson

Pada proses pengerjaan tugas akhir atau skripsi, mahasiswa STMIK Indonesia Padang sering sekali melakukan kesalahan penulisan dalam hal pengetikan dan pengetahuan yang kurang terhadap penulisan ejaan dan padanan kata yang paling update dan sesuai dengan KBBI. Kesalahan yang sering terjadi adalah kurangnya pengetahuan mahasiswa akan penulisan ejaan yang baku, kelalaian mahasiswa yang tidak disengaja, kesalahan pengaturan aplikasi yang digunakan untuk media pengetikan (Microsoft Word, Notepad, Open Office Word) dan beberapa hal lainnya. Aplikasi deteksi kesalahan penulisan skripsi merupakan solusi untuk membantu mahasiswa dalam membuat skripsi dan mendeteksi kesalahan penulisan dokumen skripsi. Salah satu metode indexing untuk meng-indeks teks biasa, untuk mengurangi kapasitas pemakaian storage dan meningkatkan kinerja searching adalah Full Text Indexing. Full Text Indexing merupakan metode yang digunakan dalam mencari kesalahan dalam sebuah teks sebagai alat bantu utama dalam perancangan aplikasi ini. Pada metode Full Text Indexing terdapat 2 tahap yang dilakukan sebelum dilakukan pencarian kata, yaitu tahap tokenizing dan tahap cleansing. Aplikasi deteksi kesalahan penulisan naskah dokumen skripsi dibuat dengan fitur pengecekan kesalahan penulisan dan penyimpanan daftar pustaka dan daftar gambar. Dengan dibuatnya aplikasi ini diharapkan bisa membantu mahasiswa dalam pembuatan skripsi, terutama dalam pengecekan kesalahan penulisan skripsi.


Author(s):  
Arnab Ganguly ◽  
Wing-Kai Hon ◽  
Yu-An Huang ◽  
Solon P. Pissis ◽  
Rahul Shah ◽  
...  
Keyword(s):  

2019 ◽  
Vol 762 ◽  
pp. 41-50 ◽  
Author(s):  
Gonzalo Navarro ◽  
Nicola Prezza
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document