apache lucene Latest Research Papers

Characterization and Prediction of Various Issue Types: A Case Study on the Apache Lucene System

10.1007/978-981-16-3071-2_7 ◽

2021 ◽

pp. 71-80

Author(s):

Apurva Aggarwal ◽

Ajay Kumar Kushwaha ◽

Somil Rastogi ◽

Sangeeta Lal ◽

Sarishty Gupta

Keyword(s):

Apache Lucene

Indexing documents with reliable indexing techniques using Apache Lucene in Hadoop

International Journal of Intelligent Enterprise ◽

10.1504/ijie.2020.104656 ◽

2020 ◽

Vol 7 (1/2/3) ◽

pp. 203

Author(s):

E. Laxmi Lydia ◽

Sivakoti Satyanarayan ◽

K. Vijaya Kumar ◽

Dasari Ramya

Keyword(s):

Indexing Techniques ◽

Apache Lucene

Practical Apache Lucene 8

10.1007/978-1-4842-6345-7 ◽

2020 ◽

Author(s):

Atri Sharma

Keyword(s):

Apache Lucene

Medical Terminology Server for the Hospital of Clinics of Paraguay using Apache Lucene and the UMLS Metathesaurus

CLEI electronic journal ◽

10.19153/cleiej.22.3.8 ◽

2019 ◽

Vol 22 (3) ◽

Author(s):

Evelyn Maria Aranda Acuna ◽

Jose Luis Vazquez ◽

Cynthia Villalba

Keyword(s):

Search Engine ◽

User Satisfaction ◽

Information Source ◽

Spanish Speakers ◽

Search Process ◽

Medical Terminology ◽

Medical Language ◽

System Usability Scale ◽

System Usability ◽

Apache Lucene

The current process of medical terminology coding in health standards at the Hospital of Clinics of Paraguay is performed by physicians, normally interns or residents. They search the medical terminology code in impressed coding manuals or in internet by using their cell phones. This search process takes a lot time to be done during the medical consultation. This work proposes and evaluates a friendly medical terminology server thought web services, using Apache Lucene, as a search engine library, and the Metathesaurus of Unified System of Medical Language (UMLS), as an information source. The server is developed for Spanish speakers. Results show that physicians can find medical terminology code with the terminology server, using friendly or familiar terms, 18 times faster than with the current search process. The user satisfaction degree is ``Good” according to an adjective rating of the System Usability Scale (SUS). In addition, a comparison with a search engine of medical terminology called Metamorphosys shows that the implemented terminology server is quite competitive and it responses in similar average time

Indexing Documents With Reliable Indexing Techniques Using Apache Lucene In Hadoop

International Journal of Intelligent Enterprise ◽

10.1504/ijie.2019.10023317 ◽

2019 ◽

Vol 6 (3) ◽

pp. 1

Author(s):

E.Laxmi Lydia ◽

Sivakoti Satyanarayan ◽

K. Vijaya Kumar ◽

Dasari Ramya

Keyword(s):

Indexing Techniques ◽

Apache Lucene

Proposta de modelo de recomendação de conteúdo baseado em arquivos de legendas de filmes e séries

Pesquisa Brasileira em Ciência da Informação e Biblioteconomia ◽

10.22478/ufpb.1981-0695.2018v13n2.42950 ◽

2018 ◽

Vol 13 (2) ◽

Author(s):

Armstrong Gomes Brito ◽

Luiz Claudio Gomes Maia

Keyword(s):

Apache Lucene

A crescente complexidade dos objetos armazenados e o grande volume de dados exigem modelos de recuperação e recomendação cada vez mais sofisticados. O objetivo deste trabalho é propor um modelo de recomendação de conteúdo baseado em arquivos de legendas de filmes e séries. Utilizando a ferramenta Apache Lucene para recuperação da informação, e a ferramenta OGMA, para análise de textos, foi possível propor para o modelo, três etapas distintas: uma pesquisa utilizando palavra-chave, a classificação de filmes e séries por gênero e a identificação de títulos similares. Também é apresentado uma adaptação ao modelo para identificar em cada título um sentimento, denominado análise de sentimentos. Como resultado ressaltamos que a pesquisa por palavras-chave gerou recomendações relevantes, já que proporcionam ao usuário liberdade de pesquisa dentro de um conteúdo específico. Já a classificação por gênero apresentou índice de 73% de acerto em comparação com os gêneros apresentados pelo site IMDb, facilitando a recomendação de conteúdo. A análise de sentimentos demonstrou recomendações com coesão, determinando títulos apropriados para cada sentimento. Por último, a identificação de títulos similares, apresentou resultados primários, trazendo apenas filmes e séries com a mesma temática, sem apresentar nenhum resultado em comum com o site IMDb. Concluiu-se que apesar da enorme dificuldade de ser assertivo na recuperação da informação, existem vantagens em se utilizar os arquivos de legendas para ajudar na composição dos sistemas de recomendação.Palavras-chave: Recomendação de conteúdo. Recuperação da informação. Recomendação de filmes e séries. Arquivos de legenda. Classificação por gênero. Apache-Lucene. OGMA. Sistemas de recomendação.Link: http://www.periodicos.ufpb.br/ojs/index.php/itec/article/view/38189/20173

Büyük Veri Setlerinde Varlık Tanıma: En Sık Geçen E-Posta, Web Adreslerinin ve Emojilerin Tespit Edilmesi

Academic Perspective Procedia ◽

10.33793/acperpro.01.01.79 ◽

2018 ◽

Vol 1 (1) ◽

pp. 399-406

Author(s):

Ahmet Arslan ◽

Ahmet Alkılınç ◽

Bekir Taner Dinçer

Keyword(s):

Apache Lucene

İnternetin ve sosyal web sitelerinin ortaya &ccedil;ıkmasıyla birlikte, dijital verilerin hacmi her ge&ccedil;en g&uuml;n artmaktadır. Bu b&uuml;y&uuml;k miktardaki verilerden anlamlı bilgi elde etmek ve işlemek o kadar kolay değildir. Geleneksel y&ouml;ntemleri ve ara&ccedil;ları kullanarak bu b&uuml;y&uuml;k veriyi işlemek olduk&ccedil;a k&uuml;lfetli ve zaman alıcıdır. Bu gibi durumlarda, b&uuml;y&uuml;k veri işleme ara&ccedil;ları bir &ccedil;&ouml;z&uuml;m olarak devreye girmektedir. Bu &ccedil;alışmada b&uuml;y&uuml;k veri indeksleme ve arama yazılımı olan Apache Lucene kullanılarak, yarım milyar Web sayfası i&ccedil;inde en sık ge&ccedil;en e-posta, Web adresleri ve emojilerin nasıl tespit edildiği anlatılmaktadır.

BUILDING A BILLION SPATIO-TEMPORAL OBJECT SEARCH AND VISUALIZATION PLATFORM

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-4-w2-97-2017 ◽

2017 ◽

Vol IV-4/W2 ◽

pp. 97-100

Author(s):

D. Kakkar ◽

B. Lewis

Keyword(s):

Open Source ◽

Real Time ◽

Geographic Analysis ◽

Temporal Object ◽

Object Search ◽

Sloan Foundation ◽

Apache Lucene ◽

Spatio Temporal

With funding from the Sloan Foundation and Harvard Dataverse, the Harvard Center for Geographic Analysis (CGA) has developed a prototype spatio-temporal visualization platform called the Billion Object Platform or BOP. The goal of the project is to lower barriers for scholars who wish to access large, streaming, spatio-temporal datasets. The BOP is now loaded with the latest billion geo-tweets, and is fed a real-time stream of about 1 million tweets per day. The geo-tweets are enriched with sentiment and census/admin boundary codes when they enter the system. The system is open source and is currently hosted on Massachusetts Open Cloud (MOC), an OpenStack environment with all components deployed in Docker orchestrated by Kontena. This paper will provide an overview of the BOP architecture, which is built on an open source stack consisting of Apache Lucene, Solr, Kafka, Zookeeper, Swagger, scikit-learn, OpenLayers, and AngularJS. The paper will further discuss the approach used for harvesting, enriching, streaming, storing, indexing, visualizing and querying a billion streaming geo-tweets.

DEVELOPING TOPICAL CLUSTERING OF LIBYAN’S TWEETS USING CARROT2 FRAMEWORK

Respati ◽

10.35842/jtir.v9i27.77 ◽

2017 ◽

Vol 9 (27) ◽

Author(s):

Moammar Mohamed Abdalmjied

Keyword(s):

Object Oriented ◽

Way Of Life ◽

Multiple Queries ◽

Analysis And Design ◽

Indexing And Retrieval ◽

Data Source ◽

Apache Lucene ◽

Object Oriented Approach ◽

Oriented Approach ◽

Analyze Data

Today, Twitter in addition has become a way of life also has become one of the sources of data in a very large number that can be accessed by anyone and anytime as long as it is connected to the internet. Data twitter is now widely used for various purposes, e.g. for the purposes of the survey. As we know that today many organizations, both government and non-government have been utilizing Twitter as a data source. In this study, developed an application to analyze data or tweets twitter committed by the Libyan people, where most of the tweets are written in letters and Arabic. The results of the analysis are expected to be topics of conversation among the Libyan people on Twitter.Development is done by utilizing Carrot2 Framework, where the stages of analysis and design are done by using the object-oriented approach utilizing UML. In addition, the application also utilizes several open libraries, such as Apache Lucene is useful for the process of indexing and retrieval and also Twitter4J tweets, a useful library API to access tweets from Twitter.The resulting application is then tested with multiple queries, and the results of these tests can then be calculated degree of accuracy in the system generate topics of conversation. With 10 queries, system generates 197 topics, which 185 topics are relevant while the 12 topics considered irrelevant. So the accuracy of the system can be calculated and the results obtained 93.91. It can be concluded that the accuracy of the system in generating relevant topics is 93.91%.

IBM's Watson Analytics for Health Care

Cloud Computing Systems and Applications in Healthcare - Advances in Healthcare Information Systems and Administration ◽

10.4018/978-1-5225-1002-4.ch007 ◽

2016 ◽

pp. 117-134 ◽

Cited By ~ 7

Author(s):

Mayank Aggarwal ◽

Mani Madhukar

Keyword(s):

Artificial Intelligence ◽

Health Care ◽

Clinical Trial ◽

Information Technology ◽

Platform As A Service ◽

Medical Issues ◽

Oncology Clinical Trial ◽

Apache Lucene ◽

Ibm Watson ◽

Health Applications

With the advent of Internet and Computers, Information Technology (IT) has become a major tool to aid medical issues. IBM Watson is one such initiative by IBM, which provides integration with any application to build Internet of Things (IoT), based health applications and also assists by its existing services. The strength of Watson is its data analytics and Artificial Intelligence. The four variants of Watsons are Watson Discovery Advisor, Oncology, Clinical Trial Matching and Curam. It is based on Open Source Apache UIMA, Apache Lucene. Its integration with IBM Bluemix Cloud, Platform as a Service (PaaS) makes it easily available to users.

apache lucene
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Characterization and Prediction of Various Issue Types: A Case Study on the Apache Lucene System

Indexing documents with reliable indexing techniques using Apache Lucene in Hadoop

Practical Apache Lucene 8

Medical Terminology Server for the Hospital of Clinics of Paraguay using Apache Lucene and the UMLS Metathesaurus

Indexing Documents With Reliable Indexing Techniques Using Apache Lucene In Hadoop

Proposta de modelo de recomendação de conteúdo baseado em arquivos de legendas de filmes e séries

Büyük Veri Setlerinde Varlık Tanıma: En Sık Geçen E-Posta, Web Adreslerinin ve Emojilerin Tespit Edilmesi

BUILDING A BILLION SPATIO-TEMPORAL OBJECT SEARCH AND VISUALIZATION PLATFORM

DEVELOPING TOPICAL CLUSTERING OF LIBYAN’S TWEETS USING CARROT2 FRAMEWORK

IBM's Watson Analytics for Health Care

Export Citation Format

apache luceneRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Characterization and Prediction of Various Issue Types: A Case Study on the Apache Lucene System

Indexing documents with reliable indexing techniques using Apache Lucene in Hadoop

Practical Apache Lucene 8

Medical Terminology Server for the Hospital of Clinics of Paraguay using Apache Lucene and the UMLS Metathesaurus

Indexing Documents With Reliable Indexing Techniques Using Apache Lucene In Hadoop

Proposta de modelo de recomendação de conteúdo baseado em arquivos de legendas de filmes e séries

Büyük Veri Setlerinde Varlık Tanıma: En Sık Geçen E-Posta, Web Adreslerinin ve Emojilerin Tespit Edilmesi

BUILDING A BILLION SPATIO-TEMPORAL OBJECT SEARCH AND VISUALIZATION PLATFORM

DEVELOPING TOPICAL CLUSTERING OF LIBYAN’S TWEETS USING CARROT2 FRAMEWORK

IBM's Watson Analytics for Health Care

apache lucene
Recently Published Documents