apache lucene
Recently Published Documents


TOTAL DOCUMENTS

15
(FIVE YEARS 1)

H-INDEX

3
(FIVE YEARS 0)

2021 ◽  
pp. 71-80
Author(s):  
Apurva Aggarwal ◽  
Ajay Kumar Kushwaha ◽  
Somil Rastogi ◽  
Sangeeta Lal ◽  
Sarishty Gupta
Keyword(s):  


2020 ◽  
Vol 7 (1/2/3) ◽  
pp. 203
Author(s):  
E. Laxmi Lydia ◽  
Sivakoti Satyanarayan ◽  
K. Vijaya Kumar ◽  
Dasari Ramya


2020 ◽  
Author(s):  
Atri Sharma
Keyword(s):  


2019 ◽  
Vol 22 (3) ◽  
Author(s):  
Evelyn Maria Aranda Acuna ◽  
Jose Luis Vazquez ◽  
Cynthia Villalba

The current process of medical terminology coding in health standards at the Hospital of Clinics of Paraguay is performed by physicians, normally interns or residents. They search the medical terminology code in impressed coding manuals or in internet by using their cell phones. This search process takes a lot time to be done during the medical consultation. This work proposes and evaluates a friendly medical terminology server thought web services, using Apache Lucene, as a search engine library, and the Metathesaurus of Unified System of Medical Language (UMLS), as an information source. The server is developed for Spanish speakers. Results show that physicians can find medical terminology code with the terminology server, using friendly or familiar terms, 18 times faster than with the current search process. The user satisfaction degree is ``Good” according to an adjective rating of the System Usability Scale (SUS). In addition, a comparison with a search engine of medical terminology called Metamorphosys shows that the implemented terminology server is quite competitive and it responses in similar average time



2019 ◽  
Vol 6 (3) ◽  
pp. 1
Author(s):  
E.Laxmi Lydia ◽  
Sivakoti Satyanarayan ◽  
K. Vijaya Kumar ◽  
Dasari Ramya


Author(s):  
Armstrong Gomes Brito ◽  
Luiz Claudio Gomes Maia
Keyword(s):  

A crescente complexidade dos objetos armazenados e o grande volume de dados exigem modelos de recuperação e recomendação cada vez mais sofisticados. O objetivo deste trabalho é propor um modelo de recomendação de conteúdo baseado em arquivos de legendas de filmes e séries. Utilizando a ferramenta Apache Lucene para recuperação da informação, e a ferramenta OGMA, para análise de textos, foi possível propor para o modelo, três etapas distintas: uma pesquisa utilizando palavra-chave, a classificação de filmes e séries por gênero e a identificação de títulos similares. Também é apresentado uma adaptação ao modelo para identificar em cada título um sentimento, denominado análise de sentimentos. Como resultado ressaltamos que a pesquisa por palavras-chave gerou recomendações relevantes, já que proporcionam ao usuário liberdade de pesquisa dentro de um conteúdo específico. Já a classificação por gênero apresentou índice de 73% de acerto em comparação com os gêneros apresentados pelo site IMDb, facilitando a recomendação de conteúdo. A análise de sentimentos demonstrou recomendações com coesão, determinando títulos apropriados para cada sentimento. Por último, a identificação de títulos similares, apresentou resultados primários, trazendo apenas filmes e séries com a mesma temática, sem apresentar nenhum resultado em comum com o site IMDb. Concluiu-se que apesar da enorme dificuldade de ser assertivo na recuperação da informação, existem vantagens em se utilizar os arquivos de legendas para ajudar na composição dos sistemas de recomendação.Palavras-chave: Recomendação de conteúdo. Recuperação da informação. Recomendação de filmes e séries. Arquivos de legenda. Classificação por gênero. Apache-Lucene. OGMA. Sistemas de recomendação.Link: http://www.periodicos.ufpb.br/ojs/index.php/itec/article/view/38189/20173



2018 ◽  
Vol 1 (1) ◽  
pp. 399-406
Author(s):  
Ahmet Arslan ◽  
Ahmet Alkılınç ◽  
Bekir Taner Dinçer
Keyword(s):  

İnternetin ve sosyal web sitelerinin ortaya çıkmasıyla birlikte, dijital verilerin hacmi her geçen gün artmaktadır. Bu büyük miktardaki verilerden anlamlı bilgi elde etmek ve işlemek o kadar kolay değildir. Geleneksel yöntemleri ve araçları kullanarak bu büyük veriyi işlemek oldukça külfetli ve zaman alıcıdır. Bu gibi durumlarda, büyük veri işleme araçları bir çözüm olarak devreye girmektedir. Bu çalışmada büyük veri indeksleme ve arama yazılımı olan Apache Lucene kullanılarak, yarım milyar Web sayfası içinde en sık geçen e-posta, Web adresleri ve emojilerin nasıl tespit edildiği anlatılmaktadır.



Author(s):  
D. Kakkar ◽  
B. Lewis

With funding from the Sloan Foundation and Harvard Dataverse, the Harvard Center for Geographic Analysis (CGA) has developed a prototype spatio-temporal visualization platform called the Billion Object Platform or BOP. The goal of the project is to lower barriers for scholars who wish to access large, streaming, spatio-temporal datasets. The BOP is now loaded with the latest billion geo-tweets, and is fed a real-time stream of about 1 million tweets per day. The geo-tweets are enriched with sentiment and census/admin boundary codes when they enter the system. The system is open source and is currently hosted on Massachusetts Open Cloud (MOC), an OpenStack environment with all components deployed in Docker orchestrated by Kontena. This paper will provide an overview of the BOP architecture, which is built on an open source stack consisting of Apache Lucene, Solr, Kafka, Zookeeper, Swagger, scikit-learn, OpenLayers, and AngularJS. The paper will further discuss the approach used for harvesting, enriching, streaming, storing, indexing, visualizing and querying a billion streaming geo-tweets.



Respati ◽  
2017 ◽  
Vol 9 (27) ◽  
Author(s):  
Moammar Mohamed Abdalmjied

Today, Twitter in addition has become a way of life also has become one of the sources of data in a very large number that can be accessed by anyone and anytime as long as it is connected to the internet. Data twitter is now widely used for various purposes, e.g. for the purposes of the survey. As we know that today many organizations, both government and non-government have been utilizing Twitter as a data source. In this study, developed an application to analyze data or tweets twitter committed by the Libyan people, where most of the tweets are written in letters and Arabic. The results of the analysis are expected to be topics of conversation among the Libyan people on Twitter.Development is done by utilizing Carrot2 Framework, where the stages of analysis and design are done by using the object-oriented approach utilizing UML. In addition, the application also utilizes several open libraries, such as Apache Lucene is useful for the process of indexing and retrieval and also Twitter4J tweets, a useful library API to access tweets from Twitter.The resulting application is then tested with multiple queries, and the results of these tests can then be calculated degree of accuracy in the system generate topics of conversation.  With 10 queries, system generates 197 topics, which 185 topics are relevant while the 12 topics considered irrelevant. So the accuracy of the system can be calculated and the results obtained 93.91. It can be concluded that the accuracy of the system in generating relevant topics is 93.91%.



Author(s):  
Mayank Aggarwal ◽  
Mani Madhukar

With the advent of Internet and Computers, Information Technology (IT) has become a major tool to aid medical issues. IBM Watson is one such initiative by IBM, which provides integration with any application to build Internet of Things (IoT), based health applications and also assists by its existing services. The strength of Watson is its data analytics and Artificial Intelligence. The four variants of Watsons are Watson Discovery Advisor, Oncology, Clinical Trial Matching and Curam. It is based on Open Source Apache UIMA, Apache Lucene. Its integration with IBM Bluemix Cloud, Platform as a Service (PaaS) makes it easily available to users.



Sign in / Sign up

Export Citation Format

Share Document