A figure search engine architecture for a chemistry digital library

Author(s):  
Sagnik Ray Choudhury ◽  
Suppawong Tuarob ◽  
Prasenjit Mitra ◽  
Lior Rokach ◽  
Andi Kirk ◽  
...  



AI Magazine ◽  
2015 ◽  
Vol 36 (3) ◽  
pp. 35-48 ◽  
Author(s):  
Jian Wu ◽  
Kyle Mark Williams ◽  
Hung-Hsuan Chen ◽  
Madian Khabsa ◽  
Cornelia Caragea ◽  
...  

CiteSeerX is a digital library search engine providing access to more than five million scholarly documents with nearly a million users and millions of hits per day. We present key AI technologies used in the following components: document classification and de-duplication, document and citation clustering, automatic metadata extraction and indexing, and author disambiguation. These AI technologies have been developed by CiteSeerX group members over the past 5–6 years. We show the usage status, payoff, development challenges, main design concepts, and deployment and maintenance requirements. We also present AI technologies implemented in table and algorithm search, which are special search modes in CiteSeerX. While it is challenging to rebuild a system like CiteSeerX from scratch, many of these AI technologies are transferable to other digital libraries and/or search engines.



Author(s):  
Menzo Windhouwer ◽  
Albrecht Schmidt ◽  
Roelof van Zwol ◽  
Milan Petkovic ◽  
Henk E. Blok

In this chapter the development of a specialised search engine for a digital library is described.  The proposed system architecture consists of three levels: the conceptual, the logical and the physical level.  The conceptual level schema enables by its exposure of a domain specific schema semantically rich conceptual search.  The logical level provides a description language to achieve a high degree of flexibility for multimedia retrieval.  The physical level takes care of scalable and efficient persistent data storage.  The role, played by each level, changes during the various stages of a search engine’s lifecycle: (1) modeling the index, (2) populating and maintaining the index and (3) querying the index.  The integration of all this functionality allows the combination of both conceptual and content-based querying in the query stage.  A search engine for the Australian Open tennis tournament website is used as a running example, which shows the power of the complete architecture and its various components.



2000 ◽  
Vol 09 (03) ◽  
pp. 229-254
Author(s):  
A. N. ZINCIR-HEYWOOD ◽  
M. I. HEYWOOD ◽  
C. R. CHARTWIN ◽  
T. TUNALI

A platform for performing multi-agent searches in heterogeneous digital libraries is proposed. This differs significantly from previous approaches by completely removing the concept of a centralized search engine. Specifically, the organization of information held on domain index servers is constrained to conform to a virtual tree representation based on facets and global keyword concept schema particular to the set of information providers associated with the domain of interest (e.g. preparatory intranet). Simulation studies are used to compare this platform against a digital library platform presently in use, which employs the traditional central server scheme. Improvements in terms of query service time and robustness are demonstrated.







2012 ◽  
Vol 182-183 ◽  
pp. 915-918
Author(s):  
Huai Feng Wang ◽  
Guang Yao Gao

The future search engine should understand the content of Web pages and implement logical reasoning, in order to achieve complex search and correct results. This paper introduced relative theory of component and RDF, created a conceptual architecture for semantic search engine, and discussed its components and their relationships. Finally, advantages of this architecture are proved.



CCIT Journal ◽  
2018 ◽  
Vol 11 (1) ◽  
pp. 15-25
Author(s):  
Zudha Pratama ◽  
Yans Safarid Hudha ◽  
M Lukman Prayoghi

Virtually all database-related systems provide search features. Starting from a complex search engine like google to a relatively simple example of search features on a digital library page. A good search engine is capable of delivering fast, accurate, and fault-tolerant results. Speed ​​may be affected by server device capabilities and complex algorithm combinations.The form of the condition condition used in the search query generally uses LIKE for partial search, REGEXP for multi key search, and MATCH-AGAINST for multi key search with fulltext index. However, these functions are not sufficient to perform a search selection on a slightly wrong key or rather fault tolerance that is still not good. So researchers do an analysis if one of the search function is combined with a dictionary table.Table dictionary as a comparator key to find a more appropriate key if key wrong key. But on the other hand the addition of the comparison process is estimated to have a weakness to the processing time. Researchers assume if the weakness can be overcome if the ability of the server is improved.



Author(s):  
Anka Letic-Gavrilovic

In this chapter, the author will demonstrate and describe a project to develop a unique database with multilingual information and knowledge resource for biomedical dental materials and their properties. The database will be populated with high-quality, peer-reviewed information, equipped with an original search engine which would include all necessary information to (1) do standardization of therapeutic treatments (2) understand, the tissue response to biomaterials; (3) identify biomaterials and tissue matrix environment, to allow deeper understanding of the underlying relationship which allow more effective device design and engineering; (4) develop enabling tools by improvements in high-throughput assay and instrumentation, imaging, modalities, fabrication technologies, computational modelling and bioinformatics;(5) promote scale up, translation and commercialisation.



Sign in / Sign up

Export Citation Format

Share Document