scholarly journals An efficient and scalable search engine for models

Author(s):  
José Antonio Hernández López ◽  
Jesús Sánchez Cuadrado

AbstractSearch engines extract data from relevant sources and make them available to users via queries. A search engine typically crawls the web to gather data, analyses and indexes it and provides some query mechanism to obtain ranked results. There exist search engines for websites, images, code, etc., but the specific properties required to build a search engine for models have not been explored much. In the previous work, we presented MAR, a search engine for models which has been designed to support a query-by-example mechanism with fast response times and improved precision over simple text search engines. The goal of MAR is to assist developers in the task of finding relevant models. In this paper, we report new developments of MAR which are aimed at making it a useful and stable resource for the community. We present the crawling and analysis architecture with which we have processed about 600,000 models. The indexing process is now incremental and a new index for keyword-based search has been added. We have also added a web user interface intended to facilitate writing queries and exploring the results. Finally, we have evaluated the indexing times, the response time and search precision using different configurations. MAR has currently indexed over 500,000 valid models of different kinds, including Ecore meta-models, BPMN diagrams, UML models and Petri nets. MAR is available at http://mar-search.org.

2016 ◽  
Author(s):  
Paolo Corti ◽  
Benjamin G Lewis ◽  
Tom Kralidis ◽  
Jude Mwenda

A Spatial Database Infrastructure (SDI) is a framework of geospatial data, metadata, users and tools intended to provide the most efficient and flexible way to use spatial information. One of the key software component of a SDI is the catalogue service, needed to discover, query and manage the metadata. Catalogue services in a SDI are typically based on the Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) standard, that defines common interfaces to access the metadata information. A search engine is a software system able to perform very fast and reliable search, with features such as full text search, natural language processing, weighted results, fuzzy tolerance results, faceting, hit highlighting and many others. The Centre of Geographic Analysis (CGA) at Harvard University is trying to integrate within its public domain SDI (named WorldMap), the benefits of both worlds (OGC catalogs and search engines). Harvard Hypermap (HHypermap) is a component that will be part of WorldMap, totally built on an open source stack, implementing an OGC catalog, based on pycsw, to provide access to metadata in a standard way, and a search engine, based on Solr/Lucene, to provide the advanced search features typically found in search engines.


Author(s):  
Pavel Šimek ◽  
Jiří Vaněk ◽  
Jan Jarolímek

The majority of Internet users use the global network to search for different information using fulltext search engines such as Google, Yahoo!, or Seznam. The web presentation operators are trying, with the help of different optimization techniques, to get to the top places in the results of fulltext search engines. Right there is a great importance of Search Engine Optimization and Search Engine Marketing, because normal users usually try links only on the first few pages of the fulltext search engines results on certain keywords and in catalogs they use primarily hierarchically higher placed links in each category. Key to success is the application of optimization methods which deal with the issue of keywords, structure and quality of content, domain names, individual sites and quantity and reliability of backward links. The process is demanding, long-lasting and without a guaranteed outcome. A website operator without advanced analytical tools do not identify the contribution of individual documents from which the entire web site consists. If the web presentation operators want to have an overview of their documents and web site in global, it is appropriate to quantify these positions in a specific way, depending on specific key words. For this purpose serves the quantification of competitive value of documents, which consequently sets global competitive value of a web site. Quantification of competitive values is performed on a specific full-text search engine. For each full-text search engine can be and often are, different results. According to published reports of ClickZ agency or Market Share is according to the number of searches by English-speaking users most widely used Google search engine, which has a market share of more than 80%. The whole procedure of quantification of competitive values is common, however, the initial step which is the analysis of keywords depends on a choice of the fulltext search engine.


2016 ◽  
Author(s):  
Paolo Corti ◽  
Benjamin Lewis ◽  
Tom Kralidis ◽  
Jude Mwenda

A Spatial Database Infrastructure (SDI) is a framework of geospatial data, metadata, users and tools intended to provide the most efficient and flexible way to use spatial information. One of the key software component of a SDI is the catalogue service, needed to discover, query and manage the metadata. Catalogue services in a SDI are typically based on the Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) standard, that defines common interfaces to access the metadata information. A search engine is a software system able to perform very fast and reliable search, with features such as full text search, natural language processing, weighted results, fuzzy tolerance results, faceting, hit highlighting and many others. The Centre of Geographic Analysis (CGA) at Harvard University is trying to integrate within its public domain SDI (named WorldMap), the benefits of both worlds (OGC catalogs and search engines). Harvard Hypermap (HHypermap) is a component that will be part of WorldMap, totally built on an open source stack, implementing an OGC catalog, based on pycsw, to provide access to metadata in a standard way, and a search engine, based on Solr/Lucene, to provide the advanced search features typically found in search engines.


2012 ◽  
Vol 532-533 ◽  
pp. 1282-1286
Author(s):  
Zhi Chao Lin ◽  
Lei Sun ◽  
Xiao Liu

There is a lot of information contained in the World Wide Web. It has become a research focus to obtain the required related resources quickly and accurately from the web through the content-based search engines. Most current tools of full text web search engine, such as Lucene which is a widely used open source retrieval library in information retrieval field, are purely keyword based. This may not sufficient for users to retrieve in the web. In this paper, we employ a method to overcome the limitations of current full text search engines in represent of Lucene. We propose a Query Expansion and Information Retrieval approach which can help users to acquire more accurate contents from the web. The Query Expansion component finds expanded candidate words of the query word through WordNet which contains synonyms in several different senses; In the Information Retrieval component, the query word and its candidate words are used together as the input of the search module to get the result items. Furthermore, we can put the result items into different classes based on the expansion. Some experiments and the results are described in the late part of this paper.


2010 ◽  
Vol 143-144 ◽  
pp. 1270-1274 ◽  
Author(s):  
Fan Zhang ◽  
Xiu Lan Feng ◽  
Jin Sheng Yuan

With the rapid growth of forestry information,the amount of information on forestry is increasing rapidly. Comprehensive search engine is powerful,but its speed and accuracy of industry search is limited, owing to a host of information. According to the definition of vertical search engines,Heritrix Web Crawler and full text search framework of Lucene,this paper is mainly concerned the information of capture,indexing and search strategies in order to achieve an ideal forestry vertical search engine design. Experiments compared with comprehensive have proved the effectiveness of the proposed method.


Author(s):  
M.J. Kim ◽  
L.C. Liu ◽  
S.H. Risbud ◽  
R.W. Carpenter

When the size of a semiconductor is reduced by an appropriate materials processing technique to a dimension less than about twice the radius of an exciton in the bulk crystal, the band like structure of the semiconductor gives way to discrete molecular orbital electronic states. Clusters of semiconductors in a size regime lower than 2R {where R is the exciton Bohr radius; e.g. 3 nm for CdS and 7.3 nm for CdTe) are called Quantum Dots (QD) because they confine optically excited electron- hole pairs (excitons) in all three spatial dimensions. Structures based on QD are of great interest because of fast response times and non-linearity in optical switching applications.In this paper we report the first HREM analysis of the size and structure of CdTe and CdS QD formed by precipitation from a modified borosilicate glass matrix. The glass melts were quenched by pouring on brass plates, and then annealed to relieve internal stresses. QD precipitate particles were formed during subsequent "striking" heat treatments above the glass crystallization temperature, which was determined by differential thermal analysis.


The Analyst ◽  
2020 ◽  
Vol 145 (1) ◽  
pp. 122-131 ◽  
Author(s):  
Wanda V. Fernandez ◽  
Rocío T. Tosello ◽  
José L. Fernández

Gas diffusion electrodes based on nanoporous alumina membranes electrocatalyze hydrogen oxidation at high diffusion-limiting current densities with fast response times.


2021 ◽  
pp. 089443932110068
Author(s):  
Aleksandra Urman ◽  
Mykola Makhortykh ◽  
Roberto Ulloa

We examine how six search engines filter and rank information in relation to the queries on the U.S. 2020 presidential primary elections under the default—that is nonpersonalized—conditions. For that, we utilize an algorithmic auditing methodology that uses virtual agents to conduct large-scale analysis of algorithmic information curation in a controlled environment. Specifically, we look at the text search results for “us elections,” “donald trump,” “joe biden,” “bernie sanders” queries on Google, Baidu, Bing, DuckDuckGo, Yahoo, and Yandex, during the 2020 primaries. Our findings indicate substantial differences in the search results between search engines and multiple discrepancies within the results generated for different agents using the same search engine. It highlights that whether users see certain information is decided by chance due to the inherent randomization of search results. We also find that some search engines prioritize different categories of information sources with respect to specific candidates. These observations demonstrate that algorithmic curation of political information can create information inequalities between the search engine users even under nonpersonalized conditions. Such inequalities are particularly troubling considering that search results are highly trusted by the public and can shift the opinions of undecided voters as demonstrated by previous research.


Photonics ◽  
2021 ◽  
Vol 8 (4) ◽  
pp. 119
Author(s):  
Anastasiia Tukmakova ◽  
Ivan Tkhorzhevskiy ◽  
Artyom Sedinin ◽  
Aleksei Asach ◽  
Anna Novotelnova ◽  
...  

Terahertz (THz) filters and detectors can find a wide application in such fields as: sensing, imaging, security systems, medicine, wireless connection, and detection of substances. Thermoelectric materials are promising basis for THz detectors’ development due to their sensitivity to the THz radiation, possibility to be heated under the THz radiation and produce voltage due to Seebeck effect. Thermoelectric thin films of Bi-Sb solid solutions are semimetals/semiconductors with the band gap comparable with THz energy and with high thermoelectric conversion efficiency at room temperature. Detecting film surface can be transformed into a periodic frequency selective surface (FSS) that can operate as a frequency filter and increases the absorption of THz radiation. We report for the first time about the simulation of THz detector based on thermoelectric Bi-Sb thin-filmed frequency-selective surface. We show that such structure can be both detector and frequency filter. Moreover, it was shown that FSS design increases not only a heating due to absorption but a temperature gradient in Bi-Sb film by two orders of magnitude in comparison with continuous films. Local temperature gradients can reach the values of the order of 100 K·mm−1. That opens new perspectives for thin-filmed thermoelectric detectors’ efficiency increase. Temperature difference formed due to THz radiation absorption can reach values on the order of 1 degree. Frequency-transient calculations show the power dependence of film temperature on time with characteristic saturation at times around several ms. That points to the perspective of reaching fast response times on such structures.


Sign in / Sign up

Export Citation Format

Share Document