scholarly journals Microsoft Concept Graph: Mining Semantic Concepts for Short Text Understanding

2019 ◽  
Vol 1 (3) ◽  
pp. 238-270 ◽  
Author(s):  
Lei Ji ◽  
Yujing Wang ◽  
Botian Shi ◽  
Dawei Zhang ◽  
Zhongyuan Wang ◽  
...  

Knowlege is important for text-related applications. In this paper, we introduce Microsoft Concept Graph, a knowledge graph engine that provides concept tagging APIs to facilitate the understanding of human languages. Microsoft Concept Graph is built upon Probase, a universal probabilistic taxonomy consisting of instances and concepts mined from the Web. We start by introducing the construction of the knowledge graph through iterative semantic extraction and taxonomy construction procedures, which extract 2.7 million concepts from 1.68 billion Web pages. We then use conceptualization models to represent text in the concept space to empower text-related applications, such as topic search, query recommendation, Web table understanding and Ads relevance. Since the release in 2016, Microsoft Concept Graph has received more than 100,000 pageviews, 2 million API calls and 3,000 registered downloads from 50,000 visitors over 64 countries.

10.29007/fvc9 ◽  
2019 ◽  
Author(s):  
Gautam Kishore Shahi ◽  
Durgesh Nandini ◽  
Sushma Kumari

Schema.org creates, supports and maintain schemas for structured data on the web pages. For a non-technical author, it is difficult to publish contents in a structured format. This work presents an automated way of inducing Schema.org markup from natural language context of web-pages by applying knowledge base creation technique. As a dataset, Web Data Commons was used, and the scope for the experimental part was limited to RDFa. The approach was implemented using the Knowledge Graph building techniques - Knowledge Vault and KnowMore.


Author(s):  
Feng Xu ◽  
Yu-Jin Zhang

Content-based image retrieval (CBIR) has wide applications in public life. Either from a static image database or from the Web, one can search for a specific image, generally browse to make an interactive choice, and search for a picture to go with a broad story or to illustrate a document. Although CBIR has been well studied, it is still a challenging problem to search for images from a large image database because of the well-acknowledged semantic gap between low-level features and high-level semantic concepts. An alternative solution is to use keyword-based approaches, which usually associate images with keywords by either manually labeling or automatically extracting surrounding text from Web pages. Although such a solution is widely adopted by most existing commercial image search engines, it is not perfect. First, manual annotation, though precise, is expensive and difficult to extend to large-scale databases. Second, automatically extracted surrounding text might by incomplete and ambiguous in describing images, and even more, surrounding text may not be available in some applications. To overcome these problems, automated image annotation is considered as a promising approach in understanding and describing the content of images.


2013 ◽  
Vol 7 (2) ◽  
pp. 574-579 ◽  
Author(s):  
Dr Sunitha Abburu ◽  
G. Suresh Babu

Day by day the volume of information availability in the web is growing significantly. There are several data structures for information available in the web such as structured, semi-structured and unstructured. Majority of information in the web is presented in web pages. The information presented in web pages is semi-structured.  But the information required for a context are scattered in different web documents. It is difficult to analyze the large volumes of semi-structured information presented in the web pages and to make decisions based on the analysis. The current research work proposed a frame work for a system that extracts information from various sources and prepares reports based on the knowledge built from the analysis. This simplifies  data extraction, data consolidation, data analysis and decision making based on the information presented in the web pages.The proposed frame work integrates web crawling, information extraction and data mining technologies for better information analysis that helps in effective decision making.   It enables people and organizations to extract information from various sourses of web and to make an effective analysis on the extracted data for effective decision making.  The proposed frame work is applicable for any application domain. Manufacturing,sales,tourisum,e-learning are various application to menction few.The frame work is implemetnted and tested for the effectiveness of the proposed system and the results are promising.


Think India ◽  
2019 ◽  
Vol 22 (2) ◽  
pp. 174-187
Author(s):  
Harmandeep Singh ◽  
Arwinder Singh

Nowadays, internet satisfying people with different services related to different fields. The profit, as well as non-profit organization, uses the internet for various business purposes. One of the major is communicated various financial as well as non-financial information on their respective websites. This study is conducted on the top 30 BSE listed public sector companies, to measure the extent of governance disclosure (non-financial information) on their web pages. The disclosure index approach to examine the extent of governance disclosure on the internet was used. The governance index was constructed and broadly categorized into three dimensions, i.e., organization and structure, strategy & Planning and accountability, compliance, philosophy & risk management. The empirical evidence of the study reveals that all the Indian public sector companies have a website, and on average, 67% of companies disclosed some kind of governance information directly on their websites. Further, we found extreme variations in the web disclosure between the three categories, i.e., The Maharatans, The Navratans, and Miniratans. However, the result of Kruskal-Wallis indicates that there is no such significant difference between the three categories. The study provides valuable insights into the Indian economy. It explored that Indian public sector companies use the internet for governance disclosure to some extent, but lacks symmetry in the disclosure. It is because there is no such regulation for web disclosure. Thus, the recommendation of the study highlighted that there must be such a regulated framework for the web disclosure so that stakeholders ensure the transparency and reliability of the information.


2013 ◽  
Vol 347-350 ◽  
pp. 2758-2762
Author(s):  
Zhi Juan Wang

Negative Internet information is harmful for social stability and national unity. Opinion tendency analyzing can find the negative Internet information. Here, a method based on regular expression is introduces that neednt complex technologies about semantics. This method includes: building negative information bank, designing regular expression and the realization of program. The result gotten from this method verified it works perfect on judging the opinion of the web pages.


2021 ◽  
pp. 1-11
Author(s):  
Zhinan Gou ◽  
Yan Li

With the development of the web 2.0 communities, information retrieval has been widely applied based on the collaborative tagging system. However, a user issues a query that is often a brief query with only one or two keywords, which leads to a series of problems like inaccurate query words, information overload and information disorientation. The query expansion addresses this issue by reformulating each search query with additional words. By analyzing the limitation of existing query expansion methods in folksonomy, this paper proposes a novel query expansion method, based on user profile and topic model, for search in folksonomy. In detail, topic model is constructed by variational antoencoder with Word2Vec firstly. Then, query expansion is conducted by user profile and topic model. Finally, the proposed method is evaluated by a real dataset. Evaluation results show that the proposed method outperforms the baseline methods.


2014 ◽  
Vol 30 (4) ◽  
pp. 15-17 ◽  

Purpose – This paper aims to review the latest management developments across the globe and pinpoint practical implications from cutting-edge research and case studies. Design/methodology/approach – This briefing is prepared by an independent writer who adds their own impartial comments and places the articles in context. Findings – Becoming increasingly reliant on the web as a principal source of finding information is altering our brains and the way that we obtain and hold knowledge. We are becoming less reliant on our memories to hold knowledge, instead using technology – and search engines like Google in particular – to deposit and retrieve information. Practical implications – The paper provides strategic insights and practical thinking that have influenced some of the world's leading organizations. Social implications – The paper provides strategic insights and practical thinking that can have a broader social impact. Originality/value – The briefing saves busy executives and researchers hours of reading time by selecting only the very best, most pertinent information and presenting it in a condensed and easy-to-digest format.


Author(s):  
Carmen Domínguez-Falcón ◽  
Domingo Verano-Tacoronte ◽  
Marta Suárez-Fuentes

Purpose The strong regulation of the Spanish pharmaceutical sector encourages pharmacies to modify their business model, giving the customer a more relevant role by integrating 2.0 tools. However, the study of the implementation of these tools is still quite limited, especially in terms of a customer-oriented web page design. This paper aims to analyze the online presence of Spanish community pharmacies by studying the profile of their web pages to classify them by their degree of customer orientation. Design/methodology/approach In total, 710 community pharmacies were analyzed, of which 160 had Web pages. Using items drawn from the literature, content analysis was performed to evaluate the presence of these items on the web pages. Then, after analyzing the scores on the items, a cluster analysis was conducted to classify the pharmacies according to the degree of development of their online customer orientation strategy. Findings The number of pharmacies with a web page is quite low. The development of these websites is limited, and they have a more informational than relational role. The statistical analysis allows to classify the pharmacies in four groups according to their level of development Practical implications Pharmacists should make incremental use of their websites to facilitate real two-way communication with customers and other stakeholders to maintain a relationship with them by having incorporated the Web 2.0 and social media (SM) platforms. Originality/value This study analyses, from a marketing perspective, the degree of Web 2.0 adoption and the characteristics of the websites, in terms of aiding communication and interaction with customers in the Spanish pharmaceutical sector.


Sign in / Sign up

Export Citation Format

Share Document