Microsoft Concept Graph: Mining Semantic Concepts for Short Text Understanding

Knowlege is important for text-related applications. In this paper, we introduce Microsoft Concept Graph, a knowledge graph engine that provides concept tagging APIs to facilitate the understanding of human languages. Microsoft Concept Graph is built upon Probase, a universal probabilistic taxonomy consisting of instances and concepts mined from the Web. We start by introducing the construction of the knowledge graph through iterative semantic extraction and taxonomy construction procedures, which extract 2.7 million concepts from 1.68 billion Web pages. We then use conceptualization models to represent text in the concept space to empower text-related applications, such as topic search, query recommendation, Web table understanding and Ads relevance. Since the release in 2016, Microsoft Concept Graph has received more than 100,000 pageviews, 2 million API calls and 3,000 registered downloads from 50,000 visitors over 64 countries.

Download Full-text

Inducing Schema.org markup from Natural Language Context

10.29007/fvc9 ◽

2019 ◽

Author(s):

Gautam Kishore Shahi ◽

Durgesh Nandini ◽

Sushma Kumari

Keyword(s):

Natural Language ◽

Knowledge Base ◽

Structured Data ◽

Knowledge Graph ◽

Web Pages ◽

Web Data ◽

Experimental Part ◽

Language Context ◽

Data Commons ◽

The Web

Schema.org creates, supports and maintain schemas for structured data on the web pages. For a non-technical author, it is difficult to publish contents in a structured format. This work presents an automated way of inducing Schema.org markup from natural language context of web-pages by applying knowledge base creation technique. As a dataset, Web Data Commons was used, and the scope for the experimental part was limited to RDFa. The approach was implemented using the Knowledge Graph building techniques - Knowledge Vault and KnowMore.

Download Full-text

Probability Association Approach in Automatic Image Annotation

Handbook of Research on Public Information Technology ◽

10.4018/978-1-59904-857-4.ch056 ◽

2008 ◽

pp. 615-626

Author(s):

Feng Xu ◽

Yu-Jin Zhang

Keyword(s):

Large Scale ◽

Image Annotation ◽

Public Life ◽

Image Database ◽

Web Pages ◽

Challenging Problem ◽

Semantic Concepts ◽

Automated Image Annotation ◽

High Level ◽

The Web

Content-based image retrieval (CBIR) has wide applications in public life. Either from a static image database or from the Web, one can search for a specific image, generally browse to make an interactive choice, and search for a picture to go with a broad story or to illustrate a document. Although CBIR has been well studied, it is still a challenging problem to search for images from a large image database because of the well-acknowledged semantic gap between low-level features and high-level semantic concepts. An alternative solution is to use keyword-based approaches, which usually associate images with keywords by either manually labeling or automatically extracting surrounding text from Web pages. Although such a solution is widely adopted by most existing commercial image search engines, it is not perfect. First, manual annotation, though precise, is expensive and difficult to extend to large-scale databases. Second, automatically extracted surrounding text might by incomplete and ambiguous in describing images, and even more, surrounding text may not be available in some applications. To overcome these problems, automated image annotation is considered as a promising approach in understanding and describing the content of images.

Download Full-text

A FRAME WORK FOR WEB INFORMATION EXTRACTION AND ANALYSIS

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v7i2.3459 ◽

2013 ◽

Vol 7 (2) ◽

pp. 574-579 ◽

Cited By ~ 3

Author(s):

Dr Sunitha Abburu ◽

G. Suresh Babu

Keyword(s):

Information Extraction ◽

Data Extraction ◽

Research Work ◽

Web Pages ◽

Web Documents ◽

E Learning ◽

Structured Information ◽

Frame Work ◽

Effective Decision ◽

The Web

Day by day the volume of information availability in the web is growing significantly. There are several data structures for information available in the web such as structured, semi-structured and unstructured. Majority of information in the web is presented in web pages. The information presented in web pages is semi-structured.Â But the information required for a context are scattered in different web documents. It is difficult to analyze the large volumes of semi-structured information presented in the web pages and to make decisions based on the analysis. The current research work proposed a frame work for a system that extracts information from various sources and prepares reports based on the knowledge built from the analysis. This simplifies Â data extraction, data consolidation, data analysis and decision making based on the information presented in the web pages.The proposed frame work integrates web crawling, information extraction and data mining technologies for better information analysis that helps in effective decision making.Â Â It enables people and organizations to extract information from various sourses of web and to make an effective analysis on the extracted data for effective decision making.Â The proposed frame work is applicable for any application domain. Manufacturing,sales,tourisum,e-learning are various application to menction few.The frame work is implemetnted and tested for the effectiveness of the proposed system and the results are promising.

Download Full-text

Governance Disclosure on the Internet by Leading Indian Public Sector Companies

Think India ◽

10.26643/think-india.v22i2.8716 ◽

2019 ◽

Vol 22 (2) ◽

pp. 174-187

Author(s):

Harmandeep Singh ◽

Arwinder Singh

Keyword(s):

Public Sector ◽

Financial Information ◽

Three Dimensions ◽

The Internet ◽

Web Pages ◽

Non Profit ◽

Index Approach ◽

Significant Difference ◽

Governance Disclosure ◽

The Web

Nowadays, internet satisfying people with different services related to different fields. The profit, as well as non-profit organization, uses the internet for various business purposes. One of the major is communicated various financial as well as non-financial information on their respective websites. This study is conducted on the top 30 BSE listed public sector companies, to measure the extent of governance disclosure (non-financial information) on their web pages. The disclosure index approach to examine the extent of governance disclosure on the internet was used. The governance index was constructed and broadly categorized into three dimensions, i.e., organization and structure, strategy & Planning and accountability, compliance, philosophy & risk management. The empirical evidence of the study reveals that all the Indian public sector companies have a website, and on average, 67% of companies disclosed some kind of governance information directly on their websites. Further, we found extreme variations in the web disclosure between the three categories, i.e., The Maharatans, The Navratans, and Miniratans. However, the result of Kruskal-Wallis indicates that there is no such significant difference between the three categories. The study provides valuable insights into the Indian economy. It explored that Indian public sector companies use the internet for governance disclosure to some extent, but lacks symmetry in the disclosure. It is because there is no such regulation for web disclosure. Thus, the recommendation of the study highlighted that there must be such a regulated framework for the web disclosure so that stakeholders ensure the transparency and reliability of the information.

Download Full-text

Using Knowledge Graph and Search Query Click Logs in Statistical Language Model for Speech Recognition

10.21437/interspeech.2017-1790 ◽

2017 ◽

Author(s):

Weiwu Zhu

Keyword(s):

Speech Recognition ◽

Language Model ◽

Knowledge Graph ◽

Search Query ◽

Statistical Language Model

Download Full-text

A Method of Opinion Tendency Analyzing Based on Regular Expression

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.2758 ◽

2013 ◽

Vol 347-350 ◽

pp. 2758-2762

Author(s):

Zhi Juan Wang

Keyword(s):

Regular Expression ◽

Negative Information ◽

Web Pages ◽

Social Stability ◽

National Unity ◽

Internet Information ◽

Information Bank ◽

The Web

Negative Internet information is harmful for social stability and national unity. Opinion tendency analyzing can find the negative Internet information. Here, a method based on regular expression is introduces that neednt complex technologies about semantics. This method includes: building negative information bank, designing regular expression and the realization of program. The result gotten from this method verified it works perfect on judging the opinion of the web pages.

Download Full-text

Composite analysis of web pages in adaptive environment through Modified Salp Swarm algorithm to rank the web pages

Journal of Ambient Intelligence and Humanized Computing ◽

10.1007/s12652-021-03033-y ◽

2021 ◽

Author(s):

E. Manohar ◽

E. Anandha Banu ◽

D. Shalini Punithavathani

Keyword(s):

Web Pages ◽

Composite Analysis ◽

Salp Swarm Algorithm ◽

Adaptive Environment ◽

Swarm Algorithm ◽

The Web

Download Full-text

A method of query expansion based on topic models and user profile for search in folksonomy

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210508 ◽

2021 ◽

pp. 1-11

Author(s):

Zhinan Gou ◽

Yan Li

Keyword(s):

Information Retrieval ◽

Query Expansion ◽

Information Overload ◽

Topic Model ◽

User Profile ◽

Expansion Method ◽

Collaborative Tagging ◽

Search Query ◽

Tagging System ◽

The Web

With the development of the web 2.0 communities, information retrieval has been widely applied based on the collaborative tagging system. However, a user issues a query that is often a brief query with only one or two keywords, which leads to a series of problems like inaccurate query words, information overload and information disorientation. The query expansion addresses this issue by reformulating each search query with additional words. By analyzing the limitation of existing query expansion methods in folksonomy, this paper proposes a novel query expansion method, based on user profile and topic model, for search in folksonomy. In detail, topic model is constructed by variational antoencoder with Word2Vec firstly. Then, query expansion is conducted by user profile and topic model. Finally, the proposed method is evaluated by a real dataset. Evaluation results show that the proposed method outperforms the baseline methods.

Download Full-text

The Google Knowledge Graph

Strategic Direction ◽

10.1108/sd-04-2014-0049 ◽

2014 ◽

Vol 30 (4) ◽

pp. 15-17 ◽

Cited By ~ 1

Keyword(s):

Case Studies ◽

Design Methodology ◽

Reading Time ◽

Social Impact ◽

Knowledge Graph ◽

Content Type ◽

Principal Source ◽

Pertinent Information ◽

Practical Implications ◽

The Web

Purpose – This paper aims to review the latest management developments across the globe and pinpoint practical implications from cutting-edge research and case studies. Design/methodology/approach – This briefing is prepared by an independent writer who adds their own impartial comments and places the articles in context. Findings – Becoming increasingly reliant on the web as a principal source of finding information is altering our brains and the way that we obtain and hold knowledge. We are becoming less reliant on our memories to hold knowledge, instead using technology – and search engines like Google in particular – to deposit and retrieve information. Practical implications – The paper provides strategic insights and practical thinking that have influenced some of the world's leading organizations. Social implications – The paper provides strategic insights and practical thinking that can have a broader social impact. Originality/value – The briefing saves busy executives and researchers hours of reading time by selecting only the very best, most pertinent information and presenting it in a condensed and easy-to-digest format.

Download Full-text

Exploring the customer orientation of Spanish pharmacy websites

International Journal of Pharmaceutical and Healthcare Marketing ◽

10.1108/ijphm-04-2018-0025 ◽

2018 ◽

Vol 12 (4) ◽

pp. 447-462 ◽

Cited By ~ 2

Author(s):

Carmen Domínguez-Falcón ◽

Domingo Verano-Tacoronte ◽

Marta Suárez-Fuentes

Keyword(s):

Web 2.0 ◽

Customer Orientation ◽

Community Pharmacies ◽

Web Pages ◽

Web Page ◽

Pharmaceutical Sector ◽

Content Type ◽

Web Page Design ◽

Page Design ◽

The Web

Purpose The strong regulation of the Spanish pharmaceutical sector encourages pharmacies to modify their business model, giving the customer a more relevant role by integrating 2.0 tools. However, the study of the implementation of these tools is still quite limited, especially in terms of a customer-oriented web page design. This paper aims to analyze the online presence of Spanish community pharmacies by studying the profile of their web pages to classify them by their degree of customer orientation. Design/methodology/approach In total, 710 community pharmacies were analyzed, of which 160 had Web pages. Using items drawn from the literature, content analysis was performed to evaluate the presence of these items on the web pages. Then, after analyzing the scores on the items, a cluster analysis was conducted to classify the pharmacies according to the degree of development of their online customer orientation strategy. Findings The number of pharmacies with a web page is quite low. The development of these websites is limited, and they have a more informational than relational role. The statistical analysis allows to classify the pharmacies in four groups according to their level of development Practical implications Pharmacists should make incremental use of their websites to facilitate real two-way communication with customers and other stakeholders to maintain a relationship with them by having incorporated the Web 2.0 and social media (SM) platforms. Originality/value This study analyses, from a marketing perspective, the degree of Web 2.0 adoption and the characteristics of the websites, in terms of aiding communication and interaction with customers in the Spanish pharmaceutical sector.

Download Full-text