Low-Quality Error Detection for Noisy Knowledge Graphs

2021 ◽  
Vol 32 (4) ◽  
pp. 48-64
Author(s):  
*Chenyang Bu ◽  
Xingchen Yu ◽  
Yan Hong ◽  
Tingting Jiang

The automatic construction of knowledge graphs (KGs) from multiple data sources has received increasing attention. The automatic construction process inevitably brings considerable noise, especially in the construction of KGs from unstructured text. The noise in a KG can be divided into two categories: factual noise and low-quality noise. Factual noise refers to plausible triples that meet the requirements of ontology constraints. For example, the plausible triple <New_York, IsCapitalOf, America> satisfies the constraints that the head entity “New_York” is a city and the tail entity “America” belongs to a country. Low-quality noise denotes the obvious errors commonly created in information extraction processes. This study focuses on entity type errors. Most existing approaches concentrate on refining an existing KG, assuming that the type information of most entities or the ontology information in the KG is known in advance. However, such methods may not be suitable at the start of a KG's construction. Therefore, the authors propose an effective framework to eliminate entity type errors. The experimental results demonstrate the effectiveness of the proposed method.

Organization ◽  
2017 ◽  
Vol 25 (3) ◽  
pp. 374-400 ◽  
Author(s):  
Vivien Blanchet

Whereas categories are important cogs of market dynamics, their construction process has been largely overlooked to date. Drawing on the Actor–Network Theory, the article tackles this issue by redefining categorisation as a translation process transforming multiplicity into unity through inscriptions. This process sheds light on the very practices of categorising, the devices involved and their agency. Combining multiple data sources, it describes how organisers and exhibitors at a trade fair use visual inscriptions like pictures and movies, logos and maps, catalogues and fashion parades to define ethical fashion, make compromises between ethics and aesthetics, and project a fashionable image of the nascent category. This offers new insights into the construction of markets by breaking down the performative process of categorisation and revealing the visual mediations involved.


2021 ◽  
pp. 1-22
Author(s):  
Emily Berg ◽  
Johgho Im ◽  
Zhengyuan Zhu ◽  
Colin Lewis-Beck ◽  
Jie Li

Statistical and administrative agencies often collect information on related parameters. Discrepancies between estimates from distinct data sources can arise due to differences in definitions, reference periods, and data collection protocols. Integrating statistical data with administrative data is appealing for saving data collection costs, reducing respondent burden, and improving the coherence of estimates produced by statistical and administrative agencies. Model based techniques, such as small area estimation and measurement error models, for combining multiple data sources have benefits of transparency, reproducibility, and the ability to provide an estimated uncertainty. Issues associated with integrating statistical data with administrative data are discussed in the context of data from Namibia. The national statistical agency in Namibia produces estimates of crop area using data from probability samples. Simultaneously, the Namibia Ministry of Agriculture, Water, and Forestry obtains crop area estimates through extension programs. We illustrate the use of a structural measurement error model for the purpose of synthesizing the administrative and survey data to form a unified estimate of crop area. Limitations on the available data preclude us from conducting a genuine, thorough application. Nonetheless, our illustration of methodology holds potential use for a general practitioner.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Tomas Kos

Abstract Although foreign language instruction in mixed-age (M-A) is gaining popularity (Heizmann and Ries and Wicki 2015; Lau and Juby-Smith and Desbiens, 2017; Shahid Kazi and Moghal and Aziz 2018; Thurn 2011), the research is scarce. Drawing from multiple data sources, this study investigated to what extent do peer interactions among M-A and same-age (S-A) pairs aid L2 development and how students perceive their interactions. In this study, the same learners (N=24) aged between 10 and 12 interacted with the same and different age partners during common classroom lessons in two EFL classrooms. The results suggest that both S-A and M-A peer interactions aided L2 development. Although S-A pairs outperformed M-A pairs on the post-test, the results are not statistically significant. The analysis of students’ perceptions revealed that the majority of students prefer working in S-A to M-A pairs. In addition to age/proficiency differences, factors such as students’ relationships and perceptions of one’s own and partner’s proficiency greatly impact how they interact with one another.


Author(s):  
Lijing Wang ◽  
Aniruddha Adiga ◽  
Srinivasan Venkatramanan ◽  
Jiangzhuo Chen ◽  
Bryan Lewis ◽  
...  

Omega ◽  
2021 ◽  
pp. 102479
Author(s):  
Zhongbao Zhou ◽  
Meng Gao ◽  
Helu Xiao ◽  
Rui Wang ◽  
Wenbin Liu

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Jin Chen ◽  
Tianyuan Chen ◽  
Yifei Song ◽  
Bin Hao ◽  
Ling Ma

AbstractPrior literature emphasizes the distinct roles of differently affiliated venture capitalists (VCs) in nurturing innovation and entrepreneurship. Although China has become the second largest VC market in the world, the unavailability of high-quality datasets on VC affiliation in China’s market hinders such research efforts. To fill up this important gap, we compiled a new panel dataset of VC affiliation in China’s market from multiple data sources. Specifically, we drew on a list of 6,553 VCs that have invested in China between 2000 and 2016 from CVSource database, collected VC’s shareholder information from public sources, and developed a multi-stage procedure to label each VC as the following types: GVC (public agency-affiliated, state-owned enterprise-affiliated), CVC (corporate VC), IVC (independent VC), BVC (bank-affiliated VC), FVC (financial/non-bank-affiliated VC), UVC (university endowment/spin-out unit), and PenVC (pension-affiliated VC). We also denoted whether a VC has foreign background. This dataset helps researchers conduct more nuanced investigations into the investment behaviors of different VCs and their distinct impacts on innovation and entrepreneurship in China’s context.


Sign in / Sign up

Export Citation Format

Share Document