scholarly journals Data Harmonization for Heterogeneous Datasets: A Systematic Literature Review

2021 ◽  
Vol 11 (17) ◽  
pp. 8275
Author(s):  
Ganesh Kumar ◽  
Shuib Basri ◽  
Abdullahi Abubakar Imam ◽  
Sunder Ali Khowaja ◽  
Luiz Fernando Capretz ◽  
...  

As data size increases drastically, its variety also increases. Investigating such heterogeneous data is one of the most challenging tasks in information management and data analytics. The heterogeneity and decentralization of data sources affect data visualization and prediction, thereby influencing analytical results accordingly. Data harmonization (DH) corresponds to a field that unifies the representation of such a disparate nature of data. Over the years, multiple solutions have been developed to minimize the heterogeneity aspects and disparity in formats of big-data types. In this study, a systematic review of the literature was conducted to assess the state-of-the-art DH techniques. This study aimed to understand the issues faced due to heterogeneity, the need for DH and the techniques that deal with substantial heterogeneous textual datasets. The process produced 1355 articles, but among them, only 70 articles were found to be relevant through inclusion and exclusion criteria methods. The result shows that the heterogeneity of structured, semi-structured, and unstructured (SSU) data can be managed by using DH and its core techniques, such as text preprocessing, Natural Language Preprocessing (NLP), machine learning (ML), and deep learning (DL). These techniques are applied to many real-world applications centered on the information-retrieval domain. Several assessment criteria were implemented to measure the efficiency of these techniques, such as precision, recall, F-1, accuracy, and time. A detailed explanation of each research question, common techniques, and performance measures is also discussed. Lastly, we present readers with a detailed discussion of the existing work, contributions, and managerial and academic implications, along with the conclusion, limitations, and future research directions.

Author(s):  
Marco Mesiti ◽  
Ernesto Jiménez Ruiz ◽  
Ismael Sanz ◽  
Rafael Berlanga Llavori ◽  
Giorgio Valentini ◽  
...  

There is a proliferation of research and industrial organizations that produce sources of huge amounts of biological data issuing from experimentation with biological systems. In order to make these heterogeneous data sources easy to use, several efforts at data integration are currently being undertaken based mainly on XML. Starting from a discussion of the main biological data types and system interactions that need to be represented, the authors deal with the main approaches proposed for their modelling through XML. Then, they show the current efforts in biological data integration and how an increasing amount of Semantic information is required in terms of vocabulary control and ontologies. Finally, future research directions in biological data integration are discussed.


Author(s):  
Fred Luthans ◽  
Carolyn M. Youssef

Over the years, both management practitioners and academics have generally assumed that positive workplaces lead to desired outcomes. Unlike psychology, considerable attention has also been devoted to the study of positive topics such as job satisfaction and organizational commitment. However, to place a scientifically based focus on the role that positivity may play in the development and performance of human resources, and largely stimulated by the positive psychology initiative, positive organizational behavior (POB) and psychological capital (PsyCap) have recently been introduced into the management literature. This chapter first provides an overview of both the historical and contemporary positive approaches to the workplace. Then, more specific attention is given to the meaning and domain of POB and PsyCap. Our definition of POB includes positive psychological capacities or resources that can be validly measured, developed, and have performance impact. The constructs that have been determined so far to best meet these criteria are efficacy, hope, optimism, and resiliency. When combined, they have been demonstrated to form the core construct of what we term psychological capital (PsyCap). A measure of PsyCap is being validated and this chapter references the increasing number of studies indicating that PsyCap can be developed and have performance impact. The chapter concludes with important future research directions that can help better understand and build positive workplaces to meet current and looming challenges.


2010 ◽  
Vol 21 (3) ◽  
pp. 223-237 ◽  
Author(s):  
Alberto Sa Vinhas ◽  
Sharmila Chatterjee ◽  
Shantanu Dutta ◽  
Adam Fein ◽  
Joseph Lajos ◽  
...  

2015 ◽  
Vol 11 (3) ◽  
pp. 427-440 ◽  
Author(s):  
Chris Marquis ◽  
Susan E. Jackson ◽  
Yuan Li

ABSTRACTAs China shifts its development model from focusing on economic growth at all costs to a model in which economic growth is balanced with solving pressing societal and environmental problems, there is an increasing need for management research on building sustainable organizations in China. This collection of papers focuses attention on the role of business in promoting sustainable economic development, highlighting a number of key processes including: the factors that foster transparency and CSR reporting, how stakeholders can influence corporations to abandon their CSR commitments, the benefits of environmental branding and labeling, and the antecedents and performance consequences of proactive environmental strategies. In this introductory essay we reflect on recent trends in sustainability research in China, and to encourage this important movement, provide recommendations for future research directions.


1998 ◽  
Vol 24 (1) ◽  
pp. 21-42 ◽  
Author(s):  
T. K. Das ◽  
Bing-Sheng Teng

Resource-based and risk-based views of strategic alliances have not been adequately reflected in the literature. This paper identifies four types of critical resources that the partners bring to an alliance: financial, technological, physical, and managerial resource. It also suggests two basic types of risk in strategic alliances: relational risk and performance risk. The alliance making process is examined in terms of the interactive effects of resource and risk on the orientations and objectives of the prospective alliance partners. Managerial implications are discussed and future research directions indicated in the form of propositions for empirical testing.


2013 ◽  
Vol 30 (1) ◽  
pp. 76-105 ◽  
Author(s):  
Sylvester O. Orimaye ◽  
Saadat M. Alhashmi ◽  
Eu-Gene Siew

AbstractThis paper presents trends and performance of opinion retrieval techniques proposed within the last 8 years. We identify major techniques in opinion retrieval and group them into four popular categories. We describe the state-of-the-art techniques for each category and emphasize on their performance and limitations. We then summarize with a performance comparison table for the techniques on different datasets. Finally, we highlight possible future research directions that can help solve existing challenges in opinion retrieval.


2021 ◽  
Vol 17 (8) ◽  
pp. e1009283
Author(s):  
Tomasz Konopka ◽  
Sandra Ng ◽  
Damian Smedley

Integrating reference datasets (e.g. from high-throughput experiments) with unstructured and manually-assembled information (e.g. notes or comments from individual researchers) has the potential to tailor bioinformatic analyses to specific needs and to lead to new insights. However, developing bespoke analysis pipelines from scratch is time-consuming, and general tools for exploring such heterogeneous data are not available. We argue that by treating all data as text, a knowledge-base can accommodate a range of bioinformatic data types and applications. We show that a database coupled to nearest-neighbor algorithms can address common tasks such as gene-set analysis as well as specific tasks such as ontology translation. We further show that a mathematical transformation motivated by diffusion can be effective for exploration across heterogeneous datasets. Diffusion enables the knowledge-base to begin with a sparse query, impute more features, and find matches that would otherwise remain hidden. This can be used, for example, to map multi-modal queries consisting of gene symbols and phenotypes to descriptions of diseases. Diffusion also enables user-driven learning: when the knowledge-base cannot provide satisfactory search results in the first instance, users can improve the results in real-time by adding domain-specific knowledge. User-driven learning has implications for data management, integration, and curation.


Nanophotonics ◽  
2019 ◽  
Vol 8 (5) ◽  
pp. 747-769 ◽  
Author(s):  
Henrik Mäntynen ◽  
Nicklas Anttu ◽  
Zhipei Sun ◽  
Harri Lipsanen

AbstractSingle-photon sources are one of the key components in quantum photonics applications. These sources ideally emit a single photon at a time, are highly efficient, and could be integrated in photonic circuits for complex quantum system designs. Various platforms to realize such sources have been actively studied, among which semiconductor quantum dots have been found to be particularly attractive. Furthermore, quantum dots embedded in bottom-up-grown III–V compound semiconductor nanowires have been found to exhibit relatively high performance as well as beneficial flexibility in fabrication and integration. Here, we review fabrication and performance of these nanowire-based quantum sources and compare them to quantum dots in top-down-fabricated designs. The state of the art in single-photon sources with quantum dots in nanowires is discussed. We also present current challenges and possible future research directions.


Sign in / Sign up

Export Citation Format

Share Document