scholarly journals Uncertainty-aware Visual Analytics - Scope, Oppertunities and Challenges

Author(s):  
Robin Georg Claus Maack ◽  
Gerik Scheuermann ◽  
Hans Hagen ◽  
Jose Tiberio Hernández Peñaloza ◽  
Christina Gillmann

Abstract In many applications, Visual Analytics(VA) has developed into a standard tool to ease data access and knowledge generation. Unfortunately, many data sources, used in the VA process, are affected by uncertainty. In addition, the VA cycle itself can introduce uncertainty to the knowledge generation process. The classic VA cycle does not provide a mechanism to handle these sources of uncertainty. In this manuscript, we aim to provide an extended VA cycle that is capable of handling uncertainty by quantification, propagation, and visualization. Different data types and application scenarios that can be handled by such a cycle, examples, and a list of open challenges in the area of uncertainty-aware VA are provided.

2016 ◽  
Vol 55 (02) ◽  
pp. 107-113 ◽  
Author(s):  
M. Löpprich ◽  
C. Karmen ◽  
M. Ganzinger ◽  
M. Gietzelt

SummaryBackground: Systems medicine is a new approach for the development and selection of treatment strategies for patients with complex diseases. It is often referred to as the application of systems biology methods for decision making in patient care. For systems medicine computer applications, many different data sources have to be integrated and included into models. This is a challenging task for Medical Informatics since the approach exceeds traditional systems like Electronic Health Records. To prioritize research activities for systems medicine applications, it is necessary to get an overview over modelling methods and data sources already used in this field.Objectives: We performed a systematic literature review with the objective to capture current use of 1) modelling methods and 2) data sources in systems medicine related research projects.Methods: We queried the MEDLINE and ScienceDirect databases for papers associated with the search term systems medicine and related terms. Papers were screened and assessed in full text in a two-step process according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement guidelines.Results: The queries returned 698 articles of which 34 papers were finally included into the study. A multitude of modelling approaches such as machine learning and network analysis was identified and classified. Since these approaches are also used in other domains, no methods specific for systems medicine could be identified. Omics data are the most widely used data types followed by clinical data. Most studies only include a rather limited number of data sources.Conclusions: Currently, many different modelling approaches are used in systems medicine. Thus, highly flexible modular solutions are necessary for systems medicine clinical applications. However, the number of data sources included into the models is limited and most projects currently focus on prognosis. To leverage the potential of systems medicine further, it will be necessary to focus on treatment strategies for patients and consider a broader range of data.


2005 ◽  
Vol 9 (4) ◽  
pp. 51-63 ◽  
Author(s):  
Anurag Mishra ◽  
M. Akbar

Literature on medium sized enterprises (MSEs) is limited both in developed markets and emerging markets. This paper addresses this gap and explores MSEs from a knowledge-based perspective. Grounded in the case based research often MSEs, the paper identifies the knowledge assets employed by highly successful firms. The paper performs a detailed case analysis of three such firms from our sample. We trace the knowledge generation process through a detailed line diagram and based on the case analysis, build a generic model for analyzing the knowledge conversion process in MSEs. The contribution of this work is articulated in the process model that integrates the various classes of knowledge assets in the context of transitional firms in India. The paper also develops a few empirically testable propositions, filling a major gap in existing literature on knowledge management.


Author(s):  
Sabrina T. Wong ◽  
Julia M. Langton ◽  
Alan Katz ◽  
Martin Fortin ◽  
Marshall Godwin ◽  
...  

AbstractAimTo describe the process by which the 12 community-based primary health care (CBPHC) research teams worked together and fostered cross-jurisdictional collaboration, including collection of common indicators with the goal of using the same measures and data sources.BackgroundA pan-Canadian mechanism for common measurement of the impact of primary care innovations across Canada is lacking. The Canadian Institutes for Health Research and its partners funded 12 teams to conduct research and collaborate on development of a set of commonly collected indicators.MethodsA working group representing the 12 teams was established. They undertook an iterative process to consider existing primary care indicators identified from the literature and by stakeholders. Indicators were agreed upon with the intention of addressing three objectives across the 12 teams: (1) describing the impact of improving access to CBPHC; (2) examining the impact of alternative models of chronic disease prevention and management in CBPHC; and (3) describing the structures and context that influence the implementation, delivery, cost, and potential for scale-up of CBPHC innovations.FindingsNineteen common indicators within the core dimensions of primary care were identified: access, comprehensiveness, coordination, effectiveness, and equity. We also agreed to collect data on health care costs and utilization within each team. Data sources include surveys, health administrative data, interviews, focus groups, and case studies. Collaboration across these teams sets the foundation for a unique opportunity for new knowledge generation, over and above any knowledge developed by any one team. Keys to success are each team’s willingness to engage and commitment to working across teams, funding to support this collaboration, and distributed leadership across the working group. Reaching consensus on collection of common indicators is challenging but achievable.


Author(s):  
Jon Hael Simon Brenas ◽  
Mohammad S. Al-Manir ◽  
Kate Zinszer ◽  
Christopher J. Baker ◽  
Arash Shaban-Nejad

ObjectiveMalaria is one of the top causes of death in Africa and some other regions in the world. Data driven surveillance activities are essential for enabling the timely interventions to alleviate the impact of the disease and eventually eliminate malaria. Improving the interoperability of data sources through the use of shared semantics is a key consideration when designing surveillance systems, which must be robust in the face of dynamic changes to one or more components of a distributed infrastructure. Here we introduce a semantic framework to improve interoperability of malaria surveillance systems (SIEMA).IntroductionIn 2015, there were 212 million new cases of malaria, and about 429,000 malaria death, worldwide. African countries accounted for almost 90% of global cases of malaria and 92% of malaria deaths. Currently, malaria data are scattered across different countries, laboratories, and organizations in different heterogeneous data formats and repositories. The diversity of access methodologies makes it difficult to retrieve relevant data in a timely manner. Moreover, lack of rich metadata limits the reusability of data and its integration. The current process of discovering, accessing and reusing the data is inefficient and error-prone profoundly hindering surveillance efforts.As our knowledge about malaria and appropriate preventive measures becomes more comprehensive malaria data management systems, data collection standards, and data stewardship are certain to change regularly. Collectively these changes will make it more difficult to perform accurate data analytics or achieve reliable estimates of important metrics, such as infection rates. Consequently, there is a critical need to rapidly re-assess the integrity of data and knowledge infrastructures that experts depend on to support their surveillance tasks.MethodsIn order to address the challenge of heterogeneity of malaria data sources we recruit domain specific ontologies in the field (e.g. IDOMAL (1)) that define a shared lexicon of concepts and relations. These ontologies are expressed in the standard Web Ontology Language (OWL).To over come challenges in accessing distributed data resources we have adopted the Semantic Automatic Discovery & Integration framework (SADI) (2) to ensure interoperability. SADI provides a way to describe services that provide access to data, detailing inputs and outputs of services and a functional description. Existing ontology terms are used when building SADI Service descriptions. The services can be discovered by querying a registry and combined into complex workflows. Users can issue SPARQL syntax to a query engine which can plan complex workflows to fetch actual data, without having to know how target data is structured or where it is located.In order to tackle changes in target data sources, the ontologies or the service definitions, we create a Dashboard (3) that can report any changes. The Dashboard reuses some existing tools to perform a series of checks. These tools compare versions of ontologies and databases allowing the Dashboard to report these changes. Once a change has been identified, as series of recommendations can be made, e.g. services can be retired or updated so that data access can continue.ResultsWe used the Mosquito Insecticide Resistance Ontology (MIRO) (5) to define the common lexicon for our data sources and queries. The sources we created are CSV files that use the IRbase (4) schema. With the data defined using we specified several SPARQL queries and the SADI services needed to answer them. These services were designed to enabled access to the data separated in different files using different formats. In order to showcase the capabilities of our Dashboard, we also modified parts of the service definitions, of the ontology and of the data sources. This allowed us to test our change detection capabilities. Once changes where detected, we manually updated the services to comply with a revised ontology and data sources and checked that the changes we proposed where yielding services that gave the right answers. In the future, we plan to make the updating of the services automatic.ConclusionsBeing able to make the relevant information accessible to a surveillance expert in a seamless way is critical in tackling and ultimately curing malaria. In order to achieve this, we used existing ontologies and semantic web services to increase the interoperability of the various sources. The data as well as the ontologies being likely to change frequently, we also designed a tool allowing us to detect and identify the changes and to update the services so that the whole surveillance systems becomes more resilient.References1. P. Topalis, E. Mitraka, V Dritsou, E. Dialynas and C. Louis, “IDOMAL: the malaria ontology revisited” in Journal of Biomedical Semantics, vol. 4, no. 1, p. 16, Sep 2013.2. M. D. Wilkinson, B. Vandervalk and L. McCarthy, “The Semantic Automated Discovery and Integration (SADI) web service design-pattern, API and reference implementation” in Journal of Biomedical Semantics, vol. 2, no. 1, p. 8, 2011.3. J.H. Brenas, M.S. Al-Manir, C.J.O. Baker and A. Shaban-Nejad, “Change management dashboard for the SIEMA global surveillance infrastructure”, in International Semantic Web Conference, 20174. E. Dialynas, P. Topalis, J. Vontas and C. Louis, "MIRO and IRbase: IT Tools for the Epidemiological Monitoring of Insecticide Resistance in Mosquito Disease Vectors", in PLOS Neglected Tropical Diseases 2009


2021 ◽  
Vol 2 (2) ◽  
pp. 95-103
Author(s):  
Trie Nadia Ayu Lizara ◽  
Timbul Simangunsong

The Influence of the Implementation of the Annual Entity E-SPT, Understanding Taxation, Tax Awareness Awareness of Taxpayer Compliance of the corporate taxpayers in reporting as the agency annual. In this study using quantitative data types and data sources, namely primary data. Primary data were obtained from questionnaires distributed to corporate taxpayers at random, using the purposive sampling method at the West Jakarta Middle Tax Office. The number of questionnaires distributed was 100 questionnaires. The results of this study indicate that the Application of the Annual Annual E-SPT of the Agency has a significant effect on taxpayer compliance and taxpayer awareness. While Understanding Taxation has no significant effect on Taxpayer Compliance.


2021 ◽  
Author(s):  
Ekaterina Chuprikova ◽  
Abraham Mejia Aguilar ◽  
Roberto Monsorno

<p>Increasing agricultural production challenges, such as climate change, environmental concerns, energy demands, and growing expectations from consumers triggered the necessity for innovation using data-driven approaches such as visual analytics. Although the visual analytics concept was introduced more than a decade ago, the latest developments in the data mining capacities made it possible to fully exploit the potential of this approach and gain insights into high complexity datasets (multi-source, multi-scale, and different stages). The current study focuses on developing prototypical visual analytics for an apple variety testing program in South Tyrol, Italy. Thus, the work aims (1) to establish a visual analytics interface enabled to integrate and harmonize information about apple variety testing and its interaction with climate by designing a semantic model; and (2) to create a single visual analytics user interface that can turn the data into knowledge for domain experts. </p><p>This study extends the visual analytics approach with a structural way of data organization (ontologies), data mining, and visualization techniques to retrieve knowledge from an extensive collection of apple variety testing program and environmental data. The prototype stands on three main components: ontology, data analysis, and data visualization. Ontologies provide a representation of expert knowledge and create standard concepts for data integration, opening the possibility to share the knowledge using a unified terminology and allowing for inference. Building upon relevant semantic models (e.g., agri-food experiment ontology, plant trait ontology, GeoSPARQL), we propose to extend them based on the apple variety testing and climate data. Data integration and harmonization through developing an ontology-based model provides a framework for integrating relevant concepts and relationships between them, data sources from different repositories, and defining a precise specification for the knowledge retrieval. Besides, as the variety testing is performed on different locations, the geospatial component can enrich the analysis with spatial properties. Furthermore, the visual narratives designed within this study will give a better-integrated view of data entities' relations and the meaningful patterns and clustering based on semantic concepts.</p><p>Therefore, the proposed approach is designed to improve decision-making about variety management through an interactive visual analytics system that can answer "what" and "why" about fruit-growing activities. Thus, the prototype has the potential to go beyond the traditional ways of organizing data by creating an advanced information system enabled to manage heterogeneous data sources and to provide a framework for more collaborative scientific data analysis. This study unites various interdisciplinary aspects and, in particular: Big Data analytics in the agricultural sector and visual methods; thus, the findings will contribute to the EU priority program in digital transformation in the European agricultural sector.</p><p>This project has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No 894215.</p>


2021 ◽  
Author(s):  
Benjamin Moreno-Torres ◽  
Christoph Völker ◽  
Sabine Kruschwitz

<div> <p>Non-destructive testing (NDT) data in civil engineering is regularly used for scientific analysis. However, there is no uniform representation of the data yet. An analysis of distributed data sets across different test objects is therefore too difficult in most cases.</p> <p>To overcome this, we present an approach for an integrated data management of distributed data sets based on Semantic Web technologies. The cornerstone of this approach is an ontology, a semantic knowledge representation of our domain. This NDT-CE ontology is later populated with the data sources. Using the properties and the relationships between concepts that the ontology contains, we make these data sets meaningful also for machines. Furthermore, the ontology can be used as a central interface for database access. Non-domain data sources can be integrated by linking them with the NDT ontology, making them directly available for generic use in terms of digitization. Based on an extensive literature research, we outline the possibilities that result for NDT in civil engineering, such as computer-aided sorting and analysis of measurement data, and the recognition and explanation of correlations.</p> <p>A common knowledge representation and data access allows the scientific exploitation of existing data sources with data-based methods (such as image recognition, measurement uncertainty calculations, factor analysis or material characterization) and simplifies bidirectional knowledge and data transfer between engineers and NDT specialists.</p> </div>


2019 ◽  
pp. 254-277 ◽  
Author(s):  
Ying Zhang ◽  
Chaopeng Li ◽  
Na Chen ◽  
Shaowen Liu ◽  
Liming Du ◽  
...  

Since large amount of geospatial data are produced by various sources, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. We mainly adopt four kinds of geospatial data sources to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).


2019 ◽  
pp. 230-253
Author(s):  
Ying Zhang ◽  
Chaopeng Li ◽  
Na Chen ◽  
Shaowen Liu ◽  
Liming Du ◽  
...  

Since large amount of geospatial data are produced by various sources and stored in incompatible formats, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. First, we provide a uniform integration paradigm for users to retrieve geospatial data. Then, we align the retrieved geospatial data in the modeling process to eliminate heterogeneity with the help of Karma. Our main contribution focuses on addressing the third problem. Previous work has been done by defining a set of semantic rules for performing the linking process. However, the geospatial data has some specific geospatial relationships, which is significant for linking but cannot be solved by the Semantic Web techniques directly. We take advantage of such unique features about geospatial data to implement the linking process. In addition, the previous work will meet a complicated problem when the geospatial data sources are in different languages. In contrast, our proposed linking algorithms are endowed with translation function, which can save the translating cost among all the geospatial sources with different languages. Finally, the geospatial data is integrated by eliminating data redundancy and combining the complementary properties from the linked records. We mainly adopt four kinds of geospatial data sources, namely, OpenStreetMap(OSM), Wikmapia, USGS and EPA, to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).


Author(s):  
Ying Zhang ◽  
Chaopeng Li ◽  
Na Chen ◽  
Shaowen Liu ◽  
Liming Du ◽  
...  

Since large amount of geospatial data are produced by various sources, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. We mainly adopt four kinds of geospatial data sources to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).


Sign in / Sign up

Export Citation Format

Share Document