scholarly journals Text and Data Quality Mining in CRIS

Information ◽  
2019 ◽  
Vol 10 (12) ◽  
pp. 374 ◽  
Author(s):  
Azeroual

To provide scientific institutions with comprehensive and well-maintained documentation of their research information in a current research information system (CRIS), they have the best prerequisites for the implementation of text and data mining (TDM) methods. Using TDM helps to better identify and eliminate errors, improve the process, develop the business, and make informed decisions. In addition, TDM increases understanding of the data and its context. This not only improves the quality of the data itself, but also the institution’s handling of the data and consequently the analyses. This present paper deploys TDM in CRIS to analyze, quantify, and correct the unstructured data and its quality issues. Bad data leads to increased costs or wrong decisions. Ensuring high data quality is an essential requirement when creating a CRIS project. User acceptance in a CRIS depends, among other things, on data quality. Not only is the objective data quality the decisive criterion, but also the subjective quality that the individual user assigns to the data.

Publications ◽  
2019 ◽  
Vol 7 (1) ◽  
pp. 14 ◽  
Author(s):  
Otmane Azeroual ◽  
Joachim Schöpfel

Collecting, integrating, storing and analyzing data in a database system is nothing new in itself. To introduce a current research information system (CRIS) means that scientific institutions must provide the required information on their research activities and research results at a high quality. A one-time cleanup is not sufficient; data must be continuously curated and maintained. Some data errors (such as missing values, spelling errors, inaccurate data, incorrect formatting, inconsistencies, etc.) can be traced across different data sources and are difficult to find. Small mistakes can make data unusable, and corrupted data can have serious consequences. The sooner quality issues are identified and remedied, the better. For this reason, new techniques and methods of data cleansing and data monitoring are required to ensure data quality and its measurability in the long term. This paper examines data quality issues in current research information systems and introduces new techniques and methods of data cleansing and data monitoring with which organizations can guarantee the quality of their data.


2019 ◽  
Vol 12 (4) ◽  
pp. 84 ◽  
Author(s):  
Otmane Azeroual

With the increased accessibility of research information, the demands on research information systems (RIS) that are expected to automatically generate and process knowledge are increasing. Furthermore, the quality of the RIS data entries of the individual sources of information causes problems. If the data is structured in RIS, users can read and filter out their information and knowledge needs without any problems. This technique, which nevertheless allows text databases and text sources to be analyzed and knowledge extracted from unknown texts, is referred to as text mining or text data mining based on the principles of data mining. Text mining allows automatically classifying large heterogeneous sources of research information and assigning them to specific topics. Research information has always played a major role in higher education and academic institutions, although they were usually available in unstructured form in RIS and grow faster than structured data. This can be a waste of time searching for RIS staff in universities and can lead to bad decision-making. For this reason, the present paper proposes a new approach to obtaining structured research information from heterogeneous information systems. It is a subset of an approach to the semantic integration of unstructured data using the example of a RIS. The purpose of this paper is to investigate text and data mining methods in the context of RIS and to develop an improvement quality model as an aid to RIS using universities and academic institutions to enrich unstructured research information.


Symmetry ◽  
2019 ◽  
Vol 11 (10) ◽  
pp. 1248 ◽  
Author(s):  
Shao ◽  
Weng ◽  
Chuang

Improving the quality of research information systems is an important goal in the process of improving the performance of research management in Chinese universities. Since the evaluation of information system (IS) quality is a multicriteria decision problem, it is critical to identify the interrelationships among the dimensions and criteria, and decide on the important criteria for proposed improvement strategies. This paper suggests a hybrid multicriteria decision-making (MCDM) model for improving the quality of a research information system. First, a rough method combined with the decision-making trial and evaluation laboratory and analytical network process (rough DANP) model is used to improve the objectivity of expert judgements. Additionally, the rough DANP can be used to construct an influential network relationship map (INRM) between research information system components to derive the criterion weights. The complex proportional assessment of alternatives with rough numbers (COPRAS-R) is applied to evaluate the performance of the research information system. A Chinese university research information system is chosen to illustrate the usefulness of the proposed model. The results show that efficiency, effectiveness, and user frequency have the highest priorities for improvement. Selected management implications based on the actual case study are supplied.


2021 ◽  
Author(s):  
Isabell Krisch ◽  
Oliver Reitebuch ◽  
Jonas von Bismarck ◽  
Alain Dabas ◽  
Peggy Fischer ◽  
...  

<p>The European Space Agency (ESA)’s Earth Explorer Aeolus was launched in August 2018 carrying the world’s first spaceborne wind lidar, the Atmospheric Laser Doppler Instrument (ALADIN). ALADIN uses a high spectral resolution Doppler wind lidar operating at 355nm to determine profiles of line-of-sight wind components in near-real-time (NRT). ALADIN samples the atmosphere from 30km altitude down to the Earth’s surface or to the level where the lidar signal is attenuated by optically thick clouds.</p><p>The global wind profiles provided by ALADIN help to improve weather forecasting and the understanding of atmospheric dynamics as they fill observational gaps in vertically resolved wind profiles mainly in the tropics,  southern hemisphere, and over the northern hemisphere oceans. Since 2020, multiple national and international weather centres (e.g. ECMWF, DWD, Météo France, MetOffice) assimilate Aeolus observations in their operational forecasting. Additionally, the scientific exploitation of the Aeolus dataset has started.</p><p>A main prerequisite for beneficial impact and scientific exploitation is data of sufficient quality. Such high data quality has been achieved through close collaboration of all involved parties within the Aeolus Data Innovation and Science Cluster (DISC), which was established after launch to study and improve the data quality of Aeolus products. The tasks of the Aeolus DISC include the instrument and platform monitoring, calibration, characterization, retrieval algorithm refinement, processor evolution, quality monitoring, product validation, and impact assessment for NWP.</p><p>The achievements of the Aeolus DISC for the NRT data quality and the one currently available reprocessed dataset will be presented. The data quality of the Aeolus wind measurements will be described and an outlook on planned improvements of the dataset and processors will be provided.</p>


Author(s):  
Markus Oppermann ◽  
Stephan Weise

In the wide-ranging field of biodiversity conservation, genebanks play a major role in the preservation of cultivated plants. An important focus of genebanks is the comprehensive documentation of the maintained material. This is a prerequisite to enable users to select the most suitable material for e.g. research or breeding programs (Hoisington et al. 1999). The German Federal ex situ Genebank for Agricultural and Horticultural Crops, which is being hosted at IPK, is the largest genebank in Western Europe. Within the multitude of data associated with plant material (e.g. from various -omics areas or conservation management), the so-called passport data represent the most original and oldest data in genebanks. These metadata are often subject to heterogeneity due to historically different collection and curation, especially if they were received from different institutions around the world. This leads to difficulties in handling these data and can result in misinterpretations. In addition, there are correlations between the individual attributes of the passport data which can lead to a different importance of the individual data points for the users. Major challenges for users are to estimate completeness, correctness and reliability of these data. Thus, it is necessary to assess the quality of these data by defining a suitable set of metrics. Unfortunately, classical data quality measurement metrics, e.g. (Klier 2008), are not sufficient to fulfill the users' needs. Depending on the intention of the user, a different focus is placed on the data. Moreover, the individual attributes of the respective areas can be related to each other. Therefore, a single index value for estimating the quality of a passport record is not sufficient. Rather, it seems to be more promising to generate more differentiated quality statements. We are working on a metrics system that is sensitive to the users' focus. Through a practical set of rules of data quality metrics for accession-related data, the user will be able to influence the weighting of individual domains (e.g. geographical origin, biological status) according to their context (fit-for-use index). The presentation will discuss the background and will give an overview of the progress of this research activity.


2018 ◽  
Vol 2 ◽  
pp. e25970
Author(s):  
Andrew Bentley

The recent incorporation of standardized data quality metrics into the GBIF, iDigBio, and ALA portal infrastructures enables data providers with useful information they can use to clean or augment Darwin Core data at the source based on these recommendations. Numerous taxonomic and geographic based metrics provide useful information on the quality of various Darwin Core fields in this realm, while also providing input on Darwin Core compliance for others. As a provider/data manager for the Biodiversity Institute, University of Kansas, and having spent some time evaluating their efficacy and reliability, this presentation will highlight some of the positive and negative aspects of my experience with specific examples while highlighting concerns regarding the user experience and standardization of these metrics across the aggregator landscape. These metrics have indicated both data and publishing issues that have increased the utility and cleanliness of our data while also highlighting batch processing challenges and issues with the process of inferring "bad" data. The integration of these metrics into source database infrastructure will also be postulated, with Specify Software as an example.


2020 ◽  
Vol 66 (suppl 1) ◽  
pp. s59-s67 ◽  
Author(s):  
Raíssa Antunes Pereira ◽  
Christiane Ishikawa Ramos ◽  
Renata Rodrigues Teixeira ◽  
Gisselma Aliny Santos Muniz ◽  
Gabriele Claudino ◽  
...  

SUMMARY A healthy diet is an essential requirement to promote and preserve health, even in the presence of diseases, such as chronic kidney disease (CKD). In this review, nutritional therapy for CKD will be addressed considering not only the main nutrients such as protein, phosphorus, potassium, and sodium, which require adjustments as a result of changes that accompany the reduction of renal functions, but also the benefits of adopting dietary patterns associated with better outcomes for both preventing and treating CKD. We will also emphasize that these aspects should also be combined with a process of giving new meaning to a healthy diet so that it can be promoted. Finally, we will present the perspective of an integrated approach to the individual with CKD, exploring the importance of considering biological, psychological, social, cultural, and economic aspects. This approach has the potential to contribute to better adherence to treatment, thus improving the patient's quality of life.


Author(s):  
B. Carragher ◽  
M. Whittaker

Techniques for three-dimensional reconstruction of macromolecular complexes from electron micrographs have been successfully used for many years. These include methods which take advantage of the natural symmetry properties of the structure (for example helical or icosahedral) as well as those that use single axis or other tilting geometries to reconstruct from a set of projection images. These techniques have traditionally relied on a very experienced operator to manually perform the often numerous and time consuming steps required to obtain the final reconstruction. While the guidance and oversight of an experienced and critical operator will always be an essential component of these techniques, recent advances in computer technology, microprocessor controlled microscopes and the availability of high quality CCD cameras have provided the means to automate many of the individual steps.During the acquisition of data automation provides benefits not only in terms of convenience and time saving but also in circumstances where manual procedures limit the quality of the final reconstruction.


2010 ◽  
Vol 39 (2) ◽  
pp. 34-36
Author(s):  
Vaia Touna

This paper argues that the rise of what is commonly termed "personal religion" during the Classic-Hellenistic period is not the result of an inner need or even quality of the self, as often argued by those who see in ancient Greece foreshadowing of Christianity, but rather was the result of social, economic, and political conditions that made it possible for Hellenistic Greeks to redefine the perception of the individual and its relationship to others.


2017 ◽  
Vol 3 (1) ◽  
pp. 112-126 ◽  
Author(s):  
Ilaria Cristofaro

From a phenomenological perspective, the reflective quality of water has a visually dramatic impact, especially when combined with the light of celestial phenomena. However, the possible presence of water as a means for reflecting the sky is often undervalued when interpreting archaeoastronomical sites. From artificial water spaces, such as ditches, huacas and wells to natural ones such as rivers, lakes and puddles, water spaces add a layer of interacting reflections to landscapes. In the cosmological understanding of skyscapes and waterscapes, a cross-cultural metaphorical association between water spaces and the underworld is often revealed. In this research, water-skyscapes are explored through the practice of auto-ethnography and reflexive phenomenology. The mirroring of the sky in water opens up themes such as the continuity, delimitation and manipulation of sky phenomena on land: water spaces act as a continuation of the sky on earth; depending on water spaces’ spatial extension, selected celestial phenomena can be periodically reflected within architectures, so as to make the heavenly dimension easily accessible and a possible object of manipulation. Water-skyscapes appear as specular worlds, where water spaces are assumed to be doorways to the inner reality of the unconscious. The fluid properties of water have the visual effect of dissipating borders, of merging shapes, and, therefore, of dissolving identities; in the inner landscape, this process may represent symbolic death experiences and rituals of initiation, where the annihilation of the individual allows the creative process of a new life cycle. These contextually generalisable results aim to inspire new perspectives on sky-and-water related case studies and give value to the practice of reflexive phenomenology as crucial method of research.


Sign in / Sign up

Export Citation Format

Share Document