data element
Recently Published Documents


TOTAL DOCUMENTS

147
(FIVE YEARS 43)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 17 (4) ◽  
pp. 67-100
Author(s):  
Thang Truong Nguyen ◽  
Nguyen Long Giang ◽  
Dai Thanh Tran ◽  
Trung Tuan Nguyen ◽  
Huy Quang Nguyen ◽  
...  

Attribute reduction from decision tables is one of the crucial topics in data mining. This problem belongs to NP-hard and many approximation algorithms based on the filter or the filter-wrapper approaches have been designed to find the reducts. Intuitionistic fuzzy set (IFS) has been regarded as the effective tool to deal with such the problem by adding two degrees, namely the membership and non-membership for each data element. The separation of attributes in the view of two counterparts as in the IFS set would increase the quality of classification and reduce the reducts. From this motivation, this paper proposes a new filter-wrapper algorithm based on the IFS for attribute reduction from decision tables. The contributions include a new instituitionistics fuzzy distance between partitions accompanied with theoretical analysis. The filter-wrapper algorithm is designed based on that distance with the new stopping condition based on the concept of delta-equality. Experiments are conducted on the benchmark UCI machine learning repository datasets.


2021 ◽  
Author(s):  
Laura Dover Wandner ◽  
Anthony F. Domenichiello ◽  
Jennifer Beierlein ◽  
Leah Pogorzala ◽  
Guadalupe Aquino ◽  
...  

2021 ◽  
Author(s):  
Huaqin Pan ◽  
Cataia Ives ◽  
Meisha Mandal ◽  
Ying Qin ◽  
Tabitha Hendershot ◽  
...  

Objectives: To adopt the FAIR principles (Findable, Accessible, Interoperable, Reusable) to enhance data sharing, the Cure Sickle Cell Initiative (CureSCi) MetaData Catalog (MDC) was developed to make Sickle Cell Disease (SCD) study datasets more Findable by curating study metadata and making them available through an open-access web portal.   Methods: Study metadata, including study protocol, data collection forms, and data dictionaries, describe information about study patient-level data. We curated key metadata of 16 SCD studies in a three-tiered conceptual framework of category, subcategory, and data element using ontologies and controlled vocabularies to organize the study variables. We developed the CureSCi MDC by indexing study metadata to enable effective browse and search capabilities at three levels: study, Patient-Reported Outcome (PRO) Measures, and data element levels.      Results: The CureSCi MDC offers several browse and search tools to discover studies by study level, PRO Measures, and data elements. The “Browse Studies,” “Browse Studies by PRO Measures,” and “Browse Studies by Data Elements” tools allow users to identify studies through pre-defined conceptual categories. “Search by Keyword” and “Search Data Element by Concept Category” can be used separately or in combination to provide more granularity to refine the search results. This resource helps investigators find information about specific data elements across studies using public browsing/search tools, before going through data request procedures to access controlled datasets. The MDC makes SCD studies more Findable through browsing/searching study information, PRO Measures, and data elements, aiding in the reuse of existing SCD data.


Author(s):  
Allison Gates ◽  
Michelle Gates ◽  
Shannon Sim ◽  
Sarah A. Elliott ◽  
Jennifer Pillay ◽  
...  

Background. Machine learning tools that semi-automate data extraction may create efficiencies in systematic review production. We prospectively evaluated an online machine learning and text mining tool’s ability to (a) automatically extract data elements from randomized trials, and (b) save time compared with manual extraction and verification. Methods. For 75 randomized trials published in 2017, we manually extracted and verified data for 21 unique data elements. We uploaded the randomized trials to ExaCT, an online machine learning and text mining tool, and quantified performance by evaluating the tool’s ability to identify the reporting of data elements (reported or not reported), and the relevance of the extracted sentences, fragments, and overall solutions. For each randomized trial, we measured the time to complete manual extraction and verification, and to review and amend the data extracted by ExaCT (simulating semi-automated data extraction). We summarized the relevance of the extractions for each data element using counts and proportions, and calculated the median and interquartile range (IQR) across data elements. We calculated the median (IQR) time for manual and semiautomated data extraction, and overall time savings. Results. The tool identified the reporting (reported or not reported) of data elements with median (IQR) 91 percent (75% to 99%) accuracy. Performance was perfect for four data elements: eligibility criteria, enrolment end date, control arm, and primary outcome(s). Among the top five sentences for each data element at least one sentence was relevant in a median (IQR) 88 percent (83% to 99%) of cases. Performance was perfect for four data elements: funding number, registration number, enrolment start date, and route of administration. Among a median (IQR) 90 percent (86% to 96%) of relevant sentences, pertinent fragments had been highlighted by the system; exact matches were unreliable (median (IQR) 52 percent [32% to 73%]). A median 48 percent of solutions were fully correct, but performance varied greatly across data elements (IQR 21% to 71%). Using ExaCT to assist the first reviewer resulted in a modest time savings compared with manual extraction by a single reviewer (17.9 vs. 21.6 hours total extraction time across 75 randomized trials). Conclusions. Using ExaCT to assist with data extraction resulted in modest gains in efficiency compared with manual extraction. The tool was reliable for identifying the reporting of most data elements. The tool’s ability to identify at least one relevant sentence and highlight pertinent fragments was generally good, but changes to sentence selection and/or highlighting were often required.


2021 ◽  
Vol 12 (04) ◽  
pp. 826-835
Author(s):  
Lorenz A. Kapsner ◽  
Jonathan M. Mang ◽  
Sebastian Mate ◽  
Susanne A. Seuchter ◽  
Abishaa Vengadeswaran ◽  
...  

Abstract Background Many research initiatives aim at using data from electronic health records (EHRs) in observational studies. Participating sites of the German Medical Informatics Initiative (MII) established data integration centers to integrate EHR data within research data repositories to support local and federated analyses. To address concerns regarding possible data quality (DQ) issues of hospital routine data compared with data specifically collected for scientific purposes, we have previously presented a data quality assessment (DQA) tool providing a standardized approach to assess DQ of the research data repositories at the MIRACUM consortium's partner sites. Objectives Major limitations of the former approach included manual interpretation of the results and hard coding of analyses, making their expansion to new data elements and databases time-consuming and error prone. We here present an enhanced version of the DQA tool by linking it to common data element definitions stored in a metadata repository (MDR), adopting the harmonized DQA framework from Kahn et al and its application within the MIRACUM consortium. Methods Data quality checks were consequently aligned to a harmonized DQA terminology. Database-specific information were systematically identified and represented in an MDR. Furthermore, a structured representation of logical relations between data elements was developed to model plausibility-statements in the MDR. Results The MIRACUM DQA tool was linked to data element definitions stored in a consortium-wide MDR. Additional databases used within MIRACUM were linked to the DQ checks by extending the respective data elements in the MDR with the required information. The evaluation of DQ checks was automated. An adaptable software implementation is provided with the R package DQAstats. Conclusion The enhancements of the DQA tool facilitate the future integration of new data elements and make the tool scalable to other databases and data models. It has been provided to all ten MIRACUM partners and was successfully deployed and integrated into their respective data integration center infrastructure.


2021 ◽  
Author(s):  
Varsha Gouthamchand ◽  
Andre Dekker ◽  
Leonard Wee ◽  
Johan van Soest

One of the common concerns in clinical research is improving the infrastructure to facilitate the reuse of clinical data and deal with interoperability issues. FAIR (Findable, Accessible, Interoperable and Reusable) Data Principles enables reuse of data by providing us with descriptive metadata, explaining what the data represents and where the data can be found. In addition to aiding scholars, FAIR guidelines also help in enhancing the machine-readability of data, making it easier for machine algorithms to find and utilize the data. Hence, the feasibility of accurate interpretation of data is higher and this helps in obtaining maximum results from research work. FAIR-ification is done by embedding knowledge on data. This could be achieved by annotating the data using terminologies and concepts from Web Ontology Language (OWL). By attaching a terminological value, we add semantics to a specific data element, increasing the interoperability and reuse. However, this FAIR-ification of data can be a complicated and a time-consuming process. Our main objective is to disentangle the process of making data FAIR by using both domain and technical expertise. We apply this process in a workflow which involves FAIR-ification of four independent public HNSCC datasets from The Cancer Imaging Archive (TCIA). This approach converts the data from the four datasets into Linked Data using RDF triples, and finally annotates these datasets using standardized terminologies. By annotating them, we link all the four datasets together using their semantics and thus a single query would get the intended information from all the datasets.


Epilepsia ◽  
2021 ◽  
Author(s):  
Mark P. Fitzgerald ◽  
Michael C. Kaufman ◽  
Shavonne L. Massey ◽  
Sara Fridinger ◽  
Marisa Prelack ◽  
...  

2021 ◽  
Author(s):  
Francis Lau ◽  
Marcy Antonio ◽  
Kelly Davison ◽  
Roz Queen ◽  
Katie Bryski

BACKGROUND Historically, the terms sex and gender have been used interchangeably as a binary attribute to describe a person as male or female, even though there is growing recognition that sex and gender are distinct concepts. The lack of sex and gender delineation in electronic health records (EHRs) may be perpetuating the inequities experienced by the transgender and gender nonbinary (TGNB) populations. OBJECTIVE This study aims to conduct an environmental scan to understand how sex and gender are defined and implemented in existing Canadian EHRs and current international health information standards. METHODS We examined public information sources on sex and gender definitions in existing Canadian EHRs and international standards communities. Definitions refer to data element names, code systems, and value sets in the descriptions of EHRs and standards. The study was built on an earlier environment scan by Canada Health Infoway, supplemented with sex and gender definitions from international standards communities. For the analysis, we examined the definitions for clarity, consistency, and accuracy. We also received feedback from a virtual community interested in sex-gender EHR issues. RESULTS The information sources consisted of public website descriptions of 52 databases and 55 data standards from 12 Canadian entities and 10 standards communities. There are variations in the definition and implementation of sex and gender in Canadian EHRs and international health information standards. There is a lack of clarity in some sex and gender concepts. There is inconsistency in the data element names, code systems, and value sets used to represent sex and gender concepts across EHRs. The appropriateness and adequacy of some value options are questioned as our societal understanding of sexual health evolves. Outdated value options raise concerns about current EHRs supporting the provision of culturally competent, safe, and affirmative health care. The limited options also perpetuate the inequities faced by the TGNB populations. The expanded sex and gender definitions from leading Canadian organizations and international standards communities have brought challenges in how to migrate these definitions into existing EHRs. We proposed 6 high-level actions, which are to articulate the need for this work, reach consensus on sex and gender concepts, reach consensus on expanded sex and gender definitions in EHRs, develop a coordinated action plan, embrace EHR change from socio-organizational and technical aspects to ensure success, and demonstrate the benefits in tangible terms. CONCLUSIONS There are variations in sex and gender concepts across Canadian EHRs and the health information standards that support them. Although there are efforts to modernize sex and gender concept definitions, we need decisive and coordinated actions to ensure clarity, consistency, and competency in the definition and implementation of sex and gender concepts in EHRs. This work has implications for addressing the inequities of TGNB populations in Canada.


2021 ◽  
Author(s):  
Lingtong Min ◽  
Xiangang Liu ◽  
Deyun Zhou ◽  
Yuanjie Zhi ◽  
Xiaoyang Li ◽  
...  

BACKGROUND Electronic health record information systems' continuous application has accumulated enormous medical data with potential value. Semantic interoperability is the premise of mining and realizing these values by sharing and reusing vast scattered, heterogeneous, and multi-source clinical data. Several initiatives have been developing information models to realize and improve semantic interoperability, among which the openEHR model is one of the outstanding information models. Reusing archetypes from Clinical Knowledge Manager (CKM) instances is the backbone of achieving semantic interoperability based on the openEHR approach. Archetype reuse among different CKMs is of great significance to achieve a broader range of semantic interoperability. However, the reuse of archetypes among different existing CKMs is still unclear. OBJECTIVE This study aims to investigate the reuse of openEHR archetypes across the CKMs approved by openEHR International for achieving and improving semantic interoperability. METHODS This study analyzes and compares archetype reuse across the given five CKMs (Fi-CKMs) and across the four CKMs (Fo-CKMs), respectively. Firstly, the analysis and comparison of reused archetypes are executed in terms of: archetype reuse number and archetype reuse ratio. Then, data elements of the given 25 reused archetypes were analyzed and compared across the Fi-CKMs and across the Fo-CKMs, including the type, number and ratio of the reused data elements. RESULTS The archetype reuse numbers across the Fi-CKMs and the Fo-CKMs are 25 and 58, respectively. Comparison of archetype reuse ratio showed that reducing one CKM increased the archetype reuse ratio from 5.56% - 19.23% to 12.89% - 44.62%. The analysis and comparison of data elements across the Fi-CKMs and the Fo-CKMs show that reducing one CKM increased the reusable data elements from 231 to 355, increased the data element reuse ratio from 54.20% - 70.64% to 90.33% - 100%, and increased the data element direct ratio from 89% to 100%. CONCLUSIONS A large number of archetypes have been accumulated in the existing CKM instances, which provides an important foundation for the reuse of archetypes needed for semantic interoperability. With the decrease of the involving person, archetype reuse is improved, including the number of the reused archetype, the number of data element reuse, and the ratio of data element direct reused. A more effective coordination mechanism between multiple CKMs needs to be established to promote archetypes reuse across CKMs.


2021 ◽  
Vol 34 (1) ◽  
pp. 1-15
Author(s):  
Alexander J. Towbin ◽  
Christopher J. Roth ◽  
Cheryl A. Petersilge ◽  
Kimberley Garriott ◽  
Kenneth A. Buckwalter ◽  
...  

AbstractIn order for enterprise imaging to be successful across a multitude of specialties, systems, and sites, standards are essential to categorize and classify imaging data. The HIMSS-SIIM Enterprise Imaging Community believes that the Digital Imaging Communications in Medicine (DICOM) Anatomic Region Sequence, or its equivalent in other data standards, is a vital data element for this role, when populated with standard coded values. We believe that labeling images with standard Anatomic Region Sequence codes will enhance the user’s ability to consume data, facilitate interoperability, and allow greater control of privacy. Image consumption—when a user views a patient’s images, he or she often wants to see relevant comparison images of the same lesion or anatomic region for the same patient automatically presented. Relevant comparison images may have been acquired from a variety of modalities and specialties. The Anatomic Region Sequence data element provides a basis to allow for efficient comparison in both instances. Interoperability—as patients move between health care systems, it is important to minimize friction for data transfer. Health care providers and facilities need to be able to consume and review the increasingly large and complex volume of data efficiently. The use of Anatomic Region Sequence, or its equivalent, populated with standard values enables seamless interoperability of imaging data regardless of whether images are used within a site or across different sites and systems. Privacy—as more visible light photographs are integrated into electronic systems, it becomes apparent that some images may need to be sequestered. Although additional work is needed to protect sensitive images, standard coded values in Anatomic Region Sequence support the identification of potentially sensitive images, enable facilities to create access control policies, and can be used as an interim surrogate for more sophisticated rule-based or attribute-based access control mechanisms. To satisfy such use cases, the HIMSS-SIIM Enterprise Imaging Community encourages the use of a pre-existing body part ontology. Through this white paper, we will identify potential challenges in employing this standard and provide potential solutions for these challenges.


Sign in / Sign up

Export Citation Format

Share Document