scholarly journals Comparison of Keyword Search Techniques with Respect to Electronic Health Records

2021 ◽  
Vol 16 (4) ◽  
pp. 30-35
Author(s):  
Prachi Gurav ◽  
Sanjeev Panandikar

As the world progresses towards automation, manual search for data from large databases also needs to keep pace. When the database includes health data, even minute aspects need careful scrutiny. Keyword search techniques are helpful in extracting data from large databases. There are two keyword search techniques: Exact and Approximate. When the user wants to search through EHR, a short search time is expected. To this end, this work investigates Metaphone (Exact search) and Similar_Text (approximate search) Techniques. We have applied keyword search to the data, which includes the symptoms and names of medicines. Our results indicate that the search time for Similar_text is better than for Metaphone.

2018 ◽  
Author(s):  
Temesgen Hailemariam Dadi ◽  
Enrico Siragusa ◽  
Vitor C. Piro ◽  
Andreas Andrusch ◽  
Enrico Seiler ◽  
...  

AbstractMotivationMapping-based approaches have become limited in their application to very large sets of references since computing an FM-index for very large databases (e.g. > 10 GB) has become a bottleneck. This affects many analyses that need such index as an essential step for approximate matching of the NGS reads to reference databases. For instance, in typical metagenomics analysis, the size of the reference sequences has become prohibitive to compute a single full-text index on standard machines. Even on large memory machines, computing such index takes about one day of computing time. As a result, updates of indices are rarely performed. Hence, it is desirable to create an alternative way of indexing while preserving fast search times.ResultsTo solve the index construction and update problem we propose the DREAM (Dynamic seaRchablE pArallel coMpressed index) framework and provide an implementation. The main contributions are the introduction of an approximate search distributor directories via a novel use of Bloom filters. We combine several Bloom filters to form an interleaved Bloom filter and use this new data structure to quickly exclude reads for parts of the databases where they cannot match. This allows us to keep the databases in several indices which can be easily rebuilt if parts are updated while maintaining a fast search time. The second main contribution is an implementation of DREAM-Yara a distributed version of a fully sensitive read mapper under the DREAM [email protected]://gitlab.com/pirovc/dream_yara/


2021 ◽  
Author(s):  
Nawar Shara ◽  
Kelley M. Anderson ◽  
Noor Falah ◽  
Maryam F. Ahmad ◽  
Darya Tavazoei ◽  
...  

BACKGROUND Healthcare data are fragmenting as patients seek care from diverse sources. Consequently, patient care is negatively impacted by disparate health records. Machine learning (ML) offers a disruptive force in its ability to inform and improve patient care and outcomes [6]. However, the differences that exist in each individual’s health records, combined with the lack of health-data standards, in addition to systemic issues that render the data unreliable and that fail to create a single view of each patient, create challenges for ML. While these problems exist throughout healthcare, they are especially prevalent within maternal health, and exacerbate the maternal morbidity and mortality (MMM) crisis in the United States. OBJECTIVE Maternal patient records were extracted from the electronic health records (EHRs) of a large tertiary healthcare system and made into patient-specific, complete datasets through a systematic method so that a machine-learning-based (ML-based) risk-assessment algorithm could effectively identify maternal cardiovascular risk prior to evidence of diagnosis or intervention within the patient’s record. METHODS We outline the effort that was required to define the specifications of the computational systems, the dataset, and access to relevant systems, while ensuring data security, privacy laws, and policies were met. Data acquisition included the concatenation, anonymization, and normalization of health data across multiple EHRs in preparation for its use by a proprietary risk-stratification algorithm designed to establish patient-specific baselines to identify and establish cardiovascular risk based on deviations from the patient’s baselines to inform early interventions. RESULTS Patient records can be made actionable for the goal of effectively employing machine learning (ML), specifically to identify cardiovascular risk in pregnant patients. CONCLUSIONS Upon acquiring data, including the concatenation, anonymization, and normalization of said data across multiple EHRs, the use of a machine-learning-based (ML-based) tool can provide early identification of cardiovascular risk in pregnant patients. CLINICALTRIAL N/A


2018 ◽  
Vol 27 (4) ◽  
pp. 555-563
Author(s):  
M. Priya ◽  
R. Kalpana

Abstract Challenging searching mechanisms are required to cater to the needs of search engine users in probing the voluminous web database. Searching the query matching keyword based on a probabilistic approach is attractive in most of the application areas, viz. spell checking and data cleaning, because it allows approximate search. A probabilistic approach with maximum likelihood estimation is used to handle real-world problems; however, it suffers from overfitting data. In this paper, a rule-based approach is presented for keyword searching. The process consists of two phases called the rule generation phase and the learning phase. The rule generation phase uses a new technique called N-Gram based Edit distance (NGE) to generate the rule dictionary. The Turing machine model is implemented to describe the rule generation using the NGE technique. In the learning phase, a log model with maximum-a-posterior estimation is used to select the best rule. When evaluated in real time, our system produces the best result in terms of efficiency and accuracy.


Despite improvement in diagnosis and management, cardiovascular disease (CVD) is the leading cause of death and hospitalization throughout the world. The expansion of digital cardiology presents outstanding opportunities for clinicians, researchers, and health care administrators to improve outcomes and sustainability of health systems. Electronic big health data combining electronic health records (EHRs) from diverse individuals across a wide variety of platforms may provide a real-time solution to questions and problems relating to health. Very large population studies based on EHR are efficient and cost-effective, and offer an alternative to traditional research approaches. Indeed, digital cardiology can help researchers to enhance, diagnose, and manage CVD using dedicated algorithms that allow targeted and personalized CVD treatments


2021 ◽  
Author(s):  
Joon-Hyop Lee ◽  
Suhyun Kim ◽  
Kwangsoo Kim ◽  
Young Jun Chai ◽  
Hyeong Won Yu ◽  
...  

BACKGROUND Post-thyroidectomy hypoparathyroidism may result in various transient or permanent symptoms, ranging from tingling sensation to severe breathing difficulties. Its incidence varies among surgeons and institutions, making it difficult to determine its actual incidence and associated factors. OBJECTIVE This study attempted to estimate the incidence of post-operative hypoparathyroidism in patients at two tertiary institutions that share a common data model, the Observational Health Data Sciences and Informatics. METHODS This study used the Common Data Model to extract explicitly specified encoding and relationships among concepts using standardized vocabularies. The EDI-codes of various thyroid disorders and thyroid operations were extracted from two separate tertiary hospitals between January 2013 and December 2018. Patients were grouped into no evidence of/transient/permanent hypoparathyroidism groups to analyze the likelihood of hypoparathyroidism occurrence related to operation types and diagnosis RESULTS Of the 4848 eligible patients at the two institutions who underwent thyroidectomy, 1370 (28.26%) experienced transient hypoparathyroidism and 251 (5.18%) experienced persistent hypoparathyroidism. Univariate logistic regression analysis predicted that, relative to total bilateral thyroidectomy, radical tumor resection was associated with a 48% greater likelihood of transient hypoparathyroidism and a 102% greater likelihood of persistent hypoparathyroidism. Moreover, multivariate logistic analysis found that radical tumor resection was associated with a 50% greater likelihood of transient hypoparathyroidism and a 97% greater likelihood of persistent hypoparathyroidism than total bilateral thyroidectomy. CONCLUSIONS These findings, by integrating and analyzing two databases, suggest that this analysis could be expanded to include other large databases that share the same Observational Health Data Sciences and Informatics protocol.


Author(s):  
Sam Goundar ◽  
Karpagam Masilamani ◽  
Akashdeep Bhardwaj ◽  
Chandramohan Dhasarathan

This chapter provides better understanding and use-cases of big data in healthcare. The healthcare industry generates lot of data every day, and without proper analytical tools, it is quite difficult to extract meaningful data. It is essential to understand big data tools since the traditional devices don't maintain this vast data, and big data solves the major issue in handling massive healthcare data. Health data from numerous health records are collected from various sources, and this massive data is put together to form the big data. Conventional database cannot be used in this purpose due to the diversity in data formats, so it is difficult to merge, and so it is quite impossible to process. With the use of big data this problem is solved, and it can process highly variable data from different sources.


Author(s):  
Luan Ibraimi ◽  
Qiang Tang ◽  
Pieter Hartel ◽  
Willem Jonker

Commercial Web-based Personal-Health Record (PHR) systems can help patients to share their personal health records (PHRs) anytime from anywhere. PHRs are very sensitive data and an inappropriate disclosure may cause serious problems to an individual. Therefore commercial Web-based PHR systems have to ensure that the patient health data is secured using state-of-the-art mechanisms. In current commercial PHR systems, even though patients have the power to define the access control policy on who can access their data, patients have to trust entirely the access-control manager of the commercial PHR system to properly enforce these policies. Therefore patients hesitate to upload their health data to these systems as the data is processed unencrypted on untrusted platforms. Recent proposals on enforcing access control policies exploit the use of encryption techniques to enforce access control policies. In such systems, information is stored in an encrypted form by the third party and there is no need for an access control manager. This implies that data remains confidential even if the database maintained by the third party is compromised. In this paper we propose a new encryption technique called a type-and-identity-based proxy re-encryption scheme which is suitable to be used in the healthcare setting. The proposed scheme allows users (patients) to securely store their PHRs on commercial Web-based PHRs, and securely share their PHRs with other users (doctors).


2020 ◽  
Vol 12 (9) ◽  
pp. 142
Author(s):  
Zhijun Wu ◽  
Bohua Cui

Aiming at the problem of low interconnection efficiency caused by the wide variety of data in SWIM (System-Wide Information Management) and the inconsistent data naming methods, this paper proposes a new TLC (Type-Length-Content) structure hybrid data naming scheme combined with Bloom filters. This solution can meet the uniqueness and durability requirements of SWIM data names, solve the “suffix loopholes” encountered in prefix-based route aggregation in hierarchical naming, and realize scalable and effective route state aggregation. Simulation verification results show that the hybrid naming scheme is better than prefix-based aggregation in the probability of route identification errors. In terms of search time, this scheme has increased by 17.8% and 18.2%, respectively, compared with the commonly used hierarchical and flat naming methods. Compared with the other two naming methods, scalability has increased by 19.1% and 18.4%, respectively.


Sign in / Sign up

Export Citation Format

Share Document