scholarly journals Distributed NLP Framework to create new federated data-sets

Author(s):  
Simon Thompson

IntroductionAlthough healthcare systems generate significant amounts of structured data, there remains a untapped wealth of unstructured narrative data. In the UK, 70% of all NHS digital information is in unstructured form. The NHS has no plans to computerise this data, as it is simply would not be cost effective. Objectives and ApproachOur aim was to make all digitised free text within partner organisations accessible for NLP processing for research, while overcoming information governance challenges. We developed a distributed GATE-based NLP platform enabling NLP models to be automatically distributed and materialised against the free text data in each organisation to create new conventional datasets, which can then be transmitted back using an established governance model. This work adds NLP capability to the UK’s National Research Data Appliances, deployed throughout Wales and beyond and uses many open source components enabling a deployment without additional software licence costs, leading to increased potential use cases. ResultsWe have been able to demonstrate a fully federated network of analytical nodes into NHS Wales, which takes the analytical NLP model to the free text data, as opposed to the data having to travel. Under a common, acceptable, governance model, an approval system enables organisations such as health boards to give permission for projects and NLP models to be used against their data. In a proof of concept project, we have run a number of NLP models over large numbers of documents, which the platform has ingested, converted and analysed. We have developed a proposal for a common NLP model definition format to enable models to be interchangeable between different research groups and systems. Sharing/discovery of established NLP models is key deliverable. Conclusion/ImplicationsThe implications of being able to send the query to the data, enables access to this untapped data source, finally enabling the realisation of new datasets, while abiding by any IG framework. The low cost and simplicity will enable a many research opportunities, some of which are already being realised.

2016 ◽  
Vol 38 (3) ◽  
pp. 585-590 ◽  
Author(s):  
Y. Ahmed-Little ◽  
V. Bothra ◽  
D. Cordwell ◽  
D. Freeman Powell ◽  
D. Ellis ◽  
...  

Background The burden of disease relating to undiagnosed HIV infection is significant in the UK. BHIVA (British HIV Association) recommends population screening in high prevalence areas, expanding outside traditional antenatal/GUM settings. Methods RUClear 2011–12 piloted expanding HIV testing outside traditional settings using home-sampling kits (dry-blood-spot testing) ordered online. Greater Manchester residents (≥age 16) could request testing via an established, online chlamydia testing service (www.ruclear.co.uk). Participant attitudes towards this new service were assessed. Qualitative methods (thematic analysis) were used to analyse free-text data submitted by participants via hard copy questionnaires issued in all testing kits. Results 79.9% (2447/3062) participants completed questionnaires, of which 30.9% (756/2447) provided free-text data. Participants overwhelmingly supported the service, valuing particularly accessibility and convenience, allowing individuals to order tests any time of day and self-sample comfortably at home; avoiding the invasive nature of venipuncture and avoiding the need for face-to-face interaction with health services. The pilot was also clinically and cost-effective. Conclusion Testing via home-sampling kits ordered online (dry-blood-spot testing) was felt to be an acceptable and convenient method for accessing a HIV test. Many individuals undertook HIV testing where they would otherwise not have been tested at all. Expansion of similar services may increase the uptake of HIV testing.


Crime Science ◽  
2020 ◽  
Vol 9 (1) ◽  
Author(s):  
Daniel Birks ◽  
Alex Coleman ◽  
David Jackson

Abstract We present a novel exploratory application of unsupervised machine-learning methods to identify clusters of specific crime problems from unstructured modus operandi free-text data within a single administrative crime classification. To illustrate our proposed approach, we analyse police recorded free-text narrative descriptions of residential burglaries occurring over a two-year period in a major metropolitan area of the UK. Results of our analyses demonstrate that topic modelling algorithms are capable of clustering substantively different burglary problems without prior knowledge of such groupings. Subsequently, we describe a prototype dashboard that allows replication of our analytical workflow and could be applied to support operational decision making in the identification of specific crime problems. This approach to grouping distinct types of offences within existing offence categories, we argue, has the potential to support crime analysts in proactively analysing large volumes of modus operandi free-text data—with the ultimate aims of developing a greater understanding of crime problems and supporting the design of tailored crime reduction interventions.


2003 ◽  
Vol 2003 (1) ◽  
pp. 269-272
Author(s):  
David Salt ◽  
Roger Stockham ◽  
Stuart Byers

ABSTRACT Recent changes in legislation within the United Kingdom created pressure for change in the response strategies applicable in the UK offshore environment. To meet the new requirements, innovative technology was required which was capable of speedily delivering a payload of approximately one ton of dispersant. To provide a cost efficient solution, a system was developed capable of being mounted on a non-dedicated aircraft, which can be rapidly adapted to meet the response requirements. This paper describes the design criteria for the system and goes on to detail the development, construction and flight testing programme for the dispersant pods. It then goes on to briefly describe the operational response system which has been established to provide a response for the offshore operators in the United Kingdom Continental Shelf (UKCS). The development represents a significant step forward in providing a low cost, effective solution to changing response requirements using innovative engineering solutions, allowing for potential application in other parts of the world.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 404
Author(s):  
Chan H. See ◽  
Kirill V. Horoshenkov ◽  
M. Tareq Bin Ali ◽  
Simon J. Tait

Combined sewer overflow structures (CSO) play an important role in sewer networks. When the local capacity of a sewer system is exceeded during intense rainfall events, they act as a “safety valve” and discharge excess rainfall run-off and wastewater directly to a natural receiving water body, thus preventing widespread urban flooding. There is a regulatory requirement that solids in CSO spills must be small and their amount strictly controlled. Therefore, a vast majority of CSOs in the UK contain screens. This paper presents the results of a feasibility study of using low-cost, low-energy acoustic sensors to remotely assess the condition of CSO screens to move to cost-effective reactive maintenance visits. In situ trials were carried out in several CSOs to evaluate the performance of the acoustic sensor under realistic screen and flow conditions. The results demonstrate that the system is robust within ±2.5% to work successfully in a live CSO environment. The observed changes in the screen condition resulted in 8–39% changes in the values of the coefficient in the proposed acoustic model. These changes are detectable and consistent with observed screen and hydraulic data. This study suggested that acoustic-based sensing can effectively monitor the CSO screen blockage conditions and hence reduce the risk of non-compliant CSO spills.


2020 ◽  
Author(s):  
Helen Harper ◽  
Amanda J. Burridge ◽  
Mark Winfield ◽  
Adam Finn ◽  
Andrew D. Davidson ◽  
...  

AbstractTracking genetic variations from positive SARS-CoV-2 samples yields crucial information about the number of variants circulating in an outbreak and the possible lines of transmission but sequencing every positive SARS-CoV-2 sample would be prohibitively costly for population-scale test and trace operations. Genotyping is a rapid, high-throughput and low-cost alternative for screening positive SARS-CoV-2 samples in many settings. We have designed a SNP identification pipeline to identify genetic variation using sequenced SARS-CoV-2 samples. Our pipeline identifies a minimal marker panel that can define distinct genotypes. To evaluate the system we developed a genotyping panel to detect variants-identified from SARS-CoV-2 sequences surveyed between March and May 2020- and tested this on 50 stored qRT-PCR positive SARS-CoV-2 clinical samples that had been collected across the South West of the UK in April 2020. The 50 samples split into 15 distinct genotypes and there was a 76% probability that any two randomly chosen samples from our set of 50 would have a distinct genotype. In a high throughput laboratory, qRT-PCR positive samples pooled into 384-well plates could be screened with our marker panel at a cost of < £1.50 per sample. Our results demonstrate the usefulness of a SNP genotyping panel to provide a rapid, cost-effective, and reliable way to monitor SARS-CoV-2 variants circulating in an outbreak. Our analysis pipeline is publicly available and will allow for marker panels to be updated periodically as viral genotypes arise or disappear from circulation.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0243185 ◽  
Author(s):  
Helen Harper ◽  
Amanda Burridge ◽  
Mark Winfield ◽  
Adam Finn ◽  
Andrew Davidson ◽  
...  

Tracking genetic variations from positive SARS-CoV-2 samples yields crucial information about the number of variants circulating in an outbreak and the possible lines of transmission but sequencing every positive SARS-CoV-2 sample would be prohibitively costly for population-scale test and trace operations. Genotyping is a rapid, high-throughput and low-cost alternative for screening positive SARS-CoV-2 samples in many settings. We have designed a SNP identification pipeline to identify genetic variation using sequenced SARS-CoV-2 samples. Our pipeline identifies a minimal marker panel that can define distinct genotypes. To evaluate the system, we developed a genotyping panel to detect variants-identified from SARS-CoV-2 sequences surveyed between March and May 2020 and tested this on 50 stored qRT-PCR positive SARS-CoV-2 clinical samples that had been collected across the South West of the UK in April 2020. The 50 samples split into 15 distinct genotypes and there was a 61.9% probability that any two randomly chosen samples from our set of 50 would have a distinct genotype. In a high throughput laboratory, qRT-PCR positive samples pooled into 384-well plates could be screened with a marker panel at a cost of < £1.50 per sample. Our results demonstrate the usefulness of a SNP genotyping panel to provide a rapid, cost-effective, and reliable way to monitor SARS-CoV-2 variants circulating in an outbreak. Our analysis pipeline is publicly available and will allow for marker panels to be updated periodically as viral genotypes arise or disappear from circulation.


BMJ Open ◽  
2021 ◽  
Vol 11 (9) ◽  
pp. e045250
Author(s):  
Mike Bracher ◽  
Banyana C Madi-Segwagwe ◽  
Emma Winstanley ◽  
Helen Gillan ◽  
Tracy Long-Sutehall

ObjectivesLong-standing undersupply of eye tissue exists both in the UK and globally, and the UK National Health Service Blood and Transplant Service (NHSBT) has called for further research exploring barriers to eye donation. This study aims to: (1) describe reported reasons for non-donation of eye tissue from solid organ donors in the UK between 1 April 2014 and 31 March 2017 and (2) discuss these findings with respect to existing theories relating to non-donation of eyes by family members.DesignSecondary analysis of a national primary data set of recorded reasons for non-donation of eyes from 2790 potential solid organ donors. Data analysis including descriptive statistics and qualitative content analysis of free-text data for 126 recorded cases of family decline of eye donation.SettingNational data set covering solid organ donation (secondary care).Participants2790 potential organ donors were assessed for eye donation eligibility between 1 April 2014 and 31 March 2017.ResultsReasons for non-retrieval of eyes were recorded as: family wishes (n=1339, 48% of total cases); medical reasons (n=841, 30%); deceased wishes (n=180, 7%). In >50% of recorded cases, reasons for non-donation were based on family’s knowledge of the deceased wishes, their perception of the deceased wishes and specific concerns regarding processes or effects of eye donation (for the deceased body). Findings are discussed with respect to the existing theoretical perspectives.ConclusionEye donation involves distinct psychological and sociocultural factors for families and HCPs that have not been fully explored in research or integrated into service design. We propose areas for future research and service development including potential of only retrieving corneal discs as opposed to full eyes to reduce disfigurement concerns; public education regarding donation processes; exploration of how request processes potentially influence acceptance of eye donation; procedures for assessment of familial responses to information provided during consent conversations.


Author(s):  
Sheng-Jun Huang ◽  
Jia-Lve Chen ◽  
Xin Mu ◽  
Zhi-Hua Zhou

In traditional active learning, there is only one labeler that always returns the ground truth of queried labels. However, in many applications, multiple labelers are available to offer diverse qualities of labeling with different costs. In this paper, we perform active selection on both instances and labelers, aiming to improve the classification model most with the lowest cost. While the cost of a labeler is proportional to its overall labeling quality, we also observe that different labelers usually have diverse expertise, and thus it is likely that labelers with a low overall quality can provide accurate labels on some specific instances. Based on this fact, we propose a novel active selection criterion to evaluate the cost-effectiveness of instance-labeler pairs, which ensures that the selected instance is helpful for improving the classification model, and meanwhile the selected labeler can provide an accurate label for the instance with a relative low cost. Experiments on both UCI and real crowdsourcing data sets demonstrate the superiority of our proposed approach on selecting cost-effective queries.


2021 ◽  
Author(s):  
Emily J Harrop ◽  
Silvia Goss ◽  
Damian JJ Farnell ◽  
Mirella Longo ◽  
Anthony Byrne ◽  
...  

Background: The COVID-19 pandemic is a mass bereavement event which has profoundly disrupted grief experiences. Understanding support needs and access to support among people bereaved at this time is crucial to ensuring appropriate bereavement support infrastructure. Aim: To investigate grief experiences, support needs and use of formal and informal bereavement support among people bereaved during the pandemic. Design: Baseline results from a longitudinal survey. Support needs and experiences of accessing support are reported using descriptive statistics and thematic analysis of free-text data. Setting/Participants: 711 adults bereaved in the UK between March-December 2020, recruited via media, social media, national associations and community/charitable organisations. Results: High-level needs for emotional support were identified. Most participants had not sought support from bereavement services (59%, n=422) or their GP (60%, n=428). Of participants who had sought such support, over half experienced difficulties accessing bereavement services (56%, n=149)/GP support (52%, n=135). 51% reported high/severe vulnerability in grief; among these, 74% were not accessing bereavement or mental-health services. Barriers included limited availability, lack of appropriate support, discomfort asking for help, and not knowing how to access services. 39% (n=279) experienced difficulties getting support from family/friends, including relational challenges, little face-to-face contact, and disrupted collective mourning. The perceived uniqueness of pandemic bereavement and wider societal strains exacerbated their isolation. Conclusions: People bereaved during the pandemic have high levels of support needs alongside difficulties accessing support. We recommend increased provision and tailoring of bereavement services, improved information on support options, and social/educational initiatives to bolster informal support and ameliorate isolation.


2020 ◽  
Vol 46 (6) ◽  
pp. 367-377 ◽  
Author(s):  
Elizabeth Ford ◽  
Malcolm Oswald ◽  
Lamiece Hassan ◽  
Kyle Bozentko ◽  
Goran Nenadic ◽  
...  

BackgroundUse of routinely collected patient data for research and service planning is an explicit policy of the UK National Health Service and UK government. Much clinical information is recorded in free-text letters, reports and notes. These text data are generally lost to research, due to the increased privacy risk compared with structured data. We conducted a citizens’ jury which asked members of the public whether their medical free-text data should be shared for research for public benefit, to inform an ethical policy.MethodsEighteen citizens took part over 3 days. Jurors heard a range of expert presentations as well as arguments for and against sharing free text, and then questioned presenters and deliberated together. They answered a questionnaire on whether and how free text should be shared for research, gave reasons for and against sharing and suggestions for alleviating their concerns.ResultsJurors were in favour of sharing medical data and agreed this would benefit health research, but were more cautious about sharing free-text than structured data. They preferred processing of free text where a computer extracted information at scale. Their concerns were lack of transparency in uses of data, and privacy risks. They suggested keeping patients informed about uses of their data, and giving clear pathways to opt out of data sharing.ConclusionsInformed citizens suggested a transparent culture of research for the public benefit, and continuous improvement of technology to protect patient privacy, to mitigate their concerns regarding privacy risks of using patient text data.


Sign in / Sign up

Export Citation Format

Share Document