EDGE20: A Cross Spectral Evaluation Dataset for Multiple Surveillance Problems

Author(s):  
Ha Le ◽  
Christos Smailis ◽  
Lei Shi ◽  
Ioannis Kakadiaris
2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Michael Rutherford ◽  
Seong K. Mun ◽  
Betty Levine ◽  
William Bennett ◽  
Kirk Smith ◽  
...  

AbstractWe developed a DICOM dataset that can be used to evaluate the performance of de-identification algorithms. DICOM objects (a total of 1,693 CT, MRI, PET, and digital X-ray images) were selected from datasets published in the Cancer Imaging Archive (TCIA). Synthetic Protected Health Information (PHI) was generated and inserted into selected DICOM Attributes to mimic typical clinical imaging exams. The DICOM Standard and TCIA curation audit logs guided the insertion of synthetic PHI into standard and non-standard DICOM data elements. A TCIA curation team tested the utility of the evaluation dataset. With this publication, the evaluation dataset (containing synthetic PHI) and de-identified evaluation dataset (the result of TCIA curation) are released on TCIA in advance of a competition, sponsored by the National Cancer Institute (NCI), for algorithmic de-identification of medical image datasets. The competition will use a much larger evaluation dataset constructed in the same manner. This paper describes the creation of the evaluation datasets and guidelines for their use.


Author(s):  
J A Hall ◽  
R J Harris ◽  
A Zaidi ◽  
S C Woodhall ◽  
G Dabrera ◽  
...  

Abstract Background Household transmission of SARS-CoV-2 is an important component of the community spread of the pandemic. Little is known about the factors associated with household transmission, at the level of the case, contact or household, or how these have varied over the course of the pandemic. Methods The Household Transmission Evaluation Dataset (HOSTED) is a passive surveillance system linking laboratory-confirmed COVID-19 cases to individuals living in the same household in England. We explored the risk of household transmission according to: age of case and contact, sex, region, deprivation, month and household composition between April and September 2020, building a multivariate model. Results In the period studied, on average, 5.5% of household contacts in England were diagnosed as cases. Household transmission was most common between adult cases and contacts of a similar age. There was some evidence of lower transmission rates to under-16s [adjusted odds ratios (aOR) 0.70, 95% confidence interval (CI) 0.66–0.74). There were clear regional differences, with higher rates of household transmission in the north of England and the Midlands. Less deprived areas had a lower risk of household transmission. After controlling for region, there was no effect of deprivation, but houses of multiple occupancy had lower rates of household transmission [aOR 0.74 (0.66–0.83)]. Conclusions Children are less likely to acquire SARS-CoV-2 via household transmission, and consequently there was no difference in the risk of transmission in households with children. Households in which cases could isolate effectively, such as houses of multiple occupancy, had lower rates of household transmission. Policies to support the effective isolation of cases from their household contacts could lower the level of household transmission.


2017 ◽  
Vol 7 (6) ◽  
pp. 1802-1809 ◽  
Author(s):  
Pedro M. Rodrigo ◽  
Eduardo F. Fernandez ◽  
Marios Theristis ◽  
Florencia Almonacid Cruz

2020 ◽  
Vol 34 (05) ◽  
pp. 8592-8599
Author(s):  
Sheena Panthaplackel ◽  
Milos Gligoric ◽  
Raymond J. Mooney ◽  
Junyi Jessy Li

Comments are an integral part of software development; they are natural language descriptions associated with source code elements. Understanding explicit associations can be useful in improving code comprehensibility and maintaining the consistency between code and comments. As an initial step towards this larger goal, we address the task of associating entities in Javadoc comments with elements in Java source code. We propose an approach for automatically extracting supervised data using revision histories of open source projects and present a manually annotated evaluation dataset for this task. We develop a binary classifier and a sequence labeling model by crafting a rich feature set which encompasses various aspects of code, comments, and the relationships between them. Experiments show that our systems outperform several baselines learning from the proposed supervision.


Sign in / Sign up

Export Citation Format

Share Document