test set
Recently Published Documents





Xiaomian Kang ◽  
Yang Zhao ◽  
Jiajun Zhang ◽  
Chengqing Zong

Document-level neural machine translation (DocNMT) has yielded attractive improvements. In this article, we systematically analyze the discourse phenomena in Chinese-to-English translation, and focus on the most obvious ones, namely lexical translation consistency. To alleviate the lexical inconsistency, we propose an effective approach that is aware of the words which need to be translated consistently and constrains the model to produce more consistent translations. Specifically, we first introduce a global context extractor to extract the document context and consistency context, respectively. Then, the two types of global context are integrated into a encoder enhancer and a decoder enhancer to improve the lexical translation consistency. We create a test set to evaluate the lexical consistency automatically. Experiments demonstrate that our approach can significantly alleviate the lexical translation inconsistency. In addition, our approach can also substantially improve the translation quality compared to sentence-level Transformer.

2022 ◽  
Vol 11 ◽  
Huangqi Zhang ◽  
Binhao Zhang ◽  
Wenting Pan ◽  
Xue Dong ◽  
Xin Li ◽  

PurposeThis study aimed to develop a repeatable MRI-based machine learning model to differentiate between low-grade gliomas (LGGs) and glioblastoma (GBM) and provide more clinical information to improve treatment decision-making.MethodsPreoperative MRIs of gliomas from The Cancer Imaging Archive (TCIA)–GBM/LGG database were selected. The tumor on contrast-enhanced MRI was segmented. Quantitative image features were extracted from the segmentations. A random forest classification algorithm was used to establish a model in the training set. In the test phase, a random forest model was tested using an external test set. Three radiologists reviewed the images for the external test set. The area under the receiver operating characteristic curve (AUC) was calculated. The AUCs of the radiomics model and radiologists were compared.ResultsThe random forest model was fitted using a training set consisting of 142 patients [mean age, 52 years ± 16 (standard deviation); 78 men] comprising 88 cases of GBM. The external test set included 25 patients (14 with GBM). Random forest analysis yielded an AUC of 1.00 [95% confidence interval (CI): 0.86–1.00]. The AUCs for the three readers were 0.92 (95% CI 0.74–0.99), 0.70 (95% CI 0.49–0.87), and 0.59 (95% CI 0.38–0.78). Statistical differences were only found between AUC and Reader 1 (1.00 vs. 0.92, respectively; p = 0.16).ConclusionAn MRI radiomics-based random forest model was proven useful in differentiating GBM from LGG and showed better diagnostic performance than that of two inexperienced radiologists.

2022 ◽  
Vol 11 ◽  
Haolin Yin ◽  
Yu Jiang ◽  
Zihan Xu ◽  
Wenjun Huang ◽  
Tianwu Chen ◽  

Background and PurposeBreast ductal carcinoma in situ (DCIS) has no metastatic potential, and has better clinical outcomes compared with invasive breast cancer (IBC). Convolutional neural networks (CNNs) can adaptively extract features and may achieve higher efficiency in apparent diffusion coefficient (ADC)-based tumor invasion assessment. This study aimed to determine the feasibility of constructing an ADC-based CNN model to discriminate DCIS from IBC.MethodsThe study retrospectively enrolled 700 patients with primary breast cancer between March 2006 and June 2019 from our hospital, and randomly selected 560 patients as the training and validation sets (ratio of 3 to 1), and 140 patients as the internal test set. An independent external test set of 102 patients during July 2019 and May 2021 from a different scanner of our hospital was selected as the primary cohort using the same criteria. In each set, the status of tumor invasion was confirmed by pathologic examination. The CNN model was constructed to discriminate DCIS from IBC using the training and validation sets. The CNN model was evaluated using the internal and external tests, and compared with the discriminating performance using the mean ADC. The area under the curve (AUC), sensitivity, specificity, and accuracy were calculated to evaluate the performance of the previous model.ResultsThe AUCs of the ADC-based CNN model using the internal and external test sets were larger than those of the mean ADC (AUC: 0.977 vs. 0.866, P = 0.001; and 0.926 vs. 0.845, P = 0.096, respectively). Regarding the internal test set and external test set, the ADC-based CNN model yielded sensitivities of 0.893 and 0.873, specificities of 0.929 and 0.894, and accuracies of 0.907 and 0.902, respectively. Regarding the two test sets, the mean ADC showed sensitivities of 0.845 and 0.818, specificities of 0.821 and 0.829, and accuracies of 0.836 and 0.824, respectively. Using the ADC-based CNN model, the prediction only takes approximately one second for a single lesion.ConclusionThe ADC-based CNN model can improve the differentiation of IBC from DCIS with higher accuracy and less time.

2022 ◽  
James Devasia ◽  
Hridyanand Goswami ◽  
Subitha Lakshminarayanan ◽  
Manju Rajaram ◽  
Subathra Adithan ◽  

Abstract Chest X-ray based diagnosis of active Tuberculosis (TB) is one of the oldest ubiquitous tests in medical practice. Artificial Intelligence (AI) based automated detection of abnormality in chest radiography is crucial in radiology workflow. Most deep convolutional neural networks (DCNN) for diagnosing TB by transfer learning from natural images and using the same dataset to evaluate the model performance and diagnostic accuracy. However, dataset shift is a known issue in predictive models in AI, which is unexplored. In this work, we fine-tuned, validated, and tested two benchmark architectures and utilized the transfer learning methodology to measure the diagnostic accuracy on cross-population datasets. We achieved remarkable calcification accuracy of 100% and area under the receiver operating characteristic (AUC) 1.000 [1.000 – 1.000] (with a sensitivity 0.985 [0.971 – 1.000] and a specificity of 0.986 [0.971 – 1.000]) on intramural test set, but significant drop in extramural test set. Accuracy on various extramural test sets varies 50% - 70%, AUC ranges 0.527 – 0.865 (sensitivity and specificity fluctuate 0.394 – 0.995 and 0.443 – 0.864 respectively). Diagnostic performance on the intramural test set observed in this study shows that DCNN can accurately classify active TB and normal chest radiographs, however the external test set shows DCNN is less likely to generalize well on models trained on specific population dataset.

2022 ◽  
Vol 11 ◽  
Elena Bertelli ◽  
Laura Mercatelli ◽  
Chiara Marzi ◽  
Eva Pachetti ◽  
Michela Baccini ◽  

Prostate cancer (PCa) is the most frequent male malignancy and the assessment of PCa aggressiveness, for which a biopsy is required, is fundamental for patient management. Currently, multiparametric (mp) MRI is strongly recommended before biopsy. Quantitative assessment of mpMRI might provide the radiologist with an objective and noninvasive tool for supporting the decision-making in clinical practice and decreasing intra- and inter-reader variability. In this view, high dimensional radiomics features and Machine Learning (ML) techniques, along with Deep Learning (DL) methods working on raw images directly, could assist the radiologist in the clinical workflow. The aim of this study was to develop and validate ML/DL frameworks on mpMRI data to characterize PCas according to their aggressiveness. We optimized several ML/DL frameworks on T2w, ADC and T2w+ADC data, using a patient-based nested validation scheme. The dataset was composed of 112 patients (132 peripheral lesions with Prostate Imaging Reporting and Data System (PI-RADS) score ≥ 3) acquired following both PI-RADS 2.0 and 2.1 guidelines. Firstly, ML/DL frameworks trained and validated on PI-RADS 2.0 data were tested on both PI-RADS 2.0 and 2.1 data. Then, we trained, validated and tested ML/DL frameworks on a multi PI-RADS dataset. We reported the performances in terms of Area Under the Receiver Operating curve (AUROC), specificity and sensitivity. The ML/DL frameworks trained on T2w data achieved the overall best performance. Notably, ML and DL frameworks trained and validated on PI-RADS 2.0 data obtained median AUROC values equal to 0.750 and 0.875, respectively, on unseen PI-RADS 2.0 test set. Similarly, ML/DL frameworks trained and validated on multi PI-RADS T2w data showed median AUROC values equal to 0.795 and 0.750, respectively, on unseen multi PI-RADS test set. Conversely, all the ML/DL frameworks trained and validated on PI-RADS 2.0 data, achieved AUROC values no better than the chance level when tested on PI-RADS 2.1 data. Both ML/DL techniques applied on mpMRI seem to be a valid aid in predicting PCa aggressiveness. In particular, ML/DL frameworks fed with T2w images data (objective, fast and non-invasive) show good performances and might support decision-making in patient diagnostic and therapeutic management, reducing intra- and inter-reader variability.

2022 ◽  
Tianyuan Lu ◽  
Vincenzo Forgetta ◽  
J. Brent Richards ◽  
Celia Greenwood

Abstract Genomic risk prediction is on the emerging path towards personalized medicine. However, the accuracy of polygenic prediction varies strongly in different individuals. In this study, based on up to 352,277 White British participants in the UK Biobank, we constructed polygenic risk scores for 15 physiological and biochemical quantitative traits after performing genome-wide association studies (GWASs). We identified 185 polygenic prediction variability quantitative trait loci (pvQTLs) for 11 traits by Levene’s test among 254,376 unrelated individuals. We validated the effects of pvQTLs using an independent test set of 58,927 individuals. A score aggregating 51 pvQTL SNPs for triglycerides had the strongest Spearman correlation of 0.185 (p-value < 1.0x10−300) with the squared prediction errors. We found a strong enrichment of complex genetic effects conferred by pvQTLs compared to risk loci identified in GWASs, including 89 pvQTLs exhibiting dominance effects. Incorporation of dominance effects into polygenic risk scores significantly improved polygenic prediction for triglycerides, low-density lipoprotein cholesterol, vitamin D, and platelet. After including 87 dominance effects for triglycerides, the adjusted R2 for the polygenic risk score had an 8.1% increase on the test set. In addition, 108 pvQTLs had significant interaction effects with measured environmental or lifestyle exposures. In conclusion, we have discovered and validated genetic determinants of polygenic prediction variability for 11 quantitative biomarkers, and partially profiled the underlying complex genetic effects. These findings may assist interpretation of genomic risk prediction in various contexts, and encourage novel approaches for constructing polygenic risk scores with complex genetic effects.

Cancers ◽  
2022 ◽  
Vol 14 (2) ◽  
pp. 376
Natália Alves ◽  
Megan Schuurmans ◽  
Geke Litjens ◽  
Joeran S. Bosma ◽  
John Hermans ◽  

Early detection improves prognosis in pancreatic ductal adenocarcinoma (PDAC), but is challenging as lesions are often small and poorly defined on contrast-enhanced computed tomography scans (CE-CT). Deep learning can facilitate PDAC diagnosis; however, current models still fail to identify small (<2 cm) lesions. In this study, state-of-the-art deep learning models were used to develop an automatic framework for PDAC detection, focusing on small lesions. Additionally, the impact of integrating the surrounding anatomy was investigated. CE-CT scans from a cohort of 119 pathology-proven PDAC patients and a cohort of 123 patients without PDAC were used to train a nnUnet for automatic lesion detection and segmentation (nnUnet_T). Two additional nnUnets were trained to investigate the impact of anatomy integration: (1) segmenting the pancreas and tumor (nnUnet_TP), and (2) segmenting the pancreas, tumor, and multiple surrounding anatomical structures (nnUnet_MS). An external, publicly available test set was used to compare the performance of the three networks. The nnUnet_MS achieved the best performance, with an area under the receiver operating characteristic curve of 0.91 for the whole test set and 0.88 for tumors <2 cm, showing that state-of-the-art deep learning can detect small PDAC and benefits from anatomy information.

2022 ◽  
Challenger Mishra ◽  
Niklas von Wolff ◽  
Abhinav Tripathi ◽  
Eric Brémond ◽  
Annika Preiss ◽  

Catalytic hydrogenation of esters is a sustainable approach for the production of fine chemicals, and pharmaceutical drugs. However, the efficiency and cost of catalysts are often the bottlenecks in the commercialization of such technologies. The conventional approach of catalyst discovery is based on empiricism that makes the discovery process time-consuming and expensive. There is an urgent need to develop effective approaches to discover efficient catalysts for hydrogenation reactions. We demonstrate here the approach of machine learning for the prediction of out-comes for the catalytic hydrogenation of esters. Our models can predict the reaction yields with high mean accuracies of up to 91% (test set) and suggest that the use of certain chemical descriptors selectively can result in a more accurate model. Furthermore, cata-lysts and some of their corresponding descriptors can also be pre-dicted with mean accuracies of 85%, and >90%, respectively.

2022 ◽  
Vol 12 ◽  
Guangying Cui ◽  
Shanshuo Liu ◽  
Zhenguo Liu ◽  
Yuan Chen ◽  
Tianwen Wu ◽  

Objective: The gut microecosystem is the largest microecosystem in the human body and has been proven to be linked to neurological diseases. The main objective of this study was to characterize the fecal microbiome, investigate the differences between epilepsy patients and healthy controls, and evaluate the potential efficacy of the fecal microbiome as a diagnostic tool for epilepsy.Design: We collected 74 fecal samples from epilepsy patients (Eps, n = 24) and healthy controls (HCs, n = 50) in the First Affiliated Hospital of Zhengzhou University and subjected the samples to 16S rRNA MiSeq sequencing and analysis. We set up a train set and a test set, identified the optimal microbial markers for epilepsy after characterizing the gut microbiome in the former and built a diagnostic model, then validated it in the validation group.Results: There were significant differences in microbial communities between the two groups. The α-diversity of the HCs was higher than that of the epilepsy group, but the Venn diagram showed that there were more unique operational taxonomic unit (OTU) in the epilepsy group. At the phylum level, Proteobacteria and Actinobacteriota increased significantly in Eps, while the relative abundance of Bacteroidota increased in HCs. Compared with HCs, Eps were enriched in 23 genera, including Faecalibacterium, Escherichia-Shigella, Subdoligranulum and Enterobacteriaceae-unclassified. In contrast, 59 genera including Bacteroides, Megamonas, Prevotella, Lachnospiraceae-unclassified and Blautia increased in the HCs. In Spearman correlation analysis, age, WBC, RBC, PLT, ALB, CREA, TBIL, Hb and Urea were positively correlated with most of the different OTUs. Seizure-type, course and frequency are negatively correlated with most of the different OTUs. In addition, twenty-two optimal microbial markers were identified by a fivefold cross-validation of the random forest model. In the established train set and test set, the area under the curve was 0.9771 and 0.993, respectively.Conclusion: Our study was the first to characterize the gut microbiome of Eps and HCs in central China and demonstrate the potential efficacy of microbial markers as a noninvasive biological diagnostic tool for epilepsy.

2022 ◽  
Vol 12 (1) ◽  
Ashish Sarraju ◽  
Andrew Ward ◽  
Jiang Li ◽  
Areli Valencia ◽  
Latha Palaniappan ◽  

AbstractStatin therapy is the cornerstone of preventing atherosclerotic cardiovascular disease (ASCVD), primarily by reducing low density lipoprotein cholesterol (LDL-C) levels. Optimal statin therapy decisions rely on shared decision making and may be uncertain for a given patient. In areas of clinical uncertainty, personalized approaches based on real-world data may help inform treatment decisions. We sought to develop a personalized statin recommendation approach for primary ASCVD prevention based on historical real-world outcomes in similar patients. Our retrospective cohort included adults from a large Northern California electronic health record (EHR) aged 40–79 years with no prior cardiovascular disease or statin use. The cohort was split into training and test sets. Weighted-K-nearest-neighbor (wKNN) regression models were used to identify historical EHR patients similar to a candidate patient. We modeled four statin decisions for each patient: none, low-intensity, moderate-intensity, and high-intensity. For each candidate patient, the algorithm recommended the statin decision that was associated with the greatest percentage reduction in LDL-C after 1 year in similar patients. The overall cohort consisted of 50,576 patients (age 54.6 ± 9.8 years) with 55% female, 48% non-Hispanic White, 32% Asian, and 7.4% Hispanic patients. Among 8383 test-set patients, 52%, 44%, and 4% were recommended high-, moderate-, and low-intensity statins, respectively, for a maximum predicted average 1-yr LDL-C reduction of 16.9%, 20.4%, and 14.9%, in each group, respectively. Overall, using aggregate EHR data, a personalized statin recommendation approach identified the statin intensity associated with the greatest LDL-C reduction in historical patients similar to a candidate patient. Recommendations included low- or moderate-intensity statins for maximum LDL-C lowering in nearly half the test set, which is discordant with their expected guideline-based efficacy. A data-driven personalized statin recommendation approach may inform shared decision making in areas of uncertainty, and highlight unexpected efficacy-effectiveness gaps.

Sign in / Sign up

Export Citation Format

Share Document