scholarly journals Blastocyst Prediction of Day-3 Cleavage-Stage Embryos Using Machine Learning

2021 ◽  
pp. 1-6
Author(s):  
Dung P. Nguyen ◽  
Quan T. Pham ◽  
Thanh L. Tran ◽  
Lan N. Vuong ◽  
Tuong M. Ho

Background:Embryo selection plays an important role in the success of in vitro fertilization (IVF). However, morphological embryo assessment has a number of limitations, including the time required, lack of accuracy, and inconsistency. This study determined whether a machine learning-based model could predict blastocyst formation using day-3 embryo images. Methods:Day-3 embryo images from IVF/intracytoplasmic sperm injection (ICSI) cycles performed at My Duc Phu Nhuan Hospital between August 2018 and June 2019 were retrospectively analyzed to inform model development. Day-3 embryo images derived from two-pronuclear (2PN) zygotes with known blastocyst formation data were extracted from the CCM-iBIS time-lapse incubator (Astec, Japan) at 67 hours post ICSI, and labeled as blastocyst/non-blastocyst based on results at 116 hours post ICSI. Images were used as the input dataset to train (85%) and validate (15%) the convolutional neural network (CNN) model, then model accuracy was determined using the training and validation dataset. The performance of 13 experienced embryologists for predicting blastocyst formation based on 100 day-3 embryo images was also evaluated. Results:A total of 1,135 images were allocated into training ([Formula: see text] = 967) and validation ([Formula: see text] = 168) sets, with an even distribution for blastocyst formation outcome. The accuracy of the final model for blastocyst formation was 97.72% in the training dataset and 76.19% in the validation dataset. The final model predicted blastocyst formation from day-3 embryo images in the validation dataset with an area under the curve of 0.75 (95% confidence interval [CI] 0.69–0.81). Embryologists predicted blastocyst formation with the accuracy of 70.07% (95% CI 68.12%–72.03%), sensitivity of 87.04% (95% CI 82.56%–91.52%), and specificity of 30.93% (95% CI 29.35%–32.51%). Conclusions:The CNN-based machine learning model using day-3 embryo images predicted blastocyst formation more accurately than experienced embryologists. The CNN-based model is a potential tool to predict additional IVF outcomes.

2021 ◽  
Vol 12 ◽  
Author(s):  
Haixia Jin ◽  
Xiaoxue Shen ◽  
Wenyan Song ◽  
Yan Liu ◽  
Lin Qi ◽  
...  

It is well known that the transfer of embryos at the blastocyst stage is superior to the transfer of embryos at the cleavage stage in many respects. However, the rate of blastocyst formation remains low in clinical practice. To reduce the possibility of wasting embryos and to accurately predict the possibility of blastocyst formation, we constructed a nomogram based on range of clinical characteristics to predict blastocyst formation rates in patients with different types of infertility. We divided patients into three groups based on female etiology: a tubal factor group, a polycystic ovary syndrome group, and an endometriosis group. Multiple logistic regression was used to analyze the relationship between patient characteristics and blastocyst formation. Each group of patients was divided into a training set and a validation set. The training set was used to construct the nomogram, while the validation set was used to test the performance of the model by using discrimination and calibration. The area under the curve (AUC) for the three groups indicated that the models performed fairly and that calibration was acceptable in each model.


2019 ◽  
Vol 17 (1) ◽  
Author(s):  
Jiahui Qiu ◽  
Pingping Li ◽  
Meng Dong ◽  
Xing Xin ◽  
Jichun Tan

Abstract Background Infertility has become a global health issue with the number of couples seeking in vitro fertilization (IVF) worldwide continuing to rise. Some couples remain childless after several IVF cycles. Women undergoing IVF face greater risks and financial burden. A prediction model to predict the live birth chance prior to the first IVF treatment is needed in clinical practice for patients counselling and shaping expectations. Methods Clinical data of 7188 women who underwent their first IVF treatment at the Reproductive Medical Center of Shengjing Hospital of China Medical University during 2014–2018 were retrospectively collected. Machine-learning based models were developed on 70% of the dataset using pre-treatment variables, and prediction performances were evaluated on the remaining 30% using receiver operating characteristic (ROC) analysis and calibration plot. Nested cross-validation was used to make an unbiased estimate of the generalization performance of the machine learning algorithms. Results The XGBoost model achieved an area under the ROC curve of 0.73 on the validation dataset and showed the best calibration compared with other machine learning algorithms. Nested cross-validation resulted in an average accuracy score of 0.70 ± 0.003 for the XGBoost model. Conclusions A prediction model based on XGBoost was developed using age, AMH, BMI, duration of infertility, previous live birth, previous miscarriage, previous abortion and type of infertility as predictors. This study might be a promising step to provide personalized estimates of the cumulative live birth chance of the first complete IVF cycle before treatment.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Qingsong Xi ◽  
Qiyu Yang ◽  
Meng Wang ◽  
Bo Huang ◽  
Bo Zhang ◽  
...  

Abstract Background To minimize the rate of in vitro fertilization (IVF)- associated multiple-embryo gestation, significant efforts have been made. Previous studies related to machine learning in IVF mainly focused on selecting the top-quality embryos to improve outcomes, however, in patients with sub-optimal prognosis or with medium- or inferior-quality embryos, the selection between SET and DET could be perplexing. Methods This was an application study including 9211 patients with 10,076 embryos treated during 2016 to 2018, in Tongji Hospital, Wuhan, China. A hierarchical model was established using the machine learning system XGBoost, to learn embryo implantation potential and the impact of double embryos transfer (DET) simultaneously. The performance of the model was evaluated with the AUC of the ROC curve. Multiple regression analyses were also conducted on the 19 selected features to demonstrate the differences between feature importance for prediction and statistical relationship with outcomes. Results For a single embryo transfer (SET) pregnancy, the following variables remained significant: age, attempts at IVF, estradiol level on hCG day, and endometrial thickness. For DET pregnancy, age, attempts at IVF, endometrial thickness, and the newly added P1 + P2 remained significant. For DET twin risk, age, attempts at IVF, 2PN/ MII, and P1 × P2 remained significant. The algorithm was repeated 30 times, and averaged AUC of 0.7945, 0.8385, and 0.7229 were achieved for SET pregnancy, DET pregnancy, and DET twin risk, respectively. The trend of predictive and observed rates both in pregnancy and twin risk was basically identical. XGBoost outperformed the other two algorithms: logistic regression and classification and regression tree. Conclusion Artificial intelligence based on determinant-weighting analysis could offer an individualized embryo selection strategy for any given patient, and predict clinical pregnancy rate and twin risk, therefore optimizing clinical outcomes.


2021 ◽  
Author(s):  
Fang He ◽  
John H Page ◽  
Kerry R Weinberg ◽  
Anirban Mishra

BACKGROUND The current COVID-19 pandemic is unprecedented; under resource-constrained setting, predictive algorithms can help to stratify disease severity, alerting physicians of high-risk patients, however there are few risk scores derived from a substantially large EHR dataset, using simplified predictors as input. OBJECTIVE To develop and validate simplified machine learning algorithms which predicts COVID-19 adverse outcomes, to evaluate the AUC (area under the receiver operating characteristic curve), sensitivity, specificity and calibration of the algorithms, to derive clinically meaningful thresholds. METHODS We conducted machine learning model development and validation via cohort study using multi-center, patient-level, longitudinal electronic health records (EHR) from Optum® COVID-19 database which provides anonymized, longitudinal EHR from across US. The models were developed based on clinical characteristics to predict 28-day in-hospital mortality, ICU admission, respiratory failure, mechanical ventilator usages at inpatient setting. Data from patients who were admitted prior to Sep 7, 2020, is randomly sampled into development, test and validation datasets; data collected from Sep 7, 2020 through Nov 15, 2020 was reserved as prospective validation dataset. RESULTS Of 3.7M patients in the analysis, a total of 585,867 patients were diagnosed or tested positive for SARS-CoV-2; and 50,703 adult patients were hospitalized with COVID-19 between Feb 1 and Nov 15, 2020. Among the study cohort (N=50,703), there were 6,204 deaths, 9,564 ICU admissions, 6,478 mechanically ventilated or EMCO patients and 25,169 patients developed ARDS or respiratory failure within 28 days since hospital admission. The algorithms demonstrated high accuracy (AUC = 0.89 (0.89 - 0.89) on validation dataset (N=10,752)), consistent prediction through the second wave of pandemic from September to November (AUC = 0.85 (0.85 - 0.86) on post-development validation (N= 14,863)), great clinical relevance and utility. Besides, a comprehensive 386 input covariates from baseline and at admission was included in the analysis; the end-to-end pipeline automates feature selection and model development process, producing 10 key predictors as input such as age, blood urea nitrogen, oxygen saturation, which are both commonly measured and concordant with recognized risk factors for COVID-19. CONCLUSIONS The systematic approach and rigorous validations demonstrate consistent model performance to predict even beyond the time period of data collection, with satisfactory discriminatory power and great clinical utility. Overall, the study offers an accurate, validated and reliable prediction model based on only ten clinical features as a prognostic tool to stratifying COVID-19 patients into intermediate, high and very high-risk groups. This simple predictive tool could be shared with a wider healthcare community, to enable service as an early warning system to alert physicians of possible high-risk patients, or as a resource triaging tool to optimize healthcare resources. CLINICALTRIAL N/A


2020 ◽  
Author(s):  
Wanjun Zhao ◽  
Yong Zhang ◽  
Xinming Li ◽  
Yonghong Mao ◽  
Changwei Wu ◽  
...  

AbstractBackgroundBy extracting the spectrum features from urinary proteomics based on an advanced mass spectrometer and machine learning algorithms, more accurate reporting results can be achieved for disease classification. We attempted to establish a novel diagnosis model of kidney diseases by combining machine learning with an extreme gradient boosting (XGBoost) algorithm with complete mass spectrum information from the urinary proteomics.MethodsWe enrolled 134 patients (including those with IgA nephropathy, membranous nephropathy, and diabetic kidney disease) and 68 healthy participants as a control, and for training and validation of the diagnostic model, applied a total of 610,102 mass spectra from their urinary proteomics produced using high-resolution mass spectrometry. We divided the mass spectrum data into a training dataset (80%) and a validation dataset (20%). The training dataset was directly used to create a diagnosis model using XGBoost, random forest (RF), a support vector machine (SVM), and artificial neural networks (ANNs). The diagnostic accuracy was evaluated using a confusion matrix. We also constructed the receiver operating-characteristic, Lorenz, and gain curves to evaluate the diagnosis model.ResultsCompared with RF, the SVM, and ANNs, the modified XGBoost model, called a Kidney Disease Classifier (KDClassifier), showed the best performance. The accuracy of the diagnostic XGBoost model was 96.03% (CI = 95.17%-96.77%; Kapa = 0.943; McNemar’s Test, P value = 0.00027). The area under the curve of the XGBoost model was 0.952 (CI = 0.9307-0.9733). The Kolmogorov-Smirnov (KS) value of the Lorenz curve was 0.8514. The Lorenz and gain curves showed the strong robustness of the developed model.ConclusionsThis study presents the first XGBoost diagnosis model, i.e., the KDClassifier, combined with complete mass spectrum information from the urinary proteomics for distinguishing different kidney diseases. KDClassifier achieves a high accuracy and robustness, providing a potential tool for the classification of all types of kidney diseases.


2021 ◽  
Author(s):  
Itay Erlich ◽  
Assaf Ben-Meir ◽  
Iris Har-Vardi ◽  
James A Grifo ◽  
Assaf Zaritsky

Automated live embryo imaging has transformed in-vitro fertilization (IVF) into a data-intensive field. Unlike clinicians who rank embryos from the same IVF cycle cohort based on the embryos visual quality and determine how many embryos to transfer based on clinical factors, machine learning solutions usually combine these steps by optimizing for implantation prediction and using the same model for ranking the embryos within a cohort. Here we establish that this strategy can lead to sub-optimal selection of embryos. We reveal that despite enhancing implantation prediction, inclusion of clinical properties hampers ranking. Moreover, we find that ambiguous labels of failed implantations, due to either low quality embryos or poor clinical factors, confound both the optimal ranking and even implantation prediction. To overcome these limitations, we propose conceptual and practical steps to enhance machine-learning driven IVF solutions. These consist of separating the optimizing of implantation from ranking by focusing on visual properties for ranking, and reducing label ambiguity.


2019 ◽  
Author(s):  
Zied Hosni ◽  
Annalisa Riccardi ◽  
Stephanie Yerdelen ◽  
Alan R. G. Martin ◽  
Deborah Bowering ◽  
...  

<div><div><p>Polymorphism is the capacity of a molecule to adopt different conformations or molecular packing arrangements in the solid state. This is a key property to control during pharmaceutical manufacturing because it can impact a range of properties including stability and solubility. In this study, a novel approach based on machine learning classification methods is used to predict the likelihood for an organic compound to crystallise in multiple forms. A training dataset of drug-like molecules was curated from the Cambridge Structural Database (CSD) and filtered according to entries in the Drug Bank database. The number of separate forms in the CSD for each molecule was recorded. A metaclassifier was trained using this dataset to predict the expected number of crystalline forms from the compound descriptors. This approach was used to estimate the number of crystallographic forms for an external validation dataset. These results suggest this novel methodology can be used to predict the extent of polymorphism of new drugs or not-yet experimentally screened molecules. This promising method complements expensive ab initio methods for crystal structure prediction and as integral to experimental physical form screening, may identify systems that with unexplored potential.</p> </div> </div>


Sign in / Sign up

Export Citation Format

Share Document