Abstract
Study question
Can a group-wide quality assurance scheme be developed to effectively determine inter-operator agreement for morphokinetic parameters of interest.
Summary answer
Very strong agreement was found between all operators except for one, therefore this scheme effectively identified areas of improvement in inter-operator annotations.
What is known already
Where fertility clinics use embryo morphokinetics to determine viability potential, quality assurance of annotations is essential. Embryo selection algorithms rely on the manual determination of certain morphokinetic parameters. Variations in these parameters can lead to differences in the algorithm score attributed to an embryo thus potentially affecting its fate. It is vital that all embryologists involved in embryo annotation and selection are consistent with their annotation approach through regular quality assurance mechanisms.
Study design, size, duration
Each participant was required to annotate the same three embryos for morphokinetic parameters of interest, including tPB2, tPNf, t2 to t5, t8, tM, tSB, tB. Participants were also required to grade embryos at 68 hours post insemination (hpi), 112hpi and to assess additional parameters used for embryo selection or future investigations, such as the extent of morula compaction. The aim of this scheme is to release new distribution each quarter to ensure regular participation.
Participants/materials, setting, methods
All embryologists responsible for embryo annotation in a single, UK fertility group were enrolled onto the scheme. A total of 59 participants from 10 fertility clinics in the UK were included. Inter-operator agreement was assessed using two-way, mixed intraclass correlation coefficient (ICC) for consistency. Five categories of agreement were determined based on ICC score; very weak (0–0.2), weak (0.21–0.4), moderate (0.41–0.6), strong (0.61–0.8) and very strong (0.81–1.0).
Main results and the role of chance
Very strong agreement (0.81–1.0) was observed between all operators for all parameters assessed except for one operator who showed a weak agreement (0.21–0.4) with all other operators. Descriptive statistics revealed standard deviations (SD) ranging from 0.34 (t3) to 3.43 (t5). For each parameter the SD across the three assessed embryos ranged from 0.34–3.43; tPB2 (0.11–0.98), tPNf (2.06–4.40), t2 (0.22–0.80), t3 (0.16–0.70), t4 (0.39–0.65). t5 (2.40–5.44), t8 (0.33–2.72), tM (1.00–2.72), tSB (1.08–2.67), tB (1.12–1.81). These results indicate a high concordance with less subjective annotations such as the cell stage divisions and more variability with the subjective annotations such as the blastulation parameters. The concordance with less well practiced or understood annotations, such as extent of morula compaction, planar or tetrahedral orientation at the four cell stage as well as angle of extrusion of second polar body in relation to the first polar body, was poorer as indicated using descriptive statistics. This highlighted the need for experience in performing these annotations before drawing conclusions regarding their predictive nature in relation to an embryo’s viability.
Limitations, reasons for caution
The variability between more subjective parameters would be expected to be higher than others. The participation in these schemes can create false environments which do not reflect how an embryologist would usually score; they may spend longer on some decisions given the nature of the scheme.
Wider implications of the findings: Quality assurance of morphokinetic annotations across clinics utilising standardised selection models is crucial. Robust annotation policies and education programmes are essential in achieving consistent results between operators. Quality assurance schemes can identify individuals who lack consistency overall and can identify reliably annotated parameters to inform inclusion in embryo selection algorithms.
Trial registration number
Not applicable