Differentiating Laparoscopic Skills of Trainees with Computer Vision Based Metrics

Author(s):  
Shiyu Deng ◽  
Chaitanya Kulkarni ◽  
Tianzi Wang ◽  
Jacob Hartman-Kenzler ◽  
Laura E. Barnes ◽  
...  

Context dependent gaze metrics, derived from eye movements explicitly associated with how a task is being performed, are particularly useful for formative assessment that includes feedback on specific behavioral adjustments for skill acquisitions. In laparoscopic surgery, context dependent gaze metrics are under investigated and commonly derived by either qualitatively inspecting the videos frame by frame or mapping the fixations onto a static surgical task field. This study collected eye-tracking and video data from 13 trainees practicing the peg transfer task. Machine learning algorithms in computer vision were employed to derive metrics of tool speed, fixation rate on (moving or stationary) target objects, and fixation rate on tool-object combination. Preliminary results from a clustering analysis on the measurements from 499 practice trials indicated that the metrics were able to differentiate three skill levels amongst the trainees, suggesting high sensitivity and potential of context dependent gaze metrics for surgical assessment.

2020 ◽  
Author(s):  
Alex J. C. Witsil

Volcanoes are dangerous and complex with processes coupled to both the subsurface and atmosphere. Effective monitoring of volcanic behavior during and in between periods of crisis requires a diverse suite of instruments and processing routines. Acoustic microphones and video cameras are typical in long-term deployments and provide important constraints on surficial and observational activity yet are underutilized relative to their seismic counterpart. This dissertation increases the utility of infrasound and video datasets through novel applications of computer vision and machine learning algorithms, which help constrain source dynamics and track shifts in activity. Data analyzed come from infrasound and camera installations at Stromboli Volcano, Italy and Villarrica Volcano, Chile and are diverse in terms of the recorded activity. At Villarrica, a computer vision algorithm quantifies video data into a set of characteristic features that are used in a multiparametric analysis with seismic and infrasound data to constrain activity during a period of crisis in 2015. Video features are also input into a machine learning algorithm that classifies data into five modes of activity, which helps track behavior over weekly and monthly time scales. At Stromboli, infrasound signals radiating from the multiple active vents are synthesized into characteristic features and then clustered via an unsupervised learning algorithm. Time histories of cluster activity at each vent reveal concurrent shifts in behavior that suggest a linked plumbing system between the vents. The algorithms presented are general and modular and can be implemented at monitoring agencies that already collect acoustic and video data.


10.2196/27663 ◽  
2021 ◽  
Vol 8 (5) ◽  
pp. e27663
Author(s):  
Sandersan Onie ◽  
Xun Li ◽  
Morgan Liang ◽  
Arcot Sowmya ◽  
Mark Erik Larsen

Background Suicide is a recognized public health issue, with approximately 800,000 people dying by suicide each year. Among the different technologies used in suicide research, closed-circuit television (CCTV) and video have been used for a wide array of applications, including assessing crisis behaviors at metro stations, and using computer vision to identify a suicide attempt in progress. However, there has been no review of suicide research and interventions using CCTV and video. Objective The objective of this study was to review the literature to understand how CCTV and video data have been used in understanding and preventing suicide. Furthermore, to more fully capture progress in the field, we report on an ongoing study to respond to an identified gap in the narrative review, by using a computer vision–based system to identify behaviors prior to a suicide attempt. Methods We conducted a search using the keywords “suicide,” “cctv,” and “video” on PubMed, Inspec, and Web of Science. We included any studies which used CCTV or video footage to understand or prevent suicide. If a study fell into our area of interest, we included it regardless of the quality as our goal was to understand the scope of how CCTV and video had been used rather than quantify any specific effect size, but we noted the shortcomings in their design and analyses when discussing the studies. Results The review found that CCTV and video have primarily been used in 3 ways: (1) to identify risk factors for suicide (eg, inferring depression from facial expressions), (2) understanding suicide after an attempt (eg, forensic applications), and (3) as part of an intervention (eg, using computer vision and automated systems to identify if a suicide attempt is in progress). Furthermore, work in progress demonstrates how we can identify behaviors prior to an attempt at a hotspot, an important gap identified by papers in the literature. Conclusions Thus far, CCTV and video have been used in a wide array of applications, most notably in designing automated detection systems, with the field heading toward an automated detection system for early intervention. Despite many challenges, we show promising progress in developing an automated detection system for preattempt behaviors, which may allow for early intervention.


2019 ◽  
Vol 9 (15) ◽  
pp. 3065 ◽  
Author(s):  
Dresp-Langley ◽  
Ekseth ◽  
Fesl ◽  
Gohshi ◽  
Kurz ◽  
...  

Detecting quality in large unstructured datasets requires capacities far beyond the limits of human perception and communicability and, as a result, there is an emerging trend towards increasingly complex analytic solutions in data science to cope with this problem. This new trend towards analytic complexity represents a severe challenge for the principle of parsimony (Occam’s razor) in science. This review article combines insight from various domains such as physics, computational science, data engineering, and cognitive science to review the specific properties of big data. Problems for detecting data quality without losing the principle of parsimony are then highlighted on the basis of specific examples. Computational building block approaches for data clustering can help to deal with large unstructured datasets in minimized computation time, and meaning can be extracted rapidly from large sets of unstructured image or video data parsimoniously through relatively simple unsupervised machine learning algorithms. Why we still massively lack in expertise for exploiting big data wisely to extract relevant information for specific tasks, recognize patterns and generate new information, or simply store and further process large amounts of sensor data is then reviewed, and examples illustrating why we need subjective views and pragmatic methods to analyze big data contents are brought forward. The review concludes on how cultural differences between East and West are likely to affect the course of big data analytics, and the development of increasingly autonomous artificial intelligence (AI) aimed at coping with the big data deluge in the near future.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ritaban Dutta ◽  
Cherry Chen ◽  
David Renshaw ◽  
Daniel Liang

AbstractExtraordinary shape recovery capabilities of shape memory alloys (SMAs) have made them a crucial building block for the development of next-generation soft robotic systems and associated cognitive robotic controllers. In this study we desired to determine whether combining video data analysis techniques with machine learning techniques could develop a computer vision based predictive system to accurately predict force generated by the movement of a SMA body that is capable of a multi-point actuation performance. We identified that rapid video capture of the bending movements of a SMA body while undergoing external electrical excitements and adapting that characterisation using computer vision approach into a machine learning model, can accurately predict the amount of actuation force generated by the body. This is a fundamental area for achieving a superior control of the actuation of SMA bodies. We demonstrate that a supervised machine learning framework trained with Restricted Boltzmann Machine (RBM) inspired features extracted from 45,000 digital thermal infrared video frames captured during excitement of various SMA shapes, is capable to estimate and predict force and stress with 93% global accuracy with very low false negatives and high level of predictive generalisation.


2014 ◽  
Vol 128 (2) ◽  
pp. 147-152 ◽  
Author(s):  
S Dowthwaite ◽  
C Szeto ◽  
B Wehrli ◽  
T Daley ◽  
F Whelan ◽  
...  

AbstractObjective:We aimed to investigate the diagnostic accuracy of contact endoscopy in evaluating oral and oropharyngeal mucosal lesions.Methods:Between January 2010 and December 2011, 34 patients with lesions of the oral and oropharyngeal mucosa were enrolled in the study. Comparison between initial contact endoscopy results and ‘gold standard’ tissue biopsy was undertaken.Results:Nine patients had histologically confirmed squamous cell carcinoma, 2 had carcinoma in situ, 3 had dysplastic lesions and 20 patients had various benign lesions. Contact endoscopy demonstrated sensitivity and specificity of 89 and 100 per cent respectively in the evaluation of malignant lesions. Benign lesions were correctly categorised in 50 per cent of cases (10/20). The video images from contact endoscopy could not be interpreted in six cases.Conclusions:Contact endoscopy demonstrates high sensitivity and specificity in the imaging of malignant lesions with reduced reliability in the evaluation of benign lesions. Significant shortcomings also exist in the design of current technology that we believe represent a significant barrier to the reliable collection of useful video data.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2684 ◽  
Author(s):  
Obed Tettey Nartey ◽  
Guowu Yang ◽  
Sarpong Kwadwo Asare ◽  
Jinzhao Wu ◽  
Lady Nadia Frempong

Traffic sign recognition is a classification problem that poses challenges for computer vision and machine learning algorithms. Although both computer vision and machine learning techniques have constantly been improved to solve this problem, the sudden rise in the number of unlabeled traffic signs has become even more challenging. Large data collation and labeling are tedious and expensive tasks that demand much time, expert knowledge, and fiscal resources to satisfy the hunger of deep neural networks. Aside from that, the problem of having unbalanced data also poses a greater challenge to computer vision and machine learning algorithms to achieve better performance. These problems raise the need to develop algorithms that can fully exploit a large amount of unlabeled data, use a small amount of labeled samples, and be robust to data imbalance to build an efficient and high-quality classifier. In this work, we propose a novel semi-supervised classification technique that is robust to small and unbalanced data. The framework integrates weakly-supervised learning and self-training with self-paced learning to generate attention maps to augment the training set and utilizes a novel pseudo-label generation and selection algorithm to generate and select pseudo-labeled samples. The method improves the performance by: (1) normalizing the class-wise confidence levels to prevent the model from ignoring hard-to-learn samples, thereby solving the imbalanced data problem; (2) jointly learning a model and optimizing pseudo-labels generated on unlabeled data; and (3) enlarging the training set to satisfy the hunger of deep learning models. Extensive evaluations on two public traffic sign recognition datasets demonstrate the effectiveness of the proposed technique and provide a potential solution for practical applications.


2019 ◽  
Vol 263 ◽  
pp. 288-298 ◽  
Author(s):  
Innocent Nyalala ◽  
Cedric Okinda ◽  
Luke Nyalala ◽  
Nelson Makange ◽  
Qi Chao ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document