Assessing Surgical Competency Through Automated AI-Powered Surgical Video Analysis

The Doctors Company Foundation supports the development and implementation of an artificial intelligence (AI) system to evaluate surgeon performance through automated analysis of surgical video recordings.

Grantee Profile

The University of Michigan W.K. Kellogg Eye Center is a nationally recognized center for vision care and research. Kellogg’s faculty includes outstanding ophthalmologists and vision researchers. Its clinical and research programs attract talented residents and fellows who go on to practice ophthalmology throughout the world. Kellogg ophthalmologists conduct more than 220,000 patient visits each year, offering an exceptional range of services.

“Funding from The Doctors Company Foundation allowed highly skilled graduate and postgraduate trainees to dedicate time to this project. Traditional approaches to assessing surgical competency are fraught with limitations that include subjectivity and lack of multiple raters and longitudinal observation. Applying machine learning and AI to cataract surgical videos provides a unique opportunity to design new evaluation tools that not only measure a surgeon’s skill level, but can also inform future surgical instruction.”
Nambi Nallasamy, MD, Project Lead, Assistant Professor of Ophthalmology and Visual Sciences and Assistant Professor of Computational Medicine and Bioinformatics at the University of Michigan, Kellogg Eye Center­

The Challenge

Achieving competency in cataract surgery is an essential component of ophthalmology residency training. Finding a way to assess surgical competency in a rigorous and objective manner has, however, been an elusive goal. Although video-based analysis of surgical performance has the potential to fundamentally change surgical training, the process of editing and reviewing the videos can be inefficient, cumbersome, and include only a small sample size.

Our goal is developing and implementing an AI system for evaluating surgeon performance in cataract procedures through automated analysis of surgical video recordings.


Cataract surgery is one of the most commonly performed surgeries in the world. Complications—which include retinal detachment, intraocular infection, corneal decompensation, and macular edema—can be devastating.

The ability to ensure surgeon competency through AI-powered surgical video assessment has the potential to reduce the risk of adverse events that cause harm to patients under care. The procedure lends itself well to high-resolution video recording through the operating microscope. Because each surgery can be recorded, it offers significant opportunities for applying machine learning and artificial intelligence in order to automate video-based competency analysis and generate continuous data about each step a surgeon makes during a procedure.

Our Approach

Our team collected video recordings of cataract surgeries performed by attending surgeons at University of Michigan’s Kellogg Eye Center and obtained institutional review board approval for the study.

Frame-by-frame annotations were generated for each video, yielding data that could be used to train machine learning models to identify and track instruments, identify phases of surgery, and compute metrics of surgical performance. Annotation of “ground truth” (that is, the interpretation of a human expert) is essential to training machines to analyze and interpret videos. The dataset of surgical videos we were able to create—called “BigCat”—contains millions of annotated frames.

We also developed an approach for producing specialized, project-specific annotations. This enabled us to generate associated subsets, including tens of thousands of human manual semantic segmentations. These have been essential to the creation of machine learning algorithms for the segmentation of key anatomical and surgical landmarks intraoperatively.


  • Creation of the BigCat surgery video database: Having high-quality annotated surgical video is essential to training deep neural networks for machine learning. A significant accomplishment of our project was gathering a dataset of surgical videos and their annotations to create the BigCat cataract surgery video database. It is the largest cataract surgery video database reported worldwide to date. As of this grant case study, BigCat contains more than 4 million deeply annotated frames of surgical video.
  • Real-Time Surgical Instrument Identification: The machine learning model can accurately identify in real time when a surgical tool is being used during a procedure. The ability to understand the order, duration, and location of surgical instruments at different points throughout a surgery may indicate how well a surgeon performed or whether complications occurred during the procedure.
  • Surgical Phase Identification: We built a machine learning algorithm that identifies the phases of surgery and recognizes the boundaries of each phase with state-of-the-art performance. This model, called “CatStep,” enables the fully automated segmentation of a complete surgical video into its component steps. This type of action recognition is essential to the qualitative and quantitative analysis of surgical maneuvers. Additionally, the time spent on each surgical step can be an important indicator of how well the surgeon is performing.

Takeaways and Next Steps

  • One of the most challenging steps for beginning surgeons is creating the capsulorrhexis (the round anterior capsular opening through which disassembly of the cataract nucleus is performed). Successfully identifying the capsulorrhexis phase of surgery will enable the identification of appropriate frames for capsulorrhexis segmentation. We are working toward automating the analysis of the circularity, smoothness, size, and centration of the capsulorrhexis from surgical video frames.
  • The progression of the project raised the prospect of an additional goal of surgical video analysis: providing intraoperative decision support to surgeons. We realized that using real-time analyses of patient tissue responses to intraoperative maneuvers could enable us to improve patient safety by suggesting patient-specific risk mitigation procedures to surgeons intraoperatively.
  • The next phases of the project will involve development of machine learning algorithms for more deeply examining performance within each individual phase of cataract surgery.
  • The work we have done with the support of The Doctors Company Foundation and our other funding groups has allowed us to generate preliminary data and findings that will support the basis of an NIH R01 grant submission.
  • We have submitted and published a number of peer-reviewed abstracts and journal articles examining the concept of video-based surgical analysis. As a result, we have been invited to present our findings at meetings in the U.S. and internationally. The topic has also garnered significant interest within ophthalmology, across medical specialties, and from ophthalmic surgical device manufacturers. These connections may result in direct clinical application.

For Further Information