Introduction
Due to the novelty of Artificial Intelligence (AI) and its wide-reaching research applications, there has been an exponential rise in literature focusing on AI’s use within otolaryngology.1–3 For example, published studies range from examining the performance of ChatGPT in theoretical clinical vignettes,4–7 advancing the early detection of oropharyngeal malignancies,8 improving the readability for patient education materials,9 and predicting audiologic outcomes among patients with sudden sensorineural hearing loss.10
Despite the rise in publications focusing on AI within otolaryngology, a minority have examined how AI may be harnessed to improve comprehensive education for trainees within the field. A recent scoping review examined how AI can re-shape the assessment of surgical training in otolaryngology, which is undoubtedly an integral aspect to an otolaryngology residency training program.11 In addition to covering surgical training, this scoping review aims to address other important areas of residency training such as the development of clinical acumen, scholarship, and career preparedness. Research on this topic within other surgical subspecialties such as ophthalmology,12–21 neurosurgery,22–24 and urology25–28 has exceeded that within otolaryngology, further compounding the gap that currently exists within otolaryngology literature.
Given the intensifying rise in AI’s proficiency and use, the aim of this scoping review is to systematically examine AI’s unique ability to improve comprehensive otolaryngology trainee education.
Methods
A systematic review of the literature following PRISMA-ScR guidelines was performed across the following databases: Embase (1947 to July 2025), Scopus (1970 to July 2025), and PubMed (1946 to July 2025). Key words included terms related to three broad categories: artificial intelligence, medical education, and otolaryngology. Examples of key words used include: “Artificial Intelligence,” “machine learning,” “large language model,” “surgical assessment,” “residency,” “learner skills,” “otolaryngology,” and “ENT.” The full search strategy for each database is shared within Appendix 1. Two authors (M.S., J.Z.) independently reviewed all abstracts and full texts using Covidence, resolving conflicts throughout the process. Inclusion criteria included original articles related to the uses of artificial intelligence within otolaryngology training. Exclusion criteria included letters to the editor and opinion letters, conference abstracts, and publications in a language other than English. Any articles not pertaining to either artificial intelligence, medical education, or the field of otolaryngology were excluded. After full-text review, themes were extracted and organized into overarching categories.
Results
The initial search returned 1964 articles, of which 657 were duplicates. After initial abstract screening, 120 articles were selected for full-text review. Of these, 56 articles lacked an AI component, 19 were not related to medical or surgical education, 12 were response letters or editorials, 7 were conference abstracts, 4 were review articles, 4 were not within the field of Otolaryngology, and 1 was not available in English. 16 articles were selected for inclusion in the final thematic analysis (Figure 1). The articles were reviewed and emerging themes analysis was performed. Studies included were published between the years of 2015-2025. We identified the following themes surrounding the use of AI: AI-Enhanced Surgical Training Models,29–32 AI-Based Surgical Skill Assessment,33–35 Instrument tracking and computer vision34–37 AI as a Research Aid,38 AI Augmentation of Didactic Learning,39–41 and AI in Residency Admissions.42–44
Artificial Intelligence as a research aid in Otolaryngology
Shaari et al.38 evaluated the ability of ChatGPT to generate unique research ideas relevant to Otolaryngology. Of the systematic review ideas that ChatGPT generated, the authors found that only a small portion had not been already published. The authors found that artificial intelligence was occasionally able to generate novel systematic review ideas within the field of otolaryngology; the ideas that it did generate were largely feasible and clinically relevant.
Augmenting Didactic learning
Patel et al.39 evaluated the efficacy of using ChatGPT 4 to generate study guides for selected chapters of “Cummings: Otolaryngology Head and Neck Surgery” and assess whether the generated content retains sufficient material, clarity, and relevance to serve as an effective educational tool for otolaryngology residents. The study guides received modest mean scores regarding accuracy, relevancy, and clarity. Qualitative feedback from reviewers was also collected with strengths including the organization and comprehensiveness of the study guides, highlighting their potential utility as a learning tool. However, several reviewers thought that the guides often included excessive detail on non-critical concepts and were not always concise enough.
With enhancing didactic education, specifically anatomical landmarks, Singal et al.40 aimed to evaluate if ChatGPT is a reliable source for gross anatomical information of the scalenovertebral triangle. The authors found that the answers provided by ChatGPT were not appropriate (either incorrect, partially correct, or incomplete) in many of the queries.
An article by Brennan et al.41 aimed to provide guidance on optimizing ChatGPT use by otolaryngology trainees. The article further elaborates on the advantages AI can have on didactic learning, highlighting how trainees may utilize ChatGPT to create customized quizzes or flashcards to help reinforce their knowledge. Furthermore, trainees can use ChatGPT to facilitate learning in other important aspects of a residency education including presenting cases, discussing treatment plans, or explaining procedures to patients.
Residency selection
An additional theme that emerged from our review was the ability for AI to augment professional development and residency selection. A study performed by Vasan et al.42 retrieved over 1500 letters of recommendation (LOR) from the 2022-2023 application cycle from a single institution. A variety of machine learning models were employed to analyze the letters to help predict interview invitations as well as identify frequently used words in the letters of those applicants chosen for interview.
A study by Wilhidal et al.43 aimed to compare the quality of artificial intelligence-generated personal statements to those written by successful applicants to OHNS residency programs in Ontario, Canada, from 2017 to 2022. Provokingly, the authors found that artificial intelligence-generated personal statements significantly outperformed applicant-written statements in all assessed domains, including authenticity, readability, and personability.
Taken together, these studies underscore the potential of an automated process to help predict an applicant’s likelihood of obtaining an interview invitation could be a valuable tool for aspiring Otolaryngology residents and training programs alike.
Despite this proposed benefit, one study in our review was more cautionary. Using ChatGPT as a case study, Halagur et al.44 concluded that utilizing publicly available large language models to aid in otolaryngology residency selection may introduce significant racial, gender, and sexual orientation bias.
AI-Enhanced Surgical Training Models
The uses of AI in surgical training have been centered around providing real-time feedback and guidance to trainees. The most common training models utilized AI-augmented overlays within virtual reality simulators or endoscopic camera views.29–31 One study integrated AI to provide real-time feedback within a 3D-printed model for cochlear implant insertion.32 In this study, a realistic 3D-printed cochlea model was created to help train clinicians in cochlear implant electrode insertion. The system used a mini camera and custom software with a graphical interface to give real-time visual feedback on insertion depth, speed, back-outs, and electrode kinking, along with tactile feedback from the model itself. Using this simulator, researchers found that slower, controlled insertion speed was an important factor in improving trainee technique.
Latour et al.29 developed a virtually augmented surgical navigation (VASN) system within endoscopic sinus surgery simulation to assist learners with visual overlays for optimal surgery and proximity alerts. Additionally, this system used generative AI to render endoscopic viewpoints for additional visual reference. The study found that VASN significantly improved trainee performance, including fewer surgical complications, faster procedure completion, more surgical steps completed, and higher technical skill scores. Additionally, 93% of trainees reported improved confidence after using the model.
Miller et al.30 developed an AI software “copilot” to identify anatomy during flexible fiberoptic laryngoscopy for medical students. The model was trained using machine learning from human-labeled images.
Wijewickrema et al.31 created a machine-learning model that provided automated feedback within a VR temporal bone surgery simulator. They trained the model to incorporate data on several metrics, including average speed, duration, force, and bone removal level, and provide real time feedback to trainees. Trainees using the model received suggestions to increase or decrease metrics such as magnification level or stroke speed as well as proximity feedback. Feedback was 84.2% accurate and resulted in improved bone removal rates.
Dauterman et al.32 created a 3D-printed model of the cochlea for training in cochlear implant electrode insertion. The model provided real-time feedback based on insertion depth, speed, and technique. Residents, fellows, and early-career surgeons improved their performance after use of the model with regards to metrics of insertion depth, speed, and reduction in back-outs and kinks/fold-overs.
Instrument Tracking and Computer Vision
Other studies investigated the use of AI to track surgical instruments from video feeds for the purposes of skill assessment and providing real-time feedback. These AI models not only identified distinct surgical instruments,34–37 but were also capable of providing real-time feedback37 or skill evaluation.34,35
Lee et al.34 created a deep-learning based tracking algorithm for robotic thyroid surgery to determine instrument position and assess skill across quantitative movement metrics. In this study, videos from robotic thyroid surgeries and training simulations were analyzed to develop an automated system for surgical skill evaluation. A deep-learning model (Mask R-CNN with Deep SORT tracking) was used to identify and track surgical instruments, allowing extraction of quantitative motion metrics such as instrument speed, trajectory, and efficiency. These metrics were then used to train machine-learning models to classify surgeon skill level using expert-assigned standard assessment scores as the reference standard. The model achieved up to 88.1% accuracy in identifying surgical instruments from videos. They also developed a surgical skill prediction model, which determined the most important metric to be economy of motion.
Liu et al.35 developed an artificial intelligence model to track otologic instruments in mastoidectomy videos and achieved 93% precision for tracking the otologic drill. This model also tracked drill speed and found a statistically significant difference in drill speed between attendings and residents.
Nwosu et al.37 also developed an AI model to track drill motion in mastoidectomy videos. The model achieved a mean average precision of 81.5%. The model was also able to provide real-time feedback on drill motion across several metrics of average velocity, total distance, and smoothness of motion.
Raymond et al.36 also developed an AI computer vision model to track instrument motion in mastoidectomy videos. This computer vision model achieved 99% mean average precision for tracking the drill and 86% for tracking the suction-irrigator. The model demonstrated the ability to track objective metrics such as stroke direction and relative distance, providing potential measures for evaluation of skill.
AI-Based Surgical Skill Assessment
Integrated within many of these studies was the use of AI to determine the most predictive objective metrics for surgical skill and operative success.33–35
Das et al.33 developed a neural network capable of real-time instrument tracking and surgical skill assessment for endoscopic pituitary surgeries. The AI model achieved object tracking precision scores between 59.1-71.9%. Additionally, the model was capable of skill evaluation based on time and motion-based metrics, distinguishing between novice and expert skill level with 87% accuracy.
Discussion
Here, in this scoping review we systematically examined how AI is presently employed in comprehensive otolaryngology trainee education. Studies were broadly centered around categories of instrument tracking and skills assessment, didactic learning and research aids, and residency selection.
While literature examining AI in otolaryngology education remains somewhat limited, other surgical subspecialties have made considerable progress in integrating AI into trainee surgical education, assessment, and professional development. Ophthalmology, in particular, has emerged as a leading subspecialty in incorporating AI into surgical education.12–21 Machine learning algorithms have been applied to cataract and vitreoretinal surgery videos to objectively assess surgical proficiency, track instrument motion, and differentiate novice from expert performance with accuracy.45,46 These systems offer the potential to overcome long-standing limitations of subjective faculty evaluations by providing continuous, standardized, and reproducible feedback across training environments.
Neurosurgery has similarly leveraged AI to enhance both technical and cognitive aspects of resident education. Studies have employed computer vision and deep learning models to analyze operative footage and neuro-navigation data to characterize surgical workflow, procedural efficiency, and error patterns.22–24 Furthermore, natural language processing techniques have been explored to analyze operative reports and educational documentation, enabling assessment of clinical reasoning, case complexity, and operative exposure.47–49
There has been a considerable amount of work that has been published examining the integration of AI in urology trainee education, particularly in simulation-based education and robotic surgery training. AI-powered platforms have been used to analyze metrics like economy of motion, instrument collisions, and task completion time to provide objective feedback on surgical performance.25–28 Importantly, several studies have demonstrated that AI-derived performance metrics correlate with validated assessment tools and predict future operative proficiency, supporting their use as educational instruments.50,51 Across surgical subspecialties, an underlying theme that has emerged is harnessing AI to transition surgical education away from time-based apprenticeship models and toward data-driven, outcomes-focused training.
Beyond surgical training, multiple surgical subspecialties have explored AI’s role in enhancing didactic learning, scholarship, and development of clinical acumen. Large language models have been evaluated for their ability to generate educational content, summarize literature, assist with examination preparation, and support scholarly writing across general surgery, plastic surgery, and ophthalmology.52–58 While several studies demonstrate improvements in efficiency and learner engagement, concerns remain regarding factual inaccuracies and oversimplification of complex concepts, which was similarly documented by the studies identified in our scoping review.
Additionally, AI-assisted research idea generation, manuscript drafting, and statistical analysis have been examined across disciplines, most prominently Orthopedic surgery, with findings that parallel those observed in otolaryngology.59–64 Specifically, AI tools tend to perform well in organizing ideas, revising/rephrasing text, ensuring feasibility, and improving clarity, but remain limited in originality, hypothesis generation, and critical interpretation. Importantly, these observations reinforce the notion that AI should serve as an adjunct rather than a complete replacement for human expertise, particularly within academic medicine where innovation and critical appraisal are essential.
While AI can be of benefit to residency applicants and admission committees alike, literature has similarly cautioned about the possibility of amplifying existing biases related to race, gender, and socioeconomic status.65–68 These findings closely align with otolaryngology-specific evidence identified in this review and emphasize the ethical necessity of employing transparency, bias auditing, and human oversight when utilizing AI-assisted selection tools.69–71
Collectively, evidence from other surgical subspecialties suggests that otolaryngology trainees are well-positioned to benefit from AI-driven educational tools; however, currently the field remains at an early stage of adoption. Existing otolaryngology studies largely focus on discrete applications such as instrument tracking, skills assessment, research assistance, didactic augmentation, and residency selection rather than fully integrated educational systems.
Looking ahead, the future of AI in otolaryngology education hinges on augmenting and optimizing integration, personalization, and oversight. Rather than isolated tools, AI systems may evolve into longitudinal educational platforms capable of tracking operative experience, knowledge acquisition, and scholarly development across residency.
With the exciting potential of continued use of AI in trainee education, it is paramount to recognize that in order to fully leverage these capabilities, otolaryngology must prioritize the development of expansive datasets that include diverse videos and metadata to improve AI model accuracy and reliability.11 Advances in multimodal AI (integrating video, audio, text, and sensor data) may further enable nuanced assessments of both technical skill and intraoperative decision-making.72 Other promising avenues to enrich learning experiences for otolaryngology trainees include large language models that could more robustly simulate patient interactions or clinical scenarios, allowing residents to practice diagnostic reasoning and decision-making in a risk-free environment.73
Equally important will be the establishment of ethical, regulatory, and educational frameworks to guide AI implementation. Ensuring data privacy, mitigating algorithmic bias, and preserving human judgment will be essential as AI becomes increasingly embedded in training. Faculty will play a critical role, as educators must be equipped to interpret new and evolving AI applications and integrate them meaningfully into training programs and assessment practices.
Our scoping review not only discusses the current uses of AI within comprehensive otolaryngology education, but also underscores the potential for AI to revolutionize otolaryngology education for trainees. Importantly, these applications align closely with the comprehensive goals of residency training, supporting not only surgical competence but also clinical reasoning, didactic learning and academic development, research, and career development.
Conclusion
Several themes emerged in our scoping review on the use of AI in comprehensive otolaryngology education: AI-Enhanced Surgical Training Models, AI-Based Surgical Skill Assessment, AI as a Research Aid, AI Augmentation of Didactic Learning, and AI in Residency Admissions. This scoping review identified present and future applications of AI technology for otolaryngology residency education. We discuss future AI applications in otolaryngology education, considering new approaches to selection, onboarding, training, and resident evaluation over time, ultimately catalyzing their preparedness for practice.