IDx-DR (FDA-approved in 2018)
Target disease: diabetic retinopathy
Accuracy: sensitivity 87.2%, specificity 90.7%
Features: fully autonomous. Can be operated by non-ophthalmologists. Used in internal medicine and primary care2)
AI (artificial intelligence) is a general term for machine learning systems that imitate human intelligence. Deep learning (DL) is a subset of AI that uses multi-layer neural networks to extract advanced features and make complex judgments1).
Ophthalmology is one of the medical fields where AI has advanced the most. Fundus photos, OCT (optical coherence tomography), visual field tests, and other image data are standardized, making it easier to secure large amounts of training data. AI is mainly used for the following three purposes.
In 2018, the FDA approved the first fully autonomous AI diagnostic system (IDx-DR), accelerating the practical use of ophthalmic AI diagnosis2). IDx-DR can be operated by non-ophthalmic staff in internal medicine and primary care settings, and it automatically determines whether referral to an ophthalmology specialist is needed2).
Deep learning systems have shown accuracy comparable to specialists in detecting diabetic retinopathy, glaucoma, and AMD, and the potential of AI diagnosis using fundus photographs has been demonstrated8).
AI automatically analyzes images from fundus photographs and OCT to detect diseases such as diabetic retinopathy, glaucoma, and age-related macular degeneration. Screening AI (fully autonomous) can be operated by non-ophthalmologists and is used for primary screening in regions with a shortage of specialists. Research is also being done on the use of AI chatbots (such as GPT-4) to evaluate ophthalmology knowledge and educate patients3). It is positioned as a support tool, with the final diagnosis made by an ophthalmology specialist.
Ophthalmic AI is broadly divided into the following three types according to function and level of autonomy.
Screening AI (fully autonomous)
It automatically analyzes fundus photographs and determines whether referral is unnecessary or needed. It can operate even where ophthalmology specialists are not available, and is applied to the following diseases2).
Diagnostic support AI (semi-autonomous)
A system that assists physicians in image interpretation. It is used for AMD subtype classification through automatic segmentation of OCT layer structures, and for severity assessment of diabetic macular edema (DME).
AI chatbot (multimodal)
An application of a large language model that analyzes text (history-taking information) and images (fundus photographs and OCT) at the same time. ChatGPT-4’s ophthalmic knowledge and image interpretation ability have been evaluated, and its use for patient education and remote history-taking is being considered3).
| AI type | Representative system | Target | Accuracy metric |
|---|---|---|---|
| Screening AI (autonomous) | IDx-DR2) | Diabetic retinopathy | Sensitivity 87.2%, specificity 90.7% |
| Screening AI (autonomous) | i-ROP DL5) | ROP | Sensitivity 91%, specificity 91% |
| Screening AI (autonomous) | EyeArt4) | Diabetic retinopathy | Evaluated and used in the UK NHS |
| AI chatbot | ChatGPT-43) | Ophthalmology knowledge assessment | Overall accuracy 70% |
2) is the first fully autonomous AI diagnostic system approved by the FDA in 2018. Non-ophthalmic staff take images with a non-mydriatic fundus camera, and the AI automatically analyzes them and decides whether to refer. It is being introduced in primary care settings.
Key performance indicators (Abràmoff et al. 2018 pivotal trial)2):
IDx-DR has made autonomous DR screening possible in internal medicine and primary care settings, allowing efficient selection of cases that need referral to an ophthalmology specialist2).
The accuracy of GPT-4 on multiple-choice ophthalmology questions has been evaluated3), with an overall accuracy of 70%.
| Field | Accuracy |
|---|---|
| Retina | 77% (highest)3) |
| Eye tumors | 72%3) |
| Pediatric ophthalmology | 68%3) |
| Uveitis | 67%3) |
| Glaucoma | 61%3) |
| Neuro-ophthalmology | 58% (lowest)3) |
This difference shows that the chatbot’s image interpretation ability still lags behind its non-image text comprehension. It has been pointed out that proper integration of multimodal chatbots in clinical settings is essential3).
IDx-DR (FDA-approved in 2018)
Target disease: diabetic retinopathy
Accuracy: sensitivity 87.2%, specificity 90.7%
Features: fully autonomous. Can be operated by non-ophthalmologists. Used in internal medicine and primary care2)
EyeArt (Eyenuk)
Target disease: diabetic retinopathy
Accuracy: evaluated and put into practical use in the UK NHS
Features: integrated into screening programs4)
i-ROP DL (2018)
Target disease: retinopathy of prematurity (ROP)
Accuracy: sensitivity 91%, specificity 91%
Feature: automatic detection of plus disease in the NICU5)
ChatGPT-4 (OpenAI)
Scope: ophthalmology knowledge and image interpretation assessment
Accuracy: overall accuracy 70% (retina 77%, neuro-ophthalmology 58%)
Feature: research stage for applications in patient education and remote consultations3)
Diabetic retinopathy screening AI (IDx-DR) achieved 87.2% sensitivity and 90.7% specificity, with accuracy comparable to ophthalmologist interpretation2). AI for retinopathy of prematurity (ROP) (i-ROP DL) also achieved 91% sensitivity and 91% specificity5). By contrast, in the ophthalmology knowledge evaluation of the AI chatbot (ChatGPT-4), the overall correct answer rate was 70%, and in neuro-ophthalmology it was lower at 58%3). In all cases, AI is only an assistive tool, and if any abnormality is detected, a detailed examination by an ophthalmology specialist is needed.
Evidence on the cost-effectiveness of AI-based ophthalmic screening has accumulated across multiple studies1).
In Wu’s systematic review (2021), 11 of 15 studies evaluating the economics of AI-based DR screening found it to be cost-effective1).
| Region / setting | Cost-effectiveness assessment | Source |
|---|---|---|
| NHS Scotland | Annual savings of $403,200 | Wu 20211) |
| U.S. primary care | 23.3% cost reduction (per patient) | Wu 20211) |
| Rural areas in China | $34.86 cheaper than human graders, +0.04 QALY | Wu 20211) |
| Japan (AMD, Tamura et al. 2022) | ICER $99,283/QALY (above the threshold) | Wu 20211) |
Autonomous AI screening has been reported to be the most cost-effective compared with telemedicine, ophthalmoscopy, and assisted AI1). At a willingness-to-pay threshold of $7, it was found to be cost-effective compared with assisted screening1).
In a Japanese cohort simulation (500,000 people aged 40 and over, prevalence 3.85%), the ICER for AI screening every 3 years was $99,283/QALY ($92,890-$99,283)1). This exceeds Japan’s willingness-to-pay threshold (about $47,286/QALY), so the cost-effectiveness of AMD screening remains uncertain for now1). However, future improvements may be possible with advances in AI technology and lower costs.
The following are the ethical and legal issues raised by ophthalmic AI1).
Systems approved by regulatory authorities such as the FDA (such as IDx-DR) have undergone rigorous clinical trials and have confirmed a certain level of safety2). However, AI diagnosis is an assistive tool, and the final diagnosis and treatment plan should be determined by an ophthalmologist. Self-diagnosis using only an AI chatbot (such as ChatGPT) is not recommended. AI accuracy may decrease with poor image quality, rare diseases, and neuro-ophthalmology cases3), so if an abnormality is suspected, it is important to see an eye doctor promptly.

A convolutional neural network (CNN: Convolutional Neural Network) is the core technology of AI diagnosis in ophthalmology.
Transfer learning (applying pre-trained models from other domains such as ImageNet to ophthalmic images) is widely used as a method to achieve high accuracy even when training data are limited.
Research is also advancing on using GANs (generative adversarial networks) to generate synthetic images and artificially expand training data for rare diseases.
Multimodal AI that processes text (history-taking information) and images (fundus photographs and OCT) at the same time is being applied to ophthalmology as large language models (such as GPT-4) continue to advance3). While it can integrate more diverse information than a single-modality CNN, it has been shown that its ability to interpret images is still weaker than its understanding of text3).
Deep-learning analysis of fundus photographs has shown that it may be possible to predict systemic risk factors such as age, sex, systolic blood pressure, smoking history, and HbA1c from fundus photographs alone6). Some accuracy has also been reported in predicting future risk of cardiovascular events (myocardial infarction and stroke), drawing attention to the possibility that fundus photographs may serve as a window into overall health. AI models for predicting dementia, kidney disease, and anemia are also still in the research stage6).
Using fundus photography with a small clip-on lens attached to a smartphone, together with AI analysis, has been shown to make DR screening practical in patients with diabetes in India7). Both sensitivity and specificity have been comparable to those of specialized fundus cameras, and AI screening combined with low-cost general-purpose devices could help spread use in developing countries and rural areas.
By combining AI screening with telemedicine, improvement in ophthalmic access in remote and developing regions is expected. Even in facilities without an eye specialist, AI can perform initial screening and send only positive cases for remote review by a specialist, allowing more efficient use of medical resources.
Research is progressing on AI that can predict in advance treatment response to anti-VEGF therapy (ranibizumab, aflibercept, faricimab, etc.) and suggest the best dosing plan for each patient. Models that predict treatment effect from OCT images may help reduce the number of injections and improve visual prognosis.
Large language models (such as GPT-4) are being studied for uses such as explaining diseases to patients, preparing informed consent documents, and assisting with interviews3). However, challenges remain in preventing errors and bias in medical information and in maintaining the doctor-patient relationship. It is not recommended for patients to rely only on chatbots to make decisions about self-diagnosis or self-treatment3).
Wu JH, Liu TYA, Hsu WT, et al. Performance and limitation of machine learning algorithms for diabetic retinopathy screening: meta-analysis. J Med Internet Res. 2021;23(11):e23863.
Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ digital medicine. 2018;1:39. doi:10.1038/s41746-018-0040-6. PMID:31304320; PMCID:PMC6550188.
Mihalache A, Popovic MM, Guo MZ, et al. Performance of an upgraded artificial intelligence chatbot for ophthalmic knowledge assessment. JAMA Ophthalmol. 2024;142(3):234-241.
Olvera-Barrios A, Heeren TF, Balaskas K, et al. Diagnostic accuracy of diabetic retinopathy grading by an artificial intelligence-enabled algorithm compared with a human standard reference. Diabetologia. 2023;66(5):857-866.
Brown JM, Campbell JP, Beers A, et al. Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks. JAMA Ophthalmol. 2018;136(7):803-810.
Poplin R, Varadarajan AV, Blumer K, Liu Y, McConnell MV, Corrado GS, et al. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nature biomedical engineering. 2018;2(3):158-164. doi:10.1038/s41551-018-0195-0. PMID:31015713.
Rajalakshmi R, Subashini R, Anjana RM, et al. Automated diabetic retinopathy detection in smartphone-based fundus photography using artificial intelligence. Eye. 2018;32(6):1138-1144.
Ting DSW, Cheung CY, Lim G, Tan GSW, Quang ND, Gan A, et al. Development and Validation of a Deep Learning System for Diabetic Retinopathy and Related Eye Diseases Using Retinal Images From Multiethnic Populations With Diabetes. JAMA. 2017;318(22):2211-2223. doi:10.1001/jama.2017.18152. PMID:29234807; PMCID:PMC5820739.