AI outperforms nurses at predicting ER admissions in Mount Sinai study
Overcrowding has dogged hospitals for decades, and now emergency departments are buckling under the pressure. Patients who need beds often languish in hallways, sometimes for hours, while staff scramble to secure space. The bottleneck costs money, strains clinicians, and, in some cases, costs lives.
A new study from Mount Sinai Health System suggests machine learning could give hospitals valuable breathing room, but it also raises questions about how human judgment fits into increasingly automated triage processes.
Published in Mayo Clinic Proceedings: Digital Health, the study compared predictions made by triage nurses to those generated by a machine learning model trained on nearly 2 million historical emergency department visits. The researchers found that the algorithm, which combines a gradient boosting model (XGBoost) with a natural language processing system known as Bio-Clinical BERT, outperformed nurses when it came to forecasting whether patients would ultimately be admitted to the hospital.
A clear winner and huge implications
The research team, led by clinicians and data scientists at Mount Sinai, pulled data from six hospitals across its network, which collectively handle about half a million ER visits annually. Between September and October 2024, 574 nurses recorded a yes-or-no prediction for more than 46,000 patients at the end of triage. Those judgments were then measured against the machine learning model’s predictions, which drew on demographics, vital signs, medical history, and triage notes.
The results clearly favored the model, with nurses correctly predicting admissions with 81.6 percent accuracy, while the machine learning model reached 85.4 percent at its optimal probability threshold. Sensitivity, the ability to catch true admissions, was also in favor of the model, which was correct 70.8 percent of the time against 64.8 percent for nurses. When researchers tested whether combining the nurse predictions with the machine learning system would improve outcomes, they found no added benefit, as the machine learning system stood on its own.
A few percentage points of accuracy seem modest, but in the emergency department even small predictive gains can ripple through the system. Research has repeatedly shown that delays in admitting patients from the ER drive worse outcomes, with one landmark study finding that ICU patients who were admitted without delay had a 12.9 percent mortality rate, compared to 17.4 percent for those who waited. Other studies have also linked surgical delays to increased mortality risk.
Each prolonged admission adds cost, too, and extended hospitalizations represent a significant portion of U.S. healthcare spending. If hospitals can predict admissions earlier, they can allocate beds, notify inpatient teams, and arrange staffing hours in advance, steps that can lead to both cost savings and improved outcomes for patients.
Algorithms challenge clinical instincts (by learning from them)
The Mount Sinai study joins a growing body of work evaluating whether artificial intelligence can outperform human clinicians in triage tasks. Prior research has shown nurses can predict admissions with varying degrees of accuracy, though confidence levels can affect performance.
On the AI side, models that incorporate comprehensive EHR data have achieved area-under-the-curve scores of 0.90 or higher for certain tasks, though those systems often demand heavier computing resources and run the risk of overfitting.
The Mount Sinai model struck a balance by sticking to data readily available at triage. That design makes it potentially more scalable, especially for hospitals without sophisticated IT infrastructure. “We’re not waiting for a provider to finish evaluating the case,” the authors noted. Instead, the system could, in theory, generate a “reservation list” for beds as patients walk through the door.
Can AI help emergency nurses, not replace them?
One of the study’s most surprising findings was that combining human and AI predictions didn’t improve accuracy. The researchers hypothesized that while nurses bring valuable intuition, the binary yes/no input they were asked to provide may not have captured the nuance of clinical reasoning.
Still, the findings highlight a tension that has cropped up across medicine; clinicians want AI tools that augment judgment, not replace it. Emergency nurses, in particular, have long relied on experience to make rapid calls. Reducing their role to a binary checkbox raises cultural and professional stakes that go beyond accuracy metrics.
Practical hurdles yet to be cleared
For all its promise, the Mount Sinai model has yet to prove real-world impact. The predictions were run offline; they did not influence actual triage decisions or bed assignments. The study also covered only two months within a single health system, limiting generalizability. And the model excluded lab and imaging data, which could further refine predictions but are typically available only after triage.
Implementation will also raise practical questions, as hospitals would need to integrate the algorithm into EHR workflows, train staff, and guard against bias in the data. Interpretability matters, as well. The study used SHAP (SHapley Additive exPlanations) analysis to show that age, acuity, and past admissions were among the strongest predictors, factors that align with clinical reasoning and could help build trust among frontline staff.
Hospital leaders are already exploring predictive analytics as part of broader strategies to tackle boarding and overcrowding, with some research showing that forecasting models can help anticipate patient flow patterns and inform staffing decisions. More recent work has tied AI-driven forecasting to potential gains in bed management and cost savings.
Whether the Mount Sinai model can deliver similar benefits remains to be seen. Future studies, the authors noted, will focus on piloting the system in live environments and measuring its impact on length of stay, staffing, and patient safety. The promise of shorter waits, smoother patient flow, and less wasted money is enticing, but the risks of misallocation, overreliance, and cultural pushback must be addressed.
In a system where every minute counts, a few percentage points of predictive power could make a real difference. The question is whether hospitals can harness that edge without losing the human judgment that still anchors emergency care.