Everything you need to know (but were afraid to ask) about AI buzzwords

Hallucinations? LLMs? Generative AI? What’s the deal with the AI buzzwords that are everywhere these days?

By admin

Nov 13, 2024, 10:02 AM

Artificial intelligence is everywhere, yet so many aspects of it remain a mystery to the majority of its users. That’s partially because AI is really complex by nature and still evolving in terms of maturity and potential applications. But it’s also because some of the people selling AI-enabled tools can’t resist casting a smoke screen of buzzwords to disguise the fact that, well, it’s still evolving and maturing.

As a result, healthcare organizations looking to invest in AI tools don’t always know exactly how to decipher what they’re being pitched, or how to ask the right questions to cut through the clutter.

The first step to clarity is a better understanding of some of the terms commonly used in the AI conversation. Here are some of the top buzzwords being thrown around the healthcare industry.

Generative AI

The basic definition of artificial intelligence is “a computer model or algorithm that simulates human activities, including learning and problem solving.” Generative AI is a type of AI that takes this concept a step further: it is able to create net-new content (text, images, videos, or audio) by synthesizing large amounts of data, identifying common features of interest across that data, and presenting its own take on content that arises from those themes.

Chat-GPT is the most famous example. The “GPT” part stands for “generative pre-trained transformer,” which means it’s a model trained on lots and lots of unstructured, unlabeled text information (i.e. most of the internet). By combing through this data for patterns, it can summarize information and get “creative” with its output as it participates in human-like conversations in response to prompts.

It’s a hugely promising area of research for healthcare, especially as the volume of data outpaces the human capacity to understand it. But it’s also very, very new and prone to a number of pitfalls, as we’ll discuss in just a moment.

Large language models (LLMs)

LLMs are the underpinnings for text-based generative AI tools. AI models need huge amounts of training data (another good word to know) to learn how to respond correctly to queries. We’re talking billions upon billions of data points. This training data can be unstructured, semi-structured, or fully structured, depending on the use cases. LLMs tend to fall on the unstructured side of things, because they can use natural language processing (NLP) and other computer science methods in a semi-supervised manner to parse through vast amounts of text, learn the statistical relationships between text elements (from grammar and syntax to colloquial intended meanings), and mimic the understanding of human language in narrative outputs.

These are the models that can pass medical board exams, summarize patient histories, suggest next-best treatment options, and assist with research, because they can “read” massive amounts of data more quickly and thoroughly than people.

Hallucinations

Now come the caveats. Generative AI and LLMs are sometimes prone to hallucinations, which are incorrect or misleading responses to a user’s request. Hallucinations can include straight-up factual errors and nonsensical fabrications, which are understandably pretty risky in the healthcare context.

Hallucinations can occur for a variety of reasons, including incomplete, biased, or poor quality training data. In fact, there is a growing concern that as models like Chat-GPT are increasingly fed internet-based training content created by itself and other generative AI models, hallucinations will spiral out of control as input and output become more and more inbred.

There’s a long list of very obvious (and very funny) examples of hallucinations, but they’re no joke when the fallacies are subtle or the user doesn’t know what to look for. In these cases, hallucinations can easily spread misinformation or lead to errors that propagate through systems unchecked.

Responsible AI and ethical AI

It’s no wonder, then, that many AI experts are strongly pushing for “responsible” or “ethical” development and use of AI, especially in high-risk applications such as medical decision-making. Ethics and responsibility are top of mind for a variety of leading organizations, such as CHIME, CHAI, WHO, FDA, AMA, and many others, all of whom have released guidance on how to make it happen.

Advocates for responsible and ethical AI promote principles of transparency, privacy, accountability, equity, and beneficial use. Developers and users will need to take a systematic approach to identifying and eliminating risks throughout the AI lifecycle.

It boils down to keeping humans in the loop to monitor safety, dignity, and fairness across all use cases where AI might be involved. This should include the administrative environment, even if AI-enabled back office functions don’t directly touch patients.

Black box vs. explicable AI

Responsible and ethical AI is difficult to achieve in a world of “black box” models, or AI tools whose operations are not visible and explicable to the end-user. It’s a hard thing to avoid, since AI is designed, by nature, to complete tasks that are more complex than humans can fathom. In fact, since developers themselves don’t always know how exactly deep neural networks and other computational techniques come to the conclusions that they do, explaining the answers to others is nearly impossible.

This is likely to be a growing problem as models because more sophisticated – and as AI models are increasingly used in contexts that affect life-or-death situations, such as insurance coverage decisions. Potential purchasers should be very clear about the explicability and transparency of AI products, and should push back on companies that cannot provide adequate evidence that they know how their models are producing their results.

Prompt engineering

The hottest new job category of the future, prompt engineering is the act of designing natural language queries for AI models that will produce desired outputs. A good prompt engineer understands the capabilities (and limitations) of LLMs and other models so they can get the best results out of these tools.

A high-quality prompt is simple, specific, and well-structured, creating the best chance for an accurate and useful answer.

Experienced prompt engineering is important for recognizing and avoiding hallucinations, as well as for understanding when and how models might give different answers to exactly the same question, or the same answer to different questions. Because generative AI is such a new area of study, we’re still working out how to maximize its potential and avoid its known (and unknown) problems.

While there are plenty of other concepts to explore, getting to grips some of the basic terminology is a good first step toward creating an AI-driven world that is transparent, fair, ethical, and equitable. Ideally, developers, users, and regulators will work together to promote the good and avoid the potential harm that AI can bring, especially in healthcare and other highly risky, highly regulated areas.

Jennifer Bresnick is a journalist and freelance content creator with a decade of experience in the health IT industry. Her work has focused on leveraging innovative technology tools to create value, improve health equity, and achieve the promises of the learning health system. She can be reached at [email protected].