Following the #PharmaUSA conference in Philadelphia last week, I had the opportunity to connect with three different clients over the course of just a few days. Different companies, different priorities, different business questions.
But the conversations converged in a very clear way.
Each of them, in one form or another, centered on what AI means for us in the insights and analytics (I&A) function -- and for the partners who support it.
That is worth noting.
As senior leadership across pharma organizations continues to ask more pointed questions about, and place greater expectations on what I&A can deliver, the need for AI fluency is becoming more urgent. Not just familiarity with terminology, but a grounded understanding of what these systems can actually do, where they add value, and where their limitations exist.
And that is the intent behind this series of essay I've been producing last week and this week.
To help build a shared, practical language. One that allows us to engage more effectively with stakeholders, evaluate solutions more rigorously, and ultimately make better decisions in an increasingly AI-enabled environment.
As with the previous piece, I’m grateful to my colleague Himavanth Chandra, MBA for continuing to partner in shaping this evolving lexicon, both in identifying the concepts that matter most, and in helping ensure we are defining them in ways that reflect how they show up in real-world applications.
What follows is a continuation of that effort.
One of the more subtle, but increasingly relevant limitations in AI systems is context window saturation.
Every model can only process a finite amount of information at once. When that limit is exceeded -- or even approached -- performance begins to degrade. Important details get truncated, earlier context is “forgotten,” and outputs become less coherent or less accurate.
In practice, this shows up when analyzing long patient journeys, large qualitative datasets, or multi-document inputs. The model may appear to respond confidently, but it is doing so with partial visibility.
Understanding this constraint is critical. Because when outputs feel incomplete or inconsistent, the issue is often not the model itself -- but the volume and structure of what we are asking it to process.
AI systems are not static. And neither is the data they rely on.
Model drift refers to changes in a model’s performance over time. Data drift refers to shifts in the underlying data itself; for example, evolving treatment paradigms, new clinical guidelines, or changes in patient behavior.
In pharma, where the landscape evolves constantly, this matters. A model trained on last year’s data may no longer reflect current reality.
Without monitoring and recalibration, even a well-performing model can become misaligned. And because the outputs may still appear plausible, drift can go undetected until it impacts decisions.
A knowledge graph is a structured way of representing relationships between entities -- for example, linking drugs, indications, mechanisms of action, and patient populations.
Unlike traditional databases, which store information in tables, knowledge graphs map how concepts relate to one another.
For AI applications, this is relevant. It enables systems to move beyond isolated data points and begin to understand context and connection.
In pharma, where relationships drive insight -- between symptoms and diagnoses, treatments and outcomes -- this structure becomes particularly valuable.
Traditional search looks for exact matches. Semantic search looks for meaning.
Instead of retrieving documents that contain the same words, semantic search retrieves those that convey similar concepts. It understands that “adverse event” and “side effect” may be related, even if the phrasing differs.
For I&A teams, this has practical implications. Literature reviews, social listening, and qualitative analysis become more comprehensive and more nuanced.
But it also introduces a shift: from searching for what was actually said by a respondent, to interpreting what was truly meant.
A medical ontology is a standardized framework for organizing clinical concepts and terminology.
It ensures that when different systems refer to a condition, treatment, or outcome, they are speaking the same language -- even if the wording differs.
In AI applications, ontologies provide structure and consistency. They diminish ambiguity and improve the reliability of outputs, particularly in functional areas like medical affairs and real-world evidence.
Without this standardization, even sophisticated models can misinterpret meaning or miss connections.
Chunking is the process of breaking large volumes of data into smaller, manageable pieces before feeding them into a model.
This is often necessary because of context window limitations. But it is not just a technical workaround -- it shapes how the model interprets information.
If chunks are too small, context is lost. If they are too large, saturation occurs.
The way data is segmented can directly influence the quality of outputs. Which means that structuring inputs is not a trivial step. It is part of the analytical process.
AI systems learn from data. And data reflects human decisions, behaviors, and assumptions.
Algorithmic bias occurs when those underlying patterns lead to systematically skewed outputs -- whether by overrepresenting certain populations, underrepresenting others, or reinforcing existing disparities.
In pharma, this is highly relevant. Bias can influence patient identification, treatment recommendations, and even how insights are interpreted.
Mitigating this risk requires intentionality: in data selection, in model design, and in ongoing validation. Because left unchecked, bias does not just persist. It compounds.
One of the most important distinctions to understand is between deterministic and probabilistic systems.
Deterministic systems produce the same output given the same input, every time. Traditional analytics often falls into this category.
AI systems, particularly generative models, are probabilistic. They generate outputs based on likelihood, which means variability is inherent.
This is not a flaw. It is a feature.
But it does change how we evaluate results. Consistency is no longer guaranteed. Instead, we assess outputs based on plausibility, coherence, and alignment with known information.
And that requires a different kind of scrutiny.
A digital twin is a virtual representation of a real-world entity -- a patient, a healthcare system, or even a market.
These models can be used to simulate scenarios, test interventions, and explore potential outcomes without real-world risk.
In pharma, the applications are compelling: modeling disease progression, predicting treatment responses, or optimizing patient pathways.
But the value of a digital twin is only as strong as the data and assumptions that underpin it.
It is not reality. It is a representation of reality.
Fine-tuning refers to adapting a pre-trained model to perform better on a specific task or within a specific domain.
Rather than constructing a model from scratch, we refine it using targeted data -- for example, clinical literature, brand materials, or proprietary datasets.
This is how general-purpose models become more relevant in specialized contexts.
But fine-tuning introduces its own considerations: data quality, overfitting, and the balance between general knowledge and domain specificity.
Done well, it enhances precision. Done poorly, it narrows perspective.
Across all of these concepts, a common theme emerges.
AI performance is not just about the model. It is about how data is structured, how inputs are framed, and how systems are designed and maintained over time.
In other words, the outputs we see are a reflection of the decisions we make upstream.
For those of us in I&A, this is where our role continues to evolve.
Not just interpreting outputs, but shaping the conditions that produce them.
If the earlier stages of this journey were about understanding terminology, this stage is about applying it.
How do we design workflows that account for context limitations? How do we monitor for drift? How do we ensure transparency, reduce bias, and validate outputs?
These are not technical questions alone. They are strategic ones.
And increasingly, they sit at the intersection of analytics, technology, and decision-making.
What is becoming clear is that fluency is no longer optional. It is quickly becoming a prerequisite for how we operate in I&A.
The question is not whether these concepts matter, but how quickly we can translate them into better decisions, better workflows, and more credible guidance to the stakeholders we support.
That translation — from terminology to application — is where the real work now happens.