The Bayesian network from lesson 16 was a static snapshot: a patient’s current findings give a posterior over current diagnoses. Real clinical reasoning is inherently temporal. A patient’s condition evolves. Vital signs trend. New symptoms emerge. Test results arrive at different times. An AI system that reasons about patient trajectories needs to model change over time under uncertainty.

This lesson covers the graphical model formalisms that handle temporal uncertainty, along with the practical question of uncertainty quantification: how do we know when to trust an AI prediction?

Before the formal sections, four terms are worth naming. Aleatoric uncertainty comes from genuine randomness or overlap in the data and cannot be removed just by collecting more examples. Epistemic uncertainty comes from limited knowledge and often can be reduced with better data or models. Calibration means predicted probabilities match observed frequencies. Expected utility means the average value of an action after weighting all possible outcomes by their probabilities.

Core learnings about uncertainty and graphical models

Temporal uncertainty requires models that represent latent state evolution across time, not only static snapshots.
HMMs and DBNs formalize hidden dynamics and evidence generation in sequence data.
Uncertainty type matters: aleatoric uncertainty calls for risk-aware decisions, while epistemic uncertainty motivates additional data.
Calibrated probabilities plus expected utility give a principled basis for clinical action selection.

Markov Chains and the Markov Property

A Markov chain models a sequence of states where the future depends only on the present, not on the full history:

P(state_t+1 | state_t, state_t-1, ..., state_0) = P(state_t+1 | state_t)

Symbol meaning in this expression:

$state_t$ is the latent/system state at time step $t$ .
$state_{t+1}$ is the next-step state.
The left side conditions on full history; the right side conditions only on the current state.
Equality states the Markov assumption: current state is sufficient summary of relevant past information.

This is the Markov property or “memoryless” assumption. It is rarely exactly true but often approximately true over short timescales, making Markov chains useful in practice.

A triage patient’s acuity level might evolve as a Markov chain: from Stable to Deteriorating to Critical to either Stabilised or Death. Given current acuity, the transition probabilities encode how quickly patients evolve. A planner managing ICU capacity can use this chain to forecast bed demand.

Hidden Markov Models

A Hidden Markov Model (HMM) separates hidden states from observations. The hidden state (e.g., true disease state: stable/worsening/critical) is not directly observed. What we observe is evidence correlated with the hidden state: vital sign trends, lab values, nursing assessments.

Formally:

A sequence of hidden states S1, S2, …, ST following a Markov chain.
At each time t, an observation Ot generated according to P(Ot | St).

To read this notation clearly:

$S_t$ is hidden clinical state at time $t$ (not directly observed).
$O_t$ is observed evidence at time $t$ (vitals, labs, notes).
$T$ is sequence length (how many time steps are modeled).
$P(O_t\mid S_t)$ is the emission distribution mapping hidden state to observable signals.

HMMs answer three questions:

Evaluation: Given an observation sequence, how probable is it under this model? Used to choose among competing disease trajectory models.

Decoding (Viterbi algorithm): Given an observation sequence, what is the most likely hidden state sequence? Applied to ICU monitoring: given the last 24 hours of vital signs, what is the most likely disease trajectory?

Learning (Baum-Welch algorithm): Given observation sequences without hidden state labels, estimate the model parameters. Used to learn disease progression models from EHR data where true disease state is unobserved.

Dynamic Bayesian Networks

HMMs limit the hidden state to a single variable. Dynamic Bayesian networks (DBNs) generalise this to multiple interacting hidden variables, each with its own conditional dependencies across time. A two-slice temporal Bayes net (2-TBN) specifies:

The structure of dependencies within one time slice.
The dependencies between time slice t and time slice t+1.

For a sepsis monitoring application, a DBN node might represent infection status, organ function, and fluid responsiveness at each time step, with cross-time edges representing how each variable at time t affects the next step.

Beyond Bayesian Networks: Markov Random Fields

Bayesian networks use directed edges representing causation. Markov Random Fields (MRFs, undirected graphical models) use undirected edges representing symmetric correlation.

MRFs are natural for problems where direction is unclear or where consistency constraints must be symmetric. Image segmentation is the canonical example: adjacent pixels should have similar labels (tissue/bone/organ) without any causal direction specified. The Ising model from statistical physics is an MRF, and its inference algorithms (belief propagation on a grid) underlie many medical image processing pipelines.

Uncertainty Quantification: The Practical Problem

Knowing that a model is uncertain is as important as knowing its prediction. There are two types of uncertainty:

Aleatoric uncertainty (irreducible): inherent randomness in the data-generating process. Even with infinite data, a 60-year-old male smoker presenting with chest pain has genuine uncertainty in diagnosis; multiple conditions are plausible and some overlap.

Epistemic uncertainty (reducible): uncertainty due to limited knowledge or data. A model that has seen very few patients with this atypical presentation is uncertain because it lacks information, not because the problem is inherently ambiguous.

Distinguishing these matters because aleatoric uncertainty should trigger caution, while epistemic uncertainty should trigger more data collection.

Bayesian Neural Networks place probability distributions over weights rather than point estimates. The posterior weight distribution induces a predictive distribution rather than a point prediction. In practice this is approximated via:

Monte Carlo Dropout: run the network multiple times with dropout active at inference time; the variance in predictions estimates uncertainty.
Deep Ensembles: train multiple networks with different random initialisations; disagreement between ensemble members signals uncertainty.

Calibration Revisited

A model is calibrated if its stated confidence reflects empirical accuracy. Calibrated models are essential in clinical settings:

An overconfident model will not prompt the physician to seek additional tests.
An underconfident model will create alert fatigue.

Expected Calibration Error (ECE) bins predictions by confidence (e.g., 0.5-0.6, 0.6-0.7, …) and measures the gap between mean confidence and mean accuracy within each bin. Well-calibrated modern medical AI models typically achieve ECE below 0.05 on held-out test data.

Temperature scaling is the simplest post-hoc calibration method: divide logits, the raw pre-softmax output scores of the model, by a learned scalar T before the final softmax. This does not change the model’s predictions but adjusts their confidence, typically reducing overconfidence.

The Sepsis Network

The network below models a simplified sepsis diagnosis scenario. Infection causes SIRS criteria and hypotension, both of which contribute to a sepsis diagnosis. Try setting evidence about SIRS or blood pressure and watch how posterior probability of sepsis updates.

Sepsis Bayesian Network

Set SIRS criteria and blood pressure observations to update sepsis posterior probability.

Decision Making Under Uncertainty

Once we have a posterior probability distribution, the AI system must recommend an action. This requires utility theory: assigning value to outcomes.

Expected utility of an action A:

EU(A) = sum over outcomes O of: P(O | A, evidence) * Utility(O)

Interpretation of symbols:

$A$ is a candidate action (discharge/admit/ICU transfer).
$O$ indexes possible outcomes under that action.
$P(O\mid A, evidence)$ is the outcome probability conditioned on action and current evidence.
$Utility(O)$ is the clinical value/cost assigned to outcome $O$ .

The action with the highest expected utility is the rational choice. In a triage setting:

Action: discharge vs. admit vs. transfer to ICU.
Outcomes: correct discharge, missed critical illness, unnecessary admission.
Utilities: assign costs to false negatives (missed meningitis: catastrophic) and false positives (unnecessary admission: moderate).

Maximising expected utility automatically produces asymmetric decision thresholds: a rational system accepts more false positives to avoid false negatives when the cost asymmetry is large. This is not cherry-picking; it is the mathematically correct response to the utility structure of clinical decisions.

Key Takeaways

Markov chains and HMMs model temporal evolution under uncertainty.
DBNs generalise HMMs to multiple interacting variables across time.
Markov Random Fields handle undirected, symmetric correlations (e.g., spatial coherence in medical images).
Aleatoric uncertainty is irreducible; epistemic uncertainty can be reduced by more data.
Bayesian neural networks, Monte Carlo Dropout, and deep ensembles provide principled uncertainty estimates.
Calibration (ECE, temperature scaling) ensures predicted probabilities are trustworthy.
Expected utility maximisation provides a principled framework for action selection under uncertainty.

Relation to earlier lessons

Lesson 16 introduced Bayesian-network uncertainty in a static setting.
Lesson 17 extends that framework to temporal trajectories and uncertainty decomposition.
This keeps the same triage thread while upgrading from snapshot reasoning to sequence reasoning.

Concrete bridge: lesson 16 asked “What is the posterior now?” This lesson asks “How does that uncertainty evolve over time, and what action should we take?”

Notation quick reference

Symbol/Term	Meaning	Detailed link
$P(s_{t+1}\mid s_t)$	Markov transition probability	Markov Chains and the Markov Property
HMM	hidden Markov model	Hidden Markov Models
DBN	dynamic Bayesian network	Dynamic Bayesian Networks
MRF	Markov random field	Beyond Bayesian Networks: Markov Random Fields
ECE	expected calibration error	Calibration Revisited
$EU(A)$	expected utility of action $A$	Decision Making Under Uncertainty
Aleatoric	irreducible data-generating uncertainty	Uncertainty Quantification: The Practical Problem
Epistemic	uncertainty from limited model knowledge	Uncertainty Quantification: The Practical Problem

What comes next

Lesson 18 closes the course by integrating symbolic AI, search, learning, deep architectures, and probabilistic reasoning into one coherent operating map.

For strong internal navigation and recap flow, jump to Lesson 16: Probabilistic AI for static uncertainty and then read Lesson 18: Wrapping Up.

References and Further Reading

Rabiner, L. “A Tutorial on Hidden Markov Models.” Proceedings of the IEEE 77(2), 1989.
Koller, D. and Friedman, N. Probabilistic Graphical Models. MIT Press, 2009.
Murphy, K. Dynamic Bayesian Networks: Representation, Inference and Learning. PhD Thesis, 2002.

This is Lesson 17 of 18 in the AI Starter Course.

Uncertainty and Graphical Models: Reasoning Over Incomplete Information

Lesson introduction