In lesson 2, we studied goal trees, how to decompose one complex decision into a tree of smaller sub-decisions. That gave us a structure for reasoning, but it still left open an important practical question: where does the actual knowledge come from? In this lesson, that structure gets filled with explicit rules, confidence values, and recommendations.
Rule-based expert systems were the first major proof that AI could be genuinely useful in the real world. They kept goal trees as the underlying reasoning structure, but added a layer on top: explicit IF/THEN rules paired with confidence factors. They did not learn from data, and they did not look like modern neural networks. Instead, they stored expert knowledge explicitly and applied it through logical rules with weights attached to reflect uncertainty.
This lesson transforms lesson 2’s question about structure into a question about knowledge: how do we encode what we know? and how do we apply that knowledge when facts are uncertain? The answer is expert systems.
That may sound old-fashioned, but the core idea is still everywhere: compliance systems, medical checkers, fraud rules, underwriting systems, industrial control logic, and many modern agent workflows all still rely on explicit rule application.
Core learnings about expert systems
- How do expert systems store knowledge in a form a machine can execute?
- What does an inference engine actually do when it applies rules?
- Why were certainty factors introduced instead of using only clean true/false logic?
- Why did expert systems work so well in narrow domains and then struggle to scale?
What an expert system is
In lesson 1, we described AI as a mapping from inputs to outputs. Expert systems still fit that same frame. The difference is that the mapping is not hidden inside learned parameters. It is written down as explicit rules.
One rule can be written as
If or symbolic rule notation is unclear, use Certainty factors in expert systems first.
You can read this line as follows:
- means rule number
- are the conditions that must hold
- means logical “and”
- means “implies” or “leads to”
- is the conclusion or hypothesis produced by the rule
- stands for certainty factor
- is the confidence attached to the rule
This is one of the main advantages of rule-based AI: the system’s reasoning is visible.
The three moving parts
Every expert system is built from three connected pieces.
Knowledge base
The knowledge base stores domain knowledge.
- Facts describe the current case, for example high fever or stiff neck.
- Rules describe expert knowledge in IF/THEN form.
In our triage example, a rule might say that fever, neck stiffness, and severe headache together support a meningitis hypothesis.
Inference engine
The inference engine decides how rules are applied.
It follows a repeated cycle:
- Match rules whose conditions are currently satisfied.
- Select one rule to fire.
- Execute the rule and add its conclusion to working memory.
If we call the current set of known facts , then firing a rule that concludes produces
For a quick notation refresh on expressions like this, see Function notation for AI.
The new symbol here is , which means “union” or more simply “add this new item to the set.”
Explanation facility
Expert systems can usually explain their own reasoning, because each conclusion came from an explicit rule firing.
That means the system can answer questions like:
- why was this diagnosis proposed?
- which rules fired?
- which facts were required?
This built-in traceability is one reason expert systems are still attractive in regulated environments.
Why MYCIN mattered
MYCIN is the classic expert-system case study. It was designed to diagnose bacterial infections and recommend antibiotic treatments.
Why it mattered:
- it showed that a rule-based system could perform at or above specialist level in a narrow task
- it used confidence-weighted rules rather than pretending all medical evidence was certain
- it could explain the chain of reasoning behind a recommendation
The key lesson is not just that MYCIN performed well. It is that expert knowledge, when encoded clearly enough, could become executable.
Why certainty factors were needed
Medical reasoning is rarely perfectly certain. Symptoms suggest possibilities; they do not guarantee them.
That is why systems like MYCIN attached a certainty factor to each rule. A certainty factor is not exactly a probability. It is better understood as a practical confidence score used to rank or strengthen conclusions.
For a dedicated explanation, see Certainty factors in expert systems.
The triage rule base in action
Our triage example turns the abstract architecture into something concrete.
- some facts are directly observed, like fever or stiff neck
- some facts are derived, like possible meningitis
- some conclusions are recommendations, like prescribing ceftriaxone
What you are looking for is not only the final recommendation. You are looking at the path the system takes to get there.
For example:
- if strong symptoms are present, one rule may derive a meningitis hypothesis
- if test evidence is also present, another rule may strengthen that conclusion
- if the patient has no allergy, a treatment recommendation rule can fire
So the general topic of expert systems becomes concrete here: explicit knowledge, explicit updates, explicit explanations.
Explore the expert system simulator
The interactive simulator below shows a simplified MYCIN-style reasoning loop for hospital triage.
As you use it, focus on four panels:
- Observable facts: what the system knows at the start
- Rule base: what knowledge has been encoded
- Working memory: what new facts the system derives while it runs
- Inference trace: why each step happened
The preset scenarios are meant to show different kinds of behavior:
- Meningitis presentation: symptom-based reasoning
- Lab confirmed: stronger evidence and stronger downstream conclusions
- Viral pattern: a different branch of the rule base
- Allergy contraindication: same diagnosis, different treatment outcome because of a safety constraint
Walkthrough: Allergy contraindication
Run this one once in this exact order:
- Choose Allergy contraindication.
- Keep Backward mode and click Run.
- In the trace, observe that meningitis-supporting rules still fire from symptoms plus culture evidence.
- Then follow the treatment split: the
no-allergycondition is absent, so the ceftriaxone rule cannot fire. - The allergy-specific rule can fire instead, producing the chloramphenicol recommendation.
What this means: one changed fact (allergy status) can redirect treatment while keeping the diagnosis reasoning transparent.
These results can be interpreted quite directly: the system is not recommending ceftriaxone because the patient is allergic to penicillin, so the safety precondition for ceftriaxone is not satisfied. It therefore redirects treatment to the allergy-safe alternative. In practice, that is exactly the kind of transparent safety behavior rule systems are good at.
If you switch to Forward mode, the clinical outcome should stay consistent, but the reasoning flow is different:
- Forward mode starts from observed facts and derives all reachable consequences.
- Backward mode starts from a target recommendation and asks what must be true to justify it.
- In this allergy scenario, Forward mode shows both treatment branches as consequences of available facts, while Backward mode more directly highlights why the ceftriaxone branch fails.
Expert-style reasoning in modern LLM systems
Rule-based reasoning is still used in many modern LLM stacks, usually as a control layer around neural generation:
- Policy and safety rules: deterministic checks enforce contraindications, compliance, and guardrails.
- Routing rules: IF/THEN logic selects tools, workflows, or prompt templates by request type.
- Verification rules: symbolic validators check whether generated outputs satisfy domain constraints.
- Hybrid loops: the LLM proposes outputs, while explicit rules accept, reject, revise, or escalate.
So expert-system methods did not disappear; they often provide reliability and governance around probabilistic models.
If the simulator works well, you should come away with two intuitions. First, the system is easy to inspect because every step is explicit. Second, adding more rules quickly increases maintenance complexity.
You can think of this lesson as the next step after lesson 2. The tree structure is still there in the background, but now the reasoning feels more concrete because reusable rules are doing the work and confidence values can influence what gets recommended.
What comes next
In lesson 4, we move away from a fixed rule base and start searching through possible states directly. That shift matters because not every problem can be solved by storing enough rules in advance.
Why expert systems struggled to scale
The central problem was the knowledge acquisition bottleneck.
That means useful knowledge had to be extracted from experts and written down manually. This is harder than it sounds for three reasons:
- experts often know more than they can clearly articulate
- rules interact with one another in ways that become difficult to manage at scale
- real domains change, so the rule base has to be updated continuously by hand
This is where systems like XCON become important. They proved expert systems could generate real business value, but they also showed that maintaining thousands of rules is expensive and fragile.
Why this still matters today
Expert systems are not dead. In many industries they were absorbed into production software and rebranded as decision engines, compliance logic, or workflow rules.
Their core strengths still matter:
- transparent reasoning
- fast execution
- explicit control over decisions
- easy auditing
Their core weakness also still matters: they do not learn their own knowledge.
Notation quick reference
| Symbol | Meaning | Detailed Explanation |
|---|---|---|
| rule number | What an expert system is | |
| condition number | What an expert system is | |
| logical and | What an expert system is | |
| implies / leads to | What an expert system is | |
| conclusion or hypothesis | What an expert system is | |
| certainty factor | Why certainty factors were needed | |
| confidence attached to a rule | What an expert system is | |
| fact set at step | The three moving parts | |
| updated fact set after one firing step | The three moving parts | |
| add to / union of sets | The three moving parts |
For standalone math deep dives:
References and Further Reading
- Shortliffe, E.H. Computer-Based Medical Consultations: MYCIN. Elsevier, 1976.
- Winston, Patrick H. Artificial Intelligence, 3rd ed. Addison-Wesley, 1992. Chapter 5.
- McDermott, J. “R1: A Rule-Based Configurer of Computer Systems.” Artificial Intelligence 19(1), 1982.
- Hayes-Roth, F., Waterman, D.A., Lenat, D.B. (eds.) Building Expert Systems. Addison-Wesley, 1983.
- Buchanan, B.G. and Shortliffe, E.H. (eds.) Rule-Based Expert Systems. Addison-Wesley, 1984.
This is Lesson 3 of 18 in the AI Starter Course.