Everyone is suddenly an AI expert. Teams are asked to “use AI” quickly, products are rebranded, and expectations rise fast. So it helps to slow down for a moment and build a definition that is actually useful when you read papers, test tools, or try to build something yourself.

John McCarthy, who coined the term artificial intelligence, described AI as “the science and engineering of making intelligent machines.” That is already a good starting point. I find it even more useful to think of AI as the science and engineering of machines turning raw inputs into meaningful results. That is the lens you will keep using throughout this course.

Core learnings about artificial intelligence

What does AI actually mean in a practical, non-hyped way?
How can you describe an AI system mathematically as a mapping from input to output?
Why do symbols like $X$ , $Y$ , $x$ , $y$ , and $f$ matter once you start reading more technical material?
How does one simple notation frame the rest of the course?

AI in one line

In this course, we make that definition practical with one simple sentence:

“AI is a system that maps inputs to useful outputs under real-world constraints.”

To simultaneously introduce the notation you will need later, so research-style expressions become easier to read over time, we write this mapping as:

f : X \to Y

If this notation feels new, open Function notation for AI and then come back.

Think of it like this:

$X$ is the full set of possible inputs the system could receive
$Y$ is the full set of possible outputs the system could return
$f$ is the rule, model, or process in between

The notation symbols themselves also matter:

$:$ means “is a function from”
$\to$ means “maps to”
$=$ means “is equal to” or “is given by”
$\in$ means “is an element of” or more simply “belongs to”

The capital letters matter here. They describe the whole space of possibilities. So $X$ does not mean one patient. It means the set of all valid patient descriptions the system is designed to handle. In the same way, $Y$ does not mean one diagnosis. It means the set of all allowed outputs.

When we talk about one concrete case, we switch to lowercase letters:

$x \in X$ means one specific input example
$y \in Y$ means one specific output for that example

If a hospital triage system receives patient signals and returns a treatment recommendation, that is already an AI mapping problem.

For one patient we can write:

y = f(x)

Same notation, different zoom level. If you want a short refresher first, use Function notation primer.

This is not a different idea from $f : X \to Y$ . It is the same mapping written at a different zoom level:

$f : X \to Y$ says what kind of function we are talking about in general
$f(x) = y$ says what that function does for one concrete input

So the first notation is the global view, and the second notation is the worked example.

This short equation becomes a useful anchor for the full course. In every lesson, you can come back to the same three questions:

what exactly is the input $x$ ?
what kind of output $y$ do we need?
how is the mapping $f$ built, evaluated, and improved?

The triage example we reuse throughout the course

Consider a hospital triage process in which patients arrive with symptoms, measurements, and test results, and the clinical team has to decide what to do next under time pressure. We will keep returning to that setting throughout the course because it makes abstract ideas easier to compare. The method changes from lesson to lesson, but the practical situation stays familiar.

One patient can be represented as:

x = (x_1, x_2, \dots, x_n)

This vector-style notation is part of the same mapping language, so the same Function notation primer applies here too.

Read this as a checklist of features:

$x_1$ : temperature
$x_2$ : stiff-neck indicator
$x_3$ : headache severity
$x_4$ : white blood cell count
$\dots$ : more features continue in the same pattern

If this looks abstract, imagine a spreadsheet row: each column is one feature and the full row is $x$ . The capital letter $X$ would then mean the set of all rows that fit the format, while the lowercase $x$ means one specific row for one specific patient.

Why AI progress comes in waves

People often ask: why does AI seem revolutionary in one decade and disappointing in another?

A practical answer is to track three ingredients together:

Performance = g(Model\ Capacity,\ Data\ Quality,\ Compute\ Budget)

Treat this as a simple planning lens rather than a law of nature. It says AI progress usually becomes visible when model quality, data quality, and compute improve together rather than in isolation.

This is not a strict physical law; it is a useful planning model.

Model Capacity: can the model represent the patterns we care about?
Data Quality: are examples accurate, relevant, and diverse enough?
Compute Budget: do we have enough compute for training and serving?

When all three improve together, AI systems make visible jumps. When one is weak, progress stalls.

A visual timeline of AI progress

The timeline below is not a side note. It is another way of reading the same equation above.

As you move through the milestones, ask two simple questions:

which part of the system improved?
which bottleneck still held progress back?

Seen this way, the history of AI becomes much easier to follow. The field does not move forward because of hype alone. It moves forward when representation, data, and compute become strong enough at the same time.

Two common ways to build the mapping

Different AI approaches still solve the same mapping. What changes is how the middle part $f$ is represented and where the intelligence of the system is stored.

Symbolic style (explicit logic):

\operatorname{IF}(condition_1 \land condition_2) \Rightarrow conclusion

Connectionist style (learned parameters):

y = f_\theta(x)

If the subscripted function form is unfamiliar, Function notation primer covers this notation pattern.

Quick interpretation:

$f_\theta$ means the mapping is controlled by parameters $\theta$
in neural networks, $\theta$ is the set of learned weights and biases
symbolic systems are usually easier to inspect step by step
connectionist systems often learn richer patterns from large datasets

The practical difference is this: symbolic systems are explicit because a human writes down the rules in advance, while connectionist systems are flexible because they learn many small interacting patterns from data. That usually makes connectionist systems better at capturing messy real-world regularities, but also harder to inspect directly. In the next lessons you will see both sides more clearly, first with reasoning trees and rule systems, and later with learned models.

For the symbols in the symbolic rule above:

$\land$ means logical “and”
$\Rightarrow$ means “implies” or “leads to”
the subscript in $condition_1$ simply labels the first condition, just like $x_1$ labels the first feature in a vector

What comes next

In lesson 2 we keep the same mapping frame $y = f(x)$ , but add a more practical question: how much computation does reasoning require as the problem grows? That is where trees, branching, and search cost start to matter.

Notation quick reference

Symbol	Meaning	Detailed Explanation
$f$	decision function / model	AI in one line
$X$	full input space	AI in one line
$Y$	full output space	AI in one line
$x$	one input example	The triage example we reuse throughout the course
$y$	one output example	AI in one line
$x_i$	feature $i$ of input $x$	The triage example we reuse throughout the course
$n$	number of features	The triage example we reuse throughout the course
$f_\theta$	parameterized model	Two common ways to build the mapping
$\theta$	learnable parameter set	Two common ways to build the mapping

Concept deep dives (separate math posts)

For readers who want standalone math explanations before continuing:

These are placeholders and will be expanded with interactive math visuals in dedicated concept lessons.

References and Further Reading

Winston, Patrick H. Artificial Intelligence, 3rd ed. Addison-Wesley, 1992.
Turing, Alan. “Computing Machinery and Intelligence.” Mind 59(236), 1950.
McCarthy, J. et al. “A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence.” 1955.
Lighthill, James. “Artificial Intelligence: A General Survey.” Science Research Council, 1973.
Russell, S. and Norvig, P. Artificial Intelligence: A Modern Approach, 4th ed. Pearson, 2020.

This is Lesson 1 of 18 in the AI Starter Course.

What Is Artificial Intelligence? A Practical Beginner Definition with Examples

Lesson introduction