Types of Learning
IB Syllabus: Builds on A4.1.1 (the types of machine learning) with the HL techniques behind them: A4.3.2 supervised classification (k-NN, decision trees), A4.3.4 clustering, A4.3.5 association rule learning, A4.3.6 reinforcement learning, A4.3.7 genetic algorithms.
This page is HL only. It goes deeper into how each type of learning actually works. The overview of the types, which is the SL and HL core, is on What Machine Learning Is.
Table of Contents
- Why This Page Exists
- Recap: Three Ways to Learn
- Supervised Learning Up Close
- Unsupervised Learning Up Close
- Reinforcement Learning Up Close
- Genetic Algorithms
- Quick Check
- Match the Technique
- Practice Exercises
- Connections
Why This Page Exists
On the previous page you met the three ways a system can learn: supervised (from labelled examples), unsupervised (finding structure in unlabelled data), and reinforcement (trial and error with rewards). That is the what. This page is the how: the actual algorithms that carry out each kind of learning.
You do not need to implement these (the syllabus asks you to explain and describe them, not to code them). What you need is a clear mental model of how each one works, so you can say why it fits a problem and where it goes wrong. Each technique below links to a simulator so you can watch it run.
Try them yourself: the machine learning simulators include a live demo for every technique on this page. Watching a boundary form or clusters settle makes these far easier to remember than reading alone.
Recap: Three Ways to Learn
A quick reminder of the paradigms these techniques belong to.
| Paradigm | Data it uses | What it produces |
|---|---|---|
| Supervised | Labelled examples (input plus correct answer) | A model that predicts the answer for new inputs |
| Unsupervised | Unlabelled data | Structure found in the data (groups, associations) |
| Reinforcement | Rewards and penalties from acting | A strategy that maximises reward |
Genetic algorithms sit slightly apart: they are a search-and-optimisation method inspired by evolution, often grouped with machine learning because they learn good solutions over many generations.
Supervised Learning Up Close
Supervised learning predicts an answer from labelled examples. Two of the clearest algorithms are k-nearest neighbours and decision trees.
k-Nearest Neighbours (k-NN)
k-NN is the simplest classifier to picture. To classify a new item, it finds the k closest already-labelled items (its “neighbours”) and takes a majority vote of their labels. That is the whole idea: you are the company you keep.
A few details make it work well:
- k is a choice. A small k (like 1) follows the data very closely and is easily fooled by noise; a larger k is smoother but can blur real boundaries. k is a hyperparameter, a setting you choose rather than one the model learns.
- Use an odd k for two classes, so the vote cannot tie.
- Features must be on a comparable scale. “Closest” is measured by distance, so if one feature runs 0 to 1 and another 0 to 100000, the large one drowns out the small one unless you rescale first.
Application: recommending items similar to ones a user liked, or classifying a reading as normal or abnormal by comparing it to labelled past readings.
Try it yourself: the k-NN simulator lets you place labelled points, drop a new one, and watch it connect to its k nearest neighbours and take their vote. Slide k up and down and watch the decision boundary change.
Decision Trees
A decision tree classifies by asking a sequence of yes/no questions about the features, each answer leading down a branch until it reaches a decision at a leaf. “Is income above X? If yes, is age below Y?” and so on. Learning the tree means choosing which questions, in which order, best split the data.
Their great strength is interpretability: unlike many models, you can read a decision tree and see exactly why it decided what it did, which matters enormously when a decision affects someone’s life. Their weakness is a tendency to overfit (covered on the training page): a deep enough tree can memorise the training data and generalise poorly.
Application: loan and insurance decisions where the reasoning must be explainable, and medical triage tools.
Unsupervised Learning Up Close
Unsupervised learning finds structure in data with no labels. Two forms appear in the syllabus: clustering and association rule learning.
Clustering (k-means)
Clustering groups similar items together without being told the groups in advance. The best-known method is k-means, and it repeats two steps until things settle:
- Assign: put each point in the cluster whose centre (centroid) is nearest.
- Update: move each centroid to the average position of the points assigned to it.
Repeat, and the centroids drift to the heart of natural groups. A few honest limitations:
- You must choose k (the number of clusters) in advance, which you often do not know.
- Different starting positions can settle differently, so the same data can give different clusters on different runs (a local optimum).
- It assumes roughly round, similar-sized clusters, and does poorly on long, thin, or very uneven groups.
Application: customer segmentation, grouping documents by topic, compressing an image by clustering its colours.
Try it yourself: the k-means simulator shows the assign-then-recentre loop step by step, and lets you re-seed the starting centroids to see the same data settle into different clusters.
Association Rule Learning
Association rule learning uncovers relationships between items that frequently occur together. The classic example is market-basket analysis: “customers who buy bread and butter often also buy jam.” A rule is judged by how often the items appear together (its support) and how reliably one implies the other (its confidence).
It is not predicting a label or a number; it is surfacing patterns of co-occurrence a human can then act on, such as placing associated products together or recommending a likely add-on.
Application: shop “you might also like” suggestions, spotting which symptoms tend to appear together, and store-layout decisions.
Reinforcement Learning Up Close
Reinforcement learning has an agent that acts in an environment. In each state it chooses an action, and the environment responds with a new state and a reward (or penalty). The agent’s goal is to maximise its cumulative reward over time, not just the next step, so it must learn which early actions lead to good outcomes later.
The central difficulty is the exploration versus exploitation trade-off:
- Exploitation: take the action you currently believe is best, to earn reward now.
- Exploration: try a different, uncertain action, to discover whether something better exists.
Too much exploitation and the agent gets stuck in a mediocre habit; too much exploration and it never settles on what works. Good reinforcement learning balances the two, exploring more early on and exploiting more as it learns.
Application: game-playing agents, robot control, and adaptive systems like recommendation or resource scheduling that improve through feedback.
Try it yourself: the reinforcement learning simulator trains an agent to cross a grid to a goal while avoiding a pit. Raise the exploration setting and watch it wander more; lower it and watch it commit to the first decent path it finds.
Genetic Algorithms
Genetic algorithms borrow the mechanism of evolution to search for good solutions. You keep a population of candidate solutions and improve it over many generations:
- Fitness: a fitness function scores how good each candidate is.
- Selection: fitter candidates are more likely to be chosen as “parents”.
- Crossover: parents are combined to produce new candidates that mix their features.
- Mutation: small random changes are introduced, to keep exploring new possibilities.
- Evaluation and termination: score the new generation and repeat until a solution is good enough or time runs out.
The fitness function is the heart of it: it defines what “good” means, and everything else is a way of pushing the population towards higher fitness. Genetic algorithms shine on huge search spaces where you can recognise a good solution but cannot directly calculate the best one.
Application: designing shapes or schedules, tuning other systems’ settings, and evolving strategies for hard optimisation problems.
Try it yourself: the genetic algorithm simulator evolves random text toward a target phrase you type. Watch fitness climb generation by generation, and change the mutation rate to see exploration speed up or stall.
Quick Check
Q1. How does k-nearest neighbours classify a new data point?
Q2. Why is an odd value of k often chosen for a two-class k-NN problem?
Q3. What does the k-means clustering algorithm repeatedly do?
Q4. What does association rule learning discover?
Q5. In reinforcement learning, the "exploration versus exploitation" trade-off is about:
Q6. In a genetic algorithm, what is the job of the fitness function?
Match the Technique
Name the technique each description points to: k-NN, decision tree, k-means, association rules, reinforcement learning, or genetic algorithm.
Fill in the blanks.
// Classifies by a majority vote of the nearest labelled points
// Technique:
// Groups unlabelled data by moving centroids to the average of their points
// Technique:
// Evolves a population using selection, crossover, and mutation
// Technique:
// Learns by acting for rewards and balancing exploration and exploitation
// Technique:
// Classifies with a readable sequence of yes/no questions
// Technique: Practice Exercises
Note for IB CS learners: these A4.3 techniques are examined with Explain and Describe. The exercises below practise that. At least one asks for a full prose response. All content here is HL.
Core
-
Explain (4 marks) – Explain how k-nearest neighbours classifies a new data point, including the role of k.
-
Describe (4 marks) – Describe the two repeating steps of k-means clustering and one limitation of the method.
-
Describe (4 marks) – Describe the exploration versus exploitation trade-off in reinforcement learning, with an original example.
Extension
-
Explain (6 marks) – Explain the roles of the fitness function, crossover, and mutation in a genetic algorithm, and why mutation matters.
-
Compare (6 marks) – Compare k-NN and decision trees for a task where the reasoning behind each decision must be explainable, and recommend one.
Challenge
- Discuss (8 marks) – A supermarket wants to (a) group customers into similar segments and (b) discover which products are bought together. Explain which technique fits each task and why, then discuss what the shop could responsibly do with each result. (Write in prose.)
Connections
- Prerequisite: What Machine Learning Is – the overview these techniques deepen
- Next: Training and Evaluating a Model – how any of these models is built and judged
- Related: Ethics of Machine Learning – fairness and accountability when these techniques make real decisions
- Related: Algorithms – the classical algorithms these learning methods sit alongside