CPU Architecture

IB Syllabus: A1.1.1 – CPU components, A1.1.5 – FDE cycle, A1.1.6 (HL) – Pipelining

Table of Contents

  1. Key Concepts
    1. CPU Components (A1.1.1)
    2. Buses
    3. CPU Architecture Diagram
    4. Single-Core vs Multi-Core Processors
    5. Co-processors
  2. Fetch-Decode-Execute Cycle (A1.1.5)
    1. Phase 1: Fetch
    2. Phase 2: Decode
    3. Phase 3: Execute
  3. Enrichment: Half Adder and Full Adder
    1. Half Adder
    2. Full Adder
    3. Connection to the ALU
  4. Enrichment: Little Man Computer (LMC)
  5. HL Section: Pipelining (A1.1.6)
    1. What is Pipelining?
    2. 4-Stage Pipeline
    3. Why Pipelining Improves Throughput
    4. Pipeline Hazards
    5. Why Multi-Core Processors Help
  6. Worked Examples
    1. Example 1: Labelling the CPU
    2. Example 2: FDE Cycle Trace
  7. Quick Code Check
  8. Trace Exercise
  9. Spot the Error
  10. Fill in the Blanks
  11. Predict the Output
  12. Practice Exercises
    1. Core
    2. Extension
    3. Challenge
  13. Connections

Key Concepts

CPU Components (A1.1.1)

The Central Processing Unit (CPU) is the brain of the computer. It fetches instructions from memory, decodes them, and executes them. Every program you run – from a web browser to a game – is ultimately a sequence of instructions processed by the CPU.

The CPU contains three main functional units:

ALU (Arithmetic Logic Unit)

  • Performs arithmetic operations: addition, subtraction, multiplication, division
  • Performs logic operations: AND, OR, NOT, comparisons (equal, greater than, less than)
  • Takes inputs from registers and writes results back to the Accumulator (AC)

CU (Control Unit)

  • Directs the operation of the entire CPU
  • Sends control signals to other components (ALU, memory, I/O devices)
  • Manages the timing and sequencing of operations
  • Interprets (decodes) instructions fetched from memory

Registers – small, ultra-fast storage locations inside the CPU:

Register Full Name Purpose
PC Program Counter Holds the address of the next instruction to fetch
MAR Memory Address Register Holds the address currently being accessed in memory
MDR Memory Data Register Holds the data being read from or written to memory
IR Instruction Register Holds the current instruction being decoded and executed
AC Accumulator Stores the result of ALU operations

A common exam mistake is confusing MAR and MDR. Remember: MAR holds the address (where to look), MDR holds the data (what was found there).

Buses

Buses are electrical pathways that carry signals between CPU components and memory:

Bus Direction Purpose
Address Bus Unidirectional (CPU -> memory) Carries the memory address being accessed
Data Bus Bidirectional (CPU <-> memory) Carries the data being read or written
Control Bus Bidirectional Carries control signals: read/write, clock pulse, interrupt

The address bus is the only unidirectional bus – the CPU tells memory where to look, but memory never sends addresses back. The data bus must be bidirectional because data flows in both directions (reading and writing).

CPU Architecture Diagram

+------------------- CPU -------------------+
|  +---------+         +------------------+ |
|  |   CU    |         |    Registers     | |
|  |         |         |  PC  MAR  MDR    | |
|  |         |         |  IR  AC          | |
|  +----+----+         +--------+---------+ |
|       |                       |           |
|  +----+-----------------------+-----+     |
|  |              ALU                 |     |
|  +----------------------------------+     |
+-------------------+-------------------+---+
                    |
         +----------+----------+
         | Address  | Data     | Control
         | Bus      | Bus      | Bus
         +----------+----------+
                    |
              +-----+-----+
              |  Memory   |
              +-----------+

Single-Core vs Multi-Core Processors

A single-core processor has one processing unit and can execute one instruction stream at a time. It gives the illusion of multitasking by rapidly switching between tasks (context switching).

A multi-core processor contains two or more independent cores on the same chip. Each core can execute instructions simultaneously, enabling true parallel processing. A quad-core CPU can genuinely run four independent tasks at the same time.

However, not all tasks benefit equally from multiple cores. Tasks that are sequential (each step depends on the previous result) cannot be parallelised – they run on a single core regardless of how many cores are available.

Co-processors

A co-processor is a specialised processor that works alongside the CPU to handle specific types of computation more efficiently:

  • GPU (Graphics Processing Unit) – optimised for parallel graphics and mathematical operations
  • DSP (Digital Signal Processor) – optimised for audio, video, and signal processing
  • TPU (Tensor Processing Unit) – optimised for machine learning operations

Co-processors free the CPU to focus on general-purpose tasks while offloading specialised work.


Fetch-Decode-Execute Cycle (A1.1.5)

The fetch-decode-execute (FDE) cycle is the fundamental process by which the CPU processes every instruction. It repeats continuously while the computer is running.

Phase 1: Fetch

  1. The address stored in the PC is copied to the MAR
  2. The CPU sends a read signal along the control bus and the address along the address bus
  3. The instruction stored at that memory address is fetched via the data bus into the MDR
  4. The instruction is copied from the MDR into the IR
  5. The PC is incremented by 1 (pointing to the next instruction)

Phase 2: Decode

  1. The CU examines the instruction in the IR
  2. The CU determines the type of operation (e.g., load, add, store, branch)
  3. The CU identifies any operands (data or memory addresses the instruction needs)
  4. The CU prepares the necessary control signals for execution

Phase 3: Execute

  1. The ALU or other components carry out the decoded instruction
  2. For arithmetic/logic operations, the ALU performs the calculation and stores the result in the AC
  3. For memory operations (load/store), data is transferred between registers and memory
  4. For branch instructions, the PC is updated to the target address

The cycle then repeats from Fetch, using the updated PC to get the next instruction.

The FDE cycle is one of the most frequently examined topics on IB exams. You must be able to describe each phase step by step, naming the specific registers involved.


Enrichment: Half Adder and Full Adder

This goes beyond the IB syllabus but helps build understanding of how the ALU performs arithmetic using logic gates.

The ALU does not “know” how to add numbers the way humans do. Instead, it uses combinations of logic gates to perform binary addition.

Half Adder

A half adder adds two single binary digits (A and B) and produces:

  • Sum = A XOR B
  • Carry = A AND B
A B Sum (A XOR B) Carry (A AND B)
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1

Full Adder

A full adder extends the half adder by accepting a carry-in from a previous addition. This allows multiple adders to be chained together to add multi-bit numbers.

  • Sum = A XOR B XOR Carry-in
  • Carry-out = (A AND B) OR (Carry-in AND (A XOR B))

Connection to the ALU

By chaining full adders together (one for each bit), the ALU can add numbers of any bit width. An 8-bit adder uses 8 full adders in sequence, with the carry-out of each feeding into the carry-in of the next. This is a ripple-carry adder.

This is why understanding logic gates is the foundation for understanding how CPUs actually compute.


Enrichment: Little Man Computer (LMC)

The LMC is a simplified model that demonstrates how a CPU processes instructions. It is not part of the IB syllabus but is a useful teaching tool.

The Little Man Computer is an instructional model of a simple CPU. It imagines a “little man” inside the computer who follows a fixed procedure to process instructions.

The LMC has a limited instruction set:

Mnemonic Code Description
LDA 5xx Load the value from address xx into the accumulator
STA 3xx Store the accumulator value at address xx
ADD 1xx Add the value at address xx to the accumulator
SUB 2xx Subtract the value at address xx from the accumulator
INP 901 Read input into the accumulator
OUT 902 Output the value in the accumulator
BRA 6xx Branch (jump) to address xx unconditionally
BRZ 7xx Branch to address xx if accumulator is zero
BRP 8xx Branch to address xx if accumulator is positive or zero
HLT 000 Halt the program

The LMC follows the same fetch-decode-execute pattern as a real CPU, making it an excellent way to trace through programs and understand how registers change at each step.


HL Section: Pipelining (A1.1.6)

HL Only – The following section on pipelining is assessed at HL level only.

What is Pipelining?

Without pipelining, the CPU completes all phases of one instruction before starting the next. This means the fetch unit sits idle while the ALU executes, and the ALU sits idle while the next instruction is fetched.

Pipelining overlaps the phases of multiple instructions so that different parts of the CPU are active simultaneously.

4-Stage Pipeline

Clock Cycle Fetch Decode Execute Write-back
1 I1
2 I2 I1
3 I3 I2 I1
4 I4 I3 I2 I1
5 I5 I4 I3 I2
6 I5 I4 I3
7 I5 I4
8 I5

Without pipelining, 5 instructions would take 20 clock cycles (4 stages x 5 instructions). With pipelining, the same 5 instructions complete in just 8 cycles. Once the pipeline is full (cycle 4 onwards), one instruction completes every clock cycle.

Why Pipelining Improves Throughput

  • Each CPU component (fetch unit, decoder, ALU, write-back unit) is kept busy rather than idle
  • Throughput increases – more instructions completed per unit of time
  • Individual instruction latency does not decrease (each still takes 4 cycles), but overall program execution is faster

Pipeline Hazards

Pipelining does not always work perfectly. Hazards are situations that prevent the next instruction from executing in the expected clock cycle:

Data Dependency Hazard Instruction 2 needs the result of Instruction 1, but Instruction 1 has not finished executing yet. Example: ADD R1, R2 followed by STORE R1 – the second instruction needs the updated value of R1.

Branch Hazard A conditional branch instruction means the CPU does not know which instruction comes next until the branch condition is evaluated. Instructions fetched speculatively into the pipeline may need to be discarded.

Solutions: CPUs use techniques like forwarding (passing results directly between pipeline stages), stalling (inserting a wait), and branch prediction (guessing which way a branch will go).

Why Multi-Core Processors Help

Even with pipelining, a single core can only execute one instruction stream. Multi-core processors run multiple independent pipelines simultaneously, each processing a separate thread of execution. This is especially beneficial for:

  • Running multiple applications at the same time
  • Parallelisable tasks (e.g., rendering different parts of an image)
  • Server workloads handling many simultaneous requests

Worked Examples

Example 1: Labelling the CPU

Label each component in the CPU diagram below:

+------------------- CPU -------------------+
|  +---------+         +------------------+ |
|  |  (A)    |         |  (B) Registers   | |
|  |         |         |  (C)  (D)  (E)   | |
|  |         |         |  (F)  (G)        | |
|  +----+----+         +--------+---------+ |
|       |                       |           |
|  +----+-----------------------+-----+     |
|  |             (H)                  |     |
|  +----------------------------------+     |
+-------------------+-------------------+---+
                    |
         +----------+----------+
         |  (I)     | (J)      | (K)
         +----------+----------+
                    |
              +-----+-----+
              |    (L)    |
              +-----------+

Answers:

Label Component Role
(A) Control Unit (CU) Directs operations, sends control signals
(B) Registers Fast internal storage
(C) Program Counter (PC) Address of next instruction
(D) Memory Address Register (MAR) Address being accessed
(E) Memory Data Register (MDR) Data being read/written
(F) Instruction Register (IR) Current instruction being decoded
(G) Accumulator (AC) Stores ALU results
(H) Arithmetic Logic Unit (ALU) Performs arithmetic and logic operations
(I) Address Bus Carries memory addresses (unidirectional)
(J) Data Bus Carries data (bidirectional)
(K) Control Bus Carries control signals (bidirectional)
(L) Memory (RAM) Stores instructions and data

Example 2: FDE Cycle Trace

Trace the FDE cycle for these three instructions. Assume the initial memory state:

Address Contents
5 LOAD 20
6 ADD 21
7 STORE 22
20 42
21 8
22 0

PC starts at 5.

Instruction 1: LOAD 20 (load value from address 20 into AC)

Phase Step PC MAR MDR IR AC
Fetch PC -> MAR 5 5
Fetch Memory[5] -> MDR 5 5 LOAD 20
Fetch MDR -> IR; PC+1 6 5 LOAD 20 LOAD 20
Decode CU reads IR 6 5 LOAD 20 LOAD 20
Execute MAR = 20; Memory[20] -> MDR 6 20 42 LOAD 20
Execute MDR -> AC 6 20 42 LOAD 20 42

Instruction 2: ADD 21 (add value from address 21 to AC)

Phase Step PC MAR MDR IR AC
Fetch PC -> MAR 6 6 42
Fetch Memory[6] -> MDR 6 6 ADD 21 42
Fetch MDR -> IR; PC+1 7 6 ADD 21 ADD 21 42
Decode CU reads IR 7 6 ADD 21 ADD 21 42
Execute MAR = 21; Memory[21] -> MDR 7 21 8 ADD 21 42
Execute AC = AC + MDR 7 21 8 ADD 21 50

Instruction 3: STORE 22 (store AC value to address 22)

Phase Step PC MAR MDR IR AC
Fetch PC -> MAR 7 7 50
Fetch Memory[7] -> MDR 7 7 STORE 22 50
Fetch MDR -> IR; PC+1 8 7 STORE 22 STORE 22 50
Decode CU reads IR 8 7 STORE 22 STORE 22 50
Execute MAR = 22; MDR = AC 8 22 50 STORE 22 50
Execute MDR -> Memory[22] 8 22 50 STORE 22 50

After execution, Memory[22] = 50 (which is 42 + 8).


Quick Code Check

Q1. Which register holds the address of the next instruction to be fetched?

Q2. What is the role of the ALU?

Q3. Which bus is unidirectional?

Q4. What happens during the Fetch phase of the FDE cycle?

Q5. (HL) What is the main benefit of pipelining?


Trace Exercise

Trace the FDE cycle for two instructions. Fill in the register values at each phase.

Initial memory state:

Address Contents
10 LOAD 20
11 ADD 21
20 42
21 8

PC starts at 10. All other registers start empty (shown as –).

Instruction 1: LOAD 20

PhasePCMARMDRIRAC
Before Fetch 10 -- -- -- --
After Fetch --
After Execute

Instruction 2: ADD 21

PhasePCMARMDRIRAC
After Fetch
After Execute

Final value in the Accumulator:


Spot the Error

A student describes the Fetch phase of the FDE cycle but gets the register roles wrong. One line contains an error about which register receives the fetched instruction. Click the line with the error, then pick the correct fix.

1FETCH PHASE: 21. The address in the PC is copied to the MAR. 32. The instruction is fetched from memory into the MDR, then copied into the MAR. 43. The PC is incremented by 1.

Pick the correct fix for line 3:

Remember the fetch pathway: PC -> MAR -> Memory -> MDR -> IR. The MAR only ever holds addresses. The MDR is the “middle step” that receives data from memory before it goes to its final destination (IR for instructions, AC for data values).


Fill in the Blanks

Complete the description of the FDE cycle by filling in the correct register names.

The Fetch-Decode-Execute cycle describes how the CPU processes each instruction. Fill in the blanks with the correct register or component names:

FETCH PHASE:
1. The address in the  is copied to the .
2. The instruction at that address is fetched from memory into the .
3. The instruction is then copied from the MDR into the .
4. The  is incremented by 1.

DECODE PHASE:
5. The  interprets the instruction stored in the IR.

EXECUTE PHASE:
6. The  performs the operation.
7. The result is stored in the .

Predict the Output

Given the following initial CPU state and memory contents, the CPU executes two complete FDE cycles. The first instruction loads a value into the accumulator, and the second adds another value to it. What is the final value in the AC after both instructions execute?

PC = 0
Memory:
  Address 0: LOAD 5
  Address 1: ADD 6
  Address 5: 7
  Address 6: 3

AC before execution: 0

Final value in AC:

The CPU processes three instructions in sequence. The first loads a value, the second subtracts another value from it, and the third stores the result in memory. What value is stored at memory address 22 after all three instructions execute?

PC = 10, AC = 15
Memory:
  Address 10: LOAD 20
  Address 11: SUB 21
  Address 12: STORE 22
  Address 20: 42
  Address 21: 12
  Address 22: 0

Value stored at address 22:


Practice Exercises

Core

  1. Register roles – Describe the role of each of the following registers in one sentence each: PC, MAR, MDR, IR, AC.

  2. FDE phases – Describe the three phases of the fetch-decode-execute cycle. For each phase, name the registers involved and explain what happens to them.

  3. Bus types – Name the three types of bus in a computer system. For each bus, state its direction (unidirectional or bidirectional) and what it carries.

Extension

  1. FDE trace – Given the following memory contents and PC = 0, trace all register values through each phase of the FDE cycle for all three instructions:

    Address Contents
    0 LOAD 10
    1 ADD 11
    2 STORE 12
    10 25
    11 17
    12 0
  2. Bus analysis – A program loads a value from memory address 100. Explain which buses are used during this operation, what signals travel on each bus, and in which direction.

Challenge

  1. Single-core vs multi-core – A program has two tasks: Task A takes 4 seconds and Task B takes 3 seconds. Task B depends on the output of Task A (it cannot start until A finishes). Compare the total execution time on a single-core processor versus a dual-core processor. Then consider a different scenario where Task A and Task B are independent – how does the comparison change?

  2. (HL) Pipeline hazards – Consider this sequence of instructions:

    I1: ADD R1, R2, R3    (R1 = R2 + R3)
    I2: SUB R4, R1, R5    (R4 = R1 - R5)
    I3: LOAD R6, [100]    (R6 = Memory[100])
    

    Explain why a data dependency hazard occurs between I1 and I2. Describe two techniques the CPU could use to handle this hazard. Would I3 be affected by the same hazard? Why or why not?


Connections

  • Prerequisites: Logic Gates – gates build the ALU; understanding AND, OR, XOR is essential for understanding how arithmetic works at the hardware level
  • Related: Primary Memory – the CPU reads instructions and data from memory and writes results back; understanding the memory hierarchy explains why registers are faster than RAM
  • Related: GPU – a co-processor that works alongside the CPU; the HL comparison (A1.1.3) requires understanding CPU architecture first
  • Forward: Operating Systems Layer – the OS schedules processes on CPU cores, manages context switching between tasks, and handles interrupts that interact with the FDE cycle

Back to top

© EduCS.me — A resource hub for IB Computer Science

This site uses Just the Docs, a documentation theme for Jekyll.