Testing Strategy
Where this helps: proving your solution works, on purpose rather than by luck. In the IB Computer Science Internal Assessment this is Criterion C (the test plan). Choosing good test data is a skill any developer needs.
Table of Contents
- Why This Page Exists
- Types of Testing
- The Real Skill: Choosing Test Data
- Writing a Test Plan
- Quick Check
- Classify the Test Data
- Practice Exercises
- Connections
Why This Page Exists
Most beginners “test” their program by running it once with sensible input, seeing it work, and moving on. That is not testing. That is hoping.
Real testing is deliberate. You choose inputs that are likely to break the program, decide in advance what should happen, and check whether it does. The goal is not to prove your program is perfect (you cannot), it is to find the places where it is wrong while you still have time to fix them. A tester’s mindset is not “look, it works”, it is “let me try to break it”.
Types of Testing
Testing is described in two useful ways: by how much you can see and by how much of the system you are testing.
By what you can see
| Type | What it means |
|---|---|
| Black-box testing | You test against the requirements without looking at the code. You know what the program should do, and you check that it does, treating the inside as a sealed box. |
| White-box testing | You test knowing the code inside, deliberately choosing inputs that exercise each branch and path (every if, every loop). |
Both matter. Black-box testing catches “it does the wrong thing”; white-box testing catches “this particular path was never checked”.
By how much you are testing
| Level | What it tests |
|---|---|
| Unit testing | One small piece on its own (a single method or function). |
| Integration testing | Several pieces working together, to catch problems where they connect. |
| System testing | The whole finished program, end to end. |
| User acceptance testing | Whether the system meets the client’s real needs, tested with (or by) the client. |
These build up. You test a part alone, then parts together, then the whole thing, then whether the whole thing actually satisfies the person you built it for.
The Real Skill: Choosing Test Data
Anyone can run a program with a “normal” value. The skill that separates real testing from hoping is choosing test data on purpose, in three categories.
| Category | What it is | Why it matters |
|---|---|---|
| Normal | Typical valid input the program is meant to handle | Confirms the program does the ordinary job correctly |
| Boundary | Values right at the edges of what is allowed | The edges are where off-by-one errors and wrong comparisons hide |
| Invalid (erroneous) | Input the program should reject or handle gracefully | Real users type wrong things; the program must not crash |
Boundary values are the most valuable and the most forgotten. If a field accepts a score from 0 to 100, the interesting values are not 50; they are 0 and 100 (the valid edges) and -1 and 101 (the first invalid values on each side). A bug like if (score < 100) instead of <= 100 is invisible to normal data and obvious the moment you test the boundary.
Testing only normal data is the most common testing mistake. A program that handles 50 correctly tells you almost nothing. Test the edges (0, 100) and the just-outside values (-1, 101), and test rubbish input (“hello”, empty, a negative number). That is where the bugs live.
Worked example: a score field (valid range 0 to 100)
| Category | Test values | Expected result |
|---|---|---|
| Normal | 45, 78 | Accepted |
| Boundary (valid edges) | 0, 100 | Accepted |
| Boundary (first invalid) | -1, 101 | Rejected |
| Invalid | 500, “hello”, (empty) | Rejected without crashing |
Six well-chosen values like these test the field far more thoroughly than fifty random valid numbers.
Writing a Test Plan
A test plan is a table, written before you test, that says exactly what you will try and what should happen. Deciding the expected result in advance is what makes a test honest: you cannot talk yourself into “well, that output looks about right” if you already wrote down what right was.
A good test plan has these columns:
| Test | What it checks | Test data | Type | Expected result | Actual result | Pass/Fail |
|---|---|---|---|---|---|---|
| 1 | Accepts a normal score | 45 | Normal | Stored, shown as 45 | ||
| 2 | Accepts the lower edge | 0 | Boundary | Stored, shown as 0 | ||
| 3 | Accepts the upper edge | 100 | Boundary | Stored, shown as 100 | ||
| 4 | Rejects just over the top | 101 | Boundary | Rejected with a message | ||
| 5 | Rejects text | “hello” | Invalid | Rejected without crashing |
You leave “Actual result” and “Pass/Fail” blank until you run the tests, then fill them in honestly. A failed test is not a disaster; it is the test doing its job. An untested boundary that ships as a bug is the real disaster.
Note for IB CS learners: in the Internal Assessment, the test plan is part of the design (Criterion C) and the evidence of testing appears in development and the video. Tests should be relevant to your solution: test the features your success criteria promised, with data chosen across the three categories, not a random assortment.
Quick Check
Q1. What are the three standard categories of test data?
Q2. A quantity field accepts whole numbers from 1 to 10. Which of these is a boundary value?
Q3. Which approach tests a feature most thoroughly?
Q4. A tester checks whether the program meets its requirements without ever looking at the source code. This is:
Q5. What is the real purpose of testing?
Q6. Why should you decide a test's expected result before you run it?
Classify the Test Data
A login system accepts an age from 13 to 120. Classify each test value.
Fill in the blanks with normal, boundary, or invalid.
// Age entered: 40
// Category:
// Age entered: 13
// Category:
// Age entered: 120
// Category:
// Age entered: 150
// Category:
// Age entered: "twenty"
// Category: 12 and 121 would also be excellent tests: they are the boundary values just outside the allowed range, and they catch the most common comparison bugs.
Practice Exercises
Note for IB CS learners: these mirror the testing thinking behind Internal Assessment Criterion C. Command terms and a suggested mark weight are shown. At least one asks for a full prose response.
Core
-
Define (3 marks) – Define normal, boundary, and invalid test data, giving one original example of each for a field that accepts a month number from 1 to 12.
-
Select (4 marks) – A password must be 8 to 16 characters. Give one normal, two boundary, and one invalid test value, and state the expected result for each.
-
Construct (4 marks) – Write a five-row test plan (with the standard columns) for a feature that adds an item to a shopping cart.
Extension
-
Explain (4 marks) – Explain why testing only normal data can let a serious bug reach users, using a boundary example to illustrate.
-
Distinguish (4 marks) – Explain the difference between black-box and white-box testing, and give one situation where each is the better choice.
Challenge
- Discuss (8 marks) – “If the program runs correctly on my own inputs, it is tested.” Discuss why this belief is dangerous, referring to categories of test data, expected results, and the purpose of testing. Reach a reasoned conclusion. (Write in prose.)
Connections
- Prerequisite: Requirements Gathering – success criteria are what your tests check against
- Prerequisite: Design Representations – you plan tests alongside the design
- Related: Debugging – what you do once a test finds a problem
- Next: Documentation and User Guides – recording how the finished system works
- Related: Internal Assessment – the test plan is assessed in Criterion C