Topic Selection

Most weak CS Extended Essays are not weak because the writing is poor. They are weak because the question was wrong: too broad, not actually computer science, or not answerable through technical investigation. Spend serious time here.


The three-question topic test

Before you settle on a research question, every answer below should be yes.

  1. Can this question be answered through computational principles? Algorithm complexity, system performance, computational theory, data structure properties, security analysis, machine-learning model evaluation – pick a lens.
  2. Does it require investigation, not just description? Comparing, measuring, evaluating, benchmarking – not “explain how X works.”
  3. Is it specific enough to address in 4,000 words? Narrow scope, defined conditions, measurable variables.

If any answer is “no”, refine.


Research question quality test

Quality Test
Clear Will the reader understand what is being investigated from the RQ alone?
Focused Is the scope tight enough to be properly explored in 4,000 words?
Arguable Does it allow analysis, comparison, evaluation – not just explanation?
CS-grounded Could it only be answered by applying computer science knowledge?

Use higher-order framing: “To what extent…?”, “How does X compare to Y under Z conditions?”, “What is the impact of X on Y?”.

Avoid: “How does X work?”, “Is X better than Y?” (without a defined metric), “What is the future of X?”.

Shape of the question

  • Aim for a single sentence, around 20–30 words.
  • One question, not two. Compound RQs joined by and / or split your essay’s focus and almost always cost a mark on Criterion A.
  • Title vs. research question are different things. The title appears on the title page; the RQ appears in the introduction. They can match, but a short title plus a longer, more specific RQ is more common. Make sure both exist and that they say compatible things.

Refining a broad idea: a worked example

Stage Question or scope
Broad interest “I am interested in machine learning.”
Focused topic “Classification algorithms for medical data.”
Preliminary investigation “SVMs and neural networks are commonly compared. MNIST and several medical datasets are publicly available.”
Research question “To what extent is a feed-forward neural network more accurate in classifying malignant cancer compared to k-nearest neighbour?”

Notice what happens at each step: the topic gets narrower, the metric becomes measurable, and the comparison becomes specific.


Topic areas with example research questions

These are illustrative – not “approved topics.” Use them as templates for the kind of question that works.

Algorithms and complexity

  • To what extent does the choice of sorting algorithm affect performance on partially sorted datasets of varying sizes?
  • What is the impact of grid density on the execution time and path accuracy of A* and Dijkstra’s algorithms in two-dimensional environments?
  • How does the average number of node expansions of iterative deepening depth-first search compare to breadth-first search for solving the 8-puzzle?

Machine learning and AI

  • How does the choice of kernel function in support vector machines affect classification accuracy on handwritten digit datasets?
  • To what extent is a feed-forward neural network more accurate in classifying malignant cancer compared to k-nearest neighbour?
  • How does the number of hidden layers in a neural network affect image classification accuracy on CIFAR-10?

Cryptography and security

  • How do RSA and elliptic curve cryptography compare in key generation time and encryption speed when implemented in Python?
  • To what extent does AES-256 maintain performance and security under varying key sizes and input volumes?

Data compression and information theory

  • To what extent does Huffman coding outperform arithmetic coding in compressing English text datasets of varying sizes?
  • How does LZ77 compare to LZW in compressing different types of data (text, image, binary)?

Computer architecture and systems

  • To what extent does the type of core (logical or physical) of a CPU influence the speed at which algorithms are executed?
  • To what extent do cache-locality optimisations (loop tiling, array-of-structs vs. struct-of-arrays) affect the execution time of matrix multiplication on large matrices?

Data structures

  • How does the choice of data structure (hash table vs. balanced BST) affect search performance as dataset size scales?
  • To what extent does trie-based indexing outperform hash-based indexing for prefix search operations?

Networking

  • How does packet loss rate affect the throughput of TCP Reno vs. TCP CUBIC in simulated network conditions?

Topics to avoid in CS

These belong in other Diploma subjects, even if the surface topic sounds technical.

Topic type Why not CS Better suited to
Social impact of social media algorithms No computational analysis Digital Society, Psychology
History of the internet Descriptive, not analytical History
Business/financial analysis of a tech company Strategic, not technical Business Management, Economics
Ethical implications of facial recognition Ethics-focused, not technical Philosophy, Global Politics
User-experience study of a mobile app Design-focused, not computational Design Technology
How technology affects education Social-science question Digital Society, Psychology

Ethical or social considerations may be briefly addressed (one paragraph in the introduction or conclusion), but they cannot dominate the analysis.


Subject-focused or interdisciplinary?

Most CS Extended Essays use the subject-focused pathway – the essay sits inside Computer Science only. This is what we recommend by default.

The interdisciplinary pathway is also available if your topic genuinely needs two DP subjects to make sense – for example, climate-change modelling (Mathematics + Computer Science) or computational analysis of political text (Computer Science + Global Politics). Interdisciplinary essays must place the work inside one of five interdisciplinary frameworks the IB defines (e.g., movement/time/space, evidence/measurement/innovation, sustainability/development/change).

In practice, choose the subject-focused pathway unless you have a strong reason to combine subjects. Whichever you choose, the essay must remain genuinely rooted in computer science.


Common topic-selection mistakes

  • Too broad. “How effective is encryption?” is too general. Better: “To what extent does AES-256 maintain performance and security under varying key sizes and input volumes?”
  • Too descriptive. Describing how a technology works is not the same as analysing it. The question must require investigation.
  • Not actually CS. If your question could be answered without applying computer science principles, the EE probably belongs in another subject.
  • No measurable variable. “Is X better than Y?” is unanswerable until you define better with a metric.
  • Unrealistic resource needs. Datasets you cannot access, models you cannot train, hardware you do not have. Plan around what you can actually do in ~40 hours.

Practical considerations before you commit

  • Resources. Do you have access to the datasets, libraries, hardware, or environments you need?
  • Feasibility. Can the experiment or analysis be completed – and properly written up – in ~40 hours?
  • Scope. An overly ambitious investigation that runs out of time will not score well even if the idea was strong.
  • Prior knowledge. You should be studying Computer Science in the DP. The CS-specific subject guidance assumes this; the EE builds on that foundation.
  • Supervisor expertise. Talk to your supervisor early. They can spot scope and methodology problems before you commit weeks to them.

© EduCS.me — A resource hub for Computer Science education

This site uses Just the Docs, a documentation theme for Jekyll.