A (somewhat subjective) Guide to Part II CompSci Dissertations
December 11, 2020
10 min read
I’ve been fortunate enough to get some great advice about Part II dissertations throughout the course of the last year, which culminated in being awarded the prize for the top dissertation.
This post is a compilation of the advice I’ve received and the lessons I’ve learnt along the way. Part II dissertations are wonderfully varied, so this post is only really just one perspective. Remember to look at the Pink Book!
I’ll try to give some general advice, with anecdotes in a box like below:
Skim this and dip into sections however you please!
If you have a clear project idea, then you can skip this section.
I want to say right off the bat that you don’t need to self-propose a project idea for Part II or even know much about the field your project is in to do well. (I didn’t.)
You’re not doing research, so your idea doesn’t have to be novel or ground-breaking. This is a software engineering project, so if anything I think it’s more a case of executing it well.
My supervisor Alan Mycroft summed it up very nicely in one of our early emails, when I asked him about what made a great project:
“Beware trying to “design in” the idea of high-flying. Just build yourself an interesting base idea (core project) and have ideas (extensions) for what to do beyond that.”
If you’re unsure, I would suggest you:
I framed the project as a learning opportunity. The Part II project is going to take up a quarter of your time for 6 months, so it’s an ideal opportunity to dive into a topic you’re interested in but don’t know much about.
Enjoy a specific 1B lecture course? Maybe it’s worth doing a project in that area.
Many supervisors have a list of project ideas on their website, or will suggest some if you email them.
I didn’t know much about type systems and programming language theory but really enjoyed the 1B Concepts in Programming Languages course. So I emailed the lecturer, Alan Mycroft, who suggested I read an interesting paper about a type system that prevented data races. From there we came up with a project idea that seemed exciting - reimplement this type system in my own toy language Bolt.
The dissertation is about:
- Picking a project that has a clear motivation
- Understanding and explaining the background material
- Implementing your project using good software engineering practices
- Critically evaluating it.
The first two points usually address themselves. Conversations with your supervisor and overseers should filter out any unclear project ideas. And just through the course of implementing your project, you’ll naturally encounter a good amount of theory that goes beyond the 1B Tripos.
In practice, it’s easier to demonstrate the latter two points in certain fields compared to others.
Systems projects work especially well as you’re writing plenty of code, so there’s ample opportunity to demonstrate good engineering practices like testing. Systems evaluation tends to be quantitative (lots of pretty graphs to evaluate!).
For other topics, it’s a case of augmenting your project idea so it’ll do better. Remember, the examiners won’t appreciate the nuances of your field that make it especially hard, as they’re not experts in that field. So you need to add a more general engineering component.
For example, take a machine learning project. Naturally, you’re likely to write a lot less code than someone writing a compiler, and ML models are not as suited to traditional testing. Examiners might not appreciate the time-consuming nature of tuning model architectures and hyperparameters. So perhaps you could also write a data pipeline that ingests data for your machine learning model, or implement some environment to deploy your model in. You can then demonstrate good engineering practice when implementing the pipeline/environment.
Let’s look now at another common project idea: an X -> Y compiler (insert your languages here). This ticks off the systems project boxes mentioned above, so scores decently. Compilers are a great demonstration of how much execution matters. A bog-standard compiler will score middling marks, but if you can implement extra compiler optimisations, that’ll give you more to talk about in terms of theory, and in terms of implementation, and you can write more benchmarks in your evaluation.
In a nutshell, you want as minimal a core project as possible, with scope for extensions. The proposal is a safety net, rather than a wishlist. Save talking about the proper stretch goals for the dissertation (in case they end up harder than expected).
Give yourself plenty of slack time in your project timeline, and overestimate how long it’ll take you, to budget for unexpected obstacles. If it’s going super well, you’ll have a lot of slack time to work on extensions!
You can use your proposal’s timeline to evidence forward thinking / good planning. So mark key project milestones (deliverables) in your timeline, as you can later display these in a Gantt chart in your dissertation.
In my proposal, I realised implementing the entire type system was pretty hard, given that I didn’t really understand the typing rules very well. So for my core success criteria, I cut the type system down to a minimal subset of typing rules, and made the language as simple as possible. Implementing the whole type system was firmly in extension territory. (This way if I didn’t fully understand it, it wasn’t a failure!)
Your mileage might vary here, so I’ll try to give some high-level suggestions.
6 months feels like plenty of time to get the implementation done. The biggest challenge is to balance the project work with lots of short-term deadlines (supervision work etc.). I would suggest trying to do a little bit every day or every few days, otherwise it’s very easy to go 2 weeks without getting any work done on your project.
Your supervisor should be your first port of call if you’re stuck, and can also suggest interesting avenues for extensions. When it comes to reaching out to your supervisor, be mindful that the feedback loop might be up to a week (they’re busy too!). Don’t forget friends can be great for debugging (shout out to Jamie for helping me with pthreads).
If you’re reimplementing a paper, don’t expect to understand everything in one go. I read the paper I was implementing a few times at the start of the project, and then dove deeper to each of the sections as I was implementing them. I was still re-reading it when writing up my dissertation!
Sign up to the blog for more tutorials!
With a Part II Project, you're likely going to end up googling topics which lack tutorials. This is why I'm writing tutorials for each part of my compiler project! At the time of writing this post, I've written 7 out of 11 posts in the series.
Throughout your project, I would suggest also learning more broadly about your project area (your supervisor is a fountain of knowledge - use them!). If not just purely for fun, this extra context is useful for setting the scene in your introduction and preparation chapters.
Once you’ve got your core project done, everything you do is a bonus in the eyes of the examiners.
I would say beware of feature creep. It’s easy to keep adding quick wins / low hanging fruit, but at some day you have to wrap up. You’re marked on your dissertation, not your project code.
Tip: if you’re short on time, first write up your dissertation. Include the section about planned extensions in the dissertation, and then actually implement the extensions whilst waiting on feedback. (Though this did mean I was writing code a few days before the deadline…)
If you want to know what you need to do to get a strong mark, the department provide marking guidelines.
The department doesn’t publish the marks the top dissertations were awarded. Unfortunately, I’ll have to disappoint you and say the marks can’t be meaningfully compared year-on-year: I scored 92 and know someone who scored 86 for their top dissertation and without a shade of doubt their dissertation project was much better than mine.
From what I can tell, the marks are scaled based on two factors:
- Other students’ marks - so a first-class student scores ~70/100.
- Scores in the other exams that year - so your dissertation mark doesn’t massively skew your overall results. E.g. if you scored 100 in the dissertation, you’d only need to score an average of 50/100 to get a 2.1 overall.
Whilst marks can’t be compared, you can still which dissertations were commended / prize-winning to get a feel for what makes a good dissertation. The Computer Lab lists an archive of previous dissertations and commended students.
Yes, everyone says this - it does take much longer than you expect to write it up. Painfully so, at times :P
I can tell you that my dissertation would have likely scored about 20 marks fewer if it wasn’t for the feedback given by my DoS, my supervisor and also my friends (shout out to Jamie, Zeb and Pali)! Not only is helping improve your friends’ dissertations super wholesome, it’s great for getting feedback about accessibility.
The easiest way to do badly on your dissertation is to write too much for your preparation chapter, and not enough for your evaluation.
Plan out a high-level dissertation outline, with word budgets for each of the sections. Aim for an overall budget of 9-10,000 words. Don’t forget that you’ll need space to incorporate feedback (e.g. if an explanation needs more detail).
You need to motivate the problem your project is solving by crafting a narrative in the Introduction chapter. Why should they care about the problem you are solving? Explain the problem at hand, related approaches and where your project comes in.
The end of the chapter is a chance to list the key contributions of your project: why is the outcome of your project significant? Detail matters here. E.g. saying you can compile programs that are hundreds of lines long is more impressive than saying you have a working compiler.
What makes my programming language Bolt useful? Will anyone use it? Probably not. However I framed it as a Java-style language, and explained that Bolt demonstrated how the type system in the paper I implemented could be incorporated into Java.
Related: the “Work Completed” section in the Proforma is a 100 word pitch for why examiners should give you a good mark. Use this to extoll the virtues of your project, as they’ll use this pitch to refresh their memory when assigning marks!
The marks for the preparation chapter are for clarity of explanation. It’s hard to give specific advice here, as each project’s background material will vary a lot. Roughly speaking, anything that the reader needs to know to understand your project implementation needs to go in this chapter.
The biggest pitfall is to think something is obvious because you’ve spent the last 6 months wrapped up in it. So here are a couple of tips to recalibrate your sense of “obvious”.
Firstly, look at previous years’ dissertations on similar topics to see what level they started their preparation chapter at.
And secondly, signpost! Say things thrice: introduce it, explain it, summarise it. The examiners are skimming yoru dissertation when marking it, so make their life easier by signposting so they don’t miss the key ideas.
To tie together the chapter, use a running example and include additional context about the big picture.
To avoid your project being seen as a “toy” project, use a real-world running example. E.g. for an example program, don’t use
The second half of the preparation chapter is a requirements analysis and engineering practices section. To be honest, this is just a hoop to jump through. Pick some form of requirements analysis method e.g. MoSCoW.
Pepper in some metrics for the “software engineering practices” section. It’s the difference between writing “some tests” and achieving 94% test coverage with 302 tests. This helps convince the assessors you’ve actually carried out a professional approach.
Here’s some ideas of what you can put in as “good practice”:
- Version control - please use Git to backup your project! (tutorial here)
- Filing issues to track subtasks - GitHub has a useful Kanban board that integrates with issues, which you can reference as “agile” methodology
- Tests and discuss why you chose your given testing frameworks
- Build systems
- Continuous Integration
- Docstrings or some other documentation generator.
- Bash scripts
- Linters and autoformatters
Have a look at previous projects to get more ideas.
The main pitfall in this chapter is going into the nitty-gritty of individual classes and methods. The goal isn’t to maximise the coverage of each line of your codebase, it’s to take a step back and explain the bigger design patterns e.g. through UML diagrams, datatype representations chosen. Implementation doesn’t just mean code, it could be the details of the algorithms you used, or the modifications you made to the theory introduced in the preparation chapter.
To score especially highly, you need to demonstrate some way you extended or adapted the theory presented in the preparation chapter. In my case, it was adapting the type system I used to work with class-based inheritance. Don’t worry though, the bar for “contribution to the field” in a Part II project is much lower than a research contribution that would warrant a conference paper.
The evaluation chapter is about two things: how critically you evaluated the success of your project, and the methodology you used to reach that conclusion.
With all evaluations, assessors are looking for you to present a nuanced discussion of the successes and failures. If possible, find something to compare against, and use this to really explore the limits of your project. So don’t just say it was successful overall, but find inputs where it performed particularly well or badly, and try to explain why. If you’ve hit your core success criteria, your project is already successful enough, so don’t feel this critical evaluation detracts from your implementation.
It’s good to have something quantitative to talk about, even if it feels artificial, if only to allow you to demonstrate good methodology.
Showing a pretty graph is half the story, the other half is explaining why that graph is legitimate. Graphs should have error bars. Explain your evaluation methodology, listing the configurations used for benchmarks (for reproducibility). Talk about how you avoided systematic errors etc.
I benchmarked Bolt against Java. Okay, it doesn’t evaluate the type system (the heart of my dissertation) but rather the overall Bolt compiler. Did it show anything revelatory? Nah, it showed that LLVM optimisations beat the JVM. However, I could talk about my approach to benchmarking.
E.g. Java’s classloader lazily loads in classes at runtime. I therefore run the Java program a few times before recording times for the benchmark to ensure the classes have been loaded for all benchmark runs.
You’ve submitted your dissertation, and completed your exams, but… you’re not free. If you get called up for a viva, rest assured it’s nothing to worry about. It’s neither good nor bad, just that the examiners want to know more about your work. It could be because of plagiarism concerns, to decide whether it’s worth a prize, or because they want to achieve consensus on the mark (if examiners disagree on the mark).
The viva lasts about 20 minutes. It consists of:
- An 8-min presentation on your project. This is like your progress report - pretty high-level and sell your significant achievements!
- A 5-min demonstration of your project.
- The rest of the time is for questions about your project. Don’t worry, these aren’t a grilling, if anything the discussion is quite chill.