There is a lot to be said about learning to program, what languages to use, what didactics to use, to learn it in a university or in a boot camp, but one thing I think is similar in all those situations: a strong focus on creating code, very often ‘green-field’ code, starting from a specification or a set of tests. This leads to the fact that programmers in generally are really poorly equipped to read code, even their own code, but especially code written by others.
Starting anew will be easier
I myself am no exception, very often if I am in need of a certain piece of code, rather than reusing open source code prefer to write code myself because “it will be less work this way”, which is a bit like a self-fulfilling prophecy: because I do not read code, I will not get better at reading code, and I will for ever write my own stuff.
How to write code?
When writing code, even if it is for a really tough problem, I know what to do; I have a set of strategies I can employ. I could start by writing tests TDD style, I could start to write down requirements and specifications, I could draw an architecture diagram or sketch out the user interface. Each of these processes are relatively well described in books and blogs, and more importantly: I practiced these things extensively when I was in college, and afterwards I kept using them when writing code, and I learned new techniques too.
How to read code?
But, how to read code? Frankly… I also do not know. How to start? There is some research on how professional programmers read code; for example Uwano et al. observed that programmers ‘scan’ code to get an idea of what the program does. They found that 70% of lines were seen in the first 30%
of time spent. So apparently, scanning the code is a thing we do, but why? And how did we learn to do this? I am not saying it does not make sense to do so, it makes a lot of sense, but how do we know? There might be some transfer from natural language or developers might learn this from seeing others do it? And while transfer from natural language reading skills will certainly be there, but it is not completely the same. Busjahn et al. observed that programmers read code less linearly than natural language; they follow the call stack rather than reading from top to bottom. How do we know that is a smart thing to do? Maybe again by observing people do it, or by mimicking how you would write code?
However we know what to do, in any case, code reading skills do not come immediately to new programmers. Busjahn et al. also found (same paper) that novices read code more linearly than expert programmers. Learning to follow the call stack it a thing that comes with experience. But here is the things, as fas as I am aware of this practice, unlike the practice to write code, is not deliberate practice. It happens as a byproduct of other things (like code writing skill?) This is a problem, firstly because it leads to the thought that “writing afresh will be easier than learning existing code”, and I also worry that it might keep people out of projects?
Deliberate reading
Coming back to a comparison to natural language, which I have talked and written about a lot, in language, learners practice reading a lot! Not just technical reading, but also close reading: the practice of carefully interpreting a text. I have been studying close reading practices for a long time (I would recommend Boyles as an easy and practical intro) and the more I read, the more I became enticing with the idea that all of this would be useful for programming too!
For example. Boyle suggests questions like:
- What is the first thing that jumps out at me? Why?
- What’s the next thing I notice? Are these two things connected? How?
- Do they seem to be saying different things?
These sentences encourage and train ‘scanning’ behavior, which we know is a thing programmers do. The beautiful thing about this technique is that is gives me a thing “to do” when reading code. Rather than mindlessly scrolling thought the code, or stepping through it with a debugger in the shallow hopes of an epiphany, I can now do a thing, which makes the task so much less daunting!
There are more syntactic techniques to be applied to programming too. Greenham’s book Close Reading: the basics (which I would recommend if you want to dive into the topic more) describes 6 contexts of close reading, one of which is syntactic. In this context you study words, and how they relate to each other. A simple exercise to study syntax is to circle words which appear in the text commonly, see what patterns arise and how words are connected to similar ones.
This is an exercise too where I see a lot of value for programming. In Circling variables and their uses, or classes and their instances gives a lot of insight into an unknown code base, and again is a thing I can *do*. In fact, when I teach variable to my high schoolers, I was using a similar technique even before I was diving into close reading!
A similar technique was also described in an ITiCSE working group paper on Program Comprehension by Izu et al:
I think that these techniques can certainly add value when learning to program, and I will continue to apply them in my lessons.
Code reading clubs for programmers
But what to do it you are already a professional? After my talk last year at StrangeLoop, in which I also proposed this idea, I was approached by Katja Mordaunt, who was very excited to try this within her company NeonTribe. She was really excited to practice code reading in a professional setting, and she has been running a Code Club with my exercises for the past 5 months.
In all honesty, I would have expected for this idea to die out soon, it seems like a nice idea of course, but would people really want to spend an hour every week reading code? Turns out, they do! The initiative was very successful, and in the mean time, 3 new clubs have spun out of Katja’s first club! Participants have shared the value that they get out of these sessions, for example:
- “Increased my confidence of reading Python as a beginner”
- “Makes me think about comments more”
- “Making my comments less noisy – not new but a visual reminder of what that can do to a codebase”
I also loved the impact the club had on things that are not code reading, like people improving interviews with a shared reading exercise instead of a white board exercise! I would have never thought of this, but I do think that makes a lot of sense too and might be way less daunting that creating code.
In summary, I think we all agree reading reading code is a great skill to have, and I think my concrete exercises might help in practising this, for learners and experts alike!
On GitHub you can find all info on the Code Reading Club, including my exercises, weekly notes and a retrospective. If you are considering to start a club, I would love to hear about it and sit in sometimes!
Love to read about your work on this topic. Thank you for sharing and open-sourcing your research and your work to build communities.