Some people take to programming very readily, whilst others find it more challenging. But what does programming involve? Here I speculate on aspects of programming as a cognitive activity
Consider a hypothetical programmer; let’s call him Algy. Algy is tasked with writing an application that has to connect to a database, obtain various pieces of textual and numerical data, perform various operations on them, and then display the results.
Algy needs to know many things, besides the basic facts about such entities as databases and data. He needs to know how, in whatever language he is using, to connect to a database of a certain kind, how to write suitable queries, how to get the database to execute the queries, how to extract any query results from the results sets returned by the database, how to combine and format the information from the various results sets, and how to display the end result in his application. For each operation Algy wants to perform, there are specific commands and keywords he needs to know, and he needs to be able to construct statements that are valid in themselves and which connect together as a whole so as to conform to the syntax of the language. And he will quite possibly already have had to conceive the architecture of, and then build, an overall application into which the database connection/data extraction code can slot. In relation to all of these tasks he will have had to draw on considerable amounts of knowledge.
What, cognitively, is involved in what Algy is seeking to accomplish? At the very outset there probably comes a conceptualization and design stage, based perhaps on a written project brief — or maybe a sketch on the back of an envelope — from a product manager or some other stakeholder. Algy has to interpret the brief and turn it into an idea for an application. If the objective is a GUI application, that is presumably in part a visualization problem (what should the running application look like?), but one where what must be envisaged is not a static image but a composite object that has latent functional potentialities, akin in some ways to a machine. (Developers sometimes speak of the ‘moving parts’ of a software system.) Probably there will be decisions to be made about how best to combine the information extracted by the different database queries, and how to display it. Perhaps the text is stored in the database in a different format from that which can be displayed, in which case Algy needs to find a way for his application to transform data from one format to the other. That might involve using a transformation-specific language distinct from the language he’s using to write his application or that used to query the database. Maybe his application will have to execute another, external, application or script (written in yet another language?) to carry out the transformation. Potential complexities abound.
One of the most fundamental questions relating to what Algy is doing, given all the languages potentially involved, is what it is to ‘know’ a programming language. The parallel with knowing a natural language appears close, for there is a vocabulary and a syntax, and expressions composed using the terms of the language may be meaningful or not. As with a natural language, some terms refer to objects and entities of various kinds, while others serve to establish larger causal/semantic structures (e.g. in relation to flow control). To know a language one needs to know what the key terms are, have an intuitive sense of the syntactic categories to which they belong, and grasp the rules for combining them to create meaningful statements. One way to think about the specific functions or roles of terms within a programming language is as valences or affinities for other terms; knowing about the role of some specific term serves to constrain the set of terms it can legitimately succeed, and that can legitimately succeed it.
Priming and programming fluency
It’s interesting to compare how a programmer attains familiarity with the key terms in a programming language with how natural languages are learnt, as regards how term roles are assimilated. Michael Hoey’s theory of lexical priming posits that we develop a sense of how words function in a natural language (what their roles are) not by explicitly learning the rules of grammar, but rather by being exposed over time to large numbers of word instances in their natural contexts. It is this that drives the automatic development of the appropriate associations between terms, as the brain acquires a sense of word association probabilities (Hoey 2005). Grammar, for Hoey, is thus effectively an emergent phenomenon. It does not seem unreasonable to suppose that priming is a significant contributor to the development of fluency in a programming language, even if we might baulk at the idea that priming is the fundamental driver of programming language learning in the way that Hoey contends it is for natural language acquisition. In other words, maybe learning a programming language is something of a hybrid case where lexical priming is concerned, where the understanding that comes from explicit learning of syntax is reinforced by priming effects that come from exposure to quantities of code.
There are also several notable quantitative differences between natural languages and programming languages, from which might follow important qualitative differences. The first concerns the size of the vocabularies involved: most programming languages consist of several dozen or so keywords, whereas natural language vocabularies are some hundreds of times larger (at the individual speaker level). The second contrast relates to syntactic complexity. Now natural language grammars — viewed as codifications of emergent syntax, to buy into lexical priming — are no less complex than the syntax of a typical programming language. (Au contraire, surely.) So it seems reasonable to suppose that natural language assertions of some given length are liable to be more varied than assertions of the same length in any programming language. What this might mean — and this is something of a leap, I acknowledge — is that more cognitive capacity is available in principle to the programmer for conceiving structures over large scales in a programming language than is available to the user of a natural language. Thus, with exposure to sufficient volumes of code, a programmer might find it possible to conceive of code running to thousands of lines. Indeed, isn’t this what we already know to be the case: that experienced programmers do have such a facility? Moreover, the capability difference between the least able and the most able programmers is marked, as regards speed of coding and time spent debugging. Should we say this is down to priming effects, or imaginative capability? Most likely a combination of the two, but weighted — so my intuitions incline me — towards the latter?
Software applications as explanations
Talk of imagination brings me to another radical idea: that the artefacts the programmer produces, i.e. tech solutions, stand comparison with the explanations we routinely give of events and phenomena in the world. Explanation is a well-developed branch of philosophy of science, but as might be expected it is a topic that interests psychologists too. One explanatory phenomenon which has received considerable attention is the so-called self-explanation effect, whereby explaining a problem or situation to oneself or another person often enables one to reach an understanding that is otherwise elusive (Chi et al. 1994; Ainsworth & Loizou 2003; see also Lombrozo 2016). My rough take on the phenomenon is that it stems from the fact that when we explain, we are compelled to straighten out our thoughts, as it were, by identifying salient relationships between relevant elements. We must relate our thoughts to the properties of those elements, make explicit the operative patterns of causality, be clear about issues of temporal sequence, and so on. Often this amounts to a quasi-imaginative process in which sets of ideas are, perhaps partly sub-consciously, explored, manipulated and configured. Successfully imagining how something happens or why it is the case means that our thoughts become structured in such a way that the dispositions we have around individual conceptual elements (some of which dispositions we might refer to as beliefs) are mutually supporting.
Prior to the act of explaining we we entertain a set of ideas and beliefs, but they are somewhat disconnected and disorganized; the relations between different elements in the complex are not fully worked out; we have not configured them such that their dispositions sum constructively. An explanation is a serialized version of the original idea complex, where the conceptual elements form a dispositionally coherent whole in which term affinities (and perhaps other, higher order, constraints) are honoured, and the whole can be verbally expressed. When we achieve a concatenation of ideas where one thought leads naturally to the next, something seems to us to be intelligible, and we experience the feeling of understanding. When such an explanation is received by another, its structure is such as to engender an aptly configured set of ideas in that other’s mind, and the expression is thus explanatory for the recipient.
A tech solution is, I’d like to claim, rather like an explanation, in that as with the pre-explanatory state described above, the starting point (or at least one of the starting points) is a set or complex of inter-related ideas. In the case of application building, these ideas don’t get linearized into a temporal sequence, but rather they get worked out and transformed into a machine-like causal structure, in a process of manipulation, exploration and configuration that has to satisfy multiple constraints — syntactic, functional, and possibly also aesthetic. And at the code block level, there is a quasi-linguistic linearization of ideas. The important point is that almost never do the initial ideas get worked out or configured fully unless, or until, an attempt is made to build the application. This, I suspect, is generally for reasons of cognitive burden: the ideas are too numerous or too complex for the mind to readily manipulate them ‘in memory’. Hence, to see whether a software idea is viable, or to work out how best to implement an idea, as often as not there’s no substitute for rolling up one’s sleeves and getting stuck into some practical software development.
 This glosses over some important differences between programming languages, which potentially impact on the extent to which, and the respects in which, they impose cognitive loads of various kinds on their users. XSLT and Prolog spring to mind as ‘troublesome’ languages, and I suspect their difficulty stems from the way in which lines of code written in these languages often fail to reflect the complexities of internal processing that determine the results of code execution. In other words, it can be hard for the programmer to mentally simulate the results of running the code they write. Languages like this generally fail to achieve widespread or enduring adoption.
 See e.g. Weinberg (1971), p.135, where productivity differences of 30:1 are granted as plausible, albeit in relation to specific aspects of the programming role. (It is suggested that because individual programmers excel in different aspects of their roles, but few excel in all, overall differences for the role viewed in toto will be smaller.) Differences of these sorts of magnitude have been disputed, however, e.g. by Nichols (2019).
 Although often rules of code layout implicate, to a greater or lesser extent, a second spatial dimension, making coding arguably more akin to writing poetry than prose.
Ainsworth, S., and Loizou, A.T. (2003) The effects of self-explaining when learning with text or diagrams. Cognitive Science 27: 669–681
Chi, M.T.H., de Leeuw, N., Chiu, M.-H., and LaVancher, C. (1994) Eliciting Self-Explanations Improves Understanding. Cognitive Science 18: 439–477
Hoey, M. (2005) Lexical Priming — A new theory of words and language. Routledge.
Lombrozo, T. (2016) Learning isn’t just about getting the right information. YouTube: https://www.youtube.com/watch?v=YZDqlxbfoVQ
Nichols, W.R. (2019) The End to the Myth of Individual Programmer Productivity. IEEE Software 36(5): 71–75.
Weinberg, G.M. (1971) The Psychology of Computer Programming. Van Nostrand Reinhold.