Friday, May 9, 2008

Is Functional Programming the new Python?

Back in 2004 Paul Graham wrote an essay on the Python Paradox:

if a company chooses to write its software in a comparatively esoteric language, they'll be able to hire better programmers, because they'll attract only those who cared enough to learn it. And for programmers the paradox is even more pronounced: the language to learn, if you want to get a good job, is a language that people don't learn merely to get a job.

Some tentative support for this theory comes from a study of programming languages done in 2000. The same task was given to over 80 programmers. The chart shows how long they took. Obviously the average for some languages was a lot less than for others, but the interesting thing for the Python Paradox is the variability. Java had huge variability: one developer took over 60 hours to complete the task. Meanwhile the Python developers were the most consistent, with the lowest variance as a percentage of the mean. I suspect (but can't prove) that this was because of the kind of programmers who wrote in Java and Python back in 2000. Java was the language of the Web start-up and the dot-com millionaire, but Python was an obscure open source scripting language. The Pythonistas in this study didn't learn it to get a job, but many of the Java programmers did.

But if this study was repeated today I bet the spread for Python would be a lot larger. Maybe still not as big as Java, but more like C++ or Perl. Because today you can get a good job writing Python. A quick check of jobs on found 1450 Python jobs against 7732 C++ jobs and 15640 jobs for Java. Python hasn't taken over the world, but the jobs are there.

So the smart employers and developers need something new to distinguish themselves from the crowd, and it looks like functional programming might be it. Programming Reddit carries lots of cool stuff about Haskell, and job adverts are starting to list a grab-bag of functional languages in the "would also be an advantage" list. For instance:
- Programming experience with more esoteric and powerful languages for data manipulation (Ruby, Python, Haskell, Lisp, Erlang)
So it looks like the with-it job-seekers and recruiters may be starting to use functional programming to identify each other, just as they used Python up to 2004.

Update: Oops. I just remembered this post which started me thinking along these lines.

Sunday, May 4, 2008

An Under-Appreciated Fact: We Don't Know How We Program

I was talking to a colleague from another part of the company a couple of weeks ago, and I mentioned the famous ten-to-one productivity variation between the best and worst programmers. He was surprised, so I sketched some graphs and added a few anecdotes. He then proposed a simple solution: "Obviously the programmers at the bottom end are using the wrong process, so send them on a course to teach them the right process."

My immediate response, I freely admit, was to open and shut my mouth a couple of times while trying to think of response more diplomatic than "How could anyone be so dumb as to suggest that?". But I have been mulling over that conversation, and I have come to the conclusion that the suggestion was not dumb at all. The problem lies not with my colleague's intelligence but in a simple fact. It is so basic that nobody in the software industry notices it, but nobody outside the industry knows it. The fact is this: there is no process for programming.

Software development abounds with processes of course: we have processes for requirements engineering, requirements management, configuration management, design review, code review, test design, test review, and on and on. Massive process documents are written. Huge diagrams are drawn with dozens of boxes to try to encompass the complexity of the process, and still they are gross oversimplifications of what needs to happen. And yet in every one of these processes and diagrams there is a box which basically says "write the code", and ought to be subtitled "(and here a miracle occurs)". Because the process underneath that box is very simple: read the problem, think hard until a solution occurs to you, and then write down the solution. That is all we really know about it.

To anyone who has written a significant piece of software this fact is so obvious that it seems to go without saying. We were taught to program by having small examples of code explained to us, and then we practiced producing similar examples. Over time the examples got larger and the concepts behind them more esoteric. Loops and arrays were introduced, then pointers, lists, trees, recursion, all the things you have to know to be a competent programmer. Like many developers I took a 3 year degree course in this stuff. But at no point during those three years did any lecturer actually tell me how to program. Like everyone else, I absorbed it through osmosis.

But to anyone outside the software world this seems very strange. Think about other important areas of human endeavor: driving a car, flying a plane, running a company, designing a house, teaching a child, curing a disease, selling insurance, fighting a lawsuit. In every case the core of the activity is well understood: it is written down, taught and learned. The process of learning the activity is repeatable: if you apply yourself sufficiently then you will get it. Aptitude consists mostly of having sufficient memory capacity and mental speed to learn the material and then execute it efficiently and reliably. Of course in all these fields there are differences in ability that transcend the mere application of process. But basic competence is generally within reach of anyone with a good memory and average mental agility. It is also true that motor skills such as swimming or steering a car take practice rather than book learning, but programming does not require any of those.

People outside the software industry assume, quite reasonably, that software is just like all the other professional skills; that we take a body of knowledge and apply it systematically to particular circumstances. It follows that variation in productivity and quality is a solvable problem, and that the solution lies in imposing uniformity. If a project is behind schedule then people need to be encouraged to crank through the process longer and faster. If quality is poor then either the process is defective or people are not following it properly. All of this is part of the job of process improvement, which is itself a professional skill that consists of systematically applying a body of knowledge to particular circumstances.

But if there is no process then you can't improve it. The whole machinery of process improvement loses traction and flails at thin air, like Wiley Coyote running off a cliff. So the next time someone in your organisation says something seemingly dumb about software process improvement, try explaining that software engineering has processes for everything except actually writing software.

Update: Some of the discussion here, and on Reddit and Hacker News is arguing that many other important activities are creative, such as architecture and graphic design. Note that I didn't actually mention "architecture" as a profession, I said "designing a house" (i.e. the next McMansion on the subdivision, not one of Frank Lloyd Wright's creations). People give architects and graphic designers room to be creative because social convention declares that their work needs it. The problem for software is that non-software-developers don't see anything creative about it.

The point of this post is not that software "ought" to be more creative or that architecture "ought" to be less. The point is that we need to change our rhetoric when explaining the problem. Declaring software to be creative looks to the rest of the world like a sort of "art envy", or else special pleading to be let off the hook for project overruns and unreliable software. Emphasising the lack of a foundational process helps demonstrate that software really does have something in common with the "creative" activities.