Paul's Pontifications: August 2007

Monday, August 27, 2007

Tax as Percentage of GDP

In the 17th century French economist Jean Baptiste Colbert said "The art of taxation consists in so plucking the goose as to obtain the largest possible amount of feathers with the smallest possible amount of hissing”. His observation is as true today as it was then.

Whenever the government in the UK decides to charge for something, especially if it was previously free, it is in turn charged with "stealth taxation". I don't know about other countries, but I imagine that the politics are similar. There is often a good reason why the government wants to levy a charge, which often has nothing to do with its overall level of income. An example is the recent proposal to charge by the kilo for domestic refuse collection: the "polluter pays" principle is generally recognised as sound and "free" rubbish collection is increasingly expensive. But not everyone agrees:

Mother-of-five Mandy Price, who has just begun to recycle but still produces an average of nine bin liners of rubbish a week, said the assembly could not justify introducing such a policy in the wake of council tax rises.
She said: "You pay your council tax for the local authority to come and collect your rubbish, so why should we pay more?"

In other words, the charge is perceived primarily as just another source of revenue rather than an attempt to shift the costs onto the people who actually use the service the most. The government response is always to say that this is going to be offset by lower taxes elsewhere (in this case, council tax), but this makes a very unconvincing soundbite.

Gross Domestic Product is a standard way of measuring the overall economic activity in a nation or economic area, and the usual way of measuring the tax burden on an economy is the ratio of taxation to GDP. For instance, Reform points out that the US takes 26.4% of GDP as tax, whereas the UK takes 35.8%. Of course, these figures are almost useless for international comparison because they don't say what the tax pays for. In the UK most health care is paid for by the government out of general taxation whereas in the US it is mostly paid for privately by employer-funded health schemes that get a tax break. The NHS is funded out of taxation, and so is charged to the government's account, but a scheme partly funded by a tax reduction (as in the US) is not. If the US were to eliminate the tax break and subsidise these schemes directly instead then the basic nature of the system would not change one whit, but the proportion of US GDP taken in tax would increase. The US spends around 16% of GDP on health care (compared to about 8% in the UK). From an employer's point of view taxation and healthcare are both just costs of doing business, and the overall burden of general taxation and healthcare in the US is actually higher than in the UK. As always you get what you pay for, but a worrying amount of political debate seems to revolve around which ledger the payment is recorded in.

However one place where such figures would be useful (but never seem to be used) is in domestic political arguments. The Liberal Democrats used to have a policy of adding 1% to the rate of income tax in order to fund improved education. Meanwhile the Tories are struggling with a reputation for aggressive cost-cutting in public services in order to fund tax cuts, and the Labour government has been increasing taxes in order to spend more on public services (with notable lack of effect, but thats another issue).

What I would like to see is for each party to declare a target level of taxation as a percentage of GDP. That gets their macro-economic policy on taxation out in the open, without confusing it with a lot of micro-economic questions over how it is collected. Thats not to say that those micro-economic questions are unimportant, but they need to be separated from the macro-economic debate about the overall level of taxation.

Wednesday, August 22, 2007

Microsoft versus FOSS Configuration Management

This was originally posted to my old blog on December 3rd 2006. It was also discussed on Reddit at the time, so please do not repost it.

--------------------------------------

Joel Spolsky writes about the Vista shutdown menu and its excess of confusing options (what exactly is the difference between Hibernate and Sleep?). Moishe Lettvin, who happened to have worked on that menu, chimed in with an explanation of why it came out that way, which included a fascinating insight into how Microsoft handles configuration management.

For the uninitiated, configuration management is a lot more than just version control. It includes managing all the libraries and tools used in a build, and if multiple components are being incorporated into the final product then it also involves keeping track of those. The goal is to be able to go back to some build that happened last year, repeat the build, and come out with the same MD5 checksum at the end. Stop and think about what that involves for a minute. Its highly non-trivial. (And some compilers happen to make it impossible by using multiple threads, so that two consecutive builds generate the same code, but with functions in a different order. Under some high integrity quality regimes this actually matters).

The Microsoft problem is that they have thousands of people working on Vista, and a simple repository with everyone checking stuff in and out simply won’t scale. So they have a tree of repositories about 4 deep. Each developer checks stuff in to their local repository, and at intervals groups of repositories are integrated into their local branch. Hence the changes propogate up the tree to the root repository, from where they then propogate back down the other branches.

The trouble is, propgation from one side of the tree to the other can take a month or three. So if developer A and developer B need to collaborate closely but are on distant branches of the build tree then their code takes ages to propogate between them.

Now consider the case of open source software. The Linux kernel and Windows are actually organised in very similar ways: Linus owns the master repository, people like Alan Cox own sub-repositories, and handle integration up to Linus. Sub-repository owners are responsible for QA on the stuff they merge in, just like in Windows.

So why is Windows Vista in trouble, but free / open source software is not? After all, GNU/Linux overall has tens of thousands of people developing for it, and thousands of packages of which the kernel is just one. Worse yet, these tens of thousands of people are not properly organised and have very little communication. Microsoft can at least order all its programmers to work in certain ways and conform to certain standards.

Actually this disconnection is a strength, not a weakness. Conway’s Law says the structure of any piece of software will duplicate the structure of the organisation that created it. So in Microsoft there are lots of programmers who all talk to one another, and this leads to software where all the bits are inter-dependent in arbitrary ways. Open source developers, on the other hand, are spread around and have very narrow interfaces between each other. This leads to software with narrow well defined interfaces and dependencies.

Dependencies are the crucial thing here: if I am writing a new application that uses, lets say, the Foobar library, then I will want to depend on a stable version. If I write to the API in the daily snapshot then my code could suddenly break because someone submits a patch that changes something. So I write to the last stable release. If I really need some feature that is still being developed by the Foobar team then I can use it, but that is an exceptional case and I won’t be releasing my application until the Foobar feature stabilises.

Dependency management is probably the most important contribution of open source to software engineering. The requirement first became obvious under Windows, with “DLL hell”: different applications installed and required different dynamic libraries, and conflicts were inevitable. Then Red Hat users encountered a similar problem in “RPM hell”, in which they had to manually track the dependencies of any package they wanted and download and install them.

As far as I know Debian was the first distribution to really solve the dependency problem with apt-get. These days Fedora has yum and pirut, which do essentially the same job. Its not a simple job either. A program may have a dependency not just on Foobar, but on Foobar version 1.3.*, or anything from 1.3 onwards but not 2.*. Meanwhile some other package may depend on Foobar 1.6 onwards, including 2.*, and yet a third package requires Foobar 2.1 onwards. It is the job of apt-get and its relatives to manage this horrendous complexity.

(Side note: I remember when OO software was young in the nineties, people thought that the biggest challenge for resuable software was searching repositories. They were wrong: it is dependency management).

Open source also has a clear boundary to every package and a very precise process for releasing new versions. So when the new version of Foobar comes out anyone interested finds out about it. Any changes to interfaces are particularly carefully controlled, so I can easily tell if the Foobar people have done anything to break my application. If they have I can carry on depending on the previous version until I can resolve the problem. Then before my application finds its way into, say, Fedora, it has to be integrated into an explicit dependency graph and checked for conflicts with anything else.

Microsoft doesn’t do this. Vista is effectively a big blob of code with lots of hidden dependencies and no effective management of what depends on what. No wonder they are in trouble.

Sunday, August 19, 2007

Anatomy of a new monad

The monad described here was originally written in response to a posting by Chad Scherrer on Haskell-Cafe:

I need to build a function
buildSample :: [A] -> State StdGen [(A,B,C)]

given lookup functions
f :: A -> [B]
g :: A -> [C]

The idea is to first draw randomly form the [A], then apply each
lookup function and draw randomly from the result of each.

I suggested Chad try using ListT for the non-determinism, but this didn't work, so I decided to try solving it myself. I discovered that actually ListT doesn't work here: you need a completely custom monad. So here it is. Hopefully this will help other people who find they need a custom monad. This isn't meant to be yet another ab-initio monad tutorial: try reading the Wikibook or All About Monads if you need one.

The basic idea of the MonteCarlo monad is that each action in the monad returns a list of possible results, just like the list monad. However the list monad then takes all of these possible results forwards into the next step, potentially leading to combinatorial explosion if you can't prune the tree somehow. It was this explosion that was giving Chad problems when he scaled up his original solution. So instead the MonteCarlo monad picks one of the results at random and goes forwards with that.

Picking a random element means we need to thread a random number generator StdGen as a form of monadic state. So the monad type looks like this:

newtype MonteCarlo a = MonteCarlo {runMC :: StdGen -> (StdGen, [a])}

This is based on the state monad, except that the type parameter "s" is replaced by StdGen. I could have used the state monad here, but it would have meant working its "bind" and "return" into MonteCarlo, which would have been more trouble than it was worth.

Picking a random item from a list is going to be necessary, and as seen below it is actually needed more than once. So:

-- Internal function to pick a random element from a list
pickOne :: [a] -> StdGen -> (StdGen, a)
pickOne xs g1 = let (n, g2) = randomR (0, length xs - 1) g1 in (g2, xs !! n)

Monad instance
Now for the Monad instance declaration. In this we have to declare the bind function (>>=) and the return function.

instance Monad MonteCarlo where
 MonteCarlo m >>= f  = MonteCarlo $ \g1 ->
    let
       (g2, xs)  = m g1
       (g3, x)   = pickOne xs g2
       (g4, f')  = case xs of
                      []   -> (g2, mzero)
                      [x1] -> (g2, f x1)
                      _    -> (g3, f x)
    in runMC f' g4

 return x = MonteCarlo $ \g -> (g, [x])

The return function shows the minimal structure for a Monte-Carlo action: wrapped up in the MonteCarlo type is a function that takes a StdGen state and returns the state (in this case unmodified) along with the potential results. If the action had used the generator then the result pair would have had the new generator state instead of the old one.

The bind (>>=) is a bit more complicated. The type for monadic bind is:

(>>=) :: m a -> (a -> m b) -> m b

In the MonteCarlo instance of Monad this becomes:

(>>=) :: MonteCarlo a -> (a -> MonteCarlo b) -> MonteCarlo b

The job of bind is to take its two arguments, both of which are functions under the covers, and compose them into a new function. Then it wraps that function up in the MonteCarlo type.

We unwrap the first argument using pattern matching to get the function inside "m". The second parameter "f" is a function that takes the result of the first and turns it into a new action.

The result of the bind operation has to be wrapped up in MonteCarlo. This is done by the line

MonteCarlo $ \g1 ->

The "g1" lambda parameter is the input generator state for this action when it is run, and the rest of the definition is the way we compose up the functions. We need to call the first function "m" to get a result and a new generator state "g2", and then feed the result into the second argument "f".

The "let" expression threads the random generator manually through the first two random things we need to do:

Get the possible results of the first argument as a list.
Pick one of these at random.

However there is a twist because the result set could be empty. This is interpreted as failure, and the "MonadPlus" instance below will explain more about failing. In addition the result set could also be a single item, in which case there is no need to generate a random number to pick it. So the f-prime value is the result of applying "f" to whatever item was picked, or an empty list if the "m" action returned an empty list. We also avoid getting a new generator state if the random pick was not needed.

Finally we use "runMC" to strip the MonteCarlo type wrapper off the second result, because the whole thing is being wrapped by the MonteCarlo type constructor at the top level of bind.

And there you have it. In summary, when defining an instance of "bind" you have to construct a new function out of the two arguments by extracting the result of the first argument and passing it to the second. This then has to be wrapped up as whatever kind of function is sitting inside your monadic action type. The arguments to this hidden function are the monadic "state" and therefore have to be threaded through bind in whatever way suits the semantics of your monad.

MonadPlus instance
MonadPlus is used for monads that can fail or be added together in some way. If your underlying type is a Monoid then its a good bet that you can come up with a MonadPlus instance from the "mempty" and "mappend" functions of the Monoid. In this case the underlying type is a list, so the "mempty" is an empty list and the "mappend" is concatenation.

instance MonadPlus MonteCarlo where
  mzero = MonteCarlo $ \g -> (g, [])
  mplus (MonteCarlo m1) (MonteCarlo m2) = MonteCarlo $ \g ->
     let
       (g1, xs1) = m1 g
       (g2, xs2) = m2 g1
     in (g2, xs1 ++ xs2)

Here the "mzero" returns an empty list. The bind operation interprets an empty list as failure, and so indicates this to its caller by returning an empty list. Thus a single failure aborts the entire computation.

"mplus" meanwhile threads the random number state "g" through the two arguments, and then returns the results concatenated. The bind operation will then pick one of these results, so "mplus" has become a kind of "or" operation. However note that the alternation is done over the results, not the original arguments. If you say "a `mplus` b" and "a" returns ten times as many results as "b" then the odds are 10:1 that you will be getting one of the results of "a".

One important combinator introduces a list of items to pick one from.

-- | Convert a list of items into a Monte-Carlo action.
returnList :: [a] -> MonteCarlo a
returnList xs = MonteCarlo $ \g -> (g, xs)

Finally we need to be able to run a MonteCarlo computation. You could do this using runMC, but that gives you the full list of results for whatever the last action was, which is not what you want. Instead you want to pick one of them at random. So there is a runMonteCarlo function that looks an awful lot like the bind operation, and for similar reasons. This returns a Maybe type with "Just" for success and "Nothing" for failure.

-- | Run a Monte-Carlo simulation to generate a zero or one results.
runMonteCarlo :: MonteCarlo a -> StdGen -> Maybe a
runMonteCarlo (MonteCarlo m) g1 =
  let
     (g2, xs) = m g1
     (_,  x)  = pickOne xs g2
  in case xs of
     []   -> Nothing
     [x1] -> Just x1
     _    -> Just x

Exercises for the reader:

Add a new combinator allowing access to the random number generator.
Each computation returns only a single result, but Monte Carlo methods depend on statistical analysis of many results. How should this be done?
Can we introduce a "try" combinator to limit the scope of failure?

New Software Technology: Blockage On Line

This was originally posted to my old blog on December 14th 2006. It was discussed on Reddit so please don't repost it there. Since it was posted even stronger evidence has emerged showing the productivity increase from functional languages.
-----------------------------------------------------

This is the promised posting about why new software technology finds it so difficult to gain acceptance even when major improvements are likely.

To give you some idea of the scale of the problem, in 1997 Ulf Wiger wrote a paper entitled Four-fold Increase in Productivity and Quality. It described a practical experience with a mature technology (the Erlang language) on a significant new system by a large company.

Now Ulf Wiger is a well known proponent of Erlang, so the uncharitable might suspect some degree of bias and selective reporting. But however skeptical one might be of Ulf Wiger’s claims it would be a big stretch to think that he had invented or imagined the whole thing. The most likely explanation is that the reported results are broadly accurate.

So how come we are not all now programming in Erlang? I believe the answer lies in the “hill climbing” approach that most companies take to optimising their behaviour.

If you are lost on a foggy mountain then one way to reach the top is to simply head up hill. You don’t need a map or compass to tell which way that is, and eventually you will reach a point where every way is downhill. That is the top. Businesses are very much in this situation. There are lots of things a business could do that might work to increase profits. Some are big steps, others are small. The likely result, especially of big steps, is shrouded in fog. So the best thing is to move up the hill of profitability in small steps. Eventually you get to the top, and then you can rest for a while.

The trouble with this algorithm is that you are likely to have climbed a foothill, and when the fog clears you see that the real mountain is somewhere else.

Here the analogy breaks down because unlike real mountains the business environment keeps changing. Mountains go up and down over millions of years. In business the equivalent happens much faster. In fact many businesses have to run as fast as they can just to keep up with the foothills.

So now what happens when someone claims to have discovered a huge mountain in the distance? Three questions will immediately occur to the managers:

Is it real?
Will we survive long enough in the Lowlands of Unprofitability to get there?
Will it still be there when we arrive?

All three are extremely good questions, and I’ll analyse them in detail below. For brevity I’ll talk about new programming languages but the same arguments apply to many new software technologies, especially the ones that affect the way that you program.

Is it real?

Managers in the software business are bombarded by sales pitches for things that will make their software faster, cheaper and better. 90 out of 100 of these pitches are for pure snake oil. A further 9 are stuff that will work, but nowhere near as well as advertised. The last 1 will change the world, and possibly make you a fortune if you time it right. The trouble is, how do you find that diamond in all the dross? Each new sales pitch requires a lot of time and effort to evaluate, most of which will give no return on the investment. And in the meantime there are those foothills to keep up with. So managers learn to listen to the sales pitch, nod sagely, and then carry on as before.

I say “managers” because they are usually the ones who make the decisions, and are therefore the target of the sales pitches. Sometimes they can be evaded. The early days of Linux adoption were punctuated by anecdotes of IT managers declaring that Linux was verboten in their shop, only to be gently told that it was already running some piece of key infrastructure.

Will we survive long enough to get there?

At first sight a new programming language looks simple to deploy: just start using it. Unfortunately things are not that simple.

Any significant project is going to require a team of developers, and then on-going maintenance and development of new versions. This means putting a team of people together who all know the language, and then keeping them on the staff. Do you train them? If so how long is it going to take them to get productive? In the days of the great OO paradigm shift it was generally agreed to take months. On the other hand you could hire them, but how many people out there know the language? Probably not very many. Either way, if somebody leaves then replacing them will be problematic.

A software house that has been earning money for a while will have been doing so on the back of some body of software (the exception being pure “body shops” who just write code for customers). This software is the major strategic asset of the company, and in practice most of the development effort in the company is devoted to maintaining and extending existing packages. The only way that you can apply a new programming language to an existing software package is to throw away and rewrite the whole thing. At the very least this is a huge and risky enterprise: revenue from the old legacy will drop off fast if you stop developing it, and in the meantime you just have to hope that the new system gets delivered on time and on budget, because if it doesn’t you will go bust. Of course a rewrite of this sort will eventually be necessary, but the sad thing is that by then the company is not in good enough financial shape to take the project on.

Most software companies have diversified and do not depend on one monolithic software asset, so in theory you could do the rewrites one system at a time. This is still expensive and risky, but at least you don’t have to bet the company. But typically each major asset has a division looking after it, and from within the division the sums look just the same as for a smaller company with one big asset. So the only people who can make such a decision are the board of directors. I’ll come back to this point later.

The last option for a new programming language is a completely new product line. Here you are starting with a clean sheet. You still have training and recruitment issues, not to mention long term support, and you have to put together a whole new toolchain for the developers, but the proposition does at least look sensible.

Will it still be there when we arrive?

New technologies often don’t hit the big time. If the suppliers go out of business, or the open source community loses interest, then anyone who adopted the technology early is going to be left high and dry. A previous employer of mine opted for Transputers in a big digital signal processing project. The Transputer was technically ideal, but then INMOS went out of business.

Geoffrey Moore has described a “chasm” between the Innovator market (who will buy anything just because it is new) and the Early Adopters (who make a rational decision to invest in new things). I’m not convinced that there is really a chasm: people seem to have continuous variations rather than discrete types. But either way there is a real obstacle here. In effect everyone is waiting for everyone else to jump first.

So those were the rational reasons why companies tend to avoid new technology. Now for the, err, less rational reasons.

Most of these come down to the fact that companies are not like Hobbe’s Leviathan. As described in The Tipping Point, once you get past about 150 people in an organisation the people in it cannot keep track of everyone else. Hence you find lots of people at all levels working hard to optimise their bit, but inadvertently messing up the stuff done by someone else. Bear with me while I take you on a short tour of the theory and practice of management motivation.

Companies try hard to reward people who do the Right Thing (at least, the Right Thing for the company). This generally means short term evaluation on how they are doing at their main task. Sales people, for instance, get paid partly by commission, which is a very direct linkage of short term performance to pay. Other people get annual bonuses if their bosses recommend them for it, and of course promotion happens from time to time. And its backed up by social pressure as well: these people are being rewarded for doing the Right Things, and everyone else takes note.

All of this is absolutely vital for a company to survive: you have to keep your eye on the ball, nose to the grindstone and ear to the ground. However, as Clayton Christensen describes in The Innovator’s Dilemma, it also leads to a problem when a “disruptive technology” arrives.

An example of what goes wrong was Xerox PARC. As is well known, the researchers at PARC pretty much invented the modern Office software suite, along with graphical user interfaces, laser printers and ethernet. The usual myth has it that Xerox executives were too dumb to realise what they had, but the real story is more interesting. Xerox did actually go to market with a serious office machine, called Xerox Star. You or I could sit down at one of those things and feel right at home. But when it was launched in 1981 it only sold 25,000 units, which was far too few to make a profit.

The reason (I believe, although I haven’t seen this anywhere else) is that Xerox salesmen (and they were almost all men at that time) were experts at selling big photocopiers to big companies. That was the bread-and-butter of Xerox business, and the quarterly bonuses of those salesmen depended on doing that as much as possible. Anything else was a distraction. So when this funny computer thing appeared in their catalog they basically ignored it. If someone specifically asked for some I’m sure that any salesman would be happy to fill the order, but they weren’t going to waste valuable face time with corporate buyers trying to explain why a whizzy and very expensive piece of equipment was going to revolutionise everything. So Xerox concluded that there was no market for networked desktop computers and sold the whole concept off to Steve Jobs in exchange for some Apple stock.

Christensen has a number of other examples of this phenomenon, all of which are market based. This is probably because you can observe the market behaviour and success of a company, whereas just about everything else they do tends to be visible only on the inside, and often not even then. But the same logic applies.

Suppose you are a project manager, entrusted with Project X to develop some new software. You have had your plans and associated budget approved by the Committee That Spends Money (every company has one, but the name varies). And then some engineer walks into your office and starts talking about a programming language, ending with “… so if you used this on Project X you could do it for a quarter of the cost”.

Now, strange to relate, a project manager will not actually be rewarded for coming in 75% under budget. Instead he (even today it is usually “he”) will be told off for not submitting a better estimate. Senior managers do not like padded estimates because it prevents the money being invested more profitably elsewhere. Coming in a bit under your original estimate is OK: it shows you are a good manager. But coming in way under shows you are either bad at estimation or just plain dishonest (managers watch Star Trek too). Besides, you already have approval for your original plan, so why bother changing course now?

But you have also been around a bit longer than this engineer, and have seen some technology changes. So you ask some pertinant questions, like who else has used it, how long it will take the programmers to learn it, and where the support is going to come from. At the end of this you conclude that, even if this technology is as good as claimed, if you use it on Project X you stand a very good chance of blowing your entire budget just teaching your programmers to use it. This will not get you promoted, and might even get you fired for incompetence. So you thank the engineer for bringing this matter to your attention, promise to look into it carefully, and show him the door.

So now the engineer tries going up the ladder. Next stop is the Product Manager, who looks after the product line that Project X will fit into. He can see that there just might be a case for making the investment, but he has already committed to a programme of improvements and updates to the existing product line to keep it competitive. His annual bonus depends on delivering that plan, and this new language will obviously disrupt the important work he has been entrusted with. So he too thanks the engineer and points him out of the door.

Next stop is the Chief Technology Officer. He is vaguely aware of programming languages, but being a wise man he seeks advice from those who understand these issues (most geeks will find this surprising, but very few senior managers got there by being stupid). Meaning, of course, the project and product managers mentioned earlier, possibly with a trusted engineer or two as well.

These engineers know about programming. In fact they owe their position to knowing more about it than anyone else. This new language will make that valuable knowledge obsolete, so they are not well disposed to it. On top of that they find the technical arguments in favour of the new language highly unconvincing. Paul Graham has christened this phenomenon The Blub Paradox. If you haven’t already read his essay please do so: it explains this far better than I ever could.

In short, everyone in the company with any interest in the selection of a new programming languge can see a lot of very good reasons why it would be a bad idea. The only people who disagree are the ones who have taken the trouble to learn a new language and understand its power. But they are generally in a minority of one.

And this is true in every company. Every company has a few eccentric engineers who try to explain why this or that new technology would be a great investment. Sometimes they are even right. But they are almost never taken seriously. And so great technologies that could actually save the world a great deal of money on software development (not to mention improve quality a lot as well) languish on the shelf.

Monday, August 13, 2007

Why Making Software Companies Liable Will Not Improve Security

(((This was originally posted to my old blog on January 28th 2007. However my host Blogthing went down almost immediately thereafter, so as far as I know almost nobody saw it. I'm now reposting it in response to the recent House of Lords report on e-crime. Bruce Schneier has commented approvingly on the report, including its recommendations for liability for security defects in software. So I think that now is a good time to repost this, including to Reddit.)))

----------------------------------

Bruce Schneier has written that the way to improve the current lamentable state of computer security is to impose mandatory liability on vendors for security breaches. I disagree: I think that this would have little positive impact on security, but a lot of negative impacts on the software industry generally, including higher prices, increased barriers to entry, and general reduced competition.

This is a bit worrying: Schneier is a remarkably clever guy who understands software, security and (at least a bit of) economics. His understanding may well exceed mine in all three areas . I’m used to nodding in agreement whenever I read his words, so to find myself shaking my head was a strange experience. Hence this blog post: whether I turn out to be right or wrong, I will at least have scratched the itch.

So, to the argument:

On the face of it, Schneier’s argument is impecable economics. Security is an “externality”, meaning that all of us pay the price for bad security, but only the software vendor pays the price for good security. In theory we should demand secure software and pay higher prices to get it, but in practice most people cannot accurately evaluate the security of a software system or make an intelligent trade-off about it. So system vendors (including, but not limited to, Microsoft) find it is more cost effective to trumpet their wonderful security systems while actually doing as little as possible. Security is more of a PR issue than a technical issue.

So the Economists Solution is to push the costs of bad security back on to the vendor, where they rightfully belong. Therefore you and I should be able to sue someone who sells us software with security defects. That way if we suffer from virus infections, spambots or phishing because some software somewhere was defective, we should be able to sue the manufacturer, just as I could sue a car manufacturer if I am hurt in a crash due to defective brakes.

So far, so fine. But now imagine you are a small software company, such as the one run by Joel Spolsky. You sell a product, and you are making a reasonable living. Then one day a process server hands you a writ alledging that a cracker or crackers unknown found a defect in your software, and used it to cause series of security breaches for many of your customers, followed by downtime, loss of business, theft from bank accounts, and a total of a million dollars of damages. It could easily put you out of business. Even if the claim is slim and the damages inflated, you could still have a big legal bill.

Obviously this is not useful: the point is to encourage the vendors to do better, and while hanging the worst one occasionally may encourage the others, doing so by a lottery won’t.

Part of the problem is that the economic logic calls for unlimited liability. So it doesn’t matter whether you sold the software for $5 or $5,000, you are still on the hook for all damages due to security defects. Of course the law could move to a limited liability model, capping it at, say, the price of the software, but that is still too big for most companies to pay out. Even if it was 10% of the price of the software, Microsoft is probably the only company with a big enough cash pile to survive an incident that hit 50% of its users. But 10% of the price of a piece of software is going to be only a very tiny fraction of the real cost of a security incident. It looks a lot more like a fine than real liability.

So if you are a software vendor then how do you protect yourself from such an incident? Of course you can tighten up your act, which is the whole point. But no amount of checking and careful architecture is going to protect you from the occasional defect that blows everything wide open.

You could buy insurance. Lots of professions have to carry liability insurance: its just a cost of doing business. Insurers will want to see you take proper steps to avoid claims, but will then cover you for the ones that do happen. Or, more or less equivalently, there could be a “safe harbour” clause in the liability law. If you can show that you have taken all proper steps to ensure the security of your system then its just bad luck for the customer and you are not liable.

The trouble with both of these solutions is that we do not have any way of deciding what the “proper steps” to either maintain your insurance cover or stay in the safe harbour are. There are lots of good practices which are generally thought to improve security, but they actually depend more on motivated people than anything else. From the developers point of view, the need to develop secure software is replaced by the need to reach the safe harbour. If the approved practices say you do X then you do it, and ensure that a written record exists to prove that you did it. Whether doing X actually improves the security of your product is irrelevant.

I’ve seen this effect personally. For a while I worked in an industry where software defects *do* give rise to unlimited liability, and where the government inspectors check that you are following the approved process. The industry was medical devices, and the government department was the FDA. The entire focus was on the paper trail, and I mean paper: signed and dated pieces of paper were all that counted unless you could prove that your CM system was secure according to yet another complicated and onerous set of rules (which we couldn’t). Worse yet, the inspectors wanted to see that you worked from the same records they were auditing, so you couldn’t even keep the training records on a database to find out who still needed what training: it was the signature sheet or nothing. In theory we weren’t even allowed to do software development on a computer, although in practice the inspectors merely noted that we were not in compliance on that point.

The impact of these rules on everyday work was huge and often counterproductive. For instance, it might have been a good idea to run lint or a similar tool on our C code. But this could never become part of the official process because if it was then the inspectors would ask to see the output (i.e. the dated, signed printout from a particular run), annotated with the written results of the review of each warning showing how it had either been resolved or why it could be ignored. Even if this could have been done on the computer, the overhead would have been huge. So it was cheaper not to put lint in the official process, and the resulting loss in real quality didn’t cost us anything.

(Actually individual programmers did compile with gcc -Wall at least some of the time, which is the modern equivalent. But because this wasn’t part of the official process I don’t know how many did so, and there was certainly no independent review of their decisions to fix or ignore warnings).

And despite our best efforts it was simply impossible to comply with 100% of the rules 100% of the time. The FDA knows this of course, so its inspectors just hang someone from time to time to encourage the others. Most of the QA people in the industry are ex-FDA people, so they know how the system works.

(Side note: in 1997 the FDA justified their regulation of medical device design on the grounds that over 50% of device recalls were due to design defects. I never saw any figures for the design defect recall rate *after* they imposed these regulations).

In short, I believe that any attempt to impose quality on software by inspection and rule books is doomed to failure. This approach works in manufacturing and civil engineering because there a good safe product *is* the result of following the rules. But software engineering is nowhere near that mature, and may never be because software is always invented rather than manufactured: as soon as we reduce the production of some category of software to a set of rules that guarantees a good result we automate it and those rules become redundant.

So, back to security. Much as I dislike the current state of computer security, I don’t see liability or regulation as answers. I’ve seen regulation, and I don’t think liability would look any different because it always comes down to somebody outside the company trying to impose good practice by writing a rule book for software development that the company must follow (and prove it has followed) on pain of bankruptcy.

It might be argued that an insurance market would seek the least onerous and most effective rulebook. I disagree. All forms of insurance have the fundamental problems of “moral hazard” and “asymmetrical information”, both of which come down to the fact that the development company knows a lot more about its risk than the insurer. From the outside it is very difficult to tell exactly what a software company is doing and how well it is doing it. As long as security improvement requires time and thought I cannot see any effective way to tell whether the right amount of thought has been dedicated to the subject.

At the top I said that enforcing liability would increase costs and barriers to entry, and thereby reduce competition.
Obviously risk brings cost, either in the money that has to be set aside to cover it or to pay insurance. The extra work required to preserve evidence of good practice will also increase costs. Finally, these costs will fall hardest on the start-up companies:

They are always short of money anyway
Setting up the process requires up-front work, and having your process inspected by possible insurers to get a quote is going to be expensive too
Maintaining the evidentiary chains is particularly hard when you are trying to modify your product in response to early feedback
Insurers will prefer companies with an established track record of good practice and secure software, so start-ups will have to pay higher prices

Put all of these together, and it really does add to the costs for small companies.

But suppose despite all the obstacles listed above we have our software, and its more secure. Not totally secure, of course, because there ain’t no such thing. Lets say its a web framework, along the lines of Rails or Twisted. It gets incorporated into a banking system, along with a web server, a database, an open source token-based security system and a set of front-end web pages written by a bunch of contractors being bossed about by some project manager employed by the bank. And despite everyone’s best efforts, next month a few thousand people have their accounts cleaned out. It seems they fell for a targetted trojan that performed a man-in-the-middle attack and then used a defect in the web pages to inject SQL commands that, due to another defect in the database, were able to disable the bank’s monitoring of suspicious transactions. Lawyers representing the customers, the bank, all the contractors various liability insurance companies, the database vendor and the web framework vendor are suing or threatening to sue. Industry observers think it will be a good year or two before some kind of settlement is arrived at. And what about the open source developers of the security system? It seems that the ones in the UK will be fine, but the ones in the US have already received writs. The law says that developers are only liable if they received payment for the development, so they thought they were safe. But their project web server was partly funded by a donation from the bank, and according to the database vendor’s lawyers that is enough to establish liability.

Sunday, August 12, 2007

Responses to 'Silver Bullets Incoming!'

(((
These are the responses originally posted to "Silver Bullets Incoming". Many of these responses are worth reading in their own right, and I'd like to thank the respondants for taking the time for such thoughtful posts.

Please do not post this to reddit, as it has already been discussed there under the original URL.
)))

Stephen Says:
December 9th, 2006 at 4:38 pm e
Paul:

I enjoyed your first article quite a bit - it got me thinking about technical language issues again (always fun).

I’d like to comment on your update to the original article. Specifically, I have some comments regarding C++

C++ is not an “old” language, incorporating many language features of more “modern” languages, including exceptions, automatic memory management (via garbage collection libraries and RIIA techniques), and templates, a language feature that is only available in C++, and that provides support for generic programming and template metaprogramming, two relatively new programming paradigms. Yes, C++ has been around a while, but until I see programmers constantly exhausting the design and implementation possibilities of C++, I won’t call the language “old.”

C++ was not designed to support just OO programming: From “Why C++ Isn’t Just An Object-Oriented Programming Language” (http://www.research.att.com/~bs/oopsla.pdf):

“If I must apply a descriptive label, I use the phrase ‘multiparadigm language’ to describe C++.”

Stroustrup identifies functional, object-oriented, and generic programming as the three paradigms supported by C++, and I would also include metaprogramming (via C++ templates or Lisp macros) as another paradigm, though it is not often used by most developers.

Of course, we should also keep in mind Stroustrup’s statement regarding language comparisons (”The Design and Evolution of C++”, Bjarne Stroustrup, 1994, p.5): “Language comparisons are rarely meaningful and even less often fair.”

Take care, and have a good weekend!

Stephen
pongba Says:
December 12th, 2006 at 11:26 am e
I found it so weird that, on the one hand you argue that haskell is fast( to the extend that it might be even faster than some compiling language such as C++), while on the other hand you said “where correctness matters more than execution speed its fine today”.
Does that sound paradoxical?
Another Paul Says:
December 12th, 2006 at 3:59 pm e
Paul:

“I think that Dijkstra had it right: a line of code is a cost, not an asset. It costs money to write, and then it costs money to maintain. The more lines you have, the more overhead you have when you come to maintain or extend the application”

By that measure, there’s no such thing as an asset. Think about that a moment - someone buys a general ledger or CAD/CAM system and modifies it as companies do. Either system reduces staff, provides more accurate information much more quickly, and renders the company more competitive. Take it away and what happens?

It’s been my experience that while these systems require maintenance (and sometimes a lot) they usually result in a net reduction in staff and the cost of doing business. And some types of systems provide a clear competitive edge as well. I think that makes many systems just as much an asset as a house, factory building, or a lathe.

Interesting article. Thanks.

Another Paul
BobP Says:
December 12th, 2006 at 6:07 pm e
>> An order of magnitude is a factor of 10, no less

> Well, the Wikipedia entry does say about 10. All this stuff is so approximate that anything consistently in excess of 5 is close enough.

0.5 orders of magnitude = power(10.0,0.5) = sqrt(10.0) = 3.1623 (approx)
1.5 orders of magnitude = power(10.0,1.5) = sqrt(1000.0) = 31.623 (approx)

If we are rounding off, a factor of 4 is about one order of magnitude; also, a factor of 30 is about one order of magnitude.
Jeremy Bowers Says:
December 12th, 2006 at 6:36 pm e
You missed my point with Python, or at least failed to address it.

My point wasn’t that Python is also good. My point was that you lept from “10x improvement” to “it must the chewy functional goodness!” But your logic falls down in the face of the fact that Python, Perl, Ruby, and a number of non-functional languages that also have a 10x improvement over C++, therefore it clearly is not a sound argument to just leap to the conclusion that “it must be the chewy functional goodness!” when there are clearly other factors in play.

I’m not criticizing Haskell or functional programming, I’m criticizing your logic, and you’ve still got nothing to back it up.

(This is par for the course for a claim of a silver bullet, though.)
Sam Carter Says:
December 12th, 2006 at 8:14 pm e
“Libraries and languages are complicit: they affect each other in important ways. In the long run the language that makes libraries easier to write will accumulate more of them, and hence become more powerful.”

This argument has a large flaw in it, namely the current state of libraries doesn’t reflect this claim. The largest and most powerful collection of libraries seem to be .NET, CPAN, and the Java libs, certainly not Lisp libraries.

But the advocates of Lisp would argue that it’s the most powerful language, and it’s clearly been around for a long time, yet the Lisp community has not accumulated the most powerful collection of libraries. So unless the next 40 years are going to be different from the previous 40 years, you can’t really assert that language power is going to automatically lead to a rich set of libraries.

I stand by my original comment to the previous article that programming is more about APIs and libraries than about writing their own code, and that if you are focused on measuring code-writing performance, you are just measuring the wrong thing.

I also disagree with the claim that this is unmeasurable because doing a real-world test is too expensive. As long as the project is solvable in a few programmer-weeks, you can test it out with different languages. I took a computer science class (Comp314 at Rice) where we were expected to write a web browser in 2 weeks. It wouldn’t be that hard to have a programming test which incorporated a database, a web or GUI front end, and some kind of client/server architecture, e.g. implementing a small version of Nagios, or an IM client, or some other toy application.

I’m sorry but writing a command line application that parses a CSV file and does some fancy algorithm to simulate monkeys writing Shakespeare is about as relevant to modern software engineering as voodoo is to modern medicine.
Paul Johnson Says:
December 12th, 2006 at 8:27 pm e
pongba:

I’m arguing that Haskell programs are faster to *write*. Execution speed is a much more complicated issue. FP tends to lose in simple benchmarks, but big systems seem to do better in higher level languages because the higher abstraction allows more optimisation.
Paul Johnson Says:
December 12th, 2006 at 10:54 pm e
Another Paul:

The functionality that a package provides is an asset, but the production and maintenance of each line in that package is a cost. If you can provide the same asset with fewer lines of code then you have reduced your liabilities.

Paul.
Paul Johnson Says:
December 12th, 2006 at 11:12 pm e
Jeremy Bowers:

Teasing apart what it is about Haskell and Erlang that gives them such a low line count is tricky, because it is more than the sum of its parts. One part of it is the high level data manipulation and garbage collection that Python shares with functional languages. Another part of it is the chewy functional gooodness. Another part, for Haskell at least, is the purity. OTOH for Erlang it is the clean and simple semantics for concurrency.

What I see in the results from the Prechelt paper is that Python was, on average, about 3 times better than C++ while the average Haskell program (from a sample of 2) was about 4 times better. Actually the longer Haskell program was mine, and I was really embarassed when someone else showed me how much simpler it could have been.

In terms of pure line count I have to conceed that Python and Haskell don’t have a lot to choose between them. A 25% improvement isn’t that much. Its a pity we can’t do a controlled test on a larger problem: I think that Haskell’s type system and monads are major contributors to code that is both reliable and short. Unfortunately I can’t prove it, any more than I could prove that garbage collection was a win back in the days when I was advocating Eiffel over C++.

Paul.
Paul Prescod Says:
December 12th, 2006 at 11:59 pm e
If you cannot “tease apart” what it is about Haskell and Erlang that makes them so productive then you cannot say that any one improvement is a silver bullet. It just feels truthy to you. Furthermore, if you are presented with counter-examples in the form of Python and Ruby then surely you must discard your thesis entirely. The best you can say is that there exist non-functional languages that are 10 times less productive than some functional languages for some projects.

Duh.
Paul Johnson Says:
December 13th, 2006 at 12:26 am e
Sam Carter:

On languages with expressive power gathering libraries; point mostly conceeded, although Perl certainly is a very expressive language, so I don’t think it supports your point, and .NET has Microsoft paying its mongolian hordes, so its not a fair comparison.

There are two sorts of libraries: general purpose ones (e.g. data structures, string manipulation, file management) that get used in many applications, and vertical libraries (HTTP protocol, HTML parsing, SMTP protocol) that are only useful in specific applications. There is no hard dividing line of course, but the usefulness of a language for general purpose programming depends on the language and its general purpose libraries. The vertical libraries have a big impact for applications that use them, but not elsewhere. So I would generally judge a language along with the general purpose libraries that are shipped with it. The special purpose libraries are useful as well, but its a secondary consideration.

Paul.
Paul Johnson Says:
December 13th, 2006 at 12:33 am e
Sam Carter (again):

Sorry, just looked back at your post and realised I’d forgotten the second point.

A worthwhile test is going to take about 10 versions to average out the impact of different developers. So thats 2 weeks times 10 coders is 20 developer-weeks, or almost half a man-year. Say a coder is on $30K per year and total cost of employment is three times that (which is typical). Round numbers $40-50 per language. Ten languages will cost the best part of half a million dollars to evaluate. Not small beer.

Of course you could use students, but on average they will know Java or Javascript better than Python or Haskell, so how do you correct for that?

Paul.
pongba Says:
December 13th, 2006 at 7:10 am e
I’m arguing that Haskell programs are faster to *write*. Execution speed is a much more complicated issue. FP tends to lose in simple benchmarks, but big systems seem to do better in higher level languages because the higher abstraction allows more optimisation.

I always hear people saying that, but I really don’t get it.
I know that *theoretically* abstraction( or non-side-effect, etc) gives more opportunity for optimization, but I have never seen people show some real data that can *really* prove it.
One question constantly annoys me - If higher-level of abstraction allows more optimization, then why .NET put the burden of discriminating value-types and reference-types on us programmers. Shouldn’t the referential-transparency-ness be better at this?
Jonathon Duerig Says:
December 13th, 2006 at 10:58 am e
I have two specific (and one general) criticisms to make about your line of argumentation:

First, I think you do not adequately address the criticisms about lines of code as a metric. The cost of a line of code is the sum of five factors: (a) Difficulty of formulating the operation involved (original coder*1), (b) Difficulty of translating that operation into the target programming language (original coder*1), � Difficulty of parsing the code involved to understand what the line does (maintainer*n), (d) Difficulty of later understanding the purpose of that operation (maintainer*n), and (e) Difficulty of modifying that line while keeping it consistent with the rest of the program (maintainer*n).

(a) and (b) are done only once, but �, (d), and (e) are done many times whenever the program needs to be fixed or modified. Brooks’ argument was specifically that in the general case the time for (a) is more than 1/9 the time for (b), and the time for (d) is more than 1/9 the time for � and (e). This is important because (a) and (d) are both language and tool independent.

When comparing the lines of code from different languages, it is important to realize that the formulation of the operations and the understanding of purpose are spread across those lines. And the verbosity of the language usually doesn’t impede either of these problems (unless it is extreme).

For instance, take the creation of an iterator or enumeration in C++ or Java respectively and compare that to creating a fold function in Scheme. These are roughly equivalent tasks. In C++, an iterator is defined first by defining a class with various access operators like * and -> and ++ and — and then implementing them. This adds a lot of baggage because there are half a dozen or so functions that must be defined and there is a separate class specification. In constrast, a scheme fold function is much simpler from the language perspective. A single function is defined rather than half a dozen. It will almost certainly have fewer lines, possibly by 4 or 5 times.

But let us look at what the creation of the iterator or fold function means from the perspective of items (a) and (d). Both of these are common idioms in their respective languages, so all of the code specifically dealing with iteration/folding is trivial to conceptualize and trivial to understand the purpose of. The difficulty in writing either a custom iterator or a custom fold function lies within the subtleties of the iteration. If it is a tree, what information needs to be maintained and copied to successive iterations (whether that be in the form of state, or in the form of argument passing)? Are there multiple kinds of iterations? How would they be supported? (For example, sometimes a user wants to traverse a tree in pre-order, sometimes in post-order, sometimes in-order, and sometimes level by level in a breadth-first order.) These are the questions which the original coder and the later maintainers will have to contend with. And these are really orthogonal to lines of code counts.

But there is another factor at work here which makes lines of code a faulty cross-language measurement. Every language has a grain to it. If you program with the grain, then any difficulty will be easily solved by the tools in the language. If you program against the grain, then you will run into difficulty after difficulty. This applies to fundamental language properties. You can bypass the type system in C++ and avoid all the type checks, but it is cumbersome and unpredictable if you do it wrong. Ruby allows you to be much more flexible with types and provides a safety net. If you try to enforce a more strict typing in Ruby, then you will have to fight the language every step.

But the grain of the language also includes the question of scale. Some languages provide a lot of flexibility. They allow compact and loose representations of programs which can be customized to the problem domain easily. These languages include Scheme and Ruby and Haskell. These flexible languages are very useful for small projects with one or a few developers because they can be metaphorically molded to fit the hand of the person who wields them. But there is a trade-off because they tend to be more difficult to use in large groups because it is harder for others to undestand what it going on. This is a fundamental trade-off that programming languages must make. And it means that a language which is great at one end of the spectrum will likely be lousy at the other end. And this is reflected in the lines of code required for a particular scale of problem.

My second criticism is in regard to your discussion of state. You point out that Brooks considered managing of state to be a major essential difficulty of programming and you then claim that functional languages obviate this difficulty and hypothesize this as the reason that they can be a silver bullet.

I believe that you have misunderstood the kind of state the Brooks was referring to. He was not talking about run-time state but compile-time state. He was not talking about what variables are changed at run-time. He was talking about the interactions between components of the program. These interactions are still there and just as complex in functional languages as in imperative languages.

Second, even when considering just the run-time state, the referential transparency of functional languages simplifies only the theoretical analysis of a program. As far as a normal programmer who is informally reasoning about what a program does, the programmer must consider how state is transformed in the same way whether or not a modified copy is made or a destructive write is made. This is the same kind of reasoning.

Finally, I have seen many people talk about getting an order of magnitude improvement by finding some incredible programming tool. Functional programming is not unique in that respect. But in my experience this is more likely to be about finding a methodology that suits the persons mindset than about finding the one true language or system. Somebody who thinks about programming in terms of a conceptual universe that changes over time will be an order of magnitude less effective in a functional environment. And somebody who thinks about programming in terms of a conceptual description of the result which is broken up into first class functions will be an order of magnitude less effective in an imperative environment.

I have programmed in both imperative and functional languages. I know and understand the functional idioms and have used them. My mindset tends to the empirical. I am a less effective programmer in such languages. But I have seen programmers who can pull a metaphorical rabbit out of a hat while tapdancing in them. This says to me that evangelism about functional languages or empirical languages is fundamentally misguided regardless of the side.
Paul Johnson Says:
December 13th, 2006 at 7:20 pm e
Jonathon Duerig:

I had decided not to respond to any further comments and instead get on with my next article. But yours is long and carefully argued, so it merits a response regardless. Its also nice to be arguing the point with someone who knows what a fold is.

You make the point that during maintenance the difficulty of later understanding the purpose of an operation is language independent. I’m not so sure. A maintainer may suspect that a C++ iterator is truly orthogonal, but it can’t be taken for granted. There may be a bug hiding in those methods, or perhaps someone fixed a bug or worked around a problem by tweaking the semantics in an irregular way. Also a lot of the understanding of a piece of code comes from context, and it helps a lot to be able to see all the context at once (why else would 3 big monitors be a selling point for coding jobs?). So terse code makes it a lot easier to deduce context because you can see more at once.

(Aside: I remember in my final year project at Uni going into the lab at night because then I could get two VT100s to myself).

You say that Scheme, Ruby and Haskell can be moulded to fit the hand of the user, making them more productive for single person tasks, but less productive for groups because of mutual comprehension difficulties.

This is hard to test because of the lack of statistics, but Haskell is strongly typed and the community has already developed conventions and tools for documentation and testing (Haddock and QuickCheck). I can see that Scheme macros can be used to construct an ideosyncratic personal language, but I really don’t see how this could happen in Haskell. Things that get done with macros in Scheme are usually done with monads in Haskell, but whereas Scheme macros are procedural monads are declaritive and must conform to mathematical laws, making them tractable. My experience with Haskell monads is that you generally build a monadic sub-language in a single module and provide libraries for it in some other modules (e.g. Parsec), and that the end result is intuitive and simple to use. But maybe I’ve only been exposed to well-designed monads.

On the subject of state and informal reasoning: personally I use whatever reasoning forms that will work. In debugging a particularly complex monad I once resorted to writing out the algebraic substitutions long-hand in order to understand how the bind and return operators were interacting. It worked, and I got the monad to do what I wanted. I routinely use informal algebraic reasoning of this sort in simpler cases in order to understand what my program is doing. Any informal reasoning must be a hasty short-hand version of what a full formal proof would do, and it follows that language features that make full formal proof easier will make the informal short-hand mental reasoning easier too.

Pure functions are particularly valuable when trying to understand a large program because you don’t have to worry about the context and history of the system for each call; you just look at what the function does to its arguments. In a real sense this is as big a step forwards as garbage collection, and for the same reason: any time you overwrite a value you are effectively declaring the old value to be garbage. Functional programs (at least notionally) never require you to make this decision, leaving it up to the GC and compiler to figure it out for you based on the global system context. Thus complex design patterns like Memento and Command are rendered trivial or completely obsolete.

Finally you talk about the many over-hyped technologies in this industry. Yes, hype is a common problem. Those of you who think you have a silver bullet are very annoying for those of us who actually do.

Paul.
Paul Johnson Says:
December 13th, 2006 at 7:34 pm e
Jonathon Duerig:

I had decided not to respond to any further comments and instead get on with my next article. But yours is long and carefully argued, so it merits a response regardless. Its also nice to be arguing the point with someone who knows what a fold is.

You make the point that during maintenance the difficulty of later understanding the purpose of an operation is language independent. I’m not so sure. A maintainer may suspect that a C++ iterator is truly orthogonal, but it can’t be taken for granted. There may be a bug hiding in those methods, or perhaps someone fixed a bug or worked around a problem by tweaking the semantics in an irregular way. Also a lot of the understanding of a piece of code comes from context, and it helps a lot to be able to see all the context at once (why else would 3 big monitors be a selling point for coding jobs?). So terse code makes it a lot easier to deduce context because you can see more at once.

(Aside: I remember in my final year project at Uni going into the lab at night because then I could get two VT100s to myself).

You say that Scheme, Ruby and Haskell can be moulded to fit the hand of the user, making them more productive for single person tasks, but less productive for groups because of mutual comprehension difficulties.

This is hard to test because of the lack of statistics, but Haskell is strongly typed and the community has already developed conventions and tools for documentation and testing (Haddock and QuickCheck). I can see that Scheme macros can be used to construct an ideosyncratic personal language, but I really don’t see how this could happen in Haskell. Things that get done with macros in Scheme are usually done with monads in Haskell, but whereas Scheme macros are procedural monads are declaritive and must conform to mathematical laws, making them tractable. My experience with Haskell monads is that you generally build a monadic sub-language in a single module and provide libraries for it in some other modules (e.g. Parsec), and that the end result is intuitive and simple to use. But maybe I’ve only been exposed to well-designed monads.

On the subject of state and informal reasoning: personally I use whatever reasoning forms that will work. In debugging a particularly complex monad I once resorted to writing out the algebraic substitutions long-hand in order to understand how the bind and return operators were interacting. It worked, and I got the monad to do what I wanted. I routinely use informal algebraic reasoning of this sort in simpler cases in order to understand what my program is doing. Any informal reasoning must be a hasty short-hand version of what a full formal proof would do, and it follows that language features that make full formal proof easier will make the informal short-hand mental reasoning easier too.

Pure functions are particularly valuable when trying to understand a large program because you don’t have to worry about the context and history of the system for each call; you just look at what the function does to its arguments. In a real sense this is as big a step forwards as garbage collection, and for the same reason: any time you overwrite a value you are effectively declaring the old value to be garbage. Functional programs (at least notionally) never require you to make this decision, leaving it up to the GC and compiler to figure it out for you based on the global system context. Thus complex design patterns like Memento and Command are rendered trivial or completely obsolete.

Finally you talk about the many over-hyped technologies in this industry. Yes, hype is a common problem. Those of you who think you have a silver bullet are very annoying for those of us who actually do.

Paul.
Toby Says:
December 15th, 2006 at 7:36 am e
Since I happened to stumble upon an actual Dijsktra cite just now, I thought I’d add it here (having read and appreciated your original post a few days ago).

In EWD513, “Trip Report E.W. Dijsktra, Newcastle, 8-12 September 1975,” he writes,

“The willingness to accept what is known to be wrong as if it were right was displayed very explicitly by NN4, who, as said, seems to have made up his mind many years ago. Like so many others, he expressed programmer productivity in terms of ‘number of lines of code produced’. During the discussion I pointed out that a programmer should produce solutions, and that, therefore, we should not talk about the number of lines of code produced, but the number of lines used, and that this number ought to be booked on the other side of the ledger. His answer was ‘Well, I know that it is inadequate, but it is the only thing we can measure.’. As if this undeniable fact also determines the side of the ledger….”

That is the edited version as printed in “Selected Writings on Computing: A Personal Perspective”. The original text can be found in the EWD Archive, at http://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD513.html

Original responses to '"No Silver Bullet" and Functional Programming'

(((
These are the comments originally appended to the article. Many of them are thoughtful and worth reading in their own right.

As with the article, please do not submit this to reddit.
)))

Chris Morris Says:
December 7th, 2006 at 6:51 am e
” experts in several different languages were given a simplified spec taken from a real problem and asked to implement it.”

So, was the use of Haskell an integral part of simplifying the original spec given to the experts in the first place? To me, the essential difficulties of programming are in figuring out what the hell to really build in the first place. None of the programming languages in that study helped them do that. They skipped the bulk of the essential difficulties and went right into studying the effects on the accidental portion.
Craig Says:
December 7th, 2006 at 7:41 am e
I really enjoyed that, looking forward to your next blog.
Functional Lover Says:
December 7th, 2006 at 8:21 am e
You know why? Because all the colleges have been brainwashed by Java. Damned be they.
Paul Johnson Says:
December 7th, 2006 at 8:26 am e
There has been some discussion of this at http://discuss.joelonsoftware.com/default.asp?joel.3.424533.12

Paul.
Jonathan Allen Says:
December 7th, 2006 at 9:04 am e
From the study referenced:

> 1. The NSWC experiment was conducted in a very short time-frame with very little direct funding. Thus many corners had to be cut, including significant simplification of the problem itself; the largest of the prototypes was only 1200 lines of code.

> 2. The geo-server specification was by nature ambiguous and imprecise, thus leading to some variation in the functionalities of the developed prototypes. (On the other hand, the specification is probably typical of that found in practice, especially during requirements acquisition.)

> 3. The participants were trusted entirely to report their own development metrics. In addition, not all of the participants attended the first meeting at NSWC; those who did attend were advantaged.

> 4. No guidelines were given as to what exactly should be reported. For example, development times usually included documentation time, but there were major variations in the code-size/documentation-size ratios. In addition, different methods were used to count lines of code for each language, and it is not clear how one line of code for one language relates to that of another.

> 5. The review panel had very little time to conduct a really thorough review. For example, none of the code was actually run by the panel; they relied on the written reports and oral presentations.

The problem was grossly simplified, the code was rushed, the line count and time spent numbers questionable, and the final application never run.

This is your proof?

We don’t even know what those 85 lines of Haskel actually work with good inputs, let alone if they correctly responded to bad data.
Mike Griffiths Says:
December 7th, 2006 at 10:50 am e
I think you re missing the point with regard to the whole process of software development. Even orders of magnitude improvements in the process of writing code have no impact upon the problem of writing the right code - defining right as “fit for the required purpose” once the problem domain exceeds a remarkably small limit.
v Says:
December 7th, 2006 at 1:22 pm e
While the arguments you present in the bulk of your post are convincing, the hard data presented hardly seems so. How does low LOC = high productivity except in the wierd world where CMM makes sense?

What gets put down as code is but the distill of all the knowledge stored in a programmer’s head, and functional languages do demand a higher overhead of that storage space than imperative ones (more for historical reasons, i admit, but there it is).

Moreover, code building tools reduce the effort to actually input those high LOCs with automation that is improving by the day.

Can you present a better quantitative metric than LOC counts for functional languages being an order of magnitude better?
thomas lackner Says:
December 7th, 2006 at 3:07 pm e
You should link to information about the STM implementation in Haskell.. sounds very interesting.

I refuse to comment on the constant debate about programmer efficency, as I’m sure the same program would be 72 characters in K/Q!
hxa7241 Says:
December 7th, 2006 at 3:27 pm e
I have recently translated a minimal global illumination renderer from C++ to OCaml. It is about half the size — similar to the Ruby translation. The difference is in having class interfaces defined separately, and other small things.

The implication that functional is so much more compact and simple than imperative is wrong. Look at the granularity of abstractions: they are practically the same. If you still code with the same sized building blocks it will demand similar effort.

Also, well-made imperative code has many restrictions on state interaction. It is not so far away from functional code.

The only forseeable way to greatly improved productivity is with reuse: more, better libraries/frameworks and ways of using them. (As Brooks says.)
Achilleas Margaritis Says:
December 7th, 2006 at 3:48 pm e
As always, benchmarks are biased.

Getopt is a parser.

You can write a command line parser in a few lines of code in C++ using a library like boost::Spirit. For example (code not tested, just an illustration):

rule letter = range(’a', ‘z’) | range(’A', ‘Z’);
rule digit = range(’0′, ‘9′);
rule alphanumeric = digit | letter;
rule id = *letter << *alphanumeric;
rule num = +digit;
rule cmdLine = *('-' << (id | num));

There are many tasks in which functional programs are shorter and more concise, but these advantages are not due to the functional nature of a program, but due to design choices (better syntax, no semicolons, lambda functions etc). These advantages can happily exist in imperative languages too. The only difference is referential transparency, but, for me, it gets in the way.

Can we please see a MVC application with Haskell where the model is cleanly separated from the view?
Achilleas Margaritis Says:
December 7th, 2006 at 3:59 pm e
As always, results are biased.

The command line parser could be written in C++ with a Spirit-like framework like this:

rule id = letter << *alphanumeric;
rule num = +digit;
rule cmdline = *('-' << (id | num));

The advantages of FP come from concepts that can also be used in imperative languages: lambda functions and closures (Smalltalk/Ruby), simplified syntax (Ruby), etc. On the other hand, in pure FPs many things are unnecessarily complex.
anonymous Says:
December 7th, 2006 at 5:22 pm e
Your argument about the lack of states in functional programs is fundamentally flawed. FP-s appear to have no states simply because they describe what would happen (sort of.) However, in order for a FP to, well, run, someone, somewhere, must pull a lever so that the wheels start turning according to the spec of the FP, and what ticks under the FP ultimately does have state. I am not suggesting that there are no gains from functional programming, but the notion you offer is bogus as it stands.
Edward Ocampo-Gooding Says:
December 7th, 2006 at 5:23 pm e
A popular excuse for why programming folks all don’t immediately jump into functional programming is because it’s not as intuitive a paradigm as the procedural style.

I’d like to see a psychology study that measures programming proficiency vs. time trained with “fresh” students unaware of either paradigm and have two separate groups be trained.
I
You wish Says:
December 7th, 2006 at 5:34 pm e
And yet Lisp, which is considered a functional language, has been around since 1962 (it was spec’ed in 1958), and Brooks doesn’t point to it in his paper. Functional programming isn’t new, and it is certainly not a silver bullet.
Sam Carter Says:
December 7th, 2006 at 8:04 pm e
The reason why the whole world isn’t using functional languages is very simple: shitty library support. The real silver bullet for most programmers is access to large, high quality libraries, with the primary examples being the .NET runtime, or Java’s libraries. Programming tests that require writing 100% of the code from scratch are just flat out inaccurate, because you are measuring the wrong thing. The bulk of modern development time isn’t writing code from scratch. It’s writing code that interfaces with libraries for networking, or XML parsing, or database handling, or whatever (see http://www.joelonsoftware.com/articles/LordPalmerston.html for a fuller treatment of this topic). The functional programming language community is more interested in marginalizing their access to external systems rather than making it easier (research into monads being a great example of that).
Larry O'Brien Says:
December 7th, 2006 at 8:50 pm e
I intended to post the URL of some thoughts on your post, but your comment engine diagnoses dashes in URLs as indicative of spam. (Not so, I think.) Anyhow, http://www.knowing.net/PermaLink,guid,659c9535-6674-48f9-a4f9-8bc34fe724b5.aspx
Paul Says:
December 7th, 2006 at 9:39 pm e
Comparing Haskell to C compares a variety of design parameters at once: strong versus weak-ish typing, garbage collection versus hand collection, high level data types versus low-level ones, different libraries etc.

Based on my experience with functional languages and other high-level languages like Python, Ruby, Smalltalk, Javascript, Perl, C#, REXX etc,, I would guess that those other factors are MUCH MORE likely to be relevant than just the functional programming paradigm.

In addition, comparing lines of code for PROTOTYPING is pretty uninteresting. I’d like to compare lines of code for a functional application used by customers. What if the design decisions in a particular language are focused on robustness and maintainability?
Larry O'Brien Says:
December 7th, 2006 at 10:19 pm e
I’ve posted a reply at my blog. Unfortunately, I can’t paste the exact permalink, which your comment engine wrongly insists are spam.
Indeed You Wish Says:
December 7th, 2006 at 10:19 pm e
Certain problems are more easily solved with certain tools — let’s not fool ourselves, and believe FP solves all problems better.
Jeff Says:
December 7th, 2006 at 10:37 pm e
You can make any language look good by comparing it to C++. If being functional is the magic bullet, then why do stateful languages like Python, Perl, Ruby, Lua, Lisp, etc. always do just as well as Haskell in this sort of omparison?
Stuart Dootson Says:
December 7th, 2006 at 11:29 pm e
Unlike the rest of your commenters (as far as I can tell), I *have* used Haskell for real-world problems, and can confirm that (for the problems I’ve used it for), it is significantly more productive than imperative languages like Java, C++ or Ada.

I don’t think it’s quite ready for the maistream, but it’s definitely got promise.
Not Silver But Still Very Good Says:
December 8th, 2006 at 12:54 am e
I’m very sceptical about LOC as a measure. However, here’s a very recent informal data point, which I’m mentioning partly because it’s notable for the relatively controlled nature of the comparison: Lucian Wischik at Microsoft recently had to rewrite a 6,000 line F# program in C# (C# was required for unrelated reasons). It became 30,000 lines (references to comments by Lucian below). Now, this comparison is with essentially the same .NET libraries (F# adds a few minor libraries), the same (talented) programmer, the same .NET runtime, the same performance profile, the same problem specification, and the C# code has the advantage of being a reimplementation.

See Lucian’s comments on the Google group at microsoft.public.dotnet.languages.csharp at http://groups.google.com/group/microsoft.public.dotnet.languages.csharp/tree/browse_frm/thread/38a144ceaf101030/fb4664a7e4c27cf6?rnum=11&q=F%23&_done=%2Fgroup%2Fmicrosoft.public.dotnet.languages.csharp%2Fbrowse_frm%2Fthread%2F38a144ceaf101030%2F38d9aa4dbda50892%3Flnk%3Dgst%26q%3DF%23%26rnum%3D1%26#doc_814fd26871c2c6b7

More extensive analysis of the differences by Lucian at http://groups.google.com/group/microsoft.public.dotnet.languages.csharp/browse_frm/thread/38a144ceaf101030/38d9aa4dbda50892?lnk=gst&q=f%23&rnum=1#38d9aa4dbda50892
Josh S Says:
December 8th, 2006 at 1:53 am e
“However, in order for a FP to, well, run, someone, somewhere, must pull a lever so that the wheels start turning according to the spec of the FP, and what ticks under the FP ultimately does have state.”

That’s not the point. The point is that an FP utilizes a framework that *safely* translates from an abstract language without a concept of state to a concrete language that is basically nothing but state.

Likewise, object oriented languages do not actually have objects — whether they are detected at compile time or run time, they ultimately map to swaths of memory and functions.

It’s all smoke and mirrors. But it’s the magic of the smoke and mirrors that enhance our productivity. By building the funnel from a “safe”, “imaginary” language to the “dangerous”, “unproductive” one, we eliminate the problems inherent at the lower level.
Joel Says:
December 8th, 2006 at 2:41 am e
Functional programming is not usually adopted because so many real-world systems are almost entirely side-effects. Look at all the things a shell script does; it’s changing system state. What’s that JDBC app doing? Changing database state.

In the world of web applications, you’re always doing I/O. It’s almost all I/O. And that is state (of your I/O streams).

I’ve been looking at Erlang a lot recently. I have a serious jones for its concurrency constructs. They are as awesome as advertised. But man, you can’t do a decent thing with strings in that language. Seriously, it’s not much better than C. Python, Perl, even Lisp can do a lot of manipulation of the strings - things you have to do in parsing protocols and HTML and so on. But Erlang here is almost a non-starter. Ugh!

Then there is the subject of libraries. Some major languages provide many things you don’t have to reimplement. But do the functional languages? Lisp has been around the longest, but try getting a gui library that works on all of (SBCL,CMUCL,CLisp). Or a decent GUI on Erlang at all. Even Paul Graham, champion of Lisp as Solution To Everything, admits that you need massive libraries to be useful - it’s a goal he has for Arc.

At least Haskell has GTK+ bindings. I haven’t explored Haskell enough to comment on it. I just wonder why folks haven’t used it to do Yahoo! Store. If it’s that good, it will make someone money.
fez Says:
December 8th, 2006 at 5:07 am e
I would love to see a shootout between the various web frameworks (including any based on functional languages).

Something like the following:
- well-defined spec
- basic CRUD functionality for the most part
- some integration(s) with a third-party API (thus existing off the shelf libraries will be of use)

Let the best coders from each language duke it out. Record time spent for each segment of app development & log all commits to a Subversion repository.

Have a panel of neutral judges look at not just the time spent but also feature completeness and any other niceties the teams added in the time alotted, and give a score to each team.
Ulf Wiger Says:
December 8th, 2006 at 1:20 pm e
More references were asked for. Here’s one:

“Comparing C++ and Erlang for Motorola Telecoms Software”
carried out by Heriot-Watt University together with Motorola.

http://www.erlang.se/euc/06/proceedings/1600Nystrom.ppt

Two applications were re-written from C++ to Erlang, and one of the applications was benchmarked. The pure Erlang version was 2-3x faster, much more robust, and had 1/3rd to 1/18th the number of lines of code, depending on how you compare the libraries. Detailed analysis suggested:

“- Code for successful case – saves 27%
- Automatic memory management – saves 11%
- High-level communications – saves 23%” (slide 31)

Like all other studies, this one can certainly be debated. Does anyone have a reference to a study concluding that functional programming does NOT lead to a productivity increase?
tndal Says:
December 8th, 2006 at 3:04 pm e
You omitted that ISI Relation Lisp scored the most rapid development time by far: less than half the time required by Haskell.

Although the line count was greater, this version of Lisp cleaned the floor with Haskell as a productivity tool. And Relational Lisp is a mirror of Prolog.
Jason Says:
December 8th, 2006 at 6:42 pm e
Some points:

- LOC does not imply that less effort was involved in crafting the Haskell solution. Just less effort in typing it in.

- Some problems have greater susceptibility to different approaches. One small “real world” problem is completely inadequate.

- Double-blind experiments in this regard are, in fact, impossible, so this will remain a bench-race forever.

—
Jason
Joel Reymont Says:
December 8th, 2006 at 7:51 pm e
Paul, you may be interested in the latest article in my blog. See “Re-birth of a trading platform”. http://wagerlabs.com/2006/12/8/re-birth-of-a-trading-platform

Thanks, Joel
Larry O'Brien Says:
December 8th, 2006 at 7:51 pm e
Jason: “LOC does not imply that less effort was involved in crafting the Haskell solution. Just less effort in typing it in.”

Not so. In industrial systems, lines of code produced per month is essentially constant, regardless of language. Also, defect rates per KLOC is essentially constant, regardless of language. (ref. Capers Jones works on “language levels”) (In small programs, you definitely see greater variation.)

I agree with your other points.
Larry O'Brien Says:
December 8th, 2006 at 7:57 pm e
“Not Silver But Very Good” ref’s Lucian’s claims. The threads have little substance. I’ve been following Lucian’s Website and I feel that a grain of salt is called for. For instance, he wrote an F# program that displays a teapot in Direct3D; fair enough, but he makes it sound like the lighting and manipulation in 3D comes from a few lines of F# when, in fact, the “teapot in a viewport” is canned functionality that can be done in a few lines of _any_ language.
Ben Moseley Says:
December 8th, 2006 at 9:24 pm e
The thrust of your article is very close to the thrust of our “Out of the Tar Pit” paper which investigates FP (along with the relational model) as being very relevant to the Silver Bullet question - http://web.mac.com/ben_moseley/frp/paper-v1_01.pdf .
Not Silver But Still Very Good Says:
December 9th, 2006 at 3:26 am e
Re the teapot - you’ve got the wrong guy: Lucian works at Microsoft, and has never touched a DirectX teapot. I think you’re thinking of Flying Frog consultancy.
Ulf Wiger Says:
December 9th, 2006 at 12:50 pm e
I totally disagree with the idea that LOC wouldn’t matter. It’s not just a matter of the effort of typing in the code. Much of that extra code often represents “unnecessary” detail that distracts the reader, hides the core logic, and which also needs to be maintained. I’ve seen projects that have problems with keeping “boilerplate” code consistent when the system gets complex enough. Some projects resort to using modeling languages that generate the boilerplate. When judged as programming languages, these tools are usually quite crappy, but their supporters defend them based on the opinion that it’s still a lot better than having to code the stuff by hand. But there are good programming langugages that work at the same abstraction level as those modeling tools. We’ve also found that the learning curve for e.g. Erlang is much shorter than for e.g. UML or C++, contrary to the statement that FPLs would be more difficult to learn, or less intuitive.

My own conclusions are based on 10 years of working in and around very large software projects, with code volumes in the order of hundreds of thousand lines, or even millions. I’ve had the opportunity to review several projects using C++, UML, Java and Erlang. I get the feeling that many of the comments above come from very limited comparisons. That’s ok - you have to start somewhere. For me, it took 2-3 months to properly un-learn C++ and OO when I first started with Erlang.
anonymous Says:
December 14th, 2006 at 12:57 pm e
Josh S:

“That’s not the point. The point is that an FP utilizes a framework that *safely* translates from an abstract language without a concept of state to a concrete language that is basically nothing but state.”

Brooks was talking about the complexity arising out of the sheer multitude of possible states (or combinations thereof). In FP, you retain (at least in part) this very complexity in order to be able to produce any useful behavior, and whether this complexity is state-based or not, is totally irrelevant.

What’s left to argue about is whether complexity is significantly reduced by imposing the stateless view, and I for one am still a bit sceptical about any radical claims on that account, especially if well-designed programs/systems from both domains are compared.

Another point that comes to mind is that, paradoxically, FP actually inhibits stateful programming when you need it (and you almost always do when designing and implementing systems,) by making the complexity of combining states explicit. Inherently stateful models, on the other hand, will let you hide some of this complexity, be it at the still-present risk of coming up with something inconsistent.

It pays to note that, compared to the current state of the art in PP/OOP, the FP way of combining state is clunky even with the help of monads and monad transformers, in case you wanted to bring those into the argument.

Paul's Pontifications