For this reason, many software professionals encountering science software for the first time may be horrified. How, they ask, can we rely on this crude software, developed in primitive conditions - by amateurs, working with such poor tools and such poor understanding of the field? This is a common reaction to GISTEMP, and is exactly the reaction which many critics have had, some very publicly, to the software published with the CRU emails. Such critics do have a point. Science software should be better than it is. Scientists should be provided with more training, and more support. But consider the uses to which science software is put. Most
software written by scientists:
* consists of tiny programs;
* which will only ever be run a small number of times;
* over the course of a few weeks as it is being developed;
* by the scientist who wrote it;
* on data gathered by that scientist's team;
* concerning a scientific field in which that scientist is expert;
* to perform data processing on which that scientist is expert; and will be discarded, never to be used again, as soon as the paper containing the results is accepted for publication.
There are hardly any scientists today who don't do some programming of some sort; there's not much science that doesn't involve churning through really big data sets. As a result, there's a lot of it about. Which reminds me of this Eric Sink post from 2006, about the distinctions between "me-ware, us-ware, and them-ware". Me-ware is software that you write and only you use; us-ware is software that is used by the same organisation that produces it; them-ware is software that is produced by a software company or open-source project for the general public.
There's a gradient of difficulty; the further from you the end-user is, the less you know about their needs. On the other hand, if you're just trying to twiddle the chunks to fit through the ChunkCo Chunkstrainer without needing to buy a ChunkCo Hyperchunk, well, although you know just how big they are, you're unlikely to spend time building a pretty user interface or doing code reviews. Which only matters up to a point; nobody else would bother solving your problem.
But this can bite you on the arse, which is what happened to the climate researchers. It's fair to say that if you're processing a scientific data set, what actually matters is the data, or the mathematical operation you want to do to it. You won't get the paper into Nature because you hacked up a really elegant list comp or whatever; they won't refuse it because the code is ugly. Anyone who wants to replicate your results will probably roll their own.
This is OK, but the failure mode is when the political equivalent of Brian Coat comes snooping around your #comments or lack of them. Perhaps I should tidy up the Vfeed scripts while I'm at it.