For Some Value of "Magic": programming

Showing posts with label programming. Show all posts

November 30, 2014

“Rock Star” Programmers

I've finally realized just what, exactly, so gets up my nose when I see organizations advertising for “rock star” programmers. To understand you need to cast your mind back (assuming it then existed) to 1971 (when I was in my first decade as a programmer) and Gerald Weinberg had just produced The Psychology of Computer Programming, in which he concluded that over-identification with the code one produced could, in interesting ways, be counter-productive.

He observed that programmers who were reluctant to share their code tended to hold on to false views of what might be causing issues rather longer than those who were open to review and discussion. My own development as a programmer was greatly aided by this approach, and at university a couple of close friends in particular discussed every aspect of the code we were creating.

Forty years later Weinberg's “egoless” approach, in which mistakes are accepted as inevitable and reviews are performed in a collegial way, remains the sanest way to produce code. Given that computer programming is fast becoming a mainstream activity it seems perverse to deliberately select for ego when seeking programming talent, since the inevitable shortfall in humility will ultimately work to undermine the rock star's programming skills.

When I think of the best programmers I know, the foremost characteristic they share is a modesty about their own achievements which others would do well to emulate. So, can we please dispel the myth of the “rock star” programmer? The best programmers can't be rock stars. Rock star egoism will stand in the way of developing your programming skills.

May 17, 2012

Refactor All the Laws

In the Python world we try to keep things simple. There are very good reasons for this. Brian Kernighan, a well-known programmer responsible for parts of the design of UNIX™*, once famously observed that

Debugging is twice as hard as writing code in the first place. Therefore, if you write the code as cleverly as possible you are, by definition, not smart enough to debug it.

This principle is not observed by programming beginners, who know enough to get themselves in trouble, but often not enough to get out of it. Anyone who has been coding a long time (and believe me, I have been coding a long time) knows not to get too smart lest they write code they cannot debug. It's slightly different when you are struggling to establish an architecture (are we a library or a framework?), but once the production environment is established you really want to be able to crank it out without having to rethink each corner case.

I even once went so far as to write a pre-processor for BASIC PLUS that operated on BASIC lines (which had to be numbered, whereas the input to the pre-processor did not) much as an assembler did to lines of symbolic machine code. In other words, I made BASIC code relocatable and made library sharing between projects much simpler. Because it was really more like a linking loader than an assembler (though it had features in common with both), I called it Blink. But that was thirty-odd years ago, when I was writing accounting systems. It certainly let us crank the code out. Happy days.

Of course Blink no longer exists. I am not one of those programmers who keeps every line of code they have ever written, regarding most of it as ephemeral: built to perform a task, and no longer relevant once the task is complete. I do not envy the curators of computer museums, who must decide what is worth keeping, and what can be kept running. You can do that with hardware, just about, but with software the profusion makes it impossible to track what's going on. Perhaps soon the open source world will find fixes for this bug—certainly the appearance of public DVCS systems will help. Some software, however, seemingly goes on for ever and ever.

It was very pleasing earlier this year to see a bunch of BBC Micros of ancient vintage still doing what they were designed for thirty years later. I wonder whether we'll ever see a 30-year-old iPhone anywhere but a museum? (To be fair, you don't see many BBC Micros nowadays). There are relatively few engineers building control systems and the like for a lifetime much longer than that.

Of course most systems of any age have been modified somewhat from their original purpose. You build a billing system, then the sales department come along and say "we could double our revenues if we could bill more flexibly," and the dance begins. You fix your programs to handle new requirements, they come along with even newer requirements, you add more fixes, and so on. Unless you are very, very disciplined, and are working with well-designed well-written code (e.g. if someone else wrote the program badly you may be screwed) you can end up finding that the change you make, while it meets the new requirements, no longer meets earlier requirements because of unintended consequences from your change. In simple terms, your program has become so complex that fixing it in one place breaks it in another.

In the software world we can use regression tests to alleviate the worst pain from this kind of activity (in addition to the unit tests we use to establish basic functions operate correctly). Whenever you find an error in the program, you write a test that fails with the problematic release but should pass when the system is fixed. This has the advantage that if your changes cause unanticipated failure then there should be a high probability that at least one existing test should fail. The presence of such regression tests sets a sort of “high water mark” for software performance. It has to be at least good enough to pass all the tests, or something is broken. In the presence of the tests we can refactor our code (reorganize and re-structure it)

Imagine now, if you can, a computer program two hundred years old**. Yes, I know, that pre-dates even Charles Babbage's analytical engine by more than a century. Never mind that. Just suppose that by some freak of probability some primitive computing technology had been developed by an unsung genius, and that its output is so valuable that it must be kept running†. The order of society itself depends on this program running, and yet it has never had a single test written for it. Don't blame the authors, when the constitution was written there was no such thing as test-driven development.

And yes, I am talking about the law of the land as something in need of refactoring. In just the same way as software engineering has benefited from test-driven development, so the law would, in my immodest contention, benefit from principle-driven development. By this I mean to suggest that the lawyers, when proposing a law, should list some desirable outcomes (tests) which somehow engender the law's purpose. If we could at least get agreement on what such principles might be, and the fact that they are desirable, then we might establish benchmarks for the operation of a law and be able to reject amendments that violate the principles (break the tests, in coding terms).

Just as sometimes we get our tests wrong, and have to rewrite them, sometimes legislators will occasionally get the principles wrong and need to rewrite those. But a discussion of principle would be a matter for all-out debate, whereas it seems to me that modifications of the law would be less contentious as long as none of the underlying principles were violated (in other words, as long as the law introduces no regression errors).

The present laws are full of special cases, introduced because of the lobbying of those with special interests or to suit one particular constituency. It's time we stopped placing so much emphasis on passing new laws and decided instead to add principles to the existing law so that we could start to detect more easily when the law started to diverge from society’s desires about the way it operates. In time the law could be cleaned up in much the same way as a crufty old program can be re-engineered to bring it in line with more modern requirements.

At present the law is a bug ugly ball of string, and there are many professionals making a good living finding and exploiting loopholes that operate to the advantage of their clients. We need more foresight, and we need a legal system that effectively says “this law cannot be amended, and is not intended to operate, to provide tax benefits to those who do not require them” or “this law cannot be used to the benefit of anyone with above-average income.” While this isn't a perfect proposal, it would perhaps serve to focus people's interest on those who are specifically intended to benefit from the passage of particular laws, and the principles might over time become an accepted set of goals for new legislation.

When I think of how crufty code gets after just a few years I shudder to think what the law must look like from the inside. It's certainly obvious that the legislature has not been operating “by the people, for the people and of the people.” It's time we changed that. Since I have no vote I'd appreciate it if my voter friends could execute this change at the first available opportunity.

* Merely one in a very long list of achievements, as any Internet search will reveal
** Or, perhaps, 236 years old
† Believe it or not, at the time of the "year 2k" panic some banks discovered they were running (in compatibility mode) some programs originally written in 1400-series autocode for which they no longer had the source. This would have been more surprising back in the days when banks were regarded as reliable and responsible. Happy days.

December 4, 2008

Python 3.0 Is Out

Even though I am on vacation this is worth a quick note. After long efforts by many developers Python 3.0 was released today!

I posted a short article a while ago about 3.0 (in)compatibility, but the differences between 2.6 and 3.0 aren't so great. It's perfectly possible to write 3.0 code that will run on 2.6 too, as most of the language hasn't changed at all.

The preferred strategy for writing code that runs on both versions is to write in 2.6 and then apply the 2to3 converter and verify that it produces a correct 3.0 program. There's no guarantee that it will, so you may need to paraphrase the 2.6 code a few times before you get a transatable program.

Once all the third-party modules you and extensions you rely on are 3.0-ready, and you no longer have clients requiring 2.6 version of your software, you can simply drop the 2.6 compatibility requirement and start to make use of the few 3.0-only constructs that have been introduced.

December 21, 2007

On Programming

F**king brilliant.

December 19, 2007

Cheese Shop Doing Good Business

PyPi, the Python Package Index, is affectionately known as the Cheese Shop due to the the Python world's affinity with Monty Python and its tendency to relish obscure puns based on the sketches. I can't say I have ever liked the nickname, but PyPi is a valuable repository of often highly usable code and currently contains over 3,000 contributions. So it's worth a look.

The RSS feed contains the last thirty updates, which currently covers a 48-hour span--the feed appears to be updated once daily at 7 pm. So there's clearly a lot going on, and when I get time (note the ironic sound of hollow laughter ringing metaphorically in your ears) I shall be using the browse interface to take a closer look.

September 18, 2007

Blogger Behind the Curve?

I am sure I wasn't the only one who greeted Blogger's acquisition by Google with enthusiasm. At last, I thought, I will be free of the editor applet, which has become rather tedious to use.

Alas not, though. While much has changed, and the AJAX-based layout editor is a great improvement, we are still left with a content editor that creates horrible HTML and whose toggle buttons easily lose synchronization with the editor's state.

A recent post by Paddy also highlights the fact that there is inadequate support for posting code. I realise this probably isn't a majority interest, but in this day and age you would think the the world's leading web company could do better.

August 17, 2007

Close Enough?

I have long admired the formula often known as Euler's identity. It was probably known before Euler's time, but it is associated inextricably with his name because it is a special case of a more general formula, with π as the value of the bound variable. The identity asserts that

To me this is a thing of beauty and a joy forever, but I have long since given up trying to explain to other people why or how I perceive beauty in mathematics. Anyway I thought I would see how close my trusty laptop could get to emulating this mystic identity (it's amazing what I get up to when procrastinating), and naturally chose Python (though I believe the results would be just as disappointing in any other language). Here's what I got:

>>> math.e**(math.pi*-1j)
(-1-1.2246467991473532e-16j)

Definitely not quite the same mystical properties there, even though numerically quite close. No wonder I never liked applied mathematics!

July 18, 2007

Obscurantism

I may have mentioned before that I have started to use the C# language. Overall it isn't bad - it uses static typing, which I have always found irksome, but the language is reasinably expressive and compact compared with something like Java or C++. Because it is heavily promoted by Microsoft there are any number of tutorial and example web sites, not all of them by programming experts.

Somehow the Java message that it's bad to allow direct access to instance and class attributes appears to have permeated the C# world, I have no idea why. I have just been reading one of the better-written tutorials, which offers the following code as an example if instance creation and manipulation.

static void Main(string[] args)
{
Person Michael = new Person();
Person Mary = new Person();
// Specify some values for the instance variables
Michael.Age = 20;
Michael.HairColor = "Brown";
Mary.Age = 25;
Mary.HairColor = "Black";
// print the console's screen some of the variable's values
Console.WriteLine("Michael's age = {0}, and Mary's age = {1}",
                 Michael.Age, Mary.Age);
Console.ReadLine();
}

As code goes that isn't bad, and apart from the different comment styles and the declarations it could almost be Python. Unfortnately the author has tasted the Java Kool-Aid, and shortly after this example writes

So each object now contains different data. Note that we directly accessed the variables and we put any values we wanted, right? But wait there is a solution to this problem. We will use properties.

There is no attempt to explain what the "problem" is, and you will probably not be surprised to learn that the "solution" involves making the attributes private to the class and then providing public getter and setter methods for instance users. This turns a five-line class declaration into a 32-line one (though to be fair to the author, he does at least include checking code that demonstrates the value of properties in applying class-based logic during assignment).

I have now read any number of texts where instead of something like

class Person
{
public int Age;
public string HairColor;
}

that allows direct access to the instance variables by client code, readers are encouraged to write horrendous code like

class Person
{
private int age;
private string hairColor;
public int Age
{
  get
  {
      return age;
  }
  set
  {
      age = value;
  }
}
public string HairColor
{
  get
  {
      return hairColor;
  }
  set
  {
      hairColor = value;
  }
}
}

which actually offers no benefit over the short version at all. Python users are used to accessing attributes directly in their code, which clearly has performance benefits, and then implementing properties if and when they are required to add logic to the setting or retrieval of attribute values. I just wish that C# users could see that empty getters and setters offer no measurable benefit over direct access to attributes.

May 17, 2007

Another Great Python Blog Entry

I've been following Doug Hellman's Python Module of the Week series, but I already know most of the modules he'd covered until along came PyMOTW: logging which describes a module I have always found difficult to understand in terms that make it comprehensible. Nice job!

Wyatt Baldwin Blog - Google Maps

Wyatt Baldwin has recently made a couple of interesting blog entries (I'm just catching up after an extended Windows repair session [spit, spit]. In Google Maps Encoded Polylines he uses Python to draw shapes on Google Maps, and in Creating a (Google Maps) Tosca Widget he explains in considerable detail just how to do that. Both great posts demonstrating Python's power, and much kudos to Wyatt for this excellent work.

April 11, 2007

Help Wanted!

Another post from Doug Napoleone says (among many other things) "I need to take the very clean and and elegant code of Steve Holden and stuff in into the somewhat organic and undocumented PyCon-Tech code."

It's strange how other people's perceptions of our code can be so different from our own. I am conscious of several areas where that code (published, by the way, as a part of my PyCon 2007 tutorial) really needs revision. But anyway, thanks for the tip of the hat, Doug.

Anyone wanting to help to improve the software support for PyCon should sign up for the PyCon Tech project. It would be a great way to get started in Django in a supportive environment.