![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
So my rant yesterday about good and bad programmers did leave me musing about an important corollary question: how do you make good programmers? The answer is obviously complex, but here's a starting point: teach them The First Law of Programming, which is:
The odd part is how ill-taught this rule is. Most programming courses teach it as an afterthought, if at all, which is strange because it motivates so much of the structure of programming. I mean, the evolution of computer languages has mostly been about finding higher and higher-level ways to eliminate duplication in code, and many language features are all about ways to remove duplication. For example:
Mind, I am not advocating removing duplication for the usual squishy reasons like "reuse". (Itself a source of many sins, because it misses the fact that *sometimes*, it really is much cheaper, easier and more reliable to reinvent the wheel.) The real reason is much simpler: Duplication Causes Bugs. Period. And I don't mean occasionally: in my experience, *most* serious programming bugs trace back to duplication in one way or another. Sometimes it is because duplicated code makes the code bulkier and harder to reason about. Frequently, it is because you copied this code into four places, tweaked it in one of them, and forgot to tweak it in the others. Most often, the duplication is simply a symptom of the fact that you don't really understand the abstractions in your code.
So if you are learning programming, I commend to you this rule. Whenver you notice *any* kind of duplication, ask why, and really dig into whether those duplicates should be combined. While it is technically possible to carry it too far, it's really pretty difficult to do so -- the exceptions are at least a bit unusual. And continually saying to yourself, "Surely there must be *some* way to remove this duplication" will force you to think in ways that will teach you a huge amount about why modern programming languages work the way they do, and why you want to use those fancy constructs.
To give you a leg up, I'll point you specifically to the little-known bible of programming: Refactoring: Improving the Design of Existing Code, by Martin Fowler. Fowler can be a bit of a loon (albeit glorious fun to listen to), but he's a brilliant loon and one of the more insightful thinkers about the art of programming. This book, in particular, is the one I hand to *every* intermediate-level engineer. It starts with a fairly modest section on how to think about the structure of code, and then spends the rest of the book on an encyclopedia of "code smells" and how to fix them. Tom Leonard insisted that I read it back when I was working for him, a dozen or so years ago, and of all the things I learned from Tom, this was probably the single most valuable. It isn't quite perfect -- it is very Java-centric, so misses lots of functional-programming options available in more modern languages, and it is very focused on fixing existing code. But it'll teach you a lot about how to *think* about code properly.
As the title says, this is just part 1. When I have some time (possibly later today, but we'll see), I'll get into Part 2: Duplicate Data is Evil...
Duplication Is EvilReally, that's it -- one nice simple sentence, with huge ramifications.
The odd part is how ill-taught this rule is. Most programming courses teach it as an afterthought, if at all, which is strange because it motivates so much of the structure of programming. I mean, the evolution of computer languages has mostly been about finding higher and higher-level ways to eliminate duplication in code, and many language features are all about ways to remove duplication. For example:
- If you find the same expression being used in multiple places in the program -- even if it is just one complex line -- it most often makes sense to lift that out into its own parameterized function or method.
- If you have the same basic functional pattern being used repeatedly -- that is, when you can comfortably say, "This is just doing the same thing as that except for X" -- then you probably want to lift out a higher-order function, encapsulating X as a functional parameter or in a closure.
- If you have multiple classes that are doing essentially the same things, except *to* different types -- for instance, a List of integers vs. a List of strings vs. a List of Customers -- then you almost certainly want a Generic class.
- If you have multiple classes that are trying to do the same tidbit of functionality, then you probably want a trait or a mixin. (Or if you are trapped in single-inheritance land, at least change the way you're aggregating those functions.)
Mind, I am not advocating removing duplication for the usual squishy reasons like "reuse". (Itself a source of many sins, because it misses the fact that *sometimes*, it really is much cheaper, easier and more reliable to reinvent the wheel.) The real reason is much simpler: Duplication Causes Bugs. Period. And I don't mean occasionally: in my experience, *most* serious programming bugs trace back to duplication in one way or another. Sometimes it is because duplicated code makes the code bulkier and harder to reason about. Frequently, it is because you copied this code into four places, tweaked it in one of them, and forgot to tweak it in the others. Most often, the duplication is simply a symptom of the fact that you don't really understand the abstractions in your code.
So if you are learning programming, I commend to you this rule. Whenver you notice *any* kind of duplication, ask why, and really dig into whether those duplicates should be combined. While it is technically possible to carry it too far, it's really pretty difficult to do so -- the exceptions are at least a bit unusual. And continually saying to yourself, "Surely there must be *some* way to remove this duplication" will force you to think in ways that will teach you a huge amount about why modern programming languages work the way they do, and why you want to use those fancy constructs.
To give you a leg up, I'll point you specifically to the little-known bible of programming: Refactoring: Improving the Design of Existing Code, by Martin Fowler. Fowler can be a bit of a loon (albeit glorious fun to listen to), but he's a brilliant loon and one of the more insightful thinkers about the art of programming. This book, in particular, is the one I hand to *every* intermediate-level engineer. It starts with a fairly modest section on how to think about the structure of code, and then spends the rest of the book on an encyclopedia of "code smells" and how to fix them. Tom Leonard insisted that I read it back when I was working for him, a dozen or so years ago, and of all the things I learned from Tom, this was probably the single most valuable. It isn't quite perfect -- it is very Java-centric, so misses lots of functional-programming options available in more modern languages, and it is very focused on fixing existing code. But it'll teach you a lot about how to *think* about code properly.
As the title says, this is just part 1. When I have some time (possibly later today, but we'll see), I'll get into Part 2: Duplicate Data is Evil...
Programming & Business
Date: 2011-11-04 08:06 pm (UTC)Duplication is evil.
This is typically described in "the office" world that I am so accustomed to as "Don't do double work." Overlapping. Two people doing the same task to achieve the same goal... or not... two people simply doing the same task and not knowing that it's being done. It slows down productivity on a whole bunch of levels. Then it breeds apathy. 'Why should I do this if someone else is anyway?'
This is something I talk about in interviews. If a problem is recurring (duplicating) then a process should be put in place to correct the problem at the core. At which point, other duplications... and issues (hey, 'Terry' did that same solution and now these three people are doing it, but not the rest of the team) can come to light. Yay! Problems. (Not being sarcastic.) Now that problems have been identified, processes can be put into place, so that duplications can cease and productivity can increase.
If you ever burn out on programming, might I suggest a 2nd career in HR?
Re: Programming & Business
From:Re: Programming & Business
From:(no subject)
Date: 2011-11-04 09:16 pm (UTC)Duplication Is Evil
Really, that's it -- one nice simple sentence, with huge ramifications.
Ha. That was strongly in mind just last night, while signing 120+ pages of paperwork for a refinance. (*There's* something that desperately needs refactoring.)
While it is technically possible to carry it too far, it's really pretty difficult to do so -- the exceptions are at least a bit unusual.
"Too far" may be rare, but "badly" is also a pitfall of the overzealous. A perhaps-canonical example: the method which does one of seventeen different things depending on which flags get passed in, those things being related only by the fact that pieces of their internal logic overlap. All done in the name of avoiding duplication, much like the Inquisition was done in the name of promoting faith and virtue.
...and it is very focused on fixing existing code.
I'd call that a point in its favor - maintaining and modifying an existing codebase is far more prevalent in industry than in your average comp sci degree program; formally passed-down knowledge on this skill is a good thing to have around.
(no subject)
From:(no subject)
From:(no subject)
From:(no subject)
Date: 2011-11-05 12:41 am (UTC)I certainly try to make this point in class. "When you find yourself writing the same thing over and over, you're doing something wrong." I say this in introducing variables, again in introducing functions, again in introducing higher-order functions, again in introducing inheritance....
(no subject)
From:(no subject)
From:(no subject)
From:macros
From:Re: macros
From:(no subject)
Date: 2011-11-05 04:59 pm (UTC)Reminds me of one of my best quips from a past job where we were designing a large class library: "It's better to reinvent the wheel than to subclass a square wheel and try to make it round." :-)
(no subject)
From:(no subject)
Date: 2011-11-08 04:00 am (UTC)Today, for instance, we were presented with the flexible new system that is supposed to replace some ad hoc perl reports - now instead of writing a new 50 line script for each query, we only need to create a new DB table, DAO, service class, model, view, controller, and command-line application...