jducoeur: (Default)
[personal profile] jducoeur
So my rant yesterday about good and bad programmers did leave me musing about an important corollary question: how do you make good programmers? The answer is obviously complex, but here's a starting point: teach them The First Law of Programming, which is:
Duplication Is Evil
Really, that's it -- one nice simple sentence, with huge ramifications.

The odd part is how ill-taught this rule is. Most programming courses teach it as an afterthought, if at all, which is strange because it motivates so much of the structure of programming. I mean, the evolution of computer languages has mostly been about finding higher and higher-level ways to eliminate duplication in code, and many language features are all about ways to remove duplication. For example:
  • If you find the same expression being used in multiple places in the program -- even if it is just one complex line -- it most often makes sense to lift that out into its own parameterized function or method.

  • If you have the same basic functional pattern being used repeatedly -- that is, when you can comfortably say, "This is just doing the same thing as that except for X" -- then you probably want to lift out a higher-order function, encapsulating X as a functional parameter or in a closure.

  • If you have multiple classes that are doing essentially the same things, except *to* different types -- for instance, a List of integers vs. a List of strings vs. a List of Customers -- then you almost certainly want a Generic class.

  • If you have multiple classes that are trying to do the same tidbit of functionality, then you probably want a trait or a mixin. (Or if you are trapped in single-inheritance land, at least change the way you're aggregating those functions.)
And so on. While not every programming-language feature is about removing duplication, many are, and for good reason.

Mind, I am not advocating removing duplication for the usual squishy reasons like "reuse". (Itself a source of many sins, because it misses the fact that *sometimes*, it really is much cheaper, easier and more reliable to reinvent the wheel.) The real reason is much simpler: Duplication Causes Bugs. Period. And I don't mean occasionally: in my experience, *most* serious programming bugs trace back to duplication in one way or another. Sometimes it is because duplicated code makes the code bulkier and harder to reason about. Frequently, it is because you copied this code into four places, tweaked it in one of them, and forgot to tweak it in the others. Most often, the duplication is simply a symptom of the fact that you don't really understand the abstractions in your code.

So if you are learning programming, I commend to you this rule. Whenver you notice *any* kind of duplication, ask why, and really dig into whether those duplicates should be combined. While it is technically possible to carry it too far, it's really pretty difficult to do so -- the exceptions are at least a bit unusual. And continually saying to yourself, "Surely there must be *some* way to remove this duplication" will force you to think in ways that will teach you a huge amount about why modern programming languages work the way they do, and why you want to use those fancy constructs.

To give you a leg up, I'll point you specifically to the little-known bible of programming: Refactoring: Improving the Design of Existing Code, by Martin Fowler. Fowler can be a bit of a loon (albeit glorious fun to listen to), but he's a brilliant loon and one of the more insightful thinkers about the art of programming. This book, in particular, is the one I hand to *every* intermediate-level engineer. It starts with a fairly modest section on how to think about the structure of code, and then spends the rest of the book on an encyclopedia of "code smells" and how to fix them. Tom Leonard insisted that I read it back when I was working for him, a dozen or so years ago, and of all the things I learned from Tom, this was probably the single most valuable. It isn't quite perfect -- it is very Java-centric, so misses lots of functional-programming options available in more modern languages, and it is very focused on fixing existing code. But it'll teach you a lot about how to *think* about code properly.

As the title says, this is just part 1. When I have some time (possibly later today, but we'll see), I'll get into Part 2: Duplicate Data is Evil...

Programming & Business

Date: 2011-11-04 08:06 pm (UTC)
From: [identity profile] unicornpearlz.livejournal.com
And again, you have hit on something so fundimental that it actually spans the genre's from programming to business (ie - non-programming).

Duplication is evil.

This is typically described in "the office" world that I am so accustomed to as "Don't do double work." Overlapping. Two people doing the same task to achieve the same goal... or not... two people simply doing the same task and not knowing that it's being done. It slows down productivity on a whole bunch of levels. Then it breeds apathy. 'Why should I do this if someone else is anyway?'

This is something I talk about in interviews. If a problem is recurring (duplicating) then a process should be put in place to correct the problem at the core. At which point, other duplications... and issues (hey, 'Terry' did that same solution and now these three people are doing it, but not the rest of the team) can come to light. Yay! Problems. (Not being sarcastic.) Now that problems have been identified, processes can be put into place, so that duplications can cease and productivity can increase.

If you ever burn out on programming, might I suggest a 2nd career in HR?

Re: Programming & Business

Date: 2011-11-04 08:50 pm (UTC)
From: [identity profile] unicornpearlz.livejournal.com
And this - the reading the same thing and getting two different, yet slightly overlapping ideas out of it - is what happens when managerial vs secretarial brains look at it. In a good company, both would work together to make the team stronger. But, that's a different conversation entirely.

(no subject)

Date: 2011-11-04 09:16 pm (UTC)
mindways: (Default)
From: [personal profile] mindways
Very nice - I look forward to the continuation!

Duplication Is Evil

Really, that's it -- one nice simple sentence, with huge ramifications.


Ha. That was strongly in mind just last night, while signing 120+ pages of paperwork for a refinance. (*There's* something that desperately needs refactoring.)

While it is technically possible to carry it too far, it's really pretty difficult to do so -- the exceptions are at least a bit unusual.

"Too far" may be rare, but "badly" is also a pitfall of the overzealous. A perhaps-canonical example: the method which does one of seventeen different things depending on which flags get passed in, those things being related only by the fact that pieces of their internal logic overlap. All done in the name of avoiding duplication, much like the Inquisition was done in the name of promoting faith and virtue.

...and it is very focused on fixing existing code.

I'd call that a point in its favor - maintaining and modifying an existing codebase is far more prevalent in industry than in your average comp sci degree program; formally passed-down knowledge on this skill is a good thing to have around.

(no subject)

Date: 2011-11-04 10:10 pm (UTC)
From: [identity profile] hudebnik.livejournal.com

That was strongly in mind just last night, while signing 120+ pages of paperwork for a refinance.

Things for me to look forward to. (My closing is scheduled for Tuesday. Pain in the ass, but the savings will be very nice.)


My closing was scheduled for Oct. 17, but one business day earlier they discovered a problem I had told them about two months earlier, so they rescheduled it for Oct. 27. On Oct. 27, two hours before closing, I was told that they hadn't found a solution to the aforementioned problem, and closing was therefore canceled.

But to get back to your point... the previous time I tried to refinance, the deal fell through precisely because of Evil Duplication. Two mortgage company employees each had half of my dossier of paperwork, and each was waiting for me to send in the other half before they could proceed. If the responsibility had been in a single point of control, this would have been discovered and fixed much sooner.

(no subject)

Date: 2011-11-05 04:04 am (UTC)
mindways: (Default)
From: [personal profile] mindways
I will admit to being slightly surprised that you're refinancing this soon after the purchase, though -- significant interest rate drop since you bought?

Not large, but not trivial: 3/8 of a percent lower, paying very little cash to do so. Naive payback time (ignoring mortgage interest deduction, future value of money, etc) is about 2-3 years, and we're planning on being here at least 10, if not 15-20, so the math makes sense.

(It could have been even better to pay points and get an even more ridiculously good rate, but we're hurting for liquidity after the renovations, and - strangely - at the time we refinanced, points wouldn't have lowered the rate a huge amount, making the marginal benefit slim.)

(no subject)

Date: 2011-11-05 12:41 am (UTC)
From: [identity profile] hudebnik.livejournal.com
Most programming courses teach it as an afterthought, if at all, which is strange because it motivates so much of the structure of programming. I mean, the evolution of computer languages has mostly been about finding higher and higher-level ways to eliminate duplication in code, and many language features are all about ways to remove duplication.

I certainly try to make this point in class. "When you find yourself writing the same thing over and over, you're doing something wrong." I say this in introducing variables, again in introducing functions, again in introducing higher-order functions, again in introducing inheritance....

(no subject)

Date: 2011-11-05 01:03 pm (UTC)
From: [identity profile] metageek.livejournal.com
<looks innocent> And again when introducing macros?

(no subject)

Date: 2011-11-05 01:19 pm (UTC)
From: [identity profile] metageek.livejournal.com
I was thinking Lisp.

macros

Date: 2011-11-05 10:00 pm (UTC)
From: [identity profile] hudebnik.livejournal.com
Sure, I would, if I ever got to teach macros. I teach Scheme to non-majors who are lucky if they get to HOF's in a semester. I teach Scheme to CS majors, but only for a third of a semester, so at best I get to mention the existence of macros. I teach C++ for more than half a semester, but there are so many difficult and necessary things to cover in C++ that I don't get to parameterized macros.

Edited Date: 2011-11-05 10:03 pm (UTC)

(no subject)

Date: 2011-11-05 04:59 pm (UTC)
From: [identity profile] doubleplus.livejournal.com
Itself a source of many sins, because it misses the fact that *sometimes*, it really is much cheaper, easier and more reliable to reinvent the wheel.

Reminds me of one of my best quips from a past job where we were designing a large class library: "It's better to reinvent the wheel than to subclass a square wheel and try to make it round." :-)

(no subject)

Date: 2011-11-08 04:00 am (UTC)
From: [identity profile] learnedax.livejournal.com
I couldn't agree more. I have recently encountered a number of head-scratching cases of seemingly bright people inventing systems where every new feature is forced to rewrite 90% the same functionality, presumably because they've learned lots of lessons along the way but not this one.

Today, for instance, we were presented with the flexible new system that is supposed to replace some ad hoc perl reports - now instead of writing a new 50 line script for each query, we only need to create a new DB table, DAO, service class, model, view, controller, and command-line application...

Profile

jducoeur: (Default)
jducoeur

June 2025

S M T W T F S
12 34567
891011121314
15161718192021
22232425262728
2930     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags