Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Could Bill Gates write code? (theregister.co.uk)
83 points by cavedave on June 7, 2011 | hide | past | favorite | 54 comments


Programmers working with modern compilers and tools and in what are comparatively resource-rich environments often haven't had to bum instructions and bum bytes.

In (most) modern environments, the optimizations are toward maintenance and support and speed of programming. Not toward memory optimizations.

All sorts of bizarre constructs can be commonplace when you are looking to stuff an application into a tiny ROM or tiny RAM; into 32 KB main RAM memory that was commonplace on a number of systems years ago, for instance. This using instructions as data storage for constants, modifying the return pointer in the call stack, using instruction side-effects, and otherwise.

When you're at the edge of needing a bigger (and then far more expensive) ROM or falling out of available memory or needing to switch over to overlaid segments or other run-time hackery, getting a dozen bytes back somewhere could be a big deal.

Some of these coding techniques can be appropriate in and still show up in very tight loops and embedded applications, but they're far less common now.

And thankfully, cheap "large" physical memory and virtual memory means we seldom have to deal with anything like TKB and its overlays:

http://wjh.conflux.net:16080/RSTS-V9/documents/DEC/AA-5072C-...


I recently had to do this for an embedded processor. 1920 bytes to fit some controller code in. It was a ton of fun, to the point where one of my cow-orkers found himself compelled to help out. I think we're at around 20 bytes free right now.

"I managed to save three bytes here."

"My God, you're a sick person. I love it."

Then I went back to my "day job" data mining a couple dozen terabytes of stuff in a database.

I love working with computers.


This is called "code golf". You try to get the final program with the minimum of "shots" (bytes, lines, tokens or characters). It's amusingly addictive.

And I really hope you misplaced that hyphen in "cow-orkers".


Scott Adams popularized this construction, which suggests that one's co-workers are likely to ork cows (but he didn't originate it:http://en.wikipedia.org/wiki/Scott_Adams#Coined_phrases).

EDIT: As far as I know, the definition of "ork" as a verb is left to the imagination.


I wish I could do more embedded stuff. I desperately need more free time.


Even today, squeezing your code and data so it fits within your processor's L1 cache can give you a very significant performance boost.

When I was developing educational software for Apple II computers in the mid-80's, one program we did had, at a given moment, 43-and-half bytes of free memory in the whole machine. The half byte is there because we were using that byte, but the counter wouldn't exceed 12, so we had 4 bits we could use for something else.

But my most impressive feat there was a graphics window-overlay library that used RLE to store obscured regions and was implemented in less than 1K of 6502 code. If you didn't want to preserve color information, it would shave one bit off every byte of screen data (increasing RLE efficiency)


Bill Gates wrote code very well, and was an exceedingly good programmer.

I know, because I used to sit with him late at night at the Harvard CRCT PDP-10 consoles (graduate research center in computing technology), ribbing him about hacking on such silly hobbyist computers as 8008's and 8080's. (He was working on his 8008 assembler/linker/simulator which he used to write the Altair Basic before he ever saw the hardware. Worked the first time he tried it on the real thing.)

He and I also had the same fate undergrad (I was '76, he was '77): we knew enough CS that the undergrad courses at the time (fairly underdeveloped) were too mickey mouse, so we took only grad CS courses (which were good even for their time). And he did well in those courses.

So his brilliance and his skill aren't in question.

Nor is his drive and competitiveness--that was obvious even back then. He was a serious player in the Currier House poker (bridge?) tournaments that would go on for days and involve many $K pots. (Way over my head.)


This reminds me of the following piece from http://www.sorehands.com/humor/real5.htm

"Allegedly, one Real Programmer managed to tuck a pattern-matching program into a few hundred bytes of unused memory in a Voyager spacecraft that searched for, located, and photographed a new moon of Jupiter."

BTW, I would be very grateful if someone could verify this.


It appears that the story is first mentioned in a letter by Ed Post of Tektronix to the editor of Datamation. A transcript can be found here:

http://www.ee.ryerson.ca/~elf/hack/realmen.html

Every other instance of the quotation seems to reference that letter (if anything), and the letter appears to contain no further references.


Nice story. People still do this stuff by the way, for example in 4k democoding. E.g. how these guys put an entire world (+ music) in 4 kilobytes:

http://www.pouet.net/prod.php?which=50063 ('download' for the Windows executable)


Gates was an extremely talented programmer. There's a great chapter on him in "Programmers at Work" - http://www.amazon.com/Programmers-Work-Interviews-Computer-I...

Joel Spolsky also mentions his talents in some of his old posts as well.


The BillG Review article:

http://www.joelonsoftware.com/items/2006/06/16.html

"...a person who came along from my team whose whole job during the meeting was to keep an accurate count of how many times Bill said the F word. The lower the f*-count, the better."


Thanks, that's exactly the post I was thinking of.


>if that bit of code was small enough (ie one or two bytes) you could simply encode those one or two bytes inside a two or three byte instruction thus saving the three-byte instruction needed to jump over it.

The 8080 and Z80 had a relative jump, which took only 2 bytes. It had a more limited jump range, but it was definitely wide enough to jump over 3 bytes.

>All that... for three bytes.

No, two.


If I recall correctly the opcode for the jump absolute was C3, and the conditional versions we C2, CA, etc. The destination was two bytes, 16 bits, little endian.

The JR - jump relative unconditional - was hex 18 followed by the number of bytes to jump, calculated from the byte after the JR instruction to account for the pipelining.

From memory - I could be wrong, but it's close enough.


It's a pity that people think these sorts of tricks are amazing in any way. This is the sort of stuff you have to do when you have a very small amount of available memory and need to make something as powerful as possible.

You are likely to still see the same sort of stuff going on when working with microcontrollers. Now, if you want to read about something really cool, read about drum memory and optimizing code so that instructions are ready for execution when the drum they are on has rotated to the right spot: http://www.columbia.edu/cu/computinghistory/650.html


The only reason you're able to make a statement like "This is the sort of stuff you have to do when you have a very small amount of available memory..." is because people like Gates came up with it first. It's only common knowledge now because the trailblazers made it so.

The real pity is not appreciating the incredible creativity of those who laid the groundwork for everything we take for granted today.


That's not correct.

I have been programming for a very long time and have been involved in all sorts of nasty tricks to fit things in memory (self modifying code, code that uses subroutines from the OS to save having duplicates in its own base, code that relocates itself while running, storing tiny amounts of code in 'free space' inside the BIOS, temporarily storing code in screen memory because there's nowhere else to go and you hope the user won't notice the funny image on screen, etc.)


Not to make light of his accomplishments, but Bill Gates is only 55: A mere whipper-snapper compared to some of the grey beards that many of us have worked with over the years. Gate's blazed many a trail, but this wasn't one of them.


This was also the reason why floppy disks had interleaving sectors; so that the CPU had some time to empty the read/write buffer.

http://en.wikipedia.org/wiki/Interleave#Interleaving_in_disk...


but... '"You never know where it's going to put things", he explained, "so you'd have to use separate constants". '

http://www.pbm.com/~lindahl/mel.html


This is a great quote: "I have often felt that programming is an art form, whose real value can only be appreciated by another versed in the same arcane art; there are lovely gems and brilliant coups hidden from human view and admiration, sometimes forever, by the very nature of the process. You can learn a lot about an individual just by reading through his code"


I found the proposition absurd: Why would you suspect Gates to be less of a coder than, say, Page, Brin, or Zuckerberg?


Someone is doing technical historical research by disassembling a piece of software written by Bill Gates. They found the software used numerous clever hacks to stay small. What proposition are you referring to?


From the article:

"'Could Bill Gates Write Code?' Or was he merely the luckiest man alive,..."

Maybe I'm wrong but I found the title to be kind of linkbait-ish, trying to ride the prevalent anti-Gates sentiment. I'm no Gates fan, but to doubt his coding skills or technical prowess either shows you're ignorant or just young enough not to know. I use and like a Radio Shack Model 100 (http://en.wikipedia.org/wiki/TRS-80_Model_100, a dinosaur, but the battery life on that thing is awesome, and works with stands AA batteries, too), this is agreed to be the last piece of hardware that has Gates' code on it.

I, too thought he was just a corporate raider (like our managers, sigh), but got corrected many times by friends who worked at Microsoft Research before, who said his assessment and knowledge of new technology was spot on.


I think I still have one of those... now I really want to dust it off and play around with it.


I thought it was junk but then found a vibrant community of users online. Now I use it for word processing, great for focusing, like WriteRoom on Mac. Also get some interesting looks. "Retro all the way baby" :-)


The headline probably.


This doesn't mean Bill could write code. This means either Bill Gates, Paul Allen, or both could code.

I'm inclined to say Bill could code, just because of his background and upbringing, but being a developer on a joint project doesn't say much.


The more relevant conclusion is that Gates/Allen could hack.


Apparently this sort of thing was relatively common. In the late 60's my aunt worked for MetLife as a programmer and would routinely hide data in execution code to save space. When she became pregnant they put her on 'leave' but refused to reinstate her after my cousin was born. Unfortunately for them only she knew about the hidden data and they were forced to hire her back as a consultant, for much more money, when things inevitably broke.


Another similar story, from my interview with Guy Steele in Coders at Work:

This may seem like a terrible waste of my effort, but one of the most satisfying moments in my career was when I realized that I had found a way to shave one word off an 11-word program that Gosper had written. It was at the expense of a very small amount of execution time, measured in fractions of a machine cycle, but I actually found a way to shorten his code by 1 word and it had only taken me 20 years to do it.

Seibel: So 20 years later you said, “Hey Bill, guess what?”

Steele: It wasn’t that I spent 20 years doing it, but suddenly after 20 years I came back and looked at it again and suddenly had an insight I hadn’t had before: I realized that by changing one of the op codes, it would also be a floating point constant close enough to what I wanted, so I could use the instruction both as an instruction and as a floating point constant.

Seibel: That’s straight out of “The Story of Mel, a Real Programmer.”

Steele: Yeah, exactly. It was one of those things. And, no, I wouldn’t want to do it in real life, but it was the only time I’d managed to reduce some of Gosper’s code. It felt like a real victory. And it was a beautiful piece of code. It was a recursive subroutine for computing sines and cosines.

So that’s the kind of thing we worried about back then.


Not really related, but Gosper's algorithm (http://en.wikipedia.org/wiki/Gospers_algorithm) is the thing I've learned in the last few years that blew my mind the hardest. It's how Maple or Mathematica can reduce your crazy sums to a single formula. Earlier work was apparently pioneered by a nun: http://en.wikipedia.org/wiki/Mary_Celine_Fasenmyer


I find this a bit of a moot point. Could Bill Gates really code? Yea, probably. Does it really matter? Not that much as we like to believe.

Right time, right place, right people on the team and whatever other circumstances you might want to describe as "luck" play a much bigger role than people like to admit because it would diminish their own accomplishments.

Also his and Microsoft's success has less to do with how much and how great Mr Gates can code; just think of Steve Jobs and many others. Good techs don't necessarily make for successful CEOs - quite the opposite is just as plausible if not likely considering how techs typically detest politics.


“Right time, right place, right people on the team and whatever other circumstances you might want to describe as "luck" play a much bigger role than people like to admit because it would diminish their own accomplishments.”

True, but equally true is that initiative, experience, domain knowledge, persistence and hard work play a much bigger role in outsized success than most people would like to admit because it would diminish their own accomplishments.


I think coding at this level -- successfully; well -- reflects a kind of relentlessness that was essential.

Gates had other qualities that contributed to Microsoft's success: The desire and ability to focus on getting people to do things and on business, etc.

But his coding work demonstrates a focus, an intensity and persistence, that was instrumental.

I'm not particularly fond of some of his and Microsoft's business techniques. But he damned well executed them, relentlessly. As I see it (from afar), he was, is never one for half measures -- not with regard to his true interests.


And he came at the right time to the right place - with an operating system for a new platform that would prove to be more widespread than all platforms before.


I think the example provided was valid but it's bad, unintuitive code. Sort of showboating.


You're missing the point. Such "tricks" were essential to fit the required code into the necessary space. The same thing still happens (although rarely) when fitting required functionality into limited devices such as FPGAs and PICs.

You're not aiming for readable, maintainable code. You're trying to get the cheapest device, and then squeezing the essential into what little space you get. Such "tricks" as jumping into the middle of instructions are unavoidable.


Yeah, tricks like that are still done for maximum speed for things like fitting as much data into structs/classes.

Like assuming (or enforcing) items are allocated to 4-byte memory boundary addresses, so that you can mask the first two bits of a pointer and use it for storing flags.

Similarly, you can pre-allocate tree nodes (left and right children) in a continuous array, and then only store a single pointer to the left node, and the right node will be the address of the left node + 1, so you save 8-bytes in the struct/class.

This allows more items into the processor cache line.


They had enough spare bytes to not have to pull that trick. It'd have been better to clean up some algorithms somewhere or reuse some code.

Prime example COS/SIN = same operation with 180 phase shift. I've seen BASIC implementations with two separate implementations...

Note: I've worked with very memory constrained systems in assembly before (actually hand assembled code on paper as well).


  > They had enough spare bytes to not have to pull
  > that trick. It'd have been better to clean up some
  > algorithms somewhere or reuse some code.
I assume from your clear and unequivocal statement that you have first hand knowledge of that. I'll bow to your better information.

  > Note: I've worked with very memory constrained
  > systems in assembly before
As have I.

  > (actually hand assembled code on paper as well).
That's how I started, although largely I ended up writing directly in machine code since it was quicker, and after I found my third bug in the assembler I had occasional access to I gave up on writing mnemonics at all. It was only much later when I had other people to communicate with that I went back to writing assembly.

And I remember writing code that really, really needed to do things like jumping into the middle of instructions.


Sorry should always "cite your sources": http://web.archive.org/web/20011211233332/www.rjh.org.uk/alt...

Total assembled output was 3826 bytes so they had a few bytes left including the RAM used to switch in the tape loader.


Thank you - useful. Perhaps they really didn't need to use that specific trick on that specific occasion. It might be interesting for someone to comb through and find out if they used it lots of times, or just a few. perhaps they simply got into the mindset and used it because they could, in anticipation of needing it.

Certainly that sort of thing is easier to do first time round, rather than having to go round again to find bytes when you find later that you need them. It becomes a habit, much like these days it's a habit to layout code clearly, name variables carefully, and comment tricky code.


A possibility and a fair one! I might write something to scan through looking for jumps that jump inside opcodes (added to list of projects which I will do one day).


"They had enough spare bytes to not have to pull that trick"

Hogwash. When you're writing an interepter for a low-specced system you have no truly spare bytes, because every one you take is one that programs running on your interpreter cannot have.


Actually the phase shift between the cosine and sine is 90 degrees, not 180.


Very true - I stand corrected.


Only if the engineer picked the wrong part for the job...


That doesn't make sense to me at all. Sure, pick a larger part, pay more money, and don't squeeze your code. It's a decision to be made, but don't label the decision for a smaller part as "wrong" until you know the trade-offs being made.

When you talk about huge volumes, that cent or two for the part can make a significant difference. Really, it can. I've been there. I've been part of a team that commissioned silicon and had to contribute to the decision about how much programming space to have. Compromises like this can make a very large difference to the economics.


I think most people these days tend to just play suppliers off against each other until the price is right if the volume is enough. Farnell bend 10% instantly if you mention RS for example choke innocent whistle.


It seems appropriate for its environment. If you write code like that today, it wouldn't be.


True. I think this is one of the saddest things about our history - the fact that hardware limitations thwarted good explanatory coding styles for so long.


Yes, but does Altair BASIC still work on the Pentium 4 and its trace cache? I'm afraid this trick may not have been any more future-proof than self-modifying code.

Still, nice.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: