The Forrest Conjecture
(I originally wrote this in 2003, and posted it to the comp.arch USENET group then. It generated a fair number of insightful comments which I’ve attempted to incorporate into this new version).
You might not know it, but the programs that run on your computer are actually divided into 2 pieces. One piece is where the instructions that your computer executes are stored. These instructions are things like “ADD” or “JUMP”. This is called the “text space”. The other piece is where the data that the program accesses is stored. This is called the “data space”.
32-bit processors in PCs started to appear in about 1985. A 32-bit processor can address a 4GB text and a 4GB data space. At the time, a 32-bit processor was a huge improvement over the 16-bit processors that came before, which could only address 64KB, at least not without painful tricks. 32-bit processors made it possible to run much larger programs that could process much more data than before. All was well until applications started appearing that needed to access more than 4GB of data. To solve this, AMD, and then Intel, released 64-bit processors. (Because of the way processors are designed, the next increment from 32-bits is 64-bits).
Today, 64-bit processors are ubiquitous, and everyone has enough address space to do what they need. However, I claim that 64-bit processors are being pushed in one way that’s completely unnecessary. This is that although it’s crystal clear that a 64-bit data space is critical, there’s no need at all for a 64-bit text space. A 32-bit text space would be fine even today, roughly 30 years after 32-bit processors first appeared. The reason for this is simple – it’s simply too complicated for a human, or a group of humans, to write a program that comes close to filling up a 32-bit text space. Unless humans get much smarter, this isn’t likely to change.
To prove this, I measured the total text size of every single executable and library on a large Ubuntu 16.10 server system. This size was slightly under 2GB. This means if every program and library on this system were somehow combined into one giant program, it would still fit in a 32-bit text space.
Notice that I’m talking about a program written by a human. Obviously you could write a program that itself generates a program of any size. I’m told that Computer Aided Design programs generate huge amounts of text space. I’m also talking about one program running on one computer. I suppose it’s possible to design a processor in which different parts of the text space are actually running on remote processors, but I haven’t seen one. I’m also not talking about programs run by an interpreter. A classical interpreter treats programs as consisting entirely of data, and, as I mention above, the need for a 64-bit data space is clear.
To be clear, I’m not seriously suggesting that somebody make a processor with a 64-bit data space and a 32-bit text space. After all, a 64-bit text space might be unnecessary but it doesn’t do any harm.
Does anybody know of any programs that have require than 32-bits of text?