The genetic code is the programming language used to make living organisms. Right now, biologists are attempting to ‘reverse engineer’ this genetic code, in order to determine how it works and to make changes to it. I (in my infinite wisdom) think they’re going about this the wrong way.
To understand why, we must first delve into the languages of computers. When humans program computers, they usually write in “high level languages”, which look like a series of mathematical formulas and instructions. Here is an example of a simple function to compute the nth Fibonacci Number, written in a high level language called C:
//computes the nth fibonacci number unsigned int fibonacci(unsigned int n) { unsigned int term1 = 1; unsigned int term2 = 1; unsigned int result = term2; unsigned int i = 1; while (i < n) { result = term1 + term2; term1 = term2; term2 = result; i = i + 1; } return result; }
The Computer cannot understand this program as it is written. It must be translated into instructions that make sense to the computer. This translation process is called compiling. The compiled program will look different depending upon what computer it is compiled on, but all computers basically operate the same way: they can read from and write to memory locations, and they can add, multiply, divide, and subtract numbers. Here is an English-language translation of what the compiled Fibonacci function might look like. Provided to make things more clear is a mapping of variable names to their memory locations:
| n | 0 | |
| term1 | 1 | |
| term2 | 2 | |
| result | 3 | |
| i | 4 |
# set the value in location 1 to 1 # set the value in location 2 to 1 # set the value in location 3 to the value stored in location 1 # set the value in location 4 to the 1 # if the value in location 4 is not less than the value in location 0, go to instruction 11 # set the value in location 3 to the result of the value in 1 plus the value in 2 # set the value in location 1 to the value in location 2 # set the value in location 2 to the value in location 3 # set the value in location 4 to the value in location 4 plus the value 1 # jump to instruction 5 # return the value in location 3
Each of the above instructions would be represented with a single word, which on a top-of-the-line modern computer is 64 bits of information. The human genetic code is also a sequence of simple instructions, interpreted not by a central processing unit, but by intracellular organisms. Instead of dealing with changing values in memory locations, the genetic code’s instructions are for amino acids used to build proteins. There are approximately 6 billion instructions in the human genome, making it roughly 10 million times as complex as our simple Fibonacci function.
Notice that the machine language program is no where near as easy to understand as the C program. If I gave you a machine language program with 100 million lines of code, you’d have a heck of a time trying to figure out how it worked. Poking around and changing single instructions, then determining what happens when you change those instructions probably wouldn’t be a good start. Unfortunately, as I understand things, that is exactly how biologists are proceeding. They are stepping through the genetic code piece by piece, using experiments to try and figure out what different sequences of instructions do.
There is, I think, a better approach. The key to understanding this approach is a simple fact about computer programming: It’s easier (and more fun) to write your own program than it is to read someone else’s program and figure out what it’s doing. Instead of trying to reverse engineer existing DNA code, which evolved over millions of years and is therefore probably extremely convoluted and hard to follow, we’d be better off trying to ‘write’ our own organisms. Biologists could start by writing DNA ‘programs’ to code simple proteins. After getting good at this, they could invent a ‘high level language’ for DNA programming (AKA GeneticC) which could be used to speed up the development progress. The next step would be to write a single cellular organism, and then more complicated organisms.
Interestingly enough, this process of building our own organisms could provide a lot of insight into the question of whether there is some being who designed us. Once we have a better understanding of genetic programming, we could look at the quality of the code that builds human beings. If it’s clean, neat, and straightforward, that would be strong evidence of a creator at work. If it’s messy, garbled, hacked together, redundant and nonsensical in parts, we could conclude that the code evolved over time. Either that, or god is a perl scripter.