Archive for the ‘Programming’ Category

Simplicity vs Speed

Thursday, July 9th, 2009

This is a post about programming practices and design in general. Even if you’re not a programmer, you should read this and let me know what you think.

Consider these two C++ classes:

class SimpleFrobulotron
{
  public:
 
	SimpleFrobulotron(const string &name);
	string getName();
 
  private:
	string m_Name;
};  
 
class FastFrobulotron
{
  public:
 
	FastFrobulotron(const string &name);
	void getName(string * outName);
 
  private:
    string m_Name;
 
};

The main difference between the two designs is simple: The SimpleFrobulotron is easier to understand and use, but somewhat slower than the FastFrobulotron. For example:

void printFrobulotronNames()
{
	SimpleFrobulotron simple("tweedle dee");
	FastFrobulotron fast("tweedle dum");
 
	//display the simple frobulotron's name
	cout << simple.getName() << endl;
 
	//display the fast frobulotron's name
	string fastName;
	fast.getName(&fastName);
	cout << fastName << endl;
 
}

Is FastFrobulotron really that much faster than SimpleFrobulotron? A test on my machine shows that FastFrobulotron runs in 27% of the time that SimpleFrobulotron runs in. So it is clear that there’s a trade off here: Simplicity versus Speed. Code that squeezes every last bit of performance out of the machine is harder to understand. What side of this trade off do you favor, and why?

In this case, I think the answer is obvious: returning a string from a function is much slower than assigning it through a pointer. This is a simple enough optimization that most good programmers should be able to look at the code and understand what’s going on. Therefore it’s best to make use of it. Things can get much more complicated than this simple example, though.

Writing code that does what it’s supposed to do is something most programmers can do (unless you give them the halting problem.) Writing code that’s easy to understand, however, is not. Neither is writing code that is as efficient as possible. The problem is that these two goals (simplicity and efficiency) seem to be at odds with each other. How do you balance between the two?

Another Bad Design

Thursday, July 2nd, 2009

I use iTunes to manage the music on my iPhone. I went to add a bunch of music into my library, and a window popped up saying it was “Adding files.” There was a “stop” button, and a progress bar. This progress bar started off with a blue cursor on the left side, and that blue cursor gradually moved to the right. I presumed that when the blue cursor reached the right edge of the bar the work would be done.

I was wrong. Once it got to the right edge of the bar, the cursor just wrapped around and appeared at the left edge again. I understand the point of having some sort of animated graphic to indicate that work is being done, but why a progress bar? A progress bar should only be used if it shows the extent to which the work has been completed. To indicate … progress. A better design would be to have some kind of spinning wheels or gears, to make it clear that you have no idea how long you’re going to be waiting until you can listen to “I Wanna Sex You Up” by Color Me Badd. If, you know, you wanted to listen to that song.

What Does This Mean?

Monday, May 18th, 2009

Do you know what this means?

htons

No Cheating via Google! Click for the answer … (more…)

Removing SVN Bindings with Python

Saturday, May 16th, 2009

I had an SVN repository on my website that I was using to store a personal project of mine.  I wanted to move the files into a different SVN repository on another website. Doing this required that I first remove all of the .SVN folders in my project. It’s got a bunch of folders, so going through manually would have been a big pain. Besides, as a self-respecting geek, I’d definitely prefer spending twice as long writing a script to automate a boring process over just doing it manually. My script, which is based upon this script here, generates a batch file that you run to clean the directories out. I would have done it directly from python, but I kept getting ‘this directory could not be found’ exceptions, probably becuase of the spaces in my folder names.

The script is below, for your edification. It only works on windows, but it could easily be modified to run on a UNIX based system. It is presented without any warranty, express or implied, not even that of suitability for a specific purpose.

import os
import sys
 
def findFileGenerator(rootDirectory, acceptanceFunction):
  for aCurrentDirectoryItem in [ os.path.join(rootDirectory, x) for x in os.listdir(rootDirectory) ]:
    if os.path.isdir(aCurrentDirectoryItem):
		if acceptanceFunction(aCurrentDirectoryItem):
			yield aCurrentDirectoryItem
		for aSubdirectoryItem in findFileGenerator(aCurrentDirectoryItem, acceptanceFunction):
			yield aSubdirectoryItem
 
def acceptSVNChildren(theFileOrDir):
	return theFileOrDir.find('.svn') != -1
 
def removeSVN(rootdir):
	"""recursively removes the SVN bindings from the directory specified in the argument"""
 
	for dude in findFileGenerator(rootdir,acceptSVNChildren):
		if os.path.isdir(dude):
			print "rmdir /S /Q", '"'+dude+'"'
 
def main():
	"""Removes the SVN bindings from the commandline dir specified in the argument.
	If no argument is specified, the root directory is used."""
 
	dirToClean = os.getcwd()
	if(len(sys.argv) == 2):
		dirToClean = os.path.join(os.getcwd(),sys.argv[1])
	elif (len(sys.argv) &gt; 2):
		print "Usage: remove_svn.py &lt;dirToClean&gt;"
 
	removeSVN(dirToClean)
 
if __name__ == "__main__" : main()

Confessions of a Programmer

Friday, February 13th, 2009

I can’t talk about my job too much because everything we do is TOP SECRET, but I thought I’d make a few confessions / thoughts about programming.

  1. I absolutely love writing code. Probably moreso than is healthy.
  2. I spend more time than I should formatting my code so it looks pretty. I really enjoy reading code with syntax highlighting, lots of comments, and good indenting.
  3. I dislike the ternary  "<expression> ? <true-value> : <false-value>" operator. I think it obfuscates code and makes debugging more difficult. That said, I’m guilty of using it myself because I think it looks cool.
  4. I used to consider myself a ‘very good’ programmer, but I’ve learned a lot at my new job, and the more I learn, the more I realize there is to learn. Programming is like knowledge in general that way. I think this phenomenon (beginning programmers overrating their skills) is a great example of the Dunning-Kruger effect at work.
  5. I get a huge rush when I write a lot of code and it works on the first try, especially if the code involved pointer arithmetic or a nontrivial algorithm.
  6. I think programming is a great way to gain self confidence. There’s nothing like completing a project that sounded terrifying when it was first proposed to you.
  7. Python used to be, far and away, my favorite language, but I’m starting to think C# is really cool, largely because I like the .NET libraries and the Reflection capabilities.   If I do go back to using Python on a regular basis, I’ll probably switch to IronPython so I can still use . NET.   I still think Python is better for manipulating strings, largely because of the splicing syntax, but I rarely have to do that. I’ve never extensively used the pickle module in Python, but from what I remember, it was more difficult to use than .NET’s XML Serialization. Admittedly, the XML serialization syntax in .NET is kind of goofy for serializing recursively defined classes like Trees, and it can’t handle a class like a Graph without some carefull planning, but it really is cool to add a few attributes, feed your obejct to a BinaryWriter, and have the data easily persisted to the hard drive or sent across the wire.
  8. I hate the const keyword in C++ but I think it’s necessary for writing good code. The reason I dislike it is that it spreads like a virus: you add the const keyword to one function, and soon you have to add const to all the inputs to that function and then add const to the output, and const soon  your whole  const class, all of its const clients, and some other random const code you wrote is peppered with const keywords.  I like C#’s implementation of the out keyword much, much more.
  9. That is all for now. I have more code to write :)

On GPS Navigation Unit Mapping Algorithms

Wednesday, January 21st, 2009

I currently live in Chapel Hill, North Carolina. I grew up, however,  in Cincinnati, Ohio and a couple of times a year I make the drive back there.  The best route takes me through West Virginia on I-77 to Charleston, but after that it’s really up in the air as to what the best way to get to Cincinnati is. I’ve experimented with a number of routes, and after consulting my GPS unit, Google Maps, and AAA, I think I’ve come up with the best possible route.  For those interested, it is OH-32 to US-35 to US-34 to I-64 to I-77 to I-74 to US-52 to US-421 to I-40 to NC-54.

Why was I able to find a better route than my GPS unit, into which probably went thousands of man-hours of labor? My guess is that the algorithm used by the GPS unit is not theoretically optimal.  The reason for this is simple: the country is  huge.   There are about 6 million miles of paved highway in the US. Assume we have 10 intersections for every paved mile of road (a very conservation assumption): then we have 60 million intersections. A graph with 60 million vertices has at least 60 million edges, but probably closer to 120 million if it’s connected at all (60 million edges would just be a straight line, which is obviously wrong.)  A very good shortest path algorithm running on a graph with E edges and V vertices takes runs in time roughly proportional to E * log V.  Plugging in 120 million for E and 60 million for V, we get a number of computational operations on the order of 2 billion.
I have a Garmin c340, but I couldn’t find data on its microprocessor. Therefore well assume we are using a Tom-Tom 720, which has a 400 Mhz microprocessor inside. This means it undergoes one clock tick approximately (1 MHz = 1048576 Hz) 400 million times a second.  Each operation probably takes on the order of 100 machine instructions. If we assume perfect pipelining in the microprocessor and 0 cache misses ever,  we get 1 instruction per tick. That’s 200 billion instructions, at 400 Million instructions per second, for 500 seconds processing time on our GPS unit . 8 minutes would be a long time to wait for a route, and that’s with a number of conservative assumptions.  Once you take memory latency into effect, you’re probably looking at 16-20 minutes to find a route.

Your little GPS unit inside your probably employs a heuristic that favors taking interstates over back roads, among others.  It almost definitely also uses the fact that the graph of the roads embedded on the surface of a sphere.  It sure would be interesting to figure out what kind of algorithms they use inside that little thing.  Google Maps is an entirely different story- they have much faster processors and can probably find paths in parallel. What sort of parallel algorithms for finding shortest paths exist? Could you cache frequently used route portions? How would that work ? Just something to think about.

Credo

Thursday, August 14th, 2008

I believe in myself.  I believe that the mind has incredible power over the body, and that eating breakfast every day is good for you.  I believe in small miracles. I believe in the power of free markets to improve the human condition. I believe that human history is a story of progress, and that the future will be better than the past. I believe that family is the single most important thing there is on this planet. I believe in a thing called love, but I don’t really know what it is or how it works. I believe that peace will  come to the world, albeit gradually.  I believe low pocket pairs are rarely worth the trouble they can cause you.  I believe that strawberry jelly is superior to grape jelly, especially vis a vis peanut butter and jelly sandwiches. I believe a first kiss should be as romantic as possible. I believe in following big dreams, no matter how likely you are to obtain them.  I believe in opening the door for others. I believe that not all cultures are ‘equal’ in any real sense, and that the values a culture holds effect that culture’s economies and social freedoms.  I believe in taking responsibility for my actions, and ensuring that others do the same. I believe a man should take his fate into his own hands, whether or not he has the ability to do so. I believe that life is 10% what you’ve been given, and 90% what you make of what you’ve been given.  I believe in preventative maintenance.  I do believe Jones Soda to be absolutely delicious.  I believe in making small differences, especially when they don’t seem to matter at all.  I believe that, somehow, we are all one. I believe that sometimes, the only solution to a problem is violence. I believe that game theory is one of the most underutilized mathematical constructs. I believe that P will eventually be found to encompass all of NP, but that NP-Complete problems will still be effectively intractable. I believe python to be an amazingly useful language. I believe that computer programs are beautiful in of themselves, regardless of what they do.  I believe that the scientific method has been phenomenally successful  in helping us divine the nature of our world. I believe organized religions like modern Christianity have spread good works all over the world. I believe Islam is fundamentally incompatible with a Free, Democratic society. I believe in challenging common wisdom: Can you really boil a frog to death by gradually increasing the heat? I sincerely doubt it. I believe that saying ‘please’ and ‘thank you’ make a small difference in someone’s day. I believe that small differences can make a big difference. I believe that voting is irrational. I believe it’s OK to be irrational every now and them.  I believe the world was created by a phenomenally intelligent entity, as an act of supreme love.  And in the end, I believe that everything will work itself out.

On the Stability of Computer Programs

Saturday, February 9th, 2008

If something in your car breaks, most likely the car will still work, albeit with reduced functionality. If you randomly flip a bit in a computer program, however, the odds that the thing will run at all are next to zero. I tried this as a kid and realized quite quickly I couldn’t use notepad to edit executable files. My little mind simply concluded that computer programs are inherently fragile and subject to breaking. It amazed me, then that our home computer could run for years without suffering any real problems. How is this possible?

There are maybe 1000 parts of a car that absolutely have to function in order for the car to still be drivable. Computer programs have far more ‘parts’ than cars do – a simple 10 mb executable file contains roughly 1o million ‘parts’, all of which have to work properly in order for the executable to function correctly. If one of those parts breaks, the entire program stops working. Shouldn’t that mean that a computer program is far more likely to break than a car?

 

The reason computer programs are so robust is that the probability that a random bit will get flipped is basically zero. Error-Checking codes are put in place to ensure that, whether in transmission or storage, the probability of an error occurring is vanishingly small. In practice, therefore, the far more ‘fragile’ computer programs are more likely to keep on running than the far simpler, ’stronger’ cars. There are many 40 year old computer programs that still run today, because it’s cheaper to emulate the old hardware than it is to redevelop the software. How many 40 year old cars are still running on a day-to-day basis? It’s really quite amazing when you think about it.

 

 

Very True

Sunday, February 25th, 2007

I had breakfast with my friend Sara this morning.  We were talking about school when she said:

“Whenever you complain about math or physics, it seems like ultimately your complaint is that they are not computer science.”

What can I say? I guess I know what I like.

Another Victory

Sunday, February 18th, 2007

I brushed off the old PHP skills and wrote a script to fix the database, so now older posts will accurately display the correct number of comments.

Programming makes me happy. It’s interesting to note that my programming language of choice (Python), has most likely influenced the way I tend to go about solving problems. Compiling and running a python script is so quick, I got in the habit of making small changes, then running the program to verify that those changes worked. That’s really easy to do when compiling your program takes all of half a second, but when you’re writing a program that mixes C, C++, and LISP, and you need to compile the program, then install a new version of that program on a server using ssh, you tend to write larger chunks of code before you test them.

Speaking of writing complicated programs in a bunch of languages, I find myself reflecting on my time at my old job more positively now that I’ve put some distance between then and now (and that I’ve updated the resume.) I absolutely hated going in to work every day, I admit, but there was some cool technology in use there. I like the new job and feel like I’ve got a lot more of a future there, but I’d love to start my own company, and do a proper job of what those guy were doing.

Also, I put up a ‘projects‘ page, with a link to my first project. Check it out.