Introduction

Audience

I have targeted this page at "intermediate programmers". These are programmers who have written (solo) several 10-30,000 line programs and may have worked on a couple of larger projects. Also, these programmers probably have less than 5 years experience in the industry. This page is meant to help these programmers learn how to code programs that are an order of magnitude bigger.

In my experience, there is a disconnect between coding a 20,000 line program and coding a 200,000 line program. Techniques that work on smaller programs tend to be inadequate or inapplicable to the larger sized programs. This is particularly true when the larger programs also run longer, consume more memory, are more complex, or all of the above.

Larger sized programs need a different approach in all aspects of programming - design, analysis, project management as well as coding. There is a large body of software engineering literature that addresses almost all of these aspects. However, there are very few books that talk about how to code for 200,000 line programs. This page directly addresses that gap.

Concerns

The biggest problem in coding is correctness. A universal truth is that you are going to have bugs in the code. The larger the program, the more time it takes to identify a bug. The most important thing I will focus on is reducing the time to isolate these bugs.

Another issue that changes is maintainability. A larger program is probably going to be deployed for a longer period of time. Over this time, it will be modified and enhanced. Also, bugs will be found and need to be fixed. This will, almost certainly, be done by people different than the programmer who initially wrote the code. The techniques described here always keep maintainability of code in mind.

Most programmers at this stage have been writing programs in one, or perhaps two, environments. It is unlikely that they have had to port a substantial amount of code between two different operating systems. And, given the situation today, it is unlikely that they have come across environments where an int is not 32 bits (or perhaps 64 bits). These pages deal with the issue of portability, where pertinent

I chose to focus on C mostly because it is the language that I use the most. My work has focused on system and embedded programming - things like compilers, operating systems, device drivers, etc. In these areas, the flexibility and efficiency of C are critical to delivering good performance. Later on, I shall get into some generalities about programming for performance.

Methodology

As currently written, the page consists of a series of tips that form a more or less continuous narrative. Certain features are introduced in a particular tip that may not make complete sense till later.

Also, the quality of the tips are fairly variable. They range from mundane suggestions to be found in many places, to ideas that I have never seen written down anywhere.

Disclaimer

Many of the techniques I show are fairly idiosyncratic, and are not common practice. But keep in mind that these techniques were developed to solve specific problems encountered while developing large programs. I have tried to explain how some of them came about. When looking at a specific idea, try and ask yourselves how else the problem could have been solved.

Also, most of these ideas were developed working on programs in the 100,000-500,000 line class. They may be overkill for small programs. However, they seem to b useful even for smaller codes. They may, however, be inadequate to deal with larger programs. I don't know, since I haven't worked enough on programs in the million line range.


Next Prev Main Top Feedback