, #0740-3 CGI01FI.MCW ***Production: Thumbtab: Part I, CGI Fundamentals*** ***Production: The following codes are used in this chapter:*** - em dash bullet ... ellipsis [lbr] right-pointing triangles (FRI)
The Common Gateway Interface (CGI) specification lets Web servers execute other programs and incorporate their output into the text, graphics, and audio sent to a Web browser. The server and the CGI program work together to enhance and customize the World Wide Web's capabilities.
By providing a standard interface, the CGI specification lets developers use a wide variety of programming tools. CGI programs work the magic behind processing forms, looking up records in a database, sending e-mail, building on-the-fly page counters, and dozens of other activities. Without CGI, a Web server can offer only static documents and links to other pages or servers. With CGI, the Web comes alive-it becomes interactive, informative, and useful. CGI can also be a lot of fun!
In this chapter, you'll learn about the fundamentals of CGI: how it originated, how it's used today, and how it will be used in the future:
Browsers and Web servers communicate by using the Hypertext Transfer Protocol (HTTP). Tim Berners-Lee at CERN developed the World Wide Web using HTTP and one other incredibly useful concept: the Universal Resource Locator (URL). The URL is an addressing scheme that lets browsers know where to go, how to get there, and what to do after they reach the destination. Technically, a URL is a form of Universal Resource Identifier (URI) used to access an object using existing Internet protocols. Because this book deals only with existing protocols, all URIs will be called URLs, not worrying about the technical hair-splitting. URIs are defined by RFC 1630. If you're interested in reading more about URIs, you can get a copy of the specification from http://ds.internic.net/rfc/rfc1630.txt.
In a simplified overview, six things normally happen when you fire up your Web browser and visit a site on the World Wide Web:
Chapter 3, "Designing CGI Applications," looks at these steps in more detail. For now, you need to know how the server responds. You ask for a URL; the server gives you the requested document and then disconnects. If the document you get back has links to other documents (inline graphics, for instance), your browser goes through the whole routine again. Each time you contact the server, it's as if you'd never been there before, and each request yields a single document. This is what's known as a stateless connection.
Fortunately, most browsers keep a local copy, called a cache, of recently accessed documents. When the browser notices that it's about to refetch something already in the cache, it just supplies the information from the cache rather than contact the server again. This alleviates a great deal of network traffic.
Because the server doesn't remember you between visits, the HTTP 1.0 protocol is called stateless. This means that the server doesn't know the state of your browser-whether this is the first request you've ever made or whether this is the hundredth request for information making up the same visual page. Each GET or POST in HTTP 1.0 must carry all the information necessary to service the request. This makes distributing resources easy but places the burden of maintaining state information on the CGI application.
A "shopping cart" script is a good example of needing state information. When you pick an item and place it in your virtual cart, you need to remember that it's there so that when you get to the virtual check-out counter, you know what to pay for. The server can't remember this for you, and you certainly don't want the user to have to retype the information each time he or she sees a new page. Your program must track all the variables itself and figure out, each time it's called, whether it's been called before, whether this is part of an ongoing transaction, and what to do next. Most programs do this by shoveling hidden fields into their output so when your browser calls again, the hidden information from the last call is available. In this way, it figures out the state you're supposed to have and pretends you've been there all along. From the user's point of view, it all has to happen behind the scenes.
The Web has used HTTP 1.0 since 1990, but since then many proposals for revisions and extensions have been discussed. If you're interested in the technical specifications, stop by http://www.w3.org/hypertext/WWW/Protocols/ and read about what's coming down the road in the near future. Of particular interest to CGI programmers is the proposal for maintaining state information at the server. You can retrieve a text version of the proposal from http://www.ics.uci.edu/pub/ietf/http/draft-kristol-http-state-info-01.txt.
HTTP 1.1, when approved and in widespread use, will provide a great number of improvements in the state of the art. In the meantime, however, the art is stateless, and that's what your programs will have to remember.
This is fine for retrieving static text or displaying graphics, but what if you want dynamic information? What if you want a page counter or a quote-of-the-day? What if you want to fill out a guest book form rather than just retrieve a file? The next section can help you out.
Your Web browser doesn't know much about the documents it asks for. It just submits the URL and finds out what it's getting when the answer comes back. The server supplies certain codes, using the Multipurpose Internet Mail Extensions (MIME) specifications, to tell the browser what's what. This is how your browser knows to display a graphic but save a .ZIP file to disk. Most Web documents are Hypertext Markup Language (HTML): just plain text with embedded instructions for formatting and displaying.
In Chapter 3, "Designing CGI Applications," and Chapter 6, "Examples of Simple CGI Scripts," you'll see that the browser has to know a little bit about CGI, particularly when dealing with forms; however, most of the intelligence lives on the server, and that's what this book will concentrate on.
See "Integrating CGI into Your HTML Pages," p. xxx, for more information on how Web browsers interact with Web servers.
By itself, the server is only smart enough to send documents and to tell the browser what kind of documents they are. But the server also knows one key thing: How to launch other programs. When a server sees that a URL points to a file, it sends back the contents of that file. When the URL points to a program, however, the server fires up the program. The server then sends back the program's output as if it were a file.
What does this accomplish? Well, for one thing, a CGI program can read and write data files (a Web server can only read them) and produce different results each time you run it. This is how page counters work. Each time the page counter is called, it hunts up the previous count from information stored on the server, increments it by one, and creates a .GIF or .JPG on the fly as its output. The server sends the graphics data back to the browser just as if it were a real file living somewhere on the server.
NCSA Software Development maintains the CGI specification. You'll find the specification online at the World Wide Web Consortium: http://www.w3.org/hypertext/WWW/CGI/. This document goes into great detail, including history, rationales, and implications. If you don't already have a copy, download one and keep it handy. You won't need it to understand the examples in this book, but it will give you a wonderful overview of CGI and help you think through your own projects in the future.
The current version of the CGI specification is 1.1. The information you'll find at www.w3.org is composed of continually evolving specifications, proposals, examples, and discussions. You should keep this URL handy and check in from time to time to see what's new.
***end note***
A CGI program isn't anything special by itself. That is, it doesn't do magic tricks or require a genius to create it. In fact, most CGI programs are fairly simple things, written in C or Perl (two popular programming languages).
CGI programs are often called scripts because the first CGI programs were written using UNIX shell scripts (bash or sh) and Perl. Perl is an interpreted language, somewhat like a DOS batch file, but much more powerful. When you execute a Perl program, the Perl instructions are interpreted and compiled into machine instructions right then. In this sense, a Perl program is a script for the interpreter to follow, much as Shakespeare's Hamlet is a script for actors to follow.
Other languages, like C, are compiled ahead of time, and the resultant executable isn't normally called a script. Compiled programs usually run faster but often are more complicated to program and certainly harder to modify.
In the CGI world, however, interpreted and compiled programs are called scripts. That's the term this book will use from now on.
Before the server launches the script, it prepares a number of environment variables representing the current state of the server, who is asking for the information, and so on. The environment variables given to a script are exactly like normal environment variables, except that you can't set them from the command line. They're created on the fly and last only until that particular script is finished. Each script gets its own unique set of variables. In fact, a busy server often has many scripts executing at once, each with its own environment.
You'll learn about the specific environment variables in later chapters; for now, it's enough to know that they're present and contain important information that the script can retrieve.
See "Environment Variables: Information for the Taking," for a discussion of CGI environment variables.
Also, depending on how the script is invoked, the server may pass information another way, too. Although each server handles things a little differently, and although Windows servers often have other methods available, the CGI specification calls for the server to use STDOUT (Standard Output) to pass information to the script.
STDIN and STDOUT are mnemonics for Standard Input and Standard Output, two predefined stream/file handles. Each process inherits these two handles already open. Command-line programs that write to the screen usually do so by writing to STDOUT. If you redirect the input to a program, you're really redirecting STDIN. If you redirect the output of a program, you're really redirecting STDOUT. This mechanism is what allows pipes to work. If you do a directory listing and pipe the output to a sort program, you're redirecting the STDOUT of the directory program (DIR or LS) to the STDIN of the sort program.
For Web servers, STDOUT is the feed leading to the script's STDIN. The script's STDOUT feeds back to the server's STDIN, making a complete route. From the script's point of view, STDIN is what comes from the server, and STDOUT is where it writes its output. Beyond that, the script doesn't need to worry about what's being redirected where. The server uses its STDOUT when invoking a CGI program with the POST method. For the GET method, the server doesn't use STDOUT. In both cases, however, the server expects the CGI script to return its information via the script's STDOUT.
This standard works well in the text-based UNIX environment where all processes have access to STDIN and STDOUT. In the Windows and Windows NT environments, however, STDIN and STDOUT are available only to non-graphical (console-mode) programs. To complicate matters further, NT creates a different sort of STDIN and STDOUT for 32-bit programs than it does for 16-bit programs. Because most Web servers are 32-bit services under NT, this means that CGI scripts have to be 32-bit console-mode programs. That leaves popular languages such as Visual Basic and Delphi out in the cold. One popular NT server, the freeware HTTPS from EMWAC, can talk only to CGI programs this way. Fortunately, there are several ways around this problem.
Some NT servers, notably Bob Denny's WebSite, use a proprietary technique using INI files to communicate with CGI programs. This technique, which may well become an open standard soon, is called CGI-WIN. A server supporting CGI-WIN writes its output to an INI file instead of STDOUT. Any program can then open the file, read it, and process the data. Unfortunately, using any proprietary solution like this one means your scripts will work only on that particular server.
For servers that don't support CGI-WIN, you can use a wrapper program. Wrappers do what their name implies-they wrap around the CGI program like a coat, protecting it from the unforgiving Web environment. Typically, these programs read STDIN for you and write the output to a pipe or file. Then they launch your program, which reads from the file. Your program writes its output to another file and terminates. The wrapper picks up your output from the file and sends it back to the server via STDOUT, deletes the temporary files, and terminates itself. From the server's point of view, the wrapper was the CGI program. For more information on wrappers, or to download one that works with the freeware EMWAC server, visit http://www.greyware.com/greyware/software/ cgishell.htp.
The script picks up the environment variables and reads STDIN as appropriate. It then does whatever it was designed to do and writes its output to STDOUT.
The MIME codes the server sends to the browser let the browser know what kind of file is about to come across the network. Because this information always precedes the file itself, it's usually called a header. The server can't send a header for information generated on the fly by a script because the script could send audio, graphics, plain text, HTML, or any one of hundreds of other types. Therefore, the script is responsible for sending the header. So, in addition to its own output, whatever that may be, the script must supply the header information. Failure to do so always means failure of the script because the browser won't understand the output.
Here, then, are the broad steps of the CGI process, simplified for clarity:
It's a bit more complicated than a normal HTML retrieval, but hardly daunting-and that's all there is to how CGI works. Well, no; there's more-but that's the essential mechanism. The scripts become extensions to the server's repertoire of static files and open up the possibilities for real-time interactivity.
Just like any other file on a server, CGI scripts have to live somewhere. Depending on your server, CGI scripts may have to live all in one special directory. Other servers let you put scripts anywhere you want.
Typically-whether required by the server or not-Webmasters, a special case of the system administrator disease, put all the scripts in one place. This directory is usually part of the Web server's tree, often just one level beneath the Web server's root. By far the most common directory name is CGI-BIN, a tradition that got started by the earliest servers to support CGI: servers that (believe it or not) hard-coded the directory name. UNIX hacks will like the BIN part, but because the files are rarely named *.bin and often aren't in binary format anyway, the rest of the world roll their eyes and shrug. Today, servers usually let you specify the name of the directory and often support multiple CGI directories for multiple virtual servers (that is, one physical server that pretends to be many different ones, each with its own directory tree).
Suppose that your UNIX Web server is installed so that the fully qualified path name is /usr/bin/https/Webroot. The CGI-BIN directory would then be /usr/bin/https/Webroot/cgi-bin. That's where you, as Webmaster, put the files. From the Web server's point of view, /usr/bin/https/Webroot is the directory tree's root, so you'd refer to a file there called index.html with a URL of /index.html. A script called myscript.pl living in the CGI-BIN directory would be referred to as /cgi-bin/myscript.pl.
On a Windows or NT server, much the same thing happens. The server might be installed in c:\winnt35\system32\https, with a server root of d:\Webroot. You'd refer to the file default.htm in the server root as /default.htm, never minding that its real location is d:\Webroot\default.htm. If your CGI directory is d:\Webroot\scripts, you'd refer to a script called myscript.exe as /scripts/myscript.exe.
Although URL references always use forward slashes-even on Windows and NT machines-file paths are separated by backslashes here. On a UNIX machine, both types of references use forward slashes.
For the sake of simplicity, assume that your server is configured to look for all CGI scripts in one spot and that you've named that spot CGI-BIN off the server root. If your server isn't configured that way, you might want to consider changing it. For one thing, in both UNIX and NT, you can control the security better if all executables are in one place (by giving the server process execute privileges only in that directory). Also, with most servers, you can specify that scripts may run only if they're found in the CGI-BIN directory. This lets you keep rogue users from executing anything they want from directories under their control.
CGI scripts, by their very nature, place an extra burden on the Web server. They're separate programs, which means the server process must spawn a new task for every CGI script that's executed. The server can't just launch your program and then sit around waiting for the response-chances are good that others are asking for URLs in the meantime. So the new task must operate asynchronously, and the server has to monitor the task to see when it's done.
The overhead of spawning a task and waiting for it to complete is usually minimal, but the task itself will use system resources-memory and disk-and also will consume processor time slices. Even so, any server that can't run two programs at a time isn't much of a server. But remember the other URLs being satisfied while your program is running? What if there are a dozen, or a hundred, of them, and what if most of them are also CGI scripts? A popular site can easily garner dozens of hits almost simultaneously. If the server tries to satisfy all of them, and each one takes up memory, disk, and processor time, you can quickly bog your server down so far that it becomes worthless.
There's also the matter of file contention. Not only are the various processes (CGI scripts, the server itself, plus whatever else you may be running) vying for processor time and memory, they may be trying to access the same files. For example, a guest book script may be displaying the guest book to three browsers while updating it with the input from a fourth. (There's nothing to keep the multiple scripts running from being the same script multiple times.) The mechanisms for ensuring a file is available-locking it while writing and releasing it when done-all take time: system OS time and simple computation time. Making a script foolproof this way also makes the script bigger and more complex, meaning longer load times and longer execution times.
Does this mean you should shy away from running CGI scripts? Not at all. It just means you have to know your server's capacity, plan your site a bit, and monitor performance on an ongoing basis. No one can tell you to buy a certain amount of RAM or to allocate a specific amount of disk space. Those requirements will vary based on what server software you run, what CGI scripts you use, and what kind of traffic your server sees. However, following are some rules of thumb you can use as a starting point when planning your site.
The best present you can buy your NT machine is more memory. While NT Server will run with 12M of RAM, it isn't happy until it has 16M and doesn't shine until it has 32M. Adding RAM beyond that probably won't make much difference unless you're running a few very hungry applications (SQL Server comes to mind as a prime example). If you give your server 16M of RAM, a generous swap file, and a fast disk, it should be able to handle a dozen simultaneous CGI scripts without sweating or producing a noticeable delay in response. With 32M of RAM, your server will be able to do handstands in its spare time-almost.
Of course, the choice of programming language will affect each variable greatly. A tight little C program hardly makes an impact, whereas a Visual Basic program, run from a wrapper and talking to a SQL Server back end, will gobble up as much memory as it can. Visual Basic and similar development environments are optimized for ease of programming and best runtime speed, not small code and quick loading. If your program loads seven DLLs, an OLE control, and an ODBC driver, you may notice a significant delay. Scripts written in a simpler programming environment, though, such as C or Perl, run just as fast on NT as they would on a UNIX system-often much faster due to NT's multithreaded and preemptive scheduling architecture.
UNIX machines are usually content with significantly less RAM than Windows NT boxes, for a number of reasons. First, most of the programs, including the OS itself and all its drivers, are smaller. Second, it's unusual, if not downright impossible, to use an X Window program as a CGI script. This means that the resources required are far fewer. Maintenance and requisite system knowledge, however, are far greater. There are trade-offs in everything, and what UNIX gives you in small size and speed it more than makes up for with complexity. In particular, setting Web server permissions and getting CGI to work properly can be a nightmare for the UNIX novice. Even experienced system administrators often trip over the unnecessarily arcane configuration details. After the system is set up, though, adding new CGI scripts goes smoothly and seldom requires adding memory.
If you give your UNIX box 16M of RAM and a reasonably fast hard disk, it will be ecstatic and will run quickly and efficiently for any reasonable number of hits. Database queries will slow it down, just as they would if the program weren't CGI. Due to UNIX's multi-user architecture, the number of logged-on sessions (and what they're doing) can significantly affect performance. It's a good idea to let your Web server's primary job be servicing the Web rather than users. Of course, if you have capacity left over, there's no reason not to run other daemons, but it's best to choose processes that consume resources predictably so you can plan your site.
Of course, a large, popular site--say, one that receives several hits each minute--will require more RAM, just as on any platform. The more RAM you give your UNIX system, the better it can cache, and therefore the faster it can satisfy requests.
The tips, techniques, examples, and advice this book gives you will get you going immediately with your own scripts. You should be aware, however, that the CGI world is in a constant state of change-more so, perhaps, than most of the computer world. Fortunately, most servers will stay compatible with existing standards, so you won't have to worry about your scripts not working. Here's a peek at the brave new world coming your way.
Java comes from Sun Microsystems as an open specification designed for platform-independence. Java code is compiled by a special Java compiler to produce byte codes that can run on a Java Virtual Machine. Rather than produce and distribute executables as with normal CGI (or most programs), Java writers distribute instructions that are interpreted at runtime by the user's browser. The important difference here is that whereas CGI scripts execute on the server, a Java applet is executed by the client's browser. A browser equipped with a Java Virtual Machine is called a Java Browser. Netscape, among other browsers, supports Java.
If you're interested in reading the technical specifications, you'll find that http://java.sun.com/whitePaper/java-whitepaper-1.html has pages worth of mind-numbingly complete information.
Part VIII of this book explores some Java sites and points out some of the fascinating things programmers are doing. In the meantime, though, here are some highlights about Java itself:
Following the incredible popularity of the Internet and the unprecedented success of companies such as Netscape, Microsoft has entered the arena and has declared war. With their own Web server, their own browsers, and a plethora of back-end services-and don't forget unparalleled marketing muscle and name recognition-Microsoft is going to make an impact on the way people look at and use the Internet.
Along with some spectacular blunders, Microsoft has had its share of spectacular successes. One such success is Visual Basic (VB), the all-purpose, anyone-can-learn-it Windows programming language. VB was so successful that Microsoft made it the backbone of their office application suite. Visual Basic for Applications (VBA) has become the de facto standard scripting language for Windows. While not as powerful as some other options (Borland's Delphi in some regards, or C programs in general), VB nevertheless has two golden advantages: It's easy to learn, and it has widespread support from third-party vendors and users.
When Microsoft announced it was getting into the Web server business, no one was terribly surprised to learn that they intended to incorporate VB or that they wanted everyone else to incorporate VB, too. VBScript, a subset of VBA, is now in prerelease design, but thousands of developers are feverishly busy playing with it and getting ready to assault the Internet with their toys.
You can get the latest technical specifications from http://www.microsoft.com/intdev/inttech/vbscript.htm. VBScript, when it obtains Internet community approval and gets implemented widely, will remove many of the arcane aspects from CGI programming. No more fussing with C++ constructors or worrying about stray pointers. No concerns about a crash bringing the whole system down. No problems with compatibility. Distribution will be a snap because everyone will already have the DLLs or will be able to get them practically anywhere. Debugging can be done on the fly, with plain-English messages and help as far away as the F1 key. Code runs both server-side and client-side, whichever makes the most sense for your application. Versions of the runtimes will soon be available for Sun, HP, Digital, and IBM flavors of UNIX, and are already available to developers for Win95 and Windows NT. What's more, Microsoft is licensing VBScript for free to browser developers and application developers. They want VBScript to become a standard.
So where's the rub? All that, if true, sounds pretty good-even wonderful. Well, yes; it is-but VB applications of whatever flavor have a two-fold hidden cost: RAM and disk space. With each release, GUI-based products tend to become more powerful and more friendly, but also take up more disk space and more runtime memory. And don't forget that managing those resources in a GUI environment also racks up computing cycles, mandating a fast processor. Linux users with a 286 clone and 640K of RAM won't see the benefits of VBScript for a long, long time.
Although that doesn't include a large share of the paying market, it does, nevertheless, include a large percentage of Internet users. Historically, the Internet community has favored large, powerful servers rather than large, powerful desktops. In part, this is due to the prevalence of UNIX on those desktops. In a text-based environment where the most demanding thing you do all day is the occasional grep, processing power and RAM aren't constant worries. As much as early DOS machines were considered "loaded" if they had 640K RAM, UNIX boxes in use today often use that amount-or even less-for most applications. Usually, only high-end workstations for CAD-CAM or large LAN servers come equipped with substantial RAM and fast processors.
In the long run, of course, such an objection is moot. I'm hardly a Luddite myself-I have very powerful equipment available, and I use it all the time. Within a few years, worries about those with 286s will be ludicrous; prices keep falling while hardware becomes more powerful. Anyone using less than a Pentium or fast RISC chip in the year 2000 won't get anyone's sympathy. But my concern isn't for the long run. VBScript will be there, along with a host of other possibilities as yet undreamed, and we'll all have the microprocessor horsepower to use and love it. But in the meantime, developers need to keep current users in mind and try to keep from disenfranchising them. The Internet thrives on its egalitarianism. Just as a considerate Webmaster produces pages that can be read by Lynx or Netscape, developers using Microsoft's fancy-and fascinating-new tools must keep in mind that many visitors won't be able to see their work...for now.
VRML, or Virtual Reality Modeling Language, produces some spectacular effects. VRML gives you entire virtual worlds-or at least interactive, multiparticipant, real-time simulations thereof. Or rather, it will give you those things someday. Right now, the 1.0 specification can only give you beautiful 3-D images with properties such as light source direction, reactions to defined stimuli, levels of detail, and true polygonal rendering.
VRML isn't an extension to HTML but is modeled after it. Currently, VRML works with your Web browser. When you click a VRML link, your browser launches a viewer (helper application) to display the VRML object. Sun Microsystems and others are working on integrating VRML with Java to alleviate the awkwardness of this requirement.
The best primer on VRML I've found is at http://vrml.wired.com/vrml.tech/vrml10-3.html. When you visit, you'll find technical specifications, sample code, and links to other sites. Also of interest is a theoretical paper by David Raggett at Hewlett-Packard. You can find it at http://vrml.wired.com/concepts/raggett.html.
You'll also want to visit the VRML Repository at http://www.sdsc.edu/vrml. This well-maintained and fascinating site offers demos, links, and technical information you won't find elsewhere.
Objects in VRML are called nodes and have characteristics: perspective, lighting, rotation, scale, shape hints, and so on. The MIME type for VRML files is x-world/x-vrml; you'll need to find and download viewers for your platform and hand-configure your browser to understand that MIME type.
VRML objects aren't limited to graphics. Theoretically, VRML can be used to model anything: MIDI data, waveform audio data, textures, and even people, eventually.
Of particular interest in the area of VRML is the notion of location independence. That is, when you visit a virtual world, some bits of it may come from your own computer, some objects from a server in London, another chunk from NASA, and so forth. This already happens with normal Web surfing; sometimes the graphics for a page come from a different server than does the text-or only the page counter might be running on another server. While handy, this capability doesn't mean much for standard Web browsing. For processor-intensive applications such as Virtual Reality Modeling, however, this type of independence makes client-server computing sensible and practical. If your machine needs only the horsepower to interpret and display graphics primitives, while a hundred monster servers are busy calculating those primitives for you, it just might be possible to model aspects of reality in real-time.
Process Software has proposed a standard called ISAPI (Internet Server Application Programming Interface), which promises some real advantages over today's CGI practices.
You can read the proposal for yourself at http://www.microsoft.com/intdev/inttech/isapi.htm or contact Process Software directly at http://www.process.com.
In a nutshell, the proposal says that it doesn't make sense to spawn external CGI tasks the traditional way. The overhead is too high, the response time too slow, and coordinating the tasks burdens the Web server. Instead of using interpreted scripts or compiled executables, Process proposes using DLLs (dynamic link libraries). DLLs have a number of advantages:
Process Software has gone beyond proposing the specification; they've implemented it in Purveyor, their own server software. I've tried it, and they're right: CGI done through an ISAPI DLL performs much faster than CGI done the traditional way. There are even ISAPI wrappers: DLLs that let established CGI programs use the new interface.
My guess is that it won't be long before you see ISAPI implemented on all NT servers. Eventually, it will become available for UNIX-based servers, too.
***Production: Insert chapter-closing icon at end of above paragraph***