Copyright ©1996, Que Corporation. All rights
reserved. No part of this book may be used or reproduced in any
form or by any means, or stored in a database or retrieval system
without prior written permission of the publisher except in the
case of brief quotations embodied in critical articles and reviews.
Making copies of any part of this book for any purpose other than
your own personal use is a violation of United States copyright
laws. For information, address Que Corporation, 201 West
103rd
Street, Indianapolis, IN 46290 or at support@mcp
.com.
Notice: This material is excerpted from Special Edition Using CGI,
ISBN: 0-7897-0740-3. The electronic version of this material has
not been through the final proof reading stage that the book goes
through before being published in printed form. Some errors may
exist here that are corrected before the book is published. This
material is provided "as is" without any warranty of
any kind.
In a world where the objective of writing CGI is to publish it
immediately, it makes sense to give you a list of public pages
that illustrate some of what we've talked about so far in this
book. After all, knowing that something can be accomplished is
sometimes all you need to inspire you to do it for yourself.
I picked sites that are outstanding for one or more of several
reasons: Either the site demonstrates a superb and elegant use
of CGI, the site has good CGI reference materials, or the site
has CGI tools you can download or buy. In the search for excellence,
I'm not terribly concerned about whether the software you'll find
is freeware, shareware, or a commercial application. My only objective
is to show you how to do it right. There may be cheaper ways of
doing it right; there will certainly always be more expensive
ways. But the sites I offer are doing it right, and doing it right
now. There's no reason you can't be doing it, too.
Undoubtedly, every reader will be able to tell me that I missed
a great site, or that XYZ Corp. gives away as freeware what I
featured for $1,800 from ZYX Corp. Save your postage stamps, please.
I intentionally skipped some well-known and excellent sites-in
most cases, because the sites are too busy to be useful. This
is the worst Catch-22 of the Web: If you do something very clever
or very popular, your server is likely to be overwhelmed by visitors.
WWW should stand for World Wide Web, not World Wide Wait. So I
tried to choose sites that offer a reasonable response time, and
are solid rather than fashionable. I also probably overlooked
some great sites just because the Web changes so fast.
That said, you'll look at the following areas:
- Programming tutorials and sample code
- CGI and SSI freeware and shareware
- Fun stuff: examples of things done right
- Indexing
- Connecting SQL databases
- Spiders, worms, crawlers, and robots
- CGI interactive games
- A brief case study: Internet Concepts, LLC
This Ever-Changing URL in Which We Live
URLs change. Sites come and go. Some last for years, others for
days or hours. Sometimes a popular site becomes temporarily unavailable
due excess traffic. Sometimes a router fails between you and the
site. Sometimes the site's server goes down. Sometimes a site
just...disappears.
Any book that provides current information runs the risk of becoming
outdated. A book like this one, though, is almost certain to have
expired links-pointers to sites that have either moved or gone
on to the Great Bit Bucket in the Sky.
We made every effort to ensure that, as of the date of manuscript
preparation, the URLs provided throughout this book were correct
and working. By correct, I mean that the site's URL is
given accurately, and that the content available at that site
roughly correlates with what we said it did. There's no guarantee,
especially when we refer to subpages on a site, that the Webmaster
hasn't shuffled things around-or even decided to give up CGI information
in favor of spotlighting the latest interactive smut fiction.
By working, I mean that we tested the link and it seemed
both reliable and reasonably responsive.
I'll start off by examining a variety of online tutorials, many
of which include sample code. Some of them are meant to be tutorials;
others are just such good examples of programming, or such simple
code, that they become lead-by-example instruction sheets. I won't
bother to list too many, since the book you're holding is one
long CGI tutorial in itself. However, even a book as comprehensive
as this one can't cover everything, so here are pointers to some
fundamental or esoteric tutorials you might find useful.
- (The Common Gateway Interface).
If your high-school teachers did their jobs, you'll know that
you must go back to the original sources when doing research.
You'll make your teachers and yourself happy by reading this tutorial
from NCSA. Starting at ground zero and working up to a library
of examples, this hypertext document gives the proper foundation
for further exploration.
- University of Utah's Introduction to (CGI Programming ).
This document contains an introductory tutorial on CGI programming,
including some example CGI programs. If you're already an accomplished
programmer, this tutorial will provide the basic information you
need to develop your own CGI programs. The example programs are
in UNIX Bourne shell language (sh) and Perl. They're
relatively simple programs and should be understandable to anyone
familiar with the UNIX environment and the C programming language.
- W4 Consultancy (http://sparkie.riv.net/w4/software/counter/).
Digital counter script for UNIX. From this page, you can download
a gzip of the counter and also access a FAQ about the counter.
If you haven't done any CGI work before, this might make a good
first project. I'm including this one here, rather than in the
programs section, because the code and documentation make an excellent
primer.
- ( Gates-o-Wisdom Software ).
NCSA-based SSI page counter tips and tricks for UNIX. Also contains
a great tutorial on general SSI and CGI techniques for NCSA servers.
- (Teleport CGI Scripts ).
This page is a compendium of Perl and shell scripts for users
of Teleport Internet Services. However, you'll find when you look
at the individual script documentation sections that the source
code is usually included. The scripts you find here are short
and sweet, and give you a good idea of how to accomplish many
common tasks.
- (WebSite CGI). If you use
WebSite, you won't find a better reference than Bob Denny's own
documentation (after all, he wrote the server). WebSite is one
of the most popular and successful NT servers. It attacks the
GUI problem directly, by providing built-in support to link the
Web server with VB, Delphi, or other Windows development environments.
This particular page provides jumping-off points for technical
papers, server self-tests, CGI programming, and related information.
- Developer's Corner,
explains the peculiarities and strengths of WebSite CGI, and gives
you step-by-step instructions for using WebSite's support for
VB, Delphi, and Perl.
- You can also visit Bob Denny directly at Bob Denny's,
where he'll entertain and enlighten you further.
- CGI Scripting with MacHTTP and AppleScript).
You may find it hard to reach this site, but once you get there,
you'll discover a wealth of information pertaining to scripting
for the Macintosh.
- CGI and AppleScript.
Here's a Dr. Dobb's Journal article by Cal Simone, founder of
MainEvent software. This is a wonderful tutorial by a gifted author
and programmer. In it, you'll learn the essentials of how AppleScript
interfaces with your Macintosh system, and how you can use it
to do CGI magic.
- Writing CGI Scripts for (WebStar).
This site offers support for the Macintosh's WebStar server via
UserTalk in the Frontier environment. You'll find a good explanation
of how to use Frontier to create dynamic HTML on your WebStar
server. If that's your platform, this is your tutorial.
- (The Amiga HTTP Common Gateway Interface ).
Mike Meyer takes time out to explain the tips you need to run
CGI scripts on an Amiga Web server. He includes a number of useful
examples in a link at the bottom of the page.
As you wander through the online world looking for samples of
scripts for tips on technique, you may run across some ready-to-run
scripts that do exactly what you want. In this section, I present
some sites that offer freeware or shareware CGI, SSI, and Java
scripts. You won't find anything wild or strange here: these are
workaday programs you can take home and put right to work doing
useful tasks.
Don't forget about the list of publicly available software libraries
at the end of Chapter 3, "Designing CGI Applications."
You'll find pointers to routines there that can save you hundreds
of development hours.
You've probably encountered many of these software offerings without
knowing it. If you've visited an NT server with a graphical counter,
for instance, chances are good that the site is using Kevin Athey's
creation, or at least the GD library component of it. For that
matter, you'll find that the GD library is used in most CGI scripts
that produce on-the-fly GIFs, regardless of platform. Likewise,
most of the programs in this section are proven products in wide-spread
use. Fill your coffee cup, clear some space on your hard disk,
and get ready to download.
Here are some of the best freeware and shareware tools available
to spice up your Web pages and make your site more powerful:
- (Behold! Software ).
Kevin Athey's collection of CGI programs for Windows NT and Windows
95 includes a list of sites using his software. Behold! Software
is a place where you can get free software for Windows 95 and
Windows NT. The emphasis of the site is CGI and Web utilities.
The two utilities available at the time of this writing were a
hit counter and a real-time clock.
- Examples of Perl CGI Scripts, with (Source Code ).
This no-nonsense page demonstrates six useful utilities, all built
with Perl-a clickable image map, a way to maintain state information,
how to generate a random number, how to hunt up names in a phone
book, a way to design a self-scoring questionnaire, and a client
pull demonstration.
- Each utility includes the source code, which usually has
a nice header explaining the script's function, but absolutely
no documentation thereafter. Fortunately, these scripts are short
and simple enough that you can probably figure out what's going
on.
- (Mooncrow's CGI/Perl Source Page ).
"Mooncrow" is Carl M. Evans, a long-time computer professional
with a BSEE, an MSEE, several commercial applications, and a text
book to his credit.
- Also to his credit is Mooncrow's Aeyrie, which includes
Mooncrow's CGI/Perl Source Page. "When I decided to create
and run my own Web pages, I had trouble locating adequate resources
on the Internet concerning CGI/Perl programming," says Evans,
"so I created my own. While scripts can be written in a number
of languages, I prefer to use Perl 4 or Perl 5. It doesn't matter
what platform the server is being run on as long as the server
supports Perl 4 and/or Perl 5 compliant scripts."
- Evans ended up with probably the most complete reference
set of Perl programs available on the Internet. With over 50 links
to tutorials, sample programs, reference materials, and source
code, Evans provides a wonderful resource for anyone thinking
of using Perl for CGI scripting.
- (gd 1.2 Library). If you're
planning to create on-the-fly GIFs, don't miss Thomas Boutell's
wonderful C library. You can incorporate this code directly into
your own programs to give them spontaneous GIF creation powers.
- (Greyware Automation Products).
Greyware provides a good selection of freeware and shareware CGI
and SSI programs for NT and Win95. The SSI programs here are the
ones discussed in detail in Chapter 16, "Using Server-Side
Includes," and included on the CD-ROM.
- Greyware's CGIShell program is of particular interest to
anyone wanting to do CGI with Visual Basic, Delphi, or another
16-bit GUI development language on EMWAC. CGIShell comes with
a handful of fully functional demonstration programs with source
code, including a guest book written in VB4 that you can put to
work immediately. The online documentation often provides a good
explanation of what goes on behind the scenes.
- (Windows NT Web Server Tools).
Jim Buyens has put together a great resource covering programs
the provide Server Extensions, Connectivity, DNS, Finger, Firewall,
FTP, Gopher, HTTP, Log Analysis, Mail, News, NFS, Perl, Publishing,
Search Engines, Software Suites, Telnet, TFTP, WAIS, and X-Windows
Clients. Oh, yes, there's also a category called Other Resources
for the things he couldn't fit into the existing groups.
- If you're running Windows NT, put this one in your bookmark
file. You'll find yourself coming back again and again.
- (Windows NT Web Server Tools).
It's a long URL, but worth typing. This site is probably the most
comprehensive repository of NT software on the Internet. It has
a little bit of everything, and a lot of things you won't find
elsewhere. You'll see this site featured again in the "Examples
of Things Done Right" section.
Here's a collection of sites that demonstrate stylish, informative,
creative, or intriguing uses of CGI on the Web. You'll find plain
old CGI and SSI mixed in with Java, real-time audio, real-time
video, and others.
I'll start small, with a simple page counter, and work my way
toward the bizarre and fanciful. I picked sites that illustrate
technique and taste. If you don't find any ideas for programs
in this section, check your pulse-you may already be dead.
- Voyager, Publisher of Interactive Media (http://www.voyagerco.com/).
Tasteful and elegant presentation all-around. Pay particular attention
to the current date and quote-of-the-day, which are carefully
blended into the page's overall theme.
- (The Amazing Fishcam!).
No list of sites would be complete without including the one,
the original, the amazing Fishcam! This site is nothing more than
two cameras focused on a tank of fish. Nothing more? Well, as
the site explains in gleeful detail, there's a lot more. You can
look at the fish in low resolution or high, and if you're running
Netscape, you can visit the Continuously Refreshing Fish Cam-a
wonderful example of server push technology. Although the idea
of watching fish in near real-time isn't particularly exciting,
this site was one of the first to demonstrate the power of the
Web to provide electronic photos. Just in case you care about
the fish as well as the technology, this page happily refers you
to 12 other aquatic sites.
- (The Amazing Parrot-Cam!).
If fish aren't enough, here's Webster, the parrot, on a live
camera feed for your viewing pleasure. In addition to good camera-work,
this page has a nice explanation of how their camera is set up
and connected to the computer.
- (Autopilot).
This site takes you on a whirlwind tour of the Internet. With
over 8,000 sites in its list of URLs to choose from, you often
find interesting and surprising places you never would have chosen
to visit otherwise. Autopilot relies on Netscape's client pull
function to whisk you from site to site every 12 seconds. This
is also a good demonstration of random URL generation.
- (Background Generator).
This handy site lets you build an image file to use as a background.
It starts with some stock images, then takes you through a customization
phase where you can edit the colors until you get exactly what
you want. This UNIX magic comes to you via a program written by
dprust@isx.com.
- (bsy's List of Internet-Accessible Machines ).
This is the got-everything page for Internet gadgets. Want to
find a Coke machine that responds to a ping? Want to change the
track on a CD player at Georgia Tech? Do you care about Paul Haas's
current refrigerator contents? Want to play with a remote-control
model railway over the Internet? Are you craving some real-time
Internet Talk Radio from NRL? Or did you ever wonder how to find
the infamous Ghostwatcher home page? This site points you to all
the cool places for gadgets, machines, and goofy things on the
Internet. Great for helping you think of new ways to use the Web!
- Dr. Fellowbug's Laboratory of Fun & (Horror).
Great examples of games and general interactivity...with a macabre
twist that's as much fun as the games themselves. No software
to download here, but hours of entertainment, and perhaps an idea
or two for the terminally twisted mind. The animated Hangman game
is particularly well done.
- (The Electric Postcard).
This site uses CGI and e-mail in a clever way. It's one of those
"Gee, duh!" ideas that other people always seem to get
first. The Electric Postcard lets you choose from a variety of
amusing (or just plain strange) postcard stock, then lets you
personalize your postcard.
- (The Windows NT Application Center).
You saw this site earlier in the section on freeware and shareware
programs. I'm listing it again here because it's the cleanest
example on the Web of interfacing a back-end database with a software
library. The site is well-indexed, carefully categorized, and
easy to use. Kudos to Beverly Hills Software for providing such
a well-designed and useful site.
- (The Vertex Award) (Nanimation of the Week).
While often almost too slow to be tolerated, this site is nevertheless
important enough that I included it for you...I think it's worth
the wait. A Nanimation is a Netscape animation. This page lists
the Vertex Award winners for best Nanimation on the Internet.
Even the introduction to the award lets you know you're in for
something special. The pages that win awards are spectacular.
- (The Netscape Engineering Sign).
This CGI-machine interface lets you type a message to be displayed
in huge green letters on a sign in the Netscape office's engineering
pit. Let's hope they never put it out on a runway. "Land
here!" "No, over there!" "Yo mamma!"
- (The Web in Pig Latin ).
This one could easily win the award for most bizarre idea ever
to grace the Internet. In fact, it's won several awards: Business
Weekly's "As A Time Out" Site of the Week; The Stick's
Misc Surf Site; a Hot Site in Internet World; and "a site
that 'does stuff' by the Center for the Easily Amused." Basically,
you enter an URL on a form provided. The CGI program goes out,
fetches the page, and presents it to you in Pig Latin. Arly-nay,
ooday! I present it here because the CGI does more than create
HTML on the fly for you; it actually goes out and fetches a page,
playing a browser role, to generate the HTML.
- Talk to My Cat ().
Well, why not? This site, says author Michael Witbrock, has a
speech synthesizer connected to the computer. You type in a sentence,
and the speech synthesizer says it aloud to Michael's cat. If
the cat happens to be around, that is. And awake. And listening.
Who knows? Who cares? Is this any different from talking to a
cat in person?
- (WebChat Broadcasting System).
WBS, or WebChat Broadcasting System, is one of the cleanest examples
of real-time chatting using the Web. With hundreds of "channels"
(separate discussion areas) to choose from, WebChat offers something
for everyone. And it seems everyone has been there once or more.
WebChat boasts over 35 million hits per month. They'll also sell
you their software to run on your own server, or lease you space
on their server. There's also a freeware version of WebChat available
with limited features. You'll need a UNIX machine to run it, although
a port to NTPerl is under way.
- (Xavier, the Web-Controlled Robot ).
Xavier isn't a toy. Xavier has three on-board 486 computers, a
Sony videocam, and enough engineering guts to rebuild the atom
bomb from scratch. Well, maybe not, but he can tell Knock-Knock
jokes! Users can issue commands to Xavier and, by tapping into
this video eye, watch him carry those commands out. Xavier communicates
to the rest of the world with wireless Ethernet. What I want to
know is why Xavier gets to go wireless before I do?
In this section, I'll point out sites that do indexing well. For
the sake of contrast and instruction, I'll include one that actually
makes the content harder to find than if it were buried at sea
in a locked cabinet. This kind of egregious irresponsibility is
rare, though, and I'm happy to provide you with several of the
best and brightest searchable sites on the Internet. I'll start
with examples of small sites, and work my way up to the behemoths
at Infoseek and Alta Vista.
- (The UBC Facility of Medicine Home Page).
A good example of a site (really a collection of pointers to sites)
done up with a static index. For this type of project, where full-text
indexing is either impossible or impractical, UBC demonstrates
how to do it manually. If you haven't visited this site before,
be sure to make a bookmark for it. The information presented here
is invaluable.
- (Site-Index.pl ).
Perl code for preparing your site to participate in the ALIWEB
master index and search engine. Useful even if you don't plan
to participate, since you can examine the Perl code to see what
kinds of information are used to create a site index.
- Technical Discussion of the (Harvest System ).
A thoughtful and complete overview of the problems inherent in
current indexing systems, along with the rationale behind the
new Harvest System's approach. For information on getting the
Harvest software, or to sample sites already using it, see Harvest's
main page at this site.
- ( Newsgroup-related Indexes ).
This site contains a list of pointers to several other WAIS engines
maintaining full-text indices for a number of popular UseNet newsgroups.
If nothing else, you can visit these sites to see how efficient
WAIS can be. WAIS is often overlooked these days in favor of large
relational database back ends, but there's no reason not to use
WAIS for appropriate tasks. If you need a full-text search engine
to handle a reasonable amount of data, WAIS can do the job quickly
and efficiently.
- (Greyware Site Index).
Here's an example of using WAIS to catalogue all the HTML on a
site. The WAIS catalogues are rebuilt daily and stored in one
directory. Static HTML documents in that directory let you select
the database, then execute the actual search using Boolean operators
and keywords. This site proves that WAIS is alive and well on
the NT platform. You can search over 11M of index in less than
a quarter-second, on average. The cataloguing itself takes about
15 minutes a day to run.
- Social Security Handbook 1995, from the United States Social
(Security Administration).
I'm including this URL for a very specific reason: This is the
best example I've found of exactly the wrong way to index
a site. The material could easily be organized with a database
engine-even FreeWAIS could handle it without breathing hard. Instead,
the "index" is nothing more than a list of links: "Index
letter A," "Index letter B," and so on. When you
choose an index letter-roughly corresponding to the first word
of the subject, rather than the key idea of the subject, you'll
find a bunch of static links to documents by number. Yes, that's
right, by inscrutable SSA document number. Good luck ever finding
anything here. They'd have done much better by throwing everything
in one directory and using keyword retrieval. Study this page
carefully so you know how not to do it. If you're ever
tempted to organize your site this way, be prepared to deal with
angry e-mail from your bewildered and abused visitors.
- (Infoseek Guide ). Here's
an example to balance the Social Security Administration's abomination.
This search engine shows how it should be done. It's clean,
fast, easy to use, and remarkably useful. Infoseek's award-winning
engine not only brings you speedy results, but a great deal of
flexibility for advanced users. If you're writing your own search
engine from scratch, take a close look at Infoseek's specifications
and capabilities first. When you realize the size of the task
and the sanity Infoseek brings to it, you'll be even more impressed.
- (Alta Vista). Another
example of how to do things right. Using some frighteningly powerful
DEC workstations and servers, Alta Vista brings you an incredibly
fast, incredibly large index of Internet sites and newsgroup contents.
The proprietary 64-bit search software was developed in-house
by Digital's research laboratory personnel. These guys aren't
fooling around. The indexer software can crunch a gigabyte of
text per hour. Scooter, their Web-spider which collects information,
can visit up to 2.5 million sites each day. Although the presentation
isn't as slick as Infoseek's, the search engine's breadth of knowledge
simply staggers the mind. This is a technology to watch.
- Indexes and Search Engines for ( Internet sources).
A useful list of search engines and indexes maintained by Jan
Wright. Jan's list will help you find the proper search engine
for your site.
Many Web servers, especially recent entries into the field and
those designed for the NT platform, have database connectivity
built in to the server. Even those servers that don't talk to
databases directly (through ODBC, or Open Database Connectivity)
usually include a CGI module of some sort that does. While this
allows the advertisers to claim that the server comes packaged
with database functionality, often the level of database support
is only good enough to demonstrate connectivity, not build a real
application. In any case, older servers, especially in the UNIX
world, usually have no database support at all.
This section looks at a few third-party products designed from
the ground up to help you connect your Web server to a back-end
database. While many products are available, the ones I chose
are clear leaders in the field-either because of outstanding performance,
or general availability and widespread use.
- (Cold Fusion). Cold Fusion
is a full set of connectivity tools to make your Web server work
seamlessly with your SQL back-end database server. Works with
O'Reilly WebSite, Netscape HTTPD, or Process Software's Purveyor.
Support for other platforms is coming soon.
- Users don't need to program in C, Perl, or any other programming
language. Cold Fusion provides the power automagically through
HTML, using high-level database commands and a general-purpose
CGI scripting language.
- Cold Fusion's heart is DBML.EXE, a CGI script tailored for
ODBC access to the back-end database of your choice. Cold Fusion
dynamically generates HTML pages containing the results of queries
or submissions, and lets you freely mix if-then-else conditional
processing and multiple SQL statements in with your regular HTML.
- W3-mSQL (W3-mSQL). W3-mSQL
is an interface package that lets you use mSQL (a freeware, light-weight
UNIX SQL engine) with your Web server. W3-mSQL is a CGI script
that works by interpreting enhanced HTML on the fly. Using a variation
on HTML comments to embed W3-mSQL commands, you connect to, query,
update, and close a back-end database entirely within your HTML.
- If you're planning to use mSQL on your UNIX machine and
don't want to write the interface code yourself, check out W3-mSQL.
See "MiniSQL (mSQL) and W3-mSQL," p. xxx, for more information on these two packages.
- ( mSQLJava Home Page ).
This site offers a library of HotJava classes suitable for use
with an mSQL back-end database. The package is copyrighted by
Darryl Collins, but may be used, copied, and redistributed under
the terms described in the GNU General Public License. At this
site, you'll find links to FTP sites where you can download the
class library, links to pages with documentation, and links to
pages with sample programs and source code.
- (mSQL (MiniSQL)).
If all this talk about mSQL tools has you wondering about the
back-end database itself, here's the official source of information
and code. While the site is occasionally very slow to respond,
it's best to get the information straight from the horse's mouth.
- mSQL is a lightweight freeware SQL engine for UNIX machines.
It's fully ANSI-compatible, but implements only a subset of SQL
commands. For Web developers, this is ideal, since the subset
includes just about everything you'll need and discards the bits
you'll never use.
- (Tango). Tango
Solutions, from Everyware, is a complete CGI package for the Macintosh
to connect HTML to their own back-end database, ButlerSQL. Development
is underway to allow Tango to talk to other SQL engines, but at
the moment only ButlerSQL is supported. There's no charge for
the ButlerSQL version of Tango; versions that connect to other
databases may have a fee eventually.
- On the Tango home page, you'll find links to demonstration
programs-some of them rather slick-for online shopping, conferencing,
and other useful ways to take advantage of Tango on your Macintosh
server. You'll also find a non-searchable FAQ page with links
to individual questions-and-answers (we have to wonder why Everyware
didn't store this information in a ButlerSQL database and let
the user search for keywords using Tango), and generic product
information. You may download the Tango software directly from
this page.
- (Oracle World Wide Web Interface Kit Archive).
If you're using Oracle as your back-end database, look no further
than this page for your interface software. Oracle meticulously
provides information for interfacing most common Web servers with
their database product. They even examine cross-platform connectivity
issues and third-party products, and have complete working examples
of useful programs-including one that lets you do a keyword search
of NCSA's documentation.
- (DB2 World Wide Web Connection, Version 1 ).
With typical IBM verbiage and charts, this page shows you how
to go about connecting your OS/2 or AIX Web server to a DB2 back-end
database. You'll find demos showing how DB2 WWW Connection V1
(that's the short name) can generate Netscape tables to hold query
results, and you can download the software directly.
- If your platform is OS/2 or AIX, and you're trying to talk
to a Big Blue database, this package is probably your best bet.
- (WWW-DB Gateway List ).
Here's a handy site maintained by KangChan Lee. Lee has gathered
in one place links to dozens of Web-to-database gateway programs,
methods, and tutorials.
- If you're using a back-end database other than the ones
I've already mentioned in this chapter, take a glance at Lee's
page. You'll probably find your database there, along with a helpful
link to available software for it.
If you're just looking for information from the Internet, use
one of the publicly available search engines. It's unlikely you'll
ever have the resources to duplicate the mighty Alta Vista, for
example, and even if you do, you'll need more help than this section
could possibly give. Besides, all the really good robot code has
commercial value, and hence isn't freeware.
On the other hand, if you want to build a small special-purpose
spider, worm, crawler, or robot, some code is available to help
you get started. More important than how to do it, however, is
how not to do it. That's why the first link I'll present is to
an article you must read if you're going to build a Web
automaton. Also be sure to check out Chapter 14, "Robots
and Web Crawlers," for more information about this topic.
- (Ethical Web Agents).
This white paper by David Eichmann discusses the ethics of using
automata on the World Wide Web. If you don't want to be inundated
with angry letters from systems administrators, read this paper
carefully before you write the first line of code for your nifty
new robot.
- This article is highly informative, with hot links to references
and other papers pertinent to the subject. By reading this paper,
you'll arm yourself with all the knowledge necessary to build
a Web-safe robot.
- (MOMSpider).
MOMSpider is a UNIX-based Perl 4 program. You may use or modify
this code, subject to the generous licensing restrictions from
the University of California, Irvine. If nothing else, you can
use the code as a jumping-off place when building your own automaton.
- (Checkbot ).
Checkbot is a link-verification tool written in Perl, using libwww.
Written by Dimitri Tischenko and Hans de Graaff, this robot collects
links (starting from a given URL), and then validates them. This
is a handy tool to have around, although you'll probably want
to modify it for your particular needs.
- (WebCopy ).
Victor Parada's WebCopy program takes a command-line argument
of an URL, then goes out and fetches the document. It can run
recursively, fetching all links referenced by that document. You
can download the code right from the site, and start using it
right away (you'll also need Perl, if you don't already have it).
By design, this program won't follow links across multiple servers;
this is to protect you from (a) endless recursion, and (b) retrieving
more than you bargained for.
- (WebWatch). WebWatch is a commercial
program for Win95, but you can download an evaluation copy. (The
evaluation copy has an expiry feature built in, and you don't
get to see the source code.) The documentation says the program
doesn't work on Windows NT now, but will soon.
- WebWatch is a personal-use spider that monitors your bookmarks,
updates lists of sites, checks for changed information, and so
on. You'll find step-by-step installation instructions and a short
FAQ. You may or may not find this product useful, but it certainly
demonstrates some smart thinking and slick marketing. You could
do worse than to build a robot with this kind of user interface
and intelligence.
There are thousands of online games to choose from, if you're
of a mind to play games on the Internet. In this section, I've
selected a few that illustrate CGI techniques particularly well.
Some are incredibly complex, others very simple, yet all deal
with maintaining state information to provide the illusion of
interactivity.
- (Real Virtual, Incorporated).
Real Virtual does far more than Dungeons & Dragons on the
Web, but it does that bit exceptionally well. For the CGI student,
there's a lot to study here (plus, if you like fantasy role-playing
games, you can have a great time). Pay particular attention to
the way Real Virtual maintains state information as you move through
the setup screens. View the document source and notice all the
hidden fields containing your selections, plus information to
let the CGI program know what to do next.
- Real Virtual spent a lot of time and care developing this
project. From the user's point of view, the Fantasy Worlds adventure
looks a lot like a PC-based game, but with all the advantages
of being real-time and multi-player.
- (Netropolis ). Netropolis
lets you become the CEO of a corporation located in or around
England. The goal is to win lots of cash and taunt the other players.
- Of special interest here is the slick use of image maps
to provide a sense of location, plus the integration of e-mail
into the game.
- If you like stomping on the business competition, you may
also enjoy the game.
- ( S.P.Q.R).
The first thing you should notice when you stop by S.P.Q.R. is
that the URL at the top of your browser changes to something like
Site. This
vile concoction isn't something you'd want to type manually, but
is there for a purpose. If you go to S.P.Q.R. with that URL, you'll
resume the game wherever you left off. S.P.Q.R. (from Time Warner
Electronic Publishing) generates a fake URL for you on the fly
when you walk through the front door. From then on, throughout
the game, that URL marks you as you, so the game can preserve
state information.
- The game itself is visually simple, but intriguing. You
wander through Rome collecting scrolls (which you can then read)
and keys (which you can use to unlock things). Your mission is
to save Rome from disaster. The game doesn't miss a beat when
it comes to maintaining state information, or matching graphical
output to what you've done.
- (QIN: Tomb of the Middle Kingdo).
This cool game from Pathfinder also uses an artificially munged
URL to keep track of who's who. The game is a visual version of
a text-based adventure game, with low-key but nevertheless impressive
graphics. This game lets you wander around a virtual 3D world
by clicking the view presented. It's all done with image maps,
so one of the inherent failings is that you can click anywhere-not
just areas that do something.
- This isn't the fault of the game design. It's a problem
with using image maps on things without clear boundaries. For
example, a toolbar or row of icons clearly has places to click.
The trivial case occurs when the user clicks right on a boundary
or on the background by mistake. In a game where you're clicking
areas of a 3D picture to govern motion, however, most clicks are
null. The trivial case becomes the few areas of the image that
actually do something. This can lead to lots and lots of clicking,
just to find out which areas of the image map are hotspots. Keep
this in mind when you're designing your own game.
- (The Barney Fun Page).
If you really hate Barney (the big purple dinosaur), you'll love
this page! Gerald Oskoboiny lets you get out all your angst against
the Purple One with a knife, a gun, an axe, an UZI, a shotgun,
a motorcycle, or a cannon. You select your weapon and fire away,
changing weapons as needed. Each time you shoot, the picture of
Barney changes to show the wounds, and you get a caption like,
"Barney has been grazed. You can do better than that,"
or "Barney has been slightly wounded," until, at last,
Barney dies.
- Gerald thoughtfully keeps on file morgue photos from the
last ten Barney-killings, so you can view the corpses and celebrate.
Now, if only the Purple One would stay dead...
- This site is sometimes slow (probably due to all the crazed
Barney hunters), but instructive for the would-be CGI programmer.
Although the subject matter is just plain silly, the site demonstrates
well how to make static drawings become interactive.
Internet Concepts, LLC, knows that content and presentation are
the two things that make one site stand out from another. They
have created several award-winning sites you may have already
encountered, as follows:
These sites not only are well-designed and visually appealing,
but they take an unusual approach to the development of site content:
They rely on the user to provide it.
Using a framework of hand-crafted CGI scripts written in Perl
5 and running on a Sun SPARCstation, Internet Concepts lets users
submit an entry on a fill-out form. A CGI script then processes
that entry, adding it to the database and making it immediately
available on the Web.
Stephan Spencer of Internet Concepts says, "Some consider
this real-time updating risky, but since December 1994 when we
first implemented this practice we have had no notable problems.
Nonetheless, we'll probably change this in the near future to
a policy of holding submissions in a 'pending' area until we have
reviewed them."
The database is based on Perl dbm. The script that processes
new entries requires the user to assign a password, too. The user
can then make changes to that entry later on. A "root"
or master password allows site supervisors to change individual
passwords, edit entries, or delete entries. Another script allows
browsing. It displays the database sorted by name, organization
name (if applicable), category/genre, and location. Most of these
sites are also keyword-searchable.
InnSite even offers a geographical search interface that responds
to user clicks by zooming in indefinitely on a region, returning
images real-time from the (Xerox PARC Map Viewer ).
Internet Concepts provides many of these sites as public service
to the Internet community. They also, however, design and implement
many commercial sites. One of the most interesting is the Online
Catalogue at (Seton Identification Products).
This site offers the "Workplace Safety Home Page," and
a searchable online catalog of thousands of signs, labels, tags,
pipe markers, and other identification products. The site supports
online ordering of over 6,000 items.
If you're interested in finding out more about Internet Concepts,
their home page is at this site, or you
can send them e-mail at .
The wizards at Internet Concepts have used CGI to create their
enchantments. With what you've learned in this book, you can invoke
the magic of CGI, too.
QUE Home Page
For technical support for our books and software contact support@mcp.com
Copyright ©1996, Que Corporation