by Paul Doyle
This book is about a clever little programming language called Perl and how you can use it to make the most of your World Wide Web server.
The book tells you what Perl is, how it works, and how to write Perl programs. Much of this material will be useful even if you never do any Web server work. The book also deals with some general Web server issues, such as security. But at heart, this book is about Perl programming applied to Web development.
The term intranet, which doesn't have the same currency as Internet, refers to a network along the lines of the Internet but internal to a corporation and usually protected from the Internet by a firewall. Web servers dominate on these so-called local Internets just as much as they do on the real thing, so this book is as relevant to them as it is to the global network.
Still, just so that there's no confusion, review the following list:
That list is a roundabout way of saying that a Web server and a HTTP server are not necessarily the same thing, and that this book is about Perl in the context of HTTP servers. We'll refer to Web servers frequently throughout the book, because the concept is often useful for dealing with the presentation of information on the Web. Also, Web is in the book's title because it reads better than Using Perl for HTTP Server Programming.
Estimating the number of people who use the Internet is notoriously difficult, but it's generally recognized that more than 20 million people now use the Internet on a regular basis. This figure includes people who have electronic mail or FTP access, as well as those who are fortunate enough to have the type of connection and equipment that allows them to use the Web.
Not all of the 20 million Internet users have access to the Web, but the number who have Web access is growing faster than the overall number who have Internet access. Recently, publicity about the Web has become so overwhelming that many people think of the Internet purely as being the Web. Also, for the first time, many people are purchasing PCs for the primary purpose of Web access.
The Web, in short, is in an upward growth spiral that shows no sign of leveling out before the end of the century.
One interesting fact is that people are astonishingly creative in thinking up uses for the system. Live share prices as a Web-based screen saver, political agitation and petition collection, merchandise sales via the Web, multimedia rèsumès...there's just no telling what people will get up to, given enough bandwidth.
Another interesting fact is that in spite of its scale (which suggests a homogenizing influence), the Web appears to act as an agent of diversity. Small companies, community groups, and schools are there along with the big corporations. The number of languages represented on the Web is growing, not declining. This broad spectrum of interests may be due, in part, to the increasing ease with which organizations can establish an effective Web presence.
Perhaps the most important trend, from the point of view of this book, is that the Web is becoming a much more dynamic place. Dynamic doesn't just mean that pages are now being replaced on a regular basis (although they are, which is a welcome change from the time when Web pages tended to be less recent than printed matter). The word also doesn't just mean that the people who produce the Web every day have a dynamic, creative demeanor (although many of them do, which is why we have such wonders as Robotman, at http://www.unitedmedia.com/comics/robotman). Dynamic means that much more of the information available on the Web is generated live when a user requests it. Databases are searched, files are counted, text is translated, and so on.
This trend is part of the excitement of using the Web now. An interesting page is all very well, but if the page is static, you probably won't visit it again except to see whether it has been updated. If, however, the content of the page depends on the passage of time, on your input, or on the input of other users, you are much more likely to come back.
The trend is also a big part of the excitement of developing on the Web now. Web server management involves much more than writing pages of deathless HTML; a good deal of real-time programming goes on, too. This programming is real-time in the sense that the programs react to external events and produce output that is used there and then. You could also say that the programming is real-time because the pressure for rapid development and new features means that the code is often edited while it is in use.
Perl is the ideal development language for Web server work, for many reasons. Chapter 1, "Perl Overview," discusses the nature of Perl in much more detail; the following sections concentrate on the reasons why Perl suits Web server development.
High-level tasks such as file manipulation and text formatting are exactly the kind of tasks at which Perl excels. You can tell Perl to slurp in the contents of a file and display it on the user's browser with all new lines replaced by tabs, as follows:
while ( <INFILE> ) { s/\n/\t/; print; }
Don't worry about the details of that code example until you read Chapter 1, "Perl Overview." Just notice two things:
In a nutshell, the secret of rapid development is writing small amounts of powerful code without having to consider awkward issues of syntax at every step.
Perl is pithy; a little Perl code goes a long way. In terms of programming languages, that statement usually means that the code is difficult to read and painful to write. But although Larry Wall (the author of Perl) says that Perl is functional rather than elegant. Most programmers quickly find that Perl code is very readable and that becoming fluent in writing it is not difficult. The fact that Perl is pithy rather than terse makes it especially appropriate for the high-level macro operations that are typically required in Web development.
As it happens, Perl is quite capable of handling some fairly low-level operations, too-handling operating-system signals and talking to network sockets, for example. But for most Web programming purposes, that level of detail is just not needed.
Because the compiler runs only one time, it can afford to take its time about generating the executable code. As a result, compilers tend to perform elaborate optimization on the program code, with the result that the executable code runs efficiently.
Compilers and interpreters each have relative advantages and disadvantages. Compiled code takes longer to prepare, but it runs fast, and your source stays secret. Interpreted code gets up and running quickly, but it isn't as fast as interpreted code; in addition, you need to distribute the program source if you want to allow other people to run your programs.
Which of these categories describes Perl?
Perl is special in this regard: it's a compiler that thinks it's an interpreter. Perl compiles program code into executable code before running it, so an optimization stage occurs, and the executable code runs quickly. Perl doesn't write this code to a separate executable file, however; instead, it stores the code in memory and then executes it. Therefore, Perl combines the rapid development cycle of an interpreted language with the efficient execution of compiled code.
The corresponding disadvantages of compilers and interpreters also apply to Perl. The need to compile the program each time it is run makes for slower startup than a purely compiled language provides, and developers are required to distribute source code to users. In practice, however, these disadvantages are not too limiting, for the following reasons:
In summary, Perl is compiled behind the scenes for rapid execution, but you can treat it as though it were interpreted. Tweaking your HTML is easy; just edit the code, and allow the users to run it. But is that good programming practice? Hey, that's one for the philosophers.
Because Perl code is truly compiled, it has no such thing as a run-time syntax error (unless you get into the realm of generating Perl code on the fly and then executing it). This fact is important when you consider that your server is your interface to the outside world; sudden script crashes caused by minor typos are not what you want people to see. Quick execution of a Perl script tells you whether all the syntax in the script is valid. |
Of course, that's no guarantee that your code won't disgrace you for some other reason.
Perl's developer could have expanded the language to handle these tasks by adding more and more keywords and operators-by making the language bigger. Instead, the core of the Perl language started small and became more refined as time went on. In some ways, the language actually contracted. The number of reserved words in Perl 5.0 is actually less than half the number in Perl 4.0.
This situation reflects an awareness that Perl's power lies in its unique combination of efficiency and flexibility. Perl itself has grown slowly and thoughtfully, usually in ways that allow for enhancements and extensions to be added rather than hard-wired in. This approach has been critical in the development of Perl's extensibility over time, as the following section explains.
The capability to use extensions such as these is a remarkable advance in the development of a fairly slick language, and it has helped to fuel the growth in Perl use. Perl developers can easily share their work with others, and the arrival of objects in Perl 5.0 makes structured design methodologies possible for Perl applications. The language has come of age without losing any of its flexibility or raw power.
Appendix B, "Perl Web Reference," describes several Perl libraries and modules. Browse through the appendix to get the flavor of the modules that are available. Also, the CD-ROM that came with this book contains a collection of freely available modules, along with documentation that explains how to use them. For details, see Appendix C, "What's on the CD?" |
Of particular interest to many Web server managers is the fact that Perl works well with standard UNIX DBM files. Also, support for proprietary databases is growing rapidly. These considerations are significant if you plan to allow users to query database material over the Web.
Chapter 1, "Perl Overview," describes the Perl language, and introduces the syntax and constructs that will be used throughout the book.
Chapter 2, "Introduction to CGI," explains how Perl programs on a Web server talk to the outside world and how data is sent from a browser to a Perl program running on a server.
Chapter 3, "Advanced Form Processing and Data Storage," deals with the issues involved in accepting data from a user who is using HTML forms. The chapter also explains how to deal with the data when it arrives.
Chapter 4, "Advanced Page Output," is about using forms to manage server-based data.
Chapter 5, "Searching," explains how Perl can be used to facilitate database or flat-file searches on the server.
Chapter 6, "Using Dynamic Pages," describes some techniques that you can use to keep your pages current and make them respond to external events.
Chapter 7, "Dynamic and Interactive HTML Content in Perl and CGI," covers the issues involved in using Perl to translate documents to and from HTML on the fly.
Chapter 8, "Understanding Basic User Authentication," explains the security issues involved in allowing users to run programs on a Web server. In addition, the chapter describes how to manage server access rights.
Chapter 9, "Understanding CGI Security," shows how a CGI wrapper script can allow users on the server to make their own programs available on the server without compromising server security.
Chapter 10, "Site Administration," is about how Perl can be used both online and offline to help you with day-to-day server management tasks.
Chapter 11, "Database Interaction," is about the rapidly expanding field of Web-based databases and how Perl can be used to manage the interaction between the user and the database.
Chapter 12, "Database Application Using CGI," discusses the issues involved in running a database on a Web server inside your company but allowing external users to have limited access.
Chapter 13, "Special Variables," describes all the tokens that have special meaning in Perl and give it its unique flavor.
Chapter 14, "Operators," lists the Perl operators and describes how they operate.
Chapter 15, "Function List," is a detailed reference to Perl's built-in functions.
Chapter 16, "Subroutine Definition," explains how to use subroutines, libraries, and modules to organize your programs and to produce reusable code.
Appendix A, "Perl Acquisition and Installation," explains where to get Perl and how to install it on your UNIX or Windows NT machine.
Appendix B, "Perl Web Reference," describes many of the Perl modules, add-ons, and related tools that are available on the Internet and on the CD-ROM that comes with this book.
Throughout the book, we'll use snippets of Perl code and sometimes entire listings for illustration. All code listed in this manner is available on the CD-ROM.
This book (like life in general) is too short to describe how to do all the things in the list for all the HTTP servers and browsers that currently exist. For the sake of manageability, we'll concentrate on the Apache server and the Netscape browser, which were the most popular devices of their types at the time when this book was written. When significant differences exist between these products and other popular products, we'll draw your attention to that fact. |
The majority of application programs written these days allow you to use either a mouse or a keyboard to operate the program. In steps that tell you how to perform a particular task, I always indicate the appropriate keystroke combinations that perform the action. Hot keys, or accelerators, are designated by an underline below the character that's the accelerator. If I were giving you instructions to open a file using Microsoft Word, for example, you would see:
From the File menu, choose Open.
To indicate a combination of keys to be pressed at the same time, the two keys are joined by a plus (+) character. If I were giving you instructions to paste a piece of text from the Clipboard into a Word document, you would see:
From the Edit menu, choose Paste, or press Ctrl+V.
The names of dialog boxes and windows, and the names of dialog-box and window options, are indicated by capitalizing the initial letters of the title. When you are saving a file in Microsoft Word, for example, the dialog box that you use to specify the file name is:
File Save
Any new terms and ideas are introduced in italic type, and messages that may appear on the screen are presented in a special font:
All source code and examples of code listings are presented in a monospace font.
Tips are used to indicate some cool trick or a neat way to organize your code. Watch out for tips, and use them in your day-to-day work, because they'll generally save you time or offer you a unique solution to an existing problem. |
Notes provide extra information related to the topic that is being discussed in the body of the text. |
Cautions are designed to alert you to dangerous actions or situations that could cause damage to your environment. You should pay particular attention to cautions so that you do not create a problem at your site. |
If more information on a particular topic appears in another chapter, you see a cross-reference that indicates the chapter to look for. A right-facing triangle indicates that the reference is in a later chapter of the book. A left-facing triangle means that the reference is in an earlier chapter. Following is an example:
Enjoy the book!
Copyright ©1996, Que Corporation. All rights reserved. No part of this book may be used or reproduced in any form or by any means, or stored in a database or retrieval system without prior written permission of the publisher except in the case of brief quotations embodied in critical articles and reviews. Making copies of any part of this book for any purpose other than your own personal use is a violation of United States copyright laws. For information, address Que Corporation, 201 West 103rd Street, Indianapolis, IN 46290.Notice: This material is from Special Edition, Using Perl for Web Programming, ISBN: 0-7897-0659-8. The electronic version of this material has not been through the final proof reading stage that the book goes through before being published in printed form. Some errors may exist here that are corrected before the book is published. This material is provided "as is" without any warranty of any kind.