Database-Backed Web Sites

by Gregory Jorgensen, PDXperts LLC

You can make a web site from a collection of HTML files, but what if you need hundreds or thousands of pages on your site? How can you collect and save information or orders from visitors to your site, and store that information so you can query and manipulate it later? How can you make changes to your product catalog and reflect those changes throughout your site immediately, without editing a lot of HTML? The answer: a database-backed web site.

This article explains how to generate HTML from a database, and how you can decide if you need a database-backed site.

Where web pages come from
The web uses the metaphor of pages: the web seems to consist of billions of individual pages, addressed by URLs and connected through links. Not all of that HTML can come from people typing into text editors. A lot web pages come from static HTML files--files created "by hand" or generated from a content management system or some program that can create HTML (like Microsoft Word). But a lot more pages come from software that generates HTML dynamically, usually in response to inputs from the person visiting the page. In other words, a lot of the web doesn't exist as HTML files; a program creates the page when someone asks for it.

When you enter a URL into your browser's location box, choose a bookmark, or click on a link, your browser sends a request to a server. The server interprets the request and responds with a stream of text (with embedded HTML tags). The browser only cares that it gets text and HTML from the server--it doesn't matter if the server sends the contents of a text file or if the server runs a program to generate the response.

A simple database-backed site
Suppose you want to make a site to show a catalog of a few hundred books. You want to organize the books into categories, allow searching by title, author, and keywords, and show a page for each book that includes a picture of the cover, publisher and copyright information, price, and description. Creating hundreds of separate HTML pages, one per book, and then creating the navigation links for the categories seems daunting enough. And how will you implement the search features? I'll use this example to demonstrate how a simple database-backed site works.

Instead of creating a pile of HTML files, you create a database of books. For a few hundred (or even a few thousand) books you can use Microsoft Excel (or any spreadsheet program) to set up a simple database--you don't necessarily have to learn a database program. Each row of the spreadsheet describes a unique book title, and the spreadsheet columns describe the attributes common to all books. Some likely attributes: ISBN, title, author name(s), keywords, description, price, and quantity on-hand.

Next, you enter your book information into the spreadsheet. When you get all of your books entered, you have a database of books good enough to generate a web site. Spreadsheets work great for data entry, but before you can use the database from your web site, you need to save the database in a common non-proprietary format. For spreadsheet data exporting a tab-delimited file works best--a simple text file with one line for each row of your spreadsheet, and invisible tab characters separating the columns.

How does the information in the database get turned into HTML? One common technique involves templates. A web designer or programmer creates templates for the different kinds of pages in the site, substituting placeholders for the real information. For example, "The Great Gatsby by F. Scott Fitzgerald" becomes "@@title@@ by @@author@@" in the template (I arbitrarily chose @@name@@ for the template placeholders; different template mechanisms use different notation). A simple template with formatting:

    <html>
        <head>
            <title>@@title@@ (ISBN @@isbn@@)</title>
        </head>
        <body>
            <h1>@@title@@</h1>
            <h2><i>by @@author@@</i></h2>
            <img src="covers/@@isbn@@.jpg">
            <p>@@description@@</p>
            <br>
            <p>ISBN @@isbn@@, $@@price@@ each, @@qty@@ in stock</p>
        </body>
    </html>

Now the tricky part: you need to write server-side code to put the database and template together (or find someone to write it for you). To show the product page for the book with ISBN X the process seems simple enough:

  1. get the desired ISBN number from the request
  2. read rows from tab-delimited books database until a book with ISBN=X found
  3. substitute the books attributes for the template placeholders
  4. send the resulting HTML to the browser that requested the page

The details of this process depend on the web server platform and the server-side tools and programming language, but the basic idea applies regardless of the technologies you use.

Graduating to a real database server
Using a spreadsheet works for small databases that don't change often, but if you have tens of thousands of books, or you need to track inventory and orders in real-time (as customers buy from your web site), you need a real database solution. Many web sites use the free open source MySQL database. MySQL runs on the server, so you can make changes to your database "live" rather than changing a spreadsheet, exporting to a tab-delimited file, and uploading to the server. MySQL also handles millions of rows (records) in a table and can perform operations like "search for ISBN=X" a lot faster than looking at every line in a text file. Most ISPs that host web sites offer database service, with either MySQL or commercial products such as Microsoft SQL Server and Oracle.

Going back to the spreadsheet database, you see that it doesn't help with inventory and orders. For example, you have to manually adjust the quantity on-hand for each book ordered, then export and upload the database with the correct quantities. With a database running on the server your server-side program can adjust inventory in real-time, as customers place orders, so customers don't place orders for sold-out books before you have a chance to adjust and export the spreadsheet. You can also collect customer information (name, address, etc.) and save that in the database, with each customer order.

Once you have your database on the server, you can use private web pages to administer the database. For example, you can have a password-protected section of your site where you can add and delete books, change prices, etc. You can make those changes any time, from anywhere that has an Internet connection. In addition, you can look at customer orders from any web browser.

Server-side programming tools
You can find many tools and technologies for server-side programming. Some of the more popular tools:

Programmers have used these tools to create many pre-fab site kits, especially for web store and e-commerce applications. You create some templates (or choose pre-made templates), pour in your database, and out comes a more-or-less complete web site. You can find both free and commercial site builder kits and pre-made stores; your ISP may offer one. Buying software and adapting to it usually costs less than writing your own, but you usually don't get exactly what you want. A consultant familiar with existing solutions and custom server-side programming can help you find and choose solutions for your particular application.

Reliability and security
The database backing your web site holds the main value of your site, so you must carefully plan and design the database. If you don't know how to design a database for your web site, find someone who does. Changing the database design after you finish the site, or after you take the site live, will cost a lot more than doing it right the first time. A good database designer and web programmer can also plan for growth, so your database scales with your business. And you can't neglect reliability: if the database breaks your whole site goes dead.

Databases sitting on web servers present attractive targets to thieves and crackers, especially if you store personal information or credit card numbers. Even one breach can cost you a lot of time, money, and good will. Get professional security advice before deploying a web site or database that collects and stores personal or sensitive information. See my article Introduction to Relational Databases in the April 2003 Computer Bits for more advice on database design and deployment.

Further Reading
Most books on web site design, e-commerce, and databases either emphasize business and management or specific tools and technologies. You can find hundreds of books on web development. I chose a few that rise above the rest.

Philip and Alex's Guide to Web Publishing, by Philip Greenspun

Database Backed Web Sites: The Thinking Person's Guide to Web Publishing, by Philip Greenspun

Inside Relational Databases, by Mark Whitehorn, Bill Marklyn

Web Monkey has several articles on database-backed web development:
http://hotwired.lycos.com/webmonkey/backend/databases/index.html

Developer Shed has some good articles and tutorials:
http://www.devshed.com/