of the American Society for Information Science and Technology   Vol. 28, No. 1    October / November 2001

Search

Go to
Bulletin Index

bookstore2Go to the ASIST Bookstore

 

Copies

Special Section

Understanding Content Management

by Bob Boiko

Bob Boiko is president, Metatorial Services Inc. He can be reached at 309 NW 78th Street, Seattle, WA 98117; by e-mail at bob@metatorial.com; on the Web at www.metatorial.com; or by phone at 206/706-3078.

If you have previously heard about content management (CM) it is most likely because you are connected to a large Web development project. Today that is where most of the interest and activity are. When the Web moved past small informally designed sites and into large, rapidly changing sites, the need for strong management tools became pressing. Product companies moved in to address this need and called their offerings content management systems (CMS). If your only problem is to create and maintain a large website, you have reason enough to desire the strict structure and formal procedures of a CMS. Such a system helps you get and stay organized so that your site can grow and change quickly while maintaining high quality. The Web, however, is simply one of many outlets for information that organizations need to manage. And when the amount of information sharing between these outlets grows, the desire for an organized approach becomes an absolute need.

I have been giving talks and running seminars on CM for the last couple of years. I often ask my audiences what publications they are responsible for. It used to be that the very large majority of the responses were "website only." However, an ever-increasing number now say "multiple websites and print publications, and anything else we are able to create from the same information."

At the highest level, CM is the process behind matching what you have to what they want. You are an organization with information and functionality of value. They are a set of definable audiences who want that value. This definition and the processes behind it work as well in other outlets as on the Web. In other words, at first blush CM may seem like a way to create large websites, but upon closer examination, it is in fact an overall process for collecting, managing and publishing content to any outlet.

  • In collection, you either create or acquire information from an existing source. Depending on the source, you may or may not have to convert the information to a master format (such as XML). Finally, you aggregate the information into a your system by editing it, segmenting it into chunks (or components) and adding appropriate metadata.
  • In management, you create a repository that consists of database records and/or files containing content components and administrative data (data on the system's users, for example).
  • In publishing, you make the content available by extracting components out of the repository and constructing targeted publications such as websites, printable documents, and e-mail newsletters. The publications consist of appropriately arranged components, functionality, standard surrounding information and navigation.

If CM were only a process for creating large websites, it would be worth using for organizations that have them. For organizations with multiple publications (even if the publications are multiple websites) CM becomes a necessity. However, there is a deeper reason why CM is important. It is not just the common forms of information I call "content" that must be collected, managed and delivered: it is all business. When organizations begin doing business electronically they must answer the same questions that confront them when they begin delivering information electronically. Namely:

  • How do I break my business down into electronically deliverable parts?
  • How do I make sure that I know what parts I have and that these parts are the right ones for my staff, partners and customers?
  • How do I assure that the right part reaches the right person at the right time?

If CM is the process of collecting, managing and publishing content, then eBusiness is the process of collecting, managing and publishing parts of your business. Of course, much of business entails publishing information. So, with no stretch at all, you can see how CM might underlie some of eBusiness. But organizations do business by providing for actions as well as information. Actions in an electronic world take the form of computer functionality. Functionality, in turn, takes the form of objects and blocks of computer programming code, which are distributed in exactly the same way as run-of-the-mill content.

Systems for Managing Content

People use a very wide variety of tools and methods to process information and get it into publications. From paper note cards to massive database applications, organizations have found ways of organizing and managing information. My goal is not to say which ones of these systems are or are not true CMS. Rather, it is to describe the full range of possibilities available for constructing a system to handle most effectively the range of issues that may arise.

I'll begin with the simplest system I have seen that still has some of the earmarks of a full system, the "nominal" Web CMS. I'll then move up in complexity to describe other systems that embody more and more of the full picture: dynamic websites, full Web CMS, and enterprise CMS.

The Nominal Web CMS

When people started creating websites, they did so by typing HTML into plain text editors like Microsoft's Notepad. As time moved on the need for better tools grew, fueled by enterprising product companies, less technical users and the need to automate tedious tasks. At first, these new HTML authoring tools did little more than help us remember the arcane syntax of HTML. Later, they began to be true WYSIWYG environments for creating Web pages. Today, they have added just enough management tools to serve as nominal Web CMS for organizations with small sites and no additional publications.

Packages like Macromedia Dreamweaver and Microsoft FrontPage intend to be single tools that allow you not only to create pages, but also to share resources between pages and manage the layout and organization of your site. The following are the main facilities they offer for these management tasks:

  • Page templates, which allow you to create standard page layouts and apply them across pages. These templates allow you to share resources such as images and standard text blocks and to auto-generate navigation based on the pages that have been added to your site.
  • Basic status functions, which let you track where a page is in the development process.
  • Link managers and site outlines, which let you verify that your cross-references are valid.
  • Deployment managers, which let you upload the site you have created locally to the Web server where it will be publicly available.

 Today's Web authoring tools, which certainly include many more products than just Dreamweaver and FrontPage, are making their first halting steps into the realm of CM. I have no doubt that as time moves on they will expand their tools to cover more and more of the territory. Still, to really enter the world of CM, these tools will have to change some of the following fundamental characteristics:

  • They are small-scale by design.  They assume a small and loosely organized team producing pages one at a time. A CMS, on the other hand, assumes a large, very well organized team producing content that will be moved onto pages in bulk.
  • They are page-oriented. In a CMS context chunks (components) are the things to be managed, not pages. While it is convenient at a small size and with a single site to manage pages, at a larger size, and especially when you have to create more than just a single site, you need to get past managing pages to managing content components.
  • Finally, these tools assume you are creating one website. They have no ability to share the same content between pages of the same site, let alone between pages of two sites. They are not even close to being able to support any publication that is not HTML.

Page Templates. Microsoft FrontPage provides a page-templating function that illustrates a number of the basic ideas behind publication templates. Using the site creation wizard in FrontPage 2000, I produced the page shown in Figure 1 in a few clicks. The page has everything on it that you might see on a published page except the actual content. In fact, it is a WYSIWYG template for creating pages for a site. The template includes the following:

  • A banner and global navigation buttons at the top whose names are changed by the applications as you change the name of the site or its pages.
  • A local navigation bar on the left whose button names and order are changed by the application as you add or modify pages in this section of the site.
  • A footer at the bottom with standard text and links that you would like to have appear on each page.

While this system is far from a full publication template system (you only get one template per site), it shows many of the basics. In particular, it shows how the system can allow the creator to specify page layout, page names and standard content blocks once and then have the system automatically update and fill in the right stuff in the right place. Most importantly, if the design of the template is changed, the system automatically propagates the change to all affected pages.

Status Functions. Macromedia FrontPage provides a function they call Design Notes to let you specify and review page status and notes as shown in Figure 2.

The box lets you choose a status for the page you are working on and type whatever notes you want. To help you encourage people to update the page status, the application can display this box each time the page is edited.

This facility is the mere hint of a workflow and routing system, but it shows the basic idea of tagging pages with management information that won't be published but will be used by authors and administrators for management. 

Outlining and Organization Functions. Macromedia Dreamweaver provides a site outline and link checker that help you organize the outline and cross-references on your site as shown in Figure 3. A full CMS would provide a much more thorough set of tools for managing hierarchies and cross-references, as well as for managing indexes and sequences. However, the ability to see broken links and the directories where your files are stored is a start.

Deployment Tools. Microsoft FrontPage offers the function shown in Figure 4 for deploying your site to a production server. The function lets you communicate via FTP to a server on the Internet and deploy all of your pages or only those that have changed since the last time you deployed. This is the most basic form of deployment: file transfer. I will illustrate more advanced forms of deployment in the sections that follow.

The Dynamic Website

A dynamic website is a system for producing Web pages "on the fly" as users request them.

There is a data source (a relational database or possibly an XML structure) on the Web server that is queried in response to a user's click on a link. The link activates a template page. The template page has regular HTML in it as well as programming scripts, objects and other programs that interpret the request, connect to the data source, retrieve the appropriate content and do whatever processing is needed to form an HTML page. When the template has created the appropriate HTML page, the Web server sends it back to the user's browser as shown in Figure 5.

In a purely dynamic site there are no HTML files, only the ability to build them when someone wants them. This is in contrast to a static site in which all of the pages are pre-built and stored on a Web server as HTML files.

Given this definition of dynamic sites, it is easy to see why they are often confused with CMS. For one thing, CMS can be said to do the same thing. They too have databases or XML structures, retrieve appropriate content and return built pages. On the other hand, there are compelling reasons to distinguish the two. Why? Because you can have a dynamic site that really is not doing CM. In addition, a CMS can just as easily build a static site.

For example, suppose you have a large dynamic website that uses advanced scripts to put a user interface on an organization's financial system. The system responds to user requests and dishes out just the right HTML page in response. You would be hard pressed to call this a CMS it is really a Web-based application.

You can also have a CMS that produces a static site. Suppose that I set up a very complex CMS that has millions of components, sophisticated workflow and produces 100 distinct publications. One of those publications is a website. When I hit the right button, out flows a static website of a million HTML files. I put those files on a Web server and I am done. When I want to change the site, I hit the button again and out flows a new static website that replaces the first. In this case, I have a robust CMS producing a static website as shown in Figure 6.

So dynamic websites and CMS are not the same, and you do not have to produce a dynamic website from a CMS. In fact, if you could get away with only producing static websites from your CMS you would be better off. A static site is faster and much less prone to crashing than a dynamic one. But you rarely can get away with static sites, or at least not entirely static ones.

You need a dynamic site if you do not know before hand what will be on a page. If you have to assess user input or some other factors in order to figure out what belongs on a Web page, you need a dynamic system. For any realistic degree of personalization, for interaction with data systems such as transactions or catalogs, or for "live" updates such as changing news stories or stock quotes you need the processing power on the Web server that a dynamic system gives you.

While dynamic sites are not CMS they illustrate a number of the qualities of a CMS:

  • The template pages in a dynamic site are very similar in approach to the templates in a CMS. Unlike the template system in a Web authoring tool, which is isolated from you and has a limited set of features, dynamic sites use generalized Web programming languages like Java Server Pages (JSP) and Active Server Pages (ASP). These programming languages are unlimited and can create any sort of page layout and logic that you might want. The templating system in many commercial CMS products is no more than an enhanced version of the kinds of template files used on dynamic sites.
  • The data sources on a dynamic site are similar and, sometimes, the same as those you would use in a CMS. In either a dynamic site or a CMS you are likely to use the same database or XML products, use the same programming techniques and write the same sorts of content selection, layout and navigation building code. A good CMS, however, will provide you with enhancements that make these tasks much easier. CMS data sources also tend to have more management information stored in them than the data sources for dynamic sites.

What is most missing from a dynamic site system is, first, the ability to create more than just a website and, second, the collection system that a CMS would include. Of course, there is nothing stopping the enterprising programmer from extending the website code to do these things, but then the programmer would not be creating a dynamic website, but rather creating a CMS!

The Full Web CMS

Sites do not have to be 100% static or 100% dynamic. In fact, the vast majority of large websites are some of each. Parts of your site can be HTML files and parts can be dished up dynamically out of a database. In addition, there may be a variety of databases that provide different parts of your site. In a full Web CMS there can be

  • A CMS application behind the firewall that takes care of collecting content from contributors and managing your content's workflow and administration.
  • A repository behind the firewall that is a relational or XML data source. The repository holds all of your content, administrative data and any of the resources you need to build the site, such as graphics and style sheets.
  • A set of flat HTML files that are managed by the CMS and deployed by the CMS to the static part of the site.
  • A live data source (the CMS generated database) on the Web server for the dynamic parts of the site. The CMS can deploy data and content from its repository to the CMS generated database. In this way, even dynamic content can be managed behind the firewall and kept off the server if it is not ready to be seen publicly. In addition, the template pages that access the CMS-generated database can be pages that are created by the CMS.
  • Other data sources can be connected to the website that are not connected to the CMS. For example, a transaction database might be connected to the website, but not the CMS for conducting sales on the site. The other data sources can run completely independently of the CMS or the template pages that access these sources can be pages that are created by the CMS.

As you can see, you can quickly get a system that is pretty complex. The nice thing is that you can manage this complexity with the CMS. If you consider the various chunks of code in the template pages as chunks of functionality (which they are), you can treat them as just another kind of content to be collected, managed and published onto the right pages.

In general, your best bet is to drive as much of your site as feasible toward static pages. Even if content changes once a day, you are probably better off producing it as HTML files and posting the ones that have changed once a day to your server. Flat pages are tremendously faster and more reliable than dynamic pages. There should be compelling reasons to dynamically generate pages (for example, the content is changing minute by minute).

The Enterprise CMS

Most discussion of CM centers on creating a large website. While large websites are the primary use of CMS today, the potential for a CMS to help an organization goes far beyond the Web. Clearly, a CMS can do a lot more than produce a website. It can encompass your entire content creation and organization system. It can provide a content repository where information can be reviewed and worked on independently of any page it might land on, and most importantly, it can produce websites and any other publications you might care to make from the stored content.

If you look at your content collection, management and publishing needs from the perspective of the whole organization, you will quickly see that the potential for CM goes way beyond the Web. In my experience, this insight is dawning only slowly in organizations that are hard-pressed to get their sites organized let alone revamp the information processing in their entire enterprise.

Many organizations have accepted (at least for the present) that their print publications will be quite separate from their Web publications not that most would even call a website a publication. It follows that

  • They are willing to spend extra time and money duplicating effort between a print team and a Web team.
  • They are willing to put up with the lack of synchronization between the two sources of information they produce.
  • They are willing to let one publication wait for content until the other goes to press. At first it was always the website that waited. More and more it is the print publication that has to wait for the content.

I think this reticence about thinking globally is often driven by the entrenchment of print-based teams in their traditional tools and methodologies. For some print team members, the CMS attitude (away from a specific publication and toward creating good content) is a godsend that lets them reach new heights. For others it is no more than having to learn a lot of new programs and unlearn a lot of old habits.

Not surprisingly, I have seen much more desire to think globally from those who have to share content among websites, or between the Web and other digital devices (WAP phones, for example). These teams have already embraced the new authoring tools, content formats, database technologies and templating models that dynamic Web sites have taught them. It is not such a big stretch to consider going the final steps to a CMS.

Interestingly the divide between publications on hand-held digital devices and the Web is much wider than the divide between print publications and the Web. Hand-helds can show only a couple of sentences of text at a time and very small pictures. They require a major rethinking of how you will deliver anything you are used to seeing either in print or on the Web.

While the Web still figures most prominently, there is nothing exclusive about the Web's demand for well-managed content. If your sole concern is the Web, don't worry, it won't be for long.

Summary

CMS have a simple purpose: to allow you to get content with the least hassle, know what content you have, and get it out to a variety of places automatically. The details are less simple:

  • Only the Web has really been covered by most systems in place today.
  • Web authoring tools are great for small sites, but lack the functionality or scalability to serve as real systems.
  • Today's most sophisticated websites go a long way toward CM, but are still much too narrowly focused to do the whole job.

While the technology matures, you are best served by spending your time and energy taking what steps you can to move toward a CM approach. This means first figuring out in detail what you would like to have happen and second seeing how far you can get with today's tools.

This article is taken in part from an upcoming book by Bob Boiko entitled The Content Management Bible (ISBN 0-7645-4862-X). The book will be published in the second half of 2001 by Hungry Minds Inc.

How to Order


ASIST Home Page

American Society for Information Science and Technology
8555 16th Street, Suite 850, Silver Spring, Maryland 20910, USA
Tel. 301-495-0900, Fax: 301-495-0810 | E-mail:
asis@asis.org

Copyright 2001, American Society for Information Science and Technology