Enlightenment: Narrative Programming by Tom Radcliffe

The NP Idea

In Chapter 1 I laid out some of the things that I think an ideal light-weight OO analysis and design methodology should have. The present chapter introduces Narrative Programming, a methodology that delivers most of these things. In the next chapter there is an example-driven tutorial of Narrative Programming in action, including use of the tools that have been developed to support it.

This chapter focuses on the ideas behind Narrative Programming (NP), and the convergence of technologies that make it possible. There are three new technologies behind NP:

Object-oriented frameworks and Design Patterns
Use-case-driven specification
The eXtensible Markup Language (XML)

How these three technologies work together to produce a software development methodology that is supported by a small, simple and powerful set of tools is the story of this chapter. But first we have to consider a new way of looking at the software application.

How to Think About Applications

There are two fundamentally different ways to look at an application: from a user's perspective, and from a developer's perspective. Usability engineering is supposed to help developers get some idea of how things look to the user, and hopefully the user will never need to know how things look to the developer.

What I'm concerned with here is how developers should think about applications. Users want a metaphor for paper: they want something that is persistent, flexible and unobtrusive. The challenge for software engineers is to produce the effect of *smart* paper: paper that knows how to spell, or connect to databases, or draw and analyze circuit diagrams, or whatever.

How we to do this is the central aim of software engineering, with the usual engineering constraints on time, money and people.

From the developer's point of view, an application is one or more modules -- executables or libraries -- that contain the machinery to create the hopefully seamless experience of the user. How this machinery is organized is what I am concerned with here. I don't propose to put forward a complete taxonomy of application structure, but there are a few major dimensions that we can differentiate applications along. The first is the coding style used: is it object-oriented, procedural, declarative, or some mixture? For object-oriented applications, which are my sole area of real interest, there are further ways of dividing up application structure, mostly in terms of how the interactions between objects are structured.

Many OO applications are based on the Model/View/Controller pattern, and many more on the somewhat weaker Document/View approach. OO applications can also be structured as state machines of one kind or another. But many applications, even with the best of OO intentions on the part of developers, wind up with very little structure before they have time to get old. One of the problems that OOA/D is supposed to address is how to maintain application structure in the face of the many forces that try to disrupt it, from unrealistic schedules to inadequate analysis and design to late changes in the spec and poor user response to early releases.

In the face of these pressures, it is clear that if developers do not have a pretty strong image in their heads of what the application structure ought to be, they will not be able to prevent the gradual encroachment of chaos. The more clearly they can envision and agree upon the basic application structure, the better they can resist the entropic legions that besiege the development process. Conceptual integrity is one of the most important predictors of application success, and conceptual integrity is expressed through clear application structure.

One of the aims of Narrative Programming is to give developers such a structure, something that is so intuitive and concrete that they can keep it clearly in mind to guide their development choices, yet still flexible enough that realistic applications can be built using it.

The central metaphor of Narrative Programming is: the application as structured document.

This should not be confused with any part of the document/view architecture, which does not impose any constraints on document structure. Structured documents are an idea that comes from the SGML community -- SGML (Standard Generalized Markup Language) is an ISO standard for tagging text in ways that describe the content. HTML is an example of an SGML language. The idea of a structured document is that only certain types of content are allowed to appear in certain places, and the document as a whole forms a tree of strongly-typed elements. What elements are allowed to appear where, and what the attributes of those elements are, is described by a Document Type Description, or DTD.

What all this means is discussed at length below.

Documents and Trees

Figure 2.1 shows an example of a structured document: it is a memo. The memo has a few elements that have to come in strict order: there is a TO and a FROM element, a single DATE element, a single SUBJECT element, and a single BODY element. Two views of the memo are shown: as a document, and as a tree. This is the key to structured documents: the document is a tree, with the "leaf" elements containing text.

========================================================
Figure 2.1:  A Structured Document
--------------------------------------------------------
 Memo Document		       	Memo Tree
  <MEMO>	       		    Memo
    <TO>       			     |---TO
      You	       		     |    |---"You"
    </TO>	       		     |
    <FROM>			     |---FROM
      Me	       		     |    |---"Me"
    </FROM>			     |
    <DATE>			     |---DATE
      Today			     |	  |---"Today"
    </DATE>			     |
    <SUBJECT>			     |---SUBJECT
      You and me	       	     |    |---"You and me"
    </SUBJECT>			     |
    <BODY>			     |---BODY
      Hey you, how about me?	     |    |---"Hey you, how about me?"
    </BODY>			     
  </MEMO>
========================================================

SGML, the original standard language for defining document structure, is too complex to be used for most applications. SGML DTDs allow users to specify document structure, but they also deal with an overwhelming array of optional features that no one but a few gurus care about or even understand, and that were mostly put into the standard due to limitations on software and computing resources that are no longer relevant. In response to the complexity of SGML, and the need for something more flexible than HTML for documents delivered over the Web, Jon Bosak of Sun Microsystems and a group of SGML experts developed a simplified version of SGML in a matter of months. This simplified language, the eXtensible Markup Language (XML) is the basis for the Narrative Programming technology described in this book.

I describe XML in more detail in an appendix, but for now there are a few things that have to be understood about it, and about document description languages in general. XML is a a language for describing document types. The way it does this is to let the user define elements and constrain where they are allowed to exist in the document. A collection of these element definitions and constraints is a Document Type Description (DTD). Say we have an element A that is allowed to contain elements of types B and C in any order. The allowed contents are called the "content model" for element A, and is expressed in an XML DTD as:

<!ELEMENT A (B|C)*>

This says that there is an element of type A, and that it can contain either B or C zero or more times. The vertical bar between elements in the content model means "choice" and the star following the parentheses means that what is inside can occur zero or more times. Element types in the content model can also be separated by commas, which indicates that the elements must occur in exactly the sequence shown. Also, the star can be replaced with a plus sign, meaning one or more times; a question mark, meaning zero or one times; or nothing, which means exactly once. The basic DTD syntax is shown in Tables 2.1 and 2.2

========================================================
Table 2.1 Content Model Separators
-------------------------------------------------------- 
 Separator   Meaning     Example
-------------------------------------------------------- 
    |        choice      (B | C)
    ,        sequence    (B , C)
========================================================

========================================================
Table 2.2 Content Model Modifiers
-------------------------------------------------------- 
 Modifier    Meaning     Example
-------------------------------------------------------- 
    *	      0 - n	  (B)*
    +	      1 - n	  (B)+
    ?	      0 - 1       (B)?
   	        1         (B)
========================================================

Once a DTD has been created that captures the desired structure of the document, there are two problems that remain: how do we actually indicate what the structure of a document is, and what good is it once we have?

The first question is answered by the XML specification: the elements defined in the DTD are identified in the document by "tags." A tag is the name of the element bounded by the characters "<" and ">". An element may have both start and end tags, or if it has no content of its own it may have a single tag that describes an empty leaf in the document tree. For example, given that the element A can contain empty elements of type B and C, a valid document might look like:

<A>
	<B/>
	<C/>
</A>

In this case, both B and C are empty elements (in the DTD, their content model is just the word EMPTY) so instead of having a start and end tag, they just have a start tag that is terminated with the character "/>" instead of just ">". The element A, on the other hand, has both a start tag, which begins with "<" and an end tag, which begins with "</".

Based on these conventions, which are defined by the XML specification, a DTD describes a formal language for marking up documents. A DTD defines a bunch of tokens (the tags for each element) and a bunch of rules about where they can occur (the content model for each element.) Documents that are marked up correctly in this language are said to be valid documents relative to their DTD. And the key to the power of all this is that because a DTD defines a formal language, you can create a parser that will determine when a document is valid -- that is, when the document's structure conforms to constraints described by the DTD.

An XML parser is an application that reads a DTD, creates a parser for documents marked up with the tags defined in the DTD, and parses XML documents to see if they conform to the DTD. Unlike SGML, XML also has a weaker notion of parsing a document, which is "well-formedness." A well-formed document is one that is a tree -- every non-empty start tag is matched by a corresponding end tag at the same level in the tree. Examples of well-formed and non-well-formed documents are shown in Figure 2.2.

========================================================
Figure 2.2:  Well Formed and Ill Formed Documents
--------------------------------------------------------
Well Formed	        Not Well Formed
--------------------------------------------------------
<A>			<A>
	<D>			<D>
	</D>			</A>
</A>			</D>
========================================================

Well-formedness is independent of the DTD, so a parser can determine if a document is well-formed even if it does not have access to the DTD.

One aspect of XML elements that I have mentioned only in passing above, but which deserves more attention, is attributes: the DTD may include an attribute list specification for an element. There are a variety of types that these attributes can have, but in terms of the document they are all strings. In the DTD, attributes are defined by giving an ATTLIST declaration for an element:

<!ATTLIST A
attName CDATA "Default Value">

In the document itself, the attribute is given a value with a simple syntax in the start tag:

<A attName="New Value">

Empty elements can also have attributes (they would have no function if they couldn't.) People with HTML experience will be familiar with the empty IMG element, for instance, which has attributes that give the source for the image's data.

What This Has To Do With Applications

The way this converges with application architecture is not simple, but the first thing to observe is that it is possible to structure an application as a tree. Most OO applications are designed at least loosely around a set of container classes, with the structure of the application being expressed both in terms of inheritance hierarchy and containment. Recently, as the power of containment has become evident, there has been a move away from the heady euphoria of the early days of OO in which inheritance was considered preferable to containment in most cases. Now, most OO purists argue for strict type-inheritance, and we have plenty of good examples to justify this approach.

Structurally, trees are pretty generic -- they can range from things that have only one level that look a lot like arrays, to "sticks" in which each child has only one child, which look a lot like stacks or queues. And internally, most trees are implemented in terms of linked lists.

As an example of an application structured as a tree, consider an event-driven application in which events are passed to the top level node in the tree. If the top level node can't handle the event, it passes the event to its children, and so on. A node that handles the event may do so by creating a new child node, thus extending the tree. Or it may do so by executing a method that operates on other nodes in the tree. Every object in the application is either a node in the tree or a data member in such a node. A realistic application might have several such trees; for example, one for handling events, one for representing the file the user is working on, and one for representing application configuration information.

An application that is structured as a tree is therefore extremely flexible as well as highly structured -- this is important because most developers want a certain amount of "elbow room", and any good software development methodology or application structure should provide it. Realistic applications require that developers have room to move, where they are not too limited by the structure they work in. Maintainable applications require that a certain amount of structure be imposed on the development environment. Narrative Programming is aimed at this happy medium.

The grand idea of Narrative Programming is that an application structured as a tree can be described by an XML DTD.

This has a lot of interesting consequences, not least of which is the ability to do automatic generation of code from the DTD for the classes that embody the application. And as I describe below, the amount of information the code generator has available from the DTD is so large that the generated code is a lot more than just an empty skeleton -- in particular it has important pieces of application functionality, like serialization, already written.

Before I jump into all the neat things you can do with the generated code, I need to make the grand realization a bit clearer. The best way to do this is with a simple example. Suppose we have a simple drawing editor. The root element of the application tree contains global state information about the application. The elements contained by the root element are things like the drawing we are currently working on, and the drawing itself contains a scene graph in the form of a tree of elements, the leaf nodes of which describe each part of the drawing. An example drawing and corresponding application tree is shown in Figure 2.3.

In terms of user interaction, there is at any moment a single "active element" of the application tree, and events are forwarded from the UI to this element in the usual way -- how the Narrative Programming Framework (NPF) interacts with UI is treated in detail in Chapter ???, with emphasis on the Qt framework. Because each element knows its place in the tree (via a pointer to its parent) and knows about the root of the tree, each element is well-placed to have enough information to behave intelligently. One of the most difficult aspects of OO programming is making sure that encapsulation and modularization does not hide information from places where you really need it.

The full NP analysis of a drawing editor is covered in detail in a later chapter, but it should be clear from this example that at least some interesting applications can be expressed as a tree structure, and any tree structure can be expressed as an XML DTD.

There are two questions that remain to be answered. Where do DTDs come from? And how do we generate code out of a DTD?

The short answer to the first question is use case analysis. The second question really doesn't have a short answer, but the general idea is that each element that is defined in the DTD corresponds to a class, and the attributes of the element correspond to simple member variables of that class. The content model of the class is dealt with via a generic tree structure that is described in Chapter ???.

Advanced Warning About State

Most applications have two aspects to their structure: data, and state. The data are the stuff the application is working on, and the state is the current condition of the application: what features are inactivated, whether anything is being dragged, and so on. It should be pretty clear that the application data can be described as a structured document. It turns out that application state can be described as a (different) structured document as well -- this is because each state of an application typically has a few allowed states that can come after it, and so the allowed successors start to look a lot like the content model of an XML element.

I don't want to confuse the issue at this point, but keep in mind that there is even more to this stuff than is described in the next chapter. The NP state framework is described in detail in chapter ???, as well as in the examples. It is possible to build powerful applications without using the state framework, but for many applications -- especially servers -- having the ability to explicitly represent state is a real plus.

Summary

This chapter has introduced the core idea of Narrative Programming: that software applications can be designed as trees, and trees can be viewed as structured documents, so applications can be viewed as structured documents and therefore described by XML DTDs.

This means that Narrative Programming is capable of producing well-structured applications, and the structured document metaphor is simple, concrete and powerful. It should therefore help developers maintain the conceptual integrity of the application in the face of real world pressures. NP also fulfills many of the constraints set out in Chapter 1 on an ideal software analysis, design and development methodology: it is based on a simple, intuitive metaphor that makes it easy to learn. It provides immediate benefits in the form of serializable classes. As we will see in the following chapters, which introduce the narrative programming framework, it can be used to model a small part of an application at the price of adding in a few standard base classes, so you don't have to re-engineer your entire development organization to realize the benefits of NP.

It still remains to be seen how to turn all this theory into a set of concrete tools for application analysis, design and development. This will be the focus of the next chapters, which introduce the Narrative Programming Framework (NPF).