www.BrettDaniel.com

Reflective Class Search Pattern

One of the main projects in last semester's software engineering course was writing a design pattern. My pattern, which discusses ways of using reflection to search for classes that satisfy certain criteria without initializing explicit references to instances of the classes, is called Reflective Class Search.

You can find earlier drafts of students' patterns—including my own— on the course wiki. They may not stay up for long now that the semester is over.

Object-Relational Mapping Patterns

For the past few weeks, the students have taken over lectures in the software engineering course. My lecture [Note: from a few weeks ago] covered four chapters on object-relational mapping from Patterns of Enterprise Application Architecture. Here are the slides for anyone who is interested. Unfortunately, the off-campus student portal is not open to the public, so I can not link to the video of the lecture.

When you really think about them, all O/R mapping approaches seem like dirty hacks. It makes sense to map classes to tables, objects to rows, and entities to primary keys, but this simple (naive?) correspondence breaks as soon as one adds complex object relationships and inheritance. To get around the fundamental object-relational impedance mismatch, one can abandon an object-oriented domain model (usually not a good idea), use an object database management system (not very widespread), or use one of several inheritance mapping patterns (this is where it gets interesting).

The three patterns explained in PoEAA are the following:

Single Table Inheritance
Compress the entire hierarchy into a single table.
Class Table Inheritance
Have one table for each class in the hierarchy.
Concrete Table Inheritance
Have one table for each concrete class in the hierarchy.

Within the application, inheritance mappers corresponding to individual domain objects convert the database representations into domain objects. I will not go into the inheritance mapper implementation or the relative advantages and disadvantages of the three mapping schemes here. The slides, book, and several websites discuss the ideas in much more detail. However, I would like to point out these interesting rules of thumb that came up during the class discussion:

  • Single table inheritance is good for when subclasses have few fields.
  • Class table inheritance is good for when both super- and subclasses have many fields.
  • Concrete table inheritance is good for when superclasses have few fields.

I personally prefer to use class table inheritance since it is the easiest to modify and one does not need to pass raw database data up and down an inheritance mapper hierarchy. It is frustrating to see "authoritative" programming manuals advocate the use of data rows, data tables, data sets, and other database-specific representations for data transfer in database-driven applications. All of these representations are basically hash tables that are completely opaque to compile-time checks. When one changes a single database operation, it can silently break huge swaths of the application that depend on the result.

This is a huge problem in ASP.NET due to the widespread practice of binding web controls to a DataTable or DataSet and the difficulty in writing unit tests for web pages. The better alternative is to constrain the DataTable/Set/Row to a small number of inheritance mappers (ideally, just one) and pass around domain objects that are easy to verify with compile-time checks and unit tests.

I find O/R mapping patterns—and design patterns in general—interesting because when done right they can be an elegant solution to what can easily become a nasty problem.

Software Shakespeare?

In yesterday's software engineering class, Professor Johnson, three other students, and I put on a play that illustrated several of the patterns described in chapter 14 of Domain-Driven Design. Not only was the play an interesting break from normal lectures, but it also illustrated especially well how the software implementation and business organization affect each other.

Professor Johnson played the software analyst who was trying to track down some problems in a large shipping application. Each student played a group leader responsible for a module in the system. I played the group leader responsible for the work order module. In each scene, Professor Johnson would "interview" the group leaders, asking how their modules worked and how they related to the rest of the system. The following lists the patterns described in the book and how they appeared in the play's software system.

Continuous Integration
In which all programmers within a software unit— or "bounded context", to use the book's terminology— work very closely to combine code and tests frequently during development. This pattern describes the approach used by all the teams in the play to some extent, but the team whose software had an extensive unit test suite and a well-defined bounded context exhibited continuous integration most strongly.
Shared Kernel
In which two or more bounded contexts rely on a core set of components. The two teams whose bounded contexts overlapped would have benefited greatly from following this pattern. Instead, they shared objects in an ad-hoc manner, which caused problems elsewhere in the system. Here, a change to the software organization would have probably prompted a corresponding change in the business organization.
Anticorruption Layer
In which a bounded context has an isolated layer that adapts an external bounded context to the domain model. Several teams exhibited this pattern. It naturally extends the Adapter and Façade design patterns from objects to architectures.
Customer/Supplier Development
In which a "downstream" bounded context in the customer role depends on the "upstream" bounded context in the supplier role and in which the supplier takes the customer's needs into account. In the play, one of the group leaders expressed disappointment that his team's change requests took a very long time for another group to implement. In this case, it would have been beneficial to formalize the business relationship to echo the customer/supplier relationship present in the software.
Conformist
In which one bounded context is completely dependent on and has little affect on another's implementation. The team whose software depended heavily on several external APIs exhibited this pattern.
Separate Ways
In which two bounded contexts are completely independent of one another. This pattern was not exhibited explicitly in the play, probably because it was exploring a highly-connected application. However, one can easily see the parallels between a business unit and software module. Certainly two business units with no interaction whatsoever cannot build software systems that depend on each other.

Like the extensive examples in the book, the play illustrated very well how software organization and business organization affect each other. The problems found in the software described in the play were largely due to problems present in the business organization. One team was making changes that conflicted with another team's assumptions about the domain model. Communication problems prevented the conflict from becoming explicit.

Domain-Driven Design

Note: I started writing this entry about a week and a half ago, but did not have a chance to finish and post it until today.

Last week I started reading Domain-Driven Design for my software engineering course. The book discusses common patterns that appear when designing and implementing domain models. Like other patterns books, it does not present any mindblowing new ideas, but instead catalogs common problems and their widely-accepted solutions.

Eric Evans, DDD's author, begins the first chapter by describing a very familiar process: he meets with the client's domain experts and based on their description of the problem, starts drawing boxes and arrows on a whiteboard. After several meetings and many revisions, a domain model starts to emerge. Evans calls this "knowledge crunching" and defines it as follows:

[Domain modelers] take a torrent of information and probe for the relevant trickle. They try one organizing idea after another, searching for the simple view that makes sense of the mass. Many models are tried and rejected or transformed. Success comes in an emerging set of abstract concepts that makes sense of all the detail. This distillation is a rigorous expression of the particular knowledge that has been found most relevant.

I have drawn all sorts of box-and-line diagrams on dozens of whiteboards, but one particular knowledge crunching session sticks in my mind. I was meeting with a domain expert early one workday. I knew very little about the area we needed to discuss, and he knew only how he wanted the user interface of the application to behave. Neither of us had a domain model in mind. After some discussion, I started drawing the UI screens on the whiteboard. The links between the boxes became user actions. With a handful of screens on the board, we were able to "find the nouns" that helped define the functional units of the domain model.

Part two of the book (chapters four through six) discuss the organization of the domain model. It seems that every programmer has his or her pet idea of what a "layered architecture" contains, and Eric Evans is no different. He has the "infrastructure layer" at the bottom, followed by the "domain layer", "application layer", and user interface. Most descriptions leave out the application layer, but it makes sense when one considers it the code used to drive the UI. That is, layers like the .NET code-behind or the Java code used to drive JSP pages.

When the application layer is too thick—that is, when it contains logic that should belong in the domain layer—it could become what Evans calls a "smart UI". This common anti-pattern is the UI analog of the transaction scripts mentioned in SAIP. I found it interesting that Evans emphasized that the Smart UI may be a valid design strategy for simple, one-off systems that do not need to scale. Evans seems to consider that a smart UI is the exclusive opposite of domain model-driven application. However, many real-world applications lie between the two extremes. It is possible—but obviously not ideal—to have a layered architecture in which a large chunk of domain logic has leaked into the application layer and UI. In this case, much of the development effort should focus on moving domain logic "down" into the appropriate domain objects.

The remainder of the section lists patterns used to organize the domain model. To me it read like a taxonomy of the types of objects that appear in a business system. I will not re-list all the patterns here, but there were some similarities to this short object taxonomy that I came across several months ago.

Architectural Mismatch

My software engineering class finished SAIP last week. In this post, I would like to explore some of the ideas presented in the second-to-last chapter entitled "Building Systems from Off-the-Shelf Components".

The chapter leaves the word "component" very loosely defined. In my mind, a component is a self-contained unit of reuse that provides functionality for a specific task. Contrast this to a general-purpose framework such as the Java Platform or the .NET Class Library that provides a great deal of fine-grain functionality in many areas. One would not include either of these as a "component" of a larger system, but they can be used to create components such as a PDF document creator, a custom network interface, or a GUI control, to name a few examples.

The chapter provides some very idealistic (read: unrealistic) methods for determining whether a component satisfies an application's requirements. The short answer to this question is "probably not". The book says, "components that were not developed internally for your system may not meet all of your requirements." (Empasis theirs) This echoes some of what I wrote in my previous post regarding why product line component reuse works well. I agree with Professor Johnson that the only way to know for sure if a component will work in a system is to actually use it in the system.

The chapter also very briefly discusses some strategies for combating this "architectural mismatch". The strategies the book lists are confusing and not particularly useful, so I will generalize the problem of architectural mismatch to the following list:

How to Write Good Software with Bad Components

Code for Replacement

As Professor Johnson said in class, "If you use a component… be prepared to get rid of it." The most common way to achieve this goal is to encapsulate a component in a custom interface. The component can be swapped out, but a well-designed interface can remain constant. Design Patterns calls this the "Adapter" pattern; SAIP calls it a "Wrapper".

Convert input and output

The data a component expects is almost always different from what a target architecture uses. In this case it is common to write converters as standalone modules or in conjunction with an adapter.

Use Multiple Components

If a particular component is missing functionality, it may be possible to find another component that provides it. In this case, one may want to use the Façade pattern to provide a simple interface to the set of components.

Expect the Component to Break

It is usually very difficult to determine the failure points of an OTS component. This is especially true for closed-source components or those with spotty documentation. It is also very difficult to rigorously test a component without having written it from the ground up. For these reasons, it is often a good idea to expect a component to break and provide a general mechanism with which to recover from a component failure.

Allow Omission

It may be possible to omit a component in certain circumstances. In this case, it makes sense to provide a means of removing components through some type of plug-in architecture, configuration setting, or conditional build. By removing a component completely, it can no longer be a possible point of failure.

Rewrite the component

Finally, if all other strategies fail, one may be left with no choice but to rewrite all or part of a component. This is often a valid choice since the true cost of using a component is always much much higher than its price tag. A component always has high learning cost, a higher chance of failure, and high integration cost. This is why companies are often willing to expend many man-hours writing a component that can be bought for a few hundred dollars.

I wrote this list off the top of my head, so please feel free to add your thoughts in the comments. I hope to get a chance to read some of the books that Professor Johnson mentioned dealing specifically with architectural mismatch. Component-based development is a common and increasingly important aspect of software development. Since bad components are also unfortunately common, I think tactics for combating architectural mismatch are a vital part of a programmer’s toolbox.

Software Product Lines

Today in my software engineering class, Professor Johnson said something like, "the TA and I noticed that some of you are slowing down on your weblog posts." I have a backlog of stuff that I would like to write about, so I will take his statement as a hint to start posting some of it ASAP.

For this post, I would like to focus on SAIP's chapter 14 entitled "Software Product Lines". I found this chapter especially interesting because I have often told people that if and when I join industry, I envision myself in a chief architect role for a large system or suite of systems. Alternately, if I decide to stay in academia, I see myself researching or consulting on the design of such systems.

I find product lines interesting because their architecture has many of the same concerns as single system architecture, but with larger components and more emphasis on modularity, modifiability, and reuse. For example, if one designs a component for one product, it is usually beneficial to generalize the component for use across the entire product line. This requires very careful, well-planned development which would likely yield a better product overall.

Reuse is a valid design concern for any system, but it works especially well in a product line for several reasons. First, almost everything can be reused. The book mentions reusing requirements, software elements, analysis, testing, people, and many other architectural elements. Second, all of these architectural elements fall under the assumptions and constraints of entire set of systems. The book echoes this idea nicely when it says, “Software product lines make re-use work by establishing a very strict context for it.” Contrast this to simple code reuse in which one uses a library or framework built elsewhere. The library may have made assumptions about its use that contradict the needs of the user or it may be too general or specific for the desired task.

I also find it interesting that a company’s success can be driven by the quality of the software used to drive its products. In this case, it benefits a company greatly to have a common software framework. When many of a company’s products rely on the same software, it changes the emphasis from quick, get-it-out-the-door coding to careful maintenance and incremental expansion of the common software product line.

Familiar Concepts

I like reading about software because often a writer will attach a descriptive name to a familiar concept or describe it in a new and interesting way. This week’s software engineering lectures and readings in Software Architecture in Practice brought up several such names and descriptions that stuck with me.

Architecture as Early Decisions

The early chapters in the book echoed the class discussion about the definition(s) of software architecture. Its main definition says the following:

The software architecture of a program or computing system is the structure of structures of the system, which comprise software elements, the externally visible properties of those elements, and the relationships between them.

This is as good a definition as any, but it did not ring as true in my mind as a passage that appeared several pages later in the book:

Software architecture represents a system’s earliest set of design decisions. These early decisions are the most difficult to get correct and the hardest to change later in the development process, and they have the most far-reaching effects.

The second definition emphasizes the thought process behind an architecture; the first describes any system whether or not any thought was put into building it. Granted, a bad architecture is still an architecture, but I prefer to focus on the good ones—those that made the right decisions early on.

Professor Johnson paraphrased this idea in class today when he said, “Architecture is the stuff you wished you did right in the beginning.”

Transaction Scripts

I am sure all programmers have encountered a particular style of programming that seems to willfully ignore object-oriented programming practices. Rather than encapsulate behavior in classes and split an application into well-defined modules or layers, a programmer will instead make many monolithic functions that probably contain SQL directly in the code. This style may work well on small applications—I am certainly guilty of using it—but it breaks down as common functionality is duplicated and bugs appear.

Tuesday’s lecture described these types of programs as “transaction scripts”. This term is much more descriptive than “procedural-style programs” and easier to say than “non-object-oriented programs”.

Software Business Cycle

A common theme in the early lectures and the first chapter of the book involved the interaction between the architecture of a system and the organization that develops the system. For example, an organization that implements a client-server architecture will probably have two teams: a client development team and a server development team. That is, the module divisions in an architecture will usually define the divisions in an organization.

Obviously, the influence can travel in the opposite direction as well. The people and expertise in an organization will affect the final architecture. For example, a group of PHP/MySQL experts will probably not design an architecture around .NET and Microsoft SQL (though a good architecture may not even specify the implementation language or database).

The book calls this interaction the “Software Business Cycle”. It is a good thing to keep in mind for when I eventually enter industry, and something that I have been insulated from in my previous programming jobs.

Unrelated

I found it interesting that with a quick search on Getty Images’ website I was able to find the original picture that the book publishers chose for the cover of the book. It seems like an odd, random choice. Why is a person standing alone in the entryway at night? Why did the publishers crop out the fountain, leaving what appears to be a strange discoloration in the sky?

Building Architecture vs. Software Architecture

Yesterday's software engineering class asked the question, "What is a software architect?" The class offered all sorts of answers, most involving some type of technical leadership. To me, the architect sets up the framework in which the other developers work. He or she makes the module- and application-spanning decisions that guide— and in some cases limit— the decisions of other developers.

The architect's main goal when making these decisions is to ensure the overall quality of the final product. However "quality" can be defined using any number of criteria: maintainability, security, speed, flexibility, reliability, etc. Professor Johnson refers to these as the "-ity" words. All involve tradeoffs and may indeed be mutually exclusive. The architect must decide which of the particular "-ities" to focus on and control the tradeoffs based on the client, business, or product needs.

We also discussed some of the many differences between architects of buildings and architects of software. The point that stuck in my mind was this: many years ago, a building architect used to be the designer, engineer, and on-site technical lead of a project. Cristopher Wren (who rebuilt the churches of London after the fire of 1666) and Washington Roebling (who directed the construction of the Brooklyn Bridge from his sickbed via his wife) both seem to meet this criteria. This description sounds very similar to how we describe a software architect today.

As time went by, the roles seemed to diverge. An "architect" became responsible for the aesthetic design of a building, the "engineer" became responsible for making the design work, and the on-site technical lead became any number of contractors and subcontractors. Of course there will always be a great deal of overlap and interaction between these roles, but the differences certainly exist.

Software architecture is still such a young field that I cannot help speculating that it will one day undergo a similar divergence. Professional software architects love trying to make programming “more like engineering”, but I think many changes need to occur before that can happen. To illustrate, a building architectural firm can easily send a design to an architectural engineering firm to prepare technical blueprints, and the engineering firm can easily send the blueprints to any number of contractors. (I realize this is a gross simplification; bear with me.) In software architecture, however, it is very difficult to create a transferable, adaptable “blueprint” of a software system for others to implement. Formal specifications, UML, design/architectural patterns, and other tools try to meet this need, but they still have a long way to go.

Despite this speculation, I believe that a software architect will always remain closer to code (or some other problem-solving abstraction) than a building architect is to a hammer. Certainly part of this belief is personal in that I really like code. However the larger part is based on the observation that software architecture is fundamentally different from any other constructive profession:

  • The final product can be duplicated infinitely. This means that there is no need to “mail the blueprint to all the contractors”. One can just email the executable.
  • Code is the current “top” of many layers of abstractions. Software architecture can grow (and is growing) to accommodate new, more powerful layers of abstractions such as architectural patters or large, interconnected modules.
  • The tools are constantly changing. Software architecture will use different tools, but these tools will still require a technical lead for the same reasons projects currently need a technical lead when writing code.

For these reasons, I believe that software architecture will evolve not toward greater speciality—like what happened in building architecture— but toward greater generality to encompass new abstractions and tools.

Software Engineering Course

I am taking CS527, UIUC's graduate-level software engineering course, this semester. It is taught by Ralph Johnson who cowrote the classic Design Patterns: Elements of Reusable Object-Oriented Software.

Ralph Johnson on the cover of Design Patterns

One interesting feature of the course is that we must keep a journal of the presentations we attend and the books we read for the course. Professor Johnson recommends keeping the journal as a weblog. He mentioned eventually setting up an aggregator, but right now he is just keeping a list of the students' weblogs. I will be keeping my journal under this weblog's CS527 category (RSS, Atom). You can probably expect more software architecture-oriented posts in the near future.

Weblogs, a wiki, an excellent reading list, and a well-known expert for a teacher... I expect good things from this class.