This post is a rough “transcript” (with some changes and creative freedom) of a session I gave in the Citi Innovation Lab, TLV about how to effectively model a system.
A Communication Breakdown?
Building complex software systems is not an easy task, for a lot of reasons. All kinds of solutions have been invented to tackle the different issues. We have higher level programming languages, DB tools, agile project management methodologies and quite a bit more. One could argue that these problems still exist, and no complete solution has been found so far. That may be true, but in this post, I’d like to discuss a different problem in this context: communicating our designs.
One problem that seems to be overlooked or not addressed well enough, is the issue of communicating our designs and system architecture. By ourselves, experienced engineers are (usually) quite capable of coming up with often elegant solutions to complex problems. But the realities and dynamics of a software development organization, especially a geographically distributed one, often require us to communicate and reason about systems developed by others.
We – software engineers – tend to focus on solving the technical issues or designing the systems we’re building. This often leads to forgetting that software development, especially in the enterprise, is often, if not always, a team effort. Communicating our designs is therefore critical to our success, but is often viewed as a negligible activity at best, if not a complete waste of time.
The agile development movement, in all its variants, has done some good to bring the issues of cooperation and communication into the limelight. Still, I often find that communication of technical details – structure and behavior of systems, is poorly done.
Why is that?
A common interpretation of agile development methods I often encounter tends to spill the baby with the water. I hear about people/teams refusing to do “big up-front design”. That in itself is actually a good thing in my opinion. The problem starts when this translates to no design at all, and this immediately translates into not wanting to spend time on documenting your architecture properly, or how it’s communicated.
But as anyone who’s been in this industry for more than a day knows – there’s no replacement for thinking about your design and your system, and agile doesn’t mean we shouldn’t design our system. So I claim that the problem isn’t really with designing per-se, but rather in the motivation and methodology we use for “doing” our architecture – how we go about designing the system and conveying our thoughts. Most of us acknowledge the importance of thinking about a system, but we do not invest the time in preserving that knowledge and discussion. Communicating a design or system architecture, especially in written form, is often viewed as superfluous, given the working code and its accompanying tests. From my experience this is often the case because the actual communication and documentation of a design are done ineffectively.
This was also strengthened after hearing Simon Brown talk about a similar subject, one which resonated with me. An architecture document/artifact should have “just enough” up front design to understand the system and create a shared vision. An architecture document should augment the code, not repeat it; it should describe what the code doesn’t already describe. In other words – don’t document the code, but rather look for the added value. A good architecture/design document adds value to the project team by articulating the vision on which all team members need to align on. Of course, this is less apparent in small teams than in large ones, especially teams that need to cooperate on a larger project.
As a side note I would like to suggest that besides creating a shared understanding and vision, an architecture document also helps in preserving the knowledge and ramping-up people onto the team. I believe that anyone who has tried learning a new system just by looking at its code will empathize with this.
Since I believe the motivation to actually design the system and solve the problem is definitely there, I’m left with the feeling that people often view the task of documenting it and communicating it as unnecessary “bureaucracy”.
We therefore need a way to communicate and document our system’s architecture effectively. A way that will allow us to transfer knowledge, over time and space (geographies), but still do it efficiently – both for the writer and readers.
It needs to be a way that captures the essence of the system, without drowning the reader in details, or burden the writer with work that will prove to be a waste of time. Looking at it from a system analysis point of view, then reading the document is quite possibly the more prominent use case, compared to writing it; i.e. the document is going to be read a lot more than written/modified.
When we come to the question of modeling a system, with the purpose of the end result being readable by humans, we need to balance the amount of formalism we apply to the model. A rigorous modeling technique will probably result in a more accurate model, but not necessarily an easily understandable one. Rigorous documents tend to be complete and accurate, but exhausting to read and follow; thereby beating the purpose we’re trying to achieve. At the other end of the scale are free text documents, often in English and sometimes with some scribbled diagrams, which explain the structure or behavior of system, often inconsistently. These are hard to follow for different reasons: inaccurate language, inconsistent terminology and/or ad-hoc (=unfamiliar) modeling technique used.
Providing an easy to follow system description, and doing so efficiently, requires us to balance these two ends. We need to have a “just enough” formalism that provides a common language. It needs to be intuitive to write and read, with enough freedom to provide any details needed to get a complete picture, but without burdening the writers and readers with unnecessary details.
In this post, I try to give an overview and pointers to a method I found useful in the past (not my invention), and that I believe answers the criteria mentioned above. It is definitely not the only way and may not suit everyone’s taste (e.g. Simon Brown suggests something similar but slightly different); but regardless of the method used, creating a shared vision, and putting it to writing is something useful, when done effectively.
System != Software
Before going into the technicalities of describing a system effectively, I believe we need to make the distinction between a system and its software.
For the purposes of our discussion, we’ll define software as a computer-understandable description of a dynamic system; i.e. one way to code the structure and behavior of a system in a way that’s understandable by computers.
A (dynamic) system on the other hand is what emerges from the execution of software.
To understand the distinction, an analogy might help: consider the task of understanding the issue of global warming (the system) vs. understanding the structure of a book about global warming (the software).
- Understanding the book structure does not imply understanding global warming. Similarly, understanding the software structure doesn’t imply understanding the system.
- The book can be written in different languages, but it’s still describing global warming. Similarly, software can be implemented using different languages and tools/technologies, but it doesn’t (shouldn’t) change the emergent behavior of the system.
- Reading the content of the book implies understanding global warming. Similarly, the system is what emerges from execution of the software.
One point we need to keep in mind, and where this analogy breaks, is that understanding a book’s structure is considerably easier than understanding the software written for a given system.
So usually, when confronted with the need to document our system, we tend to focus on documenting the software, not the system. This leads to ineffective documentation/modeling (we’re documenting the wrong thing), eventually leading to frustration and missing knowledge.
This is further compounded by the fact that existing tools and frameworks for documentation of software (e.g. UML) tend to be complex and detailed, and with the tools emphasizing code generation, and not human communication; this is especially true for UML.
Modeling a System
When we model an existing system, or design a new one, we find several methods and tools that help us. A lot of these methods define all sorts of views of the system – describing different facets of its implementation. Most practitioners have surely met one or more different “types” of system views: logical, conceptual, deployment, implementation, high level, behavior, etc. These all provide some kind of information as to how the system is built, but there’s not a lot of clarity on the differences or roles of each such view. These are essentially different abstractions or facets of the given system being modeled. While any such abstraction can be justified in itself, it is the combination of these that produces an often unreadable end result.
So, as with any other type of technical document you write, the first rule of thumb is:
Rule of thumb #1: Tailor the content to the reader(s), and be explicit about it.
In other words – set expectations. Set the expectation early on – what you’re describing and what is the expected knowledge (and usually technical competency) of the reader.
Generally, in my experience, 3 main facets are the most important ones: the structure of the system – how it’s built, the behavior of the system – how the different component interact on given inputs/events, and the domain model used in the system. Each of these facets can be described in more or less detail, at different abstraction levels, and using different techniques, depending on the case. But these are usually the most important facets for a reader to understand the system and approach the code design itself, or reading the code.
Technical Architecture Modeling
One method I often find useful is that of Technical Architecture Modeling (TAM), itself a derivative of Fundamental Modeling Concepts (FMC). It is a formal method, but one which focuses on human comprehension. As such, it borrows from UML and FMC, to provide a level of formalism which seems to strike a good balance between readability and modeling efficiency. TAM uses a few diagram types, where the most useful are the component/block diagram used to depict a system’s structure or composition; the activity and sequence diagrams used to model a system/component’s behavior and the class diagram used to model a domain (value) model. In addition, other diagram types are also included, e.g. state charts and deployment
diagrams; but these are less useful in my experience. In addition, TAM also has some tool support in the form of Visio stencils that make it easier to integrate this into other documentation methods.
I briefly discuss how the most important facets of a system can be modeled with TAM, but the reader is encouraged to follow the links given above (or ask me) for further information and details.
Block Diagram: System Structure
A system’s structure, or composition, is described using a simple block diagram. At its simplest form, this diagram describes the different components that make up the system.
For example, describing a simple travel agency system, with a reservation and information system can look something like this (example taken from the FMC introduction):
This in itself already tells us some of the story: there’s a travel agency system, accessed by customers and other interested parties, with two subsystems: a reservation system and an information help desk system. The information is read and written to two separate data stores holding the customer data and reservations in one store, and the travel information (e.g. flight and hotel information) in the other. This data is fed into the system by external travel-related organizations (e.g. airlines, hotel chains), and reservations are forwarded to the same external systems.
This description is usually enough to provide at least a contextual high level information of the system. But the diagram above already tells us a bit more. It provides us some information about the access points to the data; about the different kinds of data flowing in the system, and what component is interacting with what other component (who knows who). Note that there is little to no technical information at this point.
The modeling language itself is pretty straightforward and simple as well: we have two main “entities”: actors and data stores.
Actors, designated by square rectangles, are any components that do something in the system (also humans). They are they active components of the system. Actors communicate with other actors through channels (lines with small circles on them), and the read/write from/to data stores (simple lines with arrow heads). Examples include services, functions and human operators of the system.
Data store, designated by round rectangles (/circles), are passive components. These are “places” where data is stored. Examples include database systems, files, and even memory arrays (or generally any data structure).
Armed with these definitions, we can already identify some useful patterns, and how to model them:
Read only access – actor A can only read from data store S:
Write only access – actor A can only write to data store S:
Two actors communicating on a request/response channel have their own unique symbol:
In this case, actor ‘B’ requests something from actor ‘A’ (the arrow on the ‘R’ symbol points to ‘A’), and ‘A’ answers back with data. So data flow actually happens in both ways. A classical example of this is a client browser asking for a web page from a web server.
A simple communication over a shared storage:
actors ‘A’ and ‘B’ both read and write from/to data store ‘S’. Effectively communicating over it.
There’s a bit more to this formalism, which you can explore in FMC/TAM website. But not really much more than what’s shown here. These simple primitives already provide a powerful expression mechanism to convey most of the ideas we need to communicate over our system on a daily basis.
Usually, when providing such a diagram, it’s good practice to accompany it with some text that provides some explanation on the different components and their roles. This shouldn’t be more than 1-2 paragraphs, but actually depends on the level of detail and system size.
This would generally help with two things: identifying redundant components, and describing the responsibility of each component clearly. Think of this text explanation as a way to validate your modeling, as displayed in the diagram.
Rule of thumb #2: If your explanation doesn’t include all the actors/stores depicted in the
diagram – you probably have redundant components.
The dynamic behavior of a system is of course no less important than its structure. The cooperation, interaction and data flow between components allow us to identify failure points, bottlenecks, decoupling problems etc. In this case, TAM adopts largely the UML practice of using sequence diagrams or activity diagrams, whose description is beyond the scope of this post.
One thing to keep in mind though, is that when modeling behavior in this case, you’re usually not modeling interaction between classes, but rather between components. So the formalism of “messages” sent between objects need not couple itself to code structure and class/method names. Remember: you generally don’t model the software (code), but rather system components. So you don’t need to model the exact method calls and object instances, as is generally the case with UML models.
One good way to validate the model at this point is to verify that the components mentioned in the activity diagram are mentioned in the system’s structure (in the block diagram); and that components that interact in the behavioral model actually have this interaction expressed in the structural model. A missing interaction (e.g. channel) in the structural view may mean that these two components have an interface that wasn’t expressed in the structural model, i.e. the structure diagram should be fixed; or it could mean that these two components shouldn’t interact, i.e. the behavioral model needs to be fixed.
This is the exact thought process that this modeling helps to achieve – modeling two different facets of the system and validating one with the other in iterations allows us to reason and validate our understanding of the system. The explicit diagrams are simply the visual method that helps us to visualize and capture those ideas efficiently. Of course, keep in mind that you validate the model at the appropriate level of abstraction – don’t validate a high level system structure with a sequence diagram describing implementation classes.
Rule of thumb #3: Every interaction modeled in the behavioral model (activity/sequence
diagrams) should be reflected in the structural model (block diagram), and vice versa.
Another often useful aspect of modeling a system is modeling the data processed by the system. It helps to reason about the algorithms, expected load and eventually the structure of the code. This is often the part that’s not covered by well known patterns and needs to be carefully tuned per application. It also helps in creating a shared vocabulary and terminology when discussing different aspects of the developed software.
A useful method in the case of domain modeling is UML class diagrams, which TAM also adopts. In this case as well, I often find a more scaled-down version the most useful, usually focused on the main entities, and their relationships (including cardinality). The useful notation of class diagrams can be leveraged to express these relationships quite succinctly.
Explicit modeling of the code itself is rarely useful in my opinion – the code will probably be refactored way faster than a model will be updated, and a reader who is able to read a detailed class diagram can also read the code it describes. One exception to this rule might be when your application deals with code constructs, in which case the code constructs themselves (e.g. interfaces) serve as the API to your system, and clients will need to write code that integrates with it, as a primary usage pattern of the system. An example for this is an extensible library of any sort (eclipse plugins are one prominent example, but there are more).
Another useful modeling facet in this context is to model the main concepts handled in the system. This is especially useful in very technical systems (oriented at developers), that introduce several new concepts, e.g. frameworks. In this case, a conceptual model can prove to be useful for establishing a shared understanding and terminology for anyone discussing the system.
Of course, at the end of the day, we need to remember that modeling a system in fact reflects a thought process we have when designing the system. The end product, in the form a document (or set of documents) represents our understanding of the system – its structure and behavior. But this is never a one-way process. It is almost always an iterative process that reflects our evolving understanding of the system.
So modeling a specific facet of the system should not be seen as a one-off activity. We often follow a dynamic where we model the structure of the system, but then try to model its behavior, only to realize the structure isn’t sufficient or leads to a suboptimal flow. This back and forth is actually a good thing – it helps us to solidify our understanding and converge on a widely understood and accepted picture of how the system should look, and how it should be constructed.
Refinements also happen on the axis of abstractions. Moving from a high level to a lower level of abstraction, we can provide more details on the system. We can refine as much as we find useful, up to the level of modeling the code (which, as stated above, is rarely useful in my opinion). Also when working on the details of a given view, it’s common to find improvement points and issues in the higher level description. So iterations can happen here as well.
As an example, consider the imaginary travel agency example quoted above. One possible refinement of the structural view could be something like this (also taken from the site above):
In this case, more detail is provided on the implementation of the information help subsystem and the ‘Travel Information’ data store. Although providing some more (useful) technical details, this is still a block diagram, describing the structure of the system. This level of detail refines the high level view shown earlier, and already provides more information and insight into how the system is built. For example, how the data stores are implemented and accessed, the way data is adapted and propagated in the system. The acute reader will note that the ‘Reservation System’ subsystem now interacts with the ‘HTTP Server’ component in the ‘Information help desk’ subsystem. This makes sense from a logical point of view – the reservation system accesses the travel information through the same channels used to provide information to other actors, but this information was missing from the first diagram (no channel between the two components).
One important rule of thumb is that as you go down the levels of abstraction, keep the names of actors presented in the higher level of abstraction. This allows readers to correlate the views more easily, identify the different actors, and reason about their place in the system. It provides a context for the more fine granular details. As the example above shows, the more detailed diagram still includes the actor and store names from the higher level diagram (‘Travel Information’, ‘Information help desk’, ‘Travel Agency’).
Rule of thumb #4: Be consistent about names when moving between different levels of abstraction. Enable correlations between the different views.
Communicating w/ Humans – Visualization is Key
With all this modeling activity going on, we have to keep in mind that our main goal, besides good design, is communicating this design to other humans, not machines. This is why, reluctant as we are to admit it (engineers…) – aesthetics matter.
In the context of enterprise systems, communicating the design effectively is as important to the quality of the resulting software as designing it properly. In some cases, it might be even more important – just consider the amount of time you sometime spend on integration of system vs. how much time you spend writing the software itself. So a good looking diagram is important, and we should be mindful about how we present it to the intended audience.
Following are some tips and pointers on what to look for when considering this aspect of communicating our designs. This is by no means an exhaustive list, but more based on experience (and some common sense). More pointers can be found in the links above, specifically in the visualization guide.
First, keep in mind node and visual arrangement of nodes and edges in your diagram immediately lends itself to how clear the diagram is to readers. Try to minimize intersection of edges, and align edges on horizontal and vertical axes.
Compare these two examples:
The arrangement on the left is definitely clearer than the one on the right. Note that generally speaking, the size of a node does not imply any specific meaning; it is just a visual convenience.
Similarly, this example:
shows how the re-arrangement of nodes allows for less intersection, without losing any meaning.
Colors can also be very useful in this case. One can use colors to help distinguish between different levels of containment:
In this case, the usage of colors helps to distinguish an otherwise confusing structure. Keep in mind that readers might want to print the document you create on a black and white printer (and color blind) – so use high contrast colors where possible.
Label styles are generally not very useful to convey meaning. Try to stick to a very specific font and be consistent with it. An exception might be a label that pertains to a different aspect, e.g. configuration files or code locations, which might be more easily distinguished when using a different font style.
Visuals have Semantics
One useful way to leverage colors and layout of a diagram is to stress specific semantics you might want to convey in your diagram. One might leverage colors to distinguish a set of components from other components, e.g. highlighting team responsibilities, or highlight specific implementation details. Note that when you use this kind of technique that it is not standard, so remember to include an explanation – a legend – of what the different colors mean. Also, too many colors might cause more clutter, eventually beating the purpose of clarity.
Another useful technique is to use layout of the nodes in the graph for conveying an understanding. For example, depicting the main data flow might be hinted in the block diagram by layouting the nodes from left to right, or top to down. This is not required, nor carries any specific meaning. But it is often useful to use, and provides hints as to how the system actually works.
As we’ve seen, “doing” architecture, while often perceived as a cumbersome and unnecessary activity isn’t hard to do when done effectively. We need to keep in mind the focus of this activity: communicating our designs and reasoning about them over longer periods of time.
Easing the collaboration around design is not just an issue of knowledge sharing (though that’s important as well), but it is a necessity when trying to build software across global teams, over long periods of time. How effectively we communicate our designs directly impacts how we collaborate, the quality of produced software, how we evolve it over time, and eventually the bottom line of deliveries.
I hope this (rather long) post has served to shed some light on the subject, and provide some insight, useful tips and encouraged people to invest some efforts into learning further.
Credit: The example and images presented in this post are taken from the FMC website and examples.