Category Archives: architecture

Discussing Your Design with Scenaria

Motivation
As a software architect, I spend quite a bit of my time in design discussions. That’s an integral part of the job, for a good reason. As I see it, the design conversation is a fundamental part of this job and its role in the organization.

Design discussions are hard, for various reasons. Sometimes the subject matter is complicated. Sometimes there’s a lot of uncertainty. Sometimes tradeoffs are hard to negotiate. These are all just examples, and it is all part of the job. More often than not, it’s the interesting part.

But another reason these discussions tend to be hard is because of misunderstandings, vagueness and lack of precision in how we express ourselves. Expressing your thoughts in a way that translates well into other people’s minds is not easy. This gets worse as the number of people involved increases, especially when using a language where most, if not all, people do not speak natively.

From what I observed, this is true both for face to face meetings (often conducted remotely these days), as well as in written communication. I try to be as precise as I can, but jumping from one discussion to another, under time pressure, I also often commit the sin of “winging it” when making an argument in some Slack thread or some design document comment.

I’ve argued in the past that diagrams serve a much better job of explaining designs. I think this is true, and I often try to make extensive use of diagrams. But good diagrams also take time to create. Tools that use the “diagram as code” approach, e.g. PlantUML (but there are a bunch of others, see kroki), are in my experience a good way to create and share ideas. If you know the syntax, you can be fairly fast in “drawing” your design idea.

Still, I haven’t found a tool that will allow me to conveniently express what I need to express in a design discussion. Simply creating a simple diagram is not all of the story. I often want to share an idea of the structure of the system – the cooperating components, but also of its behavior. It’s important to not just show the structure of the system, and interfaces between components, but also highlight specific flows in different scenarios.

There are of course diagram types for that as well, e.g. sequence or activity diagrams. And there are a plethora of tools for creating those as well. But the “designer experience” is lacking. It’s hard to move from one type of view to another, maintaining consistency. This is why whiteboard discussions are easier in that sense – we sit together, draw something on the board, and then point at it, waving our hands over the picture that everyone is looking at. Even if something is not precise in itself, we can compensate by pointing at specific points, emphasizing one point or another.

Emulating this interaction is not easy at this day and age of remote work. When a lot of the discussions are done remotely, and often asynchronously (for good reasons), there’s a greater need to be precise. And this is not easy to do at the “speed of thought”.

Building software tools is sort of a hobby for me, so I set out to try and address this.

Goals

What I’m missing is a tool that will allow me to:

  1. Quickly express my thoughts on the structure and behavior of a (sub)system – the involved components and interactions.
  2. Share this picture and relevant behavior easily with other people, allowing them to reason about it. Allowing us to conveniently discuss the ideas presented, and easily make corrections or suggest alternatives.

So essentially I’m looking to create a tool that allows me to describe a system easily (structure + behavior). A tool that efficiently creates relevant diagram and allows me to visualize the behavior on this diagram.

Constraints and Boundary Conditions

Setting out to implement this kind of tool, as a proof of concept, I outlined for myself several constraints or boundary conditions I would like to maintain, both from a “product” point of view as well as from an engineering implementation point of view.

  1. The description should be text based, so we can easily share system description as well as version them using existing versioning tools, namely git.
  2. The tool should be easy to ramp up to.
    1. Just load and start writing
    2. Easy syntax, hopefully intuitive.
  3. Designs should be easily shareable – a simple link that can be sent, and embedded in other places.
  4. There should not be any special requirements for software to use the tool.
    1. A simple modern browser should be enough.

Scenaria

Enter Scenaria (git repo). 

Scenaria is a language – a simple DSL, with an accompanying web tool. The tool includes a simple online editor, and a visualization area. You enter the description of the system in the editor, hit “Apply”, and the system is displayed in the visualization pane.

Scenaria Screenshot
Scenaria Screenshot

The diagram itself is heavily inspired by technical architecture modeling. The textual DSL is inspired by PlantUML. You can play with the tool here, and see a more detailed explanation of the model and syntax here.

Discussion doesn’t stop with purely static diagram. The tool also allows you to describe and visualize interactions between the different components. You can describe several flows, which you can then “play”, on the drawn diagram. You can step through a scenario or simply play from start to finish.

After this is done, you have a shareable link, as part of the application which you can send to colleagues (or keep).

As a diagramming tool, it’s pretty lacking. But remember that the purpose here is not to necessarily create beautiful diagrams (though that’s always a plus). It’s mainly about enabling a conversation, efficiently. So there’s a balance here between being expressive in the language, while not going down the route of adding a ton of visualization features which will distract from the main purpose of describing a system or a feature.

Scenaria is more intended to be a communication tool to be used easily in the discussion we have with our colleagues. It can serve as a basis for further analysis, as it provides a way to structure the description of a system – its structure and behavior. But the focus isn’t on rigorous formal description that can derive working code. It’s not intended for code generation. It’s about having something to point at when discussing design, but easily create and share it, based on some system model.

An Example

An example scenario can be viewed here. This example shows the main components of the Scenaria app, with a simple flow showing the interaction between them when the code is parsed and shown on screen.

Looking at the code of the description, we start by enumerating the different actors cooperating in the process:

user 'Designer' as u;
agent 'App Page' as p;
agent 'Main App' as app;
agent 'Editor' as e;
agent 'Parser' as prsr;
agent 'Diagram Drawing' as dd;
agent 'ELK Lib' as elk;
agent 'Diagram Painting' as dp;
agent 'Diagram Controller' as dc;

Each component is described as an agent here, with the user (a “Designer”) as a separate actor.

We then define an annotation highlighting external libraries:

@External {
  color : 'lightgreen';
};

And annotate two agents to mark them as external libraries:

elk is @External;
e is @External;

Note that up to this point we haven’t defined any interactions or channels between the components.
Now we can turn to describe a flow – specifically what happens when the user writes some Scenaria code and hits the “Apply” button:

'Model Drawing' {
    u -('enter code')-> e
    u -('apply')->p
    p -('reset')-> app

    p -('get code')-> e
    p --('code')--< e

    p-('parseAndPresent')-> app
        app -('parse')-> prsr
        app --('model')--< prsr
        app -('layoutModel') -> dd
            dd -('layout') -> elk
            dd --('graph obj')--< elk
        app --('graph obj')--< dd

        app -('draw graph')-> dd
            dd -('draw actors, channels, edges')->dp
        app --('painter (dp)')--< dd

        app -('get svg elements')->dp
        app --('svg elements')--<dp
        
        app -('create and set svg elements')->dc


    p --('model')--< app

};

We give scenario a name – “Model Drawing”, and describe the different calls between the cooperating actors. Indentation is not required, just added here for readability.

The interaction between the agents implicitly define channels between the components. So when the diagram is drawn, it is drawn with relevant channels:

At this point the application allows you to run or step through the given scenario where you will see the different messages and return values, as described in the text.


Next Steps

This is far from a complete tool, and I hope to continue working on it, as I try to embed it into my daily work and see what works and what doesn’t.

At this point, it’s basically a proof of concept, a sort of an early prototype.

Some directions and features I have in mind that I believe can help in promoting the goals I outlined above:

  1. Better diagramming: better layout, supporting component hierarchies.
  2. Diagram features: comments on the diagram (as part of steps?), titles, notes
  3. Scenario playback – allow for branches, parallel step execution, self calls.
  4. Versioning of diagrams – show an evolution of a system, milestones for development, etc.
  5. Integration with other tools:
    1. Wikis/markdown (a “design notebook”?)
    2. Slack and other discussion tools
    3. Tools and links to other modeling tools, showing different views of the same model.
  6. A view only mode – allow sharing only the diagram and allow playback of scenarios.
    1. Allow embedding of the SVG only into other tools, e.g. a widget in google docs.
  7. Better application UX (admittedly, I’m not much of a user interface designer).
  8. Team collaboration features beyond version control.

Contributions, feedback and discussions are of course always welcome.

Managing Access Control – A Crash Course

During my career, especially when involved in development of some kind of platform, I’ve run into the need to explore and make decisions on how to manage access control in all kinds of situations. I’ve done my research, but I still haven’t found a simple and clear explanation of access control models, specifically how to manage access control in a platform.

So my purpose here is to try and give a solid yet simple enough explanation of how access control models work. This is not intended necessarily to be a comprehensive view, nor a very precise and rigorous view. My aim here is to provide a structured conceptual model and facilitate further discussions and decisions on how to model the access control in applications/frameworks. I cover here mainly the declarative part of managing access control, implementation of an access control mechanism is beyond the scope of this post.

Let’s dive right in.

The Access Control Question

At the fundamental level, all access control models try to answer a simple question: for any given user, what actions can he/she perform on what objects, at the time of the query. 

To put it slightly more formally, given a set of users U, a set of objects O and a set of actions A (associated with the objects), what is the relation P of tuples <u,a,o> (u∈U, a∈A, o∈O)?

If a tuple exists in the relation, then we say that user u is allowed to do a with object o.

Some examples:

  1. <Joe,READ, ClientsTable> means that Joe is allowed to READ the Clients table.
  2. <Alice,VIEW, Tenant MgmtScreen> means that the user named Alice is allowed to view the tenant management screen.
  3. <Alice, EXECUTE, DeleteUsers> means that the user named Alice is allowed to execute the function called ‘DeleteUsers’.
  4. <Bob, READ, Device[id=123]> means that the user named Bob can read the record with id 123 from the Device table.
  5. <*,VIEW, LoginScreen> means that any user can view the login screen (adopting ‘*’ as a notation for wildcard)

The only question1 an authorization service aims to answer is the existence of such tuples at the time of querying – whether the user in question can execute a given action on a given object.

Note that the definition of users, actions and objects is anything we choose to be. Objects can be of course data objects (tables, records, documents), with common actions such as CREATE, READ, UPDATE, DELETE. But objects are also functions to execute (stored procedures, RPCs, actions accessed over URLs, … ) and it can also be screens to view.

Terminology-wise, objects are sometimes referred to as resources and users are more generally referred to as security principals (so it’s not just individual persons). A tuple of <action, object> is referred to as a permission or privilege.

Managing Permissions

Now that we’ve established the main question that is answered, the question becomes how to manage permissions effectively and efficiently. This is more than a question of convenience. An overly complicated permission management model might lead to confusion and assignment of permissions where they should not be assigned to. It’s important to be able to reason about a given user’s permissions so we can make sure a user (usually a group of users) doesn’t get permissions they shouldn’t have.

A straightforward optimization is to assign a permission (action, object tuples) to a set of users at once. So access control systems usually let us define arbitrary groups of users to allow this convenience. Still, the definition and assignment of a large set of permissions is cumbersome.

Roles

Another common and useful way to organize permission assignment is to group permissions and assign (groups of) users to these sets of permissions.

Remember that a permission is simply a tuple of <action,object>. A role is simply a set of such tuples.

For example, an Account Manager role (AccountManagerRole) might be defined as: 

  <View, Account Configuration Screen>,
  <Execute, UpdateAccount>,
  <View, Account Dashboard>
  …

Assuming we have such a role defined in the system, we can now assign users to it.

So <Bob, AccountManagerRole> is simply a shortcut for: 

  <Bob, View, Account Configuration Screen>,
  <Bob, Execute, UpdateAccount>,
  <Bob, View, Account Dashboard>
  …

Naturally, role definitions usually correlate with business-related roles. So a user that is assigned to a role is expected to fulfill some function in a business process, which the given permissions allow him to perform.

This is also true for operational roles over management applications/resources. We define a role of “administrator” or “operator” with the necessary privileges to accomplish the role. It’s important to have even operational functionality access-controlled with the same mechanism so definitions are assigned effectively. A common example is a “user administrator” role, which is allowed to create/update/delete user records (other principles) and assign them with privileges. This role definition is not different than any other, it’s simply defined over the “Roles” and “Users” objects.

Assignment of users/principles to roles, whether mechanized or manual, is usually a human-driven process. Assignment of (groups of) users to roles happens with some utility or GUI-based tool, but essentially human driven and performed as needed, e.g. as part of some onboarding process.

A word of caution: I’ve seen in some discussions how roles get mixed up with user groups. In other words, roles are conceptualized as sets of users. While technically you can view a role as designating a set of users (the principles assigned to the role at a given point in time), this is actually a byproduct of the access control system, and not the original purpose and motivation for defining a role. A role should ideally represent a business function, with a set of privileges.

Data-Level Permissions

A common and useful extension to this permission management model is that of data-level permissions, often referred to as row-level permissions, because of its root in RDBMSs. In this extension we usually define permissions at a lower granularity of objects – individual data objects (rows/documents) defined in our application.

Note that this is not a different model then what we presented as the basic permissions model. We’re still assigning actions, usually READ/UPDATE/DELETE, to individual objects. The difference is in how the assignment happens – how we connect a user to the object in question.

Up until now, the user was assigned arbitrarily/manually to a set of permissions, probably through a role. With this extension we’re also configuring the authorization service2 to match a specific property of the user to a property of the accessed object.

For example, we might want to limit Charlie’s access only to Clients from his region. We can define the permission as:

   <READ,Clients[region=user.region]>

Where user here refers to the “current” user being matched (the principal trying to access the Clients table). This assumes of course that the user object in the system has such a property (region) defined.

This allows us to assign this permission to a group of users, but still having each user seeing only the appropriate set of data he should be allowed to access, without defining explicit roles for each such subset. It’s both easier to manage and more robust since new property values (e.g new regions) are automatically accessed only by allowed users, without changing role assignment.

The choice of properties to match can of course be anything that can be matched between the data object and the user. A common use case for such a mechanism is to separate the data between different business units or tenants, in the same data store. So a “BU”/”Tenant” property is defined for each data object (row), and users are assigned to business units/tenants, allowing this match to happen when examined.

Note that there’s nothing that prevents this definition from being part of a role definition. It is, however, often impractical/inefficient to enforce this kind of restriction at the same place where more general access control is enforced. This is simply because enforcement of data-level permissions requires intimate knowledge of the underlying data model to be queried, which usually doesn’t exist in a generic access control service. It’s also more efficient to “push” the data-level permissions to underlying query engines to reduce retrieved data volumes.

Extending the Accessible Object Set

It’s often useful/desired to extend the set of permitted data objects to a user, either individually or through a role. We do this by extending the set of matched values on the user’s side. In other words, the user will be able to access data objects (rows) that match a larger set of values.

The most straightforward way to achieve this is to explicitly specify more values in the permission definition. But this is often impractical from an operational perspective.

Another way is to rely on some other structure that exists between (sets of) users and leverage its definition to extend the set of matched property values for that user.

A common example of such a mechanism is the definition and assignment of users to business units, which often form a hierarchy derived from its business/organization hierarchy.

For example, assume ACME Corporation is structured as follows:

ACME Global

⤷ ACME US

   ⤷ ACME West Coast

   ⤷ ACME East Coast

⤷ ACME EMEA

   ⤷ ACME Middle East

   ⤷ ACME Africa

   ⤷ ACME EU

⤷ ACME APAC

And assuming Charlie is assigned to ACME US, i.e. has a property that says that his business unit is ACME US. Then we can easily extend Charlie’s set of accessible objects by matching also on descendant business units (ACME West Coast, ACME East Coast), so objects that have these business units set to them are also viewable.

It’s important to note that this is simply leveraging the assignment of users to business units and their hierarchy in order to extend the sets of accessible objects. But there’s nothing that prevents us from using any other existing relationship between users. For example, consider users that are mapped to an organization hierarchy (a Manager-Employees relationship). We can define that a user has access to all data objects where their “category” matches the user’s “expertise” value (whatever that means); plus all the ‘expertise’ values of the users managed by this manager. In this manner we allowed the manager to automatically view his employees’ data, e.g. assigned tickets. Changes in the organizational structure, or ticket categories will automatically change the permissions assigned to said users.


To summarize, we defined a simple (semi) formal model for thinking about the access control problem, and specifically how to manage it. Different systems and paradigms exist to ease the management of this kind of access control.

When considering how to manage access control, especially when creating platforms, it’s usually helpful to think of it in terms how principles are assigned to <action,object> tuples effectively. 

Hope this helps.


ADR Flow: A Tool for Managing Architecture Decisions

With all the agile software development movement, the role of the architect has changed to some degree as well.

While exploring the full breadth of changes is an interesting (some might say tedious) discussion, one of the most prominent challenges I often face is finding an efficient way to preserve knowledge around design/architecture decisions that are made in a project.
It’s a challenge I faced when I joined a team already working on several different products, and me being the new guy (and the one who’s supposed to make design decisions!) – I had to spend considerable time researching, asking around, digging into code and old presentations, trying to figure out why the system was built the way it is. In some cases there was actually no clear answer.
It’s also a situation I expect people to face with any long-lived project/product development. Unless you maintain very good documentation of your design decisions,  it will be hard to trace back why things are built the way they are. I know I’ve faced similar situations in past projects.

Documenting decisions is tedious. Moreover, it’s often perceived as non-productive. It doesn’t add product value. But even that – convincing people to document design decisions – is the easy part in my experience. The harder task is to actually write something that is coherent, describes the context and the decision succinctly but completely.

One methodology I found, liked, and currently experimenting with, is that of Architecture Decision Records.
Refer to the link above for a complete overview, but in a nutshell: describe your architecture decision using a simple template, keep one file per decision, keep them with the source code, and setup a minimal process around these decisions, to be reflected in the files. The coupling to the code, and focus on specific decisions, is what I think will make this a more productive way to document our design decisions.

For me, at this point in time, this is still in “POC mode”. I’ve introduced it to my team, and currently pushing it, but we still need to see more value.

One thing I missed from the beginning is a way to to easily setup and manage ADRs. I found this open source project, which seems useful. But it’s based solely on linux shell scripts. Being mostly a windows user in my day-to-day work, I missed a good windows integration.

And thus the ADR Flow project was born.

The ADR Flow tool is meant to be a simple command line tool to aid in the process of creating and managing the ADR lifecycle. It’s meant to be simple enough so that it could be easily integrated into any workflow and/or standard tool.
It is not meant to hide the fact that ADRs are text files, or prevent editing them in whatever tools the user sees fit. Rather it tries to leverage some of the maintenance overhead, and make managing ADRs easy.

It’s a fun side project, also giving me the option to explore more possibilities for development with Node.js.

Comments, questions, feedback and of course contributions are more than welcome.

The Understated Architect Role


So what’s in a software architect?

This is actually a question I’ve seen addressed a few times, and actually discussed during my career quite a bit. It’s a question I need to address if and when I search for new job as a software development architect. And it’s also a question that seems to have quite a few possible answers.

But eventually, there does some to be a common ground when trying to discuss the work of a software development architect. One could discuss the soft skills, the technical skills or the leadership skills required from an architect. It’s all true, at least to some degree. The exact combination of skills required to be a good software development architect varies between different organizations, operating contexts (the organization structure, company culture and size, history with the team, etc.) and your definition of “good”; so I won’t waste your time in trying to define an exact grocery list of skills.

I do believe, however, that one important role of the architect is often overlooked or not emphasized enough. And it’s probably not something you’ll learn in any course or training. It is something that I often felt during my work, but realized fully only when another (more experienced) architect has put it into words that resonated strongly with me.

Probably the most important job of an architect is to create and maintain a consistent understanding of the software system being developed. Or more precisely: creating a coherent and consistent picture of the developed system across all parties involved in the development project. To create an understanding between the development team and its stakeholders. To create an understanding within the team about the technical vision and direction for development. To create an understanding across teams on what is being developed and how systems interact.

This is probably the single most important job an architect has that is also unique to this role. Technical expertise is important, as is various design decisions and being able to weigh trade-offs properly. But the thing that truly separates the architect’s role from that of an expert software developer is the ability to convey a system structure and technical design, its capabilities, constraints and decisions taken when building it.

It’s really more than simply knowing the right words and using the right terminology. It’s about framing thoughts in a way that is consistent and understandable by the target audience. It’s about striking the balance between formalism and human comprehension. It’s about choosing the right methods to convey an idea. It’s about being precise and succinct, yet clear and understandable. It’s about being able to translate between different terminologies or “domains of thoughts”.

And herein lies the real challenge in being a software development architect, in my opinion. Technical mastery is important. But being able to convey an idea efficiently, to plant the right idea into the minds of people you’re communicating with is one thing you can’t learn on stackoverflow.com (which is a wonderful site, by the way).

And it’s a subtle act, often very delicate and hard to balance. Simply because people come from different schools of thoughts and experiences (and often with agendas). But choosing the right terminology, explaining it properly, making assumptions explicit or drawing the right diagram goes a long way towards aligning all involved parties on a single vision and a coherent mental picture of a system. Creating this mental picture in everyone’s mind is no easy task.

I had a boss who once told me that whenever there’s a disagreement on a technical direction to take, you should prefer to be the one drawing the diagrams. This simple fact already gives you an advantage. If you’re the one holding the marker pen, you already have an edge.

This is true from an organizational politics point of view, but also an important point to keep in mind when you want to reach an agreement and create this common understanding and consistency – if you wield the pen, you wield the power as well as the responsibility. And when you’re an architect, your job is to wield the pen.

When people talk about being an architect, they usually talk about what to do with the pen – what kind of diagrams to draw, what documents to write, how much code, etc. But there’s an overarching, implicit goal that is assigned to the one holding the pen – to make sure everyone is aligned and have a consistent mental picture in their heads, when approaching their individual jobs. This is especially harder in the software business, where there’s no shortage of abstractions and intangible concepts to keep in mind. Where the different levels of abstraction and complexity a developer has to keep in his head are often mind boggling. The ability to synthesize the important ideas and communicate them at the right place, at the right time, in a manner that will be adopted and accepted is quite a fit. Take someone who is good at doing that, add a decent technical and analytical abilities into the mix and you’ve got yourself a good software architect right there.

 

Effective System Modeling

This post is a rough “transcript” (with some changes and creative freedom) of a session I gave in the Citi Innovation Lab, TLV about how to effectively model a system.

A Communication Breakdown?

Building complex software systems is not an easy task, for a lot of reasons. All kinds of solutions have been invented to tackle the different issues. We have higher level programming languages, DB tools, agile project management methodologies and quite a bit more. One could argue that these problems still exist, and no complete solution has been found so far. That may be true, but in this post, I’d like to discuss a different problem in this context: communicating our designs.

One problem that seems to be overlooked or not addressed well enough, is the issue of communicating our designs and system architecture. By ourselves, experienced engineers are (usually) quite capable of coming up with often elegant solutions to complex problems. But the realities and dynamics of a software development organization, especially a geographically distributed one, often require us to communicate and reason about systems developed by others.

We – software engineers – tend to focus on solving the technical issues or designing the systems we’re building. This often leads to forgetting that software development, especially in the enterprise, is often, if not always, a team effort. Communicating our designs is therefore critical to our success, but is often viewed as a negligible activity at best, if not a complete waste of time.

The agile development movement, in all its variants, has done some good to bring the issues of cooperation and communication into the limelight. Still, I often find that communication of technical details – structure and behavior of systems, is poorly done.

Why is that?

“Doing” Architecture

A common interpretation of agile development methods I often encounter tends to spill the baby with the water. I hear about people/teams refusing to do “big up-front design”. That in itself is actually a good thing in my opinion. The problem starts when this translates to no design at all, and this immediately translates into not wanting to spend time on documenting your architecture properly, or how it’s communicated.

But as anyone who’s been in this industry for more than a day knows – there’s no replacement for thinking about your design and your system, and agile doesn’t mean we shouldn’t design our system. So I claim that the problem isn’t really with designing per-se, but rather in the motivation and methodology we use for “doing” our architecture – how we go about designing the system and conveying our thoughts. Most of us acknowledge the importance of thinking about a system, but we do not invest the time in preserving that knowledge and discussion. Communicating a design or system architecture, especially in written form, is often viewed as superfluous, given the working code and its accompanying tests. From my experience this is often the case because the actual communication and documentation of a design are done ineffectively.

This was also strengthened after hearing Simon Brown talk about a similar subject, one which resonated with me. An architecture document/artifact should have “just enough” up front design to understand the system and create a shared vision. An architecture document should augment the code, not repeat it; it should describe what the code doesn’t already describe. In other words – don’t document the code, but rather look for the added value. A good architecture/design document adds value to the project team by articulating the vision on which all team members need to align on. Of course, this is less apparent in small teams than in large ones, especially teams that need to cooperate on a larger project.

As a side note I would like to suggest that besides creating a shared understanding and vision, an architecture document also helps in preserving the knowledge and ramping-up people onto the team. I believe that anyone who has tried learning a new system just by looking at its code will empathize with this.

Since I believe the motivation to actually design the system and solve the problem is definitely there, I’m left with the feeling that people often view the task of documenting it and communicating it as unnecessary “bureaucracy”.
We therefore need a way to communicate and document our system’s architecture effectively. A way that will allow us to transfer knowledge, over time and space (geographies), but still do it efficiently – both for the writer and readers.
It needs to be a way that captures the essence of the system, without drowning the reader in details, or burden the writer with work that will prove to be a waste of time. Looking at it from a system analysis point of view, then reading the document is quite possibly the more prominent use case, compared to writing it; i.e. the document is going to be read a lot more than written/modified.

When we come to the question of modeling a system, with the purpose of the end result being readable by humans, we need to balance the amount of formalism we apply to the model. A rigorous modeling technique will probably result in a more accurate model, but not necessarily an easily understandable one. Rigorous documents tend to be complete and accurate, but exhausting to read and follow; thereby beating the purpose we’re trying to achieve. At the other end of the scale are free text documents, often in English and sometimes with some scribbled diagrams, which explain the structure or behavior of system, often inconsistently. These are hard to follow for different reasons: inaccurate language, inconsistent terminology and/or ad-hoc (=unfamiliar) modeling technique used.

Providing an easy to follow system description, and doing so efficiently, requires us to balance these two ends. We need to have a “just enough” formalism that provides a common language. It needs to be intuitive to write and read, with enough freedom to provide any details needed to get a complete picture, but without burdening the writers and readers with unnecessary details.
In this post, I try to give an overview and pointers to a method I found useful in the past (not my invention), and that I believe answers the criteria mentioned above. It is definitely not the only way and may not suit everyone’s taste (e.g. Simon Brown suggests something similar but slightly different); but regardless of the method used, creating a shared vision, and putting it to writing is something useful, when done effectively.

System != Software

Before going into the technicalities of describing a system effectively, I believe we need to make the distinction between a system and its software.

For the purposes of our discussion, we’ll define software as a computer-understandable description of a dynamic system; i.e. one way to code the structure and behavior of a system in a way that’s understandable by computers.
A (dynamic) system on the other hand is what emerges from the execution of software.

To understand the distinction, an analogy might help: consider the task of understanding the issue of global warming (the system) vs. understanding the structure of a book about global warming (the software).

  • Understanding the book structure does not imply understanding global warming. Similarly, understanding the software structure doesn’t imply understanding the system.
  • The book can be written in different languages, but it’s still describing global warming. Similarly, software can be implemented using different languages and tools/technologies, but it doesn’t (shouldn’t) change the emergent behavior of the system.
  • Reading the content of the book implies understanding global warming. Similarly, the system is what emerges from execution of the software.

One point we need to keep in mind, and where this analogy breaks, is that understanding a book’s structure is considerably easier than understanding the software written for a given system.
So usually, when confronted with the need to document our system, we tend to focus on documenting the software, not the system. This leads to ineffective documentation/modeling (we’re documenting the wrong thing), eventually leading to frustration and missing knowledge.
This is further compounded by the fact that existing tools and frameworks for documentation of software (e.g. UML) tend to be complex and detailed, and with the tools emphasizing code generation, and not human communication; this is especially true for UML.

Modeling a System

When we model an existing system, or design a new one, we find several methods and tools that help us. A lot of these methods define all sorts of views of the system – describing different facets of its implementation. Most practitioners have surely met one or more different “types” of system views: logical, conceptual, deployment, implementation, high level, behavior, etc. These all provide some kind of information as to how the system is built, but there’s not a lot of clarity on the differences or roles of each such view. These are essentially different abstractions or facets of the given system being modeled. While any such abstraction can be justified in itself, it is the combination of these that produces an often unreadable end result.

So, as with any other type of technical document you write, the first rule of thumb is:

Rule of thumb #1: Tailor the content to the reader(s), and be explicit about it.

In other words – set expectations. Set the expectation early on – what you’re describing and what is the expected knowledge (and usually technical competency) of the reader.

Generally, in my experience, 3 main facets are the most important ones: the structure of the system – how it’s built, the behavior of the system – how the different component interact on given inputs/events, and the domain model used in the system. Each of these facets can be described in more or less detail, at different abstraction levels, and using different techniques, depending on the case. But these are usually the most important facets for a reader to understand the system and approach the code design itself, or reading the code.

Technical Architecture Modeling

One method I often find useful is that of Technical Architecture Modeling (TAM), itself a derivative of Fundamental Modeling Concepts (FMC). It is a formal method, but one which focuses on human comprehension. As such, it borrows from UML and FMC, to provide a level of formalism which seems to strike a good balance between readability and modeling efficiency. TAM uses a few diagram types, where the most useful are the component/block diagram used to depict a system’s structure or composition; the activity and sequence diagrams used to model a system/component’s behavior and the class diagram used to model a domain (value) model. In addition, other diagram types are also included, e.g. state charts and deployment
diagrams; but these are less useful in my experience. In addition, TAM also has some tool support in the form of Visio stencils that make it easier to integrate this into other documentation methods.

I briefly discuss how the most important facets of a system can be modeled with TAM, but the reader is encouraged to follow the links given above (or ask me) for further information and details.

Block Diagram: System Structure

A system’s structure, or composition, is described using a simple block diagram. At its simplest form, this diagram describes the different components that make up the system.
For example, describing a simple travel agency system, with a reservation and information system can look something like this (example taken from the FMC introduction):

Sample: Travel Agency System

This in itself already tells us some of the story: there’s a travel agency system, accessed by customers and other interested parties, with two subsystems: a reservation system and an information help desk system. The information is read and written to two separate data stores holding the customer data and reservations in one store, and the travel information (e.g. flight and hotel information) in the other. This data is fed into the system by external travel-related organizations (e.g. airlines, hotel chains), and reservations are forwarded to the same external systems.

This description is usually enough to provide at least a contextual high level information of the system. But the diagram above already tells us a bit more. It provides us some information about the access points to the data; about the different kinds of data flowing in the system, and what component is interacting with what other component (who knows who). Note that there is little to no technical information at this point.

The modeling language itself is pretty straightforward and simple as well: we have two main “entities”: actors and data stores.
Actors, designated by square rectangles, are any components that do something in the system (also humans). They are they active components of the system. Actors communicate with other actors through channels (lines with small circles on them), and the read/write from/to data stores (simple lines with arrow heads). Examples include services, functions and human operators of the system.
Data store, designated by round rectangles (/circles), are passive components. These are “places” where data is stored. Examples include database systems, files, and even memory arrays (or generally any data structure).

Armed with these definitions, we can already identify some useful patterns, and how to model them:

Read only access – actor A can only read from data store S:
Read only access

 

Write only access – actor A can only write to data store S:
Write only access

 

Read/Write access:
Read/Write access

 

Two actors communicating on a request/response channel have their own unique symbol:
effective-system-modeling-004
In this case, actor ‘B’ requests something from actor ‘A’ (the arrow on the ‘R’ symbol points to  ‘A’), and ‘A’ answers back with data. So data flow actually happens in both ways. A classical example of this is a client browser asking for a web page from a web server.

 

A simple communication over a shared storage:
effective-system-modeling-005
actors ‘A’ and ‘B’ both read and write from/to data store ‘S’. Effectively communicating over it.

 

There’s a bit more to this formalism, which you can explore in FMC/TAM website. But not really much more than what’s shown here. These simple primitives already provide a powerful expression mechanism to convey most of the ideas we need to communicate over our system on a daily basis.

Usually, when providing such a diagram, it’s good practice to accompany it with some text that provides some explanation on the different components and their roles. This shouldn’t be more than 1-2 paragraphs, but actually depends on the level of detail and system size.

This would generally help with two things: identifying redundant components, and describing the responsibility of each component clearly. Think of this text explanation as a way to validate your modeling, as displayed in the diagram.

Rule of thumb #2: If your explanation doesn’t include all the actors/stores depicted in the
diagram – you probably have redundant components.

Behavior Modeling

The dynamic behavior of a system is of course no less important than its structure. The cooperation, interaction and data flow between components allow us to identify failure points, bottlenecks, decoupling problems etc. In this case, TAM adopts largely the UML practice of using sequence diagrams or activity diagrams, whose description is beyond the scope of this post.

One thing to keep in mind though, is that when modeling behavior in this case, you’re usually not modeling interaction between classes, but rather between components. So the formalism of “messages” sent between objects need not couple itself to code structure and class/method names. Remember: you generally don’t model the software (code), but rather system components. So you don’t need to model the exact method calls and object instances, as is generally the case with UML models.

One good way to validate the model at this point is to verify that the components mentioned in the activity diagram are mentioned in the system’s structure (in the block diagram); and that components that interact in the behavioral model actually have this interaction expressed in the structural model. A missing interaction (e.g. channel) in the structural view may mean that these two components have an interface that wasn’t expressed in the structural model, i.e. the structure diagram should be fixed; or it could mean that these two components shouldn’t interact, i.e. the behavioral model needs to be fixed.

This is the exact thought process that this modeling helps to achieve – modeling two different facets of the system and validating one with the other in iterations allows us to reason and validate our understanding of the system. The explicit diagrams are simply the visual method that helps us to visualize and capture those ideas efficiently. Of course, keep in mind that you validate the model at the appropriate level of abstraction – don’t validate a high level system structure with a sequence diagram describing implementation classes.

Rule of thumb #3: Every interaction modeled in the behavioral model (activity/sequence
diagrams) should be reflected in the structural model (block diagram), and vice versa.

Domain Modeling

Another often useful aspect of modeling a system is modeling the data processed by the system. It helps to reason about the algorithms, expected load and eventually the structure of the code. This is often the part that’s not covered by well known patterns and needs to be carefully tuned per application. It also helps in creating a shared vocabulary and terminology when discussing different aspects of the developed software.

A useful method in the case of domain modeling is UML class diagrams, which TAM also adopts. In this case as well, I often find a more scaled-down version the most useful, usually focused on the main entities, and their relationships (including cardinality). The useful notation of class diagrams can be leveraged to express these relationships quite succinctly.

Explicit modeling of the code itself is rarely useful in my opinion – the code will probably be refactored way faster than a model will be updated, and a reader who is able to read a detailed class diagram can also read the code it describes. One exception to this rule might be when your application deals with code constructs, in which case the code constructs themselves (e.g. interfaces) serve as the API to your system, and clients will need to write code that integrates with it, as a primary usage pattern of the system. An example for this is an extensible library of any sort (eclipse plugins are one prominent example, but there are more).

Another useful modeling facet in this context is to model the main concepts handled in the system. This is especially useful in very technical systems (oriented at developers), that introduce several new concepts, e.g. frameworks. In this case, a conceptual model can prove to be useful for establishing a shared understanding and terminology for anyone discussing the system.

Iterative Refinement

Of course, at the end of the day, we need to remember that modeling a system in fact reflects a thought process we have when designing the system. The end product, in the form a document (or set of documents) represents our understanding of the system – its structure and behavior. But this is never a one-way process. It is almost always an iterative process that reflects our evolving understanding of the system.

So modeling a specific facet of the system should not be seen as a one-off activity. We often follow a dynamic where we model the structure of the system, but then try to model its behavior, only to realize the structure isn’t sufficient or leads to a suboptimal flow. This back and forth is actually a good thing – it helps us to solidify our understanding and converge on a widely understood and accepted picture of how the system should look, and how it should be constructed.

Refinements also happen on the axis of abstractions. Moving from a high level to a lower level of abstraction, we can provide more details on the system. We can refine as much as we find useful, up to the level of modeling the code (which, as stated above, is rarely useful in my opinion). Also when working on the details of a given view, it’s common to find improvement points and issues in the higher level description. So iterations can happen here as well.

As an example, consider the imaginary travel agency example quoted above. One possible refinement of the structural view could be something like this (also taken from the site above):

Example: travel agency system refined

In this case, more detail is provided on the implementation of the information help subsystem and the ‘Travel Information’ data store. Although providing some more (useful) technical details, this is still a block diagram, describing the structure of the system. This level of detail refines the high level view shown earlier, and already provides more information and insight into how the system is built. For example, how the data stores are implemented and accessed, the way data is adapted and propagated in the system. The acute reader will note that the ‘Reservation System’ subsystem now interacts with the ‘HTTP Server’ component in the ‘Information help desk’ subsystem. This makes sense from a logical point of view – the reservation system accesses the travel information through the same channels used to provide information to other actors, but this information was missing from the first diagram (no channel between the two components).
One important rule of thumb is that as you go down the levels of abstraction, keep the names of actors presented in the higher level of abstraction. This allows readers to correlate the views more easily, identify the different actors, and reason about their place in the system. It provides a context for the more fine granular details. As the example above shows, the more detailed diagram still includes the actor and store names from the higher level diagram (‘Travel Information’, ‘Information help desk’, ‘Travel Agency’).

Rule of thumb #4: Be consistent about names when moving between different levels of abstraction. Enable correlations between the different views.

Communicating w/ Humans – Visualization is Key

With all this modeling activity going on, we have to keep in mind that our main goal, besides good design, is communicating this design to other humans, not machines. This is why, reluctant as we are to admit it (engineers…) – aesthetics matter.

In the context of enterprise systems, communicating the design effectively is as important to the quality of the resulting software as designing it properly. In some cases, it might be even more important – just consider the amount of time you sometime spend on integration of system vs. how much time you spend writing the software itself. So a good looking diagram is important, and we should be mindful about how we present it to the intended audience.

Following are some tips and pointers on what to look for when considering this aspect of communicating our designs. This is by no means an exhaustive list, but more based on experience (and some common sense). More pointers can be found in the links above, specifically in the visualization guide.

First, keep in mind node and visual arrangement of nodes and edges in your diagram immediately lends itself to how clear the diagram is to readers. Try to minimize intersection of edges, and align edges on horizontal and vertical axes.
Compare these two examples:

Aligning vertices

The arrangement on the left is definitely clearer than the one on the right. Note that generally speaking, the size of a node does not imply any specific meaning; it is just a visual convenience.

Similarly, this example:

Visual alignment

shows how the re-arrangement of nodes allows for less intersection, without losing any meaning.

Colors can also be very useful in this case. One can use colors to help distinguish between different levels of containment:

Using colors

In this case, the usage of colors helps to distinguish an otherwise confusing structure. Keep in mind that readers might want to print the document you create on a black and white printer (and color blind) – so use high contrast colors where possible.

Label styles are generally not very useful to convey meaning. Try to stick to a very specific font and be consistent with it. An exception might be a label that pertains to a different aspect, e.g. configuration files or code locations, which might be more easily distinguished when using a different font style.

Visuals have Semantics

One useful way to leverage colors and layout of a diagram is to stress specific semantics you might want to convey in your diagram. One might leverage colors to distinguish a set of components from other components, e.g. highlighting team responsibilities, or highlight specific implementation details. Note that when you use this kind of technique that it is not standard, so remember to include an explanation – a legend – of what the different colors mean. Also, too many colors might cause more clutter, eventually beating the purpose of clarity.

Another useful technique is to use layout of the nodes in the graph for conveying an understanding. For example, depicting the main data flow might be hinted in the block diagram by layouting the nodes from left to right, or top to down. This is not required, nor carries any specific meaning. But it is often useful to use, and provides hints as to how the system actually works.

Summary

As we’ve seen, “doing” architecture, while often perceived as a cumbersome and unnecessary activity isn’t hard to do when done effectively. We need to keep in mind the focus of this activity: communicating our designs and reasoning about them over longer periods of time.

Easing the collaboration around design is not just an issue of knowledge sharing (though that’s important as well), but it is a necessity when trying to build software across global teams, over long periods of time. How effectively we communicate our designs directly impacts how we collaborate, the quality of produced software, how we evolve it over time, and eventually the bottom line of deliveries.

I hope this (rather long) post has served to shed some light on the subject, and provide some insight, useful tips and encouraged people to invest some efforts into learning further.


Credit: The example and images presented in this post are taken from the FMC website and examples.