Author Archives: slior

From Code Monkeys to Thought Partners: LLMs and the End of Software Engineering Busywork

When it comes to AI and programming, vibe coding is all the rage these days. I’ve tried it, to an extent, and commented about it at length. While it seems a lot of people believe this to be a game changer  when it comes to SW development, it seems that among experienced SW engineers, there’s a growing realization that this is not a panacea. In some cases I’ve even seen resentment or scorn at the idea that vibe coding is anything more than a passing hype.

I personally don’t think it’s just a hype. It might be more in the zeitgeist at the moment, but it won’t go away. I believe that, simply because it’s not a new trend. Vibe coding, in my opinion, is nothing more than an evolution of low/no-code platforms. We’ve seen this type of tools since MS-Access and Visual Basic back in the 90s. It definitely has its niche, a viable one, but it’s not something that will eradicate the SW development profession.

I do think that AI will most definitely change how developers work and how programming looks like. But this still will not make programmers obsolete.

This is because the actual challenges are elsewhere.

The Real bottlenecks in Software Engineering

In fact, I think we’re scratching the surface here. Partially because the technology and tooling are still evolving. But also, it’s because it seems most people1 looking at improving software engineering are looking at the wrong problem.

Anyone who’s been at this business professionally has realized at some point that code production is not the real bottleneck when it comes to being a productive software engineer. It never was the productivity bottleneck.

The real challenges, in real world software development, especially at scale, are different. They revolve mainly around producing a coherent software by a lot of people that need to interact with one another:

  • Conquering complexity: understanding the business and translating it into working code. Understanding large code bases.
  • Communication overhead: the amount of coordination that needs to happen between different teams when trying to coordinate design choices2. We often end up with knowledge silos.
  • Maintaining consistency: using the same tools, practices and patterns so operation and evolution will be easier. This is especially true at a large scale of organization, and over time.
  • Hard to analyze impacts of changes. Tracing back decisions isn’t easy.

A lot of the energy and money invested in doing day-to-day professional software development is about managing this complexity and delivering software at a consistent (increasing?) pace, with acceptable quality. It’s no surprise there’s a whole ecosystem of methodologies, techniques and tools dedicated to alleviate some of these issues. Some are successful, some not so much.

Code generation isn’t really the hard part. That’s probably the easiest part of the story. Having a tool that does it slightly faster3 is great, and it’s helpful, but this doesn’t solve the hard challenges.
We should realize that code generation, however elaborate, is not the entire story. It’s also about understanding the user’s request, constraints and existing code.

The point here isn’t about the fantastic innovations made in the technology. My point is rather that it’s applied to the least interesting problem. As great as the technology and tooling is, and they are great, simply generating code doesn’t solve a big challenge.

This leads me to thinking: is this it?
Is all the promise of AI, when it comes to my line of work, is typing the characters I tell it faster?
Don’t get me  wrong, it’s nice to have someone else do the typing4, but this seems somewhat underwhelming. It certainly isn’t a game changer.

Intuitively, this doesn’t seem right. But for this we need to go a step back and consider LLMs again.

LLM Strengths Beyond Code Generation

Large Language Models, as the name implies, are pretty good at understanding, well – language. They’re really good at parsing and producing texts, at “understanding” it. I’m avoiding the philosophical debate on the nature of understanding5, but I think it’s pretty clear at this point that when it comes to natural language understanding, LLMs provide a very clear advantage.

And this is where it gets interesting. Because when we look at the real world challenges listed above, most of them boil down to communication and understanding of language and semantics.

LLMs are good at:

  • Natural language understanding – identifying concepts in written text.
  • Information synthesis – connecting disparate sources.
  • Pattern recognition
  • Summarization
  • Structured data generation

And when you consider mechanizing these capabilities, like LLMs do, you should be able to see the doors this opens.

These capabilities map pretty well to the problems we have in large scale software engineering. Take, for example, pattern recognition. This should help with mastering complexity, especially when complexity is expressed in human language6.

Another example might be in addressing communication overhead. It can be greatly reduced when the communication artifacts are generated by agents armed with LLMs. Think about drafting decisions, specifications, summarizing notes and combining them into concrete design artifacts and project plans.
It’s also easier to maintain consistency in design and code, when you have a tireless machine that does the planning and produces the code based on examples and design artifacts it sees in the system.

It should also be easier to understand the impact of changes when you have a machine that traces and connects the decisions to concrete artifacts and components. A machine that checks changes in code isn’t new (you probably know it as “a compiler” or “static code analyzer”). But one that understands high level design documents and connects it eventually to the running code, with no extra metadata, is a novelty. Think about an agent that understands your logs, and your ADRs to find bottlenecks or brainstorm potential improvements.

And this is where it starts to get interesting.

It’s interesting because this is where mechanizing processes starts to pay off – when we address the scale of the process and volume of work. And we do it with little to no loss of quality.

If we can get LLMs to do a lot of the heavy lifting when it comes to identifying correlations, understanding concepts and communicating about it, with other humans and other LLMs, then scaling it is a matter of cost7. And if we manage this, we should be on the road to, I believe, an order of magnitude improvement.

So where does that leave us?

Augmenting SW Engineering Teams with LLMs

You have your existing artifacts – your meeting notes, design specifications, code base, language and framework documentation, past design decisions, API descriptors , data schemas, etc.
These are mostly written in English or some other known format.

Imagine a set of LLM-based software agents that connect to these artifacts, understand the concepts and patterns, make the connections and start operating on them. This has an immediate potential to save human time by generating artifacts (not just code), but also make a lot of the communication more consistent. It also has the potential to highlight inconsistencies that would otherwise go unnoticed.

Consider, for example, an ADR assistant that takes in a set of meeting notes, some product requirements document(s) and past decisions, and identifies the new decisions taken automatically, and generates succinct and focused new ADRs based on decisions reached.

Another example would be an agent that can act as a sounding board to design thinking – you throw your ideas at it, allow it to access existing project and system context as well as industry standards and documentation. You then chat with it about where best practices are best applied, and where are the risks in given design alternatives. Design review suddenly becomes more streamlined when you can simply ask the LLM to bring up issues in the proposed design.

Imagine an agent that systematically builds a knowledge graph of your system as it grows. It does it in the background by scanning code committed and connecting it with higher level documentation and requirements (probably after another agent generated them). Understanding the impact of changes becomes easier when you can access such a semantic knowledge graph of your project. Connect it to a git tool and it can also understand code/documentation changes at a very granular level.

All these examples don’t eliminate the human in the loop. It’s actually a common pattern in agentic systems. I don’t think the human(s) can or should be eliminated from the loop. It’s about empowering human engineers to apply intuition and higher level reasoning. Let the machine do the heavy lifting of producing text and scanning it. And in this case we have a machine that can not only scan the text, but understand higher level concepts, to a degree, in it. Humans immediately benefit from this, simply because humans and machines now communicate in the same natural language, at scale.

We can also take it a step further: we don’t necessarily need a complicated or very structured API to allow these agents to communicate amongst themselves. Since LLMs understand text, a simple markdown with some simple structure (headers, blocks) is a pretty good starting point for an LLM to infer concepts. Combine this with diagram-as-code artifacts and you have another win – LLMs understand these structures as well. All with the same artifacts understandable by humans. There’s no need for extra conversions8.

So now we can have LLMs communicating with other LLMs, to produce more general automated workflows. Analyzing requirements, in the context of the existing system and past decisions, becomes easier. Identifying inconsistencies or missing/conflicting requirements can be done by connecting a “requirement analyzer” agent to the available knowledge graph produced and updated by another agent. What-if scenarios are easier to explore in design.

Such agents can also help with producing more viable plans for implementation, especially taking into consideration existing code bases. Leaning on (automatically updated) documentation can probably help with LLM context management – making it more accurate at a lower token cost.

Mechanizing Semantics

We should be careful here not to fall into the trap of assuming this is a simple automation, a sort of a more sophisticated robotic process automation , though that has its value as well.

I think it goes beyond that.
A lot of the work we do on a day to day basis is about bringing context and applying it to the problem or task at hand.

When I get a feature design to be reviewed, I read it, and start asking questions. I try to apply system thinking and first principle thinking. I bring in the context of the system and business I’m already aware of. I try to look at the problem from different angles, and ask a series of “what-if” questions on the design proposed. Sometimes it’s surfacing implicit, potentially harmful, assumptions. Sometimes it’s just connecting the dots with another team’s work. Sometimes it’s bringing up the time my system was hacked by a security consultant 15 years ago (true story). There’s a lot of experience that goes into that. But essentially it’s applying the same questions and thought processes to the concepts presented on paper and/or in code.

With LLMs’ ability to derive concepts, identify patterns in them and with vast embedded knowledge, I believe we can encode a lot of that experience into them. Whether it’s by fine tuning, clever prompting or context building. A lot of these thinking steps can be mechanized. It seems we have a machine that can derive semantics from natural language. We have the potential to leverage this mechanization into the day to day of software production. It’s more than simple pattern identification. It’s about bridging the gap between human expression to formal methods (be it diagrams or code). The gap seems to be becoming smaller by the day.

Let’s not forget that software development is usually a team effort. And when we have little automatic helpers that understand our language, and make connections to existing systems, patterns and vocabulary, they’re also helping us to communicate amongst ourselves. In a world where remote work is prevalent, development teams are often geographically distributed and communicating in a language that is not native to anyone in the development team – having something that summarizes your thoughts, verifying meeting notes against existing patterns and ultimately checking if your components behave nicely with the plans of other teams, all in perfect English, is a definite win.

This probably won’t be an easy thing to do, and will have a lot of nuances (e.g. legacy vs. newer code, different styles of architecture, evolving non functional requirements). But for the first time I feel this is a realistic goal, even if it’s not immediately achievable.

Are We Done?

This of course begs the question – where is the line? If we can encode our experience as developers and architects into the machine, are  we really on the path to obsolescence?

My feeling is that no, we are not. At the end of the process, after all alternatives are weighed, assumptions are surfaced, trade offs are considered, a decision needs to be taken. 

At the level of code writing, this decision – what code to produce – can probably be taken by an LLM. This is a case where constraints are clearer and with correct context and understanding there’s a good chance of getting it right. The expected output is more easily verifiable.

But this isn’t true for more “strategic” design choices. Things that go beyond code organization or localized algorithm performance. Choices that involve human elements like skill sets and relationships, or contractual and business pressure. Ultimately, the decision involves a degree of intuition. I can’t say whether intuition can be built into LLMs, intuitively I believe it can’t (pun intended). I highly doubt we can emulate that using LLMs, at least not in the foreseeable future.

So when all analysis is done, the decision maker is still a human (or a group of humans). A human that needs to consider the analysis, apply his experience, and decide on a course forward. If the LLM-based assistant is good enough, it can present a good summary and even recommendations, all done automatically. This analysis still needs to be understood and used by humans to reach a conclusion.

Are we there yet? No.
Are we close? Closer than ever probably, but still a way to go.

Can we think of a way to get there? Probably yes.

A Possible Roadmap

How can we realize this?

The answer seems to be, as always, to start simple, integrate and iterate; ad infinitum. In this case, however, the technology is still relatively young, and there’s a lot going on. Anything from the foundation models, relevant databases, coding tools, to prompt engineering, MCPs and beyond . These are all being actively researched and developed. So trying to predict how this will evolve is even harder.

Still, if I have to think on how this will evolve, practically, this is how I think it will go, at least one possible path.

Foundational System Understanding

First, we’ll probably start with simple knowledge building. I expect we’ll first see AI agents that can read code, produce and consume design knowledge – how current systems operate. This is already happening and I expect it will improve. It’s here mainly because the task in this case is well known and tools are here. We can verify results and fine tune the techniques.
Examples of this could be AI agents that produce detailed sequence diagrams of existing code, and then identifying components. Other AI agents can consume design documents/notes and meeting transcriptions, together with the already produced description to produce an accurate record of the changed/enhanced design. Having these agents work continuously and consistently across a large system already provides value.

Connecting Static and Dynamic Knowledge

Given that AI agents have an understanding of the system structure, I can see other AI agents working on dynamic knowledge – analyzing logs, traces and other dynamic data to provide insights into how the system and users actually behave and how the system evolves (through source control). This is more than log and metric analysis. It’s overlaying the information available over a larger knowledge graph of the system, connecting business behavior to the implementation of the system, including its evolution (i.e. git commits and Jira tickets).


Can we now examine and deduce information about better UX design?
Can we provide insights into the decomposition of the system? 

Enhanced Contextual Assistant and Design Support

At this point we should have everything to actually provide more proactive design support. I can see AI agents we can chat with, and help us reason about our designs. Where we can suggest a design alternative, and ask the agent to assess it, find hidden complexities, with the context of the existing system. Combined with daily deployments and source control, we can probably expect some time estimates and detailed planning.

This is where I see the “design sounding board” agent coming into play. As well as agents preemptively telling me where expected designs might falter.

More importantly, it’s where AI agents start to make the connections to other teams’ work. Telling me where my designs or expected flow will collide with another team’s plans.
Imagine an AI agent that monitors design decisions, of all teams and domains, identifies the flows they refer to, and highlights potential mismatches between teams or suggests extra integration testing, if necessary, all before sprint planning starts. Impact analysis becomes much easier at this point, not because we can query the available data (though we could, and that’s nice as well), but because we have an AI agent looking at the available data, considering the change, and identifying on its own what the impact is.


There’s still a long way to go until this is realized. Implementing this vision requires taking into account data access issues, LLM and technology evolution, integration and costs. All the makings of a useful software project.
I also expect quite a bit can change, and new techniques/technologies might make this more achievable or completely unnecessary.

And who knows, I could also be completely hallucinating. I heard it’s fashionable these days.

Conclusion: The Real Promise of LLMs in Software Engineering

I’ve argued here that while vibe coding and code generation get most of the attention, they aren’t addressing the real bottlenecks in software development. The true potential of Large Language Models lies in their ability to understand and process natural language, connect disparate information sources, and mechanize semantic understanding at scale.

LLMs can transform software engineering by tackling the actual challenges we face daily: conquering complexity, reducing communication overhead, maintaining consistency, and analyzing the impact of changes. By creating AI agents that can understand requirements, generate documentation, connect design decisions to implementation, and serve as design thinking partners, we can achieve meaningful productivity improvements beyond simply typing code faster, as nifty as that is.

What makes this vision useful and practical is that it doesn’t eliminate humans from the loop. Rather, it augments our capabilities by handling the heavy lifting of information processing and connection-making, while leaving the intuitive, strategic decisions to experienced engineers. This partnership between human intuition and machine-powered semantic understanding represents a genuine step forward in how we build software.

Are we there yet? Not quite. But we’re closer than ever before, and the path forward is becoming clearer. 

Have you experienced any of these AI-powered workflows in your own development process? Do you see other applications for LLMs that could address the real bottlenecks in software engineering?


  1. At least most who publicly talk about it ↩︎
  2. ‘Just set up an api’ is easier said than done – agreeing on the API is the hard work ↩︎
  3. And this is a bit debatable when you consider non-functional requirements ↩︎
  4. I am getting older ↩︎
  5. Also because I don’t feel qualified to argue on it ↩︎
  6. Data mining has been around forever, but mostly works on structure data ↩︎
  7. Admittedly, not a negligible consideration ↩︎
  8. Though from a pure mechanistic point of view, this might not be the most efficient way ↩︎

Exploring Vibe Coding with AI: My Experiment

In my previous post I mentioned vibe coding as a current trend of coding with AI. But I haven’t actually tried it.

So I’ve decided to jump on the bandwagon and give it a try. Granted, I’m not the obvious target audience for this technique, but before passing judgment I had to see/feel it for myself.

It’s not the first time I generated code using an LLM with some prompting. But this time I was more committed to try out “the vibe”. To be clear, I did not intend to go all in with voice commands, transcription, and watching Netflix while the LLM worked. I did intend to review the code, and keep in touch with the output at every point. I wanted to test the tool’s capabilities while still being very much aware of what was going on.

Below is an account of what happened, my thoughts and conclusions so far.
A general disclaimer is of course in place: I’m still exploring these tools, and it’s quite possible there could be improvements to the process. My experience, however, is very much influenced by my experience as a developer. My choice of tools and how to use them is therefore very much biased towards usage as an experienced developer looking to increase productivity, not a non-coder looking to crank out one-off applications1.

The Setup

I set out to create a new simple tool for myself (actually to be used at work), something I actually find useful, and is not an obvious side project that’s been done a million times, and therefore less likely (I hope) to be in the LLM’s training data. It’s a project done from scratch, and I’m trying to do something that I don’t have a lot of experience with. It is also meant to be fairly limited in scope.

The project itself is a “Knowledge Graph Visualizer”, essentially an in-browser viewer of a graph representing arbitrary concepts and their relationships. I intended this to be purely browser, JS code. The main feature is a 3D rendering of the graph, allowing navigation through the concepts and their links. You can see the initial bare specification here.

To get a feel for the project, here’s a current screenshot:

KG-Viewer showing its own knowledge graph

With respect to tooling I went with Cursor (I use Cursor Pro), using primarily Claude-Sonnet 3.7 model. The initial code generation was actually done with Gemini 2.5 pro. But I quickly ran out of credits there. So the bulk of the work was done with Cursor.

I did not use any special cursor rules or MCP tools. This may have altered the experience to a degree (though I doubt it), so I will need to continue trying it as I explore these tools and techniques.

Getting Into the Vibe

It actually started fairly impressive. Given the initial spec, Gemini generated 6 files that provided the skeleton for a proof of concept. All of these files are still there. I did not look too deeply into the generated code. Instead, I initialized an empty simple project, launched Cursor, and copied the files there. With a few tweaks2, it worked. I had a working POC in about one hour of work. Without ever coding 3D renderings of graphs.

Magic!
I’ll be honest – I was impressed at first. I got a working implementation for drawing a graph with Three.js, for some JSON schema describing a graph. Given that I never laid eyes on Three.js, this was definitely faster than I would have gotten even to this simple POC.

I did peek at the code. I wasn’t overly impressed by it – there was a lot of unnecessary repetition, very long functions, and some weird design choices. For example, having a style.css holding all the style classes, but at the same time generating a new style and dynamically injecting it into the document.
But, adhering to my “viber code”, I did not touch the code, instead working only with prompts.

Then I started asking for more features.

Cursor/Claude, We Have a Problem

A POC is nice. But I actually need a working tool. So I started asking for more features.
Note, I did not just continue to spill out requests in the chat. I followed the common wisdom – using a new chat instance, laying out the feature specification and working step by step on planning and testing before implementation.

I wrote a simple file, which should allow me to trace the feature’s spec and implementation.
The general structure is simple:

- Feature Specification
- Plan
- Testing
- Implementation Log

Where I fill in only the Feature Specification, and let Cursor fill in the plan (after approval) and the “Implementation Log” as we proceed.

The plan was to have a working log of progress, to be used as both a log of the work, but also provide context to future chat sessions.

I don’t intend to re-create here the entire chat session or all my prompts, as this is not intended to be a tutorial on LLM techniques. But fair to say that the first feature (data retrieval), was implemented fairly easily, using only prompts.

Just One Small Change…

I was actually still pretty impressed at this point, so I simply asked for tiny small feature – showing node and link labels. I did it without creating an explicit “feature file”.

The code didn’t work. So I asked Cursor to fix it. And this quickly spiraled out of control. Cursor’s agent of course notified me on every request that it had definitely figured out the issue, and now it has the fix (!).
It didn’t.

I remained loyal to the “vibe coder creed”, and did not try to debug/fix the code myself. Instead deliberately going in cycles of prompting for fixes, accepting changes blindly, testing, and prompting again with new errors.

Somewhere along this cycle, the code changes made by the agent actually created regression in the application’s code resulting in the application not loading at all.

After roughly 3 hours, and a lot more grey hair, I did notice that the Cursor agent was going in circles – simply trying out the same 3 solutions, with no idea what’s wrong. But still confidently hallucinating solutions (“Now I see the issue…”3).

This was so frustrating that at this point I simply took it upon myself to actually look at the code, which was a complete mess. I looked at the problematic code, consulted git diffs to restore basic functionality, and solved the actual issue with about 10 more minutes of Google search.

To be fair, from my very rudimentary google search it seemed my request (link labels) wasn’t that easy to achieve. It’s apparently not that obvious (again, without being an expert on Three.js). I relaxed the requirement a bit, and found a simple solution.
Still, the whole cycle of back and forth of code changes, especially to unrelated code, was very much counter-productive. The vibes were all wrong. Getting back to working code took another 2-3 hours.

At this point I was thinking “oh well, you can’t win them all”. I wanted to turn to something simple. And looking at the state of the code, a simple cleanup should be easy enough, right?

Right? …

Now It’s Just Cleanup

Well … it depends.

I went back into “vibe coding” mode. This time, I defined very basic code cleanup procedures. I then asked Cursor’s agent (in a new session), to go through the source code and follow these steps to clean it up.

It actually did reasonably well for small files. The bigger files proved to be more challenging. Trying to clean them up ended up messing the files completely. For some reason, the LLM agent removed functioning code, and created functionality regressions. Trying to quickly fix them ended up in causing more issues. It was clearly guessing at this point.

Given my battle scars with the previous feature request, I avoided this hallucination death spiral. Instead, I went through git history, found a working version, and restored the working code “by hand” – actully typing in code. I wasn’t a vibe coder anymore, but the application worked, the code was cleaner, and my blood pressure remained fairly low (I think).

The experience felt like trying to mentor a junior developer to code without creating regressions. The problem is it’s a fast and confident junior developer, with short term memory loss, who is apparently so eager to please that he simply spews out code that looks remotely connected to the problem at hand, with little understanding of context; proving to be ignorant even of changes it itself made to the code.

Documentation for Man and Machine

At this point I decided to go back to basics, where LLMs truly shine – understanding and creating text. I asked for it to create documentation for specific flows in the code (init sequence, clicking on a legend item). Unsurprisingly, with a few simple prompts, the agent produced decent documentation for what I asked, including a mermaid.js diagram code.

This is important not simply because it allowed me to document project easily, which is nice. Creating a textual documentation of specific flows also allowed me to provide better context for other chat sessions. And this is an important insight – textual descriptions of the code are useful for humans as well as the LLMs.

Other Features

At this point I turned to develop more features – loading data and “node focus“. In both cases I went back to providing feature files, with specifications, and asking the agent to update the files with plans and implementation logs.

I was a bit more cautious now. I reviewed code more carefully and intervened where I felt the code wasn’t good. In some cases it was obvious the code wasn’t functionally correct, but instead of trying to “fight” with the agent, I accepted the code and went on to change it myself.

A repeating phrase in all my prompts at this point was:

Do minimal code changes. Change only what is needed and nothing more.

This, combined with being more cautious and careful, resulted in pretty good results. I managed to implement two features in a short time. Probably a bit shorter compared to what it would have taken me to run through Three.js tutorials and do it myself.

Final Thoughts

So where does this leave me?

I have a working application. And if I had to learn Three.js from scratch myself, it would have taken me considerably longer to create. It’s working, and it’s useful. This is an important bottom line.

Small Application, Good Starting Point

The initial code, generated by the LLM (Gemini or Claude) does serve as a good starting point, especially in areas or frameworks that are unfamiliar to the developer.

But this is still a far cry from replacing developers. There are tool limitations, some of them, I expect, introduced by Cursor rather than the LLM. These limitations can cause havoc if the agent is left to proceed with no oversight.
And review is harder when there’s a ton of unorganized code4.

We can probably make it better with rules, better prompts, and combination of agents. And of course advances in LLM training.

This is a good starting point. But we need to remember this is a very small application, made from scratch. In the real world, a lot of use cases are not that simple at all. The more I read and think about it, this bears a striking resemblance to no-code/low-code tools. Also in those cases, it’s easy to achieve quick results for simple uses cases, but very hard to scale development when features creep in or the application needs to scale.

It’s not that low-code tools don’t have their place. They serve a very specific (viable) niche. But as experience shows, they haven’t replaced developers.

Could this be different?
What would it take to tackle more serious challenges, with “vibe coding”?

Context is King

It’s quite obvious that in the kingdom of tokens, amidst ramparts of code and wind all of chat messages, there is only one king, and its name is Context5. As LLMs are limited in their context size, and a lot of it is taken up by wrapping tools (Cursor in this case), context for an LLM chat is an expensive real estate.

So while context windows can get big, we’ll probably never have enough when we get to more complicated tasks and bigger code bases. There’s a preservation of complexity at play.

Accuracy and precision in the context play a crucial role in effectiveness. Context passed to LLMs needs to be information-dense. We should probably start considering how efficient is the context we’re providing to LLMs. I don’t know how to measure context efficiency yet, but I believe this will be important to be more effective as tasks become more complicated.

But there’s more than just the LLM and how to operate it.

You’re Only as Good as Your Tools, Also When Vibing

It’s quite clear that mistakes done by LLMs, and humans, can be avoided/caught with the help of the right tools. Even in my small example described above, cooperation of the LLM agent with external tools (console logs, shell commands) resulted in better understanding and a more independent agent.

I suspect that having more tools, e.g. relevant MCP server for documentation, can significantly help. I expect the integration of LLMs with tools will become more prominent and more necessary to create more independent coding agents.

One often overlooked tool is the simple document explaining the context of the project, specific features and current tasks. When LLMs will work seamlessly with Architecture Decision Records and diagram as code tools, I expect to see better results. The memory bank approach seems to be a step in that direction, though it’s hard to assess how effective it is.

I have noticed in this exercise that supplying the LLM with context of how a flow works currently (e.g. loading the data), allows it to identify the necessary changes more easily.

Diagram as code play a role now not just for humans developers, but also as a way to encode context for the application. There’s a feedback loop here between the LLM generating documentation, and using it as input for further tasks.

Effective Vibing

The real question is about the effectiveness of the vibe coding approach. With what degree of agent independence can we achieve good results.

I’m not sure how to assess this. One approximation of this might be the rate of bugs to user chat messages times lines generated in a given vibe coding session. But there are obviously other parameters involved6.

It will be interesting to measure this over time, with more integrated tools, improved LLMs and possibly improved tools.

I’m not sure how this will evolve over time. I do think, however, that if LLMs with coding tools will be reduced to a glorified low-code platform it will be a miss for software engineering in general. The technology seems to be more powerful than that, since it has the potential to more easily bridge the gap between human language and rigorous computer programs; and do it in both directions.

On to explore more.



  1. Not that there’s anything wrong with that ↩︎
  2. Yep, I asked Cursor to keep track of the changes at this point ↩︎
  3. A phrase which, I guess, is close to becoming a meme onto itself ↩︎
  4. But then again, not sure it’s a problem in the long run ↩︎
  5. Always looking for opportunities to paraphrase one of my favorite book series; couldn’t resist this one ↩︎
  6. And we should be careful of Goodhart’s law. ↩︎

AI and the Nature of Programming – Some Thoughts

So, AI.
It’s all the rage these days. And apparently for good reason.

Of course, my curiosity, along with a fair amount of FOMO1 leads me to experimenting and learning the technology. There’s no shortage of tutorials, tools and models. A true Cambrian explosion of technology.
This also aligns fairly easily with my long time interests of development tools, development models, and software engineering in general. So there’s no shortage of motivation to dive into this from a developer’s perspective2.

And the debate is on, especially when it comes to development tools.

It’s no secret that tools powered by large language models (LLMs), like Github Copilot, Cursor, and Windsurf3, are becoming indispensable. Developers all over adopt them as an essential part of their daily toolset. They offer the ability to generate code snippets, debug errors and refactor code with remarkable speed. This shift has sparked a fascinating debate about the role of AI in coding. Is it merely a productivity booster? Or does it represent a fundamental change in how we think about programming itself?

At its core, coding with AI promises to make software development faster and arguably more accessible. For simple, well-defined tasks, AI can produce functional code in seconds. This reduces the cognitive load on developers and allows them to focus on higher-level problem-solving. But software development in the wild, especially for ongoing product development, becomes very complicated very quickly. As complexity grows, the limitations of AI-generated code become obvious. While LLMs can produce code quickly and easily, the quality of its output often depends on the developer’s ability to guide and refine it.

So while AI excels at speeding up simple tasks, there are still challenges with more complex tasks. And there are implications to the ability to maintain code over time. But, I cannot deny we’re apparently at the beginning of a new era. And this raises the question of whether traditional notions of “good code” still apply in an era where AI might take over the bulk of maintenance work.

And I ask myself (and you): can we imagine a future where AI no longer generates textual code? Instead, it operates on some other canonical representation of logic. Are we witnessing a shift in the very nature of programming?

Efficiency of AI in Coding

Before diving into the hard(er) questions, let’s take a step back.

One of the most compelling advantages of coding with AI is its ability to significantly speed up the development process. This is especially true for simple and focused tasks. AI-powered tools, like GitHub Copilot and ChatGPT, excel at generating boilerplate code, writing repetitive functions, and even suggesting entire algorithms based on natural language prompts. For example, a developer can describe a task like “create a function to sort a list of integers in Python,” and the AI will instantly produce a working implementation. This capability not only saves time. It also reduces the cognitive burden on developers4. Consequently, developers can focus on more complex and creative aspects of their work.

The efficiency of AI in coding is particularly evident in tasks that are well-defined and require minimal context. Writing unit tests, implementing standard algorithms, or formatting data are all areas where AI can outperform human developers in terms of speed. AI tools can also quickly adapt to different programming languages and frameworks, making them versatile assistants for developers working in diverse environments. For instance, a developer switching from Python to JavaScript can rely on AI to generate syntactically correct code in the new language, reducing the learning curve and accelerating productivity. I often use LLMs to create simple scripts quickly instead of consulting documentation on some forgotten shell scripting syntax.

AI’s effectiveness in coding often depends on the developer’s ability to simplify tasks. The developer should break down larger, more complex tasks into smaller, manageable components. AI thrives on clarity and specificity; the more focused the task, the better the results. Yes, we have thinking models now, and they are becoming better every day. Still, they require supervision and accurate context. Contexts are large, and they’re not cheap.

At this point in time, developers still need to break down complicated tasks into more manageable sub tasks to be successful. This is often compared to a senior developer/tech lead detailing a list of tasks for a junior developer. I often find myself describing a feature to an LLM, asking for a list of tasks before coding, and then iterating over it together with the LLM. This works quite well in small focused applications. It becomes significantly more complicated with larger codebases.

While AI excels at handling simple and well-defined tasks, its performance tends to diminish as the complexity of the task increases. This is not necessarily a limitation of the AI itself but rather a reflection of the inherent challenges in translating high-level, ambiguous requirements into precise, functional code. For example, asking an AI to “build a recommendation system for an e-commerce platform” is a very complex task. In contrast, requesting a specific algorithm, like “implement a collaborative filtering model”, is simpler. The former requires a deep understanding of the problem domain, user behavior, and system architecture. These are areas where AI still struggles without significant human guidance.

As it stands today, LLMs act as a force multiplier for developers, enabling them to achieve more in less time. The true potential is realized when developers approach AI as a collaborative tool rather than a fully autonomous coder.

The “hands-off” approach (aka “Vibe coding“), where developers rely heavily on AI to generate code with minimal oversight, often leads to mixed results. AI can produce code that appears correct at first glance. Yet, it can contain subtle bugs, inefficiencies, or design flaws that are not immediately obvious. This is just one case I came across, but there are a lot more of course.

It’s not just about speed

But it’s more than simple planning, prompt engineering and context building. AI can correct its own errors, autonomously.

One of the most impressive features of AI in coding is its ability to detect and fix errors. When an LLM generates code, it doesn’t always get everything right the first time. Syntax errors, compilation issues, or logical mistakes can creep in. Yet, modern AI tools are increasingly equipped to spot these problems and even suggest fixes. For instance, tools like Cursor’s “agent mode” can recognize compilation errors. These tools then automatically try to correct them. This creates a powerful feedback loop where the AI not only writes code but also improves it in real time.

But it’s important to note here that there’s a collaboration here between AI and traditional tooling. Compilers make sure that the code is syntactically correct and can run, while LLMs help refine it. Together, they form a system where errors are caught early and often, leading to more reliable code. I have also had several cases where I asked the LLM to make sure all tests pass and there are no regressions. It ran all tests and fixed the code based on broken tests.
That is, without human intervention in that loop.

So AI, along with traditional tools (compilers, tests, linters) can be autonomous, at least to a degree.

It’s not just about correct code

As we all know, producing working code is only one (important) step when working as an engineer. It’s only the beginning of the journey. This is especially true when working on ongoing product development. It is probably less so in time-scoped projects. In ongoing projects, development never really stops. It continues unless the product is discontinued. There are mountains of tools, methodologies and techniques dedicated to maintain and evolve code over time and at scale. It is often a much tougher challenge compared to the initial code generation.

One of the biggest criticisms of AI-generated code is that it often lacks maintainability. Maintainable code is code that is easy to read, understand, and change over time. Humans value this because it makes collaboration and long-term project evolution easier. Yet, AI doesn’t always prioritize these qualities. For example, it might generate long, convoluted functions or introduce unnecessary methods that make the code harder to follow.

The reality is that code produced by an LLM, while often functional, may not always align with human standards of readability and maintainability.
I stopped counting the times I’ve had some LLM produce running, and often functionally correct code, that was horrible in almost every aspect of clean and robust code. I dare say a lot of the code produced is the antithesis of clean code. And yes, we can use system prompts and rules to facilitate better code. However, it’s not there yet, at least not consistently. This issue is not necessarily a fault of AI itself. It reflects the difficulty in defining and agreeing on what constitutes “good code”.

Whether or not LLMs get to the point where they can produce more maintainable code is uncertain. I’m sure it can improve, and we haven’t seen the end of it yet. I wonder if that is a goal we should be aiming for in the first place. We want “good” code, because we are there to read it, and work with it after the AI has created it.

But what if that wasn’t the case?

A code for the machine, by the machine

LLMs are good at understanding our text, and eventually acting on it – producing the text that will answer our questions/instructions. And when it comes to code, it produces code, as text. But that text is for us – humans – to consume. So we review and judge through this lens – code that we need to understand and work with.

We do it with the power of our understanding, but also with the tools that we’ve built to help us do it – compilers, linters, etc. It’s important to note that language compilers are tools for humans to interact with the machine. It’s a tool that’s necessary when dealing with humans instructing the machine (=writing code). The software development process, even with AI, requires it because the LLM  is writing code for humans to understand. It also allows us to leverage existing investments in programming.

But when an LLM is generating code for another LLM to review, and when it iterates on the generated response, the code doesn’t need to be worked on by humans. Do we really need the code to be maintainable and clear for us?
Do we care about duplication of code? Meaningful variable and class names? Is encapsulation important?
LLMs can structure their output, and consume structured inputs. Assuming LLMs don’t hallucinate as much than I’m not sure type checking is that impactful as well.

I think we should not care so much about these things.
At the current rate of development around LLMs there’s no reason we shouldn’t get to a point where LLMs will be able to analyze an existing code base and modify/evolve it successfully without a human ever laying eyes on the code. It might require some fancy prompting or combination of multiple LLM agents, but we’re not so far.

Another force at play here, I believe, is that code can be made simpler and straightforward if it doesn’t need to abstract away much of the underlying concepts. A lot of the existing abstractions are there because of humans. Take for example UI frameworks, or different SDKs, component frameworks and application servers. Most of the focus there is about abstracting concepts and letting humans operate at a higher level of understanding. It can be leveraged by LLMs, but it doesn’t necessarily have to be. Do I need an ORM framework when the LLM simply produces the right queries whenever it needs to?
Do I need middleware and abstractions over messaging when an AI agent can simply produce the code it needs, and replicate it whenever it needs to?

My point is, a lot of the (good) concepts and tools and frameworks we created in programming are good and useful under the assumption that humans are in the loop. Once you take humans out of the loop, are they needed? I’m not so sure.

The AI “Compiler”

Let’s take it a step further.
Programming languages are in essence a way for humans to connect with the machine. It has been this way since the early days of assembly language. And with time, the level of expression and ergonomics of programming languages have evolved to accommodate humans working with computers. This is great and important, because we humans need to program these damn computers. The easier it is for us, the more we can do.

But it’s different when it’s not a human instructing the machine. It’s an AI that understands the human, but then translates it to something else. And another AI works to evolve this output further. Does the output need to be understandable by humans?
What if LLMs understand the intent given by us, but then continue to iterate over the resulting program using some internal representation that’s more efficient?

Internal representations are nothing new in the world of compilers. Compiler developers program them to enable various operations that compilers often perform. Operations like optimizations, type checking, tool support, and generating outputs.
Why can’t LLMs communicate over code using their own internal representation, resulting in much more efficient operation and lower costs?
This is not just for generating a one-time binary, but also for evolving and extending the program. As we observed above, software engineering is more than a simple generation of code.
It doesn’t have to be something fancy or with too many abstractions. It needs to allow another LLM/AI to work and continue to evolve it whenever a new feature specification or bug is found/reported.
Do we really need AI to produce a beautiful programming language, mimicking some form of structured English, when there’s no English reader who’s going to read and work on it?

Why not have AI agents produce something like “textual/binary Gibberlink” an AI-oriented “bytecode” when producing our programs:

Is the human to machine connection through a programming language necessary when we have a tool that understands our natural language well enough, and can then reason on its own outputs?

LLMs can already encode their output in structured formats (e.g. JSON) that are machine processable. Is it that big of a leap to assume they’d be able to simply continue communicating with themselves and get the job done based on our specifications, without involving us in the middle?

Vibe coding is apparently a thing now. I don’t believe it’s a sustainable trend5. But the main reason is that it focuses on a specific point in the software life cycle – the point of generating code.
What if we can take it to the extreme? What if we remove the human from the coding process throughout the software life cycle?

I can’t really predict where this is going. At this point I don’t know the technology well enough to guesstimate, and I’m no oracle. But I do see this as one possible direction with a definite upside. And it’s definitely interesting to follow.

If programming is machines talking to machines, maintainability and evolution of code becomes a different game.

“Code” becomes a different game.

Is programming dead?

What would such a future hold for the programming professionals?

Again, I’m not great at making prophecies. But the way I see it, and looking at history, I don’t belong to the pessimistic camp. So in my opinion – no, I don’t subscribe to the notion that programming is dead.

History has taught us an important lesson. Creating software with more expressive power did not decrease the amount of software created. A higher level of abstraction did not lessen software production either. Quite the contrary. More tools, and being capable of working at higher levels of abstraction meant that more software is created. Demand grew as well. It’s just the developers that needed to adapt to the new tools and concepts. And we did6.

Demand for software still exists, and it doesn’t look like it’s receding. I believe that developers who will adapt to this new reality will still be in demand as well.

I expect LLMs will improve, even significantly, in the foreseeable future. But this doesn’t mean there’s no need for software developers. I expect software development tasks to become more complex. As developers free their time and minds from the gritty details of sorting algorithms, database schemas and implementing authentication schemes, they will focus on bigger, more complicated tasks. So software development doesn’t become less complicated, we’re just capable of doing more stuff. Complexity is simply moving to other (higher level?) places7

Could it be that software architects will become the new software engineers?
Are all of us going to be “agent coders”?

I really don’t know, but I intend to stick around and find out.

Where do you think this is going?


  1. And, admittedly, fear of becoming irrelevant ↩︎
  2. And yes, AI was used when authoring this post, a bit. But no LLM was harmed, that I know of ↩︎
  3.  Originally I intended to add more examples, but realized that by the time I finish writing a list, at least 3 new tools will be announced. So… [insert your favorite AI dev tool here] ↩︎
  4. Give or take hallucinations ↩︎
  5. Remember no-code? ↩︎
  6. I’m old enough to have programmed in Java with no build tool, even before Ant. Classpath/JAR hell was a very real thing for me. ↩︎
  7. “The law of complexity preservation in software development”? ↩︎

Discussing Your Design with Scenaria

Motivation
As a software architect, I spend quite a bit of my time in design discussions. That’s an integral part of the job, for a good reason. As I see it, the design conversation is a fundamental part of this job and its role in the organization.

Design discussions are hard, for various reasons. Sometimes the subject matter is complicated. Sometimes there’s a lot of uncertainty. Sometimes tradeoffs are hard to negotiate. These are all just examples, and it is all part of the job. More often than not, it’s the interesting part.

But another reason these discussions tend to be hard is because of misunderstandings, vagueness and lack of precision in how we express ourselves. Expressing your thoughts in a way that translates well into other people’s minds is not easy. This gets worse as the number of people involved increases, especially when using a language where most, if not all, people do not speak natively.

From what I observed, this is true both for face to face meetings (often conducted remotely these days), as well as in written communication. I try to be as precise as I can, but jumping from one discussion to another, under time pressure, I also often commit the sin of “winging it” when making an argument in some Slack thread or some design document comment.

I’ve argued in the past that diagrams serve a much better job of explaining designs. I think this is true, and I often try to make extensive use of diagrams. But good diagrams also take time to create. Tools that use the “diagram as code” approach, e.g. PlantUML (but there are a bunch of others, see kroki), are in my experience a good way to create and share ideas. If you know the syntax, you can be fairly fast in “drawing” your design idea.

Still, I haven’t found a tool that will allow me to conveniently express what I need to express in a design discussion. Simply creating a simple diagram is not all of the story. I often want to share an idea of the structure of the system – the cooperating components, but also of its behavior. It’s important to not just show the structure of the system, and interfaces between components, but also highlight specific flows in different scenarios.

There are of course diagram types for that as well, e.g. sequence or activity diagrams. And there are a plethora of tools for creating those as well. But the “designer experience” is lacking. It’s hard to move from one type of view to another, maintaining consistency. This is why whiteboard discussions are easier in that sense – we sit together, draw something on the board, and then point at it, waving our hands over the picture that everyone is looking at. Even if something is not precise in itself, we can compensate by pointing at specific points, emphasizing one point or another.

Emulating this interaction is not easy at this day and age of remote work. When a lot of the discussions are done remotely, and often asynchronously (for good reasons), there’s a greater need to be precise. And this is not easy to do at the “speed of thought”.

Building software tools is sort of a hobby for me, so I set out to try and address this.

Goals

What I’m missing is a tool that will allow me to:

  1. Quickly express my thoughts on the structure and behavior of a (sub)system – the involved components and interactions.
  2. Share this picture and relevant behavior easily with other people, allowing them to reason about it. Allowing us to conveniently discuss the ideas presented, and easily make corrections or suggest alternatives.

So essentially I’m looking to create a tool that allows me to describe a system easily (structure + behavior). A tool that efficiently creates relevant diagram and allows me to visualize the behavior on this diagram.

Constraints and Boundary Conditions

Setting out to implement this kind of tool, as a proof of concept, I outlined for myself several constraints or boundary conditions I would like to maintain, both from a “product” point of view as well as from an engineering implementation point of view.

  1. The description should be text based, so we can easily share system description as well as version them using existing versioning tools, namely git.
  2. The tool should be easy to ramp up to.
    1. Just load and start writing
    2. Easy syntax, hopefully intuitive.
  3. Designs should be easily shareable – a simple link that can be sent, and embedded in other places.
  4. There should not be any special requirements for software to use the tool.
    1. A simple modern browser should be enough.

Scenaria

Enter Scenaria (git repo). 

Scenaria is a language – a simple DSL, with an accompanying web tool. The tool includes a simple online editor, and a visualization area. You enter the description of the system in the editor, hit “Apply”, and the system is displayed in the visualization pane.

Scenaria Screenshot
Scenaria Screenshot

The diagram itself is heavily inspired by technical architecture modeling. The textual DSL is inspired by PlantUML. You can play with the tool here, and see a more detailed explanation of the model and syntax here.

Discussion doesn’t stop with purely static diagram. The tool also allows you to describe and visualize interactions between the different components. You can describe several flows, which you can then “play”, on the drawn diagram. You can step through a scenario or simply play from start to finish.

After this is done, you have a shareable link, as part of the application which you can send to colleagues (or keep).

As a diagramming tool, it’s pretty lacking. But remember that the purpose here is not to necessarily create beautiful diagrams (though that’s always a plus). It’s mainly about enabling a conversation, efficiently. So there’s a balance here between being expressive in the language, while not going down the route of adding a ton of visualization features which will distract from the main purpose of describing a system or a feature.

Scenaria is more intended to be a communication tool to be used easily in the discussion we have with our colleagues. It can serve as a basis for further analysis, as it provides a way to structure the description of a system – its structure and behavior. But the focus isn’t on rigorous formal description that can derive working code. It’s not intended for code generation. It’s about having something to point at when discussing design, but easily create and share it, based on some system model.

An Example

An example scenario can be viewed here. This example shows the main components of the Scenaria app, with a simple flow showing the interaction between them when the code is parsed and shown on screen.

Looking at the code of the description, we start by enumerating the different actors cooperating in the process:

user 'Designer' as u;
agent 'App Page' as p;
agent 'Main App' as app;
agent 'Editor' as e;
agent 'Parser' as prsr;
agent 'Diagram Drawing' as dd;
agent 'ELK Lib' as elk;
agent 'Diagram Painting' as dp;
agent 'Diagram Controller' as dc;

Each component is described as an agent here, with the user (a “Designer”) as a separate actor.

We then define an annotation highlighting external libraries:

@External {
  color : 'lightgreen';
};

And annotate two agents to mark them as external libraries:

elk is @External;
e is @External;

Note that up to this point we haven’t defined any interactions or channels between the components.
Now we can turn to describe a flow – specifically what happens when the user writes some Scenaria code and hits the “Apply” button:

'Model Drawing' {
    u -('enter code')-> e
    u -('apply')->p
    p -('reset')-> app

    p -('get code')-> e
    p --('code')--< e

    p-('parseAndPresent')-> app
        app -('parse')-> prsr
        app --('model')--< prsr
        app -('layoutModel') -> dd
            dd -('layout') -> elk
            dd --('graph obj')--< elk
        app --('graph obj')--< dd

        app -('draw graph')-> dd
            dd -('draw actors, channels, edges')->dp
        app --('painter (dp)')--< dd

        app -('get svg elements')->dp
        app --('svg elements')--<dp
        
        app -('create and set svg elements')->dc


    p --('model')--< app

};

We give scenario a name – “Model Drawing”, and describe the different calls between the cooperating actors. Indentation is not required, just added here for readability.

The interaction between the agents implicitly define channels between the components. So when the diagram is drawn, it is drawn with relevant channels:

At this point the application allows you to run or step through the given scenario where you will see the different messages and return values, as described in the text.


Next Steps

This is far from a complete tool, and I hope to continue working on it, as I try to embed it into my daily work and see what works and what doesn’t.

At this point, it’s basically a proof of concept, a sort of an early prototype.

Some directions and features I have in mind that I believe can help in promoting the goals I outlined above:

  1. Better diagramming: better layout, supporting component hierarchies.
  2. Diagram features: comments on the diagram (as part of steps?), titles, notes
  3. Scenario playback – allow for branches, parallel step execution, self calls.
  4. Versioning of diagrams – show an evolution of a system, milestones for development, etc.
  5. Integration with other tools:
    1. Wikis/markdown (a “design notebook”?)
    2. Slack and other discussion tools
    3. Tools and links to other modeling tools, showing different views of the same model.
  6. A view only mode – allow sharing only the diagram and allow playback of scenarios.
    1. Allow embedding of the SVG only into other tools, e.g. a widget in google docs.
  7. Better application UX (admittedly, I’m not much of a user interface designer).
  8. Team collaboration features beyond version control.

Contributions, feedback and discussions are of course always welcome.

שיר אחד

(נכתב ב 19.3.2023)

אי שם בחצי הראשון של שנות ה 90, בעודי נער בכיתה י”א השתתפתי במסע לפולין.

בלי להתייחס לביקורת על המסעות האלה, רגע אחד זכור לי במיוחד עד היום.

רצה הגורל, ובאותו יום שהמשלחת שלנו ביקרה באושווויץ, ביקרה שם משלחת של צוערי בה”ד 1. וכך יצא, שבעודנו מסתובבים בין הבלוקים של אושוויץ 1, עטופים ב- או מניפים את דגלי ישראל, נקלעתי כצופה למחזה חצי סוריאליסטי כשדווקא שם, בין הבלוקים, צוערי בה”ד 1 מניפים את דגל צה”ל ואת דגל ישראל, ובטקס קטן וצנוע שרים את “התקווה”.

ושם עמדתי, חצי מהופנט. כש 10 מטרים ממני אחד המלווים של המשלחת שלנו, ניצול שואה שהתלווה אלינו כעד, מוחה דמעה. המראה הזה לא עזב אותי.

יותר מאוחר באותו היום שרנו אנחנו את אותו השיר מעל המשרפה המפוצצת בבירקנאו. והמלים שיעקב, חברי באותם הימים, אמר לי בלילה שטסתי לפולין הדהדו לי כל הזמן בראש – “תזכור שלכל מקום שאליו אתה נכנס, לך יש את הזכות לצאת”.

המלים האלה, עם אותן התמונות, שרטו את השריטה שלהן. באותו הערב כבר ידעתי שעד כמה שזה תלוי בי, אני אהיה חייל קרבי בצה”ל.

כמה שנים מאוחר יותר, בתחילת 1996, בערב אביבי על רחבת המסדרים בלטרון, שרתי את אותו השיר; דקות ספורות אחרי שנשבעתי “להקדיש את כל כוחותי ואף להקריב את חיי להגנת המולדת וחירות ישראל”. 4 חודשים אח”כ ענדתי צפרגול על החזה. שרתי את אותו השיר וזכרתי את אותה הבטחה מפולין.

27 שנים עברו מאז.

ובשבת האחרונה, בנס ציונה, עמדתי ליד הבן שלי כשהוא עטוף בדגל ישראל. ביחד עם אחיו הקטן ואמא שלהם, שרנו את אותו השיר, בסיום הפגנה.

ופתאום, אותן מילים בדיוק, ששרתי כבר עשרות פעמים בעבר בגאווה גדולה וחזה נפוח, מקבלות משמעות שונה מאוד.

כי החופש שלנו, למרות ששרנו עליו, היה תמיד שם. היינו צריכים את המסע לפולין כדי ללמוד ולהיזכר למה הוא חשוב. תמיד היה ברור למה אנחנו נלחמים, אבל לא היה לרגע אחד ספק שאנחנו חופשיים. ידענו שצריך לעמוד על המשמר, אבל תמיד כאנשים חופשיים. צדקת הדרך הייתה מובנת מאליה.
אולי הדרך עצמה מלאה ג’עג’ועים, ולפעמים סוטים ממנה. אבל לא היה ספק למה אנחנו שם. ועל מה אנחנו שרים את אותן המילים.

ופתאום, קצת פחות ברור.
פתאום, כשאנחנו שרים על “להיות עם חופשי בארצנו”, זה כבר לא מובן מאליו. והשיר אותו השיר, המילים אותן המילים, הדגל אותו דגל, הצבא (בגדול) אותו צבא.

רק התקווה שונה.

Adding Bots to a Simple Game

We have previously seen how we can create a fairly simple game in a short amount of time. In the closing post of the series, I mentioned one possible direction for evolving the code was adding the option to “play against the computer”, i.e. have the option for an automated player, or even two.

This is not only interesting from the coding point of view. It also allows us to explore how to code heuristics and/or “AI” into the game, compare strategies, etc.

In this post, I will describe the necessary changes made in the code in order to incorporate these bots into the game. This is not meant to be anyway authoritative source on how to write good AI for games, or how to implement it efficiently. It’s just one way to add this kind of feature.

If you want to see the end result, have a look at the deployed game instance, and follow the instructions for specifying automated players.

Enabling Automated Players

The first order of business is to facilitate introduction of AI players (“bots”). So far, the code we built was responding to mouse clicks in the browser, in other words, driven by human interaction. We’d like to allow for moves to be decided by code, and for the game to invoke such code and proceed with the game, without any human interaction.

Keeping to the spirit of the original design, we’re still building the automated players as part of the Javascript, browser-based code. We don’t want to introduce another separate runtime component to the system.

So the next two commits (6a4c509, 5489462) introduce this facility:

const PLAYER = {
one : {
toString : () => "ONE"
, theOtherOne : () => PLAYER.two
, number : 1
, ai : () => PLAYER.one._aiPlayer
, _aiPlayer : None
}
, two : {
toString : () => "TWO"
, theOtherOne : () => PLAYER.one
, number : 2
, ai : () => PLAYER.two._aiPlayer
, _aiPlayer : None
}
}
class MancalaGame
{
constructor(gameSize,cnvsELID,_updatePlayerCallback,_showMsgCallback,requestedAIPlayers)
{
this._setupAIPlayersIfRequested(requestedAIPlayers);
}
_setupAIPlayersIfRequested(requestedAIPlayers)
{
dbg("Setting AI Players…");
PLAYER.one._aiPlayer = maybe(determineAIPlayer(requestedAIPlayers.p1))
PLAYER.two._aiPlayer = maybe(determineAIPlayer(requestedAIPlayers.p2))
dbg("AI Players: P1: " + PLAYER.one._aiPlayer + ", P2: " + PLAYER.two._aiPlayer);
function determineAIPlayer(requestedAI)
{
return requestedAI ? new SimpleAIPlayer() : null;
}
}
start()
{
this._makeNextMoveIfCurrentPlayerIsAI();
}
handleCellClick(boardCell)
{
this._makeNextMoveIfCurrentPlayerIsAI()
}
_makeNextMoveIfCurrentPlayerIsAI()
{
if (!this.gameDone)
{
this.player.ai().ifPresent(aiPlayer => {
let aiMove = aiPlayer.nextMove(this.board,this.player.number)
setTimeout(() => { this._makeMove(aiMove)}, 200) //artifical wait, so we can "see" the ai playing
})
}
}
}
view raw game.js hosted with ❤ by GitHub
const P2_PARAM_NAME = "p2";
const P1_PARAM_NAME = "p1";
function setup()
{
game = new main.MancalaGame(resolveGameSize(),'cnvsMain',updateCurrentPlayer,showMsg, resolveRequestedAI());
game.start();
}
function resolveRequestedAI()
{
let params = new URLSearchParams(window.location.search);
let p2 = params.has(P2_PARAM_NAME) ? params.get(P2_PARAM_NAME) : "";
let p1 = params.has(P1_PARAM_NAME) ? params.get(P1_PARAM_NAME) : "";
return { p1 : p1, p2 : p2}
}
view raw index.html hosted with ❤ by GitHub
class SimpleAIPlayer
{
constructor()
{
}
/**
* Given a board and side to play, return the cell to play
* @param {Board} board The current board to play
* @param {number} side The side to player, either 1 or 2
*
* @returns The board cell to play
*/
nextMove(board,side)
{
requires(board != null, "Board can't be null for calculating next move")
requires(side == 1 || side == 2,"Side must be either 1 or 2")
//Simple heuristic: choose the cell with the largest number of stones.
var maxStoneCell = side == 1 ? 1 : board.totalCellCount()-1;
switch (side)
{
case 1 : board.forAllPlayer1Cells(c => { if (board.stonesIn(c) > board.stonesIn(maxStoneCell)) maxStoneCell = c })
case 2 : board.forAllPlayer2Cells(c => { if (board.stonesIn(c) > board.stonesIn(maxStoneCell)) maxStoneCell = c })
}
dbg("Playing cell: " + maxStoneCell + " with " + board.stonesIn(maxStoneCell) + " for player " + side);
return maxStoneCell;
}
}
module.exports = {
SimpleAIPlayer
}
Mancala Bots

You can see that the setup of the AI players is done during the construction of the MancalaGame class (game.js, line 24). The requested AI player type is passed using URL parameters (index.html, lines 6,10-16) which are then parsed and used for initializing the bots. Note that the necessary player class is setup in the global PLAYER objects1. At this point we have only one type of player – SimpleAIPlayer.

The invocation of the AI player happens when responding to the player’s click (game.js, line 50). The _makeNextMoveIfCurrentPlayerIsAI function (game.js lines 53-62) simply checks that the game isn’t done, and whether the current player is an AI one. If it is, we invoke the nextMove function which is the main function in the AI player’s interface. The result of this function is simply the cell to play. The function then invokes the makeMove function, which is also used for human players from the UI.

Another small addition is the start function for the MancalaGame class (game.js, lines 42-45). Which simply tests if player one (the current player at the beginning of the game), is an AI, and makes a move. This is used to kickstart a game where the first player is a bot. The 2nd player being a bot, and further moves of the 1st player, is handled through the mechanism described above.

We Start Simple

The first implementation of a bot in this game (SimpleAIPlayer.js) is dead simple – it’s based on a very naïve heuristic: play the cell with the most cells in it. To be honest, it’s only here to serve as a straw man for testing the whole bot mechanism; there’s not a lot of wisdom in this heuristic. It can also serve as a benchmark for testing other strategies.

The implementation of the nextMove method – the only method in the bot interface – is pretty straightforward: based on the current player playing (the side parameter), simply search for the cell with the most stones on that side of the board (lines 21-26 in SimpleAIPlayer.js). Note that this strategy is deterministic – given a board state, it will always choose the same cell2. This is important when testing this, and also for testing this against other bot strategies.

Mancala Bot Wars

The next commit adds another simple AI bot player – Random AI Player, which simply picks a cell to play at random. This in itself isn’t very interesting. What is interesting3 is that now we have two types of bots, and we can run experiments on what strategy is better – we just pit the two bots against each other.

In order to do that, we’ll write some code that enables us to run the game with two bots and record the result. The next commit takes care of that:

program
.version('0.0.1')
.usage("Mancala Bot War: Run Mancala AI Experiment")
.option('-p1, –player1 <AI 1 type>', "The bot to use for player 1")
.option('-p2, –player2 <AI 2 type>', "The bot to use for player 2")
.option('-r, –rounds <rnds>', "Number of games to play")
.option('-o, –out <dir>', "output file")
.parse(process.argv)
range(1,program.rounds*1).forEach(round => {
dbg(`Round ${round}`)
let game = new MancalaGame(14,'',_ => {},gameMsg,{p1 : program.player1, p2:program.player2},results => {
dbg (`Round ${round} Results: ${results}`)
let line = []
line.push(results.player1StoneCount)
line.push(results.player2StoneCount)
line.push(results.winner || 0) //write the player who one or 0 for draw
fs.appendFileSync(program.out,line.join(',') + "\n",{flag :'a'})
})
game.start();
})
view raw bots.js hosted with ❤ by GitHub
class MancalaGame
{
constructor(gameSize,cnvsELID,_updatePlayerCallback,_showMsgCallback,requestedAIPlayers,_gameOverCallback)
{
this.gameOverCallback = _gameOverCallback
if (cnvsELID)
{
this.boardUI = maybe(new BoardUI(cnvsELID,this.cellCount,this))
this.boardUI.ifPresent(bui => bui.initializeBoardDrawing());
}
else this.boardUI = None; //headless mode
}
_redraw()
{
this.boardUI.ifPresent(_ => _.drawBoardState(this.board,this));
}
_gameOver()
{
let player1StoneCount = this.board.player1StoneCount();
let player2StoneCount = this.board.player2StoneCount();
let results = { player1StoneCount : player1StoneCount, player2StoneCount : player2StoneCount}
switch (true)
{
case player1StoneCount > player2StoneCount : results.winner = 1; break;
case player2StoneCount > player1StoneCount : results.winner = 2; break;
default : results.isDraw = true; break;
}
this.gameOverCallback(results);
}
togglePlayer()
{
this.player = this.player.theOtherOne();
this.boardUI.ifPresent(_ => _.toggleHighlights(this.player.number));
return this.player;
}
}
view raw game.js hosted with ❤ by GitHub
<script>
function setup()
{
game = new main.MancalaGame(resolveGameSize(),'cnvsMain',updateCurrentPlayer,showMsg, resolveRequestedAI(),gameOver);
game.start();
}
function gameOver(results)
{
let a = ["Game Over","# Stones P1:" + results.player1StoneCount,"# Stones P2: " + results.player2StoneCount];
if (!results.isDraw)
a.push(`Player ${results.winner} Wins!`)
else
a.push('Draw!')
showMsg(a.join('<br/>'))
}
</script>
view raw index.html hosted with ❤ by GitHub
Mancala Bot Wars!

First, you’ll note the bots.js file, which is a simple script, invoked from the command line, that simply runs several rounds of the game and writes the results to a file. Every round creates a new game instance with the chosen player types, and calls game.start. It’s a script with some parameters, nothing fancy.

In order to support this, we need to refactor the MancalaGame class a bit. Specifically, we need it to be able to run w/o UI. This means two things: the game needs to run without a UI, and delivering the results has to be decoupled from the UI as well.

This is what happens in the game.js file. In lines 3,6 you can see we add a callback to announce that the game is over, and transmit the result. This callback is used in the _gameOver method, in line 35. Note that the code for showing the game over message moved out from this class (as it should); look for it in the index.html file.

In addition, we enable the client of the MancalaGame class to pass a null value for the cnvsELID constructor parameter, which signals to us that there’s no canvas UI to be used for this game instance. The boardUI member then becomes optional (lines 8-13) and whenever we access the boardUI, we need to make sure it’s present (lines 18,41).

We now have a game that can be run without human intervention, and play out solely with bots. We can now start experimenting with different game parameters, and see what strategy plays out better.

Prime geeky goodness.


Later commits clean up the bot experiment script, and add features. We also add other types of both player, with different heuristics (e.g. greedy capture, minimax player) so we can compare more strategies.

Building a Simple Game – Part 6

Last time we implemented the ending of the game, and cleaned up the code a bit.

In this post we’ll look into an enhancement that allows us to change the size of the game, some more refactoring and bug fixes. At the end of this post we’ll be at the point where we have a playable game.

Altering the Game Size

One feature we’d like to add is the ability to play variable size of the Mancala game. We treat the total number of cells as the “size” of the game. Since each player has the same number of cells, this needs to be an even number, and cells are split evenly between players.

The next commit adds this feature:

class MancalaGame
{
constructor(gameSize,cnvsELID,_updatePlayerCallback,_showMsgCallback)
{
this.cellCount = gameSize;
this.board = new Board(this.cellCount);
}
_initializeBoardDrawing()
{
drawBoard(cnvs,this.cellCount);
}
}
view raw game.js hosted with ❤ by GitHub
<script>
const SIZE_PARAM_NAME = 'size';
const DEFAULT_GAME_SIZE = 14;
const GAME_SIZE_UPPER_BOUND = 28;
const GAME_SIZE_LOWER_BOUND = 6;
function setup()
{
game = new main.MancalaGame(resolveGameSize(),'cnvsMain',updateCurrentPlayer,showMsg);
}
function resolveGameSize()
{
let params = new URLSearchParams(window.location.search);
let size = (params.has(SIZE_PARAM_NAME) && !isNaN(params.get(SIZE_PARAM_NAME))) ?
params.get(SIZE_PARAM_NAME)*1 : DEFAULT_GAME_SIZE;
let ret = Math.max( //we bound the game size between the minimum and the upper bound
Math.min(size
,GAME_SIZE_UPPER_BOUND || DEFAULT_GAME_SIZE)
,GAME_SIZE_LOWER_BOUND);
if (ret % 2 != 0) ret -= 1; //we have to make sure it's a divisble by 2.
console.log("Game size: " + ret);
return ret;
}
</script>
view raw index.html hosted with ❤ by GitHub
Adding option for variable game size

The implementation is pretty straightforward. Instead of having a constant (CELL_COUNT) in the controller module (game.js), we pass a parameter in the constructor (game.js, line 3), that is then fed to the Board data structure. The Board class already had this parameter, so it was already ready to work with any size, and did not assume a constant size.

What remains is simply having a way for the user to specify this. I decided to go with a simple URL parameter. This is simple enough to pass. So resolveGameSize in index.html (lines 15-27) takes care of reading the parameter from the URL query string and validating it. We then initialize the MancalaGame instance with this number (index.html line 12).

Again, Some Cleanup

The next two commits are really about a small but important refactoring. We’re encapsulating the board drawing in a class that takes care of all the drawing business, essentially encapsulating how the board is drawn and the user interaction with the board. The changes in drawing.js is mostly re-organization of different functions into class method (the BoardUI class). The more interesting part is how it’s actually used:

const {BoardUI} = require("./drawing.js")
class MancalaGame
{
constructor (…)
{
this.boardUI = new BoardUI(cnvsELID,this.cellCount,this)
this.boardUI.initializeBoardDrawing();
}
_redraw()
{
this.boardUI.drawBoardState(this.board,this);
}
togglePlayer()
{
this.player = this.player.theOtherOne();
this.boardUI.toggleHighlights(this.player.number);
return this.player;
}
}
view raw game.js hosted with ❤ by GitHub
Encapsulating the drawing code

The controller now doesn’t maintain a pointer to the canvas instance. Instead, it works through API provided by the BoardUI class, which is at a higher level of abstraction – initializeBoardDrawing, drawBoardState, toggleHighlights. The motivation here is really better encapsulation of the UI implementation.

The last commit for today takes care of one bug fix – making sure we skip the opponent’s mancala when making a move:

class Board
{
/**
* Retrieve the cell number for the home (Mancala) of the given player.
* @param {numer} player The number of the player whose Mancala we're seeking (1 or 2)
* @returns The cell number for the given player's Mancala ( either 0 or totalCellCount/2)
*/
homeOf(player)
{
requires(player == 1 || player == 2,"Player number can be either 1 or 2");
return player == 1 ? this.player1Home() : this.player2Home();
}
}
view raw board.js hosted with ❤ by GitHub
class MancalaGame
{
playCell(boardCell)
{
let targetCells = this._calculateTargetCellsForMove(boardCell);
}
_calculateTargetCellsForMove(fromCell)
{
let _ = this.board;
let stepCount = _.stonesIn(fromCell);
dbg("Playing " + stepCount + " stones from cell " + fromCell)
let targetCells = range(1,stepCount)
.map(steps => _.cellFrom(fromCell,steps,this.player.number))
.filter(c => c != _.homeOf(this.player.theOtherOne().number)) //remove, if applicable, the cell of the other player's mancala
while (targetCells.length < stepCount) //add any cells, until we reach a situation where we have enough holes to fill (per the stone count in the played cell)
{
let addedCell = _.cellFrom(targetCells[targetCells.length-1],1)
if (addedCell == _.homeOf(this.player.theOtherOne().number))
targetCells.push(_.cellFrom(addedCell,1))
else
targetCells.push(addedCell)
}
return targetCells;
}
}
view raw game.js hosted with ❤ by GitHub
Fixing a bug – skipping the opponents Mancala (home)

There’s nothing too fancy here. The gist of the fix is in lines 17-19 in the MancalaGame class4.

Incorporating this fix, however, prompted me to extract the calculation of the target cells to a different function (_calculateTargetCellsForMove), so the playCell function remains at the same level of abstraction, and is still readable5.

And We’re Done …

At this point, one should be able to build the code (npm run build) and point his browser to the resulting index.html. Look for it in the dist directory.

A working version of the game is available here, embedded here for your convenience:

Admittedly, it’s not much to look in terms of UI design. But it’s a simple game we did in a few hours time, and it works. If you followed along so far, give yourself a pat on the shoulder.

… But Wait, There’s More

The story of how to create a simple game is pretty much done. But there’s more that can be said and done here, more features to build, tweaks to do.

Concrete examples for more directions include better UI design, more features (undo/redo, saving game states, showing a log, etc.). I welcome any more suggestions, and of course pull requests.

If you look at the repo, you’ll see that I went in another direction for enhancement. I was more interested in incorporating bots (“AI”) into the game as a feature; so you could play against a bot, or even have two bots play against each other. Stay tuned for more on that front.

Building a Simple Game – Part 5

Last time we started really adding meat – the logic of the game rules. Today we’re going to look into some more logic, and how we wrap it up.

Ending a Game

It’s a game, but it does have to end at some point. This commit take care of exactly that – identifying the condition that ends the game and stops processing new moves:

class MancalaGame
{
constructor(cnvsELID,_updatePlayerCallback,_showMsgCallback)
{
this.gameDone = false;
}
handleCellClick(boardCell)
{ //todo: clean this up
if (this.gameDone)
{
dbg("Game over, get outta here");
return;
}
let currentPlayerHasNoMoreMoves = this.player1Playing(this.board.allPlayer1Cells(c => this.board.stonesIn(c) <= 0)) ||
this.player2Playing(this.board.allPlayer2Cells(c => this.board.stonesIn(c) <= 0))
if (currentPlayerHasNoMoreMoves)
this.gameOver();
this.canvas.ifPresent(cnvs => {
drawBoardState(cnvs,this.board,this)
})
}
}
gameOver()
{
let player1StoneCount = this.board.player1StoneCount();
let player2StoneCount = this.board.player2StoneCount();
let a = ["Game Over","# Stones P1:" + player1StoneCount,"# Stones P2: " + player2StoneCount];
switch (true)
{
case player1StoneCount > player2StoneCount : a.push("Player 1 Wins!"); break;
case player2StoneCount > player1StoneCount : a.push("Player 2 Wins!"); break;
default : a.push("Draw!"); break;
}
this.showMsg(a.join("<br/>"));
this.setGameOver();
}
setGameOver() { this.gameDone = true; }
}
view raw game.js hosted with ❤ by GitHub
Game over functionality

The implementation is pretty straightforward: we add a simple flag (gameOver) to the MancalaGame class (line 6) and consult it before processing any move (line 12). Of course, we have to take care of setting the flag properly, which happens at lines 18-21.

The gameOver function (lines 30-45) takes care of 3 main things6:

  1. Determining the winner (lines 36-41)
  2. Notifying the UI (lines 35,43)
  3. Setting the gameOver flag (line 44).

The implementation is pretty simple and straightforward, but admittedly, this function does a bit too much. We’ll take care of that right away.

Also, taking care of the message displayed to the user is not ideal thing to do here, but it’s not terrible, in my opinion, in this case.

Some Cleanup Is In Order

At this point, it has become quite clear that the handleCellClick function was becoming convoluted. It was quickly approaching the point of being unmanageable, doing too many things and operating at different levels of abstraction; e.g. encoding game rules while also taking care of UI messages. It was time for some cleaning.

The next commit 7 did exactly that:

class MancalaGame
{
constructor(cnvsELID,_updatePlayerCallback,_showMsgCallback)
{
this._initializeBoardDrawing();
}
_initializeBoardDrawing()
{
this.canvas.ifPresent(cnvs => {
})
}
handleCellClick(boardCell)
{
if (!this.gameDone)
{
this._resetGameMessagePanel();
if (this.isValidMove(boardCell))
this._makeMove(boardCell)
else
this.showMsg("Invalid Move")
}
else dbg("Game over, stop procrastinating")
}
_makeMove(boardCell)
{
let lastCell = this.playCell(boardCell);
this._togglePlayerOrExtraTurn(lastCell)
this._checkAndDeclareGameOverIfNecessary()
this._redraw();
}
_resetGameMessagePanel()
{
this.showMsg(" ")
}
_togglePlayerOrExtraTurn(lastCell)
{
let lastCellIsHomeOfCurrentPlayer = this.player1Playing(this.board.isPlayer1Home(lastCell)) ||
this.player2Playing(this.board.isPlayer2Home(lastCell))
if (!lastCellIsHomeOfCurrentPlayer)
this.updatePlayerCallback(this.togglePlayer());
else
this.showMsg("Extra Turn")
}
_checkAndDeclareGameOverIfNecessary()
{
let currentPlayerHasNoMoreMoves = this.player1Playing(this.board.allPlayer1Cells(c => this.board.stonesIn(c) <= 0)) ||
this.player2Playing(this.board.allPlayer2Cells(c => this.board.stonesIn(c) <= 0))
if (currentPlayerHasNoMoreMoves)
this._gameOver();
}
_redraw()
{
this.canvas.ifPresent(cnvs => {
drawBoardState(cnvs,this.board,this)
})
}
playCell(boardCell)
{
let _ = this.board;
let targetCells = range(1,_.stonesIn(boardCell)).map(steps => _.cellFrom(boardCell,steps))
let lastCell = targetCells[targetCells.length-1];
let lastCellWasEmpty = _.stonesIn(lastCell) == 0;
targetCells.forEach(c => _.addStoneTo(c));
this._checkAndCaptureIfNecessary(lastCell,lastCellWasEmpty);
_.setCellStoneCount(boardCell,0);
return lastCell;
}
_checkAndCaptureIfNecessary(lastCell,lastCellWasEmpty)
{
let _ = this.board;
let isLastCellAHomeCell = _.isPlayer1Home(lastCell) || _.isPlayer2Home(lastCell);
let lastCellBelongsToCurrentPlayer = this.player1Playing(_.isPlayer1Cell(lastCell)) ||
this.player2Playing(_.isPlayer2Cell(lastCell))
if (lastCellWasEmpty && !isLastCellAHomeCell && lastCellBelongsToCurrentPlayer)
{ //capture the stones from the other player
let acrossCell = _.totalCellCount() – lastCell;
dbg("Capturing stones from " + acrossCell + " to " + lastCell)
_.setCellStoneCount(lastCell,_.stonesIn(lastCell) + _.stonesIn(acrossCell));
_.setCellStoneCount(acrossCell,0);
}
}
}
view raw game.js hosted with ❤ by GitHub
Cleaning up

The gist of what we’re doing here is a simple refactoring of “extract method”: taking a few lines of code, extracting them into a separate function/method and invoking that function in the right location. The main motivation is breaking down a long function into more digestible pieces, making the code more readable. There’s not even a lot of code reuse going on around here, which might be another motivation for extracting a piece of code into another function. The resulting code is, in my opinion, easier to follow, and troubleshoot in the future. Function logic is expressed more succinctly and in a more consistent level of abstraction.

Take for example the new handleCellClick function (lines 16-27 above). If you read it, its functionality can be summarized in one sentence: “reset the message panel, then assuming the game isn’t over and the move is valid, make the move”.

Similarly, the _makeMove function (lines 29-35 above), that is being called from the handleCellClick function, can be summarized in one sentence: “play the cell passed, then considering the last cell reached, decide if an extra turn is in place; check if we reached the end of the game and redraw the board.”8

This mental exercise of trying to describe a function’s implementation in a sentence or two9 is an important one when trying to assess the readability of the code, which from my experience is a crucial quality factor. I believe it’s hard to assess readability with an objective criteria, but when writing and reading my code, this is how I try to assess it.

Bug Fixing and a UI Improvement

The next two commits, affectionately known as 38feec5 and 8879f3110, take care of fixing a bug, and making a small (but significant) addition to the user interface:

_checkAndCaptureIfNecessary(lastCell,lastCellWasEmpty)
{
let _ = this.board;
let isLastCellAHomeCell = _.isPlayer1Home(lastCell) || _.isPlayer2Home(lastCell);
let lastCellBelongsToCurrentPlayer = this.player1Playing(_.isPlayer1Cell(lastCell)) ||
this.player2Playing(_.isPlayer2Cell(lastCell))
if (lastCellWasEmpty && !isLastCellAHomeCell && lastCellBelongsToCurrentPlayer)
{ //capture the stones from the other player
let acrossCell = _.totalCellCount() – lastCell;
let targetHome = this.player == PLAYER.one ? _.player1Home() : _.player2Home();
let totalCapturedStones = _.stonesIn(lastCell) + _.stonesIn(acrossCell);
dbg("Capturing stones from " + acrossCell + " and " + lastCell + " to " + targetHome + ". Total: " + totalCapturedStones )
_.setCellStoneCount(targetHome,_.stonesIn(targetHome) + totalCapturedStones);
_.setCellStoneCount(acrossCell,0);
_.setCellStoneCount(lastCell,0);
}
}
view raw game.js hosted with ❤ by GitHub
Properly Capturing Stones

The logic for capturing the stones is pretty straightforward. One thing to note is that the bug fix is localized in one place11 – a testament to good code structure; though, to be fair, this isn’t really a cross-cutting concern. Another thing to note is the calculation of the cell across from the last cell – acrossCell (line 9) – the simple calculation can be done because we rely on array indices. A better implementation would have deferred this to the Board class, and exposed a method named something like getAcrossCell(fromCell), so we can let this implementation detail remain in the Board class; this is an example of an abstraction leak (anyone up for a pull request to fix this?).

The second commit takes care of creating and toggling the highlighting of the current player:

function createHighlights(cellCount)
{
let w = (_boardWidthInCells -2)* CELL_SIZE;
let t = TOP_LEFT.y;
let l = TOP_LEFT.x + CELL_SIZE;
let boardHeightInCells = 3;
p1Highlight = new fabric.Line([l,t,l+w,t],{selectable:false,stroke:'red',strokeWidth:3})
p2Highlight = new fabric.Line([l,t+(boardHeightInCells*CELL_SIZE),l+w,t+(boardHeightInCells*CELL_SIZE)],{selectable:false,stroke:'red',strokeWidth:3})
}
function toggleHighlights(canvas,player)
{
requires(player == 1 || player == 2,"Invalid player when switching highlights")
switch (player)
{
case 1 :
canvas.add(p1Highlight);
canvas.remove(p2Highlight);
break;
case 2 :
canvas.add(p2Highlight);
canvas.remove(p1Highlight);
break;
}
}
view raw drawing.js hosted with ❤ by GitHub
_initializeBoardDrawing()
{
toggleHighlights(cnvs,this.player.number)
}
togglePlayer()
{
this.canvas.ifPresent( cnvs => {
toggleHighlights(cnvs,this.player.number);
})
}
view raw game.js hosted with ❤ by GitHub
Highlighting the Current Player

Technically, the highlight itself is simply drawing a red line on the border of the current player’s side of the board. We create 2 instances of fabric.Line in the drawing module, and then simply add and remove them to the canvas when necessary. Note that the toggleHighlights function in the drawing module (line 13) receives the player’s number and queries it directly. I don’t see this as a case of using magic numbers since the numbers themselves are clearly representative of the player object they’re representing, 1 for player 1, and 2 for player 2. I preferred avoiding exposing directly the the PLAYER objects in the game module (game.js).


We’re almost done. Next time we’re going to add a feature, cleanup a bit, and fix a bug; at which point we should have a working version of the game.

Building a Simple Game – Part 4

So after we hooked up and setup the skeleton of the game, it’s time to add some more meat – actually implementing the rules of the game.

Validating a Move

In our next commit12, we introduce the notion of validating a move:

handleCellClick(boardCell)
{
if (!this.isValidMove(boardCell))
this.showMsg("Invalid Move")
else
{
this.showMsg(" ")
this.board.playCell(boardCell);
this.updatePlayerCallback(this.togglePlayer());
this.canvas.ifPresent(cnvs => {
drawBoardState(cnvs,this.board,this)
})
}
}
isValidMove(boardCell)
{
let isValidPlayer1Move = this.player == PLAYER.one && this.board.isPlayer1Cell(boardCell);
let isValidPlayer2Move = this.player == PLAYER.two && this.board.isPlayer2Cell(boardCell);
return isValidPlayer1Move || isValidPlayer2Move;
}
view raw game.js hosted with ❤ by GitHub
Validating a move

The logic is simple: when we’re handling a cell click, i.e. a request to make a move, we first make sure the move is legal/valid. If it is, we make the move (line 8), and update the UI as before. If it’s not valid, we ask the UI to show some error message13. Note that some UI work – showing a message, displaying who is the current player – happen regardless of the canvas being there or not.

The validation function itself – isValidMove – is fairly straightforward at this point. It merely checks that the move is of a cell that belongs to the current player. What’s more important is that we have a specific place to validate a move. We can augment it with further rules later. Another design choice made here is that the isValidMove function returns a boolean result – a move is either valid or not, without any specification of the nature of the problem. A more robust mechanism would’ve returned some indication of the actual problem with the move; mostly for the purpose of showing the player a reason for rejecting the move.

Capturing Stones

Our next commit does two important things:

handleCellClick(boardCell)
{
if (!this.isValidMove(boardCell))
else
{
this.playCell(boardCell); //was: this.board.playCell(boardCell);
}
}
playCell(boardCell)
{
let _ = this.board;
let targetCells = range(1,_.stonesIn(boardCell)).map(steps => _.cellFrom(boardCell,steps))
let lastCell = targetCells[targetCells.length-1];
let isLastCellEmpty = _.stonesIn(lastCell) == 0;
let isLastCellAHomeCell = _.isPlayer1Home(lastCell) || _.isPlayer2Home(lastCell);
let lastCellBelongsToCurrentPlayer = (_.isPlayer1Cell(lastCell) && this.player == PLAYER.one) ||
(_.isPlayer2Cell(lastCell) && this.player == PLAYER.two)
targetCells.forEach(c => _.addStoneTo(c));
if (isLastCellEmpty && !isLastCellAHomeCell && lastCellBelongsToCurrentPlayer)
{ //get the stones from the other player
let acrossCell = _.totalCellCount() – lastCell;
_.setCellStoneCount(lastCell,_.stonesIn(lastCell) + _.stonesIn(acrossCell));
_.setCellStoneCount(acrossCell,0);
}
_.setCellStoneCount(boardCell,0);
}
view raw game.js hosted with ❤ by GitHub
Capturing stones + important refactoring

First, we fix an old “wrong”. We move the function the implements the game rule into the MancalaGame class. Remember that when we reviewed the code in the past we took note of this issue; we’re now fixing it.

Next, we’re implementing another game rule – capturing stones. The logic isn’t very complicated, it’s basically encoding the game rule directly – see if we finished in an empty cell, and assuming it’s in the current player’s side, capture the stones from the opponent’s cell right across the board.

From a design point of view, we’re only doing here a change in the board’s state – mutating the local Board instance using its mutating methods (addStoneTo, setCellStoneCount). So we maintain the separation of concerns as we intended: isValidMove validates the move, playCell changes the board as necessary and drawBoardState updates the UI with the new state.

Extra Turn

Our next commit takes care of another rule – a player gets an extra turn if his move ended in his home (his Mancala):

handleCellClick(boardCell)
{
this.showMsg(" ")
if (!this.isValidMove(boardCell))
this.showMsg("Invalid Move")
else
{
let lastCell = this.playCell(boardCell);
let lastCellIsHomeOfCurrentPlayer = this.player1Playing(this.board.isPlayer1Home(lastCell)) ||
this.player2Playing(this.board.isPlayer2Home(lastCell))
if (!lastCellIsHomeOfCurrentPlayer)
this.updatePlayerCallback(this.togglePlayer());
else
this.showMsg("Extra Turn")
}
}
player1Playing(andAlso)
{
return this.player == PLAYER.one && (typeof(andAlso) == undefined ? true : andAlso);
}
player2Playing(andAlso)
{
return this.player == PLAYER.two && (typeof(andAlso) == undefined ? true : andAlso);
}
playCell(boardCell)
{
let lastCellBelongsToCurrentPlayer = this.player1Playing(_.isPlayer1Cell(lastCell)) ||
this.player2Playing(_.isPlayer2Cell(lastCell))
return lastCell;
}
view raw game.js hosted with ❤ by GitHub
Adding the rule for extra turn

The logic of the rule itself is in lines 9-14 above, and pretty straightforward to follow. Note how we create here an interaction with the playCell function. The playCell is primarily concerned with updating the board data structure (this.board). And returns an indication of the last cell played. The rest of the logic is in its calling function (handleCellClick), which checks for validity and tests for the extra turn. The game logic is still centralized in the same class, MancalaGame, but we’re also keeping a separation between the different functions, trying to maintain the cohesion of each function on its own.

Another small point to take note of here is a small refactoring, mainly for code readability. There are now at least 2 places checking for a specific condition only if a specific player is currently playing (lines 9,10 and 33,34 in the snippet above). My “DRY itch”14 came alive when seeing this, so the pattern is extracted into separate functions – player1Playing, player2Playing. We’ll see later how this pattern of identifying a condition based on the current player that’s playing repeats itself so this will come in handy down the road as well.


Next we’re going to look into how we identify when the game is over, and implement it. We’ll also wrap up this basic version of the game, and look into what else can be done.

Building a Simple Game – Part 3

Continuing our journey to creating a game, after drawing the board, we should make sure the UI is actually responsive and a user can actually play the game.

The first interesting bit here is translating a UI click (mousedown event) to a change in the game state, and drive a redrawing of the new state.

class Board
{
playCell(cell)
{
range(1,this.stonesIn(cell))
.map(steps => this.cellFrom(cell,steps))
.forEach(c => this.addStoneTo(c))
this.setCellStoneCount(cell,0);
}
/**
* Calculate a target cell, given a new cell, walking counter-clockwise a number of steps as given
* @param {number} cell The cell we're starting from
* @param {number} steps The number of steps to take from this cell, counter-clockwise, to reach the new cell
*/
cellFrom(cell,steps)
{
//walk backwards, and if we pass cell 0, add the number of cells again.
return cell – steps + (cell < steps ? this.totalCellCount() : 0);
}
addStoneTo(cell)
{
this.setCellStoneCount(cell,this.stonesIn(cell)+1)
}
}
view raw board.js hosted with ❤ by GitHub
function drawBoardState(cnvs,board,boardClickHandler)
{
function drawOrRemove(boardCell,stoneCount,drawFunc)
{
if (stoneCount > 0)
{
removeDrawingAt(boardCell);
drawFunc(boardCell,stoneCount);
uiObjAt(boardCell).ifPresent(uiObj => {uiObj.on('mousedown', _ => { boardClickHandler(boardCell); })})
}
else removeDrawingAt(boardCell);
}
}
view raw drawing.js hosted with ❤ by GitHub
var canvas = None;
function initGame(cnvsELID)
{
canvas = maybe(initCanvas(cnvsELID));
canvas.ifPresent(cnvs => {
initDrawingElements(board.totalCellCount());
drawBoard(cnvs,CELL_COUNT);
drawBoardState(cnvs,board,boardClickHandler);
})
}
function boardClickHandler(boardCell)
{
board.playCell(boardCell);
canvas.ifPresent(cnvs => {
drawBoardState(cnvs,board,boardClickHandler)
})
}
view raw game.js hosted with ❤ by GitHub
Connecting a UI event to actual state change

So what’s going on here15?

If you remember, we already added an event listener to the mousedown event, when creating the necessary UI objects. At this point, we’re making sure it’s actually responding. Since we want the concern of responding to events to be contained in the controller (the game.js module), we add a callback function to the drawBoardState function and make sure it’s invoked at the correct time – when the mousedown event is fired. Note that at the point we attach the event, we’re already aware of the cell we’re attaching it to (the boardCell parameter is part of the closure), so this saves us the need to translate a UI object (or coordinates) to a game object/concept – the cell.

Looking at the callback function implementation – boardClickHandler in game.js – we see a rather simple implementation. We ask the board module to “play” the cell, i.e. play the stones in that cell – make a move. This is a new mutating method in the Board class. Then, assuming the canvas is available, the handler asks the drawing module to draw the new state again.

Finally, looking at the addition to the Board class, we can see how it’s “taking” all stones in the cell and adding one stone to each of the next cells counter clockwise, as per the Mancala rules + empties the played cell (line 11 in board.js above). The result of this method execution is a new state of the cells in the board data structures – a new count for some of the cells.

Note that all state changes in the board are expressed through one method – setStoneCellCount, this is intentional and will prove useful later. When coding, you don’t always foresee the future and where some choices might be useful. In this case, the benefit wasn’t immediately apparent. Still, I stuck to a simple principle of trying to express higher level functionality (playing a cell) using lower level functionality that’s already defined. This is basic structuring of the code, in this case following the DRY principle. At the very least it helps with readability, which is important by itself. In this case it also helps with encapsulation.

A Short Code Review

It’s important to acknowledge the fact that design choices are made as we write the code, not only upfront. It’s therefore worth stopping and reviewing what choices were made here.

First, if you look at the code, you can see that the Board module exposes the playCell method. This is in fact encoding a game rule – how a certain move is made in the game. We should be asking ourselves whether this is the right place for this kind of concern to be implemented. Admittedly, when writing the code I simply wrote it this way w/o giving it a second thought. This goes to show how sometimes such choices are made when we’re not paying attention. We’ll get back to this point later, when it’s actually fixed.

Second thing to note is that the canvas variable is in fact an Option type. Meaning, we could be running the game without a canvas to paint on. The code in the game module (the controller) is of course aware of this and explicitly asks whether the canvas is there, e.g. lines 6,16 in game.js. This is intentional – we would like to be able to run the game even when technically there’s no way to draw it (can you think why?). Of course, one has to ask whether the way to represent the fact that we’re running w/o a GUI is by managing the underlying drawing object (the canvas) in the controller…

Third thing to note is that throughout the code, we’re referring to the notion of the board cell by implementing it as a simple integer. In most cases this is not evident, we’re just passing a “board cell”, without specifying the type. It’s mostly evident in the board data structure itself (board.js), where we answer some queries by simply comparing numbers to the given cell. This is an intentional design choice made for simplification. I do not see a lot of benefit in defining a separate class representing a board cell. If I was using a more explicitly typed language, I would probably be aliasing the integer to a more readable type, e.g. in Scala: type BoardCell = Int. But this is not available nor needed in JS. As long as we keep that piece of knowledge (“board cell is in fact a simple integer”) confined to one module – the board data structure – we should be fine with this choice.

A Bit of Refactoring and Introducing Players

Our journey continues with a commit that is mostly about refactoring the controller. This is where the controller module (game.js) is rewritten to expose a class – MancalaGame, with an explicit interface:

class MancalaGame
{
constructor(cnvsELID,_updatePlayerCallback)
{
requires(_updatePlayerCallback != null,"Must have a player update callback")
this.board = new Board(CELL_COUNT);
this.player = PLAYER.one;
this.updatePlayerCallback = _updatePlayerCallback;
this.updatePlayerCallback(this.player);
this.canvas = maybe(initCanvas(cnvsELID));
this.canvas.ifPresent(cnvs => {
})
}
handleCellClick(boardCell)
{
this.board.playCell(boardCell);
this.updatePlayerCallback(this.togglePlayer());
this.canvas.ifPresent(cnvs => {
drawBoardState(cnvs,this.board,this)
})
}
togglePlayer()
{
this.player = this.player.theOtherOne();
return this.player;
}
}
view raw game.js hosted with ❤ by GitHub
The new MancalaGame class

This is done mostly for readability and for encapsulating whatever is necessary to run the game, as more and more pieces of logic and data come up. Note specifically that the board data structure, the canvas and associated callbacks are members of this class.

Another notion we’re introducing is that of the current player. The game has a current player in place, and the controller (MancalaGame) keeps track of it. Since we’re working under the assumption that there’re always exactly 2 players in this game, we can simply code them as constant objects:

const PLAYER = {
one : {
toString : () => "ONE"
, theOtherOne : () => PLAYER.two
}
, two : {
toString : () => "TWO"
, theOtherOne : () => PLAYER.one
}
}
view raw game.js hosted with ❤ by GitHub
Defining players

Note that while not explicitly exposed (from module.exports) these objects are in fact an API for this module, since they will be used by other modules. The togglePlayer method in MancalaGame actually exposes these objects. Also note that players are not represented by mere numbers (1,2), but as actual objects16. This is mainly because I want to have an explicit interface that encapsulate the current player, and being able to send messages to that object in different scenarios. For example, I’d like to display the player name (“ONE”, “TWO”) in the UI, and also toggle the players easily. Encapsulating the player behavior in an object (vs. representing as a simple number) is therefore more useful. We’ll see later where this comes in handy in more places.


So far, we’ve setup how the game is displayed, and the main mechanics of interaction. We have a skeleton of the game and the needed data structures to support implementing the logic. Next, we’ll start implementing more and more logic of the game, and see how this affects the code design.