Mistakes I made in the design of Tclssg (and how they can be fixed)

Edited for style after publication.

As the 1.0.0 release of Tclssg approaches I want to take a look at the less fortunate design decisions I made for the project. The goal of this post is to point out the design problems present in Tclssg right now, trace back where they came from, and outline how they could be mitigated and eventually solved without breaking backwards compatibility. It is also meant as a reference to help avoid the same mistakes in the future.

I started out developing Tclssg as I learned Tcl. From the beginning I intended it to be a learning experience. I wanted to expand my knowledge of everything that goes into a modern website besides the back end, from the semantic markup elements of HTML5 to responsive CSS frameworks to SEO. What a static site generator needs to do seemed straightforward, so as part of the exercise I also purposefully avoided studying other static site generators. I wanted to rediscover on my own the design principles and best practices that go into building one. One positive of this approach is that you get to evaluate your decisions later through comparison and to invalidate the wrong conclusions—with an understanding of why you reached them. I knew that in this area in particular there would be plenty of projects with which to compare what I made. Building the project was a better learning experience than I had expected. It is no surprise, however, that I did not make all the correct decisions with regard to Tclssg’s design. Once Tclssg was reasonably complete, I surveyed the competition and found places where the design compare unfavorably.

The first and perhaps biggest design error I discovered was actually discovered before that comparison. The error was using nested dictionaries for data storage. I talk about why it was bad and what alternative I found in “Re: Data munging”, so I will not go into detail about it here. I think the chosen fix (using SQLite) is better than what static site generators have gone with. The following problems, however, have not been fixed yet. Fixing them presents a different kind of challenge, because they touch parts of Tclssg that interface with the outside world.

Fairly early into development I introduced a distinction between documents, articles, and Markdown content. I consider this a mistake. Unlike the dictionary problem, the distinction does not affect the performance negatively; in fact, it may have helped with optimization early on. Where it is similar to the dictionary problem is in introducing unnecessary complexity in the program core. To explain what this distinction means we will take a brief look at how Tclssg processes its input.

When creating an HTML file for one’s latest blog post Tclssg does approximately the following:

  1. It takes the Markdown content of the post and converts it to HTML.
  2. It puts that HTML into an article template.
  3. It puts the article template in a document template.

The rendered document is what gets written to the .html file. The article and the document template are both rendered according to the page settings (user-supplied metadata) for the current page. A document template can include multiple articles. The so-called article collection pages rely on this: the blog index, which lists all articles chronologically, and index pages for individual tags, each listing all articles that have a certain tag. The same process repeats with a different pair of templates to create the .xml files for the RSS feeds.

While Markdown is processed into HTML exactly once, offering something of an optimization, the article template is rerendered each time it is reused in a document template. This allows the article template to behave differently in collections than in its “home” document; e.g., it can trunace the post’s content and add a “read more” link when included in a collection.

In practice, this approach works well enough for the default (website + blog) template and variations on it. However, it limits what Tclssg is good for and, more importantly, introduces extra complexity in its architecture. The HTML content generated out of Markdown input, the articles, and the documents are each a different kind of data to Tclssg despite their internal similarities.1 What should be in thier place is a general-purpose mechanism for recursively including parametrized templates in other templates. This is what other templating systems call “partials” or “partial templates”. The optimization of caching HTMLified Markdown could be replicated by introducing a special option to use when including the partial template. Unfortunately, this part of Tclssg’s architecture can not be replaced without an overhaul that would break compatibility with all existing templates. Because I am no longer the only user of Tclssg (I was surprised by how early the first adopter found the project) and a 1.0.0 release is coming, this would be the wrong thing to do, especially since the users of the current system do not object to it.

With backwards compatibility in mind, the right approach if we chose to migrate Tclssg to a new templating mechanism would be to do so gradually. Tclssg already has a way to include templates in others templates, but it is not an adequate basis for a “partial” mechanism. Included templates are run in the current template interpreter’s context with its local variables passed to them. A better solution would be to use templates that get the source for those variables as an argument on each inclusion. If templates are made pure functions, their output can be trivially memoized, which would allow us to simplify the cache system. This mechanism should be designed to allow templates to include not just other templates but also Markdown files through a uniform interface. The default document templates that ship with Tclssg could be migrated to the new mechanism. The article system could be deprecated followed by a complete removal in version 2.0.

Another mistake, although I am less certain about it being one, may have been using embedded Tcl code with some domain-specifics procedures as the primary language for templating instead of a more specialized template DSL. The result is very flexible but verbose. Because of this verbosity, the generation of most discrete features of a template like the sidebar or the next/previous page links at the bottom is delegated to plain Tcl procedures that append to a string to build up the result. This by itself does not violate MV* separation, but it does seem to me to go against the spirit of using templates. I can see two solutions to this problem, which are not mutually exclusive. First, put to use the “template proc” mechanism, which creates Tcl procedures with template bodies. It is already implemented but has not been used in the default templates yet. Template procedures also provide a parametrized templating mechanism, which I mentioned as the solution to the previous problem. They could become the basis of a simpler templating system and at the same time allow for greater code reuse. Second, implement an expressive DSL for templates. The DSL could be distinguished from regular templates through the use of a different set of template tags. Existing templates could be migrated to the new DSL gradually with raw Tcl code templating left intact where it is more expressive.

A common theme that comes to mind when thinking about the features mentioned above is the conflict between generality/flexibility and usability. The template architecture in use in Tclssg right now is usable but not very general. (An equally usable system could have been implemented top of a more general template system.) On the other hand, the template language is general (it gives you access to a general-purpose programming language) but not as usable as some of the competition. You seem to pay a price for getting it wrong in either way.

I would also like to mention a mistake in software engineering rather than software architecture and design. I gave the pre-release version of Tclssg the designation “1.0.0b” (with “b” as in “beta”) instead of going with “1.0.0a” (“alpha”) first or just continuing the “obviously unstable” 0.x.y line. I did it in part because I did not expect to introduce many major changes as a result of user feedback. I was wrong about that, and in retrospect I am glad I was. Tclssg has changed for the better thanks its users’ feedback. Using the occasion, I would like to thank everyone who gave such feedback and who contributed code and documentation to Tclssg.


  1. You can make a parallel to nominal typing here.↩︎