Mistakes I made in the design of Tclssg (and how they can be fixed)

Published

As the 1.0.0 release of Tclssg approaches I want to take a look at the less fortunate design decisions I made for the project. The goal of this post is to point out the design problems currently present in Tclssg, trace where they came from and outline how they could be mitigated and eventually solved without breaking backwards compatibility. It is also meant to serve as a reference to help avoid making the same mistakes in the future.

I started out developing Tclssg as I learned Tcl. From the beginning I intended it to be a learning experience. I wanted to expand my knowledge of everything that goes into a modern website besides the back end, from the semantic markup elements of HTML5 to responsive CSS frameworks to SEO. What a static site generator needs to do seemed straightforward, so as part of the exercise I also purposefully avoided studying other static site generators. I wanted to be able to rediscover on my own the design principles and best practices that go into building one. An upside to this approach is that you get to evaluate your decisions later through comparison and invalidate the wrong conclusions — with an understanding of why they were reached. I knew that in this area in particular there would be plenty of projects with which to compare what I would make. Building the project ended up being a better learning experience than I had expected. It is no surprise, however, that I did not make all the correct decisions with regard to Tclssg's design. Once Tclssg was reasonably complete I surveyed the competition and found places where the design did not compare favorably.

The first and perhaps the most grave design error I discovered, though, was discovered a little before that comparison. The error consisted in using nested dictionaries for data storage. Why it was bad and what alternative was found is described in Re: Data Munging, so I will not go into detail about it here. I think the chosen "fix" (using SQLite) compares favorably to what other static site generators have gone with. The following problems, however, have not yet been fixed. Fixing them presents a different kind of challenge because they touch parts of Tclssg that interface with the outside world.

Fairly early into development I introduced a distinction between documents, articles and Markdown content. I have since come to consider this a mistake. Unlike the dictionary problem the distinction does not affect the performance negatively; in fact, it may have helped with optimization early on. Where it is similar to the dictionary problem is in introducing unnecessary complexity in the program's core. To explain just what this distinction means we will take a brief look at how Tclssg processes its input.

When creating an HTML file for one's latest blog post Tclssg does approximately the following: first, it takes the Markdown content of the post and converts it to HTML; second, it puts that HTML into an article template; finally, it puts the article template in a document template. The rendered document is what gets written to the .html file. The article and the document template are both rendered according to the page settings (user-supplied metadata) for the current page. A document template can include multiple articles instead of just one. This is used to create the so-called article collection pages: the blog index, which lists all articles chronologically, and index pages for individual tags, each listing all articles that have a certain tag. The same process is repeated with a different pair of article and document template to create the .xml files for the RSS feeds.

While Markdown is processed into HTML exactly once, offering something of an optimization, the article template is rerendered each time it is reused in a document template. This is done to allow the article template to behave differently in collections than in its "home" document; e.g., it can abbreviate the post content with a "read more" link when included in a collection.

The approach described in the previous two paragraphs does work in practice: it works well enough for the default (website + blog) template and variations on it. However, it limits what Tclssg is good for and, more importantly, introduces extra complexity in its architecture: the HTML content generated out of Markdown input, the articles and the documents are each a different kind of data to Tclssg despite their internal similarities.[1] What should be used instead is a general-purpose mechanism for recursively including parametrized templates in other templates. This is what is called "partials" or "partial templates" in other templating systems. The optimization of caching HTMLified Markdown could be replicated by introducing a special option to use when including the partial template. Unfortunately, this part of Tclssg's architecture can not be replaced without an overhaul that would break compatibility with all existing templates. Because I am no longer the only user of Tclssg (I was surprised by how early the first adopter found the project) and it is on the verge of a 1.0.0 release this would be the wrong thing to do, especially since the users of the current system do not object to it.

With backwards compatibility in mind the right approach, should we choose to migrate Tclssg to a new templating mechanism, would be to do so gradually. Tclssg already has a way to include templates in others templates but it is not an adequate basis for a "partial" mechanism because the included templates are run in the current template interpreter's context with its local variables passed to them. A better solution would be to use templates that get the source for those variables as an argument on each inclusion. If templates are made pure functions their output can be trivially memoized, which would allow us further to simplify the cache system. This mechanism should be designed to allow templates to include not just other templates but also Markdown files through a uniform interface. The default document templates that ship with Tclssg could be migrated to the new mechanism and the article system deprecated followed by a complete removal in version 2.0.

Another mistake — although I am less certain about it being one — may have been using embedded Tcl code with some domain-specifics procedures as the primary language for templating instead of a more specialized template DSL. The result is very flexible but verbose. Because of this verbosity the generation of most discrete features of a template like the sidebar or the next/previous page links at the bottom is delegated to plain Tcl procedures that build up a string result. This by itself does not violate MV* separation but it does seem to me to go against the spirit of using templates. I can see two solutions to this problem, which are not mutually exclusive. First, put to use the "template proc" mechanism, which creates Tcl procedures with template bodies. It is already implemented but has not been used in the default templates yet. Template procedures also provide a parametrized templating mechanism mentioned as the solution to the previous problem. They could become the basis of a simpler templating system and at the same time allow for greater code reuse. Second, implement an expressive DSL for templates. The DSL could be distinguished from regular templates through the use of a different set of template tags. Existing templates could be migrated to the new DSL gradually where appropriate with raw Tcl code templating left intact where it is more expressive.

A common theme that comes to mind when thinking about the features mentioned above is the conflict between generality/flexibility and usability. The template architecture in use in Tclssg right now is usable but not very general. (As usable a system could have been implemented as a particular case on top of a more general template system.) On the other hand, the template language is quite general (after all, it gives you access to a general-purpose programming language) but not as usable as some of the competition. You seem to pay a price for getting it wrong in either way.

I would also like to mention a mistake in software engineering rather than software architecture and design. I gave the pre-release version of Tclssg the designation "1.0.0b" (with "b" as in "beta") instead of going with "1.0.0a" ("alpha") first or just continuing the "obviously unstable" 0.x.y line. I did so in part because I did not expect to introduce many major changes as a result of user feedback. I was wrong about that and in retrospect I am glad I was. Tclssg has changed thanks to user feedback, and for the better. Taking the occasion, I would like to thank everyone who gave such feedback and who contributed code and documentation to Tclssg.

  1. You can make a parallel to nominal typing here.