A Fractal of Garbage

When I started this blog, I briefly mentioned one of the motivations for it being that I wanted to move away from the static site generator Hugo, which I had been using for my old blog until I took a hiatus from blogging for about a year when the meat puppet's life went to shit and she was homeless for awhile. That, dear reader, is a whole post unto itself, though many have by now either gotten the rundown or heard what's happened by hearsay. Over time, I'd found that my Hugo theme and templating had all mysteriously started to break, and couldn't be bothered to fix what was already a series of really shitty hacks on top of a blogging framework with bad abstractions. To keep myself occupied with something, I decided to try to actually get down to finally migrating all my old posts away from my old blog, because it's pointless and confusing to have three separate blogs when only one of them is active.

And therein begins the fractal of garbage that I am currently descending into, writing this post to take a break from what I'm trying to do. But let's start with talking shit about Hugo itself.

Go(lang) Fuck Yourself

As I mentioned in Recurrence, I had once awhile ago tried to package Hugo in GNU Guix when I was still using it. Credit where it's due, Guix and NixOS were the first distros that got me into trying to package software for Linux, though I no longer use Linux as of a few years ago and have moved onto *BSD, because Linux also sucks. But I am glad to have gotten that experience, because at the time, I had not yet managed to fully shed the naive free software consoomer mindset that to me defines what the Linux community has become. In 2018, I pretty much blindly adopted Hugo because I wanted to move away from the old iteration of Nyxus that was using Wordpress and was recommended it by someone. I had no idea what I was doing back then and didn't yet realize that just because something is "free software", doesn't mean that it doesn't have a price.

When I tried packaging this software that I'd become reliant on to get work done, I was surprised to find out that Hugo is actually quite difficult to package on a distro like Guix that takes the concept of reproducible software builds seriously.1 If I go to the Github page for Hugo, I see that it's currently on release number… v0.122.0. I think when I last tried to package it in 2021 it was in the v0.8X.0 range, which is already a bad sign. Why even bother with making this many stable releases? Just admit your software is never going to be stable bro.

But nevertheless, I pressed on for a bit before giving up, because even with Guix being a pretty good packaging system (though lacking a lot of the automated conveniences Nix has), I just couldn't be bothered to deal with resolving endless circular dependencies. I now see why.

Let's try checking what the total transitive dependency graph is for Hugo:

n1x@katak:~/code/src/hugo % go mod graph | wc -l

It turns out that the transitive dependency graph for Hugo is so fucking massive that it literally fills my entire terminal history, so I will spare you the ugliness of pasting the entire output here. This is almost three thousand total transitive dependencies that are required to build Hugo, which includes many different versions of the same packages, because in the world of "move fast and break things", we have no time for stable APIs or completed software. And because nearly every programming language in existence is a giant pile of shit where writing useful software that does basic things is a Herculean task, whenever a problem is solved, no one dares ever attempt it again. The conventional wisdom becomes to pull in dependencies for everything, because it's the "UNIX way" or whatever. This creates a situation where it's likely that most software would actually not even work at all without adding even more stupid, ad hoc bad practices on top of this Jenga tower of garbage, and so you have a situation like Hugo where a single piece of software has literally thousands of transitive dependencies, in many cases with multiple versions of the same dependency being relied on by different other libraries because everything is pinned to different versions.

My dearest reader, this itself is already a completely batshit insane situation that only a few brave souls dare to ever speak out against. Nearly every other clever midwit who dumps their garbage onto Shithub either isn't aware that this is a problem, or at best, they have accepted it because they probably program for a living and just don't care about the state of the industry. I can't say I blame them, because nearly every tech company in existence only produces more garbage that is superfluous at best, and more often than not just actively makes the world worse. In fact, on that basis, I should be praising them for making worse software as a way of sabotaging the ϟϟiliKKKon Reich.

But in the context of what I'm talking about, the situation is somehow even worse, for a variety of reasons.

I won't pretend to have any familiarity with Golang beyond my sysadmin duties with things like packaging, because I value my time and what remaining sanity I have left and have no interest writing code in a language that is infamous in how badly designed it is. To quote the man himself Rob Pike:2

The key point here is that our programmers are not researchers. They are, as a rule, very young, they come to us after their studies, perhaps they studied Java, C / C ++ or Python. They can’t understand an outstanding language, but at the same time we want them to create good software. That is why their language should be easy for them to understand and learn.

It should be familiar, roughly similar to C. Programmers working at Google start their careers early and are mostly familiar with procedural languages, in particular the C family. The demand for speedy productivity in a new programming language means that the language doesn’t have to be too radical.

Rob Pike is giving a marketing spin (probably with the barrel of a Google PMC's Beretta digging into his back) to something that Paul Graham talked about over two decades ago. A dumb person will read this quote and think, "ah yes, thank you Rob Pike and Google for treating us programmers like children who can't be trusted with good tools to solve problems, you know like as if we were professional engineers or something", but if you read this quote you see what Stanislav is talking about in "Where Lisp Fails" not just recurring but being sold to you as if it was a design decision meant to benefit the programmer. It is not. Most programming languages are gimped by design either as an emergent process where they just happened to become popular and suddenly everyone was stuck with them (PHP is the exemplar of this) or they were made that way on purpose. What sets Google apart from the rest of the tech industry isn't that they make the best tools for the job but because they know how to play the game. They know that they need to pull the wool over the eyes of junior developers by marketing a language to them that doesn't empower them to solve problems but rather makes solving problems more complicated and painful.

This is not without precedent, and hardly limited to Golang. This post is not about Golang really; like I said, I haven't even used it personally and don't care to. But this is endemic to most of the tech industry, as far as I can tell, and it's rooted in the legacy of UNIX and C, which also make solving problems harder than they need to be. There was once a time where having to manually twiddle bits and allocate memory mattered, because of hardware constraints, but that time has long since passed. Many programmers have commented on how we were able to get to the moon with however many kilobytes of code and yet today Electron apps eat up multiple gigabytes of RAM to run yet another shitty proprietary chat application, but none of them understand that it's not a mere historical accident that we got here. It's because the technologies that existed to actually reduce the amount of work the programmer had to do didn't win, and what did win was what was already popular and that was born out of hardware constraints, and that was C and UNIX.

But it's not just that we are stuck in a bad timeline where the bad software won. We could all very easily be using lisp or Smalltalk or Erlang or [insert other niche language here] for everything, finish our software, and then do something else. But if we all used technologies that existed to reduce work, it would mean that software wouldn't require an army of programmers to get anything done, which means that there would be a smaller pool of skilled artisans who would have much more bargaining power in the workplace. They could easily demand better conditions and higher pay, and not only that, but they would probably end up just getting work done and then have nothing left to do except sit around getting paid for nothing. We can't have that, no. The tech industry has a literal financial incentive to force the adoption of shitty technologies that are badly designed and create more busywork, because this ensures that programmers are more like assembly line workers: Easily replaced if they start to demand better conditions, unless they unionize. And thanks to the tech industry also being infamously full of crypto-fascist right-libertarian idiocy, that's not likely to happen!

I'm only going to be repeating a lot of "Worse is Better" by going into this more, but my point is that this isn't merely a smug lisp weenie complaining about how software doesn't meet my extremely specific and high standards. The point is that I am currently dealing with problems that have been artificially created by all of this (let's not forget that Golang comes from the Plan 9 crowd, who supposedly have totally fixed UNIX's problems by making it even MOAR UNIXY). A fucking static site generator should not have almost 3000 transitive dependencies, but partly because of the idiotic and irresponsible practices of the tech industry, and partly because of the bad technologies that everyone uses, and partly because the tech industry is itself a predatory finance capital industry that should be killed with fire, and partly because of the mentality behind Stallmanist free software ideology that a license is all you need to ensure that your freedumbs are respected, my time is being wasted.

There is no resource more precious than time, and I genuinely resent everyone who is responsible for me having to deal with this. At the very least, I resent the fact that there is this software consoomer mindset in the Linux community that something being F/OSS is equivalent to it being good. There is a shitload of free software that is fucking garbage and that you shouldn't install on your machine if you value your time and sanity, and I am constantly having to avoid these things because I refuse to let myself be domesticated by soydevs whose elephantine software effectively takes away the user's freedom by being impractical for any single person to audit and change on their own.

But if only the problem were merely the particular implementations of software.

Death By a Thousand Cuts

I must reiterate that this is not really a post about Hugo itself, or Golang, or UNIX, or anything else that I talked about in the previous section. While it was my own ignorance that lead to me relying on an unstable and overcomplicated piece of software for blogging, I don't particularly care what the Hugo devs do. I am a strong advocate of the concept that you should write software for yourself first and foremost, and everyone else can go fuck themselves if they don't like it. At least, when it comes to free software, this should be the mentality that people approach it with. You do not owe anyone anything when you've already elected to release your work out into the world for free, and I would not ask the Hugo devs to change anything they're doing. I would simply choose to not use their software, which is what I did. But the previous section was merely setting the stage for the real heart of the problem that has gotten me where I'm at now, which is the absolutely abysmal state of document markup formats.

Hugo uses Markdown as the markup format for writing posts, and if you're someone like me a few years ago who has no idea what they're doing, it seems like a revolutionary concept to blog using plaintext files instead of a whole PHP and MySQL stack like Wordpress or a proprietary WYSIWYG XML serialization format like Word Documents (or the free software equivalents). There are projects like Obsidian that use this as a selling point, even. Wow, you own your data because it's all just plaintext files, how cool!

The thing is that Markdown is also fucking garbage, and unlike anything I've mentioned so far, Markdown isn't even software. It's more like a completely ad hoc markup specification that everyone uses that actually has no thought put into its design and was just made by some guy years ago for his own purposes and then posted on the internet. It comes from the exact same school of thought that lead to PHP getting widely adopted, where some guy makes something and releases it for free, and then a bunch of other people just use it without thinking about it, and then suddenly a whole industry is reliant on this thing that was never really designed to be a standardized markup.

The non-design of markdown is what lead to there being a need for something like CommonMark that actually puts thought into specifying in unambiguous terms how to write documents, but by that point the damage had already been done. There are all sorts of different implementations of markdown that all sort of resemble each other but all add their own bullshit on top of it, and there's no way of reliably parsing markdown documents that aren't explicitly written in conformance with CommonMark.

This, once again, leads to me having to pick on Hugo, because what Hugo does is add its own implementation of markdown with this thing called "frontmatter" that adds metadata to markdown files. Because being a plaintext markup format inspired by usenet conventions on styling text and mainly meant to be a more convenient way of writing HTML by hand, markdown has no support for metadata, which is kind of a problem if you're trying to build a blogging system around it. It's especially a problem if you care about keeping track of things like when a document was published, if it's a draft, what the canonical title is, adding things like tags, and other stuff that people usually care about when blogging. So they have shit like this that they added to it:

title: "Blog Without Organs"
date: 2018-06-23T23:30:34-07:00
type: "post"
categories: ["meta"]
tags: ["tech", "hugo", "design", "decentralization"]
draft: false

Did I mention that Hugo supports three different formats for the frontmatter, and that out of them, only one (JSON) is a relatively simple and unambiguous data serialization language? The other two are TOML and fucking YAML, an infamously convoluted pile of shit in its own right. There isn't even a parser implemented in pure Common Lisp for pretty printed YAML because it's that complicated. Data serialization formats could be considered a separate concern from document markup formats, but here the recursive descent into hell begins, because when trying to migrate my blog posts from Hugo to my own solution that I'm working on for myself in Common Lisp, I now have to deal with two different poorly-specified formats that have pretty much been blindly adopted by programmers with no thought put into the long-term consequences of it.

Once again, this is my own fault to some extent, but it actually gets worse. As I said above, markdown isn't really suited to something like blogging, in my opinion. It suffers from the same pseudo-simplicity of Golang, C, and UNIX, where the programmer is given a bare-minimum system to work with that may or may not be consistent and correct, and above all else, it must be easily implemented rather than easily used (i.e., have simple interfaces). It's easy enough to implement something that adheres to the vague idea we have of what "markdown" is, but if you want to do anything more than just write HTML by hand, then you have a problem and have to fall back on the usual eunuchs philosophy of writing your own boutique parser for everything you do, because everything is strings and files. Again, it's simple to implement, but once you start to think bigger than the bare-minimum, now you have a lot of miserable toil to deal with.

This is why a few years ago I decided to move to org-mode instead of markdown for my notetaking and writing, because within the scope of GNU Emacs, org-mode is quite nice as a document system. It has all sorts of nice interfaces for things that documents can be used for, including blogging, and you can easily keep track of metadata not just in the files themselves but even in subtrees of an org-mode file if you prefer to have one big org-mode file with a lot of different sections in the document. This to me is just a better way of writing compared markdown or something as massively complex and verbose as LaTeX that is better suited for academic papers, because while you're still using a plaintext format and not necessarily relying on having to install a whole stack of software like a database just to manage your own personal notes or writing, you still can treat your writing as data to some degree.

The problem, as is so often the case with GNUware, is that the price you pay for freedom is having to do everything how GNU wants you to do it. If markdown has, at best, an ad hoc spec with a bunch of different competing implementations, org-mode has exactly one implementation. I have attempted to write a parser for org-mode before and am currently putting that on hold until I can be bothered to refactor my rather bad and difficult to maintain code, and I can tell you that org-mode doesn't just effectively not have a spec and is defined by a single implementation (god help you if you're relying on a lot of org-mode features that are added by third-party packages). It's also a fucking massively complicated beast because so much of its functionality relies on being run in GNU Emacs, where being able to do things like run some specific function at a certain point in the document tree allows you to have an export dispatch in the first place and export a single subtree, for instance. Among other bits of funny trivia about org-mode, it uses its own timestamp format "inspired by" ISO 8601 for some goddamn reason instead of just using ISO 8601, which means that even parsing timestamps is something that requires manual intervention.

So it turns out that because I was reliant on a piece of software that is constantly breaking itself for no clear reason and is itself incredibly complicated despite doing something as simple as generating static sites (it's literally a joke that everyone has written their own static site generator), if I don't want to continue to bend to the whims of the Hugo devs and fix my site whenever they change something, I have to migrate my posts that are in their own specific implementation of a document markup format that has no real specification. An implementation which uses three different data serialization formats for metadata which are infamously complicated in their own right. And then, I have to figure out how to get it into a different document markup format that literally has no spec.

Even going about this the dumbest possible way I can think of, using the excellent Plump parser for Common Lisp to parse the generated HTML for the markup files and then converting them to org-mode using Pandoc, I ran into problems. Because if I want to do that, I need to settle on converting the entire HTML page. If I select the part of the HTML file that has the actual post in it, Pandoc can't figure out how to properly convert it to org-mode and will just output it with a bunch of HTML blocks in org-mode (because yes, you can also embed HTML in org-mode for some reason). If I just convert the whole page, then I have to deal with all the other shit like the header and footer and other stuff I don't want getting converted as well into org-mode, and then I would need to figure out how to strip all of that out of the document to get what I actually want.

The more time I spent fucking around with this trying to just automatically convert my old posts to a format I use, one that is already itself a huge compromise that I hope to eventually replace, I find myself making one compromise after another relying on other people's software. Software which is, I must add, very high quality. Pandoc is excellent, Plump is also excellent like pretty much everything Shinmera makes, and yet the deeper I fall into this artificially created hell of having to parse all of these different stupid serialization formats just to get my damn posts into something I can actually work with, the more it starts to feel like it would actually be less work to just transcribe it all by hand.

I cannot emphasize enough how this little microcosm of shit is to me an absolutely damning indictment of the state of software. Computers are supposed to be labor saving technologies, for fuck's sake!

At this point in my little adventure, I'm thinking about settling on just manually parsing the Hugo frontmatter in Common Lisp, and then parsing the body document using a CL CommonMark library, and then I guess I will just have to recurse through the AST and write my own goddamn emitter just to get everything into org-mode. This actually feels like the easiest thing for me to do right now short of implementing my own document markup format – which is something I absolutely plan to do so I can be rid of org-mode's nonsense and be less dependent on GNUware, but I just want this part to be done with first.

But then, but then, I remember that different markdown formats support footnotes, and some don't. I make liberal use of footnotes because it's a convenient way to write an aside in a piece of writing, and it turns out that if I use the CommonMark spec to try to parse my markdown files, those don't get parsed either! So now, I have to deal with figuring out if a raw text node is a footnote, and then find it later on in the fucking document, entirely separate from the context in which it was parsed.

In the end, what I ended up doing was just writing some shitty Common Lisp code to walk through the AST that cl-cmark returns and export everything, and then fix a few things manually that were too much of a pain to deal with trying to automate, like transforming the footnotes and accounting for either TOML or YAML frontmatter. I may put it up somewhere on the off-chance that it will be helpful for anyone to read, but I'm not going to do that right now because it's 4am and I've been doing this all day. I'm tired, and annoyed, and this and the other post I just put up are the only things I've written in months. I just don't fucking care. Everything is always fucking broken and everything is utter shit.

Please God Let it End

It really astounds me how little most programmers seem to care about standards. The ISO 8601 timestring format in org-mode is one small, most harmless, but very illustrative example of this. Is there any reason to make this the default in org-mode, other than because the programmer probably thought it looked nicer? Probably not, but it's there anyways, and it's one of a million little settings in org-mode that you won't realize is going to bite you in the ass and create work for you down the line if you ever decide that you want to do something other than what the developer originally intended. Once again, it's perfectly fine to make software for yourself, but if you're expecting, encouraging even for other people to use it, then you should be upfront about the decisions it makes that deviate from standards and that could create more work for someone down the line who wants to hack on your code. Standards exist for a reason, and ignoring them while still trying to get people to use your software is frankly antisocial behavior.

In the Common Lisp world, following the ANSI Common Lisp spec is taken very seriously. There are a lot of important things that are implementation-dependent, things like threads and sockets, because CL is from a time before multithreaded processors and networking was the norm. But you know what we do? We create compatibility libraries for these different implementations so that users have a choice in what implementation they use. The design of CL itself means that if an implementation has its own interface for something that isn't in the spec, it will be namespaced, while the CL namespace is reserved for things that are defined by the spec.3 CL programmers take seriously the idea that you should conform strictly to a standard and then give users additional functionality on top of the standard that are clearly and explicitly defined as exactly that – optional, non-standard features.

Perhaps if history had went differently this wouldn't be the case, because Common Lisp programmers are incentivized, to the best of their ability, to write portable software. CL is a small community/ecosystem compared to most programming languages, but it has a long history, and a lot of different high-quality implementations that all interoperate with each other quite well, and the survival of the language depends on being a good citizen in the CL world by making sure that our history is not thrown in the trash by the same reckless "move fast and break things" mentality of an industry that doesn't give a fuck about us. Even so, CL also literally has built into the language the ability to declare features at read-time, which is part of what makes all its compatibility libraries work so well. Depending on whether one is running Common Lisp in SBCL, Clozure CL, ECL, etc., and even depending on what operating system it's running on, blocks of code will either be read or completely ignored. This extreme degree of importance being placed on compatibility means you can write CL code that declares features for running on a Symbolics Lisp Machine of all things. It's not just that CL programmers are forced into not being antisocial, but that the language had actual thought put into its design and was built to last, despite the nitpicking complaints of people who think it's not sufficiently pretty or suffers from "designed by a committee" syndrome.

Yet when I have to go out into the big wide world outside of the ruins of a once great civilization, I am terrified and disgusted at what most programmers put up with. Even the harshest criticism of CL's committee design quirks pales in comparison to this barbaric hellscape where technologies with hardly any thought put into them, or that are explicitly designed to be miserable to use, are forced onto programmers by a pervasive herd mentality – and usually these technologies not only have no standard at all but are effectively owned by a company.



For a fun time, check out this post about the absolutely nightmarish state of software packaging, which has been ongoing since 2015! https://dustycloud.org/blog/javascript-packaging-dystopia/


Oh wait let me just get around the paywall for this Medium article that links the quote I'm remembering. sigh


Which, by the way, allows for projects like Coalton to exist that add a whole statically-typed language within CL that can also interoperate with CL's ANSI standard.

Created: 2024-01-31 Wed 04:02

Last updated: 2024-03-05 Tue 22:57