Archiving the Archive

In my attempts to wrangle this site migration I found myself with ~1,200 text files representing my blog archive. None of them were ready to be published, due to the various frontmatter values set from the export, but more than that, I wanted to take this chance to “fix” a lot of the content.

This isn’t the first time I’ve wanted to do this, and I suspect I’ve tried to do a content cleaning about half a dozen times over the years; I’ve kept the same domain, but my site structure and style is on a two or three year rotation.

Transitioning from WordPress to Hugo is a bigger challenge, but also means I kinda have to go through all my content. This is intentional. If I wanted I could automate most of the cleanup, but I want to go through each bit of content.

My initial plan was “clean” up each piece so my errors and dumb opinions from past would be lost to everyone except the Wayback Machine. The perfect plan! What could go wrong?

Well, it turns out that the only thing more powerful than my perfectionism is my loss aversion, specifically the idea that if I don’t keep files in version control they don’t exist. So now you can see my history, warts and all.

null, the place I try to forget

Why null? Because while most of that content will eventually make it into posts, and some will make it into different sections, like docs or static pages, I know that some of it will not go anywhere. There are some posts that just don’t make sense, like when I imported an old StatusNet site, one-side responses and all. I don’t want to erase my history, just hide it discreetly.

But that means I don’t want Hugo to generate pages from that content section, at least for now. So I made a couple of adjustments to the sitemap and theme.

First, for the index page, I changed the range of which pages are listed:

  {{ range .Data.Pages }}
  {{ range where .Data.Pages "Section" "posts" }}

Then I did the exact same thing in sitemap.xml, overriding the default Hugo sitemap.xml template.

These are temporary measures, as I expect to have other sections, so I’ll eventually change those ranges to merely exclude the null section. But it gets the job done for now. ^_^