interi

maiki asks about hosting on the Internet Archive

The Ask

What should I be thinking about if I use the Internet Archive as my primary hosting for images and videos?

  • Are there technical concerns I should know?
  • Is this normal usage for the archive? Is it, -gulp-, weird?
  • Are there alternatives I should be aware of?

The Background

As part of my migration I need to figure out where to put ~250MB of uploads, and of course where I will continue to upload non-text.

For many static sties, folks just commit it to their repo, but that doesn’t sit well with me. I just don’t like putting a bunch of binary data into a repo, and even using Git Large File Storage, it doesn’t handle the storage in a way that I appreciate.

I would prefer some kind of blob storage, while also keeping my repo small enough that I don’t have to pay particular attention to my deployment script so they aren’t copying over half a gig at each deploy (currently my site is a whopping 2.7MB).

I looked at Minio, as a storage option, but it still requires provisioning scalable hardware to accomadate for storage, and if I am going to produce more video, I will end up burning through that faster than I want to support it.

As a fallback I decided to modify my app site to handle assets both as the storage engine for my assets, as well as optimizing images when I upload them and providing shortcodes that I can just drop into Hugo. Basically, I was leaning on the asset hosting practices I know and love with WordPress…

But two things put a kink in my think.

The first was the recent Creative Commons interview with Jessamyn West (emphasis mine):

I think the missing sharing point is unless you’re a cultural heritage institution who can bulk upload a bunch of stuff because you’ve been in touch with the Wikipedia organization and you know how to do it, it’s not going work very well. The average person doesn’t know they can get free storage for their content on Wikipedia. I feel like we still could have sharing tools that work better than the ones we have now that facilitated accurate licensing, but for people affected by the digital divide, the fact that they don’t have access to computers or the web means they don’t have access to a lot of other tools.

And then I saw that The Command Line Podcast hosts their audio on the Internet Archive.

Whaaaaat?

I want to do that! That would solve so many of my problems, while also assisting me in spreading the cultural objects I create, with the spirit of sharing that I celebrate! As long as there isn’t some “gotcha”!

All my content is dedicated to the public commons, designated by Creative Commons Zero. There are no issues there, but I just have never heard of anyone really doing this, which means I don’t know of any pitfalls I may encounter.

Also, is it weird to post pictures of my kid to Wikipedia or the Internet Archive? It feels weird. I just don’t know if it my kind of weird…