Letting Information Go

There's a lot of information stored all over the Internet about me, about you, about everyone. At best, most of it can just go away because it's useless, at worst it is potentially harmful. A humorous take on this by Molly Lewis:

The place that this is the most obvious is social media. I really liked this post on old tweets by Vicki Lai which talks about the why and how of deleting Tweets. It applies to all social media. But this all got me thinking about my blog.

Blog posts tend to be more thought out (or at least I try) and seem to me to be part of the larger web. So just deleting them after a matter of time doesn't feel the same as tweets. If someone was writing about the Unity HUD I would hope they'd reference my HUD 2.0 post, as I love the direction it was going. I have other posts that are... less significant. The ones that are the most interesting are the ones that are linked to by other people, so what I'm going to do is stop linking to old blog posts. That way posts that aren't linked to by other people will stop being indexed by search engines and effectively disappear from the Internet. I have no idea if this will actually work.

The policy that I settled on was to have the latest five posts on my blog page, and then having the archives point to posts of the last two years. This means I need to write five posts every two years (easy right!) to keep it consistent. Turned out implementing it in Jekyll was a little tricky, but this post on Jekyll date filtering helped me put it together.

I think that my attitudes to data are generational difference. For my generation the idea that we could have hard drives big enough to keep historical data is exciting. Talking to younger people I think they understand it is a liability. Perhaps fixing my blog is just me trying to be young.


posted Oct 1, 2018 | permanent link

Jekyll and Mastodon

A while back I moved my website to Jekyll for all the static-y goodness that provides. Recently I was looking to add Mastodon to my domain as well. Doing so with Jekyll isn't hard, but searching for it seemed like something no one had written up. For your searchable pleasure I am writing it up.

I used Masto.host to put the Mastodon instance at social.gould.cx. But I wanted my Mastodon address to be @[email protected]. To do that you need to link the gould.cx domain to point at social.gould.cx. To do that you need a .well-known/host-meta file that redirects webfinger to the Mastodon instance:

<?xml version='1.0' encoding='UTF-8'?>
<XRD xmlns='http://docs.oasis-open.org/ns/xri/xrd-1.0'>

<!-- Needed for Mastodon -->
<Link rel='lrdd'
 type='application/xrd+xml'
 template='https://social.gould.cx/.well-known/webfinger?resource={uri}' />

</XRD>

The issue is that Jekyll doesn't copy static files that are in hidden directories. This is good for if you have a Git repository, so it doesn't copy the .git directory. We can get around this by using Jekyll's YAML front matter to set the location of the file.

---
layout: null
permalink: /.well-known/host-meta
---
<?xml version='1.0' encoding='UTF-8'?>
<XRD xmlns='http://docs.oasis-open.org/ns/xri/xrd-1.0'>

<!-- Needed for Mastodon -->
<Link rel='lrdd'
 type='application/xrd+xml'
 template='https://social.gould.cx/.well-known/webfinger?resource={uri}' />

</XRD>

This file can then be placed anywhere, and Jekyll will put it in the right location on the static site. And you can folow me as @[email protected] even though my Mastodon instance is social.gould.cx.


posted Jan 29, 2018 | permanent link

Net change

Recently the FCC voted down the previously held rules on net neutrality. I think that this is a bad decision by the FCC, but I don't think that it will result in the amount of chaos that some people are suggesting. I thought I'd write about how I see the net changing, for better or worse, with these regulations removed.

If we think about how the Internet is today, basically everyone pays to access the network individually. Both groups that want to host information and people who want to access those sites. Everyone pays a fee for 'their connection' which contributes to companies that create and connect the backbone together. An Internet connection by itself has very little value, but it is the definition of a "network effect", because everyone is on the Internet it has value for you to connect there as well. Some services you connect to use a lot of your home Internet connection, and some of them charge different rates for it. Independent of how much they use or charge you, your ISP isn't involved in any meaningful way. The key change here is that now your ISP will be associated with the services that you use.

Let's talk about a theoretical video streaming service that charged for their video service. Before they'd charge something like $10 a month for licensing and their hosting costs. Now they're going to end up paying an access fee to get to consumer's Internet connections, so their charges are going to change. They end up charging $20 a month and giving $10 of that to the ISPs of their customers. In the end consumers will end up paying for their Internet connection just as much, but it'd be bundled into other services they're buying on the Internet. ISPs love this because suddenly they're not the ones charging too much, they're out of the billing here. They could even possibly charge less (free?) for home Internet access as it'd be subsidized by the services you use.

Better connections

I think that it is quite possible that this could result in better Internet connections for a large number of households. Today those households have mediocre connectivity, and they can complain about it, but for the most part ISPs don't care about a few individuals complaints. What could change is that when a large company is paying millions of dollars in access fees is complaining, they might start listening.

The ISPs are supporting the removal of Net Neutrality regulations to get money from the services on the Internet. I don't think that they realize that with that money will come an obligation to perform to those service's requirements. Most of those services are more customer focused than ISPs are, which is likely to cause a culture shock once they hold weight with their management. I think it is likely ISPs will come to regret not supporting net neutrality.

Expensive hosting for independent and smaller providers

It is possible for large services on the Internet to negotiate contracts with large ISPs and make everything generally work out so that most consumers don't notice. There is then a reasonable question on how providers that are too small to negotiate a contract play in this environment. I think it is likely that the hosting providers will fill in this gap with different plans that match a level of connectivity. You'll end up with more versions of that "small" instance, some with consumer bandwidth built-in to the cost and others without. There may also be mirroring services like CDNs that have group negotiated rates with various ISPs. The end result is that hosting will get more expensive for small businesses.

The bundling of bandwidth is also likely to shake up the cloud hosting business. While folks like Amazon and Google have been able to dominate costs through massive datacenter buys, suddenly that isn’t the only factor. It seems likely the large ISPs will build public clouds of their own as they can compete by playing funny-money with the bandwidth charges.

Increased hosting costs will hurt large non-profits the most, folks like Wikipedia and The Internet Archive. They already have a large amount of their budget tied up in hosting and increasing that is going to make their finances difficult. Ideally ISPs and other Internet companies would help by donating to these amazing projects, but that's probably too optimistic. We'll need individuals to make up this gap. These organizations could be the real victims of not having net neutrality.

Digital Divide

A potential gain would be that, if ISPs are getting most of the money from services, the actual connections could become very cheap. There would then be potential for more lower-income families to get access to the Internet as a whole. While this is possible, the likelihood would be that only families in regions that have customers the end-services themselves want. It will help those who are near an affluent area, not everyone. It seems that there is some potential for gain, but I don't believe it will end up being a large impact.

What can I do?

If you're a consumer, there's probably not a lot, you're along for the ride. You can contact your representatives, and if this is a world that you don't like the sound of, ask them to change it. Laws are a social contract for how our society works, make sure they're a contract you want to be part of.

As a developer of a web service you can make sure that your deployment is able to work on multi-cloud type setups. You're probably going to end up going from multi-cloud to a whole-lotta-cloud as each has bandwidth deals your business is interested in. Also, make sure you can isolate which parts need the bandwidth and which don't as that may become more important moving forward.


posted Dec 19, 2017 | permanent link

Replacing Docker Hub and Github with Gitlab

I've been working on making the Inkscape CI performant on Gitlab because if you aren't paying developers you want to make developing fun. I started with implementing ccache, which got us a 4x build time improvement. The next piece of low hanging fruit seemed to be the installation of dependencies, which rarely change, but were getting installed on each build and test run. The Gitlab CI runners use Docker and so I set out to turn those dependencies into a Docker layer.

The well worn path for doing a Docker layer is to create a branch on Github and then add an automated build on Docker Hub. That leaves you with a Docker Repository that has your Docker layer in it. I did this for the Inkscape dependencies with this fairly simple Dockerfile:

FROM ubuntu:16.04
RUN apt-get update -yqq 
RUN apt-get install -y -qq <long package list>

For Inkscape though we'd really like to not set up another service and accounts and permissions. Which led me to Gitlab's Container Registry feature. I took the same Git branch and added a fairly generic .gitlab-ci.yml file that looks like this:

variables:
  IMAGE_TAG: ${CI_REGISTRY}/${CI_PROJECT_NAMESPACE}/${CI_PROJECT_NAME}/${CI_COMMIT_REF_SLUG}:latest

build:
  image: docker:latest
  services:
    - docker:dind
  stage: build
  script:
    - docker login -u ${CI_REGISTRY_USER} -p ${CI_REGISTRY_PASSWORD} ${CI_REGISTRY}
    - docker build --pull -t ${IMAGE_TAG} .
    - docker push ${IMAGE_TAG}

That tells the Gitlab CI system to build a Docker layer with the same name as the Git branch and put it in the project's container registry. For Inkscape you can see the results here:

We then just need to change our CI configuration for the Inkscape CI builds so that it uses our new image:

image: registry.gitlab.com/inkscape/inkscape-ci-docker/master

Overall the results were saving approximately one to two minutes per build. Not the drastic results I was hoping for, but this is likely to be caused by the builders being more IO constrained than CPU constrained, so uncompressing the layer is roughly the same cost as installing the packages. This still results in a 10% savings in total pipeline time. The bigger unexpected benefit is that it has cleaned up the CI build logs to where the first page starts the actual Inkscape build instead of having to scroll through pages of dependency installation (old vs. new).


posted Jun 15, 2017 | permanent link

ccache for Gitlab CI

When we migrated Inkscape to Gitlab we were excited about setting up the CI tools that they have. I was able to get the build going and Mc got the tests running. We're off! The problem is that the Inkscape build was taking about 80 minutes, plus the tests. That's really no fun for anyone as it's a walk away from the computer amount of time.

Gitlab has a caching feature in their CI runners that allows you to move data from one build to another. While it can be tricky to manage a cache between builds; ccache will do it for you on C/C++ projects. This took the rebuild time for a branch down to a more reasonable 20 minutes.

I couldn't find any tutorial or example on this, so I thought I'd write up what I did to enable ccache for people who aren't as familiar with it. Starting out our .gitlab-ci.yml looked (simplified) like this:

image: ubuntu:16.04

before_script:
  - apt-get update -yqq 
  - apt-get install -y -qq # List truncated for web

inkscape:
  stage: build
  script:
    - mkdir -p build
    - cd build
    - cmake ..
    - make

First you need to add ccache to the list of packages you install and setup the environment for ccache in your before_script:

before_script:
  - apt-get update -yqq 
  - apt-get install -y -qq # List truncated for web
  # CCache Config
  - mkdir -p ccache
  - export CCACHE_BASEDIR=${PWD}
  - export CCACHE_DIR=${PWD}/ccache

You then need to tell the Gitlab CI infrastructure to save the ccache directory:

cache:
  paths:
    - ccache/

And lastly tell your build system to use the ccache compiler, for us in CMake that is using the COMPILER_LAUNCHER defines:

inkscape:
  stage: build
  script:
    - mkdir -p build
    - cd build
    - cmake .. -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache
    - make

Final simplified .gitlab-ci.yml file:

image: ubuntu:16.04

before_script:
  - apt-get update -yqq 
  - apt-get install -y -qq # List truncated for web
  # CCache Config
  - mkdir -p ccache
  - export CCACHE_BASEDIR=${PWD}
  - export CCACHE_DIR=${PWD}/ccache

cache:
  paths:
    - ccache/

inkscape:
  stage: build
  script:
    - mkdir -p build
    - cd build
    - cmake .. -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache
    - make

If you'd like to see the full version at the time of writing it is there. Also, assuming you are reading this in the future, you might be interesting in the current Gitlab CI config for Inkscape.


posted Jun 10, 2017 | permanent link

All the older posts...