I posted the following to Twitter the other day:
So, if all goes according to plan, all of my Twitter history up to yesterday-ish will be deleted, and I will have setup some code (that I control) that will delete everything older than 7 days on an ongoing basis.
I used to believe that everything posted on the Internet should stay, forever. I’m not so sure that is true. Published for public consumption, forever? Beginning to doubt that for most normal humans…
In the near future (weeks hopefully), I’m going to start automatically hiding old photos, blog posts, everything except those that seem really worthwhile to keep up indefinitely. I’m working on the rules still, curious what people think are good rules.
I’m almost certain that corporate social media policies, especially for public facing employees, should strongly recommend services that do the same – either delete or tighten up permissions after a window of time on public posts (And anything on FB)…
So, in short – you’re only going to see ~7 days of old tweets on my Twitter account. This post is about how I’m setting that up.
The short term hack
Twitter makes this hard (though I think this is unintentional). Specifically, they make it hard to access anything more than the last 3200 tweets in your account via the API. So, getting your account down to just the last 7 days ends up requiring two bits of software:
- Find a way to delete tweets older than my most recent 3200.
- Setup a process that watches my twitter feed regularly and deletes tweets older than 7 (or whatever) days.
Deleting all of my tweets
I decided I would delete all of my tweets to begin with. If twitter offered a native “archive” or unpublish option, a la Instagram, I may not have deleted everything. But they don’t, so this was my only option to start with a clean slate.1
I found a small script someone wrote on Github, forked it, and then modified it quite significantly. The script and instructions are on my Github account. You’ll need to be comfortable at the command line if you want to use it. It’s rough, and I offer no guarantees that it will run smoothly for you. Also, keep in mind – it will delete all of your tweets, and there is no undo. Keep your backup archive safe, and make sure this is what you want: delete everything.
To get around the 3200 tweet API issue mentioned above, the script uses the
tweets.js file that comes in the data backup from Twitter, so the good thing is that you’re basically forced to download the backup to use the utility. That file contains the IDs for all of your tweets (among other things), which is all we need to issue the delete command for that tweet.
The ongoing culling of my older tweets
Again, I started with someone else’s code. I found a nice little project written in Go that leveraged AWS Lambda to run the little bot. I used this project as a chance to brush up on my Cloud Formation skills, as well. My fork, with CloudFormation templates, is on my Github account as well. There’s even a handy “Launch Stack” button if you want to set it up on your own AWS account.
The bot runs every few hours, looks for tweets in my account older than the interval I’ve configured, currently set at 7 days, and deletes them if it finds anything. It’s all pretty simple.
Making this a thing
As I started working through this, I starting thinking about enabling this for the other social media services I use. I don’t know why everything, from Flickr to Pinboard don’t offer ephemerality as a feature. If the feature is offered, it should be the default. As I mentioned at the start, I don’t believe we, as people, are prepared for a world with total recall of our every utterance. My thoughts on this are complicated2, but suffice to say, I am going to build tools that allow me to manage my social media presence following these guidelines.
I mentioned this to a few folks, and got a few enthusiastic “I want that for my account!” comments. So, I’m going to spin this up as a side project and see what I can cobble together. If you take a look at the code I linked to above, it’s very simplistic – fine for a single account, but not the best for a real service.
The other aspect of this I’m working on is governance. I don’t want to do this as a business – that’s not a goal. What I do want is a service that has a strong privacy stance, that offers high trust to folks that use it. One of the reasons I didn’t use the public services that are out there is that their business model is unclear.3
I am hoping to use this as an experiment in a cooperative form of governance for an online service, one where any charges are transparently used to maintain the service, where the source code is available for people to review, and where users can have some sort of assurance that the code that is released is the code that the hosted service is actually running. These seem like interesting problems regardless of the service being offered.
Because naming things is easily the most important and most fun part of any project (seriously, I have so many domain names!), I’ve decided to call this the Time Fades Project. A placeholder page is all that’s over there, but stay tuned for more.
If you have any interest in this sort of governance topic, or in contributing to the service, or in what a good set of default rules are for these sorts of ephemeral behaviors (I expect this will need to be different for different social networks), please get in touch.
I didn’t feel too bad about this, because I had an out. As part of this process, I had to download my official twitter data archive, which has everything. On top of that, I use a bookmarking service called Pinboard that has a feature that copies all my tweets and makes them searchable, privately, just for me. (It does require the paid archive feature in order to get the full text of the tweet. Otherwise it only stores a truncated version of the text.)↩
For example, I’m not in favor of the right-to-be-forgotten laws even as I want services to offer that capability on the individual service level…↩