While I’m not terribly concerned about the privacy implications of Facebook having access to the fairly insignificant drivel I usually post there, what really bothers me about it is the fact that it represents a whole chunk of my life wrapped up in a company database that I can’t get access to, except through the Facebook web interface (which leaves much to be desired, particularly when it comes to archived material).
It’s not that I need my Facebook data so much as that it’s like a diary: all sorts of conversations, notes, random remarks, and social interactions that might once have been written in a diary or in letters; the sort of thing that would be nice to be able to page through and reminisce when I’m old and backwards-looking. There’s a practical component too – I’d like to be able to search my messages so I can find random notes, or remember what I sent to X about Y (again, the Facebook interface sucks in this area, though I note that in the last few days they’ve upgraded the messages UI once again, so maybe that will improve things – hey, pigs may one day fly through spaaaaacccce).
Anyway, I noticed this afternoon that at some point Facebook added a ‘download your data’ function to the account settings page, so I had to try it. Here’s what you get:
- Copies of all your photos, in the resolution that Facebook stores them (so, if you uploaded higher res versions that Facebook scaled down, you get the scaled down version, not your original; about what I’d expect).
- Copies of all your videos, again, scaled
- An HTML page containing your profile
- A single HTML page listing your photo galleries, including cover images, posting details, and gallery-level comments by other users
- A single HTML page that seems intended to contain all of the photos of you. Mine’s empty, which is odd, because there are many photos tagged as containing me.
- HTML pages for each of your photo albums, including captions, image posting notes, and comments by other users. Might be that all the photos of me are owned by other users; I’m not sure I normally tag myself
- A single HTML page containing all of your events. Seems only to include recent events, and doesn’t include anything more than the event overview – you can’t recover lists of attendees or discussions
- An HTML page containing your friends list. Names only, no photos, no other info.
- A single HTML page containing all of your notes, including comments by other users
- A single HTML page, apparently containing all your videos, including posting notes and comments by other user. I say apparently because I only had the one video up, but it looks like it’s meant to be a list page
- A single HTML page containing your whole wall. Contains all your wall posts, including all comments by other users. Doesn’t include everything posted to your wall, though; third party notifications seem largely absent, for example. Of course, they’re mostly junk and advertising, anyway, so I’m OK with that. Obviously, this page can be quite big
- A single HTML page containing all of your messages, organized into conversations (as in GMail). This page can also be rather large
All of the HTML pages are stripped of Facebook’s look and feel, and contain no extraneous links (they only link to other files in the download set, such as photos). Internally, they’re fairly well structured, with tagged divs and spans for most of the key elements, so they should be fairly trivial to parse if one was so inclined.
While the pack doesn’t contain everything I’ve ever posted on Facebook, it’s pretty thorough. The main things I can’t recover are comments that I’ve posted on other people’s walls or items. It seems those only come out if that user does an export. It’d be nice if this was all a little more structured – some XML linking everything together would be nice, for example, as would, perhaps, be links to the original content. But the main problem, for me, is solved – I can recover the bulk of what I’ve put on Facebook, which means I’m no longer locked in if (when) they turn out to be evil, nor do they have any monopoly over these artifacts of my existence. They’re all going into my document repository for storage..