An SQLite database of the WikiLeaks 9/11 pager data

(Update database to v1.1 on 11/30/09 at 11:49AM)

It contains all alphanumeric messages from the WikiLeaks 9/11 dataset as an SQLite3 database:

911.wikileaks.org.sqlite3db_v1.1.zip, 14MB (uncompressed: 44MB)

There are three tables - textTable (main), emailTable and urlTable - with the following schemas:

  • textTable: timestamp DATETIME, service TEXT, senderID INTEGER, text TEXT, key INTEGER PRIMARY KEY
  • emailTable: address TEXT, domain TEXT, textKey INTEGER, key INTEGER PRIMARY KEY
  • urlTable: url TEXT, textKey INTEGER, key INTEGER PRIMARY KEY
Notes:
  • The textKey fields are pointers to the primary key (key) of an entry in the textTable table.
  • For the senderID field, a string of all zeros is a translation from the same number of question marks in the original message, as I wanted this field to be integer typed.

More to come, including: the script used for generating this database (GitHub project), a more robust script and database (added in v1.1 update on 11/30/09) that will include email and URL lookup tables, and hopefully some cool analysis of all this amazing data.

Lastly: it turns out that this Jeff Clark guy has already done a bunch of analysis along the lines of what I thinking about doing, and it’s great stuff. In any case, hopefully this database will make things a tad easier for others wishing to do interesting analysis of their own.