Thousands piled on at the last minute to push Wikileaks over 1M Facebook Fans

Sometime a little after 5pm EST on Dec 7, 2010, Wikileaks gained its 1 millionth Facebook fan. That’s about 10 hours after my prediction, as the growth rate slowed through the night (duh) and didn’t return to the previous day’s pace during most of the day.

Here’s the chart, showing Facebook likes (fans) in blue and Twitter followers in red. The gaps in the lines are gaps in data sampling. I sampled every 5 minutes.

Notice the funky little blip in the blue line right around the 1 million mark? Here’s another chart that zooms in on that timeframe within the red circle:

A significant acceleration starts around 1710 EST. During this period the average number of new likes per minute (LPM, it’s the new RPM) goes from 89 or so up to a peak of 516, and then decelerates back down to 83 LPM at 1800 EST, followed by a continuing decline from there onwards.

At first I thought the lift was tied solely to prime time East coast US and Canada news., i.e. people might have logged on to Facebook immediately after hearing news stories about Wikileaks. That probably happened, to some degree. But if that were the only factor, we should have expected a similar bump, or sustained high growth, through the 1800-1900 hour, which is also news-heavy. And that’s not the case, as you can plainly see.

My best guess is that a crowd of people were aware the Wikileaks Facebook page was almost at 1M fans, and they all piled on when it got close to push Wikileaks over the 1M mark. Probably some of them hoped to be the 1 millionth fan. (Is there a badge for that?)

In case you’re interested, below is the raw data for 1500 EST to 2000 EST. Note I’ve calculated average new likes per minute, as I’m sampling every 5 minutes.

Why bother with all this analysis? Ethics and aims of Wikileaks aside, I just think this an interesting phenomenon at the intersection of math, psychology, and social networks. It’s not often you get to watch a meme go viral, in realtime, with accurate metrics available for tracking it. Lucky us.

I’ll keep my samples running for a while, in case anything else interesting happens.

Date/Time (EST) Facebook Likes Avg. new likes/minute
2010-12-07 15:00 982510 86
2010-12-07 15:05 982962 90
2010-12-07 15:10 983389 85
2010-12-07 15:15 983855 93
2010-12-07 15:20 984282 85
2010-12-07 15:25 984725 89
2010-12-07 15:30 985115 78
2010-12-07 15:35 985505 78
2010-12-07 15:40 985940 87
2010-12-07 15:45 986320 76
2010-12-07 15:50 986728 82
2010-12-07 15:55 987136 82
2010-12-07 16:00 987557 84
2010-12-07 16:05 987979 84
2010-12-07 16:10 988429 90
2010-12-07 16:15 988856 85
2010-12-07 16:20 989309 91
2010-12-07 16:25 989751 88
2010-12-07 16:30 990164 83
2010-12-07 16:35 990584 84
2010-12-07 16:40 990998 83
2010-12-07 16:45 991404 81
2010-12-07 16:50 991845 88
2010-12-07 16:55 992301 91
2010-12-07 17:00 992730 86
2010-12-07 17:05 993175 89
2010-12-07 17:10 994691 303
2010-12-07 17:15 997270 516
2010-12-07 17:20 999698 486
2010-12-07 17:25 1001745 409
2010-12-07 17:30 1003400 331
2010-12-07 17:35 1004748 270
2010-12-07 17:40 1006032 257
2010-12-07 17:45 1006595 113
2010-12-07 17:50 1007818 245
2010-12-07 17:55 1008331 103
2010-12-07 18:00 1008748 83
2010-12-07 18:05 1009130 76
2010-12-07 18:10 1009483 71
2010-12-07 18:15 1009836 71
2010-12-07 18:20 1010182 69
2010-12-07 18:25 1010523 68
2010-12-07 18:30 1010863 68
2010-12-07 18:35 1011204 68
2010-12-07 18:40 1011495 58
2010-12-07 18:45 1011804 62
2010-12-07 18:50 1012134 66
2010-12-07 18:55 1012420 57
2010-12-07 19:00 1012703 57
2010-12-07 19:05 1012968 53
2010-12-07 19:11 1013228 52
2010-12-07 19:16 1013496 54
2010-12-07 19:21 1013766 54
2010-12-07 19:26 1013991 45
2010-12-07 19:31 1014236 49
2010-12-07 19:36 1014457 44
2010-12-07 19:41 1014672 43
2010-12-07 19:46 1014882 42
2010-12-07 19:51 1015105 45
2010-12-07 19:56 1015321 43

Open Data: Making Toronto a Better Place to Live

Several months back Toronto mayor David Miller announced the city would embark on an “Open Data” initiative, with first steps to show by this fall. Well, fall fast approaches, and the city’s Open Data website is still a blank slate. While we don’t know yet what Open Data will be, lots of people have notions of what it ought to be. Here’s mine:

Start with a Sandbox and some Dogfood

It’s been a difficult summer for the city. The strike took a lot of resources offline, including people who would otherwise have been helping formulate and deliver the first batch of data. So while fall is probably still doable, plans for a first release must surely have been scaled back. The first go-round will have to focus on low-hanging fruit: data that happens to be readily available, privacy-clean, politically non-threatening, and already in machine-readable format.

Let’s also recognize that the first release, like any version 1, wants to be a pilot / proof of concept, not a polished product. I imagine the city will publish some sample data feeds (the “dogfood”), encourage people to build a few apps that consume the data (“dogfooding”), and then evolve a repeatable process around that while putting together a viable longer-term plan. That would be just ducky.

Beyond the first release, getting the city into the business of publishing and consuming data is a huge challenge. Technology is the least of the difficulties. It’s a huge prioritization problem, for one… the city needs to develop clear tenets, guidelines, and processes for deciding which requests to bubble to the top of the stack. And there are many “soft” barriers to overcome, including union fears (must automation lead to job losses?), privacy concerns, liability risks, and — probably most difficult — the turf struggles that will surely arise from trying to pry data out of people’s hands.

But this is not a blog post about Fear, Uncertainty, and Doubt. It’s about a happy world where the city overcomes its inertia, rises to the challenge, and Does Great Things. So let’s consider the question of what the data itself should be.

A Framework for Data Selection

If I was running the show, my framework for data selection would look something like this:

  1. Solve real people’s problems: focus on data that real people are requesting in order to solve real world problems. Ignore data that’s “looking for a problem to solve”, even if that data happens to be convenient to obtain and process. In other words, stay customer- and solution-driven, not expediency- and politically-driven.
  2. Satisfy the customers: those “real people” we want to satisfy break down into three groups: citizens, non-government organizations (both for-profit and not-for-profit), and government itself.
  3. Produce net benefit to society: the data’s benefits to society at large should outweigh the data processing costs. Benefit will be hard to measure in some cases. In other cases the benefit will be crystal clear in terms of dollars, e.g. money saved, time saved. Either way I say let’s measure, and get better at measuring, so that we can set goals and quantify our progress over time.
  4. Keep it clean: obviously the data must be OK to release from a privacy perspective, and it shouldn’t expose the city to unreasonable legal risk. That said, I would be perfectly happy with a license that exempted the city from all liability due to things like errors in the data, and I bet a lot of other people and companies would too. After all, we’ve signed a bunch of other licenses just like that for most other online data services we consume, including mission critical services like email and online document storage.
  5. Keep it fresh: the data can (and indeed, must) be refreshed periodically so that it doesn’t go stale. That implies an up-front commitment to continual publishing. Open Data isn’t a one-shot deal.

Open Data = Data In + Data Out

Almost all the examples I’ve read about open data initiatives are “Data Out”, i.e. cities publishing municipal data such as budget and contract details, service records for road repair, traffic flow, and so on, for the general public to consume. This is useful and necessary stuff, but there’s another equally important category I’ll refer to as “Data In”.

Data In is about society at large publishing data which the city consumes. For example, citizens noting the location of major potholes and failed streetlights; community service organizations reporting on how many people they are reaching, and how effectively (an idea Jane Zhang at TechSoup Canada is passionate about); schools reporting student attendance numbers, and so on. There’s a massive amount of “scouting” that can be done by citizens on behalf of the city, in effect crowd-sourcing information to help the city operate more efficiently and decide where to focus its limited resources. Citizens are incented to do it because they want their tax dollars spent efficiently.

“Data In” is the reason I list government itself as one of the key Open Data customers. As part of the planning process the city should be asking each of its departments for their own wish lists of data that society at large could provide in order to help them do their jobs better. Furthermore, those departments should be dogfooding the exact same data services that we the public consume. This process — internal dogfooding, and being your own customer — has a powerful built-in bias towards self-correction and accountability. You can bet the quality of city-published data feeds will be high, for instance, if internal city processes depend on those same feeds.

More to come…

I’ll write more about Open Data in the coming months. I’m selfishly hoping the city will publish some data we find useful for 5 Blocks Out, if only to save us from transcoding information trapped in PDFs (what’s with disabling copy-paste in PDFs?), and from hearing “Sorry, you’ll have to file a Access to Information Request Form for that” when we call our friends at City Hall. We can do better. Much better. Onwards!

Follow

Get every new post delivered to your Inbox.

Join 401 other followers

%d bloggers like this: