Category Archives: Uncategorized

I swear, I didn’t work on Windows 7

In the silly-marketing-antics department, my buddy Scott forwarded this screenshot around yesterday. It’s from  Steve Ballmer’s Windows 7 launch keynote slides.

Win7_Launch

In that main triangle of people in the center, the back three rows or so is lifted directly from a photo of the Live Search team. I’m in it — back row, 3rd from the left — so it must have been taken in late 2004 or early 2005.

Scott’s in it too, from many moons ago when he worked at Microsoft. In fact he’s in there at least twice.

Can’t wait for my ship-it award!

Join the Toronto Open Data Community

Toronto, we have lift-off!

The City is hosting an “Open Data Lab” on Nov 2 to kick off their community engagement on open data.

The Open Data Lab is an opportunity to explore the innovation possibilities of open civic data in Toronto. Join City subject matter and technology experts, community stakeholders and talented members of Toronto’s vibrant technology and design communities in an interactive and collaborative afternoon imagining commercial, social and civic applications of the City’s newly launched open data program.

Let’s get started.

Mint CEO on Startup Building

Mint.com CEO Aaron Patzer recently did a great presentation called “Everything you wanted to know about startup building, but were afraid to ask“. In it he chronicles the development of Mint.com from the germ of an idea all the way to exit. (Mint sold to Intuit for $170M.)

It’s not just an interesting story. He talks in fine detail about costs, salaries he paid himself and employees, equity, business model… all the things you’d need to know if you wanted to build your own company. This sort of detail is usually kept hush-hush.

The accompanying slides are here.

Every startup founder should watch this.

New York City is doing Open Data too

With an apps competition, no less.

The New York Times says:

Contestants will have access to more than 170 data sets supplied by over 30 city agencies, including weekly traffic updates, schedules of citywide events, property sales, restaurant inspections and mappable data around school and voting districts.

Exciting!

We ought to do something of similar spirit in Toronto, once people have had a chance to dogfood a version 1 “sandbox” release of open data.

See also: Open Data: What’s on Your Wish List?, Open Data: Making Toronto a Better Place to Live

5 Blocks Out has a Blog

Katrin and I have been working for a while now on a project called “5 Blocks Out“. We’ve just launched a blog for it, which you can find at blog.5blocksout.com.

The 5 Blocks Out Blog will mainly chronicle the development of the site.We’ll also post some city-related musings from time to time, similar in spirit to the posts Katrin has been writing on the Mukodu Blog.

Currently 5BlocksOut.com is in the early stages of development – but already a great deal of fun. So whether you are interested in hearing about early stage start-ups, adventures in the City or how to get the most out of the site, the blog should have something for you.

Send me a line if you would like to learn more. Or check out the new blog as it gains momentum.

Open Data: What’s on Your Wish List?

Fall is here, and I’m eagerly awaiting the sandbox and first release of City of Toronto “open data”. Since I work every day with data on 5 Blocks Out I’ve been thinking about what I would like to see opened up. In that spirit, here’s a short personal wish list.
First, some general “tenet” wishes:
- Publish all data in machine-readable open standard formats instead of just PDFs and unstructured text. JSON, XML, CSV, iCal, etc.
- Publish standard format street addresses, at least, for all location-based data. Better yet is a latitude/longitude pair.
- Publish time-based information such as events in a calendar format such as iCal.
- Any refreshable dataset needs unique durable IDs for every object in the data so readers can detect changes over time.
- Document the data at least enough for people to understand and use it productively.
Second, a few specific data sets that would be very useful:
Lists of places, including place name and location information. A foundational part of what we’re doing on 5 Blocks Out involves situating thousands of places on maps. We’re interested in all kinds of places… businesses, private organizations, and government facilities such as parks, community centres, and libraries. To do this we need trustworthy data sources with at least a name and location for each place.  Ideally the location is already stated as a latitude/longitude, but street addresses also suffice as they can be geocoded into latitude/longitude using various free geocoding web services. << geocoder, google >>
In addition to a place name and location we try to describe each place. For example, if it’s a business or a private organization, what sort of products and services does it provide? If it’s a government facility, what services does it offer to the public? How might one contact this place via phone, email, or fax? Is there a website URL available? And so on. This information is useful, but not essential.
Lastly, we look for unique identifiers, so that we can tell places apart and identify changes in place information over time. For instance, if a business moves from 123 Main Street to 245 Main Street, how do we know it’s the same business? Some data sources publish unique ID information that enable us to detect this sort of change. It turns out that  without unique IDs to rely on you need to come up with funky duplicate detection heuristics.
Here are examples of name-and-location information data sources the City of Toronto publishes today that we would love to have in easily machine-readable format:
- DineSafe lists restaurant names & addresses along with inspection data http://app.toronto.ca/food2/DineSafeMain
- Community centres: http://www.toronto.ca/parks/recreation_facilities/community-centres/index.htm
- Parks: http://www.toronto.ca/parks/parks_gardens.htm
Event data. Like many websites, we’re interested in publishing calendars of events happening throughout the city. We define “event” broadly to mean anything interesting that is time-based. An event might be a major street festival, a running race, a sporting event, a concert, or a city councillor doing a public consultation meeting. Jon Udell has done some great work on this with his ElmCity project; check it out if you’re publishing an event calendar already, or thinking about it. We endorse Jon’s idea of publishing in iCal and related formats.  We’d like to see this go a step further and have each event item include a location, as described above.
Again, the City of Toronto already publishes event data, just not in a machine-readable format. Here’s the Toronto Festivals and Events Calendar, for example http://wx.toronto.ca/festevents.nsf/.
TTC route, stop, and vehicle location information. This sort of data is obviously useful for building all kinds of apps that help people get around the city. Kieran and Kevin have done a great job reverse-engineering TTC route and stop timing data on http://myttc.ca. TTC should provide an official data stream to enable apps like theirs. The data is already in PDF format as route schedules.
There’s lots more on my list, but these are near the top.
What’s on your wish list?

Fall is here, and I’m eagerly awaiting the first release of City of Toronto “open data”. I’ve been thinking about what I’d like to see offered, both for data-hungry citizens in general and, more greedily, for accelerating our progress on 5 Blocks Out. In that spirit, here’s a short wish list.

First, some general “tenet” suggestions:

  • If you’re publishing it for humans, publish it for machines too. We need data in machine-readable open standard formats like JSON, XML, CSV, iCal, and so on. Not just PDF.
  • Publish standard format street addresses, at minimum, for all location-based data. Better yet is a latitude/longitude pair.
  • Publish time-based information such as events in a calendar format such as iCal.
  • Any refreshable dataset needs unique durable IDs for every object in the data set so that machine readers can detect changes over time.
  • Document the data at least enough for people to understand and use it productively. This sounds like a no-brainer, but apparently it has been a blocking issue in use of open data in other cities.

Second, here are a few specific data sets I would find useful for the work I’m doing:

1. Lists of places, including place name and location information.

A foundational part of what we do on 5 Blocks Out involves situating thousands of places on maps. We’re interested in all kinds of places, including businesses, private organizations, and government facilities such as parks, community centres, and libraries. To do this we need trustworthy data sources with at least a name and location for each place.  Ideally the location is already stated as a latitude/longitude, but street addresses also suffice as they can be geocoded into latitude/longitude using various free geocoding web services.

In addition to a place name and location we try to describe each place. For example, if it’s a business or a private organization, what sort of products and services does it provide? If it’s a government facility, what services does it offer to the public? How might one contact this place via phone, email, or fax? Is there a website URL available? And so on. This descriptive info is useful, but not essential.

Lastly, we look for unique identifiers, so that we can tell places apart and identify changes in place information over time. For instance, if a business moves from 123 Main Street to 245 Main Street, how do we know it’s the same business? Some data sources include unique ID information that enable us to detect this sort of change. It turns out that  without unique IDs to rely on you need to come up with funky duplicate detection heuristics.

Here are examples of name-and-location information data sources the City of Toronto publishes today that we would love to have in easily machine-readable format:

- DineSafe lists restaurant names & addresses along with inspection data

- Community centres

- Parks

2. Event data

We’re interested in publishing calendars of events happening throughout the city. We define “event” broadly to mean anything interesting that is time-based… everything from a major street festival, to a sporting event, to a city councillor doing a public consultation meeting. Jon Udell has done some great work on this with his ElmCity project; check it out if you’re publishing an event calendar already, or thinking about it. We endorse Jon’s idea of publishing in iCal and related formats.  We’d like to see this go a step further and have each event item include a location, as described above.

Again, the City of Toronto already publishes some event data, but it’s not in an easily machine-readable format. Here’s the Toronto Festivals and Events Calendar, for example. (Yes, we could build a parser to consume this particular web page, but that would be missing the point.)

3. TTC route, stop, and vehicle location information.

This sort of data is obviously useful for building all kinds of apps that help people get around the city. Kieran and Kevin have done a great job reverse-engineering TTC route and stop timing data on MyTTC.ca. The TTC should provide an official data stream to enable apps like theirs. TTC already offers the data in PDF format as route schedules.

There’s lots more on my list, but these three buckets are near the top.

What’s on your wish list?

Spread the Word: City of Toronto Launches Urban Fellows Program

City of Toronto 175 Years

One of the reasons I love living in Toronto at this particular time is the growing energy going into making the city a truly great place to live. There’s an increasing interest amongst everyday citizens in civic issues: topics like housing, transit, streetscapes, art, outdoor life, pollution, and economic vitality are fast becoming part of everyone’s sphere of interest.[1] And just as importantly, there’s an increasing willingness and capacity to change things. Unlike many other cities I’ve visited, Toronto is a place where you can actually change the way the city works, and accomplish it in your lifetime. It’s a huge reason to live here.

If this line of thinking resonates with you, and you’ve been seeking ways to get more engaged within the city, there is a program you need to know about: The City of Toronto is launching the “Urban Fellows Program“, an initiative aimed at attracting new high caliber professionals to the Toronto Public Service.

As I understand it, it’s one half boot-camp, one half incubator for smart people who want to make the city better. Participants get “an intensive introduction to the governance, operations and administration of Canada’s largest city through a combination of full-time work experience and participation in a series of seminars, tours and workshops.”

The program is one year long, with two six-month rotations in city positions. They’re seeking Masters – and Ph.D.-level experience, although that doesn’t seem to be an absolute requirement… I read it as, “we want whip-smart, well-educated people who are fired up about making the city better”. There are a limited number of positions. And it’s paid: the salary is almost $62K, some serious cash.

I love this concept, and I hope they net some really great thinkers. Applications are due may 30, and the first cohort starts this September. Please help spread the word.

[1] I readily admit to being biased by the people I surround myself with.

David Crow posted thoughts on StartupNorth about startup incubators and why we don’t have one in Toronto. As he points out, funding an incubator program is a big challenge. There aren’t enough Angels and VCs around willing to risk money on very early stage ventures here, and the ever-decreasing amounts of capital needed by tech startups look less and less attractive to investors with big chunks of money to manage. So if we’re to have a farm team, in Rick Segal’s words, how do we fund it?

I believe Toronto has both the financial and intellectual capital needed to do this.  Given that we’re having trouble getting bigger investors to fund this sort of effort, I wondered in reply whether micro-financing might be a viable alternative:

What if we tried micro-funding instead of the current approach? That might net enough investors to make it viable. We create a fund that pays for operating one session (or one year) of the program from start to finish. Price shares at, say, $5,000 apiece. Standardize the share terms so there’s no negotiation involved. Entrepreneurs offer up a fixed amount of equity in exchange for program participation. Investors share in the entrepreneurs’ risk and reward.

Who would buy? Well, at that price, I’d buy a share. I bet at least a few hundred other people would too. Wealthy investors (incl. some Angels) might purchase tens or hundreds of shares. Forward-thinking corps and a VC or two looking for higher-risk investments would buy in, and get good PR as a result. Maybe even the government buys some shares, or provides a tax incentive to others for buying. If the terms are suitable, even investors in other countries could participate.

Could we sell 2000 shares at that price? $5,000 x 2000 shares = $10M.

$10M could buy you an awfully big farm team, or even better, many cohorts of a small farm team.

Would you buy a share?

Rails 2.2.2: Noisy Translation Errors

[This is one of several posts on upgrading a Ruby on Rails app from Rails 1.x to 2.2.2.]

Rails 2.x added some nice support for internationalization. I’m using this to “Canadianize” the UI of my app by translating a few words from the EN-US locale to EN-CA. “Favorite” becomes “favourite”, “huh?” becomes “eh?”, “Coors Light” becomes “water”, and so on.

One gotcha I ran into is that translation errors don’t manifest themselves as exceptions. The default I18n implementation rescues translation errors and returns the failed translation keys as a string. This behavior is nice in the development environment, because you can see the strings showing up in your application’s UI. On the other hand it makes testing dangerous, because translation errors can easily go unnoticed, especially when running automated tests.

I prefer my tests to fail noisily. (Hmm… perhaps this should be the default behavior?) To that end, here’s a patch, also available on Pastie.

# The code below patches I18n to raise exceptions for all errors, including translation errors.
# The default (unpatched) behavior in Rails 2.2.2 rescues translation errors and returns the failed
# translation keys as a string. This behavior is undesirable in test, because it makes it too
# easy for translation errors to go unnoticed when running automated tests. Instead, we want to fail noisily.
#
# == Usage
# (within test_helper)
# require File.dirname(__FILE__) + '/i18n_patch'      

# Patch translation within views

module ActionView
  module Helpers
    module TranslationHelper
      def translate(key, options = {})
        options[:raise] = true
        I18n.translate(key, options)
      rescue I18n::MissingTranslationData => e
        raise e if RAILS_ENV == 'test'  # <<< this line is the patch. everything else in this method is original.
        keys = I18n.send(:normalize_translation_keys, e.locale, e.key, e.options[:scope])
        content_tag('span', keys.join(', '), :class => 'translation_missing')
      end
      alias :t :translate

      def localize(*args)
        I18n.localize *args
      end
      alias :l :localize
    end
  end
end

# Patch translation in models and controllers
module I18n
  class << self
    def raise_all_exceptions(*args)
      raise args.first
    end
  end
end

I18n.exception_handler = :raise_all_exceptions

A Recipe for Protecting Your Rails App Secret

I’ve spent some time over the last few weeks upgrading 5 Blocks Out to Rails 2.2.2. One of the things I’ve been pondering how to integrate is the new protect_from_forgery feature which aims to deter cross-site request forgery attacks.

By default, Rails 2.x creates a random forgery protection secret string when generating a new app, and hard-codes the secret into  environment.rb. As with database passwords, this isn’t the sort of thing you want in your source code repository, especially if your code will be open source, or exposed in some other way to a lot of people over time. So, what to do?

I found two useful ideas on how to deal with this. Both rely on storying the secret in a file distinct from environment.rb. You store this file on your web server, and not in your source repository. This way, your secret key is as secure as your app server.

Here’s a scrap of code I cooked up to do this: http://pastie.org/369075.  It looks for a file named config/session_secret.txt and tries to load the key directly from the text in the file. When running in environments such as production or staging it raises an error if it can’t find the file. When running in dev or test environments it silently falls back to using a hard-coded key. Since I use Capistrano to deploy, I’ve added an after-deploy task that links config/session_secret.txt to a central copy of the file.

This is simple and I think it should work pretty well for me. I hope someone else finds it helpful.