ActionController::RoutingError (undefined method ‘sub’ for nil:NilClass)

Or, “Man I’m glad Ruby on Rails is an open source framework”.

Ruby on Rails normally provides pretty good debugging information for dealing with errors, but I ran into an odd error today that had no debug info other than the exception itself. Here’s what I did to find and fix the problem.

Symptoms: “So, tell me, how long have you been feeling this way?” I had just added a new controller, model, views, and helpers to my app. Requesting a route to any of the new controller’s actions caused Rails to die with the exception “ActionController::RoutingError (undefined method `sub’ for nil:NilClass)”. No stack trace. No filename / line numbers. I knew one of the 20-odd new files I had just created must be causing the problem, but I didn’t know which one.

Diagnosis: “Breathe in. Breathe again. Again. OK, now cough. Ah, yes, now I see the problem.”: The only way I know of to diagnose these kind of problems is brute force. I added a debugger statement to the top of application_controller.rb and restarted the app. Then I requested a URL that was causing crashes, and from the breakpoint, stepped forward in the debugger (next, next, next…) until the app died. Studying the code that executed just before the app died didn’t yield an obvious result, but about 10 frames up the stack I could see a call to ActiveSupport::Inflector#constantize. I edited that method to print (puts) the parameters it was being called with, then restarted the server and tried again. Voila: the last parameter spit out by #constantize before the server died was the culprit.

#constantize PlaceLinksHelper for name PlaceLinksHelper

The Cure: “Take three of these and call me in the morning.” : It turns out I had created a helper module with the file name “places_helper.rb”, but within that file I mistakenly defined “module PlaceLinksHelper”. For some reason that shall remain a mystery to me, this caused things to fall-down-go-boom. The fix was simply to give the module the name Rails expected to find: “PlacesHelper”.

Upon Reflection… “I sure hope that doesn’t happen to me again. But it probably will.” I don’t know how you do this sort of diagnosis efficiently in a closed-source or compiled environment.  This is about the 5th time I’ve edited Rails source code to find and fix an oddball problem. I guess you could try attaching a debugger to binaries and then study the gobbledy-gook, but… no thanks. The only other thing I  could have done in a compiled/closed source setup is study each of my own files line by line. That would have worked, but it would have taken far more time.

Source code good.

The case of the phantom 404s

Sometimes tracking down 404 errors (HTTP “Resource not found”) on a web site is easy to do, but I ran into a tricky case yesterday that took me the better part of the afternoon to resolve. I want to share the steps I walked through in the hopes it saves someone else some time.

If you run a web site you’ll want to periodically verify that your site is being crawled properly by search engines. Both Google Webmaster tools and Bing Webmaster give you a handy “crawl errors” list which shows URLs the search engine fails to index, and the corresponding “origin” pages that link to the problem URLs.

This weekend I checked on Google Webmaster and was surprised to find a few hundred 404 errors in recent crawls of 5 Blocks Out. (There were no matching errors on Bing, and I’m still not sure why.)

Warning: from here on, this blog post gets ugly. Skip to “Lessons Learned” below if you just want the punchline.

All the URLs looked similar to this:

5blocksout.com/q5E3Q1Db7SpV45DINIfRr28pnGRt5G/ujfe3SPV36M=

In other words, each URL had a path consisting of a long string of alphanumeric characters, and ending with an equals sign.

We don’t support URLs like this on 5 Blocks Out, hence all the 404 errors. But where were they coming from?

The crawl errors report listed many different 5 Blocks Out web pages as the origins for these problem URLs. The next step was obvious, then: look at the source code for each origin page containing a problem URL, and then fix the problem. Well, it turned out that none of the origin pages actually contained any of these problem URLs. Googlebot had discovered these URLs at crawl time, but I couldn’t find any them myself. Phantom URLs.

I cleared my browser cache and tried again, with the same results. Then I tried wget, to more closely mimic what Googlebot would see. Again, same results: I couldn’t find any of these supposed 404 URLs on the origin pages.

My hypotheses at this point were not pretty:

  • A bug in the application is intermittently generating bad URLs.
  • The pages contain poorly formed HTML, which is causing Googlebot (but not Bing) to parse them incorrectly.
  • A virus or some sort of malware is corrupting the web pages.
  • A bug in our web server software or Rails is corrupting the pages.

Not an appealing list to investigate, especially without a way to reproduce the problem. The “bad HTML” one was easy to eliminate by running the page through the W3C Validator, so that left only gnarly possibilities.

After a while of casting about for clues I eventually returned to looking at the URLs themselves, hoping to find patterns in their structure, the crawl dates, the origin pages… anything. And a few things struck me:

1. The 404 errors all began around Nov 26, which is the date I upgraded our web server to the latest versions of nginx and Phusion Passenger. This later turned out to be a false lead, but it seemed useful at the time.

2. The strings of alphanumeric characters in these phantom URLs looked similar to the authentication tokens that Rails generates as a countermeasure against cross-site request forgery. Rails inserts these tokens into every page that contains an HTML form.

A little voice reminded me, “Hey smartypants, we have some javascript that appends authentication tokens to AJAX requests made from our web pages.” Now, that code shouldn’t have affected search engines, because it was specific to POST requests, and everyone knows that well-behaved search engine crawlers only do GET request. But still… new hypothesis: “Our javascript is mistakenly appending authentication tokens to Googlebot GET requests.”

How to prove it? Well, search engines have caches of crawled pages. I checked Google’s page cache and found a cached version of an origin page that supposedly contained a 404 URL. The HTML source showed… drum roll… no such URL. But the HTML did contain javascript with an authentication code that exactly matched the path portion of the 404 URL. Bingo!

From there, the fix was easy. I did indeed have a bug in my javascript. The code was screening out AJAX requests with a type of “GET”, but it didn’t check for the lowercase “get”, and so it appended auth tokens in cases where it shouldn’t have.

Why did this start happening all of a sudden, you might ask? I don’t know. Perhaps something changed in the way Googlebot issues its HTTP requests. Maybe we’d always had this problem, and Googlebot just hadn’t yet crawled pages that exhibited it. Maybe web fairies temporarily stopped sprinkling magic pixie dust on our site. Who knows.

Lessons learned:

  • Check webmaster tools and web server logs once in a while for unexplained crawl errors.
  • Remember that search engines allow you to search for cached versions of web pages. This can help with diagnosing crawl problems. (In fact, crawl error reports really ought to link directly to the cached versions of crawled pages.)
  • String comparisons don’t work very well if you don’t make them case-insensitive.
For reference, here’s the diff for the problem code (in our application.js file, bound to the ajaxSend event):
// Insert authenticity token into all non-GET AJAX requests
function insertAuthToken(elm, xhr, s) {
-    if (s.type == "GET") {
-        return;
-    }
+    if (s.type == 'GET' || s.type == 'get' || typeof(window.auth_token) == "undefined") return;

Ubuntu upgrade breaks RMagick on Ruby Enterprise Edition

I recently upgraded the server running 5 Blocks Out from Ubuntu version 8.0.4 to 10.04. Somewhere along the way the binding between the rmagick gem and the underlying imagemagick libraries broke, probably because Ubuntu upgraded imagemagick on my behalf. And as I discovered over coffee this morning, if your Rails app depends on rmagick, and rmagick can’t bind to imagemagick, you’re hosed. So… here’s the solution I found after some panicked schnoozling around the web.

First, the error: when trying to start Rails you get an error that looks like this:

libWand.so.10: cannot open shared object file: No such file or directory - /opt/ruby-enterprise-1.8.6-20090520/lib/ruby/gems/1.8/gems/rmagick-2.12.2/lib/RMagick2.so

As you can see, we’re running REE (Ruby Enterprise Edition). This complicates matters a little, because you need to ensure you install rmagick in the directory where REE expects it to be.

Here is the series of magic chants needed to get rmagick and imagemagick working together on Ubuntu 10.0.4 with REE (Ruby Enterprise Edition).

$ sudo apt-get install imagemagick
$ dpkg -l | grep imagemagick
ii  imagemagick                       7:6.5.7.8-1ubuntu1                                         image manipulation programs

You can see we picked up imagemagick v6.5.7.8.

Next, verify where your REE binaries are installed, and install rmagick, making sure it ends up in the REE gem directory:

$ which ruby
/opt/ruby-enterprise-1.8.6-20090520/bin/ruby
$ sudo /opt/ruby-enterprise-1.8.6-20090520/bin/gem install rmagick

This step can take a while. Once it completes, check to see that rmagick is bound properly to imagemagick:

$ irb -rubygems -r RMagick
>> puts Magick::Long_versionThis is RMagick 2.13.1 ($Date: 2009/12/20 02:33:33 $) Copyright (C) 2009 by Timothy P. Hunter
Built with ImageMagick 6.5.7-8 2009-11-26 Q16 http://www.imagemagick.org
Built for ruby 1.8.6
Web page: http://rmagick.rubyforge.org
Email: rmagick@rubyforge.org=> nil

If that works, start the Rails console and try the same thing. You should get the same results, and the fact that console runs is a good indicator your app server will probably run too.

Done!

References:

Follow

Get every new post delivered to your Inbox.

Join 348 other followers