Archive for July, 2007

Rails and Nested Singular Resources

Tuesday, July 31st, 2007

One of the reasons I love Rails is the built in support for REST. If you’re not yet writing your Rails applications RESTfully, then you’re not really writing web applications.

I’ll detail an example that I just created which I thought illustrated REST support quite nicely. In Rails, you can model singular or plural Resources. A plural Resource might be Users, which means you’ll have lots of Users in your system. In contrast, you can create singular Resources when there is only one instance of that Resource in the system.

A good example of a singular resource is an User’s avatar icon. For instance, most Web 2.0 applications let you upload a tiny picture that represents you. In Rails speak, we say a User has one Avatar Icon. An Avatar Icon belongs to a User.

When creating the URI space for these models (User, AvatarIcon) you use map.resources and map.resource inside of your routes.rb file. We want to enforce that an AvatarIcon belongs to an User in the URI space, and thus the modeling of the Resource. We can do that with the below routes:

  map.resources :users do |user|
    user.resource :avatar_icon
  end

Our URIs will look like this now:

  • /users - list all users
  • /users/1 - show a single user
  • /users/1/avatar_icon - a user’s avatar icon

You can see the difference between singular and plural even in the URIs. /users is plural to indicate that it identifies the collection of all users in the system. On the other hand, /users/1/avatar_icon is singular indicating that it identifies the single avatar icon for a single user.

Furthermore, when you use map.resources, Rails will create named routes for you. This makes referring to the URIs for the Resources much easier, as they are given logical, proper names. Now, here’s where I think it gets really cool. Refer to the routes example above, where we nested the avatar_icon resource inside the users resource. When we want to create the URI /users/1/avatar_icon, we can use the generated named route avatar_icon_path(@user).

Notice how we only need to specify the user in question when building the URI to a user’s avatar icon. There’s no ID for the avatar icon in the URI, so why specify that in the named route? When I saw that, I said, “Yes, that’s exactly how I would expect it to work.”

Way to go, Rails! Making RESTful development a first class citizen for web applications.

Fixing Rails Pagination for SQL Server

Tuesday, July 31st, 2007

MS SQL Server certainly feels like the red headed step child of the Rails connection adapters. The core developers aren’t interested in it, and I’d have to guess that most Rails developers deploy to MySQL or PostgreSQL.

A good example of why the SQL Server support needs more love from the Rails community. The pagination code in the connection adapter is horribly ugly. SQL Server 2000 doesn’t support a limit or offset, which makes pagination extremely difficult. For kicks, check out the SQL that the Rails connection adapter generates for a limit and offset query in SQL Server. There’s enough sub queries and reverse sorts to make your head spin. Not to mention the awful performance killing select count(*) before every query.

SQL Server 2005 makes our life a little easier in that it added row_number() support. With this, it’s possible, however not straight forward, to perform pagination that doesn’t make you want to puke so much. Unfortunately, Rails hasn’t yet split their SQL Server adapters into a SQL Server 2000 and SQL Server 2005 adapters. I strongly encourage this move, as there are many differences between the two.

If you are running SQL Server 2005, and you want to fix many pagination problems that plague the sqlserver_adapter.rb (just look inside the Rails Trac sometime, there’s a lot), have I got the monkey patch for you. We’ve been using this for a little while now, and it seems to do the trick. YMMV but it should hopefully give you an idea of what’s possible.

module ActiveRecord
  module ConnectionAdapters
    class SQLServerAdapter

      def add_limit_offset!(sql, options)
        if options[:limit] and options[:offset]
          options[:order] ||= sql.match('FROM (.*) ')[1] + '.id'
          sql.sub!(/ORDER BY.*$/i, '')
          sql.sub!(/SELECT/i,
                  "SELECT row_number() over( order by #{options[:order]} ) as row_num, n")
          sql.replace("select top #{options[:limit]} * from (#{sql}) as tmp_table1 n" +
                "where row_num > #{options[:offset]}")
        end
      end
    end
  end
end

Not only was the built in pagination queries terribly slow (because it always executes a select count(*) before the query itself), but it had problems when doing paginations with included models. This is something that ActiveScaffold does all the time, so if you are using that and SQL Server, you’ve no doubt felt the pain when you tried to sort a column.

QOTD

Tuesday, July 31st, 2007

From Bill de hÓra: Wag the dog:

As strange as might sound, businesses might get more value from computer systems once those systems stop being optimized around transient business requirements or features. You can customize things, but only above the infrastructure. That seems to be part of the pitch for something like EC2.

Bill is talking about the systems level architecture, but can this observation be applied to software level services? Is this the pitch for SOA, in that business systems can be composed from existing smaller discreet services in the cloud?

OSCON Hadoop Presentation Downloads

Tuesday, July 31st, 2007

Yahoo Developer Network blog has links to their recent OSCON Hadoop Presentation with downloads of PDFs and a video of the presentation.

Hadoop is an open source implementation of a framework for processing large sets of data. Yahoo is currently using it to process log files, among other things.

Scaling Web Applications

Thursday, July 26th, 2007

Sam Ruby, via Tim Bray, has collected a list of scaling web applications presentations and documents. As Tim said, this is “everything anybody knows” on the subject.

I’m interested in large scale data crunching as we build out our data warehouse. It’s tricky for us, as we have one machine to do all of our data crunching, so we are definitely constrained by I/O. To really solve this issue on a single machine, we need to be smart with our disks and spread the data out to ensure parallel reads.

As I read through these presentations and reports, I’m always trying to map it back down to one machine with maybe four discs and two dual core processors.

Of course, I can just rent a Hadoop cluster.

Note to Amazon EC2: Install a EC2 instance on the DoD .mil network so we can use it, too!

Calculating Combinations The Cool Way

Monday, July 23rd, 2007

I recently had to calculate all possible combinations of a set. I needed to calculate combinations of 1..N size, where N is the size of the original set of things. Order inside of the resulting combinations did not matter to me, as I am treating the combinations as true sets.

For example, given the set [A,B,C], I needed to calculate the following combinations:

  • []
  • [A]
  • [B]
  • [C]
  • [AB]
  • [AC]
  • [BC]
  • [ABC]

It dawned on me that a cool way to generate the combinations was to treat the sets (the original set and the resulting combination sets) as bit strings. If the bit corresponding to the member is on, I include the member in the combination.

To explain, I start with the set [A,B,C]. I create a number that has three bits, all on, one for each member of the set. I therefore have the binary number 111 matching [A,B,C]. 111 happens to be 7 in decimal, which is one less than the total number of combinations I require.

Starting with zero, I loop up and including seven (for a total of eight iterations, once for each combo that I want). I convert each iteration count to a binary string, which will give me which bits are on for this combination.

For example, here’s the ruby code:


original_set = [:A, :B, :C]
combinations = []

def create_combo(bit_string, original_set)
  combo = []
  bit_string.split(//).each_with_index do |bit, i|
    combo << original_set[i] if bit == "1"
  end
  combo
end

(2**original_set.size).times do |i|
  bit_string = sprintf("%03b", i)
  combinations << create_combo(bit_string, original_set)
end

require 'pp'
pp combinations

This will print out:

[[], [:C], [:B], [:B, :C], [:A], [:A, :C], [:A, :B], [:A, :B, :C]]

Neat, huh?

I’m sure you can speed this up by checking each bit in the iteration count instead of first converting to a bit string.

Restful Account Activation

Tuesday, July 17th, 2007

Nearly every web application has the concept of users or accounts. While the concept of a user or account is quite universal, when you get to implementing, you find out that the business rules and implementation details vary widely from application to application. Luckily for your Ruby on Rails applications, there’s an excellent starting point for creating the basics of accounts and logins. The restful_authentication plugin for Rails provides generators for Users and the basic framework required, such as email activations, logins, passwords, controllers, and database migrations. The extra bonus is that the plugin uses REST to model the authentication.

In the REST web application world, all you see are objects, or nouns. The verbs in the system are limited to the HTTP methods such as GET, PUT, POST, and DELETE. REST advocates that you interact with the world (your nouns) with these four methods. It forces you to think, “How would I convert an Action into a Thing?”

With restful_authentication, the action is Login, but that concept is encapsulated into a Session object. When you login, you are *creating* a new Session. When you logout, you are *deleting* that Session. (Here, the Session is a different concept from a web session.) The plugin nicely models login and logout actions into CREATE Session and DELETE Session.

We can apply this concept to account activation, which I had to do recently for one of our applications. Our business rules stated that an account can’t access the system until it was activated by an administrator.

In the old pre-REST days, we might have modeled this with an action called activate, which might have set an activated bit on the account instance. Fair enough, and it would have worked fine. But this isn’t RESTful at all, as it would require a new verb in the system (activate).

Knowing that REST is all about the nouns, we noun-ify the concept of activation into a class called Activation. We say that an Account has one Activation, and if that activation relationship exists, the account is activated. If there is no activation instance for an account, then the account is deactivated. An administrator will CREATE an activation instance in order to activate an account. The administrator can later on DELETE the activation instance if they want to deactivate the account.

Another benefit of creating a first class Activation model is we can add properties such as when the account was activated and who activated it.

In summary, I love working with REST because it forces me to think in nouns, which are classes. I find it easier to model the world with nouns than with verbs. Plus, the problem with verbs is you can’t say anything about them, so you lose the ability to add metadata to the events in the system.

links for 2007-07-15

Saturday, July 14th, 2007

links for 2007-07-09

Sunday, July 8th, 2007

links for 2007-07-03

Monday, July 2nd, 2007