Sunday, January 15, 2012

Mapping (Multiple) Legacy Databases With Datamapper

We're starting in on a new project at work and it comes with some unique requirements (don't they all?)

To get things going so the rest of the team can take over and run with it I spent a couple days figuring out Datamapper the ruby ORM and mapping a number of legacy data tables.

This application is unique in that it will run against 3 separate databases. Two of them come from an older version of the application that the client built several years ago, we'll call these databases db1 and db2. The third database is where any custom data will be held for the web interface itself. We'll call it web1.

The first trick to getting datamapper to work in a rails applications is knowing what to actually install. Datamapper is a general purpose ORM and can be used quite effectivley outside of rails, as such the documentation on datamapper.org doesn't really focus on rails. No problem, take a look at the dm-rails gem.

dm-rails overrides a number of ruby on rails default generators, patches rake db:migrate to work with datamapper upgrades and automigrates and even integrates well with rspec and cucumber. Also worth noting that it worked fine on the latest and greatest (as of this blog post) - Rails 3.1.3 with Ruby 1.9.3.

Part of setting up dm-rails is getting your database.yml configured correctly. The documentation on the github page is great but as an added example here is a version of my config (with secure info removed.)

db1_defaults: &db1_defaults
  adapter: mysql
  username: [USERNAME]
  password: [PASSWORD]
  host: localhost
  database: db1

db2_defaults: &db2_defaults
  adapter: mysql
  username: [USERNAME]
  password: [PASSWORD]
  host: localhost
  database: db2

web1_defaults: &web1_defaults
  adapter: mysql
  username: [USERNAME]
  password: [PASSWORD]
  host: localhost
  database: web1

development:
  database: web1
  <<: *web1_defaults
  repositories: 
    db1: 
      <<: *db1_defaults
    db2: 
      <<: *db2_defaults

test: &test
  database: web1_test
  <<: *web1_defaults
  repositories: 
    db1: 
      database: db1_test
      <<: *db1_defaults
    db2: 
      database: db2_test
      <<: *db2_defaults

Once your database is configured you should be able to run rake db:migrate and see success messages. At this point you have no models configured but you can see that it can connect correctly.

The next step is mapping your existing tables.

db1 and db2 have quite a bit of existing data, their own data formats and a somewhat inconsistant naming structure. This resulted in a lot of custom mapping and eventually I just decided to explicitly spell out every mapping for consistency (even if the datamapper conventions would have applied.)

Here's an example model file. Again I've removed business context by renaming some things. I'll call out some pitfalls after.

class Procedure
  include DataMapper::Resource

  def self.default_repository_name
    :db1
  end
  storage_names[:db1] = "procedures"

  property :id, Serial, :field => "id"
  property :name, String, :field => "procnam"
  property :length, Integer, :field => "avgminlen"

  validates_length_of :name, :max => 25
end

The first important thing to note is this method declaration.

  def self.default_repository_name
    :db1
  end


This is how you tell Datamapper which database this model's table is in. In this case Procedures come from the db1 database.

Next thing worth noting is:

  storage_names[:db1] = "procedures"

Which is how you set the table name. In this case the table was pluralized. If you find that all of your tables follow a naming standard (something like tblProcedure) there are ways to override the naming globally. In this application naming standards were fairly inconsistant so I ended up adding this line in every model even when it was technically unnecessary.

Also for each varchar type column I added a length validator. The length of the varchar varied wildly in my application from 1 char fields up to 50 varchar fields. The best way I could come up with to deal with that was to simply validate that it never overflowed. Would love suggestions on any other config flags I could add if need be so datamapper knows what it is structurally.

One last thing to note. The override to default_repository_name MUST BE BEFORE THE PROPERTY DECLARATIONS. Sorry for the all caps. I spent about 2 hours fighting datamapper before figuring that one out. I even made a SO post on it before i figured it out.



3 comments:

  1. You can declare the length of the property by using :length => 1..25 or whatever the allowable range is. It will setup the validates_length_of validation automatically for you too. In general I recommend every String property have a length specified, except maybe for code spikes or one-offs.

    Besides SO, you could also try posting questions to the DataMapper mailing list (http://groups.google.com/group/datamapper) or on the IRC channel (#datamapper on Freenode. I know some DM users are active on SO, but you're probably going to see a higher concentration of users on the mailing list or irc channels in comparison.

    ReplyDelete
  2. Thanks didn't know about the mailing list, I'll join that. I will also update those models to use length.

    ReplyDelete

  3. Cool design, a very interesting solution.
    As the exchange of information, I want to share with you useful and something similar resources such as cloud server best virtual data room

    ReplyDelete