OpenStruct crashed my Mongrel

I fell in love with Ruby's OpenStruct class a few months ago after reading "Jay Fields' Thoughts" and "ERR THE BLOG" posts on OpenStruct.

I immediately liked the concept. It was right in line with how easily you can get things done in Ruby. I carefully tucked the OpenStruct nugget into the back of my head for later use. It wasn't too long until I had the opportunity to implement a solution with it. Due to issues and limitations with ActiveRecord::Base (another time, another post) Tobi needed to use a more flexible and generic architecture for reporting (again, another post). Anyway, in order to get this done quickly I pulled out the OpenStruct nugget.

We ended up with a couple of really simple classes to encapsulate our data. Now we could simply extend DataRow and decorate it with the particulars of the specific report.

  1.  # code snippet  
  2. def query  
  3.   query = <<-EOS  
  4.     SELECT ...  
  5.     FROM ...  
  6.     INNER JOIN ...  
  7.     WHERE ...  
  8.     GROUP BY ...  
  9.     ...  
  10.    EOS  
  11. end  
  12.   
  13. # code snippet  
  14. def run  
  15.   rows = ActiveRecord::Base.connection.select_all(query)  
  16.   rows.collect{|row| Backorder::Row.new(row)}      
  17. end  
  18.   
  19. class DataRow < OpenStruct  
  20.   
  21.   def initialize(row)  
  22.     # Allows us to use an anonymous ActiveRecord or a hash  
  23.     attributes = case row  
  24.       when Hash then row  
  25.       else row.attributes  
  26.     end  
  27.   
  28.     super(attributes)  
  29.   end  
  30.   
  31. end  
  32.   
  33. module Backorder  
  34.   
  35.   class Row < DataRow  
  36.     
  37.     def order_number  
  38.       order_id.to_i + 1000  
  39.     end  
  40.   
  41.     def order_ids  
  42.       Array.postgres_to_ruby(orders)  
  43.     end  
  44.   
  45.     def purchase_order_ids  
  46.       Array.postgres_to_ruby(purchase_orders)  
  47.     end  
  48.            
  49.   end  
  50.   
  51.   # ...  
  52.   
  53. end  

The preceding code gave us a simple and generic way to use complex SQL to load from an arbitrary number of tables as well as temporary tables. Additionally, we did not have to pollute models with report specific logic. The testing benefits were another major plus!

This solution solved one performance problem (Rails' n+1 queries) but introduced another one. Unfortunately we didn't catch that one until production :(

The DataRow extending OpenStruct worked exceptionally well with a small working set. However, in production, when real users hit up some of the reports, the working sets became quite large and mongrels started to crash. We have excellent monitoring in place so we were able to quickly recover and the crashes where localized to the back-end enterprise Rails application so the bug was not too terrible. However, the reporting "fix" quickly became a serious issue as getting stuck behind a slow mongrel is an awful user experience. What made matters worse was the fact that the mongrel was crashing and not just taking forever. This meant we had to go in and clean up. All in all, it was a bad situation and a horrible headache.

Why were we crashing? What was going on? Didn't we take a report that would either time out or take 10 minutes due to the n+1 Rails issue down to a singular query? What could it be? Of course the issue was obvious, we were running out of memory because we were creating so many objects. Still, why were we running out of memory? The object count wasn't that high after all.

Turns out that the problem was in ostruct.rb#new_ostruct_member. Every time we instantiated a DataRow it created a new class and then defined the corresponding methods. Of course it does, how did we think OpenStruct worked! Actually, we thought it would define the methods once on the class DataRow that was extending it. Anyway, this was the issue and it was killing us!

  1. def initialize(hash=nil)  
  2.   @table = {}  
  3.   if hash  
  4.     for k,v in hash  
  5.       @table[k.to_sym] = v  
  6.       new_ostruct_member(k)  
  7.     end  
  8.   end  
  9. end  
  10.   
  11. def new_ostruct_member(name)  
  12.   name = name.to_sym  
  13.   unless self.respond_to?(name)  
  14.     meta = class << selfselfend  
  15.     meta.send(:define_method, name) { @table[name] }  
  16.     meta.send(:define_method, :"#{name}=") { |x@table[name] = x }  
  17.   end  
  18. end  


We needed to quickly code ourselves out of this mess. Here comes the new version of the DataRow and Backorder::Row class.

  1. class DataRow  
  2.   
  3.    def initialize(attributes)  
  4.      @attributes = attributes  
  5.    end  
  6.   
  7.    def self.define_row_attribute_methods(row_attributes)  
  8.      self.class_eval do  
  9.        row_attributes.each do |attribute|  
  10.          attribute = attribute.to_s  
  11.          define_method(attribute) do  
  12.            @attributes.fetch(attribute, nil)  
  13.          end  
  14.        end  
  15.      end  
  16.    end  
  17.   
  18.  end  
  19.   
  20.  module Backorder  
  21.   
  22.    class Row < DataRow  
  23.   
  24.      define_row_attribute_methods(  
  25.        [:item_id:order_id:order_date:customer_name:product_name:brand_name,:color_name,  
  26.        :size_name:gender:price:backordered_quantity:purchase_order_quantity:received_quantity,          
  27.        :outstanding_quantity:on_order_quantity:inventory_quantity:orders:purchase_orders])  
  28.         
  29.      def order_number  
  30.        order_id.to_i + 1000  
  31.      end  
  32.   
  33.      def order_ids  
  34.        Array.postgres_to_ruby(orders)  
  35.      end  
  36.   
  37.      def purchase_order_ids  
  38.        Array.postgres_to_ruby(purchase_orders)  
  39.      end  
  40.   
  41.    end  
  42.   
  43.  end  


Basically we are defining the methods that we need on the class level when we first load the class. DataRow's internal data structure is a hash, so we can easily instantiate it and we still have all of the functionality that we needed from OpenStruct. Because of the loosely-coupled architecture we were able to swap out the general DataRow and specific Row classes.

Note: I still love the OpenStruct class. It is actually a beautiful class with some great Ruby code. It epitomizes why Ruby is superior to Java. I mean it uses recursion, meta-programming, aliasing, and method missing functionality! All of this in about 100 lines of easy to ready and understand code.

Second Note: I tried to keep the code in context as much as possible, this is Tobi's real code and it is test covered.

N'th Note: There is still much more we have done to DRY this up and simplify but it was not germane to how OpenStruct bit us in the ass.

References

http://www.tobi.com

http://errtheblog.com/posts/28-strut-your-structs
http://blog.jayfields.com/2006/09/ruby-stub-variations-openstruct.html
http://www.ruby-doc.org/stdlib/libdoc/ostruct/rdoc/classes/OpenStruct.html

0 comments: