CouchFoo provides an ActiveRecord API interface to CouchDB
= CouchFoo
== Introduction
CouchDB (http://couchdb.apache.org/) works slightly differently to relational databases. First, and foremost, it is a document-oriented database. That is, data is stored in documents each of which have a unique id that is used to access and modify it. The contents of the documents are free from structure (or schema free) and bear no relation to one another (unless you encode that within the documents themselves). So in many ways documents are like records within a relational database except there are no tables to keep documents of the same type in.
CouchDB interfaces with the external world via a RESTful interface. This allows document creation, updating, deletion etc. The contents of a document are specified in JSON so its possible to serialise objects within the database record efficiently as well as store all the normal types natively.
As a consequence of its free form structure there is no SQL to query the database. Instead you define (table-oriented) views that emit certain bits of data from the record and apply conditions, sorting etc to those views. For example if you were to emit the colour attribute you could find all documents with a certain colour. This is similar to indexed lookups on a relational table (both in terms of concept and performance).
CouchDB has been designed from the ground up to operate in a distributed way. It provides robust, incremental replication with bi-directional conflict detection and resolution. It's an excellent choice for unstructed data, large datasets that need sharding efficiently and situations where you wish to run local copies of the database.
== Overview
CouchFoo provides an ActiveRecord styled interface to CouchDB. The external API is nearly identical to ActiveRecord so it should be possible to migrate your applications quite easily. That said, there are a few minor differences to the way CouchDB works. In particular: * CouchDB is schema free so property definitions for the document are defined in the model (like DataMapper) * :select, :joins, :having, :group, :from and :lock are not available on find or associations as they don't apply (locking is handled as conflict resolution at insertion time) * :conditions can only accept a hash and not an array or SQL. For example :conditions => {:username => "Georgio1999"} * :offset is less efficient in CouchDB - there's more on this in the rdoc * :order is applied after results are retrieved from the database. Therefore :order cannot be used with :limit without a new option :usekey. This is explained fully in the quick start guide and CouchFoo#find documentation * :include isn't implemented yet but the finders and associations still accept the option so you won't need to make any code changes * By default results are ordered by document key. The key uses a UUID scheme so these don't auto-increment and are likely to come out in a different order to insertion. defaultsort can be used on a model to sort by create date by default and overcome this * validatesuniquenessof has had the :case_sensitive option removed * Because there's no SQL there's no SQL finder methods * Timezones, aggregations and fixtures are not yet implemented * The price of index updating is paid when next accessing the index rather than the point of insertion. This can be more efficient or less depending on your application. It may make sense to use an external process to do the updating for you - see CouchFoo#find for more on this * On that note, compacting of CouchDB is required to recover space from old versions of documents and keep performance fast. This can be kicked off in several ways (see quick start guide)
It is recommend that you read the quick start and performance sections in the rdoc for a full overview of differences and points to be aware of when developing. The Changelog file shows the differences between gem versions and should be checked when upgrading gem versions.
== Getting started
To install CouchFoo simply type the following:
sudo gem install couch_foo --source http://gemcutter.org
You will then need to setup the base class connection and logger:
CouchFoo::Base.set_database(:host => "http://localhost:5984", :database => "mydatabase") CouchFoo::Base.logger = Rails.logger
If using with Rails you will need to create an initializer to do this (until proper integration is added). Also note depending on your version of CouchDB will depend on the version of the CouchREST gem you will require - CouchDB 0.9 requires CouchREST greater than 0.2 and CouchDB 0.8 requires CouchREST between 0.16 and 0.2. You will be warned on CouchFoo initialization if this is wrong.
== Examples of usage
Basic operations are the same as ActiveRecord: class Address < CouchFoo::Base property :number, Integer property :street, String property :postcode # Any generic type is fine as long as .tojson and class.fromjson(json) can be called on it end
address1 = Address.create(:number => 3, :street => "My Street", :postcode => "secret") # Create address address2 = Address.create(:number => 27, :street => "Another Street", :postcode => "secret") Address.all # = [address1, address2] or maybe [address2, address1] depending on key generation Address.first # = address1 or address2 depending on keys so probably isn't as expected Address.findbystreet("My Street") # = address1
As key generation is through a UUID scheme, the order can't be predicted. However you can order the results by default: class Address < CouchFoo::Base property :number, Integer property :street, String property :postcode # Any generic type is fine as long as .tojson can be called on it property :createdat, DateTime
default_sort :created_at
end
Address.all # = [address1, address2] Address.first # = address1 or address2, sorting is applied after results Address.first(:usekey => :createdat) # = address1 but at the price of creating a new index
Conditions work slightly differently: Address.find(:all, :conditions {:street => "My Street"}) # = address1, creates index on :street Address.find(:all, :conditions {:createdat => "sometime"}) # Uses same index as :usekey => :createdat Address.find(:all, :usekey => :street, :startkey => 'p') # All streets from p in alphabet, reuses the index created 2 lines up
As well as providing support for people using relational databases, CouchFoo attempts to provide a library for those wanting to use CouchDB as a document-oriented database: class Document < CouchFoo::Base property :number, Integer property :street, String
view :number_ordered, "function(doc) {emit([doc.number , doc.street], doc); }", nil, :descending => true
end
Document.number_ordered(:limit => 75) # Will get the last 75 documents in the database ordered by number, street attributes
Associations work as expected but you must to remember to add the properties required for an association (we'll make this automatic soon): class House < CouchFoo::Base has_many :windows end
class Window < CouchFoo::Base property :houseid, String belongsto :house end
== Credits
This gem was inspired some excellent work on CouchPotato, CouchREST, ActiveCouch and RelaxDB gems. Each offered its own benefits and own challenges. After hacking with each I couldn't get a library was happy with. So I started with ActiveRecord and modified it to work with CouchDB. Some areas required more work than others but a lot of features were achieved for free once the base level of functionality had been achieved. Credit to DHH, the rails core guys and the CouchDB gems that inspired this work.
== What's left to do?
Please feel free to fork and hit me with a request to merge back in. At the moment, the following areas need addressing: