There are two options when approaching DbCharmer installation:
using the gem (recommended and the only way of using it with Rails 3.x)
install as a Rails plugin (works in Rails 2.x only)
To install as a gem, add this to your Gemfile:
To install DbCharmer as a Rails plugin use the following command:
Notice: If you use DbCharmer in a non-rails project, you
may need to set DbCharmer.env to a correct value before using any
of its connection management methods. Correct value here is a valid
database.yml first-level section name.
Easy ActiveRecord Connection Management
As a part of this plugin we’ve added switch_connection_to method
that accepts many different kinds of db connections specifications and uses
them on a model. We support:
Strings and symbols as the names of connection configuration blocks in database.yml
ActiveRecord models (we’d use connection currently set up on a model)
Database connections (Model.connection)
Nil values to reset model to default connection
Sample code:
Sample database.yml configuration:
The switch_connection_to method has an optional second parameter
should_exist which is true by default. This parameter is used when
the method is called with a string or a symbol connection name and there is
no such connection configuration in the database.yml file. If this
parameter is true, an exception would be raised, otherwise, the
error would be ignored and no connection change would happen.
This is really useful when in development mode or in a tests you do not
want to create many different databases on your local machine and just want
to put all your tables in a single database.
Warning: All the connection switching calls would switch connection
only for those classes the method called on. You can’t call the
switch_connection_to method and switch connection for a base class
in some hierarchy (for example, you can’t switch ActiveRecord::Base connection and
see all your models switched to the new connection, use the classic
establish_connection instead).
Multiple DB Migrations
In every application that works with many databases, there is need in a
convenient schema migrations mechanism.
All Rails users already have this mechanism – rails migrations. So in
DbCharmer, we’ve made it possible to seamlessly use multiple
databases in Rails migrations.
There are two methods available in migrations to operate on more than one
database:
Global connection change method – used to switch whole migration to a
non-default database.
Block-level connection change method – could be used to do only a part of a
migration on a non-default db.
Migration class example (global connection rewrite):
Migration class example (block-level connection rewrite):
Migration class example (global connection rewrite, multiple connections
with the same table):
Notice: both :connection and :connections can take an array of connections)
Default Migrations Connection
Starting with DbCharmer version 1.6.10 it is possible to call
ActiveRecord::Migration.db_magic and specify default migration
connection that would be used by all migrations without explicitly
switched connections. If you want to switch your migration to the default
ActiveRecord connection, just use
db_magic :connection => :default.
Invalid Connection Names Handling
By default in all environments on_db and db_magic
statments would fail if specified connection does not exist in
database.yml. It is possible to make DbCharmer ignore such
situations in non-production environments so that rails would create the
tables in your single database (especially useful in test databases).
This behaviour is controlled by the
DbCharmer.connections_should_exist configuration attribute which
could be set from a rails initializer.
Warning: if in test environment you use separate connections and
master-slave support in DbCharmer, make sure you disable transactional
fixtures support in Rails. Without this change you’re going to see all
kinds of weird data visibility problems in your tests.
Using Models in Master-Slave Environments
Master-slave replication is the most popular scale-out technique in a
medium-sized and large database-centric applications today. There are some
rails plugins out there that help developers to use slave servers in their
models but none of them were flexible enough for us to start using them in
a huge application we work on.
So, after using ActsAsReadonlyable plugin for some time, upon switching to
Rails 2.2 we’ve decided to collect all of our master-slave code in one plugin
and release it. DbCharmer adds the following features to Rails models:
Auto-Switching all Reads to the Slave(s)
When you create a model, you could use db_magic :slave => :blah
or db_magic :slaves => [ :foo, :bar ] commands in your model to
set up reads redirection mode when all your find/count/exist/etc methods
will be reading data from your slave (or a bunch of slaves in a round-robin
manner). Here is an example:
Default Connection Switching
If you have more than one master-slave cluster (or simply more than one
database) in your database environment, then you might want to change the
default database connection of some of your models. You could do that by
using db_magic :connection => :foo call from your models.
Example:
Sample model on a separate master-slave cluster (so, separate main
connection + a slave connection):
Per-Query Connection Management
Sometimes you have select queries that you know you want to run on the
master. This could happen for example when you have just added some data
and need to read it back and not sure if it made it all the way to the
slave yet or no. For this situation and a few others there is a set of
methods we’ve added to ActiveRecord models:
on_master – this method could be used in two forms: block form
and proxy form. In the block form you could force connection switch for a
block of code:
In the proxy form this method could be used to force one query to be
performed on the master database server:
on_slave – this method is used to force a query to be run on a
slave even in situations when it’s been previously forced to use the
master. If there is more than one slave, one would be selected randomly.
This method has two forms as well: block and proxy.
on_db(connection) – this method is what makes two previous
methods possible. It is used to switch a model’s connection to some db
for a short block of code or even for one statement (two forms). It accepts
the same range of values as the switch_connection_to method does.
Example:
By default in development and test environments you could use non-existing
connections in your on_db calls and rails would send all your
queries to a single default database. In production on_db won’t
accept non-existing names.
This behaviour is controlled by the
DbCharmer.connections_should_exist configuration attribute which
could be set from a rails initializer.
Forced Slave Reads
In some cases we could have models that are too important to be used in
default “send all reads to the slave” mode, but we still would like to
be able to switch them to this mode sometimes. For example, you could have
User model, which you would like to keep from lagging with your
slaves because users do not like to see outdated information about their
accounts. But in some cases (like logged-out profile page views, etc) it
would be perfectly reasonable to switch all reads to the slave.
For this use-case starting with DbCharmer release 1.7.0 we have a
feature called forced slave reads. It consists of a few separate small
features that together make it really powerful:
:force_slave_reads => false option for
ActiveRecord‘s db_magic method. This option could be
used to disable automated slave reads on your models so that you could call
on_slave or use other methods to enable slave reads when you need
it. Example:
ActionController.force_slave_reads class method. This
method could be used to enable per-controller (when called with no
arguments), or per-action (:only and :except params)
forced reads from slaves. This is really useful for actions in which you
know you could tolerate some slave lag so all your models with slaves
defined will send their reads to slaves. Example:
ActionController#force_slave_reads! instance method,
that could be used within your actions or in controller filters to
temporarily switch your models to forced slave reads mode. This method
could be useful for cases when the same actions could be called by
logged-in and anonymous users. Then you could authorize users in
before_filter and call force_slave_reads! method for
anonymous page views.
Notice: Before using this method you need to enable
ActionController support in DbCharmer. You need to call
DbCharmer.enable_controller_magic! method from your project
initialization code.
DbCharmer.force_slave_reads method that could be used with a
block of ruby code and would enable forced slave reads mode until the end
of the block execution. This is really powerful feature allowing high
granularity in your control of forced slave reads mode. Example:
Notice: At this point the feature considered beta and should be used with
caution. It is fully covered with tests, but there still could be
unexpected issues when used in real-world applications.
Associations Connection Management
ActiveRecord models can have an associations with each other and since
every model has its own database connections, it becomes pretty hard to
manage connections in a chained calls like User.posts.count. With
a class-only connection switching methods this call would look like the
following if we’d want to count posts on a separate database:
Apparently this is not the best way to write the code and we’ve
implemented an on_* methods on associations as well so you could
do things like this:
Notice: Since ActiveRecord associations implemented as proxies for
resulting objects/collections, it is possible to use our connection
switching methods even without chained methods:
Starting with DbCharmer release 1.4 it is possible to use prefix
notation for has_many and HABTM associations connection switching:
Named Scopes Support
To make it easier for DbCharmer users to use connections switching
methods with named scopes, we’ve added on_* methods support on
the scopes as well. All the following scope chains would do exactly the
same way (the query would be executed on the :foo database connection):
And now, add this feature to our associations support and here is what we
could do:
Bulk Connection Management
Sometimes you want to run code where a large number of tables may be used,
and you’d like them all to use an alternate database. You can now do
this:
Any model whose default database is :logs (e.g., db_charmer
:connection => :logs) will now have its connection switched to
:big_logs_slave in that block. This is lower precedence than any
other DbCharmer method, so Model.on_db(:foo).find(...)
and such things will still use the database they specify, not the one that
model was remapped to.
You can specify any number of remappings at once, and you can also use
:master as a database name that matches any model that has not had
its connection set by DbCharmer at all.
Notice: DbCharmer works via alias_method_chain in
model classes. It is very careful to only patch the models it needs to.
However, if you use with_remapped_databases and remap the default
database (:master), then it has no choice but to patch all
subclasses of ActiveRecord::Base. This should not cause any serious
problems or any big performance impact, but it is worth noting.
Simple Sharding Support
Starting with the release 1.6.0 of DbCharmer we have added support
for simple database sharding to our ActiveRecord extensions. Even though
this feature is tested in production, we do not recommend using it in your
applications without complete understanding of the principles of its work.
At this point we support four sharding methods:
range – really simple sharding method that allows you to take a
table, slice is to a set of smaller tables with pre-defined ranges of
primary keys and then put those smaller tables to different
databases/servers. This could be useful for situations where you have a
huge table that is slowly growing and you just want to keep it simple and
split the table load into a few servers without building any complex
sharding schemes.
hash_map – pretty simple sharding method that allows you to
take a table and slice it to a set of smaller tables by some key that has a
pre-defined key of values. For example, list of US mailing addresses could
be sharded by states, where you’d be able to define which states are
stored in which databases/servers.
db_block_map – this is a really complex sharding method that
allows you to shard your table into a set of small fixed-size blocks that
then would be assigned to a set of shards (databases/servers). Whenever you
would need an additional blocks they would be allocated automatically and
then balanced across the shards you have defined in your database. This
method could be used to scale out huge tables with hundreds of millions to
billions of rows and allows relatively easy re-sharding techniques to be
implemented on top.
db_block_group_map – really similar to the
db_block_map method with a single difference: this method allows
you to have a set of databases (table groups) on each server and every
group would be handled as a separate shard of data. This approach is
really useful for pre-sharding of your data before scaling your application
out. You can easily start with one server, having 10-20-50 separate
databases, and then move those databases to different servers as you see
your database outgrow one machine.
How to enable sharding?
To enable sharding extensions you need to take a few things:
Create a Rails initializer (on run this code when you initialize your
script/application) with a set of sharded connections defined. Each
connection would have a name, sharding method and an optional set of
parameters to initialize the sharding method of your choice.
Specify sharding connection you want to use in your models.
Specify the shard you want to use before doing any operations on your models.
For more details please check out the following documentation sections.
Sharded Connections
Sharded connection is a simple abstractions that allows you to specify all
sharding parameters for a cluster in one place and then use this
centralized configuration in your models. Here are a few examples of
sharded connections initialization calls:
Sample range-based sharded connection:
Sample hash map sharded connection:
Sample database block map sharded connection:
After your sharded connection is defined, you could use it in your models:
Switching connections in sharded models
Every time you need to perform an operation on a sharded model, you need to
specify on which shard you want to do it. We have a method for this which
would look familiar for the people that use DbCharmer for
non-sharded environments since it looks and works just like those per-query
connection management methods:
There is another method that could be used with range and hash_map sharding
methods, this method allows you to switch to the default shard:
And finally, there is a method that allows you to run your code on each
shard in the system (at this point the method is supported in db_block_map
and db_block_group_map methods only):
Defining your own sharding methods
It is possible with DbCharmer for the users to define their own
sharding methods. You need to do a few things to implement your very own
sharding scheme:
Create a class with a name DbCharmer::Sharding::Method::YourOwnName
Implement at least a constructor initialize(config) and a
lookup instance method shard_for_key(key) that would return either
a connection name from database.yml file or just a hash of
connection parameters for rails connection adapters.
Register your sharded connection using the following call:
Use your sharded connection as any standard one.
Adding support for default shards in your custom sharding methods
If you want to be able to use on_default_shard method on your
custom-sharded models, you need to do two things:
implement support_default_shard? instance method on your
sharded class that would return true if you do support default
shard specification and false otherwise.
implement :default symbol support as a key in your shard_for_key method.
Adding support for shards enumeration in your custom sharding methods
To add shards enumeration support to your custom-sharded models you need to
implement an instance method shard_connections on your class. This
method should return an array of sharding connection names or connection
configurations to be used to establish connections in a loop.