Skip to content
erikgrinaker edited this page Nov 23, 2012 · 15 revisions

Deployment

This document describes deploying/installing Checkpoint on a single server in such a way that it can easily be scaled up to a larger cluster of servers. Installation and configuration is done manually, although in production a configuration management system (such as Puppet or Chef) and a proper deployment system (such as Capistrano) should be used - this is left as an exercise for the reader.

Overview

The following components are used:

  • Ruby: the programming language that Checkpoint is written in
  • PostgreSQL: database backend
  • Memcached: data cache
  • Unicorn: HTTP application server
  • Nginx: proxies requests to Unicorn, and serves static assets from the public/ directory
  • HAProxy: routes requests to the proper application, and optionally load-balances requests across multiple servers

Ruby

Checkpoint is written for Ruby 1.9, which should be installed as appropriate for your environment. You may want to patch it with David Celis' patchset, which includes significant boot-time and run-time performance improvements, as well as the copy-on-write friendly garbage collector backported from Ruby 2.0 for more efficient memory usage (especially when coupled with Unicorn preloading, as described below).

Checkpoint uses Bundler to manage its runtime dependencies, so the Bundler gem must be installed on the system:

$ gem install bundler

PostgreSQL

Install and configure PostgreSQL as appropriate for your environment. Make sure to do some basic performance tuning - in particular, examine the settings for shared_mem, work_mem, effective_cache_size, effective_io_concurrency, and checkpoint_segments.

PostgreSQL can run on the same server as the application, or on a dedicated server. For redundancy it is recommended to use multiple database servers in a master/slave setup, although Checkpoint can currently only use a single server for queries (ie. it cannot spread its work load across multiple database servers).

PostgreSQL uses the special template1 database as a template for subsequent databases. Make sure it uses UTF8 encoding and en_US.UTF-8 collation so that any databases you create will inherit the proper settings:

$ psql -l
                          List of databases
      Name      |   Owner   | Encoding |   Collate   | Ctype |   Access privileges 
----------------+-----------+----------+-------------+-------+------------------------
 postgres       | postgres  | UTF8     | en_US.UTF-8 | UTF-8 |
 template0      | postgres  | UTF8     | en_US.UTF-8 | UTF-8 | =c/postgres           +
                |           |          |             |       | postgres=CTc/postgres
 template1      | postgres  | UTF8     | en_US.UTF-8 | UTF-8 | =c/postgres           +
                |           |          |             |       | postgres=CTc/postgres
(3 rows)

If the settings of template1 are incorrect, the easiest way to fix it is to delete the entire PostgreSQL data directory and re-initialize it (the filesystem paths on your system may differ):

$ /etc/init.d/postgresql stop
$ rm -rf /var/lib/postgresql/9.2
$ initdb --pgdata /var/lib/postgresql/9.2 --encoding utf8 --locale en_US.UTF-8
$ /etc/init.d/postgresql start

Once PostgreSQL is up and running, you need to create a user and database for Checkpoint:

$ createuser --pwprompt checkpoint
$ createdb --owner checkpoint checkpoint_production

Finally, verify that the user and database is set up correctly:

$ psql --username checkpoint --password --command 'SELECT 1 AS ok' checkpoint_production
Password for user checkpoint: 
 ok 
----
  1
(1 row)

Memcached

Install and configure Memcached as appropriate for your environment. Make sure it uses a decent amount of memory by setting the -m option.

Checkpoint uses Dalli as a Memcache client, which supports hashing of keys to multiple servers, and also handles server outages properly. This means that you can set up a cluster of Memcache servers to provide a larger cache space, and only lose part of the cache in case of server downtime.

Checkpoint

To fetch the source code, run:

$ git clone https://github.com/bengler/checkpoint.git /srv/checkpoint
$ cd /srv/checkpoint

Next, install the gem dependencies using Bundler (the --deployment flag tells Bundler to install the gems for this app separately, under vendor/bundle/):

$ bundle install --deployment

When gems are installed with Bundler, you must make sure to run any application commands via bundle exec, so that it can set up the necessary environment. You should also provide the RACK_ENV environment variable, set to production for production systems. App commands will typically be prefixed with RACK_ENV=production bundle exec.

Finally, set up the database connection info in the file config/database.yml (see config/database-example.yml for an example), and create the basic database structure by running:

$ RACK_ENV=production bundle exec rake db:migrate

Unicorn

Unicorn is used to run the HTTP application itself. We will set it up to listen on a UNIX socket, and have Nginx proxy networks requests to the socket, for added performance (see next section for Nginx configuration).

First, create a directory that Unicorn can store its files in:

$ mkdir /var/run/unicorn

Then, set up a basic configuration for Unicorn in config/unicorn.rb, to get it up and running:

working_directory "/srv/checkpoint"
listen "/var/run/unicorn/checkpoint.sock"
pid "/var/run/unicorn/checkpoint.pid"
worker_processes 4
timeout 60

You should now be able to start up the application:

$ RACK_ENV=production bundle exec unicorn -c config/unicorn.rb -E production

And make sure it works by doing a simple API request in a separate shell:

$ echo "GET /api/checkpoint/v1/identities/me" | socat - UNIX-CONNECT:/var/run/unicorn/checkpoint.sock,crnl
{}

To run it daemonized (in the background) add a -D switch:

$ RACK_ENV=production bundle exec unicorn -c config/unicorn.rb -E production -D

Application preloading

Normally, Unicorn will load the application separately in each of its worker processes. Unfortunately, this takes a long time if you're starting many workers, and uses a lot of memory because each process has a separate memory space. It is more efficient to have Unicorn load the application before spawning workers, so it can take advantage of Linux' copy-on-write memory management (especially if your Ruby is patched with a copy-on-write friendly garbage collector, see the Ruby section for more info). This can dramatically improve (re)start times and memory usage.

Enabling preloading is as easy as setting preload_app true in the Unicorn config. However, a side-effect of preloading (due to the semantics of the fork() system call in UNIX) is that the workers will inherit and share any open file handles from the master, including database connections. This causes the application to break in unexpected ways, as the workers try to communicate with the database on the same channel. To fix this, we need to close the connections before Unicorn forks its workers, and re-establish them separately in each worker. This is accomplished by adding the following code to the Unicorn config:

preload_app true

before_fork do |server, worker|
    if defined?(ActiveRecord::Base) and ActiveRecord::Base.connected?
        ActiveRecord::Base.connection.disconnect!
    end
    if defined?(Dalli)
        ObjectSpace.each_object(Dalli::Client) do |client|
            client.close
        end
    end
end

after_fork do |server, worker|
    if defined?(ActiveRecord::Base)
        ActiveRecord::Base.establish_connection(
            YAML::load_file("config/database.yml")[ENV["RACK_ENV"]]
        )
    end
end

Graceful restarts

In production, it is advisable to set up Unicorn to restart gracefully, so that it will reload the application in a new Unicorn master process while the old one finishes its requests. This is done by sending the Unicorn master process a USR2 signal, which will start up a new, parallell master process, and then sending the old master a QUIT signal to have it exit after finishing its current requests.

This process can be streamlined by configuring Unicorn to automatically send the QUIT signal to any old masters when it starts up - and as an added bonus, if you're using application preloading (see above), the old master will continue processing requests if the new Unicorn master fails to start for any reason. This is accomplished by adding the following code in the Unicorn config:

before_fork do |server, worker|
    oldpid = Integer(File.read("/var/run/unicorn/checkpoint.pid.oldbin")) rescue nil
    if oldpid && oldpid != Process.pid
        begin
            Process.kill("QUIT", oldpid)
        rescue Errno::ENOENT, Errno::ESRCH
        end
    end
end

With this in place, robust and graceful restarts of Unicorn can be performed simply by sending a USR2 signal to the master:

$ kill -s USR2 $(cat /var/run/unicorn/checkpoint.pid)

Nginx

Nginx is used to proxy HTTP requests to the Unicorn socket, as well as serve any static assets. Install it and configure it as appropriate in your environment, then add the following bare-bones configuration to handle Checkpoint requests:

http {
    upstream unicorn_checkpoint.domain.com {
        server unix:/var/run/unicorn/checkpoint.sock fail_timeout=0;
    }

    server {
        listen 8000;
        server_name checkpoint.domain.com;
        root /srv/checkpoint/public;

        location / {
            try_files $uri @unicorn;
        }

        location @unicorn {
            proxy_set_header Host $http_host;
            proxy_pass http://unicorn_checkpoint.domain.com;
        }
    }
}

You should now be able to run HTTP requests against Checkpoint:

$ curl http://checkpoint.domain.com:8000/api/checkpoint/v1/identities/me
{}

Performance optimization

Nginx performance optimization is a large topic, which we can't hope to cover here, but a few simple tricks will go a long way - try adding this to the http section of your Nginx config:

http {
    # TCP socket options
    tcp_nopush on;
    tcp_nodelay on;

    # GZIP compression
    gzip on;
    gzip_http_version 1.0;
    gzip_comp_level 2;
    gzip_min_length 1100;
    gzip_proxied any;
    gzip_static on;
    gzip_types text/plain text/css text/xml application/xml
        application/x-javascript text/javascript application/json;

    # Don't buffer proxied responses
    proxy_buffering off;

    # Use larger request buffers, to avoid hitting disk
    client_body_temp_path /tmp/nginx;
    client_body_buffer_size 64k;
}

HAProxy

HAProxy is used to route requests to the proper application. This means that we can route requests for /api/checkpoint/ to Checkpoint, regardless of which site we are on. It can also be used to load-balance request across a cluster of web-servers running Unicorn and Nginx.

Install and configure HAProxy as appropriate for your environment, and then add this bare-bones configuration for Checkpoint:

defaults
    mode http
    option httpclose

frontend http
    bind :80

    acl checkpoint-path path_beg /api/checkpoint/
    use_backend checkpoint if checkpoint-path

    acl checkpoint-domain hdr_reg(host) -i ^checkpoint\.domain\.com(:[0-9]+)?$
    use_backend checkpoint if checkpoint-domain

backend checkpoint
    balance roundrobin
    server web1.domain.com:8000 web1.domain.com:8000
    option forwardfor
    reqirep ^Host:\ (\S+).*$ Host:\ checkpoint.domain.com\r\nX-Forwarded-Host:\ \1

This will route any requests to the specific domain checkpoint.domain.com, or the path /api/checkpoint/ on any domain, to the Checkpoint application. It also forces the Host: header to checkpoint.domain.com, so that Nginx knows which Unicorn application to use, but places the original hostname in the X-Forwarded-Host header. It also adds an X-Forwarded-For header containing the IP address of the originating client.