Friday, May 21, 2010

Deploying MongoDB with master-slave replication

This is a quick post so I don't forget how I set up replication between a MongoDB master server running at a data center and a MongoDB slave running in EC2. I won't go into how to use MongoDB (I'll have another one or more posts about that), I'll just talk about installing and setting up MongoDB with replication.

Setting up the master server

My master server runs Ubuntu 9.10 64-bit, but similar instructions apply to other flavors of Linux.

1) Download MongoDB binary tarball from the mongodb.org downloads page (the current production version is 1.4.2). The 64-bit version is recommended. Since MongoDB uses memory-mapped files, if you use the 32-bit version your database files will have a 2 GB size limit. So get the 64-bit version.

2) Unpack the tar.gz somewhere on your file system. I unpacked it in /usr/local on the master server and created a mongodb symlink:

# cd /usr/local
# tar xvfz mongodb-linux-x86_64-1.4.1.tar.gz
# ln -s mongodb-linux-x86_64-1.4.1 mongodb

3) Create a mongodb group and a mongodb user. Create a data directory on a partition with generous disk space amounts. Create a log directory. Set permissions on these 2 directories for user and group mongodb:

# groupadd mongodb
# useradd -g mongodb mongodb
# mkdir /data/mongodb
# mkdir /var/log/mongodb
# chown -R mongodb:mongodb /data/mongodb /var/log/mongodb

4) Create a startup script in /etc/init.d. I modified this script so it also handles logging the way I wanted. I posted my version of the script as this gist. In my case the script is called mongodb.master. I chkconfig-ed it on and started it:

# chkconfig mongodb.master on
# service mongodb.master start

5) Add location of MongoDB utilities to your PATH. If my case, added this line to .bashrc:

export PATH=$PATH::/usr/local/mongodb/bin

That's it, you should be in business now. If you type

# mongo

you should be connected to your MongoDB instance. You should also see a log file created in /var/log/mongodb/mongodb.log.

Also check out the very good administration-related documentation on mongodb.org.

Setting up the slave server

The installation steps 1) through 3) are the same on the slave server as on the master server.

Now let's assume you already created a database on your master server. If your database name is mydatabase, you should see files named mydatabase.0 through mydatabase.N and mydatabase.ns in your data directory. You can shut down the master server (via "service mongodb.master stop"), then simply copy these files over to the data directory of the slave server. Then start up the master server again.

At this point, in step 4 you're going to use a slightly different startup script. The main difference is in the way you start the mongod process:

DAEMON_OPTS="--slave --source $MASTER_SERVER --dbpath $DBPATH -logpath $LOGFILE --logappend run"

where MASTER_SERVER is the name or the IP address of your MongoDB master server. Note that you need port 27017 open on the master, in case you're running it behind a firewall.

Here is a gist with my slave startup script (which I call mongodb.slave).

Now when you run 'service mongodb.slave start' you should have your MongoDB slave database up and running, and sync-ing from the master. Check out the log file if you run into any issues.

Log rotation

The mongod daemon has a nifty feature: if you send it a SIGUSR1 signal via 'kill -USR1 PID', it will rotate its log file. The current mongodb.log will be timestamped, and a brand new mongodb.log will be started.

Monitoring and graphing

I use Nagios for monitoring and Munin for resource graphing/visualization. I found a good Nagios plugin for MongoDB at the Tag1 Consulting git repository: check_mongo. The 10gen CTO Eliot Horowitz maintains the MongoDB munin plugin on GitHub: mongo-munin.

Here are the Nagios commands I use on the master to monitor connectivity, connection counts and long operations:

/usr/local/nagios/libexec/check_mongo -H localhost -A connect
/usr/local/nagios/libexec/check_mongo -H localhost -A count
/usr/local/nagios/libexec/check_mongo -H localhost -A long

On the slave I also monitor slave lag:

/usr/local/nagios/libexec/check_mongo -H localhost -A slavelag

The munin plugin is easy to install. Just copy the files from github into /usr/share/munin/plugins and create symlinks to them in /etc/munin/plugins, then restart munin-node.

mongo_btree -> /usr/share/munin/plugins/mongo_btree
mongo_conn -> /usr/share/munin/plugins/mongo_conn
mongo_lock -> /usr/share/munin/plugins/mongo_lock
mongo_mem -> /usr/share/munin/plugins/mongo_mem
mongo_ops -> /usr/share/munin/plugins/mongo_ops

You should then see nice graphs for btree stats, current connections, write lock percentage, memory usage and mongod operations.

That's it for now. I'll talk more about how to actually use MongoDB with pymongo in a future post. But just to express an opinion, I think MongoDB rocks! It hits a very sweet spot between SQL and noSQL technologies. At Evite we currently use it for analytics and reporting, but I'm keeping a close eye on sharding becoming production-ready so that we can potentially use it for web-facing traffic too.

1 comment:

rama nugraha said...

Dear Grig,

I wanna ask about the monitoring and graphing sub for MongoDB.

I was try to install those check_mongodb.py for nagios. But i got a weird result when I try to use it manually.

When I use connect option, the connection is always "OK" to any Host IP I put, this is really weird(even those host doesn't have mongodb installed).

And when I use the another option, it's always failed to connect to MongoDb server. Do you have any clue for this? Cause I feel so confuse to find the answer.. :(

Thanks a lot..

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...