MongoDB Map & Reduce with Date filter

We are using MongoDB as primary DB at VersionEye, together with MongoID. Software package is a document in the “products” collection. These products collections has a subcollection with “versions”. Assume we want to know how many versions/artifacts existed for a given language to a given time?

That is not a simple query in MongoDB. This kind of queries can be handled with Map & Reduce. With Map & Reduce you can execute JavaScript on DB Level. Here is the current solution:



border = until_date.at_midnight + 1.day

map = %Q{
  function() {
    if ( this.versions == null || this.versions.count == 0 ) return;

    that_day = new ISODate("#{border.iso8601}");
    for (var version in this.versions){
      created = this.versions[version].created_at
      if (created != null && created.getTime() < that_day.getTime()){
        emit( this.versions[version]._id, { count: 1 } );
      }
    }
  }
}

reduce = %Q{
  function(key, values) {
    var result = { count: 0 };
    values.forEach(function(value) {
      result.count += value.count;
    });
    return result; 
  }
}

Product.where(:language => language, :created_at.lt => border ).map_reduce(map, reduce).out(inline: true)

The tricky part was this line:

that_day = new ISODate("#{border.iso8601}");

To find out how to convert a Ruby Date object into the JavaScript Date object.

Otherwise you have to know that even through you are iterating over a versions collection you can not access the version object through “version”! You have to access it this way:

this.versions[version]

Otherwise it works fine 🙂

MongoDB could not restore backup because of “key too large to index” error

Recently I made a db dump on a MongoDB 2.4 server like this:

mongodump --db veye_dev

And I tried to restore it on a MongoDB 2.6 server like this:

mongorestore dump/veye_dev

Unfortunately at some point the restore process failed with this error message:

mongo error: "Btree::insert: key too large to index, failing

There are some restrictions for the size of the index in MongoDB 2.6. Luckily I found the issue on MongoDBs ticket system. The import can succeed if the mongod process is running with the parameter “failIndexKeyTooLong=false”. Just start it like that:

sudo mongod --setParameter failIndexKeyTooLong=false

And now execute the mongorestore again. That worked for me.

MongoID Lessons Learned

I am using MongoID to access MongoDB from a Ruby on Rails application. It is a good library. But there are some things I found out during the project, I want to share here. Nothing bad. Just some behaviors should now about.

Case Insensitive Search
There are different ways to write queries with MongoID. One pretty cool feature is that you can use regex for you queries. If you want to have all users who are starting with “mike”, you can write something like this:

query = User.where(name: /^#{name}/)

That is very useful. You can also make easily a case insensitive search by adding a “i” to the end of the regex.

query = User.where(name: /^#{name}/i)

Very useful feature. But you should know that the case insensitive search is 50% slower than the default regex search. At least that is what I measured. Funny that such a small “i” can have such a big impact 🙂

One fast workaround is to store the names twice. One times regular and one times lowercased. In that way you can execute the fast query on the lowercased column and in the UI you can show the normal name.

Returning an Empty Query
Sometimes you want to build together a criteria object dynamically. And sometimes it happens that you just return an empty criteria object. Something like this here:

Mongoid::Criteria.new(User)

I don’t know why but I assumed that this would return an empty criteria list with 0 users. But indeed it is a empty criteria and it returns all users from the Database 🙂

This is the code which returns really 0 users:

Mongoid::Criteria.new(User, {_id: -1})

This is a criteria object with the constraint “_id = -1”. And because there is no object in the collection with ID equal -1 it returns 0 Users.

Failed to connecto to primary node

If you try to connect to MongoDB Replica Set via MognoID and you get this error message here:

/opt/local/lib/ruby1.9/gems/1.9.1/gems/mongo-1.5.2/lib/mongo/repl_set_connection.rb:165:inconnect': Failed to connect to primary node. (Mongo::ConnectionFailure) from /opt/local/lib/ruby1.9/gems/1.9.1/gems/mongo-1.5.2/lib/mongo/repl_set_connection.rb:500:insetup'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/mongo-1.5.2/lib/mongo/repl_set_connection.rb:144:in initialize' from /opt/local/lib/ruby1.9/gems/1.9.1/gems/mongoid-2.3.4/lib/mongoid/config/replset_database.rb:24:innew'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/mongoid-2.3.4/lib/mongoid/config/replset_database.rb:24:in configure' from /opt/local/lib/ruby1.9/gems/1.9.1/gems/mongoid-2.3.4/lib/mongoid/config.rb:316:inconfigure_databases'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/mongoid-2.3.4/lib/mongoid/config.rb:119:in from_hash' from /opt/local/lib/ruby1.9/gems/1.9.1/gems/mongoid-2.3.4/lib/mongoid/config.rb:136:inload!'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/mongoid-2.3.4/lib/mongoid.rb:147:in load!' from /Users/reiz/workspace/versioneye/versioneye/config/application.rb:33:inclass:Application'
from /Users/reiz/workspace/versioneye/versioneye/config/application.rb:18:in <module:Versioneye>' from /Users/reiz/workspace/versioneye/versioneye/config/application.rb:17:in'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/railties-3.1.0/lib/rails/commands.rb:52:in require' from /opt/local/lib/ruby1.9/gems/1.9.1/gems/railties-3.1.0/lib/rails/commands.rb:52:inblock in '
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/railties-3.1.0/lib/rails/commands.rb:49:in tap' from /opt/local/lib/ruby1.9/gems/1.9.1/gems/railties-3.1.0/lib/rails/commands.rb:49:in'
from script/rails 
in require' from script/rails:6:in'

Than you should check out this link here: https://github.com/mongoid/mongoid/issues/1783#issuecomment-4319677Durran Jordan helped me to solve problem. Thx for that!

Rails + MongoDB ReplicaSet Configuration

This article shows how to configure a Ruby on Rails app for a MongoDB ReplicaSet. How to set up a MongoDB ReplicaSet I described here: MongoDB ReplicaSet Tutorial. And how to configure Rails to work together with MongoDB is described here: Rails + MongoDB Quickstart Tutorial.

If you want to connect to a ReplicaSet you just have to change your configuration a little bit. This is how your mongoid.yml file should look like:

production:
  database: mydb_prod
  hosts:
    - - mongonode1:1222
    - - mongonode2:1222
    - - mongonode3:1222
  read: :secondary

“mongonode1”, “mongonode2” and “mongonode3” should be mapped to a real ip address, of course! Usually you do that in “/etc/hosts”.

You don’t have to define a PRIMARY! The driver will figure out which of them is the PRIMARY. I like this! You also don’t need a load balancer. Because all nodes are known by the driver. And so the driver will figure it out.

The property “read” can have 2 values: [“:secondary”, “:primary”]. “:secondary” means that read operations can also be routed to secondary nodes. “:primary” means that all read operations will be routed to the PRIMARY node.

MongoDB ReplicaSet Tutorial

MongoDB is a document-oriented database. One cool feature is that you can easily set up a MongoDB cluster, a Replica Set. A Replica Set has at least 3 nodes. Show here on the image from the MongoDB docu.

In a Replica Set there is always one Master! All other nodes are “SECONDARY” nodes! Write operations have always to go to the “PRIMARY” (Master) Node. Read operations can also go to SECONDARY nodes.

If you want to start a mongod process as part of a Replica Set you can use this parameters:

./bin/mongod --port <PORT> --bind_ip <IP-ADRESS> --replSet <REPL-NAME> --rest

for example:

./bin/mongod --port 1222 --bind_ip 192.168.0.101 --replSet myreplset --rest

You can do that on 3 different servers, or more if you want. Than log in into the server which should be the first PRIMARY. Here you have to start the mongo shell and initiate the replica set.

rs.initiate()

That is initiating the replica set. Now the replica set is running with 1 node. Now you have to add the other nodes.

rs.add(“192.168.0.102:1222”)
rs.add("192.168.0.102:1222")

All right. Now the replica set is running with 3 nodes. Now you can open a browser and navigate to the ip address of the PRIMARY node:

http://192.168.0.101:1222/_replSet

Because we started the mongod process with the parameter “–rest” we can see now a small web application showing us the status of our replica set. Here we should see 3 Members. 1 PRIMARY and 2 SECONDARY.

Now you can connect to the PRIMARY via the mongo shell console and add a document to the database. For example:

mongo --host 192.168.0.101 --port 1222 
> use myfirstdb
> db.users.save ( {name : "Hans"} )

If you connect now to a SECONDARY you will see the document there.

mongo --host 192.168.0.102 --port 1222 
> use myfirstdb
> db.users.find()

If the PRIMARY goes down, there is an election in the replica set. So that one of the SECONDARIES become the new PRIMARY. When the old PRIMARY is again up and running, he will sync with the other nodes and become again the current PRIMARY.

I have setup a replica set for an Ruby on Rails application. And I played around with it. I just rebooted randomly some of the nodes. But my Ruby on Rails was still available could deliver the data. That is pretty cool! 🙂

You can read more about Replica Set Configuration here: MongoDB Replica Set Configuration.

 

MongoID read_secondary deprecated

OK. If you are using Rails + MongoID and you get this error message here:

read_secondary options has now been deprecated and will be removed in driver v2.0. Use the :read option instead.

Than you just have to change your configuration. Just replace “read_secondary” with “read”.

Rails + MongoDB Tutorial. Quickstart.

This is a Quickstart Tutorial for Rails + MongoDB. Ruby on Rails and MongoDB, a document based database, is a pretty good fit. It is very easy to integrate MongoDB with Rails. If you want to know how to install MongoDB, check out this article here: Installing MongoDB on Mac OS X Lion. Or this here: Installing MongoDB on Linux.

I assume you are familiar with Ruby on Rails 🙂

There are several GEMs to interact with MongoDB. I used the MongoID GEM. That worked pretty good for me. Just add this to your Gemfile:

gem 'bson_ext', '1.6.0'  
gem 'mongo', '1.6.0'
gem 'mongoid', '2.4.6'

And run “bundle install” on your project, to load the GEM from the internet. And add the file mongoid.yml to your config folder:

development:
    host: localhost
    database: myapp_dev

test:
    host: localhost
    database: myapp_test

production:
    host: localhost
    database: myapp_prod

This is the file where you are configuring the access to the database. If your mongodb instance is running locale you don’t need any username or password settings. Now you have to load the mongoid.yml file. You can do that by adding this line to your application.rb inside of the “class Application”.

Mongoid.load!("config/mongoid.yml")

Than just add a new Model to your project. For example “User”. That could look like that:

class User
include Mongoid::Document
include Mongoid::Timestamps
field :username, type: String
field :firstname, type: String
field :llastname, type: String
field :email, type: String
filed :age, type: Integer

end

That’s it. You just have to include the Mongoid:Document.

include Mongoid::Document

And define your fields! Now you can create a new user and save it like that:

user = User.new
user.username = "mike"
user.firstname = "Mike"
user.lastname = "Boby"
user.age = 44
user.save

Or make a query like that:

user = User.where( username: "mike")

Check out the Criteria API for MongoID. That is pretty straight forward. Worked pretty well for me.

MongoDB. Growing local.X files

I have a ReplicaSet with MongoDB 2.0.2 on two Debian Linux 6.0 servers. The ReplicaSet is working fine. But on one of the servers I realized that there are a bunch of local.x files.

  • local.0
  • local.1
  • local.22.

Each of them 2.1 GB. The actual database/collections are less than 1 GB. I am just wondering why there are all this files and taking more than 40 GB space ??

On a dedicated server with 800 GB this is not a big deal, but on a small VServer with 60GB HD this could be a huge problem!

If somebody have the same problem and an solution for it, please let me know!

mongodb replicasets with read errors

If you have a ReplicaSet setup with MongoDB and you try to read from a slave node and you get this Exception here:

mongdodb, replicates and error: { “$err” : “not master and slaveok=false”, “code” : 13435 }

or this here:

uncaught exception: count failed: { "errmsg" : "not master", "ok" : 0 }

Than the problem is that you are not allowed to read from the slave. By default MongoDB disables reads from Slaves. You can solve that by executing this here on the slave in the mongo shell:

rs.slaveOk()

Or just reconfigure your mongo driver.

MongoDB Import / Export

If you want to export a whole database on MongoDB, this is the tool you are looking for:

mongodump --host <HOST> --port <PORT> --db <DATABASE> --username <USERNAME> --password <PASSWORD>

that will export all your collections into a separate folder. Each collection is exported to a single bson file. After you executed the command you will have a “dump” folder. Under the dump folder there is folder the the name of the DATABASE and under the database folder are the collections on separate bson files.

If you want to import the files into another MongoDB on another server, this is the tool you are looking for:

mongorestore --host <HOST> --port <PORT> --username <USERNAME> --port <PORT> dump/<DATABASENAME>

That worked for me fine 🙂

 

MongoDB Indexing

I am using Ruby and MongoDB in a project and I got today this Exception here:

Mongo::OperationFailure (too much data for sort() with no index):

The Problem was really that the searched field was not indexed. You can add an index like this:

db.products.ensureIndex({name:1});

That means the attribute “name” on the “products” document get indexed in asc order. That improved my search results enorm. But if the search results are to big, you will still get the same error.

I just solved the problem by limiting the search results. If you are using MongoID for ruby you can add this to your query:

.limit(300)

That means you will just get the first 300 search results. In my case that is perfect.

Useful MongoDB commands

How to install MongoDB on Mac OS X I showed already here:
https://robert-reiz.com/2011/08/11/installing-mongodb-on-mac-ox-x-lion/

If everything is installed correctly, you can start the server process as root with this command:

mongod

and the client as no root with this command:

mongo

By default you are logged in into the “test” database. You can show all dbs with this command:

show dbs

you can switch to an existing DB or create a new DB with this command:

use mynewdb

With this command here you can show alle “collections” inside of the DB:

show collections

You can make an insert with this command:

db.users.save( { name : "myname" } )

If the collections “users” does not exist it will be created with the first call. With this command you can see all entries in the collection “users”:

db.users.find()

With count() you can see how many elements are in the collection.

db.users.count()

And with “remove” you can remove alle elements from a collections.

db.users.remove()

And with drop you can drop the entire collection.

db.users.drop()

Installing MongoDB on Mac OX X Lion

Today I tried to install mongodb via MacPorts on Mac OS X Lion. I got this error message:

Error: Target org.macports.build returned: shell command failed (see log for details)
Log for mongodb is at: /opt/local/var/macports/logs/_opt_local_var_macports_sources_rsync.macports.org_release_ports_databases_mongodb/mongodb/main.log
Error: Status 1 encountered during processing.
To report a bug, see <http://guide.macports.org/#project.tickets>

Maybe it is not the best idea to install MongoDB via MacPorts. I just downloaded the compiled binarys from the MongoDB Homepage: http://www.mongodb.org/downloads. Just unzip it and that’s it.

No you have to create a directory.

mkdir /data
cd /data
mkdir db

MongoDB expects that this directory exists and puts all necessary DB files into this directory.

Navigate in the command line to the unpacked directory and execute this command as root to start the MongoDB server:

./bin/mongod

And the server is running 🙂
In another shell you can start the client. The client can run without root rights.

./bin/mongo

Now you can start to work

> db.foo.save( { a : 1 } )
> db.foo.find()

Installing MongoDB on YDL

MongoDB is a pretty good Document Database. It is NOT a Object Database. Anyway. If you want learn more about MongoDB check out this site: http://www.mongodb.org

As far as I know MongoDB is not available over the yum package manager from Yellow Dog Linux (YDL). But it is pretty easy to install. Just download the linux binaries and unpack it.

cd /opt/
sudo wget http://downloads.mongodb.org/linux/mongodb-linux-x86_64-1.8.1.tgz
sudo tar -xzf mongodb-linux-x86_64-1.8.1.tgz

For convenient reasons you can create a short symlink

sudo ln -s mongodb-linux-x86_64-1.8.1 mongodb

Now you can start all mongo binaries under “/opt/mongodb/bin”. For convenient reasons you can create more symlinks.

cd /usr/bin
sudo ln -s /opt/mongodb/bin/mongod
sudo ln -s /opt/mongodb/bin/mongo

MongoDB will save all the database stuff in the directory “/data/db”. If this directory does not exist you will get an error. Just create the directory.

sudo mkdir /data
sudo mkdir /data/db

Now you can start the MongoDB server

sudo mongod

Your MongoDB Server is up and running. In a diffrent shell you can use the mongo client to connect to the server.

mongo

Now you are in the Mongo Shell. By default the mongo client connects to the database “test”. Try this:

db.foo.save({"b":"2"})
db.foo.find()

Now you should see something like this:

{ "_id" : ObjectId("4dd975e60994b04312264239"), "b" : "2" }

Congratulation! You can exit the shell with this command:

exit