Inject SSH pub key to Vagrant image

Usually if you create a Vagrant VM an insecure private key gets injected into the VM, which is located at ~/.vagrant.d/insecure_private_key. In Ansible you can reference that key to ensure a passwordless login to the VM. Since Vagrant 1.8.5 this doesn’t work anymore, because of security reasons. That’s why I use now this shell provisioner with a bit Ruby code to inject my public SSH key to the VM:

config.vm.provision "shell" do |s|
  ssh_pub_key = File.readlines("#{Dir.home}/.ssh/").first.strip
  s.inline = <<-SHELL
    echo #{ssh_pub_key} >> /home/ubuntu/.ssh/authorized_keys
    echo #{ssh_pub_key} >> /root/.ssh/authorized_keys
    apt-get -y install python-simplejson

The 2nd line is pure Ruby code. It reads the public SSH key from the default .ssh directory from the home directory and stores the content in the ssh_pub_key variable. The first 2 lines of the shell provisioner are injecting the SSH key to the authorized keys for the user ubuntu and root.

With that the VM is build together with my own public SSH key and I can login to the VM via SSH without entering a password. That makes it also super easy to handle the VM later with Ansible.

MongoDB Map & Reduce with Date filter

We are using MongoDB as primary DB at VersionEye, together with MongoID. Software package is a document in the “products” collection. These products collections has a subcollection with “versions”. Assume we want to know how many versions/artifacts existed for a given language to a given time?

That is not a simple query in MongoDB. This kind of queries can be handled with Map & Reduce. With Map & Reduce you can execute JavaScript on DB Level. Here is the current solution:

border = until_date.at_midnight +

map = %Q{
  function() {
    if ( this.versions == null || this.versions.count == 0 ) return;

    that_day = new ISODate("#{border.iso8601}");
    for (var version in this.versions){
      created = this.versions[version].created_at
      if (created != null && created.getTime() < that_day.getTime()){
        emit( this.versions[version]._id, { count: 1 } );

reduce = %Q{
  function(key, values) {
    var result = { count: 0 };
    values.forEach(function(value) {
      result.count += value.count;
    return result; 

Product.where(:language => language, => border ).map_reduce(map, reduce).out(inline: true)

The tricky part was this line:

that_day = new ISODate("#{border.iso8601}");

To find out how to convert a Ruby Date object into the JavaScript Date object.

Otherwise you have to know that even through you are iterating over a versions collection you can not access the version object through “version”! You have to access it this way:


Otherwise it works fine 🙂

PDFKit – invalid byte sequence in US-ASCII

I’m using PDFKit at VersionEye to generate the PDF invoices. It’s a really awesome project. The idea behind PDFKit is that you generate the documents as HTML and CSS and then convert it to PDF. That works really well. Generating a PDF works like this:

kit =, :footer_html => footer_file, :page_size => 'A4')

The first parameter “html” is the HTML as string. In addition to that you can give a separate path to a HTML file as footer. And of course you can choose the output format. In this case DIN A4.

That worked all really well, but sometimes I got a

invalid byte sequence in US-ASCII Exception

I found out that there was some kind of special character in the HTML. That can happen if you fill the HTML template with usernames for example, and one of the users is a French dude or even worst a Chinese dude, then you have some odd characters in your markup 🙂 But luckily there is a solution for that. You can enforce UTF-8 encoding for the string.

This line fixed it for me.

html = html.force_encoding(Encoding::UTF_8)

daemon script

This shell script runs forever and checks if the rails worker is running and if not it starts it again:

while :
  if ps ax | grep -v grep | grep 'rails worker' > /dev/null
      echo "service running, everything is fine"
      sleep 5
      echo "service is not running. Lets start again"
      cd $APP_ROOT
      $BUNDLE exec unicorn_rails -D -c $CONF
      echo "restarted on $(cat /rails/pids/"
      sleep 15

Geek2Geek – Centralized Logging

Last week it happened again. Geek2Geek!


This time we came together at Flyeralarm in Berlin to talk about centralized logging. That is an interesting topic for all companies which have to scale. As soon you have more than 1 server you need to think about how you collect and analyze your log files in a distributed system. There are a couple good solutions out there for this problem.

Jilles van Gurp did the first talk about the ELKstack. ELK stands for E = Elasticsearch, L = Logstash and K = Kibana. All three products belong to the Elasticsearch company and they work all together smoothly in harmony. Jilles showed us how they use the ELK stack at Linko to build the LinkoApp.


Jilles gave us a short intro to the technology on a couple slides before he switched to the live demo. It was very interesting to listen to his real-world experiences with the ELK stack.


The learning from the past couple months are, it is easy to setup but you should be careful with the Elasticsearch cluster. Don’t shut it down all at once 😉

After the first presentation the Pizza arrived and we took a little break with Pizza & Beer.


Lennart is THE guy behind Graylog2. He started the project a couples years ago at Jimdo. The very first version was implemented in Ruby. Graylog2 is a completely rewrite in Java. Lennart is also CoFounder of Torch, the company behind Graylog2.

Lennart gave a short intro about the history, intention and philosophy behind Graylog2.


I was impressed how much he knows about the other logging solutions, such as Logstash/Kibana and Splunk. He was not afraid to talk about feature comparisons and pros & cons of the different solutions.


Graylog2 is build for Enterprise usage. It is optimized for speed and high volume data. The interesting thing is that you can use it together with Elasticsearch and Logstash.

Many thanks for to Jilles and Lennart for the great talks. Both solutions are very interesting. If you still read logs on the server with “less” you should definitely check out these 2 great solutions!

By the way. I also tried to organize a Splunk talk, but unfortunately I couldn’t find any Splunkies willing to give a talk about Splunk at Geek2Geek.

Many thanks to Flyeralarm for sponsoring Location, Pizza and Beer! You guys are awesome!

By the way Flyeralarm just opened a new branch in Berlin. They have a really nice office. This is their meeting room for example.

Screen Shot 2014-05-22 at 12.58.18

And they are currently looking for experienced PHP developers. If you are interested you should contact Thomas.

Deployment with Capistrano 3

Capistrano is a ruby based deployment tool which executes commands in parallel on multiple remote machines, via the SSH protocol. With Capistrano you can deploy your Rails application on N servers with one single command from your dev machine. You even don’t need to login via SSH to your server. This command can rollout your application on N servers:

cap production deploy

And if something goes wrong you can easily rollback to the last stable deployment. Just like this.

cap production deploy:rollback

Screen Shot 2013-01-15 at 8.15.33 PM

Capistrano is pretty cool. I used already the previous version 2.X. The new version 3.X I’m using already since a couple months in production and it is super stable.

If you are deploying your Rails application to dedicated servers or instances on AWS, than Capistrano is the way to go!

Before you start with Capistrano, you have to implement SSH with authentification keys instead of password. In that way you can just login to your server with a simple “ssh user@server” without password. That is possible if your public ssh certificates are on the server. In that way the server “knows” you.

First of all you need to add the Gem to your Gemfile.

gem 'capistrano'

And if you are using Rails and Bundler you want to add this 2 lines as well.

gem 'capistrano-rails' , '~> 1.1.1'
gem 'capistrano-bundler', '~> 1.1.2'

Now you have to run bundler, to install the packages.

bundle install

As next step you have to capify your rails project. Just run:

capify .

That will create some files in your project.

[add] writing './Capfile'
[add] writing './config/deploy.rb'
[add] writing './config/deploy/production.rb'
[add] writing './config/deploy/staging.rb'
[add] writing './config/deploy/test.rb'
[done] capified!

In the Capfile you can require some capistrano packages. For a Rails App it will look like this.

require 'capistrano/setup'
require 'capistrano/deploy'
require 'capistrano/bundler'
require 'capistrano/rails'
require 'capistrano/rails/assets'
require 'capistrano/rails/migrations'

# Loads custom tasks from `lib/capistrano/tasks' if you have any defined.
Dir.glob('lib/capistrano/tasks/*.cap').each { |r| import r }

In Capistrano 3 most of the magic happens in the deploy.rb file, which is the central configuration file for Capistrano. In general it fetches the current code from your Git server, runs bundler, rake db:migrate, precompiles your assets and starts/restarts the ruby app server.

Here is my deploy.rb with some additional comments.

# Force rake through bundle exec
SSHKit.config.command_map[:rake] = "bundle exec rake"

# Force rails through bundle exec
SSHKit.config.command_map[:rails] = "bundle exec rails"

set :migration_role, 'app' # Defaults to 'db'
set :assets_roles, [:app] # Defaults to [:web]

# The name of your application
set :application, 'myapp'

# Configuration for the source control management system
set :scm , :git
set :repo_url, ''
set :branch , "master"

# This forwards the user agents and uses the local
# user for the git authentification.
set :ssh_options, {:forward_agent => true}

# User on remote server
set :user , "ubuntu"

# Application root directory on remote server
set :deploy_to , '/var/www/myapp'

# Shared directories over different deployments
set :linked_dirs, %w(pids log)

# Configuring capistrano log output
set :format , :pretty
set :log_level, :info # :debug :error :info

# Keeps the last 5 deployments on the server for rollback scenarios
set :keep_releases, 5

namespace :deploy do

 desc 'Start application'
  task :start do
   on roles(:app), in: :sequence, wait: 5 do
   execute "/etc/init.d/ start"

 desc 'Stop application'
  task :stop do
   on roles(:app), in: :sequence, wait: 5 do
   execute "/etc/init.d/ stop"

 desc 'Restart application'
  task :restart do
   on roles(:app), in: :sequence, wait: 5 do
   execute "/etc/init.d/ restart"

 after :finishing, 'deploy:restart'
 after :finishing, 'deploy:cleanup'


The script for starting and stoping unicorn you can find here:

In Capistrano you have different environments. For example “test”, “staging” and “production”. You can define as much as you want. Each environment has his own configuration file under “config/deploy/”. For example “config/deploy/production.rb”. Which might look like this:

set :stage, :production

# Setting RAILS_ENV environment variable on server
set :rails_env, :production

set :normalize_asset_timestamps, %{public/images public/javascripts public/stylesheets}

role :app, %w{ubuntu@myapp_server}

set :ssh_options, {
   forward_agent: true # , auth_methods: %w(password)

The most important line is the one with the role. In capistrano you can define different roles and assign them to different servers. So that some deployment commands will be only executed on specific servers. You can read more to that in the official docu. For this article I keep it simple and go ahead with only 1 role and 1 server.

On the remote server(s) you have to create the application root directory. If your application has the name “myapp” it would look like this:


Make sure that the user you defined in the deploy.rb file has full read and write access to this directories. Capistrano will create for each deployment a separate directory in the “release” directory, named with the timestamp of the deployment. The last deployment will be linked to “/var/www/myapp/current”. The “current” directory is a symbolic link to the latest deployment in “/var/www/myapp/releases”.

Now you can deploy with:

cap production deploy

If you have done everything right the deployment will run through and deploy your application.

This command shows you all possible Capistrano tasks:

cap -T

If you don’t deploy on Heroku or CloudControl, than Capistrano is a big help. It makes life much easier 🙂

Let me know if you have questions. Either in the comments or on Twitter.