Thursday, July 25, 2013

Server side modules with Architect

After my last notes on a client side modular architecture using require.js I'll talk about server side modules using Architect.

Modules

In my opinion every application (not only the bigger ones) needs to be divided into interdependent modules. This helps a lot when you deal with testing and enhancing your application.
In node.js applications, you will see and use this pattern very often:

//import a module
var Database = require('./db');

// initialize a module using some options 
var db = new Database('http://my.db.connection');

// initializing a module injecting a dependency 
var usermanager = require('./usermanager')(db);

But what if you want to use this module in other applications ? In this case you should make an npm independent package for each module, of course. And, in the main application you would initialize each package in the proper order. This is repeatitive and tedious furthermore, when the number of the packages starts to get bigger, initializing each module in the proper order can be an issue.

From modules to plugins

Architect uses a simple declarative syntax to explain the interconnections between packages. It also starts the system in the correct order initializing each package just once.
I recommend to read the documentation on github. It's very clear and detailed.

Express.js and Architect

I have set up a simple express.js/architect boilerplate to show how to use Architect to make an extensible express.js application.


I hope this will be useful ...

P.S.
A friend of mine suggest me to use a git subtree for each package. Nice idea!


Thursday, July 18, 2013

Client side modules with require.js


In this post I will show you how to use Require.js to split a project into simple and manageable modules.

UMD and AMD

I really didn't want to dive into differences of these 2 module systems but I think it's very important to realize of what module system we are talking about.

UMD is the module system used by node.js. You define modules doing this (foobar.js):

module.exports = {
    foo: 'bar'
};

And load a module doing this:

var foobar = require('./foobar');
console.log(foobar.foo); // prints bar

You can use UMD in the browser using browserify.
If you usually work with Javascript in the browser you will notice two issues:
  • the exports object would be overwritten every time by differents modules
  • it uses a synchronous approach
As a matter of fact it cannot work in the browser without a build step (and this is the browserify's task).

AMD instead is designed from the ground up to work in the browser.

Saying this I am not advocating one of these systems. They are both very useful even though they use different approaches.

I started by explaining UMD because, unfortunately, both systems use a function called "require".
Now that you can't be fooled anymore by this let's go on.

What is an AMD module

An AMD module must be contained in a single file and is encapsulated by the global function define.
"define" takes two parameters: an array of dependencies and the actual module code.

//module3.js
define(['module1', 'module2'], function(module1, module2) {
    'use strict';

    var namespace = {};
    
    return namespace;
});

In this example I have defined a module called "module3". This module needs module1 and module2 to run.
The return value of define will be returned if another module requires module3.
module1 and module2 are resolved loading with AJAX module1.js and module2.js (both AMD modules).
The job of require.js is basically to resolve the dependency tree and to make sure that every module would be run just once.

A module has a pair of interesting properties:

  • It is loaded the first time is required by another module.
  • not a single variable is added to the global namespace. For this reason you can even use different versions of the same library if you need to

Bootstrap and configuration

After defining modules you will need to bootstrap your application (configure and load the first module). For doing this you will define in your page a script tag with require.js and the url of the bootstrap script (main.js).

<script data-main="/js/main" src="js/vendor/require.js"></script>

After loading require.js it will load the "main.js" bootstrap using ajax. From this point every script will be loaded asynchronously and the DOMContentLoaded event (the jquery ready event) will be fired independently from the script loading.

Main.js is made of 2 part. The first one is require.js configuration:

require.config({
  baseUrl: "js/",
  paths: {
    jquery: 'vendor/jquery-1.9.1',
    underscore: 'vendor/underscore',
    backbone: 'vendor/backbone',
  },
  shim: {
      underscore: {
          exports: "_"
      },
      backbone: {
          deps: ['underscore', 'jquery'],
          exports: 'Backbone'
      },
      'jquery.bootstrap': {
          deps: ['jquery'],
          exports: '$'
      }
    }
});

Here are the most important parameters:
baseUrl: this is the path used to resolve modules.. So if you require "module1" this will be loaded from "js/module1.js".

paths is very useful to define script that are in some other paths. When you require "jquery" will be loaded the script "js/vendor/jquery-1.9.1.js".

Shims

Until now I assumed that every script is an AMD module but this is not always true.
require.js shim option allows to wrap automatically a non AMD script into an AMD define.

"deps" are dependencies to be injected and "exports" is the value to be returned:

For example your backbone.js will become:

define(['underscore', 'jquery'], function (_, $){

..actual backbone.js source ...

return Backbone;

});

The trick works even when you have to add attributes to an existing object (like defining a jquery plugin):

define(['jquery'], function ($){

$.fn.myJqueryPlugin = function (){
};

return $;

});

This is the case of Twitter bootstrap.

require

The second part of main.js is the actual bootstrap. It loads and execute the first module.

require([
  // Load our app module and pass it to our definition function
  'app',
], function(App){
  // The "app" dependency is passed in as "App"
  App.start();
});
The app module will start a chain of loading that will load the entire application's scripts.

The useful text plugin

The text plugin allows you to add text files as dependencies and this is very useful for client side templating.
Just add "text" in the path config:

require.config({
  ...
  paths: {
     ...
    text: 'vendor/text'

and you can load your templates:

define(['underscore', "text!templates/template.html"], function (_, template_html){
    var template = _.template(template_html);
    ...
});

You can also write your own plugin as described here.

Script optimization

Using a lot of modules have an obvious drawback: the time spent to load these modules in the browser.
Require.js comes with a useful tool for optimizing scripts: r.js. It analizes, concatenates and minifies all the dependencies in one single script.

I warmly recommend to use a build step to make this operation automatically.

I use grunt and a grunt plugin to automate everything.

Installing grunt and the require.js optimizer plugin

Grunt is structured in 2 different modules: "grunt-cli" and "grunt" and a series of plugins. "grunt-cli" can be installed globally:

npm install -g grunt-cli
grunt and the plugins should be installed locally pinning a release version. This system allows to use different versions and plugins for each project.
npm install grunt --save-dev

npm install grunt-contrib-requirejs --save-dev

The save-dev option adds the modules in the package.json under the "devDependencies" key and using the latest release.
...
  "devDependencies": {
    "grunt": "~0.4.1",
    "grunt-contrib-requirejs": "~0.4.1"
  },
...

You can also do this manually and launch "npm install".

Grunt configuration

Grunt needs a configuration file called Gruntfile.js this is an example from a past project of mine:

module.exports = function(grunt) {

  // Project configuration.
  grunt.initConfig({
    pkg: grunt.file.readJSON('package.json'),
    requirejs: {
      compile: {
        options: {
          mainConfigFile: "static/js/main.js",
          baseUrl: "static/js",
          name: "main",
          paths: {
            'socketio': 'empty:',
            'backboneio': 'empty:'
          },
          out: "static/js/main-built.js"
        }
      }
    }
  });

  // Load the plugin that provides the "uglify" task.
  grunt.loadNpmTasks('grunt-contrib-requirejs');
  // Default task(s).
  grunt.registerTask('default', ['requirejs']);

};

In the "requirejs" configuration I have:
mainConfigFile: the path of the bootstrap script
baseUrl: the same as configured in the main.js configuration
name: the name of the main script
paths: in this section I added a pair of script to not be included in the optimized script. These 2 files are served directly by my node module. You can also do the same for script served through a CDN or an external service.
out: the output file

"registerTask" allows me to launch the whole optimization step with the "grunt" command (using no options).

At the end of the task I can load my new optimized script using:

<script data-main="/js/main-built" src="js/vendor/require.js"></script>
I think this is all. Stay tuned for the next!

Edit: If you liked this you'll probably be interested in this other blog post on require.js edge cases.

Monday, July 1, 2013

Writing a real time single page application - server side

In the first part I highlighted how to take care of the frontend of a real time web app. Now I'll explain how to face the backend. As usual I will not dive into details, follow the links if you search in depth explanations.

Server side

I wrote the server side of my application using node.js and a very lightweight framework called express.js
The most important feature of this framework is the middleware system.
This is a middleware:

function middleware(req, res, next){
...
}

A middleware is a sort of russian doll. You can put a middleware inside another middleware (the "next" argument).

A middleware can basically:
  • call the next middleware in the chain ( next() )
  • call the output function ( res() )
  • change the input (req)
  • overwrite the output function (res)

You can use a middleware for:
  • authenticate and get user informations
  • route to a specific middleware using the URL (req.url) and the method (req.method)
  • add a specific header to the HTTP response
  • etc.

Using middlewares is very common because it allows to build simpler and reusable components.

Express.js gives you already a lot of middlewares but I used passportjs to get a complete solution for authentication.
In this example I will store users into couchdb using nano

var config = require('./config'), // I used a config.json to store configuration parameters
    db_url = config.db_protocol + "://" + config.db_user + ":" + config.db_password + "@" + config.db_url;
    nano = require('nano')(db_url), 
    setupAuth = require('./auth'),
    MemoryStore = express.session.MemoryStore,
    sessionStore = new MemoryStore(), // nano will store user id in this session
    passport = setupAuth(userdb);

var app = express(),
    server = http.createServer(app);
    
// configure Express
app.configure(function() {
    app.set('views', __dirname + '/views');
    app.set('view engine', 'ejs');
    app.use(express.logger());
    app.use(express.cookieParser());
    app.use(express.bodyParser());
    app.use(express.methodOverride());
    app.use(express.session({ store: sessionStore, secret: config.session_secret, key: config.session_key }))
    // Initialize Passport!  Also use passport.session() middleware, to support
    // persistent login sessions (recommended).
    app.use(flash());
    app.use(passport.initialize());
    app.use(passport.session());
    app.use(app.router);
    app.use(express.static(__dirname + '/' + config.static_dir));
});

var ensureAuthenticated = function(req, res, next) {
    if (req.isAuthenticated()) {
        return next();
    }
    res.redirect('/login');
};

// I check if a user is authenticated before accessing this URL
app.get('/', ensureAuthenticated, function(req, res){
  res.render('index', { user: req.user});
});

// login
app.get('/login', function(req, res){
    res.render('login', { user: req.user });
});

app.post('/login', 
    passport.authenticate('local', { failureRedirect: '/login', failureFlash: true }),
function(req, res) {
    res.redirect('/');
});

app.get('/logout', function(req, res){
  req.logout();
  res.redirect('/login');
});

server.listen(config.http_port);

The auth module contains functions used by passport:

var passport = require('passport'),
    LocalStrategy = require('passport-local').Strategy,
    crypto = require('crypto');


module.exports = function (userdb){
    var getUserFromId = function(id, done) {
        userdb.get(id, { revs_info: false }, function(err, body) {
            if (!err){
                return done(null, body);
            }
            else {
                return done(null, false, { message: 'Invalid credentials' });
            }
        });
    };

    passport.getUserFromId = getUserFromId;
    
    passport.serializeUser(function(user, done) {
        done(null, user._id);
    });
        
    passport.deserializeUser(getUserFromId);

    passport.use(new LocalStrategy(
        function(username, password, done) {
            var shasum = crypto.createHash('sha1').update(password),
                key = [username, shasum.digest('hex')]

            // this view's keys are [username, password]
            // the password is hashed of course
            userdb.view("user", "login", { keys: [key] }, function (err, body){
                if (!err) {
                    if (body.rows.length){
                        // logged in !!!
                        return done(null, body.rows[0].value);
                    }
                    else {
                        return done(null, false, { message: 'Invalid credentials' });
                    }
                }
                else {
                    console.error(err);
                    return done(null, false, { message: 'Db error, try later' });
                }
            });        

        })
    );

    
    return passport;
};

This example is explained in the passport documentation.
If you are careful you will notice that the getUserFromId function is called every time to get the complete object from the database (couchdb in this case).
This is not very optimal and it's better to cache users for some time. I used this nice memoization module:

    var memoize = require('memoizee');
    
    // cache this function for optimal performance (2 minutes)
    // https://npmjs.org/package/memoizee
    getUserFromId = memoize(getUserFromId, { maxAge: 120000,  async: true});

Backbone.io in the server

At this point I will define a backbone.io backend (as explained here: https://github.com/scttnlsn/backbone.io)

var items_backend = backboneio.createBackend();

A backend is very similar to an express.js middleware.

var backendMiddleware1 = function(req, res, next) {
   ...
};

items_backend.use(backendMiddleware1);
items_backend.use(backendMiddleware2);

Once defined, I will connect the backend.

var io = backboneio.listen(server, {items: items_backend});

The io object which is returned by the listen method is a socket.io object.

When a websocket is connected for the first time it makes a basic handshake. This phase can be used to perform the passport authentication.

var cookie = require('cookie'),
    cookiesig = require('cookie-signature');
// socket io authorization
io.set('authorization', function (data, accept) {
    if (data.headers.cookie) {
        data.cookie = cookie.parse(data.headers.cookie);

        // cookies are signed for better security:
        //
        // s:name:signature
        //
        // s: is a prefix for signed cookies
        // name is the cookie name
        // signature is an hmac of the value
        // doing this the client cannot change the cookie value
        // without invalidate the cookie
        if (data.cookie[session_key].indexOf('s:') === 0){
            data.sessionID = cookiesig.unsign(data.cookie[session_key].slice(2), session_secret);
        }
        else {
            data.sessionID = data.cookie[session_key];
        }
        
        // (literally) get the session data from the session store
        sessionStore.get(data.sessionID, function (err, session) {
            
            if (err || !session) {
                // if we cannot grab a session, turn down the connection
                accept('Cannot get sessionid', false);
            } else {
                // save the session data and accept the connection
                data.session = session;
                if("passport" in session && "user" in session.passport){
                    passport.getUserFromId(session.passport.user, function (err, user, message){
                        if(err || !user){
                            accept('Cannot find user', false);
                        }
                        else {
                            try{
                                data.user = user;

                                accept(null, true);
                            }
                            catch (e){
                                accept('Error: ' + e.toString(), false);
                            }

                        }
                        
                    });
                }
                else {
                    accept('Session does not contain userid', false);
                }
            }
        });
    } else {
       return accept('No cookie transmitted.', false);
    }
});

The tricky part here is to extract the session from the (signed) cookie. The passport and sessionStore objects are the same defined before for normal authentication.
The backend authentication middleware can get the object through req.socket.handshake object:

var authMiddleware = function(req, res, next) {
    var user = req.socket.handshake.user;

    if (!req.user){
        next(new Error('Unauthorized'));
    }
    else {
        req.user = user;
        next();
    }

};

Backend events and channels

When a backend changes something backbone.io automatically broadcasts the change to every node connected (and triggers the events I talked before).
You often need to notify only a subset of clients. For this reason you can define channels. Every changes will be notified to clients connected to a certain channel.
The channel can be defined client side but I added a useful feature to define channel server side, during the handshake.

There is also another case where you need to detect whether couchdb has been changed by another application.
In this case I recommend you to use my fork of backbone.io  because it supports channels.

Database and flow control with promises

The last piece of the application is talking to the database. In my application I used couchdb but it is really not important which database will you use.
In the first paragraph I have underlined that a single resource operation could cause many operations in the backend. For this reason is very important using a smarter way to control the flow. I have chosen promises.

Promises are a standard pattern to manage asynchronous tasks. I used this library: https://github.com/kriskowal/q
The advantage of promises is to avoid the "pyramid of doom" of nested callbacks:

step1(function (value1) {
    step2(value1, function(value2) {
        step3(value2, function(value3) {
            step4(value3, function(value4) {
                // Do something with value4
            });
        });
    });
});

And transform that in something more manageable:

Q.fcall(step1)
.then(step2)
.then(step3)
.then(step4)
.then(function (value4) {
    // Do something with value4
}, function (error) {
    // Handle any error from step1 through step4
})
.done();

With promises, managing errors is very easy.
This is an example using nano:

var getUser = function (id) {
    var deferred = Q.defer();

    userdb.get(id, {}, function(err, body) {
        if (err) {
            deferred.reject(new Error('Not found'));
        }
        else {
            deferred.resolve(body);
        }
    });
    return deferred.promise;
};

var getGroup = function (user) {
    var deferred = Q.defer();

    userdb.get(user.groupid, {}, function(err, body) {
        if (err) {
            deferred.reject(new Error('Not found'));
        }
        else {
            deferred.resolve(body);
        }
    });
    return deferred.promise;
};

function getGroupFromUserId(id, callback){
    getUser(id)
    .then(getGroup)
    .then(function (group){
        callback(group);
    })
    .done();
}

Backbone.io and databases

Backbone.io has some ready to use backend and it is quite easy to write your own following the examples (I added the couchdb backend).

The end?

This ends this whirlwind tour. I hope someone will find this useful.