Friday, January 30, 2015

urlwalker.js, a context aware router for express

Urlwalker.js is a router middleware for Connect/Express.js.

It is not a substitute of the default Express.js router but it works together with the latter, trying to get an object from a fragment of URL (It literally walks it segment by segment, hence the name). You can then use this object as model in the function called by the default Express.js router.
This process is called URL traversal. This concept is not by any means original: I took the inspiration from other web frameworks such as Zope and Pyramid.

Confused ? Let's make a step behind

URL, model and view

Using REST principles it seems to be natural mapping a URLs to a hierarchy of objects:

  • http://www.example.com/roald_dahl/the_chocolate_factory

This URL represents a relation between two objects: the author (roald_dahl) and one of his books (the_chocolate_factory). The last is the model used by the function. Let's put this thing together using express.js:
app.get("/:author/:book", function (req, res){
    // getting the book object
    // doing something with the object
    // return the result
});
The original "Expressjs" way to get the model is to do it directly inside the function (like the previous example) or (better) using app.param. But it is not flexible enough for managing a deeply arbitrary nested structure.
Furthermore I believe it can be useful to split the URL in two different parts. The first part is for getting an object and the second one to transform the object:

  • http://www.example.com/roald_dahl/the_chocolate_factory/index.json
  • http://www.example.com/roald_dahl/the_chocolate_factory/index.html

Both of these URLS point to the same object but return a different representations of that object.
Urlwalker.js follows this convention.

How to use it


The middleware is initialized with a function and a "root" object.

var traversal = require('urlwalkerjs').traversal;
var traversal_middleware = traversal(function (obj, segment, cb){
    return cb({ ... new obj ... })
    // or
    return cb(); // end of traversing
},root_object);

Then you can use it as a normal middleware and add the regular routing:

app.use(traversal_middleware);

app.get('index.json', function(req, res) {
  res.send(req.context);
});

The routing process starts with an object. I call it the "root object" and it is the second argument passed to the middleware. It can be anything, even undefined.
The function (the first argument of the middleware) is invoked for any URL segment. The first time is invoked with the first segment and the root object. It returns an object. The second time is called with the second segment and the object returned previously. The process is repeated until it can't find a match. Then it returns the last object in "req.context" and pass the control to the next middleware.
For this URL:

  • http://www.example.com/roald_dahl/the_chocolate_factory/index.json

The function is invoked twice:

  • from the root object and the segment "roald_dahl" I get an author object
  • from the author object and "the_chocolate_factory" I get a book object

Then the express.js function is called with the book object inside req.context.
For clarifying the process I have added an example here.

An example with occamsrazor.js


Defining this function with such a complex behaviour can be difficult and not very flexible.
For this reason you can use occamsrazor.js for adding dinamically new behaviours to the function (see example 2).
So it becomes:

var getObject = occamsrazor();
var has_authors = occamsrazor.validator().has("authors");
var has_books = occamsrazor.validator().has("books");

getObject.add(null, function (obj, id, cb){
    return cb(); // this will match if no one else match
});

getObject.add(has_authors, function (obj, id, cb){
    return cb(obj.authors[id]);
});

getObject.add(has_books, function (obj, id, cb){
    return cb(obj.books[id]);
});

var traversal_middleware = traversal(getObject, data);
app.use(traversal_middleware);
app.get('index.json', function(req, res) {
  res.send(req.context);
});

At the beginning it might seem a bit cumbersome until you realize you can easily extend the behaviour so easily:

var has_year = occamsrazor.validator().has("year");

getObject.add(has_year, function (obj, id, cb){
    return cb(obj.year[id]);
});

Plugin all the things

But why stops here? why can't we get the view with a similar mechanism  (example 3) ? Let's replace the Express.js routing completely with this:

...
var view = require('urlwalkerjs').view;
var getView = occamsrazor();

var view_middleware = view(getView);

getView.add(null, function (url, method, context, req, res, next){
    next(); // this will match if no one else match
});

getView.add(["/index", "GET", has_books], function (url, method, context, req, res, next){
  res.send('this is the author name: ' + req.context.name);
});

getView.add(["/index", "GET", has_authors], function (url, method, context, req, res, next){
  res.send('these are the authors available: ' + Object.keys(req.context.authors));
});

getView.add(["/index", "GET", has_title], function (url, method, context, req, res, next){
  res.send('Book: ' + req.context.title + " - " + req.context.year);
});

app.use(view_middleware);

A plugin architecture is very helpful, even though you don't need plugins at all. It allows you to apply the open/close principle and extend your application safely.