Express router

One of the reasons to create RESTinio was an ability to have express-like request handler router.

Since v0.2.1 RESTinio has a router based on the idea borrowed from express - a JavaScript framework. In v0.4 lots of improvements were made to make express router much better and practical.

In general, one can implement a custom router. It just a special request handler that receives requests from restinio as a usual handler and then selects a some kind of endpoint to do the final handling and maybe adding some stuff to original requests. Selection rules are up to router author.

Express router acts as a request handler (it means it is a function-object that can be called as a request handler). It aggregates several endpoint-handlers and picks one or none of them to handle the request. The choice of the handler to execute depends on request target and HTTP method. If the router finds no handler matching the request then request is considered unmatched. It is possible to set a handler for unmatched requests, otherwise, router rejects the request and RESTinio takes care of it.

There is a difference between ordinary restinio request handler and the one that is used with express router and is bound to a concrete endpoint. The signature of a handler that can be put in express router has an additional parameter – a container with parameters extracted from URI (request target).

Express router is defined by express_router_t class. Its implementation is inspired by express-router. It allows to defining route path with an injection of parameters that become available for handlers. For example, the following code sets a handler with 2 parameters:

using router_t = restinio::router::express_router_t<>;
auto router = std::make_unique< router_t >();

router->http_get(
  R"(/article/:article_id/:page(\d+))",
  []( auto req, auto params ){
    const auto article_id = restinio::cast_to<std::uint64_t>( params[ "article_id" ] );
    const auto page = restinio::cast_to<short>( params[ "page" ] );
    // ...
  } );

Note that express handler receives 2 parameters not only request handle but also route_params_t instance that holds parameters of the request:

using express_request_handler_t =
    std::function<request_handling_status_t(request_handle_t,route_params_t)>;

Route path defines a set of named and indexed parameters. Named parameter starts with :, followed by non-empty parameter name (only A-Za-z0-9_ are allowed). After parameter name, it is possible to set a capture regex enclosed in brackets (actually a subset of regex - none of the group types are allowed). Indexed parameters are simply a capture regex in brackets.

Let’s show how it works using an example. First, let’s assume that variable router is a pointer to express router. So that is how we add a request handler with a single parameter:

router->http_get( "/single/:param", []( auto req, auto params ){
  return
    init_resp( req->create_response() )
      .set_body(
        fmt::format(
          "GET request with single parameter: '{}'",
          params[ "param" ] ) )
      .done();
} );

The following requests will be routed to that handler:

  • http://localhost/single/123 => param=”123”
  • http://localhost/single/parameter/ => param=”parameter”
  • http://localhost/single/another-param => param=”another-param”

But the following will not:

  • http://localhost/single/123/and-more
  • http://localhost/single/
  • http://localhost/single-param/123

A helper function init_resp sets values for ‘Server’, ‘Date’ and ‘Content-Type’ header fields and returns response builder.

Let’s use more parameters and assign a capture regex for them:

// POST request with several parameters.
router->http_post( R"(/many/:year(\d{4}).:month(\d{2}).:day(\d{2}))",
  []( auto req, auto params ){
    return
      init_resp( req->create_response() )
        .set_body(
          fmt::format(
            "POST request with many parameters:\n"
            "year: {}\nmonth: {}\nday: {}\nbody: {}",
            params[ "year" ],
            params[ "month" ],
            params[ "day" ],
            req->body() ) )
        .done();
  } );

The following requests will be routed to that handler:

  • http://localhost/many/2017.01.01 => year=”2017”, month=”01”, day=”01”
  • http://localhost/many/2018.06.03 => year=”2018”, month=”06”, day=”03”
  • http://localhost/many/2017.12.22 => year=”2017”, month=”12”, day=”22”

But the following will not:

  • http://localhost/many/2o17.01.01
  • http://localhost/many/2018.06.03/events
  • http://localhost/many/17.12.22

Using indexed parameters is practically the same, just omit parameters names:

// GET request with indexed parameters.
router->http_get( R"(/indexed/([a-z]+)-(\d+)/(one|two|three))",
  []( auto req, auto params ){
    return
      init_resp( req->create_response() )
        .set_body(
          fmt::format(
            "POST request with indexed parameters:\n"
            "#0: '{}'\n#1: {}\n#2: '{}'",
            params[ 0 ],
            params[ 1 ],
            params[ 2 ] ) )
        .done();
  } );

The following requests will be routed to that handler:

  • http://localhost/indexed/xyz-007/one => #0=”xyz”, #1=”007”, #2=”one”
  • http://localhost/indexed/ABCDE-2017/two => #0=”ABCDE”, #1=”2017”, #2=”two”
  • http://localhost/indexed/sobjectizer-5/three => #0=”sobjectizer”, #1=”5”, #2=”three”

But the following will not:

  • http://localhost/indexed/xyz-007/zero
  • http://localhost/indexed/173-xyz/one
  • http://localhost/indexed/ABCDE-2017/one/two/three

See full example.

For details on route_params_t and express_router_t see express.hpp.

Route parameters

Route parameters are represented with restinio::router::route_paramts_t class. It holds named and indexed parameters. All parameters values are stored as string_view objects refering a buffer with a copy of a request target string. A key values for named parameters are also string_view objects refering a shared buffer-string provided by the route matcher entry.

Values stored in route_paramts_t objects can be accessed with operator[] receiving string_view as its argument for named parameter and std::size_t.

Note: when getting parameter value as string_view a copy of internal string_view object is returned, thus it refers to dta located in buffer owned by route_params_t instance. And such string_view is valid only during the lifetime of a given parameters object. route_params_t instance can be moved, and all string_view objects refering the buffer owned by route params remain valid during life time of a newly created object.

Casting parameters

Each parameter is represented with a string_view, but often it is just a representation, for example, of a numeric type. For that purposes RESTinio contains helpful function: restinio::cast_to<Value_Type>(string_view_t s).

For example:

router->http_get( R"(/:id{\d}/:tag([a-z0-9]+)/:year(\d{4}))",
  []( auto req, auto params ){
    const auto id = restinio::cast_to<std::uint64_t>( params[ "id" ] );
    const auto tag = restinio::cast_to<std::string>( params[ "tag" ] );
    const auto year = restinio::cast_to<short>( params[ "year" ] );
    // ...
  } );

Parameters can be casted to any type that support conversion. RESTinio supports conversion for the following types:

  • 8-,16-,32-,64-bit signed/unsigned integers;
  • float/double;
  • std::string.

A custom cnversions can be added in one of two following ways:

  1. Define an appropriate read_value function in the same namespace as your custom type (ADL will be applied):

    namespace my_ns
    {
      class my_type_t
      {
        // ...
      };
    
      void read_value( my_type_t & v, const char * data, std::size_t size )
      {
        // Set a a value of v.
      }
    } /* namespace my_ns */
    
  2. Define an appropriate read_value function in restinio::utils namespace:

    namespace restinio
    {
      namespace utils
      {
        void read_value( my_ns::my_type_t & v, const char * data, std::size_t size )
        {
          // Set a a value of v.
        }
      } /* namespace utils */
    } /* namespace restinio */
    

Note on string view

RESTinio relies on std::string_view or std::experimentl::string_view if one of them available. Otherwise RESTinio uses its own string_view class.

Non matched request handler

For the cases when express-router defeined with certain routes finds no matching routes it is possible to set a special handler that catches all non matched requests.

For example:

router->non_matched_request_handler(
  []( auto req ){
    return
      req->create_response( 404, "Not found")
        .connection_close()
        .done();
  } );

Regex engines

For doing route matching express-router relies on regex. express_router_t defined as a template class:

template < typename Regex_Engine = std_regex_engine_t>
class express_router_t
{
  // ...
};

Template argument Regex_Engine defines regex engine implementation. RESTinio comes with following predefined regex engines:

For example, to use PCRE-based engine:

#include <restinio/all.hpp>
#include <restinio/router/pcre_regex_engine.hpp>

using my_router = restinio::router::express_router_t<
      restinio::router::pcre_regex_engine_t<> >;

And for PCRE2-based engine:

#include <restinio/all.hpp>
#include <restinio/router/pcre2_regex_engine.hpp>

using my_router = restinio::router::express_router_t<
      restinio::router::pcre2_regex_engine_t<> >;

And for Boost regex engine:

#include <restinio/all.hpp>
#include <restinio/router/boost_regex_engine.hpp>

using my_router = restinio::router::express_router_t<
      restinio::router::boost_regex_engine_t >;

Tests and benchmarks for PCRE engines and Boost regex are built if build system (cmake or mxx_ru) considers them available.

Performance

Performance of routing depends on at least the following things:

  • total number of routes;
  • distribution of routes and the order in which routes are added to router;
  • complexity of regexes used for mathing routes;

It is hard to say what is the penalty in each case with its conditions. But a certain picture can be derived from benchmarks. And RESTinio contains such benchmarks for supported regex engines

For standard regex engine there is a express_router_bench. For pcre and for pcre2. And for Boost regex.

# See usage:
$ _test.router.express_router_bench -h

# Sample: run server on port 8080, using 4 threads matching routes given in a file cmp_routes.txt.
$ _test.router.express_router_bench -p 8080 -n 4 -r test/router/express_router_bench/cmp_routes.txt

A file that defines routes must contain lines in the following format:

(HTTP METHOD: GET|POST|...) (route path)

For example:

GET /users/:id(\d+)
GET /users/:id(\d+)/visits
POST /users/:id(\d+)
POST /users/new
GET /locations/:id(\d+)
GET /locations/:id(\d+)/avg
POST /locations/:id(\d+)
POST /locations/new
GET /visits/:id(\d+)
POST /visits/:id(\d+)
POST /visits/new

For measurement can be done with your tools. Or simply use wrk tool:

# Testing with cmp_routes.txt
./wrk --latency -t 4 -c 256 -d 10 -s cmp_routes.lua http://127.0.0.1:8080/

where cmp_routes.lua is:

request = function()
  local e = math.random(1, 100)

  if e < 86 then
    wrk.method = "GET"

    if e < 21 then
      path = "/users/" .. math.random(1, 10000 )
    elseif e < 41 then
      path = "/locations/" .. math.random(1, 100000 )
    elseif e < 51 then
      path = "/visits/" .. math.random(1, 10000 )
    elseif e < 61 then
      path = "/users/" .. math.random(1, 10000 ) .. "/visits"
    else
      path = "/locations/" .. math.random(1, 10000 ) .. "/avg"
    end

  else
    wrk.method = "POST"
    wrk.body = "{}"
    wrk.headers["Content-Type"] = "application/json"

    if e < 89 then
      path = "/users/" .. math.random(1, 10000 )
    elseif e < 93 then
      path = "/locations/" .. math.random(1, 100000 )
    elseif e < 95 then
      path = "/visits/" .. math.random(1, 100000 )
    elseif e < 96 then
      path = "/users/new"
    elseif e < 98 then
      path = "/visits/new"
    else
      path = "/locations/new"
    end
  end

  return wrk.format(nil, path)
end

This script some how sets a distribution of generated request.

Benchmarks

Of course express-router costs something in terms of performance. And the question is: on a given set of routes what is penalty of using express router compared to a nicely hardcoded routing for a given set of routes.

For that purpose we implemented a hardcoded routing: cmp_router_bench having a route parser for the following routes: For example:

GET /users/:id(\d+)
GET /users/:id(\d+)/visits
POST /users/:id(\d+)
POST /users/new
GET /locations/:id(\d+)
GET /locations/:id(\d+)/avg
POST /locations/:id(\d+)
POST /locations/new
GET /visits/:id(\d+)
POST /visits/:id(\d+)
POST /visits/new

And we test it with the express_router_bench described in previous section. Wrk was used for generating load with the following params: ./wrk -t 4 -c 256 -d 30 -s cmp_routes.lua http://127.0.0.1:8080/

The results are the following (2017.12.19)

# of threads hardcoded express-router (std) express-router (PCRE) express-router (PCRE2)
1 166831.47 123851.03 (74.24%) 148038.03 (88.74%) 148825.81 (89.21%)
2 258360.2 201816.12 (78.11%) 240238.03 (92.99%) 228153.96 (88.31%)
3 293823.26 233815.11 (79.58%) 270818.23 (92.17%) 262148.93 (89.22%)
4 330288.48 259893.3 (78.69%) 306312.71 (92.74%) 293151.75 (88.76%)

Benchmark environment:

  • CPU: 8x Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz;
  • Memory: 16343MB;
  • Operating System: Ubuntu 16.04.2 LTS.
  • Compiler: gcc version 7.1.0 (Ubuntu 7.1.0-5ubuntu2~16.04)