Express router
One of the reasons to create RESTinio was an ability to have express-like request handler router.
Since v0.2.1 RESTinio has a router based on the idea borrowed from express - a JavaScript framework. In v0.4 lots of improvements were made to make express router much better and practical.
In general, one can implement a custom router. It just a special request handler that receives requests from restinio as a usual handler and then selects a some kind of endpoint to do the final handling and maybe adding some stuff to original requests. Selection rules are up to router author.
Express router acts as a request handler (it means it is a function-object that can be called as a request handler). It aggregates several endpoint-handlers and picks one or none of them to handle the request. The choice of the handler to execute depends on request target and HTTP method. If the router finds no handler matching the request then request is considered unmatched. It is possible to set a handler for unmatched requests, otherwise, router rejects the request and RESTinio takes care of it.
There is a difference between ordinary restinio request handler and the one that is used with express router and is bound to a concrete endpoint. The signature of a handler that can be put in express router has an additional parameter – a container with parameters extracted from URI (request target).
Express router is defined by express_router_t
class.
Its implementation is inspired by express-router.
It allows to defining route path with an injection of parameters that become available
for handlers. For example, the following code sets a handler with 2 parameters:
using router_t = restinio::router::express_router_t<>;
auto router = std::make_unique< router_t >();
router->http_get(
R"(/article/:article_id/:page(\d+))",
[]( auto req, auto params ){
const auto article_id = restinio::cast_to<std::uint64_t>( params[ "article_id" ] );
const auto page = restinio::cast_to<short>( params[ "page" ] );
// ...
} );
Note that express handler receives 2 parameters not only request handle
but also route_params_t
instance that holds parameters of the request:
using express_request_handler_t =
std::function<request_handling_status_t(request_handle_t,route_params_t)>;
Route path defines a set of named and indexed parameters. Named parameter
starts with :
, followed by non-empty parameter name (only A-Za-z0-9_
are allowed). After parameter name, it is possible to set a capture regex
enclosed in brackets (actually a subset of regex - none of the group types are
allowed). Indexed parameters are simply a capture regex in brackets.
Let’s show how it works using an example. First, let’s assume that variable router is a pointer to express router. So that is how we add a request handler with a single parameter:
router->http_get( "/single/:param", []( auto req, auto params ){
return
init_resp( req->create_response() )
.set_body(
fmt::format(
"GET request with single parameter: '{}'",
params[ "param" ] ) )
.done();
} );
The following requests will be routed to that handler:
http://localhost/single/123
=> param=”123”http://localhost/single/parameter/
=> param=”parameter”http://localhost/single/another-param
=> param=”another-param”
But the following will not:
http://localhost/single/123/and-more
http://localhost/single/
http://localhost/single-param/123
A helper function init_resp
sets values for ‘Server’, ‘Date’ and
‘Content-Type’ header fields and returns response builder.
Let’s use more parameters and assign a capture regex for them:
// POST request with several parameters.
router->http_post( R"(/many/:year(\d{4}).:month(\d{2}).:day(\d{2}))",
[]( auto req, auto params ){
return
init_resp( req->create_response() )
.set_body(
fmt::format(
"POST request with many parameters:\n"
"year: {}\nmonth: {}\nday: {}\nbody: {}",
params[ "year" ],
params[ "month" ],
params[ "day" ],
req->body() ) )
.done();
} );
The following requests will be routed to that handler:
http://localhost/many/2017.01.01
=> year=”2017”, month=”01”, day=”01”http://localhost/many/2018.06.03
=> year=”2018”, month=”06”, day=”03”http://localhost/many/2017.12.22
=> year=”2017”, month=”12”, day=”22”
But the following will not:
http://localhost/many/2o17.01.01
http://localhost/many/2018.06.03/events
http://localhost/many/17.12.22
Using indexed parameters is practically the same, just omit parameters names:
// GET request with indexed parameters.
router->http_get( R"(/indexed/([a-z]+)-(\d+)/(one|two|three))",
[]( auto req, auto params ){
return
init_resp( req->create_response() )
.set_body(
fmt::format(
"POST request with indexed parameters:\n"
"#0: '{}'\n#1: {}\n#2: '{}'",
params[ 0 ],
params[ 1 ],
params[ 2 ] ) )
.done();
} );
The following requests will be routed to that handler:
http://localhost/indexed/xyz-007/one
=> #0=”xyz”, #1=”007”, #2=”one”http://localhost/indexed/ABCDE-2017/two
=> #0=”ABCDE”, #1=”2017”, #2=”two”http://localhost/indexed/sobjectizer-5/three
=> #0=”sobjectizer”, #1=”5”, #2=”three”
But the following will not:
http://localhost/indexed/xyz-007/zero
http://localhost/indexed/173-xyz/one
http://localhost/indexed/ABCDE-2017/one/two/three
See full example.
For details on route_params_t
and express_router_t
see
express.hpp.
Route parameters
Route parameters are represented with restinio::router::route_paramts_t
class. It holds named and indexed parameters. All parameters values are
stored as string_view
objects refering a buffer with a copy of a request
target string. A key values for named parameters are also string_view objects
refering a shared buffer-string provided by the route matcher entry.
Values stored in route_paramts_t
objects can be accessed with
operator[]
receiving string_view
as its argument for named parameter
and std::size_t
.
Note: when getting parameter value as string_view a copy of internal
string_view object is returned, thus it refers to dta located in buffer owned
by route_params_t
instance. And such string_view is valid only during the
lifetime of a given parameters object. route_params_t
instance can be
moved, and all string_view objects refering the buffer owned by route params
remain valid during life time of a newly created object.
Casting parameters
Each parameter is represented with a string_view,
but often it is just a representation, for example, of a numeric type.
For that purposes RESTinio contains helpful function:
restinio::cast_to<Value_Type>(string_view_t s)
.
For example:
router->http_get( R"(/:id{\d}/:tag([a-z0-9]+)/:year(\d{4}))",
[]( auto req, auto params ){
const auto id = restinio::cast_to<std::uint64_t>( params[ "id" ] );
const auto tag = restinio::cast_to<std::string>( params[ "tag" ] );
const auto year = restinio::cast_to<short>( params[ "year" ] );
// ...
} );
Parameters can be casted to any type that support conversion. RESTinio supports conversion for the following types:
- 8-,16-,32-,64-bit signed/unsigned integers;
- float/double;
- std::string.
A custom cnversions can be added in one of two following ways:
Define an appropriate
read_value
function in the same namespace as your custom type (ADL will be applied):namespace my_ns { class my_type_t { // ... }; void read_value( my_type_t & v, const char * data, std::size_t size ) { // Set a a value of v. } } /* namespace my_ns */
Define an appropriate read_value function in restinio::utils namespace:
namespace restinio { namespace utils { void read_value( my_ns::my_type_t & v, const char * data, std::size_t size ) { // Set a a value of v. } } /* namespace utils */ } /* namespace restinio */
Note on string view
RESTinio relies on std::string_view
or std::experimentl::string_view
if one of them available. Otherwise RESTinio uses its own
string_view class.
Non matched request handler
For the cases when express-router defeined with certain routes finds no matching routes it is possible to set a special handler that catches all non matched requests.
For example:
router->non_matched_request_handler(
[]( auto req ){
return
req->create_response( 404, "Not found")
.connection_close()
.done();
} );
Regex engines
For doing route matching express-router relies on regex. express_router_t
defined as a template class:
template < typename Regex_Engine = std_regex_engine_t>
class express_router_t
{
// ...
};
Template argument Regex_Engine
defines regex engine implementation.
RESTinio comes with following predefined regex engines:
- Based on regex provided by STL (default), see std_regex_engine.hpp.
- Based on PCRE, see pcre_regex_engine.hpp.
- Based on PCRE2, see pcre2_regex_engine.hpp.
- Based on Boost regex, see boost_regex_engine.hpp.
For example, to use PCRE-based engine:
#include <restinio/all.hpp>
#include <restinio/router/pcre_regex_engine.hpp>
using my_router = restinio::router::express_router_t<
restinio::router::pcre_regex_engine_t<> >;
And for PCRE2-based engine:
#include <restinio/all.hpp>
#include <restinio/router/pcre2_regex_engine.hpp>
using my_router = restinio::router::express_router_t<
restinio::router::pcre2_regex_engine_t<> >;
And for Boost regex engine:
#include <restinio/all.hpp>
#include <restinio/router/boost_regex_engine.hpp>
using my_router = restinio::router::express_router_t<
restinio::router::boost_regex_engine_t >;
Tests and benchmarks for PCRE engines and Boost regex are built if build system (cmake or mxx_ru) considers them available.
Performance
Performance of routing depends on at least the following things:
- total number of routes;
- distribution of routes and the order in which routes are added to router;
- complexity of regexes used for mathing routes;
It is hard to say what is the penalty in each case with its conditions. But a certain picture can be derived from benchmarks. And RESTinio contains such benchmarks for supported regex engines
For standard regex engine there is a express_router_bench. For pcre and for pcre2. And for Boost regex.
# See usage:
$ _test.router.express_router_bench -h
# Sample: run server on port 8080, using 4 threads matching routes given in a file cmp_routes.txt.
$ _test.router.express_router_bench -p 8080 -n 4 -r test/router/express_router_bench/cmp_routes.txt
A file that defines routes must contain lines in the following format:
(HTTP METHOD: GET|POST|...) (route path)
For example:
GET /users/:id(\d+)
GET /users/:id(\d+)/visits
POST /users/:id(\d+)
POST /users/new
GET /locations/:id(\d+)
GET /locations/:id(\d+)/avg
POST /locations/:id(\d+)
POST /locations/new
GET /visits/:id(\d+)
POST /visits/:id(\d+)
POST /visits/new
For measurement can be done with your tools. Or simply use wrk tool:
# Testing with cmp_routes.txt
./wrk --latency -t 4 -c 256 -d 10 -s cmp_routes.lua http://127.0.0.1:8080/
where cmp_routes.lua is:
request = function()
local e = math.random(1, 100)
if e < 86 then
wrk.method = "GET"
if e < 21 then
path = "/users/" .. math.random(1, 10000 )
elseif e < 41 then
path = "/locations/" .. math.random(1, 100000 )
elseif e < 51 then
path = "/visits/" .. math.random(1, 10000 )
elseif e < 61 then
path = "/users/" .. math.random(1, 10000 ) .. "/visits"
else
path = "/locations/" .. math.random(1, 10000 ) .. "/avg"
end
else
wrk.method = "POST"
wrk.body = "{}"
wrk.headers["Content-Type"] = "application/json"
if e < 89 then
path = "/users/" .. math.random(1, 10000 )
elseif e < 93 then
path = "/locations/" .. math.random(1, 100000 )
elseif e < 95 then
path = "/visits/" .. math.random(1, 100000 )
elseif e < 96 then
path = "/users/new"
elseif e < 98 then
path = "/visits/new"
else
path = "/locations/new"
end
end
return wrk.format(nil, path)
end
This script some how sets a distribution of generated request.
Benchmarks
Of course express-router costs something in terms of performance. And the question is: on a given set of routes what is penalty of using express router compared to a nicely hardcoded routing for a given set of routes.
For that purpose we implemented a hardcoded routing: cmp_router_bench having a route parser for the following routes: For example:
GET /users/:id(\d+)
GET /users/:id(\d+)/visits
POST /users/:id(\d+)
POST /users/new
GET /locations/:id(\d+)
GET /locations/:id(\d+)/avg
POST /locations/:id(\d+)
POST /locations/new
GET /visits/:id(\d+)
POST /visits/:id(\d+)
POST /visits/new
And we test it with the express_router_bench described in previous section.
Wrk was used for generating load with the
following params: ./wrk -t 4 -c 256 -d 30 -s cmp_routes.lua
http://127.0.0.1:8080/
The results are the following (2017.12.19)
# of threads | hardcoded | express-router (std) | express-router (PCRE) | express-router (PCRE2) |
---|---|---|---|---|
1 | 166831.47 | 123851.03 (74.24%) | 148038.03 (88.74%) | 148825.81 (89.21%) |
2 | 258360.2 | 201816.12 (78.11%) | 240238.03 (92.99%) | 228153.96 (88.31%) |
3 | 293823.26 | 233815.11 (79.58%) | 270818.23 (92.17%) | 262148.93 (89.22%) |
4 | 330288.48 | 259893.3 (78.69%) | 306312.71 (92.74%) | 293151.75 (88.76%) |
Benchmark environment:
- CPU: 8x Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz;
- Memory: 16343MB;
- Operating System: Ubuntu 16.04.2 LTS.
- Compiler: gcc version 7.1.0 (Ubuntu 7.1.0-5ubuntu2~16.04)