RESTinio context entities running on asio::io_context

RESTinio runs its logic on asio::io_context. RESTinio works with asio on the basis of callbacks, that means that some context is always passed from one callback to another. There are two main entities the contexts of which is passed between callbacks:

  • acceptor – receives new connections and creates connection objects that performs session logic;
  • connection – does TCP io-operations, HTTP-parsing and calls handler.


There is a single instance of acceptor and as many connections as needed.

Acceptors life cycle is trivial and is the following:

  1. Start listening for new connection.
  2. Receive new TCP-connection.
  3. Create connection handler object and start running it.
  4. Back to step 1’.

When the server is closed this cycle breaks up.

To set custom options for acceptor use server_settings_t::acceptor_options_setter().

By default RESTinio accepts connections one-by-one, so a big number of clients initiating simultaneous connections might be a problem even when running asio::io_context on a pool of threads. There are a number of options to tune RESTinio for such cases.

  • Increase the number of concurrent accepts. By default RESTinio initiates single accept operation only, but when running a server on N threads then up to N accepts can be handled concurrently. See server_settings_t::concurrent_accepts_count().
  • After accepting new connection on socket RESTinio creates internal connection wrapper object. The creation of such object can be done separately (in another callback posted on asio). So creating connection instance that involves allocations and initialization can be done in a context that is independent to acceptors one. It makes on-accept callback to run faster, thus more connections can be accepted in the same time interval. See server_settings_t::separate_accept_and_create_connect()

Example of using acceptor options:

// using traits_t = ...
restinio::http_server_t< traits_t >
    restinio::create_child_io_context( 4 ),
    restinio::server_settings_t< traits_t >{}
      .port( port )
      .buffer_size( 1024 )
      .max_pipelined_requests( 4 )
      .request_handler( db )
      // Using acceptor options:
        []( auto & options ) {
          options.set_option( asio::ip::tcp::acceptor::reuse_address( true ) );
        } )
      .concurrent_accepts_count( 4 )
      .separate_accept_and_create_connect( true ) };

Post-bind hook for the acceptor

Since v.0.6.11 there is an optional post-bind hook. If it is set then RESTinio calls it just after the return from successful call to bind() for the acceptor.

A reference to asio::ip::tcp::acceptor (or boost::asio::ip::tcp::acceptor) is passed to that post-bind hook. This reference can be used for gathering some acceptor-related information or for setting some application-specific options to bound acceptor (if those options can’t be set via acceptor_options_setter as shown above).

For example, this code snippet shows how to run RESTinio server on a random port (the actual port number is assigned by the Operating System during the bind()):

std::promise<unsigned short> port_promise; // For getting the port.
auto server = restinio::run_async(
         .port(0u) // Zero means that port will be assigned by the OS.
            [&port_promise](asio::ip::tcp::acceptor & acceptor) {
               // Gathering the actual port number.
// Now we can safely get the actual port number from the promise.
const auto actual_port = port_promise.get_future().get();

Limiting the number of parallel connections

Since v.0.6.12 RESTinio allows to limit the number of parallel connections (until v.0.6.12 there wasn’t such limit and RESTinio accepted all incoming connections regardless of how many they were). This is an optional limit and it has to be set manually. When this limit is set and the number of parallel connections reaches that limit then RESTinio stops calling accept(). The calling of accept() resumes when the number of connections drops below that limit.

To set up parallel connections limit a user has to do two things:

  1. Set static and constexpr use_connection_count_limiter member of server’s traits to true:

    struct my_traits : public restinio::default_traits_t {
       static constexpr bool use_connection_count_limiter = true;
  2. Call max_parallel_connections for server_settings_t:

    restinio::server_settings_t<my_traits> settings;
    auto server = restinio::run_async(

Please note that use_connection_count_limiter in restinio::default_traits_t and restinio::default_single_thread_traits_t is set to false by the default.

Connection count limiter and multi-threading

Additional care should be taken if RESTinio is used in a multi-threaded application and the actual processing of an incoming request is delegated to some worker thread.

When an incoming connection dies it informs RESTinio server about this event. RESTinio server checks the number of connections and can initiate a new accept. But this checking should be performed under a mutex.

If RESTinio server is launched with multi-threading traits (e.g. restinio::default_traits_t or user-defined traits derived from restinio::default_traits_t) then an appropriate mutex is used automatically and checking of the connection number is performed safely and correctly.

But if RESTinio server is launched with single-threading traits (e.g. restinio::default_single_thread_traits_t or user-defined traits derived from restinio::default_single_thread_traits_t) then there won’t be a mutex and handling of connection die event can lead to data-races and some form of data corruption.

It means that multi-threaded traits must be used if the processing of incoming requests is delegated to some worker threads. Especially in the case when connection count limiter is used.


Connections life cycle is more complicated and cannot be expressed linearly. Simultaneously connection runs two logical objectives. The first one is responsible for receiving requests and passing them to a handler (read part) and the second objective is streaming resulting responses back to client (write part). Such logical separation comes from HTTP pipelining support and various types of response building strategies.

Without error handling and timeouts control Read part looks like this:

  1. Start reading from a socket.
  2. Receive a portion of data from a socket and parse HTTP request out of it.
  3. If HTTP message parsing is incomplete then go back to step 1.
  4. If HTTP message parsing is complete pass request and connection to request handler.
  5. If request handler rejects the request, then push not-implemented response (status 501) to outgoing queue and stop reading from a socket.
  6. If the request was accepted and the number of requests in process is less than max_pipelined_requests then go back to step 1.
  7. Stop reading socket until awaken by the write part.

And the Write part looks like this:

  1. Wait for response pieces initiated from user domain either directly inside of handler call or from another context where response actually is being built.
  2. Push response data to outgoing queue with consideration of associated response position (multiple requests can be in process and response for a given request cannot be written to the socket before writing all previous responses to it).
  3. Try to extract the next write group that is ready to be sent.
  4. If there is no ready data available then go back to step 1.
  5. Start running the circle of streaming the data from a given write group to peer.
  6. Wait for the all write operations for current write group to to complete. If more response pieces come while write operation runs it is simply received (steps 1-2 without any further go).
  7. After all the data from current write group is sent invoke notificators if
  8. If last committed response was marked to close connection then the connection is closed and destroyed.
  9. If it appears that the room for more pipeline requests became available again then awake the read part.
  10. Go back to step 3.

Of course, implementation has error checks. Also, implementation controls timeouts of operations that are spread in time:

  • reading the request: from starting reading bytes from socket up to parsing a complete HTTP-message;
  • handling the request: from passing request data and connection handle to request handler up to get the response to be written to a socket;
  • writing response to socket (sendfile operation might set own timeout, overriding the one provided in settings).

When handling a request there are two possible cases:

  • response is created inside the request handlers call;
  • request handler delegates handling job to another context via some kind of async API.

The first case is trivial and the response simply begins to be written.

The second case and its possibility is a key point of RESTinio being created for. As request data and connection handle are wrapped in shared pointers so they can be moved to another context. So it is possible to create handlers that can interact with async API. When response data is ready response can be built and sent using request handle. After response building is complete connection handle will post the necessary job to run on host asio::io_context. So one can perform asynchronous request handling and not to block worker threads.

To set custom options for acceptor use server_settings_t::socket_options_setter():

// using traits_t = ...
restinio::http_server_t< traits_t >
    restinio::create_child_io_context( 4 ),
    restinio::server_settings_t< traits_t >{}
      .port( port )
      .buffer_size( 1024 )
      .max_pipelined_requests( 4 )
      .request_handler( db )
      // Using custom socket options:
        []( auto & options ){
          options.set_option( asio::ip::tcp::no_delay{ true } );
        } ) };