IndexModulesHandlers

Handler: HTTP reverse proxy

This handler is one of the most demanded features of Cherokee. It dispatches in-bound network traffic to a set of servers, and presents a single interface to the requesters. This is particularly useful to load balance a cluster of webservers at a much higher network stack level than that allowed by the generic balancer.

All connections coming from the Internet addressed to one of the Web servers are routed through the proxy, which can either deal with the request or pass it (with or without modifications) to the other web servers.

The reverse proxy can do several interesting things besides simply load balancing. It can rewrite headers, and it can try to establish keep-alive connections with every system interfacing with it. That is, it doesn't matter if all the clients requesting contents from our publicly available Reverse Proxy do not support this feature: the Keep-Alive connections can still be kept within the local pool, greatly improving performance.

The task of the reverse proxy can be summarized in the following steps.

  • Phase 1: The proxy receives a request, adds the necessary HTTP headers and rewrites the existing ones according to the specified rules. It then dispatches the request to one of the machines in the pool of specified information sources.

Phase 1
  • Phase 2: Once the server that has received the request sends back the response, the reverse proxy deletes the unnecesary return HTTP headers and sends the response back to the requesting client.

Phase 2

To use the HTTP reverse proxy handler you simply have to specify several parameters. First define a series of information sources. Those will be the ones handling the requests in the end.

Reverse Proxy

Then you will have to specify:

  • Reuse connections: the maximum number of connetions per server to be kept with Keep-alive. If not specified, the default value of 16 will be taken.

  • Header additions: to add specific HTTP headers.

  • URL rewriting rules, which are specified using regular expressions to modify URLs before relaying the requests.

  • Hidden returned headers: to eliminate specific HTTP headers.

  • Balancer: the type of load balancing strategy to be used.

  • Information sources: where you will assign the previously defined sources, which are all the servers from our set to be used in the cluster of web servers.

Can you improve this entry?