23. LVS: Cluster friendly versions of applications that need to maintain state

23.1. rewriting your application/service

Complicated websites can be hard to run under LVS (e.g. websites with servlets). The problem is that the website has been written with the assumption that all functions on the website will always be together on the same machine. This was a reasonable assumption not so long ago, but now with customers wanting high availability (and possibly high throughput), this assumption is no longer valid. The assumption will be invalid if the client writes to the server (see Section 27.28) or if the server maintains state information about the client (see Section 13.9.1. Often people setting up an LVS hope that LVS can look inside the packets for content and direct the packet to an appropriate realserver. LVS can't do this. In all cases, this simplest, most robust solution is to rewrite the application to run in a high availability environment. Administratively this sort of proposal is not well received by management, who don't like to hear that their expensive web application was not properly written. Management will likely be more receptive to canned solutions from glib sales people, who will tell management that an L7 loadbalancer is a simple solution (it is, but it is also slow and expensive).

Roberto Nibali ratz (at) tac (dot) ch 19 Apr 2001

LVS is a Layer4 load balancer and can't do content based (L7) load balancing.

You shouldn't try to solve this problem by changing the TCP Layer to provide a solution which should be handled by the Application Layer. You should never touch/tweak TCP settings out of the boundaries given in the various RFC's and their implementations.

If your application passes a cookie to the client, these are the general approaches:

  • buy an L7 load balancer (and don't use LVS).
  • Set a very high persistency timeout and hope it is higher than the period a client will wait to come back after he found his credit card, or look at other sites, or have a cup of coffee.

    This is not a good solution.

    • Increased persistency timeout increases the number of concurrent connections possible, which increases the amount of memory required to hold the connection table. A persistency timeout of 30min, with clients connecting at 500 connections/s you would need a memory pool of at least: 30*60*128*500/(1024*1024) = 109 MBytes. With the standard timeout of 300 seconds, you'd only need 109/6 = 18 Mbytes.
    • Long persistency times are incompatible with the DoS defense strategies employeed by secure_tcp.
  • Have a 2-tier architecture where you have the application directly on the webserver itself and maybe helped by a database. The problem of the cookies storage is not solved however. You have to deal with the replication problem. Imagine following setup:

                           ---->  Web1/App -->
                         /                    \
      Clients  ----> director ->  Web2/App ---> DB Server
                         \                    /
                           ---->  Web3/App -->
    

    Cookies are generated and stored locally on each WebX server. But if you have a persistency timeout of 300s (default LVS setting) and the client had his cup of coffee while entering his visa numbers, he would get to a new server. This new server whould then ask the client to reauthenticate. There are solutions to this e.g

    • NFS export a dedicated cookie directory over the back-interfaces. Cookies are quickly distributed among the servers.
    • the application is written to handle cookie replication and propagation between the WebX servers (you have at least 299 seconds time to replicate the cookie on all web servers. This should be enough even for distributing over serial line and do a crosscheck :)

      This does not work (well) for geographically distributed webserver.

  • 3-Tier architecture

                           -->  Web1 --
                         /              \
      Clients  ----> LVS ---->  Web2 ----> Application Server <---> DB Server
                         \              /
                           -->  Web3 -->
    

    The cookies are generated by the application server and either stored there or on the database server. If a request comes in, the LVS assigns the request f.e to Web1 and sets the persistency timeout. Web1 does a short message exchange with the application server which generates the sessionID as a cookie and stores it. The webserver sends the cookie back and now we are safe. Again this whole procedure has t_pers_timeout (300 seconds normally) amout of time. Let's assume the client times out (has gone for a cup of coffee). When he comes back normally on a Layer4 load balancer he will be forwarded to a new server, (say Web2). The CGI script on Web2 does the same as happened originally on Web1: it generates a cookie as sessionID. But the application server will tell the script that there is already a cookie for this client and will pass it to Web2. In this way we have unlimited persistency based on cookies but limited persistency for TCP.

    Advantages

    • set your own persistency timeout values
    • TCP state timeout values are not changed.
    • table lookup is faster
    • it's cheaper than buying an L7 load balancer

    Disadvantages:

    • more complex setup, more hardware
    • you have to write some software
  • If a separate database is running on each webserver, use replication to copy the cookie between servers. (You have 300 secs to do this). This was also mentioned by Ted Pavlic in connection with databases.

23.2. Session Data, maintaining state in a cluster, from Andreas Koening

Andreas J. Koenig andreas.koenig (at) anima (dot) de 2001-06-26

  • What are sessions?

    When an application written on top of a stateless protocol like HTTP has a need of stateful transactions, it typically writes some data to disk between requests and retrieves these data again on the subsequent request. This mechanism is known as session handling. The session data typically get written to files or databases. Each followup-request sends some sort of token to the server so that the application can retrieve the correct file or correct record in the database.

  • The old-fashined way to identify sessions

    At the time when every webserver was a single PC, the session token identified a filename or a record in a database and everything was OK. When an application that relies on this mechanism is ported to a cluster environment, it stops working unless one deteriorates the cluster with a mechanism called persistence. Persistence is a quick and dirty way to get the old-fashioned token to work. It's not a very clever way though.

  • Why persistence is bad

    Persistence counteracts two purposes of a cluster: easy maintainance by taking single machines out at any time and optimized balancing between the members of a cluster. Above that, persistence consumes memory on the load balancers.

  • How to do it better

    Recall that there is a token being sent back and forth anyway, that identifies a filename or a database record. Extend this token to unambiguously point to the machine where the session data were created and install a session server on each host that delivers session data within the cluster to any of the peers on request. From that moment you can run your application truely distributed, you can take single machines out for maintainance any time: you turn their weight to 0 and wait for maybe an hour or three, depending on how long you want your sessions to last. You get better balancing, and you save memory on the balancer. Note, that unlike with a dedicated session server, you do not create a single point of failure with this method.

23.3. Maintaining state with persistence

You can setup persistence several ways

  • Use port 0 (i.e. all ports) with persistency feature (read the ipvsadm man page). All ports are persistent. A client after connecting to a particular realserver for one service, will (within the timeout period) be connected to the same realserver for all services. This will allow intruders to forward packets for any ports to the realservers, so you will need to write filter rules that block all ports but the ones that you want serviced by the realservers.
  • In practice only 1 or a small number of ports (e.g. http/https, smtp/pop) will ever be used in a persistent manner and you can set persistence for a particular port (e.g. https) while other services are not persistent. The client will (within the timeout period) be sent to the same realserver for the persistent port, while being serviced by all realservers for the other LVS'ed ports.
  • For sophisticated setups (e.g. shopping carts where the client who has been filling his cart on :http, needs to give his credit card details on :https), you should use persistent fwmarks with the fwmark persistent patch. fwmarks and persistent fwmarks scale well with large numbers of services and (once you understand fwmarks) make it easy to setup shopping cart LVSs.

    Note
    Shopping cart applications have to maintain state. Usually state is maintained by sending the customer a cookie. These are instrusive and a security risk (I turn them off on my browser). If you're going to use cookies in your application, you should at least test that the client accepts them, otherwise the client will not be able to accumulate objects in their shopping cart. We encourage you to rewrite the application (see rewriting your e-commerce application) so that state is maintained on the realserver(s) in a way that is available to all realservers (e.g. on a replicated database) (see Section 23.2. You have the time of the persistence timeout to make this information available to the other realservers.

    Having told you that you can setup a shopping cart with persistent fwmarks, please read why you don't want persistence for your e-commerce site: why you should rewrite your application.

One of the problems with persistence is removing a service (e.g. you just want it removed or the realserver has crashed). Even after the weight has been set in ipvsadm to 0, the service is still in the ipvsadm table and will stay there till the end of the client's timeout period. If the realserver has crashed, the client's connection will hang. You would like to have preserved the client's state information in your database, and give the client a new realserver. This problem has now been addressed with the LVS sysclt (see sysctl and bringing down persistent services). For older material on the topic see realserver crash on sticky connection.

The following examples here use telnet and http. You aren't likely to want to make these persistent in practice. They are used because the clients are simple to use in tests. You'll probably only want to make ftp or https persistent, but not much else.

Setup persistence on VIP, default persistence timeout (default timeout value varies a bit with ipvs versions, but it's about 10mins), port not specified (all ports made persistent). Telnet'ing to the VIP from one machine, you will always connect to the same realserver.

director:/etc/lvs# ipvsadm -A -t lvs -p
director:/etc/lvs# ipvsadm -a -t lvs -r rs1 -g
director:/etc/lvs# ipvsadm -a -t lvs -r rs2 -g
director:/etc/lvs# ipvsadm
IP Virtual Server version 0.9.4 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  lvs.mack.net:0 wlc persistent 360
  -> RS2.mack.net:0               Route   1      0          0
  -> RS1.mack.net:0               Route   1      0          0

Here's setting up with a specified persistence timeout (here 600secs), setting persistence granularity (the -M option) to a netmask of /24, and round robin scheduling.

Note
If you make the timeout > 15mins (900 sec), you'll also need to change the tcpip idle timeout.
director:/etc/lvs# ipvsadm -A -t lvs -p 600 -s rr -M 255.255.255.0
director:/etc/lvs# ipvsadm -a -t lvs -r rs1 -g
director:/etc/lvs# ipvsadm -a -t lvs -r rs2 -g
director:/etc/lvs# ipvsadm
IP Virtual Server version 0.9.4 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  lvs.mack.net:0 rr persistent 600 mask 255.255.255.0
  -> RS2.mack.net:0               Route   1      0          0
  -> RS1.mack.net:0               Route   1      0          0

Note: only a timeout value can follow "-p". Thus you can have any of

director:/etc/lvs# ipvsadm -A -t VIP -p
director:/etc/lvs# ipvsadm -A -t VIP -p 600
director:/etc/lvs# ipvsadm -A -t VIP -s rr -p

but you can't have

director:/etc/lvs# ipvsadm -A -t VIP -p -s rr

You can setup persistence by port

director:/etc/lvs# ipvsadm -A -t lvs:80 -p
director:/etc/lvs# ipvsadm -a -t lvs:80 -r rs1 -g
director:/etc/lvs# ipvsadm -a -t lvs:80 -r rs2 -g
director:/etc/lvs# ipvsadm
IP Virtual Server version 0.9.4 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  lvs.mack.net:http wlc persistent 360
  -> RS2.mack.net:http            Route   1      0          0
  -> RS1.mack.net:http            Route   1      0          0

23.4. How others maintain state

We spend lot of time telling people not to use cookies to maintain state. I thought I should do a reality check, to see what people are using that's working.

Malcolm Turnbull malcolm (at) loadbalancer (dot) org 12 May 2004

I've been involved with a mid sized ecommerce company for about 4 years and we've had very few problems using cookies for state (stored in a single backend db.) Its fast and easy. If the odd customer doesn't have cookies turned on then they are no great loss.

Putting the sessionid in the URL i.e. GET is ugly and slightly less secure. I guess you could POST it on every page but would that be slower than cookie? (I think so)

Note
Joe: Having session data in the URL allows the user (or man in the middle) to manipulate it. This is not secure.

Joe 11 May 2004 14:23:02 -0400

With the security hazards of cookies, I have them turned off. Usually the application (e.g. mailman) runs a cookie test on me and tells me to turn on cookies. I've been trying to register for the Ottawa Linux Symposium for about 2months, and they've been having trouble with their registration software. Finally they ask me if I've got cookies turned on. I say "of course not". They don't do a cookie test and there's no notice on the web page that I need cookies turned on.

Jacob Coby jcoby (at) listingbook (dot) com 11 May 2004

We aren't an ecommerce site, but we do require some sort of login/authentication to use our site.

We haven't worried about cookies either, at least, until the past 4 months or so. It seems that the latest and greatest anti-virus software and pop-up blockers disable cookies (among other things that they have no business doing). Rumor has it that new versions of IE will disable cookies by default. A good portion of IE6 users won't accept cookies unless your site publishes a P3P of some sort. We get appx. 4000 people/day (9000 logins/day), and we were getting up to 10 cookie related problems a day to the helpdesk. I'd estimate that there were at probably 2-10x more that who had problems, but who never reported it. In 3 years of requiring cookies, we had only one nasty email about our requirement to have cookies enabled.

At any rate, we now use a different system to autheticate a user. We pass in a sid per page, and use cookies, IP address, browser ident, and other metrics to authenticate the user. Sensitive areas of the site (such as those requiring a credit card) also use SSL. All session data is stored in a single database, as a serialized PHP array. There can be up to 1/2 MB of session data, and part of the session data persists between logins, so it doesn't make sense for us to put session data in the cookie or to store it on the webservers.

The sid + (cookie, IP, browser ident) is only used to authenticate the user. The session data itself stores all sorts of things, such as temporary user prefs, some of the things the user looked at, a bit of caching to cut down on subsequent db queries, things like that. Only part of that session data persists between logins, but it has to be stored somewhere between pages.

Our situation is a little different from your average e-commerce store. We can't just identify a shopping cart + items by a unique sid. We need the session data to act as a ram drive of sorts for data that needs to be quickly accessed, multiple times per page.

All of that temp data is stored in an array, and serialize()'d. PHP's serialize is pretty compact, but it still expands out. For example, a simple int value of the login timestamp looks like:

s:8:"login_ts";i:1084802982;

The 1/2mb is the MAXIMUM we allow to store. Typical is more in the 3-10kb range. Average size over 49627 rows of session data is 3134b right now.

Joe: What's going to happen to your session data when IE6 disallows cookies?

It'll still work. The sid cookie is only one of several hints we can use to authenticate the user.

Malcolm Turnbull malcolm (at) loadbalancer (dot) org 17 May 2004

If your page is formated correctly with a PICS-Label IE6 will accept the cookie by default.
Note

For reference info about PICS Labels see PICS Label Distribution Label Syntax and Communications Protocols v1.1 (http://www.w3.org/TR/REC-PICS-labels-961031). For a sample HOWTO see How To Label Your Pages with PICS Meta Tag (http://256.com/gray/docs/pics/). These labels appear to be part of the so far futile effort to filter webpages for children. Here's a webpage by the people fronting this effort Information for webmasters.

They want you to rate your own website (people with obnoxious content are always honest, right?). If the ICRA expect this approach to succeed, then why do we have spam? The politicians are no help of course. One of them has a bill to stop google from inserting advertisements into their gmail service, since this requires reading people's (private) e-mail. This bill would also stop programs from filtering web content. We have a long way to go.

POST is marginally slower than GET if you look at the HTTP spec. There is an additonal request header per variable. GET is only *very slightly* less secure. POST, and cookies are of equal security levels, and they're all trivial to send using command line tools.

Joe

until recently I'd thought that putting the session data into the URL (rather than a cookie) was the way to go, till someone pointed out that the user could manipulated the URL. In that case, could the session id be put in a long enough string in the URL such that any attempt to alter it would result in an invalid string?

Jacob

There is an upper limit on the GET string IE will send. Somewhere around 1 or 2 kb. When it hits the limit, if you use javascript to submit the form, it'll error out. If you just use a

<input type="submit">

it'll just not work. For sites that will never hit that limit, passing in the session data would work. However, there should still be checks to authenticate the user, mostly to prevent problems when they share links with friends. One solution to the user modifying the string is to pass in a public key:

Kpu = public key
Kpr = private key
md5 = standard MD5 sum algo
session = your serialized session data

Kpu = md5(md5(session + Kpr) + md5(Kpr)) 
(or some variation, 
see the HTTP RFC for an example with the HTTP DIGEST authentication)

Then you can authenticate that the session data hasn't been modified by checking your computed Kpu against the Kpu that was passed in from the GET/POST. If they match, the session data probably hasn't been modified. If they don't, there is a very good chance the data was either corrupted in transit or corrupted by the user.

It's only as strong as the Kpr used and whatever the collision probability of the md5 algorithm.

Horms 21 May 2004

Is there anything to stop a cookie or post from being manipulated by an end user? Sure, it might be margionally more difficult as you would probably need some (command like) tool, rather than just changing the URL directly in your given browser. But it should still be trivial. I rarely write web pages. But if I was to do something like this I would make the string in the URL a hash (md5, sha1 or something like that), which should make guessing a valid string rather difficult. I would do the same in a cookie or a post. I would imagine something like this is pretty common practice.

nick garratt nick-lvs (at) wordwork (dot) co (dot) za 12 May 2004

This discussion happens every so often on the list and, as always, I feel the need to mention the msession session service which we have been using very reliably for years from Mohawk Software (http://www.mohawksoft.com/devel/msession.html). It's light-weight, fast and depending on the scripting language you use (we're using php4) it is very easy to implement.