Acl-operators are the other half of the acl system. For each
connection the appropriate acl-operators are checked (in the order
that they appear in the file). You have met the http_access
and icp_access operators before, but they aren't the only
Squid acl-operators. All acl-operator lines have the same format;
although the below format mentions http_access specifically,
the layout also applies to all the other acl-operators too.
http_access allow|deny [!]aclname [& [!]aclname2 ... ]
Let's work through the fields from left to right. The first word is
http_access, the actual acl-operator.
The allow and deny words come next. If you want to deny access to a specific class of users, you can change the customary allow to deny in the acl line. We have seen where a deny line is useful before, with the final deny of all IP ranges in previous examples.
Let's say that you wanted to deny Internet access to a specific list
of IP addresses during the day. Since acls can only have one type per
acl, you could not create an acl line that matches an IP address
during specific times. By combining more than one acl per acl-operator
line, though, you get the same effect. Consider the following acls:
acl dialup src 10.0.0.0/255.255.255.0
If you could create an acl-operator that was matched when both the
dialup and work acls were true, clients in the range
could only connect during the right times. This is where the
aclname2 in the above acl-operator definition comes in. When you
specify more than one acl per acl-operator line, both acls have to be
matched for the acl-operator to be true. The acl-operator function
AND's the results from each acl check together to see if it is to return
true of false.
acl work time 08:00-17:00
You could thus deny the dialup range cache access during working hours with the following acl rules:
You can also invert an acl's result value by using an exclamation mark (the traditional NOT value from many programming languages) before the appropriate acl. In the following example I have reduced Example 6-4 into one http_access line, taking advantage of the implicit inversion of the last rule to deny access to all clients.
Since the above example is quite complicated: let's cover it in more detail:
In the above example an IP from the outside world will match the 'all'
acl, but not the 'myNet' acl; the IP will thus match the http_access line.
Consider the binary logic for a request coming in from the outside
world, where the IP is not defined in the myNet acl.
Deny http access if ((true) & (!false)) Deny http access if ((true) & (!true))
If you consider the relevant matching of an IP in the 10.0.0.0 range,
the myNet value is true, the binary representation is as follows:
A 10.0.0.0 range IP will thus not match the only
http_access line in the squid config file. Remembering that
Squid will default to using the inverse of the last match in the file,
accesses will be allowed from the myNet IP range.
You have encountered only the http_access and icp_access acl-operators so far. Other acl-operators are:
no_cache
ident_lookup_access
miss_access
always_direct, never_direct
snmp_access (covered in the next section of this chapter)
delay_classes (covered in the next section of this chapter)
broken_posts
The no_cache acl-operator is used to ensure freshness of
objects in the cache. The default Squid config file includes an
example no_cache line that ejects the results of cgi programs
from the cache. If you want to ensure that cgi pages are not cached,
you must un-comment the following lines from squid.conf:
acl QUERY urlpath_regex cgi-bin \\?
The first line uses a regular expression match to find urls that have
cgi-bin or ? in the path (since we are using the
urlpath_regex acl type, a site with a name like
cgi-bin.qualica.com will not be matched.) The no_cache
acl-operator is then used to eject matching objects from the cache.
no_cache deny QUERY
Earlier we discussed using the ident protocol to control cache access. To reduce network overhead, Squid does an ident lookup only when it needs to. If you are using ident to do access control, Squid will do an ident lookup for every request, and you don't have to worry about this acl-operator.
Many administrators would like to log the the ident value for connections without actually using it for access control. Squid used to have a simple on/off switch for ident lookups, but this incurred extra overhead for the cases where the ident lookup wasn't useful (where, for example, the connection is from a desktop PC).
Let's consider some examples. Assume that a you have one Unix server (at IP address 10.0.0.3), and all remaining IP's in the 10.0.0.0/255.255.255.0 range are desktop PC's. You don't want to log the ident value from PC's, but you do want to record it when the connection is from the Unix machine. Here is an example acl set that does this:
If a system cracker is attempting to attack your cache, it can be useful to have their ident value logged. The following example gets Squid not to do ident lookups for machines that are allowed access, but if a request comes from a disallowed IP range, an ident lookup is done and inserted into the log.
The ICP protocol is used by many caches to find out if objects are in another cache's on-disk store. If you are peering with other organisation's caches, you may wish them to treat you as a sibling, where they only get data that you already have stored on disk. If an unscrupulous cache-admin were to change their cache_peer line to read parent instead of sibling, they could get you to retrieve objects on their behalf.
To stop this from happening, you can create an acl that contains the peering caches, and use the miss_access acl-operator to ensure that only hits are served to these caches. In response to all other requests, an access-denied message is sent (so if a sibling complains that they almost always get error messages, it's likely that they think that you should be their parent, and you think that they should be treating you as a sibling.)
When looking at the following example it is important to realise that http_access lines are checked before any miss_access lines. If the request is denied by the http_access lines, an error page is returned and the connection closed, so miss_access lines are never checked. This means that the last miss_access line in the example doesn't allow random IP ranges to access your cache, it only allows ranges that have passed the http_access test through. This is simpler than having one miss_access line for each http_access line in the file, and it will reduce CPU usage too, since only two acls are checked instead of the six we would have instead.
These operators help you make controlled decisions about which servers to connect to directly, and which to connect through a parent cache/proxy. I previously discussed this set of options briefly in Chapter Three, during the Basic Installation phase.
These tags are covered in detail in the following chapter, in the Peer Selection section.
Some servers incorrectly handle POST data, requiring an extra Carridge-Return (CR) and Line-Feed (LF) after a POST request. Since obeying the HTTP specification will make Squid incompatable with these server, there is an option to be non-compliant when talking to a specific set of servers. This option should be very rarely used. The url_regex acl type should be used for specifying the broken server.