Transparent caches essentially look out for TCP connections destined
for port 80. The cache server will intercept these packets, convert
them to a standard TCP stream and pass them
to Squid. When Squid sends reply data to the client, the Operating
System fakes the source address of the packets,
so that the client believes it is connected to the server that it
originally sent the request to.
You can't simply plug a transparent cache into the network and get it
to transparently cache pages. The cache server needs to be in a
position where it can fake the reply packets (without the real server
interrupting the conversation and confusing things.) The server needs
to be the gateway to the outside world.
Let's look at the simplest transparent cache setup. The client machine
(10.0.0.50) treats the cache server's internal (10.0.0.1) interface
as it's default gateway. This way, all packets arrive on the
cache server before they reach the rest of the Internet. The filter
looks for port 80 packets, and passes them to Squid, but allows all
other packets to be passed to the routing layer, which passes the
packets to the router's IP (172.31.0.2).
Once the connection is established, Squid needs to communicate with
the client. Squid doesn't do any strange packet assembly: that's left
to the transparency layer. When Squid sends reply data to the client,
the kernel automatically changes the packet's from address, so it
appears to the client that the server is just routing the requests
from the outside world. When Squid connects to the remote server, however,
the connect comes from the external interface of the cache server
(172.31.0.1, in the example.) This is where IP-authentication breaks:
since the request is coming from the cache (rather than the client's
real address (10.0.0.50).
Effectively, we need to get four things right to get transparency
right: