Squid Anti-Ad Server blocker "how to"
Squid "Anti-Ad" Server Blocker
Calomel.org Home Page
The proxy server Squid (Squid Web Proxy Cache) has the ability to read a list of ips from a text file and block those ips from clients using the proxy. This is perfect for blocking ad servers for your internal clients. Your clients in turn will not have to be bothered with ads, they will save bandwidth and you wont have to worry as much about that user that will click on any shiny object in front of them.
Getting Started
The following three(3) lines need to be added anywhere in your squid.conf file. We are going to assume your squid.conf file is in /etc/squid/ and you will be putting your list of ad servers called ad_block.txt in the same directory.
The first line below is a comment and reminder where you are getting your list from. The second line is the regular expression that reads the "/etc/squid/ad_block.txt" file when the squid daemon loads or when you reconfigure the daemon with "squid -k reconfigure". The last line instructs squid to deny access to those ips in the list from clients using the squid proxy.
## disable ads ( http://pgl.yoyo.org/adservers/ )
acl ads dstdom_regex "/etc/squid/ad_block.txt"
http_access deny ads
Fetching the list of ad servers
Now we need to fetch the list of ad servers and format the downloaded file into a list squid can recognize. The following script will do this for us. The script first downloads the ad server list from pgl.yoyo.org and saves it to /tmp/temp_ad_file using wget. Then the file is grep'd to remove unwanted characters like html text and the output is saved to /etc/squid/ad_block.txt. Lastly, squid is "reconfigure"d so the new ad servers list is loaded and then the temporary ad file is deleted from /tmp.
#### Calomel.org ad_servers_newlist.sh
## get new ad server list
/usr/local/bin/wget -O /tmp/temp_ad_file \
http://pgl.yoyo.org/adservers/serverlist.php?hostformat=squid-dstdom-regex;showintro=0
## clean html headers out of list
cat /tmp/temp_ad_file | grep "(^|" > /etc/squid/ad_block.txt
## refresh squid
/usr/local/sbin/squid -k reconfigure
## rm temp file
rm -rf /tmp/temp_ad_file
Automating with cron
Lastly, you may want to setup and cron job to get the latest list every few days. The cron job will simply call the ad_servers_newlist.sh script once every few days. The site you get the ad list from (pgl.yoyo.org) updates their ips every few days on average. With a cron job running you can make sure you have the latest list. Below is a cron job line to get the ad servers list every 3 days at 5:35am (0535).
#minute (0-59)
#| hour (0-23)
#| | day of the month (1-31)
#| | | month of the year (1-12 or Jan-Dec)
#| | | | day of the week (0-6 with 0=Sun or Sun-Sat)
#| | | | | commands
#| | | | | |
#### refresh squid's anti-ad server list
35 5 * * */3 /scripts_dir/ad_servers_newlist.sh >> /dev/null 2>&1
Questions?
Where an ad was supposed to be, it now says "error". Whats wrong?
Nothing. What you are seeing is the squid error page. When the browser asks for an ip listed in the ad_block.txt file squid servers the error page saying this ip can not be reached because it had been blocked. If you had an ad box large enough you would be able to see the entire squid error page in the ad space.