www.fabiankeil.de/sourcecode/pft/

Privoxy-Filter-Test

Privoxy-Filter-Test makes creating and testing Privoxy filters easier.

Usually you have to edit Privoxy's filter file by hand and reload the whole page for each request. You further have to make sure that the browser actually reloads the page, instead of revalidating the age and reusing a cached copy. If your filter didn't work as expected, you have to spend time figuring out which changes were made.

With Privoxy-Filter-Test you still have to write the filter yourself, but can rest assured that the filtered page isn't cached and you don't have to guess which changes your filter caused or if it maybe didn't match at all.

How Privoxy-Filter-Test works

Privoxy-Filter-Test is controlled through a web interface. You either fill in the filter and the address of the document to filter, or a short sample text to filter. You can also load filters directly from Privoxy's own filter files or simply tell Privoxy-Filter-Test to apply all the filters Privoxy would use as well.

Privoxy-Filter-Test will first save the document locally and then request it again from itself with Privoxy's filters enabled. If there are differences between the original file and the filtered version, they are shown in a way similar to the output of diff -u.

What Privoxy-Filter-Test needs

Privoxy-Filter-Test requires the Perl modules:

which are available through FreeBSD's ports collection, or CPAN.

Additionally a local Privoxy installation is required. By default Privoxy-Filter-Test uses Privoxy for all requests, to make sure the original file is fetched unmodified you need a Privoxy version above 3.0.3.

Privoxy configuration

Of course Privoxy needs to be aware of the additional files. The important parts of my config file look like:

actionsfile standard.action                    # Internal purpose, recommended
actionsfile default.action                     # Main actions file
actionsfile user-agent.action                  # Random Firefox User-Agent
actionsfile fk.action                          # My own customizations
actionsfile test.action                        # internal tests that may end up destroying the file
actionsfile regression-tests.action.action     # Vanilla regression tests
actionsfile regression-tests-requests.action   # Regression tests requiring fellatio
actionsfile privoxy-filter-test.action         # Privoxy-Filter-Test's actionfile
filterfile default.filter                      # Privoxy's official filters
filterfile fk.filter                           # My own filters
filterfile privoxy-filter-test.filter          # Privoxy-Filter-Test's filters
forward-socks4a   /              tor-jail:9050 .
forward           10.0.0.1       .

Add the emphasized lines to your Privoxy config file. The last forward line is only needed if you use Privoxy with Tor or another proxy.

privoxy-filter-test.filter will be created by Privoxy-Filter-Test, you just have to make sure it has sufficient rights to do so.

You have to create privoxy-filter-test.action yourself, the required settings are:

# Enable filtering for the delivery URL
{+filter{privoxy-filter-test} \
 +force-text-mode \
}
localhost/\?deliver
127.0.0.1/\?deliver

# Use a direct connection to the webinterface
{+forward-override{forward .}}
localhost/
127.0.0.1/

However you probably want to edit the content with Privoxy's web interface and disable all the other filters, otherwise you can't be sure if changes were made by the filter you created with Privoxy-Filter-Test, or one of the other filters.

Usage

To get to the web interface you start Privoxy-Filter-Test and point your browser to its listening address:.

fk@TP51 ~ $pft.pl --working-dir /home/fk/privoxy/pft --privoxy-dir /home/fk/privoxy --privoxy-config-file /home/fk/privoxy/config --web-interface-ip 10.0.0.1 -web-interface-port 80 --local-server "10.0.0.1/?deliver=" --privoxy 10.0.0.1:8118 Privoxy-Filter-Test 0.6 is awaiting your input at: 10.0.0.1:80/

If you don't use any parameters, default values are used, but as I run Privoxy inside a FreeBSD jail with the configuration directory nullfs-mounted from my home directory, the defaults don't work for me.

Example

This is Privoxy-Filter-Test 0.6 in action, it may not be pretty, but it works (don't mind the multiple vertical scroll bars, the image is a collage created from multiple screenshots).

[Screenshot: Privoxy-Filter-Tests's webinterface viewed with Firefox, filtering www.freebsd.org]

Filter to apply means just the substitution commands, no leading FILTER: line. Document to fetch means the document's address on the web.

The Fetch (again) button causes Privoxy-Filter-Test to download a fresh copy, whereas Use local copy reuses a local copy. You hit Fetch (again) for the first attempt, and Use local copy for the following retries, until the filter fits your needs (or you give up).

Behind the scene

From Privoxy's perspective it looks like this.

The initial request arrives:

16:59:10 28415340 Header: POST http://127.0.0.1/ HTTP/1.1
16:59:10 28415340 Header: Tagger 'client-ip-address' added tag 'IP-ADDRESS: 10.0.0.1'. No action bits update necessary.
16:59:10 28415340 Header: Tagger 'http-method' added tag 'POST'. Action bits updated accordingly.
16:59:10 28415340 Header: Tagger 'allow-post' added tag 'ALLOWED-POST'. Action bits updated accordingly.
16:59:10 28415340 Header: scan: Host: 127.0.0.1
16:59:10 28415340 Header: scan: User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.0.3) Gecko/2008100521 Firefox/3.0.3
16:59:10 28415340 Header: Tagger 'user-agent' added tag 'User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.0.3) Gecko/2008100521 Firefox/3.0.3'. No action bits update necessary.
16:59:10 28415340 Header: scan: Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
16:59:10 28415340 Header: scan: Accept-Language: en-EN
16:59:10 28415340 Header: scan: Accept-Encoding: gzip,deflate
16:59:10 28415340 Header: scan: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
16:59:10 28415340 Header: scan: Connection: close
16:59:10 28415340 Header: scan: Proxy-Connection: close
16:59:10 28415340 Header: scan: Referer: http://127.0.0.1/
16:59:10 28415340 Header: Tagger 'referer' added tag 'Referer: http://127.0.0.1/'. No action bits update necessary.
16:59:10 28415340 Header: scan: Content-Type: multipart/form-data; boundary=---------------------------18000921343988772021632671727
16:59:10 28415340 Header: scan: Content-Length: 1301
16:59:10 28415340 Header: Modified: User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; sk-SK; rv:1.9.0.1) Gecko/20080817 Firefox/3.0.1
16:59:10 28415340 Header: Replaced: 'Connection: close' with 'Connection: keep-alive'
16:59:10 28415340 Header: crumble crunched: Proxy-Connection: close!
16:59:10 28415340 Header: Accept-Language header crunched and replaced with: Accept-Language: sk-sk
16:59:10 28415340 Header: New HTTP Request-Line: POST / HTTP/1.1
16:59:10 28415340 Redirect: Decoding / if necessary.
16:59:10 28415340 Redirect: Checking / for redirects.
16:59:10 28415340 Request: 127.0.0.1/
16:59:10 28415340 Connect: to 127.0.0.1
16:59:10 28415340 Connect: No reusable socket for 127.0.0.1:80 found. Opening a new one.

Privoxy-Filter-Test looks up which filters would apply to the URL:

16:59:10 28415700 Header: GET http://config.privoxy.org/show-url-info?url=http://www.freebsd.org/ HTTP/1.0
16:59:10 28415700 Header: Tagger 'client-ip-address' added tag 'IP-ADDRESS: 10.0.0.1'. No action bits update necessary.
16:59:10 28415700 Header: Tagger 'http-method' added tag 'GET'. No action bits update necessary.
16:59:10 28415700 Header: scan: User-Agent: Privoxy-Filter-Test 0.6
16:59:10 28415700 Header: Tagger 'user-agent' added tag 'User-Agent: Privoxy-Filter-Test 0.6'. No action bits update necessary.
16:59:10 28415700 Header: scan: Connection: close
16:59:10 28415700 Header: scan: Accept: */*
16:59:10 28415700 Header: scan: Host: config.privoxy.org
16:59:10 28415700 Header: Modified: User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; sk-SK; rv:1.9.0.1) Gecko/20080817 Firefox/3.0.1
16:59:10 28415700 Header: Replaced: 'Connection: close' with 'Connection: keep-alive'
16:59:10 28415700 Header: New HTTP Request-Line: GET http://config.privoxy.org/show-url-info?url=http://www.freebsd.org/ HTTP/1.0
16:59:10 28415700 Connect: Overriding forwarding settings based on 'forward-socks5 10.0.0.2:2222 .'
16:59:10 28415700 Request: config.privoxy.org/show-url-info?url=http://www.freebsd.org/ (CGI Call)
10.0.0.1 - - [09/Nov/2008:16:59:10 +0100] "GET http://config.privoxy.org/show-url-info?url=http://www.freebsd.org/ HTTP/1.0" 200 21889

And then requests the URL while asking Privoxy not to apply any filters:

16:59:10 28415e80 Header: GET http://www.freebsd.org/ HTTP/1.0
16:59:10 28415e80 Header: Tagger 'client-ip-address' added tag 'IP-ADDRESS: 10.0.0.1'. No action bits update necessary.
16:59:10 28415e80 Header: Tagger 'http-method' added tag 'GET'. No action bits update necessary.
16:59:10 28415e80 Header: scan: User-Agent: Privoxy-Filter-Test 0.6
16:59:10 28415e80 Header: Tagger 'user-agent' added tag 'User-Agent: Privoxy-Filter-Test 0.6'. No action bits update necessary.
16:59:10 28415e80 Header: scan: Connection: close
16:59:10 28415e80 Header: scan: Accept: */*
16:59:10 28415e80 Header: scan: X-Filter: No
16:59:10 28415e80 Header: scan: Cache-Control: no-cache
16:59:10 28415e80 Header: scan: Host: www.freebsd.org
16:59:10 28415e80 Header: Modified: User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; sk-SK; rv:1.9.0.1) Gecko/20080817 Firefox/3.0.1
16:59:10 28415e80 Header: Replaced: 'Connection: close' with 'Connection: keep-alive'
16:59:10 28415e80 Header: Accepted the client's request to fetch without filtering.
16:59:10 28415e80 Header: Crunching X-Filter: No
16:59:10 28415e80 Connect: Overriding forwarding settings based on 'forward-socks5 10.0.0.2:2222 .'
16:59:10 28415e80 Header: New HTTP Request-Line: GET / HTTP/1.0
16:59:10 28415e80 Redirect: Decoding / if necessary.
16:59:10 28415e80 Redirect: Checking / for redirects.
16:59:10 28415e80 Request: www.freebsd.org/
16:59:10 28415e80 Connect: to www.freebsd.org
16:59:10 28415e80 Connect: No reusable socket for www.freebsd.org:80 found. Opening a new one.
16:59:11 28415e80 Header: scan: HTTP/1.0 200 OK
16:59:11 28415e80 Header: scan: Connection: keep-alive
16:59:11 28415e80 Header: scan: Content-Type: text/html
16:59:11 28415e80 Header: Tagger 'content-type' added tag 'text/html'. No action bits update necessary.
16:59:11 28415e80 Header: scan: Accept-Ranges: bytes
16:59:11 28415e80 Header: scan: ETag: "420795487"
16:59:11 28415e80 Header: scan: Last-Modified: Tue, 04 Nov 2008 21:55:10 GMT
16:59:11 28415e80 Header: scan: Content-Length: 19201
16:59:11 28415e80 Header: scan: Date: Sun, 09 Nov 2008 15:59:11 GMT
16:59:11 28415e80 Header: scan: Server: httpd/1.4.x LaHonda
16:59:11 28415e80 Header: Replaced: 'Connection: keep-alive' with 'Connection: close'
16:59:11 28415e80 Header: Randomizing: Last-Modified: Tue, 04 Nov 2008 21:55:10 GMT
16:59:11 28415e80 Header: Randomized: Last-Modified: Sat, 08 Nov 2008 14:32:57 GMT (added 3 days 16 hours 37 minutes 47 seconds)
16:59:11 28415e80 Connect: Done reading from server. Expected content length: 19201. Actual content length: 19201. Most recently received: 377.
10.0.0.1 - - [09/Nov/2008:16:59:11 +0100] "GET http://www.freebsd.org/ HTTP/1.0" 200 19201
16:59:11 28415e80 Connect: Overriding forwarding settings based on 'forward-socks5 10.0.0.2:2222 .'
16:59:11 28415e80 Connect: Remembering socket 6 for www.freebsd.org:80 in slot 0.

After saving the file, Privoxy-Filter-Test requests it again, but this time from itself:

16:59:11 28415480 Header: Destination extracted from "Host:" header. New request URL: http://10.0.0.1/?deliver=original-5428697-file-fetched.html
16:59:11 28415480 Header: GET /?deliver=original-5428697-file-fetched.html HTTP/1.0
16:59:11 28415480 Header: Tagger 'client-ip-address' added tag 'IP-ADDRESS: 10.0.0.1'. No action bits update necessary.
16:59:11 28415480 Header: Tagger 'http-method' added tag 'GET'. No action bits update necessary.
16:59:11 28415480 Header: scan: User-Agent: Privoxy-Filter-Test 0.6
16:59:11 28415480 Header: Tagger 'user-agent' added tag 'User-Agent: Privoxy-Filter-Test 0.6'. No action bits update necessary.
16:59:11 28415480 Header: scan: Connection: close
16:59:11 28415480 Header: scan: Accept: */*
16:59:11 28415480 Header: scan: Host: 10.0.0.1
16:59:11 28415480 Header: Replaced: 'Connection: close' with 'Connection: keep-alive'
16:59:11 28415480 Connect: Overriding forwarding settings based on 'forward .'
16:59:11 28415480 Header: New HTTP Request-Line: GET /?deliver=original-5428697-file-fetched.html HTTP/1.0
16:59:11 28415480 Redirect: Decoding /?deliver=original-5428697-file-fetched.html if necessary.
16:59:11 28415480 Redirect: Checking /?deliver=original-5428697-file-fetched.html for redirects.
16:59:11 28415480 Request: 10.0.0.1/?deliver=original-5428697-file-fetched.html
16:59:11 28415480 Connect: to 10.0.0.1
16:59:11 28415480 Connect: No reusable socket for 10.0.0.1:80 found. Opening a new one.
16:59:11 28415480 Header: scan: HTTP/1.1 200 Come and get some
16:59:11 28415480 Header: scan: Content-Encoding: utf-8
16:59:11 28415480 Header: scan: Content-Type: text/plain
16:59:11 28415480 Header: Tagger 'content-type' added tag 'text/plain'. No action bits update necessary.
16:59:11 28415480 Header: scan: Connection: Close
16:59:11 28415480 Header: Text mode enabled by force. Take cover!

The content of all the matching filters has been copied into the privoxy-filter-test filter and is now applied:

16:59:12 28415480 Re-Filter: filtering 10.0.0.1/?deliver=original-5428697-file-fetched.html (size 19201 with 'privoxy-filter-test' produced 2 hits (new size 21031) (+1830)
10.0.0.1 - - [09/Nov/2008:16:59:12 +0100] "GET /?deliver=original-5428697-file-fetched.html HTTP/1.0" 200 21031
16:59:12 28415480 Connect: Socket 7 already forgotten or never remembered.

The filtered file is saved as well, the difference to the unfiltered version computed and finally the initial request can be answered:

16:59:12 28415340 Header: scan: HTTP/1.1 200 Come and get some
16:59:12 28415340 Header: scan: Server: Privoxy-Filter-Test 0.6
16:59:12 28415340 Header: scan: Content-Encoding: utf-8
16:59:12 28415340 Header: scan: Content-Type: text/html
16:59:12 28415340 Header: Tagger 'content-type' added tag 'text/html'. No action bits update necessary.
16:59:12 28415340 Header: scan: X-Documentation: You wish
16:59:12 28415340 Header: scan: Connection: Close
16:59:12 28415340 Header: Text mode is already enabled.
10.0.0.1 - - [09/Nov/2008:16:59:12 +0100] "POST http://127.0.0.1/ HTTP/1.1" 200 34802
16:59:12 28415340 Connect: Socket 4 already forgotten or never remembered.

On the fly filter testing

[Screenshot: pft's webinterface viewed with Firefox, filtering supplied text] In this screenshot (Privoxy-Filter-Test version 0.2) the Use this text button was used. It becomes handy if you want to quickly test some filters, without fetching a whole document first.

The Reload defaults button resets the form to Privoxy-Filter-Test's defaults, the Reset button uses the values from the last request instead.

Download

At the moment pft version 0.6 works if the remote server doesn't misbehave, and you don't put in invalid data.

The error handling and the code itself have room for improvements, but as it already [wf]orks for me, it could take a while.