Tuning squid and videocache for youtube
How youtube cache (at googlevideo) works
Add to your FireFox the following extensions:
Live HTTP Headers, https://addons.mozilla.org/firefox/addon/3829
CacheViewer, https://addons.mozilla.org/firefox/addon/2489
Ensure that your FireFox doesn't use your (squid) proxy server.
Open Live HTTP Headers, in order to capture the headers.
If you clean the cache of your FireFox your testing will be easier.
Watch any video at www.youtube.com. Better if you chose a small video, you will spent less time for testing.
With Live HTTP Headers you will see that when you fetch a video at youtube you receive a 303 HTTP response redirecting you to a new URL.
HTTP/1.x 303 See Other Date: Sat, 28 Feb 2009 18:29:20 GMT Server: Apache X-Content-Type-Options: nosniff Expires: Tue, 27 Apr 1971 19:44:06 EST X-YouTube-MID: ZARg7-aAGvgIhj98y1Y9zFBt-qQ8BYvKA_FElHA1RfN3uICq-TYJCQ Cache-Control: no-cache Location: http://v2.cache.googlevideo.com/videoplayback?id=986fef5cdbeb7e25&itag=34&ip=00.00.000.000&signature=1BAD9AFC21CC55D09A6DE0F3526E78B7889D945F.4A46A515C1A1647A1D48F44220C0866F88B30E13&sver=2&expire=1235867360&key=yt1&ipbits=0 Keep-Alive: timeout=300 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html; charset=utf-8 ---------------------------------------------------------- http://v2.cache.googlevideo.com/videoplayback?id=986fef5cdbeb7e25&itag=34&ip=00.00.000.000&signature=1BAD9AFC21CC55D09A6DE0F3526E78B7889D945F.4A46A515C1A1647A1D48F44220C0866F88B30E13&sver=2&expire=1235867360&key=yt1&ipbits=0 GET /videoplayback?id=986fef5cdbeb7e25&itag=34&ip=00.00.000.000&signature=1BAD9AFC21CC55D09A6DE0F3526E78B7889D945F.4A46A515C1A1647A1D48F44220C0866F88B30E13&sver=2&expire=1235867360&key=yt1&ipbits=0 HTTP/1.1 Host: v2.cache.googlevideo.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ca; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: es-us Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive HTTP/1.x 200 OK content-disposition: attachment; filename=video.flv Last-Modified: Sat, 27 Dec 2008 15:27:28 GMT Content-Type: video/x-flv Content-Length: 2179290 Expires: Sat, 28 Feb 2009 19:29:20 GMT Cache-Control: public,max-age=3600 Connection: close Date: Sat, 28 Feb 2009 18:29:20 GMT Server: gvs 1.0 |
This new URL:
It is located at googlevideo.com or in an IP owned by youtube.com or google.com.
It have an id for the cached video (at googlevideo.com). This id is not the 'static' id for the video at www.youtube.com. This 'temporary' id is 16 characters long. The 'static' id is 11 characters long.
It have your public IP (changed to 00.00.000.000 at the above example).
It have a signature, different each time you fetch the video.
The content expires after one hour.
You can see the content cached by FireFox using CacheViewer. If you order your cache by MIME Type and look to video/x-flv you will find your video. Even you can save your object as a flv file!
If you refresh the video page with FireFox you will see (with Live HTTP Headers) that you are redirected each time to a new URL. And with CacheViewer you will one video cached for each refresh!
Yes, www.youtube.com eats your bandwidth and your harddisk! Each time you ask for the a video you have a new download! Of course you can see also this with the FlashPlayer downloading/playing bar, but I wanted to be sure looking the HTTP dialog. The only way not to download a new time the video it is using the Replay feature of FlashPlayer.
If
FireFox caches many times the video, squid too?
If youtube cache at googlevideo uses an unique URL for each video fetching and the content expire after one hour it doesn't make any sense to cache it with squid.
There are a lot of web pages about squid telling how to cache these contents. Some of them:
But I recommend to you just to ensure you are'nt caching with squid these contents. Choose one of these two solutions for your squid.conf (please see the NOTE at wiki.squid-cache.org/ConfigExamples/DynamicContent):
acl QUERY urlpath_regex cgi-bin \? cache deny QUERY |
acl GOOGLEVIDEO urlpath_regex /videoplayback\?id=
/get_video\?origin= cache deny GOOGLEVIDEO acl YOUTUBE urlpath_regex /get_video\?video_id= cache deny YOUTUBE refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 |
Change your settings (if needed), restart squid and configure your FireFox to use your (squid) proxy server.
Looking your store.log:
perl -pe 's/^\d+\.\d+/localtime $&/e;' /usr/local/squid/logs/store.log |
you should see a RELEASE -1 FFFFFFFF line for your video:
Sat Feb 28 20:33:59 2009 RELEASE -1 FFFFFFFF 82E0348C925D182F5022C41DF5046C27 200 1235849627 1186629713 1235853227 video/flv 440391/440391 GET http://tc.cache.googlevideo.com/get_video? |
I used squidpurge to clean may squid cache and I saved 30 GByte!
cleaning.sh |
#!/bin/sh purge -p localhost:3128 -P 1 -sf cleaning.txt |
cleaning.txt | /videoplayback\?id= /get_video\?origin= /get_video\?video_id= |
Don't use squidpurge at production time. It is slow and time consuming, www.wa.apana.org.au/~dean/squidpurge/README.
squid must be working, but preferable without videocache redirector. If videocache is working the PURGE action is seen for the redirectors and videocache can try to download many videos from expired URLs.
Don't cache yourself if you use videocache
If you have the (squid) proxy server running with videocache redirector and the local webserver in the same machine it doesn't make sense to cache with squid the contents cached by videocache.
Depending of your cache policy (discussed above) it is possible you need in your squid.conf something like:
acl CACHE_HOST dstdomain proxy.domain cache deny CACHE_HOST |
acl CACHE_HOST_IP dst 192.168.0.3 cache deny CACHE_HOST_IP |
assuming you are using proxy.domain or 192.168.0.3 at videocache.conf for your cache_host:
cache_host = proxy.domain |
cache_host = 192.168.0.3 |
Don't cache twice the video with videocache
Don't use google caching with videocache. Use only youtube caching. Put at your videocache.conf:
enable_youtube_cache = 1 enable_google_cache = 0 |
Like this you will have only two downloads for each new youtube video. One for the workstation and second for videocache caching.
If not, you will have a third download because videocache will also cache the 'temporary' video at googlevideo.com.
If you look at /var/spool/videocache/youtube you should have only files with names composed by 11 characters. These are the 'static' video id at www.youtube.com.
If you have files with 'temporary' id (16 characters long) you are doing double unnecessary caching for www.youtube.com.
I used videocache 1.8 & 1.9 to test this ...
What to do with video.google.com and localised
If some user goes to googlevideo.com it is permanently moved to video.google.com (301 HTTP response).
In my case video.google.com and localised (video.google.cat, video.google.es, video.google.fr, ...) are not much used. More than that, the majority of videos at video.google.com are www.youtube.com videos today.
And, in my opinion, there is a bug at videocache 1.8 & 1.9 that impedes caching for video.google.com. Please see at cachevideos.com/forum/post/httpvideogoogleesvideoplaydocid-933124738927337625
For more information please go to the videocache website, cachevideos.com.