When Apache is doing content negotiation, it follows an exhaustive, well thought out algorithm for selecting the best variant. A normal web browser will send something like this in its request headers:
Accept: application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,*/*;q=0.5
If, like me, you want to serve up multiple content types using negotiation, but show HTML by default for normal users, then step 1 of the Apache algorithm will already do the right thing for you. The browser has declared HTML to be its highest preference, so Apache will pick that.
Unfortunately Google (and other search engines) send only this:
Accept: */*
in which case the other rules in the Apache algorithm come into play. In my case, the plain text variant of my resources was being selected because of step 8 in the algorithm, so Apache was serving plain text to Google and HTML to everyone else.
Not only is this against Google’s rules (it would be considered “cloaking”), but it’s also really bad for rankings!
The solution is to configure “quality of source” (qs) multipliers, such that
your preferred content type comes out with the highest quality score in cases
where the client sends Accept: */*
. That’s easy if you’re using type map
files,
but who wants to create a type map for every resource on their site?
MultiViews
gives you automatic type mapping, but the docs don’t explain how to set the qs
multiplier for MultiViews.
Luckily, it is possible, just not obvious. I eventually found a thread on the Apache users mailing list from 2002 which explains: you can use the AddType directive to redefine content types with a qs parameter, which Apache will then apply in its negotiation algorithm. Your config would look something like this:
<Directory "/var/www/localhost/htdocs">
Options +Multiviews
AddType application/atom+xml;qs=0.5 .atom
AddType text/plain;qs=0.1 .txt
# all other content types will have a default qs multiplier of 1.0
...
</Directory>
Unfortunately Apache will pass the qs parameter on to clients in its response headers. As mentioned in the mailing list thread, it may or may not be in adherence to the spec, and it certainly is ugly, but in my testing it doesn’t seem cause any harm.