It's been half a year since the previous post here, and I feel we could use some fresh thought on
CleanURLs.
First of all, I agree that to search engines, clean (human-friendly) URLs may not matter at all. But I think they matter a lot to humans, and here are the reasons I believe so:
1. To selectively quote from
http://en.wikipedia.org?article=25725 (just kidding, from
http://en.wikipedia.org/wiki/Clean_URL):
Wikipedia wrote:clean URLs can also greatly improve usability and accessibility. Removing unnecessary parts, simplifies URLs and makes them
easier to type and remember. [...] when planning the structure of clean URLs, webmasters often take this opportunity to include relevant keywords in the URL and remove irrelevant words from it. So common words like "the", "and", "an", "a", etc. are often stripped out to further trim down the URL while descriptive keywords are added to increase user-friendliness and improve search engine ranking.[1] This includes
replacing hard-to-remember numerical IDs with the name of the resource it refers to. And, because not all resources have URL-friendly names due to the character set restrictions on web URLs or length, it is common practice to generate a
slug that is truncated to a certain length and has any invalid characters replaced with human-readable characters. This also eliminates ugly and hard to remember URL-encoded strings (e.g. Peanut%20M%26Ms becomes Peanut_MMs).
Similarly, it is common practice to replace cryptic variable names and parameters with friendly names or to simply do away with them altogether. Shorter URLs that don't contain any
esoteric abbreviations or complex syntax that is alien to the average user are less intimidating and contribute to overall usability.
Another aspect of clean URLs is that they do not contain implementation details of the underlying web application.[4] For example, many URLs include the filename of a server-side script, such as "example.php", "example.asp" or "cgi-bin". Such details are irrelevant to the user and do not serve to identify the content, and make it harder to change the implementation of the server at a later date. For example, if a script at "example.php" is rewritten in Python, the URL will have to change, or rewrite rules will need to be used to allow the old URL to redirect to the new one.
2. Let's think of URLs outside the realm of a web page, and URLs to our forum being used in places we have no control over, in any non-clickable medium:
- I've had to work in support, and often I had to ask people to read out to me over the phone what the URL of an error page was. Or they told me "I'm at http://blablabla ID = a63ab7f5ee0540698fcc51e5274bf0e2 and I see this problem".
- Paper books that reference online resources have to print their URL, and the reader has to type it in. (For the love of god, please don't say that the book author should use tinyurl.com and pick a pronounceable URL)
- At conferences or presentations, you may want to write a URL on the whiteboard (or in a slide) and have attendees type it in.
- If you're troubleshooting a problem on a computer that doesn't have a connection to another computer, and you want to "paste" a URL to it, you'd have to type that URL in. Often, this URL would point to a forum that explains how to troubleshoot that problem.
- Printouts can have a stylesheet for printing that outputs the URL next to the link title. If a reader later wants to follow the URL, they'll have to type it in.
- On various devices (i.e. phones), entering characters like the '=' from query strings is way more cumbersome than just pressing Shift and + on a full keyboard.
Even online, a human-friendly URL can be used directly, without cumbersome markup. If someone asks "How do I configure X to work with Y", I can quickly reply with "See
http://example.com/thread-12345/configu ... ork-with-y", and the URL will look friendly and legit.
In short, there are well-established accessibility reasons why clean URLs are a clear winner over cryptic URLs for humans, and these reasons have been studied extensively by professional user experience designers.
3. The only rational argument I've seen against the
idea of clean URLs is that clean URLs would leak information for private boards. However, this can easily be solved with
referer hiding. It's a trivial solution to this problem, can be implemented even
without a redirection page.
4. Of course, there are arguments against
implementing clean URLs, which is a different topic:
- "Clean URLs may require more database lookups" - perhaps not, if we keep one ID in the URL (as most dynamic content sites do): http://example.com/forum-name/thread-12 ... s-and-dogs
- "What if a topic gets renames?" - there are two solutions. If we keep an ID in the URL, then anything following the ID is there just for the convenience of humans. This is the easiest solution. If we remove the ID, then when a topic gets renamed (I've seen estimates of that happening rarely, maybe in 5% of the cases), the old URL could be setup to redirect to the new URL, in a table of redirects.
What I want to say is that human-friendly URLs may be somewhat difficult to implement, but it's the right thing to do.
A useful continuation of this discussion is to figure out
HOW exactly to implement clean URLs.