uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Avoid indexing of old UIMA documentation
Date Wed, 06 Apr 2016 20:43:13 GMT
Hi all,

I believe some time back we were talking about a strategy to avoid search engines pointing
to ancient version of the UIMA documentation.

I have read a bit on rel="canonical" and robots.txt.

1) per webpage - Apparently, one can place a `link rel="canonical"` element on any HTML page.
Search engines seeing this tag will then not index this page because it is considered to be
a duplicate of whatever other page the link points to.

2) via http header/htaccess - Since we probably don't want to patch up all our JavaDoc files,
the information about a canonical source can also be sent in the HTTP header, e.g. via a suitable
htaccess file.

I guess the idea would be that for any old documentation page, we would want it to point to
its latest version as its canonical source. I mean for every page, not only for the index
page. This seems a bit tedious.

My suggestion would be an alternative that exploits the website folder structure and uses
robots.txt.

We disallow indexing of the "d" folder on the UIMA website.
We place all the "*-current" folders (svn copies of the latest documentation versions) under
a dedicated folder (e.g. "d/current") and allow indexing that.

In that way, the outdated versions of the documentation should be hidden from the search engines
and the respective latest versions should be indexed.

Opinions? Does anybody have experience with SEO?

Cheers,

-- Richard


Mime
View raw message