uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Avoid indexing of old UIMA documentation
Date Thu, 07 Apr 2016 20:36:56 GMT
Hi,

This sounds like a good idea to me :-)

There's one small issue possibly, to changing the folder structure.  The DOCBOOK
schemes have some fancy way to link between docbooks; these require that the
books be kept relative to one another in some file tree structure.  As long as
that's not changed, I think there will be no problem. 

If anyone's curious, the relevant bits of config info are in the
uima-docbook-olink project, in the various "site.xml" files.  You can see refs
to the famous "d" folder there.  There may be a dependency on the "books" being
just one directory layer under d/, so putting an extra layer might break things
(but I'm not sure...).

Maybe there's a way to do this without introducing a new level in the directory?

-Marshall

On 4/6/2016 4:43 PM, Richard Eckart de Castilho wrote:
> Hi all,
>
> I believe some time back we were talking about a strategy to avoid search engines pointing
to ancient version of the UIMA documentation.
>
> I have read a bit on rel="canonical" and robots.txt.
>
> 1) per webpage - Apparently, one can place a `link rel="canonical"` element on any HTML
page. Search engines seeing this tag will then not index this page because it is considered
to be a duplicate of whatever other page the link points to.
>
> 2) via http header/htaccess - Since we probably don't want to patch up all our JavaDoc
files, the information about a canonical source can also be sent in the HTTP header, e.g.
via a suitable htaccess file.
>
> I guess the idea would be that for any old documentation page, we would want it to point
to its latest version as its canonical source. I mean for every page, not only for the index
page. This seems a bit tedious.
>
> My suggestion would be an alternative that exploits the website folder structure and
uses robots.txt.
>
> We disallow indexing of the "d" folder on the UIMA website.
> We place all the "*-current" folders (svn copies of the latest documentation versions)
under a dedicated folder (e.g. "d/current") and allow indexing that.
>
> In that way, the outdated versions of the documentation should be hidden from the search
engines and the respective latest versions should be indexed.
>
> Opinions? Does anybody have experience with SEO?
>
> Cheers,
>
> -- Richard
>
>


Mime
View raw message