manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: FW: Store file size in Solr
Date Wed, 27 May 2015 12:30:48 GMT
Hi Vigi,

Are you looking for the document length, or the extracted content length?

In any case, the binary length of the document is available for indexing in
the output connector, but none of our output connectors deal with it at
this time.  In addition, if you want the *original* binary document length,
before any Tika processing, then you will need support added to the various
repository connectors since that info is currently not retained.

Thanks,
Karl


On Wed, May 27, 2015 at 8:17 AM, Virgiliu R <gosuvigi@hotmail.com> wrote:

> Hello,
>
> I am currently using Manifoldcf to index a lot of documents from a Windows
> file share into Solr using an AD authority connector. Everything seems to
> be working almost fine and I am currently very pleased with the tool.
>
> There is something that I would like to implement but could not figure out
> how to do it. I would like to be able to store the file size of the
> documents into Solr so that it could be displayed in the search results or
> it could even be used for searching later on. The only problem is that I
> was not able to find a way to do that. I was thinking of using a sort of
> transformation connection, I even tried with a Tika transformation but I
> don't know exactly how to do it.
>
> Could you please give me some hints on what I would have to do to achieve
> this?
>
> Thanks,
> vigi
>

Mime
View raw message