tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Meikle <loo...@gmail.com>
Subject Re: How can I let Tika know the resource name?
Date Wed, 15 Aug 2012 09:09:11 GMT
Hi, 

On 13 Aug 2012, at 12:31, 122jxgcn <ywpark90@gmail.com> wrote:

> Hello,
> 
> I'm using Solr's ExtractingRequestHandler to let Tika know the name of the
> file when indexing.
> I'm currently sending HTTP request something like
> 
> /update/extract?stream.file=#{filepath}&literal.id=#{filepath}&resource.name=#{resource_name}&commit=true
> 
> Will setting the resource.name variable let Tika know the name of the file 
> so that it can determine Metadata of the file properly?
> (for example resource_name = "file.custom" then in Tika, 
> Metadata.RESOURCE_NAME_KEY becomes "file.custom")
> I'm not sure how can I test this so I'm confused.
> 
> Thank you.


If you pass the resource name as you are doing in it will be fed into Tika and used as a hint
for Mime Type detection.  I assume it was Mime Type detection you were looking for.

The best way to see what is happening around metadata - and if what you were trying to pass
has made it - is to look at the results of what happens in Extract Only mode[1].

Cheers,
Dave

[1] http://wiki.apache.org/solr/ExtractingRequestHandler#Extract_Only 
Mime
View raw message