lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Susheel Kumar <susheel2...@gmail.com>
Subject Re: Problem with the Content Field during Solr Indexing
Date Mon, 02 Nov 2015 13:11:24 GMT
Hi Shruti,

If you are looking to index images to make them searchable (Image Search)
then you will have to look at LIRE (Lucene Image Retrieval)
http://www.lire-project.net/  and can follow Lire Solr Plugin at this site
https://bitbucket.org/dermotte/liresolr.

Thanks,
Susheel

On Sat, Oct 31, 2015 at 9:46 PM, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com>
wrote:

> Hi Shruti,
>
> From what I understand, the /update/extract handler is for indexing
> rich-text documents, and does not support ".png" files.
>
> It only supports the following files format: pdf, doc, docx, ppt, pptx,
> xls, xlsx, odt, odp, ods, ott, otp, ots, rtf, htm, html, txt, log
> If you use the default post.jar, I believe the other formats will get
> filtered out.
>
> When I tried to index ".png" file in my custom handler, it just index "<p>
> <p>" in the content.
>
> Regards,
> Edwin
>
>
>
> On 31 October 2015 at 09:35, Shruti Mundra <mundra@usc.edu> wrote:
>
> > Hi Edwin,
> >
> > The file extension of the image file is ".png" and we are following this
> > url for indexing:
> > "
> >
> >
> http://blog.thedigitalgroup.com/vijaym/wp-content/uploads/sites/11/2015/07/SolrImageExtract.png
> > "
> >
> > Thanks and Regards,
> > Shruti Mundra
> >
> > On Thu, Oct 29, 2015 at 8:33 PM, Zheng Lin Edwin Yeo <
> edwinyeozl@gmail.com
> > >
> > wrote:
> >
> > > The "\n" actually means new line as decoded by Solr from the indexed
> > > document.
> > >
> > > What is your file extension of your image file, and which method are
> you
> > > using to do the indexing?
> > >
> > > Regards,
> > > Edwin
> > >
> > >
> > > On 30 October 2015 at 04:38, Shruti Mundra <mundra@usc.edu> wrote:
> > >
> > > > Hi,
> > > >
> > > > When I'm trying index an image file directly to Solr, the attribute
> > > > content, consists of trails of "\n"s and not the data.
> > > > We are successful in getting the metadata for that image.
> > > >
> > > > Can anyone help us out on how we could get the content along with the
> > > > Metadata.
> > > >
> > > > Thanks!
> > > >
> > > > - Shruti Mundra
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message