incubator-droids-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thorsten Scherler <thorsten.scherler....@juntadeandalucia.es>
Subject Re: Customizable Solr Handle
Date Fri, 25 Sep 2009 10:55:41 GMT
On Thu, 2009-09-24 at 18:25 +0200, Bertil Chapuis wrote:
> Did someone already try the jaxen package called saxpath? It may be a
> good solution to handle the content especially if the webpages or
> documents are very long.
> 

Yes, with this solution it should work fine.

salu2

> Best regards,
> 
> Bertil
> 
> On Thu, 2009-09-24 at 11:19 +0200, Thorsten Scherler wrote:
> > On Wed, 2009-09-09 at 10:38 +0200, Bertil Chapuis wrote:
> > > Hello,
> > > 
> > > My name is Bertil Chapuis. I am using droids for a personal project and
> > > I am trying to create a more customizable solr handler. 
> > > 
> > > I posted a ticket with my code (DROIDS-62). However, I am looking for a
> > > way to filter the handler's execution. I'd like to handle the documents
> > > only if their URI or content matches specific conditions.
> > > 
> > > For example, the document is handled only if its uri matches the
> > > following regex:
> > > 
> > > http://www.awebsite.com/document-[0-9]*.htm
> > > 
> > > What's the best way to do that? 
> > 
> > I had a chance to test this patch but in the end I could not use it for
> > my use case. The problem that I have with it it that is limiting the
> > access to the different elements in the tree to much. It is not generic
> > since instead of using xpath expression (the standard approach to solve
> > such a usecase) it uses "standard regexp". 
> > 
> > Further having a strong background on xml myself it stroke me ought to
> > have element[0] which in xpath would have been element[1].
> > 
> > IMO if you can add xpath support to this component then it really rocks
> > for many usecases since we would have a generic parser solution to
> > extract informations the way it is now it will be for very few use
> > cases.
> > 
> > salu2
> > 
> > > Is it delegated to the handler's
> > > implementation or is there a standard way?
> > > 
> > > Best regards,
> > > 
> > > Bertil Chapuis
> > > 
> > > 
> 
-- 
Thorsten Scherler <thorsten.at.apache.org>
Open Source Java <consulting, training and solutions>

Sociedad Andaluza para el Desarrollo de la Sociedad 
de la InformaciĆ³n, S.A.U. (SADESI)





Mime
View raw message