lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Baer <jonb...@gmail.com>
Subject Re: DIH - Example of using $nextUrl and $hasMore
Date Mon, 02 Feb 2009 17:31:04 GMT
Yes I think what Jared mentions in the JIRA is what I was thinking about
when it is recommended to always return true for $hasMore ...

"The transformer must know somehow when $hasMore should be true. If the
transformer always give $hasMore a value "true", will there be infinite
requests made or will it stop on the first empty request? Using the
EnumeratedEntityTransformer, a user can specify from the config xml when
$hasMore should be true using the chunkSize attribute. This solves a general
case of "request N rows at a time until no more are available". I agree, a
combination of 'rowsFetchedCount' and a HasMoreUntilEmptyTransformer would
also make this doable from the configuration"

This makes sense.

- Jon
  [ Show ยป <https://issues.apache.org/jira/browse/SOLR-994> ]
 Jared Flatow<https://issues.apache.org/jira/secure/ViewProfile.jspa?name=jflatow>-
28/Jan/09
09:16 PM The transformer must know somehow when $hasMore should be true. If
the transformer always give $hasMore a value "true", will there be infinite
requests made or will it stop on the first empty request? Using the
EnumeratedEntityTransformer, a user can specify from the config xml when
$hasMore should be true using the chunkSize attribute. This solves a general
case of "request N rows at a time until no more are available". I agree, a
combination of 'rowsFetchedCount' and a HasMoreUntilEmptyTransformer would
also make this doable from the configuration.

On Mon, Feb 2, 2009 at 11:53 AM, Shalin Shekhar Mangar <
shalinmangar@gmail.com> wrote:

> On Mon, Feb 2, 2009 at 9:20 PM, Jon Baer <jonbaer@gmail.com> wrote:
>
> > Hi,
> >
> > Sorry I know this exists ...
> >
> > "If an API supports chunking (when the dataset is too large) multiple
> calls
> > need to be made to complete the process. XPathEntityprocessor supports
> this
> > with a transformer. If transformer returns a row which contains a field *
> > $hasMore* with a the value "true" the Processor makes another request
> with
> > the same url template (The actual value is recomputed before invoking ).
> A
> > transformer can pass a totally new url too for the next call by returning
> a
> > row which contains a field *$nextUrl* whose value must be the complete
> url
> > for the next call."
> >
> > But is there a true example of it's use somewhere?  Im trying to figure
> out
> > if I know before import that I have 56 "pages" to index how to set this
> up
> > properly.  (And how to set it up if pages need to be determined by
> > something
> > in the feed, etc).
> >
>
> No, there is no example (yet). You'll put the url with variables for the
> corresponding 'start' and 'count' parameters and a custom transformer can
> specify if another request needs to be made. I know it's not much to go on.
> I'll try to write some documentation on the wiki.
>
> SOLR-994 might be interesting to you. I haven't been able to look at the
> patch though.
>
>  https://issues.apache.org/jira/browse/SOLR-994
> --
> Regards,
> Shalin Shekhar Mangar.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message