ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Pavlukhin <vololo...@gmail.com>
Subject Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)
Date Thu, 28 Nov 2019 13:22:52 GMT
Folks, Yuriy,

I suppose that we are going to proceed with

>>>
Reducing on Ignite

The obvious point of distributed response reduction is class
GridCacheDistributedQueryFuture.
Though, @Ivan Pavlukhin mentioned class with similar functionality:
ReduceIndexSorted
What I see here, that it is tangled with H2 related classes
(org.h2.result.Row) and might not be unified with TextQuery reduction.
>>

>From my side there is no strict opinion that we should unify
reduction. Having a separate reduction implementation for text queries
sounds for me as not bad option as well.

Are there still any open questions?

ср, 27 нояб. 2019 г. в 02:27, Denis Magda <dmagda@apache.org>:
>
> I don't see anything wrong if Yuriy is willing to carry on and keep
> enhancing our full-text search support that lacks basic capabilities.
>
> The basics should be available. If anybody needs an advanced feature they
> can introduce Solr or ElastiSearch into the final architecture of the app.
>
> Folks, who of us can help Yuriy with the questions asked? Most like the SQL
> experts are the best candidates here.
>
>
> -
> Denis
>
>
> On Tue, Nov 26, 2019 at 8:52 AM Ivan Pavlukhin <vololo100@gmail.com> wrote:
>
> > Folks,
> >
> > IEP is an Ignite-specific thing. In fact, I suppose that we are
> > already doing it in ASF way by having this dev-list discussion =)
> >
> > As for me, implementing "limit" feature for text queries is not so big
> > to make an IEP. But we might need to create one for next features.
> >
> > вт, 26 нояб. 2019 г. в 15:06, Ilya Kasnacheev <ilya.kasnacheev@gmail.com>:
> > >
> > > Hello!
> > >
> > > ASF way should probably start with an IEP :)
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> > >
> > >
> > > вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky
> > <arzamas123@mail.ru.invalid
> > > >:
> > >
> > > >
> > > > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this
> > > > functionality is helpful and PR it, why not ?
> > > >
> > > > isn`t it ?
> > > >
> > > > >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev <
> > > > ilya.kasnacheev@gmail.com>:
> > > > >
> > > > >Hello!
> > > > >
> > > > >The problem here is that Solr is a multi-year effort by a lot of
> > people.
> > > > We
> > > > >can't match that.
> > > > >
> > > > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding our
> > > > cache
> > > > >information into their storage for indexing and relying on their own
> > > > >mechanisms for distributed IR sorting?
> > > > >
> > > > >Regards,
> > > > >--
> > > > >Ilya Kasnacheev
> > > > >
> > > > >
> > > > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky <
> > > > arzamas123@mail.ru.invalid
> > > > >>:
> > > > >
> > > > >>
> > > > >> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ?
> > > > >>
> > > > >> thanks !
> > > > >>
> > > > >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev <
> > > > >>  ilya.kasnacheev@gmail.com >:
> > > > >> >
> > > > >> >Hello!
> > > > >> >
> > > > >> >I have a hunch that we are trying to build Apache Solr (or Solr
> > Cloud)
> > > > >> into
> > > > >> >Apache Ignite. I think that's a lot of effort that is not very
> > > > justified.
> > > > >> >
> > > > >> >I don't think we should try to implement sorting in Apache Ignite,
> > > > because
> > > > >> >it is a lot of work, and a lot of code in our code base which we
> > don't
> > > > >> >really want.
> > > > >> >
> > > > >> >Regards,
> > > > >> >--
> > > > >> >Ilya Kasnacheev
> > > > >> >
> > > > >> >
> > > > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga <  shuliga@gmail.com
> > >:
> > > > >> >
> > > > >> >> Dear Igniters,
> > > > >> >>
> > > > >> >> The first part of TextQuery improvement - a result limit - was
> > > > developed
> > > > >> >> and merged.
> > > > >> >> Now we have to develop most important functionality here - proper
> > > > >> sorting
> > > > >> >> of Lucene index response and correct reducing of them for
> > distributed
> > > > >> >> queries.
> > > > >> >>
> > > > >> >> *There are two Lucene based aspects*
> > > > >> >>
> > > > >> >> 1. In case of using no sorting fields, the documents in response
> > are
> > > > >> still
> > > > >> >> ordered by relevance.
> > > > >> >> Actually this is ScoreDoc.score value.
> > > > >> >> In order to reduce the distributed results correctly, the score
> > > > should
> > > > >> be
> > > > >> >> passed with response.
> > > > >> >>
> > > > >> >> 2. When sorting by conventional fields, then Lucene should have
> > these
> > > > >> >> fields properly indexed and
> > > > >> >> corresponding Sort object should be applied to Lucene's search
> > call.
> > > > >> >> In order to mark those fields a new annotation like '@SortField'
> > may
> > > > be
> > > > >> >> introduced.
> > > > >> >>
> > > > >> >> *Reducing on Ignite *
> > > > >> >>
> > > > >> >> The obvious point of distributed response reduction is class
> > > > >> >> GridCacheDistributedQueryFuture.
> > > > >> >> Though, @Ivan Pavlukhin mentioned class with similar
> > functionality:
> > > > >> >> ReduceIndexSorted
> > > > >> >> What I see here, that it is tangled with H2 related classes (
> > > > >> >> org.h2.result.Row) and might not be unified with TextQuery
> > reduction.
> > > > >> >>
> > > > >> >> Still need a support here.
> > > > >> >>
> > > > >> >> Overall, the goal of this letter is to initiate discussion on
> > > > TextQuery
> > > > >> >> Sorting implementation and come closer to ticket creation.
> > > > >> >>
> > > > >> >> BR,
> > > > >> >> Yuriy Shuliha
> > > > >> >>
> > > > >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov <
> > > > andrey.mashenkov@gmail.com
> > > > >> >
> > > > >> >> пише:
> > > > >> >>
> > > > >> >> > Hi Dmitry, Yuriy.
> > > > >> >> >
> > > > >> >> > I've found GridCacheQueryFutureAdapter has newly added
> > > > AtomicInteger
> > > > >> >> > 'total' field and 'limit; field as primitive int.
> > > > >> >> >
> > > > >> >> > Both fields are used inside synchronized block only.
> > > > >> >> > So, we can make both private and downgrade AtomicInteger to
> > > > primitive
> > > > >> >> int.
> > > > >> >> >
> > > > >> >> > Most likely, these fields can be replaced with one field.
> > > > >> >> >
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov <
> > > > dpavlov@apache.org
> > > > >> >
> > > > >> >> > wrote:
> > > > >> >> >
> > > > >> >> > > Hi Andrey,
> > > > >> >> > >
> > > > >> >> > > I've checked this ticket comments, and there is a TC Bot visa
> > > > (with
> > > > >> no
> > > > >> >> > > blockers).
> > > > >> >> > >
> > > > >> >> > > Do you have any concerns related to this patch?
> > > > >> >> > >
> > > > >> >> > > Sincerely,
> > > > >> >> > > Dmitriy Pavlov
> > > > >> >> > >
> > > > >> >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga <
> > shuliga@gmail.com
> > > > >:
> > > > >> >> > >
> > > > >> >> > >> Andrey,
> > > > >> >> > >>
> > > > >> >> > >> Per you request, I created ticket
> > > > >> >> > >>  https://issues.apache.org/jira/browse/IGNITE-12291 linked
> > to
> > > > >> >> > >>
> > > > >>  https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189
> > > > >> >> > >>
> > > > >> >> > >> Could you please proceed with PR merge ?
> > > > >> >> > >>
> > > > >> >> > >> BR,
> > > > >> >> > >> Yuriy Shuliha
> > > > >> >> > >>
> > > > >> >> > >> ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov <
> > > > >>  andrey.mashenkov@gmail.com
> > > > >> >> >
> > > > >> >> > >> пише:
> > > > >> >> > >>
> > > > >> >> > >> > Hi Yuri,
> > > > >> >> > >> >
> > > > >> >> > >> > To get access to TC Bot you should register as TeamCity
> > user
> > > > >> [1], if
> > > > >> >> > you
> > > > >> >> > >> > didn't do this already.
> > > > >> >> > >> > Then you will be able to authorize on Ignite TC Bot page
> > with
> > > > >> same
> > > > >> >> > >> > credentials.
> > > > >> >> > >> >
> > > > >> >> > >> > [1]  https://ci.ignite.apache.org/registerUser.html
> > > > >> >> > >> >
> > > > >> >> > >> > On Fri, Oct 4, 2019 at 3:10 PM Yuriy Shuliga <
> > > > shuliga@gmail.com
> > > > >> >
> > > > >> >> > wrote:
> > > > >> >> > >> >
> > > > >> >> > >> >> Andrew,
> > > > >> >> > >> >>
> > > > >> >> > >> >> I have corrected PR according to your notes. Please
> > review.
> > > > >> >> > >> >> What will be the next steps in order to merge in?
> > > > >> >> > >> >>
> > > > >> >> > >> >> Y.
> > > > >> >> > >> >>
> > > > >> >> > >> >> чт, 3 жовт. 2019 о 17:47 Andrey Mashenkov <
> > > > >> >> >  andrey.mashenkov@gmail.com >
> > > > >> >> > >> >> пише:
> > > > >> >> > >> >>
> > > > >> >> > >> >> > Yuri,
> > > > >> >> > >> >> >
> > > > >> >> > >> >> > I've done with review.
> > > > >> >> > >> >> > No crime found, but trivial compatibility bug.
> > > > >> >> > >> >> >
> > > > >> >> > >> >> > On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga <
> > > > >>  shuliga@gmail.com >
> > > > >> >> > >> wrote:
> > > > >> >> > >> >> >
> > > > >> >> > >> >> > > Denis,
> > > > >> >> > >> >> > >
> > > > >> >> > >> >> > > Thank you for your attention to this.
> > > > >> >> > >> >> > > as for now, the
> > > > >> >> >  https://issues.apache.org/jira/browse/IGNITE-12189
> > > > >> >> > >> >> > ticket
> > > > >> >> > >> >> > > is still pending review.
> > > > >> >> > >> >> > > Do we have a chance to move it forward somehow?
> > > > >> >> > >> >> > >
> > > > >> >> > >> >> > > BR,
> > > > >> >> > >> >> > > Yuriy Shuliha
> > > > >> >> > >> >> > >
> > > > >> >> > >> >> > > пн, 30 вер. 2019 о 23:35 Denis Magda <
> > > > dmagda@apache.org >
> > > > >> пише:
> > > > >> >> > >> >> > >
> > > > >> >> > >> >> > > > Yuriy,
> > > > >> >> > >> >> > > >
> > > > >> >> > >> >> > > > I've seen you opening a pull-request with the first
> > > > >> changes:
> > > > >> >> > >> >> > > >
> > https://issues.apache.org/jira/browse/IGNITE-12189
> > > > >> >> > >> >> > > >
> > > > >> >> > >> >> > > > Alex Scherbakov and Ivan are you the right guys to
> > do
> > > > the
> > > > >> >> > review?
> > > > >> >> > >> >> > > >
> > > > >> >> > >> >> > > > -
> > > > >> >> > >> >> > > > Denis
> > > > >> >> > >> >> > > >
> > > > >> >> > >> >> > > >
> > > > >> >> > >> >> > > > On Fri, Sep 27, 2019 at 8:48 AM Павлухин Иван <
> > > > >> >> > >>  vololo100@gmail.com >
> > > > >> >> > >> >> > > wrote:
> > > > >> >> > >> >> > > >
> > > > >> >> > >> >> > > > > Yuriy,
> > > > >> >> > >> >> > > > >
> > > > >> >> > >> >> > > > > Thank you for providing details! Quite
> > interesting.
> > > > >> >> > >> >> > > > >
> > > > >> >> > >> >> > > > > Yes, we already have support of distributed
> > limit and
> > > > >> >> merging
> > > > >> >> > >> >> sorted
> > > > >> >> > >> >> > > > > subresults for SQL queries. E.g.
> > ReduceIndexSorted
> > > > and
> > > > >> >> > >> >> > > > > MergeStreamIterator are used for merging sorted
> > > > streams.
> > > > >> >> > >> >> > > > >
> > > > >> >> > >> >> > > > > Could you please also clarify about
> > score/relevance?
> > > > Is
> > > > >> it
> > > > >> >> > >> >> provided
> > > > >> >> > >> >> > by
> > > > >> >> > >> >> > > > > Lucene engine for each query result? I am
> > thinking
> > > > how
> > > > >> to
> > > > >> >> do
> > > > >> >> > >> >> sorted
> > > > >> >> > >> >> > > > > merge properly in this case.
> > > > >> >> > >> >> > > > >
> > > > >> >> > >> >> > > > > ср, 25 сент. 2019 г. в 18:56, Yuriy Shuliga <
> > > > >> >> >  shuliga@gmail.com
> > > > >> >> > >> >:
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > Ivan,
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > Thank you for interesting question!
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > Text searches (or full text searches) are
> > mostly
> > > > >> >> > >> human-oriented.
> > > > >> >> > >> >> > And
> > > > >> >> > >> >> > > > the
> > > > >> >> > >> >> > > > > > point of user's interest is topmost part of
> > > > response.
> > > > >> >> > >> >> > > > > > Then user can read it, evaluate and use the
> > given
> > > > >> records
> > > > >> >> > for
> > > > >> >> > >> >> > further
> > > > >> >> > >> >> > > > > > purposes.
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > Particularly in our case, we use Ignite for
> > > > operations
> > > > >> >> with
> > > > >> >> > >> >> > financial
> > > > >> >> > >> >> > > > > data,
> > > > >> >> > >> >> > > > > > and there lots of text stuff like assets names,
> > > > fin.
> > > > >> >> > >> >> instruments,
> > > > >> >> > >> >> > > > > companies
> > > > >> >> > >> >> > > > > > etc.
> > > > >> >> > >> >> > > > > > In order to operate with this quickly and
> > reliably,
> > > > >> users
> > > > >> >> > >> used
> > > > >> >> > >> >> to
> > > > >> >> > >> >> > > work
> > > > >> >> > >> >> > > > > with
> > > > >> >> > >> >> > > > > > text search, type-ahead completions,
> > suggestions.
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > For this purposes we are indexing particular
> > string
> > > > >> data
> > > > >> >> in
> > > > >> >> > >> >> > separate
> > > > >> >> > >> >> > > > > caches.
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > Sorting capabilities and response size
> > limitations
> > > > are
> > > > >> >> very
> > > > >> >> > >> >> > important
> > > > >> >> > >> >> > > > > > there. As our API have to provide most relevant
> > > > >> >> information
> > > > >> >> > >> in
> > > > >> >> > >> >> view
> > > > >> >> > >> >> > > of
> > > > >> >> > >> >> > > > > > limited size.
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > Now let me comment some Ignite/Lucene
> > perspective.
> > > > >> >> > >> >> > > > > > Actually Ignite queries and Lucene returns
> > > > >> >> > >> *TopDocs.scoresDocs
> > > > >> >> > >> >> > > *already
> > > > >> >> > >> >> > > > > > sorted by *score *(relevance). So most relevant
> > > > >> documents
> > > > >> >> > >> are on
> > > > >> >> > >> >> > the
> > > > >> >> > >> >> > > > top.
> > > > >> >> > >> >> > > > > > And currently distributed queries responses
> > from
> > > > >> >> different
> > > > >> >> > >> nodes
> > > > >> >> > >> >> > are
> > > > >> >> > >> >> > > > > merged
> > > > >> >> > >> >> > > > > > into final query cursor queue in arbitrary way.
> > > > >> >> > >> >> > > > > > So in fact we already have the score order
> > ruined
> > > > >> here.
> > > > >> >> > Also
> > > > >> >> > >> >> Ignite
> > > > >> >> > >> >> > > > > > requests all possible documents from Lucene
> > that is
> > > > >> >> > redundant
> > > > >> >> > >> >> and
> > > > >> >> > >> >> > not
> > > > >> >> > >> >> > > > > good
> > > > >> >> > >> >> > > > > > for performance.
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > I'm implementing *limit* parameter to be part
> > of
> > > > >> >> *TextQuery
> > > > >> >> > >> *and
> > > > >> >> > >> >> > have
> > > > >> >> > >> >> > > > to
> > > > >> >> > >> >> > > > > > notice that we still have to add sorting for
> > text
> > > > >> queries
> > > > >> >> > >> >> > processing
> > > > >> >> > >> >> > > in
> > > > >> >> > >> >> > > > > > order to have applicable results.
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > *Limit* parameter itself should improve the
> > part of
> > > > >> >> issues
> > > > >> >> > >> from
> > > > >> >> > >> >> > > above,
> > > > >> >> > >> >> > > > > but
> > > > >> >> > >> >> > > > > > definitely, sorting by document score at least
> > > > should
> > > > >> be
> > > > >> >> > >> >> > implemented
> > > > >> >> > >> >> > > > > along
> > > > >> >> > >> >> > > > > > with limit.
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > This is a pretty short commentary if you still
> > have
> > > > >> any
> > > > >> >> > >> >> questions,
> > > > >> >> > >> >> > > > please
> > > > >> >> > >> >> > > > > > ask, do not hesitate)
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > BR,
> > > > >> >> > >> >> > > > > > Yuriy Shuliha
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > чт, 19 вер. 2019 о 11:38 Павлухин Иван <
> > > > >> >> >  vololo100@gmail.com >
> > > > >> >> > >> >> пише:
> > > > >> >> > >> >> > > > > >
> > > > >> >> > >> >> > > > > > > Yuriy,
> > > > >> >> > >> >> > > > > > >
> > > > >> >> > >> >> > > > > > > Greatly appreciate your interest.
> > > > >> >> > >> >> > > > > > >
> > > > >> >> > >> >> > > > > > > Could you please elaborate a little bit about
> > > > >> sorting?
> > > > >> >> > What
> > > > >> >> > >> >> tasks
> > > > >> >> > >> >> > > > does
> > > > >> >> > >> >> > > > > > > it help to solve and how? It would be great
> > to
> > > > >> provide
> > > > >> >> an
> > > > >> >> > >> >> > example.
> > > > >> >> > >> >> > > > > > >
> > > > >> >> > >> >> > > > > > > ср, 18 сент. 2019 г. в 09:39, Alexei
> > Scherbakov <
> > > > >> >> > >> >> > > > > > >  alexey.scherbakoff@gmail.com >:
> > > > >> >> > >> >> > > > > > > >
> > > > >> >> > >> >> > > > > > > > Denis,
> > > > >> >> > >> >> > > > > > > >
> > > > >> >> > >> >> > > > > > > > I like the idea of throwing an exception
> > for
> > > > >> enabled
> > > > >> >> > text
> > > > >> >> > >> >> > queries
> > > > >> >> > >> >> > > > on
> > > > >> >> > >> >> > > > > > > > persistent caches.
> > > > >> >> > >> >> > > > > > > >
> > > > >> >> > >> >> > > > > > > > Also I'm fine with proposed limit for
> > unsorted
> > > > >> >> > searches.
> > > > >> >> > >> >> > > > > > > >
> > > > >> >> > >> >> > > > > > > > Yury, please proceed with ticket creation.
> > > > >> >> > >> >> > > > > > > >
> > > > >> >> > >> >> > > > > > > > вт, 17 сент. 2019 г., 22:06 Denis Magda <
> > > > >> >> > >>  dmagda@apache.org
> > > > >> >> > >> >> >:
> > > > >> >> > >> >> > > > > > > >
> > > > >> >> > >> >> > > > > > > > > Igniters,
> > > > >> >> > >> >> > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > I see nothing wrong with Yury's proposal
> > in
> > > > >> regards
> > > > >> >> > >> >> full-text
> > > > >> >> > >> >> > > > > search
> > > > >> >> > >> >> > > > > > > API
> > > > >> >> > >> >> > > > > > > > > evolution as long as Yury is ready to
> > push it
> > > > >> >> > forward.
> > > > >> >> > >> >> > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > As for the in-memory mode only, it makes
> > > > total
> > > > >> >> sense
> > > > >> >> > >> for
> > > > >> >> > >> >> > > > in-memory
> > > > >> >> > >> >> > > > > data
> > > > >> >> > >> >> > > > > > > > > grid deployments when Ignite caches data
> > of
> > > > an
> > > > >> >> > >> underlying
> > > > >> >> > >> >> DB
> > > > >> >> > >> >> > > like
> > > > >> >> > >> >> > > > > > > Postgres.
> > > > >> >> > >> >> > > > > > > > > As part of the changes, I would simply
> > throw
> > > > an
> > > > >> >> > >> exception
> > > > >> >> > >> >> (by
> > > > >> >> > >> >> > > > > default)
> > > > >> >> > >> >> > > > > > > if
> > > > >> >> > >> >> > > > > > > > > the one attempts to use text indices
> > with the
> > > > >> >> native
> > > > >> >> > >> >> > > persistence
> > > > >> >> > >> >> > > > > > > enabled.
> > > > >> >> > >> >> > > > > > > > > If the person is ready to live with that
> > > > >> limitation
> > > > >> >> > >> that
> > > > >> >> > >> >> an
> > > > >> >> > >> >> > > > > explicit
> > > > >> >> > >> >> > > > > > > > > configuration change is needed to come
> > around
> > > > >> the
> > > > >> >> > >> >> exception.
> > > > >> >> > >> >> > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > Thoughts?
> > > > >> >> > >> >> > > > > > > > >
> > > > >> >> > >> >> > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > -
> > > > >> >> > >> >> > > > > > > > > Denis
> > > > >> >> > >> >> > > > > > > > >
> > > > >> >> > >> >> > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > On Tue, Sep 17, 2019 at 7:44 AM Yuriy
> > > > Shuliga <
> > > > >> >> > >> >> > >  shuliga@gmail.com
> > > > >> >> > >> >> > > > >
> > > > >> >> > >> >> > > > > > > wrote:
> > > > >> >> > >> >> > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > Hello to all again,
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > Thank you for important comments and
> > notes
> > > > >> given
> > > > >> >> > >> below!
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > Let me answer and continue the
> > discussion.
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > (I) Overall needs in Lucene indexing
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > Alexei has referenced to
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >>  https://issues.apache.org/jira/browse/IGNITE-5371
> > > > >> >> > >> where
> > > > >> >> > >> >> > > > > > > > > > absence of index persistence was
> > declared
> > > > as
> > > > >> an
> > > > >> >> > >> >> obstacle to
> > > > >> >> > >> >> > > > > further
> > > > >> >> > >> >> > > > > > > > > > development.
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > a) This ticket is already closed as not
> > > > >> valid.b)
> > > > >> >> > >> There
> > > > >> >> > >> >> are
> > > > >> >> > >> >> > > > > definite
> > > > >> >> > >> >> > > > > > > needs
> > > > >> >> > >> >> > > > > > > > > > (and in our project as well) in just
> > > > in-memory
> > > > >> >> > >> indexing
> > > > >> >> > >> >> of
> > > > >> >> > >> >> > > > > selected
> > > > >> >> > >> >> > > > > > > data.
> > > > >> >> > >> >> > > > > > > > > > We intend to use search capabilities
> > for
> > > > >> fetching
> > > > >> >> > >> >> limited
> > > > >> >> > >> >> > > > amount
> > > > >> >> > >> >> > > > > of
> > > > >> >> > >> >> > > > > > > > > records
> > > > >> >> > >> >> > > > > > > > > > that should be used in type-ahead
> > search /
> > > > >> >> > >> suggestions.
> > > > >> >> > >> >> > > > > > > > > > Not all of the data will be indexed
> > and the
> > > > >> are
> > > > >> >> no
> > > > >> >> > >> need
> > > > >> >> > >> >> in
> > > > >> >> > >> >> > > > Lucene
> > > > >> >> > >> >> > > > > > > index
> > > > >> >> > >> >> > > > > > > > > to
> > > > >> >> > >> >> > > > > > > > > > be persistence. Hope this is a wide
> > > > pattern of
> > > > >> >> > >> >> text-search
> > > > >> >> > >> >> > > > usage.
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > (II) Necessary fixes in current
> > > > >> implementation.
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > a) Implementation of correct *limit
> > > > *(*offset*
> > > > >> >> > seems
> > > > >> >> > >> to
> > > > >> >> > >> >> be
> > > > >> >> > >> >> > > not
> > > > >> >> > >> >> > > > > > > required
> > > > >> >> > >> >> > > > > > > > > in
> > > > >> >> > >> >> > > > > > > > > > text-search tasks for now)
> > > > >> >> > >> >> > > > > > > > > > I have investigated the data flow for
> > > > >> distributed
> > > > >> >> > >> text
> > > > >> >> > >> >> > > queries.
> > > > >> >> > >> >> > > > > it
> > > > >> >> > >> >> > > > > > > was
> > > > >> >> > >> >> > > > > > > > > > simple test prefix query, like
> > > > 'name'*='ene*'*
> > > > >> >> > >> >> > > > > > > > > > For now each server-node returns all
> > > > response
> > > > >> >> > >> records to
> > > > >> >> > >> >> > the
> > > > >> >> > >> >> > > > > > > client-node
> > > > >> >> > >> >> > > > > > > > > > and it may contain ~thousands, ~hundred
> > > > >> thousands
> > > > >> >> > >> >> records.
> > > > >> >> > >> >> > > > > > > > > > Event if we need only first 10-100.
> > Again,
> > > > all
> > > > >> >> the
> > > > >> >> > >> >> results
> > > > >> >> > >> >> > > are
> > > > >> >> > >> >> > > > > added
> > > > >> >> > >> >> > > > > > > to
> > > > >> >> > >> >> > > > > > > > > > queue in GridCacheQueryFutureAdapter in
> > > > >> arbitrary
> > > > >> >> > >> order
> > > > >> >> > >> >> by
> > > > >> >> > >> >> > > > pages.
> > > > >> >> > >> >> > > > > > > > > > I did not find here any means to
> > deliver
> > > > >> >> > >> deterministic
> > > > >> >> > >> >> > > result.
> > > > >> >> > >> >> > > > > > > > > > So implementing limit as part of query
> > and
> > > > >> >> > >> >> > > > > (GridCacheQueryRequest)
> > > > >> >> > >> >> > > > > > > will
> > > > >> >> > >> >> > > > > > > > > not
> > > > >> >> > >> >> > > > > > > > > > change the nature of response but will
> > > > limit
> > > > >> load
> > > > >> >> > on
> > > > >> >> > >> >> nodes
> > > > >> >> > >> >> > > and
> > > > >> >> > >> >> > > > > > > > > networking.
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > Can we consider to open a ticket for
> > this?
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > (III) Further extension of Lucene API
> > > > >> exposition
> > > > >> >> to
> > > > >> >> > >> >> Ignite
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > a) Sorting
> > > > >> >> > >> >> > > > > > > > > > The solution for this could be:
> > > > >> >> > >> >> > > > > > > > > > - Make entities comparable
> > > > >> >> > >> >> > > > > > > > > > - Add custom comparator to entity
> > > > >> >> > >> >> > > > > > > > > > - Add annotations to mark sorted
> > fields for
> > > > >> >> Lucene
> > > > >> >> > >> >> indexing
> > > > >> >> > >> >> > > > > > > > > > - Use comparators when merging
> > responses or
> > > > >> >> > reducing
> > > > >> >> > >> to
> > > > >> >> > >> >> > > desired
> > > > >> >> > >> >> > > > > > > limit on
> > > > >> >> > >> >> > > > > > > > > > client node.
> > > > >> >> > >> >> > > > > > > > > > Will require full result set to be
> > loaded
> > > > into
> > > > >> >> > >> memory.
> > > > >> >> > >> >> > Though
> > > > >> >> > >> >> > > > > can be
> > > > >> >> > >> >> > > > > > > used
> > > > >> >> > >> >> > > > > > > > > > for relatively small limits.
> > > > >> >> > >> >> > > > > > > > > > BR,
> > > > >> >> > >> >> > > > > > > > > > Yuriy Shuliha
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > пт, 30 серп. 2019 о 10:37 Alexei
> > > > Scherbakov <
> > > > >> >> > >> >> > > > > > > > >  alexey.scherbakoff@gmail.com >
> > > > >> >> > >> >> > > > > > > > > > пише:
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > Yuriy,
> > > > >> >> > >> >> > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > Note what one of major blockers for
> > text
> > > > >> >> queries
> > > > >> >> > is
> > > > >> >> > >> >> [1]
> > > > >> >> > >> >> > > which
> > > > >> >> > >> >> > > > > makes
> > > > >> >> > >> >> > > > > > > > > > lucene
> > > > >> >> > >> >> > > > > > > > > > > indexes unusable with persistence and
> > > > main
> > > > >> >> reason
> > > > >> >> > >> for
> > > > >> >> > >> >> > > > > > > discontinuation.
> > > > >> >> > >> >> > > > > > > > > > > Probably it's should be addressed
> > first
> > > > to
> > > > >> make
> > > > >> >> > >> text
> > > > >> >> > >> >> > > queries
> > > > >> >> > >> >> > > > a
> > > > >> >> > >> >> > > > > > > valid
> > > > >> >> > >> >> > > > > > > > > > > product feature.
> > > > >> >> > >> >> > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > Distributed sorting and advanved
> > > > querying is
> > > > >> >> > indeed
> > > > >> >> > >> >> not a
> > > > >> >> > >> >> > > > > trivial
> > > > >> >> > >> >> > > > > > > task.
> > > > >> >> > >> >> > > > > > > > > > > Some kind of merging must be
> > implemented
> > > > on
> > > > >> >> query
> > > > >> >> > >> >> > > originating
> > > > >> >> > >> >> > > > > node.
> > > > >> >> > >> >> > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > [1]
> > > > >> >> > >>  https://issues.apache.org/jira/browse/IGNITE-5371
> > > > >> >> > >> >> > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > чт, 29 авг. 2019 г. в 23:38, Denis
> > Magda
> > > > <
> > > > >> >> > >> >> > >  dmagda@apache.org
> > > > >> >> > >> >> > > > >:
> > > > >> >> > >> >> > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > Yuriy,
> > > > >> >> > >> >> > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > If you are ready to take over the
> > > > >> full-text
> > > > >> >> > >> search
> > > > >> >> > >> >> > > indexes
> > > > >> >> > >> >> > > > > then
> > > > >> >> > >> >> > > > > > > > > please
> > > > >> >> > >> >> > > > > > > > > > go
> > > > >> >> > >> >> > > > > > > > > > > > ahead. The primary reason why the
> > > > >> community
> > > > >> >> > >> wants to
> > > > >> >> > >> >> > > > > discontinue
> > > > >> >> > >> >> > > > > > > them
> > > > >> >> > >> >> > > > > > > > > > > first
> > > > >> >> > >> >> > > > > > > > > > > > (and, probable, resurrect later)
> > are
> > > > the
> > > > >> >> > >> limitations
> > > > >> >> > >> >> > > listed
> > > > >> >> > >> >> > > > > by
> > > > >> >> > >> >> > > > > > > Andrey
> > > > >> >> > >> >> > > > > > > > > > and
> > > > >> >> > >> >> > > > > > > > > > > > minimal support from the community
> > end.
> > > > >> >> > >> >> > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > -
> > > > >> >> > >> >> > > > > > > > > > > > Denis
> > > > >> >> > >> >> > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > On Thu, Aug 29, 2019 at 1:29 PM
> > Andrey
> > > > >> >> > Mashenkov
> > > > >> >> > >> <
> > > > >> >> > >> >> > > > > > > > > > > >  andrey.mashenkov@gmail.com >
> > > > >> >> > >> >> > > > > > > > > > > > wrote:
> > > > >> >> > >> >> > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > Hi Yuriy,
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > Unfortunatelly, there is a plan
> > to
> > > > >> >> > discontinue
> > > > >> >> > >> >> > > > TextQueries
> > > > >> >> > >> >> > > > > in
> > > > >> >> > >> >> > > > > > > > > Ignite
> > > > >> >> > >> >> > > > > > > > > > > [1].
> > > > >> >> > >> >> > > > > > > > > > > > > Motivation here is text indexes
> > are
> > > > not
> > > > >> >> > >> >> persistent,
> > > > >> >> > >> >> > not
> > > > >> >> > >> >> > > > > > > > > transactional
> > > > >> >> > >> >> > > > > > > > > > > and
> > > > >> >> > >> >> > > > > > > > > > > > > can't be user together with SQL
> > or
> > > > >> inside
> > > > >> >> > SQL.
> > > > >> >> > >> >> > > > > > > > > > > > > and there is a lack of interest
> > from
> > > > >> >> > community
> > > > >> >> > >> >> side.
> > > > >> >> > >> >> > > > > > > > > > > > > You are weclome to take on these
> > > > issues
> > > > >> and
> > > > >> >> > >> make
> > > > >> >> > >> >> > > > > TextQueries
> > > > >> >> > >> >> > > > > > > great.
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > 1, PageSize can't be used to
> > limit
> > > > >> >> > resultset.
> > > > >> >> > >> >> > > > > > > > > > > > > Query results return from data
> > node
> > > > to
> > > > >> >> > >> client-side
> > > > >> >> > >> >> > > cursor
> > > > >> >> > >> >> > > > > in
> > > > >> >> > >> >> > > > > > > > > > > page-by-page
> > > > >> >> > >> >> > > > > > > > > > > > > manner and
> > > > >> >> > >> >> > > > > > > > > > > > > this parameter is designed
> > control
> > > > page
> > > > >> >> size.
> > > > >> >> > >> It
> > > > >> >> > >> >> is
> > > > >> >> > >> >> > > > > supposed
> > > > >> >> > >> >> > > > > > > query
> > > > >> >> > >> >> > > > > > > > > > > > executes
> > > > >> >> > >> >> > > > > > > > > > > > > lazily on server side and
> > > > >> >> > >> >> > > > > > > > > > > > > it is not excepted full
> > resultset be
> > > > >> loaded
> > > > >> >> > to
> > > > >> >> > >> >> memory
> > > > >> >> > >> >> > > on
> > > > >> >> > >> >> > > > > server
> > > > >> >> > >> >> > > > > > > > > side
> > > > >> >> > >> >> > > > > > > > > > at
> > > > >> >> > >> >> > > > > > > > > > > > > once, but by pages.
> > > > >> >> > >> >> > > > > > > > > > > > > Do you mean you found Lucene load
> > > > entire
> > > > >> >> > >> resultset
> > > > >> >> > >> >> > into
> > > > >> >> > >> >> > > > > memory
> > > > >> >> > >> >> > > > > > > > > before
> > > > >> >> > >> >> > > > > > > > > > > > first
> > > > >> >> > >> >> > > > > > > > > > > > > page is sent to client?
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > I'd think a new parameter should
> > be
> > > > >> added
> > > > >> >> to
> > > > >> >> > >> limit
> > > > >> >> > >> >> > > > result.
> > > > >> >> > >> >> > > > > The
> > > > >> >> > >> >> > > > > > > best
> > > > >> >> > >> >> > > > > > > > > > > > > solution is to use query language
> > > > >> commands
> > > > >> >> > for
> > > > >> >> > >> >> this,
> > > > >> >> > >> >> > > e.g.
> > > > >> >> > >> >> > > > > > > > > > > "LIMIT/OFFSET"
> > > > >> >> > >> >> > > > > > > > > > > > in
> > > > >> >> > >> >> > > > > > > > > > > > > SQL.
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > This task doesn't look trivial.
> > > > Query is
> > > > >> >> > >> >> distributed
> > > > >> >> > >> >> > > > > operation
> > > > >> >> > >> >> > > > > > > and
> > > > >> >> > >> >> > > > > > > > > > same
> > > > >> >> > >> >> > > > > > > > > > > > > user query will be executed on
> > data
> > > > >> nodes
> > > > >> >> > >> >> > > > > > > > > > > > > and then results from all nodes
> > > > should
> > > > >> be
> > > > >> >> > >> correcly
> > > > >> >> > >> >> > > merged
> > > > >> >> > >> >> > > > > > > before
> > > > >> >> > >> >> > > > > > > > > > being
> > > > >> >> > >> >> > > > > > > > > > > > > returned via client-cursor.
> > > > >> >> > >> >> > > > > > > > > > > > > So, LIMIT should be applied on
> > every
> > > > >> node
> > > > >> >> and
> > > > >> >> > >> >> then on
> > > > >> >> > >> >> > > > merge
> > > > >> >> > >> >> > > > > > > phase.
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > Also, this may be non-obviuos,
> > > > limiting
> > > > >> >> > results
> > > > >> >> > >> >> make
> > > > >> >> > >> >> > no
> > > > >> >> > >> >> > > > > sence
> > > > >> >> > >> >> > > > > > > > > without
> > > > >> >> > >> >> > > > > > > > > > > > > sorting,
> > > > >> >> > >> >> > > > > > > > > > > > > as there is no guarantee every
> > next
> > > > >> query
> > > > >> >> run
> > > > >> >> > >> will
> > > > >> >> > >> >> > > return
> > > > >> >> > >> >> > > > > same
> > > > >> >> > >> >> > > > > > > data
> > > > >> >> > >> >> > > > > > > > > > > > because
> > > > >> >> > >> >> > > > > > > > > > > > > of page reordeing.
> > > > >> >> > >> >> > > > > > > > > > > > > Basically, merge phase receive
> > > > results
> > > > >> from
> > > > >> >> > >> data
> > > > >> >> > >> >> > nodes
> > > > >> >> > >> >> > > > > > > > > asynchronously
> > > > >> >> > >> >> > > > > > > > > > > and
> > > > >> >> > >> >> > > > > > > > > > > > > messages from different nodes
> > can't
> > > > be
> > > > >> >> > ordered.
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > 2.
> > > > >> >> > >> >> > > > > > > > > > > > > a. "tokenize" param name (for
> > > > >> >> > @QueryTextFiled)
> > > > >> >> > >> >> looks
> > > > >> >> > >> >> > > more
> > > > >> >> > >> >> > > > > > > verbose,
> > > > >> >> > >> >> > > > > > > > > > > isn't
> > > > >> >> > >> >> > > > > > > > > > > > > it.
> > > > >> >> > >> >> > > > > > > > > > > > > b,c. What about distributed
> > query?
> > > > How
> > > > >> >> > partial
> > > > >> >> > >> >> > results
> > > > >> >> > >> >> > > > from
> > > > >> >> > >> >> > > > > > > nodes
> > > > >> >> > >> >> > > > > > > > > > will
> > > > >> >> > >> >> > > > > > > > > > > be
> > > > >> >> > >> >> > > > > > > > > > > > > merged?
> > > > >> >> > >> >> > > > > > > > > > > > > Does Lucene allows to configure
> > > > >> comparator
> > > > >> >> > for
> > > > >> >> > >> >> data
> > > > >> >> > >> >> > > > > sorting?
> > > > >> >> > >> >> > > > > > > > > > > > > What comparator Ignite should
> > choose
> > > > to
> > > > >> >> sort
> > > > >> >> > >> >> result
> > > > >> >> > >> >> > on
> > > > >> >> > >> >> > > > > merge
> > > > >> >> > >> >> > > > > > > phase?
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > 3. For now Lucene engine is not
> > > > >> >> configurable
> > > > >> >> > at
> > > > >> >> > >> >> all.
> > > > >> >> > >> >> > > E.g.
> > > > >> >> > >> >> > > > > it is
> > > > >> >> > >> >> > > > > > > > > > > > impossible
> > > > >> >> > >> >> > > > > > > > > > > > > to configure Tokenizer.
> > > > >> >> > >> >> > > > > > > > > > > > > I'd think about possible ways to
> > > > >> configure
> > > > >> >> > >> engine
> > > > >> >> > >> >> at
> > > > >> >> > >> >> > > > first
> > > > >> >> > >> >> > > > > and
> > > > >> >> > >> >> > > > > > > only
> > > > >> >> > >> >> > > > > > > > > > > then
> > > > >> >> > >> >> > > > > > > > > > > > go
> > > > >> >> > >> >> > > > > > > > > > > > > further to discuss\implement
> > complex
> > > > >> >> > features,
> > > > >> >> > >> >> > > > > > > > > > > > > that may depends on engine
> > config.
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > On Thu, Aug 29, 2019 at 8:17 PM
> > Yuriy
> > > > >> >> > Shuliga <
> > > > >> >> > >> >> > > > > > >  shuliga@gmail.com >
> > > > >> >> > >> >> > > > > > > > > > > wrote:
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > Dear community,
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > By starting this chain I'd
> > like to
> > > > >> open
> > > > >> >> > >> >> discussion
> > > > >> >> > >> >> > > that
> > > > >> >> > >> >> > > > > would
> > > > >> >> > >> >> > > > > > > > > come
> > > > >> >> > >> >> > > > > > > > > > to
> > > > >> >> > >> >> > > > > > > > > > > > > > contribution results in subj.
> > area.
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > Ignite has indexing
> > capabilities,
> > > > >> backed
> > > > >> >> up
> > > > >> >> > >> by
> > > > >> >> > >> >> > > > different
> > > > >> >> > >> >> > > > > > > > > > mechanisms,
> > > > >> >> > >> >> > > > > > > > > > > > > > including Lucene.
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > Currently, Lucene 7.5.0 is used
> > > > (past
> > > > >> >> year
> > > > >> >> > >> >> > release).
> > > > >> >> > >> >> > > > > > > > > > > > > > This is a wide spread and
> > mature
> > > > >> >> technology
> > > > >> >> > >> that
> > > > >> >> > >> >> > > covers
> > > > >> >> > >> >> > > > > text
> > > > >> >> > >> >> > > > > > > > > search
> > > > >> >> > >> >> > > > > > > > > > > > area
> > > > >> >> > >> >> > > > > > > > > > > > > > and beyond (e.g. spacial data
> > > > >> indexing).
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > My goal is to *expose more
> > Lucene
> > > > >> >> > >> functionality
> > > > >> >> > >> >> to
> > > > >> >> > >> >> > > > Ignite
> > > > >> >> > >> >> > > > > > > > > indexing
> > > > >> >> > >> >> > > > > > > > > > > and
> > > > >> >> > >> >> > > > > > > > > > > > > > query mechanisms for text
> > data*.
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > It's quite simple request at
> > > > current
> > > > >> >> stage.
> > > > >> >> > >> It
> > > > >> >> > >> >> is
> > > > >> >> > >> >> > > > coming
> > > > >> >> > >> >> > > > > > > from our
> > > > >> >> > >> >> > > > > > > > > > > > > project's
> > > > >> >> > >> >> > > > > > > > > > > > > > needs, but i believe, will be
> > > > useful
> > > > >> for
> > > > >> >> a
> > > > >> >> > >> lot
> > > > >> >> > >> >> more
> > > > >> >> > >> >> > > > > people.
> > > > >> >> > >> >> > > > > > > > > > > > > > Let's walk through and vote or
> > > > discuss
> > > > >> >> > about
> > > > >> >> > >> >> Jira
> > > > >> >> > >> >> > > > > tickets for
> > > > >> >> > >> >> > > > > > > > > them.
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > 1.[trivial] Use
> > > > >> dataQuery.getPageSize()
> > > > >> >> > to
> > > > >> >> > >> >> limit
> > > > >> >> > >> >> > > > search
> > > > >> >> > >> >> > > > > > > > > response
> > > > >> >> > >> >> > > > > > > > > > > > items
> > > > >> >> > >> >> > > > > > > > > > > > > > inside GridLuceneIndex.query().
> > > > >> Currently
> > > > >> >> > it
> > > > >> >> > >> is
> > > > >> >> > >> >> > > calling
> > > > >> >> > >> >> > > > > > > > > > > > > > IndexSearcher.search(query,
> > > > >> >> > >> >> *Integer.MAX_VALUE*) -
> > > > >> >> > >> >> > so
> > > > >> >> > >> >> > > > > > > basically
> > > > >> >> > >> >> > > > > > > > > all
> > > > >> >> > >> >> > > > > > > > > > > > > scored
> > > > >> >> > >> >> > > > > > > > > > > > > > matches will me returned, what
> > we
> > > > do
> > > > >> not
> > > > >> >> > >> need in
> > > > >> >> > >> >> > most
> > > > >> >> > >> >> > > > > cases.
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > 2.[simple] Add sorting. Then
> > more
> > > > >> >> capable
> > > > >> >> > >> >> search
> > > > >> >> > >> >> > > call
> > > > >> >> > >> >> > > > > can be
> > > > >> >> > >> >> > > > > > > > > > > > > > executed:
> > > > *IndexSearcher.search(query,
> > > > >> >> > count,
> > > > >> >> > >> >> > > > > > > > > > > > > > sort) *
> > > > >> >> > >> >> > > > > > > > > > > > > > Implementation steps:
> > > > >> >> > >> >> > > > > > > > > > > > > > a) Introduce boolean
> > *sortField*
> > > > >> >> parameter
> > > > >> >> > in
> > > > >> >> > >> >> > > > > > > *@QueryTextFiled *
> > > > >> >> > >> >> > > > > > > > > > > > > > annotation. If
> > > > >> >> > >> >> > > > > > > > > > > > > > *true *the filed will be
> > indexed
> > > > but
> > > > >> not
> > > > >> >> > >> >> tokenized.
> > > > >> >> > >> >> > > > > Number
> > > > >> >> > >> >> > > > > > > types
> > > > >> >> > >> >> > > > > > > > > > are
> > > > >> >> > >> >> > > > > > > > > > > > > > preferred here.
> > > > >> >> > >> >> > > > > > > > > > > > > > b) Add *sort* collection to
> > > > >> *TextQuery*
> > > > >> >> > >> >> > constructor.
> > > > >> >> > >> >> > > It
> > > > >> >> > >> >> > > > > > > should
> > > > >> >> > >> >> > > > > > > > > > define
> > > > >> >> > >> >> > > > > > > > > > > > > > desired sort fields used for
> > > > querying.
> > > > >> >> > >> >> > > > > > > > > > > > > > c) Implement Lucene sort usage
> > in
> > > > >> >> > >> >> > > > > GridLuceneIndex.query().
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > 3.[moderate] Build complex
> > queries
> > > > >> with
> > > > >> >> > >> >> > *TextQuery*,
> > > > >> >> > >> >> > > > > > > including
> > > > >> >> > >> >> > > > > > > > > > > > > > terms/queries boosting.
> > > > >> >> > >> >> > > > > > > > > > > > > > *This section for voting only,
> > as
> > > > >> >> requires
> > > > >> >> > >> more
> > > > >> >> > >> >> > > > detailed
> > > > >> >> > >> >> > > > > > > work.
> > > > >> >> > >> >> > > > > > > > > > Should
> > > > >> >> > >> >> > > > > > > > > > > > be
> > > > >> >> > >> >> > > > > > > > > > > > > > extended if community is
> > > > interested in
> > > > >> >> it.*
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > Looking forward to your
> > comments!
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > > BR,
> > > > >> >> > >> >> > > > > > > > > > > > > > Yuriy Shuliha
> > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > > > --
> > > > >> >> > >> >> > > > > > > > > > > > > Best regards,
> > > > >> >> > >> >> > > > > > > > > > > > > Andrey V. Mashenkov
> > > > >> >> > >> >> > > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > --
> > > > >> >> > >> >> > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > > > Best regards,
> > > > >> >> > >> >> > > > > > > > > > > Alexei Scherbakov
> > > > >> >> > >> >> > > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > > >
> > > > >> >> > >> >> > > > > > > > >
> > > > >> >> > >> >> > > > > > >
> > > > >> >> > >> >> > > > > > >
> > > > >> >> > >> >> > > > > > >
> > > > >> >> > >> >> > > > > > > --
> > > > >> >> > >> >> > > > > > > Best regards,
> > > > >> >> > >> >> > > > > > > Ivan Pavlukhin
> > > > >> >> > >> >> > > > > > >
> > > > >> >> > >> >> > > > >
> > > > >> >> > >> >> > > > >
> > > > >> >> > >> >> > > > >
> > > > >> >> > >> >> > > > > --
> > > > >> >> > >> >> > > > > Best regards,
> > > > >> >> > >> >> > > > > Ivan Pavlukhin
> > > > >> >> > >> >> > > > >
> > > > >> >> > >> >> > > >
> > > > >> >> > >> >> > >
> > > > >> >> > >> >> >
> > > > >> >> > >> >> >
> > > > >> >> > >> >> > --
> > > > >> >> > >> >> > Best regards,
> > > > >> >> > >> >> > Andrey V. Mashenkov
> > > > >> >> > >> >> >
> > > > >> >> > >> >>
> > > > >> >> > >> >
> > > > >> >> > >> >
> > > > >> >> > >> > --
> > > > >> >> > >> > Best regards,
> > > > >> >> > >> > Andrey V. Mashenkov
> > > > >> >> > >> >
> > > > >> >> > >>
> > > > >> >> > >
> > > > >> >> >
> > > > >> >> > --
> > > > >> >> > Best regards,
> > > > >> >> > Andrey V. Mashenkov
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >
> > > >
> > > >
> > > >
> > > >
> >
> >
> >
> > --
> > Best regards,
> > Ivan Pavlukhin
> >
> >



-- 
Best regards,
Ivan Pavlukhin


Mime
View raw message