# ignite-dev mailing list archives

##### Site index · List index
Message view
Top
From Yuriy Shuliga <shul...@gmail.com>
Subject Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)
Date Thu, 28 Nov 2019 13:54:18 GMT
Nice to hear, Ivan

It's good practice to make existing functionality extension to be proper
presented; as we expect if from Text Queries.
Lets make it work correctly at first.

I'm ok to prepare ticket for adding reduction for sorted responses to
GridCacheDistributedQueryFuture  or nearby.
Also theTextQuery response entity will be extended to carry Lucene's
'docScore' per record.
No open question has left then.

BR,
Yuriy Shuliha

чт, 28 лист. 2019 о 15:23 Ivan Pavlukhin <vololo100@gmail.com> пише:

> Folks, Yuriy,
>
> I suppose that we are going to proceed with
>
> >>>
> Reducing on Ignite
>
> The obvious point of distributed response reduction is class
> GridCacheDistributedQueryFuture.
> Though, @Ivan Pavlukhin mentioned class with similar functionality:
> ReduceIndexSorted
> What I see here, that it is tangled with H2 related classes
> (org.h2.result.Row) and might not be unified with TextQuery reduction.
> >>
>
> From my side there is no strict opinion that we should unify
> reduction. Having a separate reduction implementation for text queries
> sounds for me as not bad option as well.
>
> Are there still any open questions?
>
> ср, 27 нояб. 2019 г. в 02:27, Denis Magda <dmagda@apache.org>:
> >
> > I don't see anything wrong if Yuriy is willing to carry on and keep
> > enhancing our full-text search support that lacks basic capabilities.
> >
> > The basics should be available. If anybody needs an advanced feature they
> > can introduce Solr or ElastiSearch into the final architecture of the
> app.
> >
> > Folks, who of us can help Yuriy with the questions asked? Most like the
> SQL
> > experts are the best candidates here.
> >
> >
> > -
> > Denis
> >
> >
> > On Tue, Nov 26, 2019 at 8:52 AM Ivan Pavlukhin <vololo100@gmail.com>
> wrote:
> >
> > > Folks,
> > >
> > > IEP is an Ignite-specific thing. In fact, I suppose that we are
> > > already doing it in ASF way by having this dev-list discussion =)
> > >
> > > As for me, implementing "limit" feature for text queries is not so big
> > > to make an IEP. But we might need to create one for next features.
> > >
> > > вт, 26 нояб. 2019 г. в 15:06, Ilya Kasnacheev <
> ilya.kasnacheev@gmail.com>:
> > > >
> > > > Hello!
> > > >
> > > > ASF way should probably start with an IEP :)
> > > >
> > > > Regards,
> > > > --
> > > > Ilya Kasnacheev
> > > >
> > > >
> > > > вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky
> > > <arzamas123@mail.ru.invalid
> > > > >:
> > > >
> > > > >
> > > > > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this
> > > > > functionality is helpful and PR it, why not ?
> > > > >
> > > > > isn`t it ?
> > > > >
> > > > > >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev <
> > > > > ilya.kasnacheev@gmail.com>:
> > > > > >
> > > > > >Hello!
> > > > > >
> > > > > >The problem here is that Solr is a multi-year effort by a lot of
> > > people.
> > > > > We
> > > > > >can't match that.
> > > > > >
> > > > > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding
> our
> > > > > cache
> > > > > >information into their storage for indexing and relying on their
> own
> > > > > >mechanisms for distributed IR sorting?
> > > > > >
> > > > > >Regards,
> > > > > >--
> > > > > >Ilya Kasnacheev
> > > > > >
> > > > > >
> > > > > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky <
> > > > > arzamas123@mail.ru.invalid
> > > > > >>:
> > > > > >
> > > > > >>
> > > > > >> Ilya Kasnacheev, what a problem in Solr with Ignite
> functionality ?
> > > > > >>
> > > > > >> thanks !
> > > > > >>
> > > > > >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev <
> > > > > >>  ilya.kasnacheev@gmail.com >:
> > > > > >> >
> > > > > >> >Hello!
> > > > > >> >
> > > > > >> >I have a hunch that we are trying to build Apache Solr (or Solr
> > > Cloud)
> > > > > >> into
> > > > > >> >Apache Ignite. I think that's a lot of effort that is not very
> > > > > justified.
> > > > > >> >
> > > > > >> >I don't think we should try to implement sorting in Apache
> Ignite,
> > > > > because
> > > > > >> >it is a lot of work, and a lot of code in our code base which
> we
> > > don't
> > > > > >> >really want.
> > > > > >> >
> > > > > >> >Regards,
> > > > > >> >--
> > > > > >> >Ilya Kasnacheev
> > > > > >> >
> > > > > >> >
> > > > > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga <
> shuliga@gmail.com
> > > >:
> > > > > >> >
> > > > > >> >> Dear Igniters,
> > > > > >> >>
> > > > > >> >> The first part of TextQuery improvement - a result limit -
> was
> > > > > developed
> > > > > >> >> and merged.
> > > > > >> >> Now we have to develop most important functionality here -
> proper
> > > > > >> sorting
> > > > > >> >> of Lucene index response and correct reducing of them for
> > > distributed
> > > > > >> >> queries.
> > > > > >> >>
> > > > > >> >> *There are two Lucene based aspects*
> > > > > >> >>
> > > > > >> >> 1. In case of using no sorting fields, the documents in
> response
> > > are
> > > > > >> still
> > > > > >> >> ordered by relevance.
> > > > > >> >> Actually this is ScoreDoc.score value.
> > > > > >> >> In order to reduce the distributed results correctly, the
> score
> > > > > should
> > > > > >> be
> > > > > >> >> passed with response.
> > > > > >> >>
> > > > > >> >> 2. When sorting by conventional fields, then Lucene should
> have
> > > these
> > > > > >> >> fields properly indexed and
> > > > > >> >> corresponding Sort object should be applied to Lucene's
> search
> > > call.
> > > > > >> >> In order to mark those fields a new annotation like
> '@SortField'
> > > may
> > > > > be
> > > > > >> >> introduced.
> > > > > >> >>
> > > > > >> >> *Reducing on Ignite *
> > > > > >> >>
> > > > > >> >> The obvious point of distributed response reduction is class
> > > > > >> >> GridCacheDistributedQueryFuture.
> > > > > >> >> Though, @Ivan Pavlukhin mentioned class with similar
> > > functionality:
> > > > > >> >> ReduceIndexSorted
> > > > > >> >> What I see here, that it is tangled with H2 related classes (
> > > > > >> >> org.h2.result.Row) and might not be unified with TextQuery
> > > reduction.
> > > > > >> >>
> > > > > >> >> Still need a support here.
> > > > > >> >>
> > > > > >> >> Overall, the goal of this letter is to initiate discussion on
> > > > > TextQuery
> > > > > >> >> Sorting implementation and come closer to ticket creation.
> > > > > >> >>
> > > > > >> >> BR,
> > > > > >> >> Yuriy Shuliha
> > > > > >> >>
> > > > > >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov <
> > > > > andrey.mashenkov@gmail.com
> > > > > >> >
> > > > > >> >> пише:
> > > > > >> >>
> > > > > >> >> > Hi Dmitry, Yuriy.
> > > > > >> >> >
> > > > > >> >> > I've found GridCacheQueryFutureAdapter has newly added
> > > > > AtomicInteger
> > > > > >> >> > 'total' field and 'limit; field as primitive int.
> > > > > >> >> >
> > > > > >> >> > Both fields are used inside synchronized block only.
> > > > > >> >> > So, we can make both private and downgrade AtomicInteger to
> > > > > primitive
> > > > > >> >> int.
> > > > > >> >> >
> > > > > >> >> > Most likely, these fields can be replaced with one field.
> > > > > >> >> >
> > > > > >> >> >
> > > > > >> >> >
> > > > > >> >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov <
> > > > > dpavlov@apache.org
> > > > > >> >
> > > > > >> >> > wrote:
> > > > > >> >> >
> > > > > >> >> > > Hi Andrey,
> > > > > >> >> > >
> > > > > >> >> > > I've checked this ticket comments, and there is a TC Bot
> visa
> > > > > (with
> > > > > >> no
> > > > > >> >> > > blockers).
> > > > > >> >> > >
> > > > > >> >> > > Do you have any concerns related to this patch?
> > > > > >> >> > >
> > > > > >> >> > > Sincerely,
> > > > > >> >> > > Dmitriy Pavlov
> > > > > >> >> > >
> > > > > >> >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga <
> > > shuliga@gmail.com
> > > > > >:
> > > > > >> >> > >
> > > > > >> >> > >> Andrey,
> > > > > >> >> > >>
> > > > > >> >> > >> Per you request, I created ticket
> > > > > >> >> > >>  https://issues.apache.org/jira/browse/IGNITE-12291
> > > to
> > > > > >> >> > >>
> > > > > >>
> https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189
> > > > > >> >> > >>
> > > > > >> >> > >> Could you please proceed with PR merge ?
> > > > > >> >> > >>
> > > > > >> >> > >> BR,
> > > > > >> >> > >> Yuriy Shuliha
> > > > > >> >> > >>
> > > > > >> >> > >> ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov <
> > > > > >>  andrey.mashenkov@gmail.com
> > > > > >> >> >
> > > > > >> >> > >> пише:
> > > > > >> >> > >>
> > > > > >> >> > >> > Hi Yuri,
> > > > > >> >> > >> >
> > > > > >> >> > >> > To get access to TC Bot you should register as
> TeamCity
> > > user
> > > > > >> [1], if
> > > > > >> >> > you
> > > > > >> >> > >> > didn't do this already.
> > > > > >> >> > >> > Then you will be able to authorize on Ignite TC Bot
> page
> > > with
> > > > > >> same
> > > > > >> >> > >> > credentials.
> > > > > >> >> > >> >
> > > > > >> >> > >> > [1]  https://ci.ignite.apache.org/registerUser.html
> > > > > >> >> > >> >
> > > > > >> >> > >> > On Fri, Oct 4, 2019 at 3:10 PM Yuriy Shuliga <
> > > > > shuliga@gmail.com
> > > > > >> >
> > > > > >> >> > wrote:
> > > > > >> >> > >> >
> > > > > >> >> > >> >> Andrew,
> > > > > >> >> > >> >>
> > > > > >> >> > >> >> I have corrected PR according to your notes. Please
> > > review.
> > > > > >> >> > >> >> What will be the next steps in order to merge in?
> > > > > >> >> > >> >>
> > > > > >> >> > >> >> Y.
> > > > > >> >> > >> >>
> > > > > >> >> > >> >> чт, 3 жовт. 2019 о 17:47 Andrey Mashenkov <
> > > > > >> >> >  andrey.mashenkov@gmail.com >
> > > > > >> >> > >> >> пише:
> > > > > >> >> > >> >>
> > > > > >> >> > >> >> > Yuri,
> > > > > >> >> > >> >> >
> > > > > >> >> > >> >> > I've done with review.
> > > > > >> >> > >> >> > No crime found, but trivial compatibility bug.
> > > > > >> >> > >> >> >
> > > > > >> >> > >> >> > On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga <
> > > > > >>  shuliga@gmail.com >
> > > > > >> >> > >> wrote:
> > > > > >> >> > >> >> >
> > > > > >> >> > >> >> > > Denis,
> > > > > >> >> > >> >> > >
> > > > > >> >> > >> >> > > Thank you for your attention to this.
> > > > > >> >> > >> >> > > as for now, the
> > > > > >> >> >  https://issues.apache.org/jira/browse/IGNITE-12189
> > > > > >> >> > >> >> > ticket
> > > > > >> >> > >> >> > > is still pending review.
> > > > > >> >> > >> >> > > Do we have a chance to move it forward somehow?
> > > > > >> >> > >> >> > >
> > > > > >> >> > >> >> > > BR,
> > > > > >> >> > >> >> > > Yuriy Shuliha
> > > > > >> >> > >> >> > >
> > > > > >> >> > >> >> > > пн, 30 вер. 2019 о 23:35 Denis Magda <
> > > > > dmagda@apache.org >
> > > > > >> пише:
> > > > > >> >> > >> >> > >
> > > > > >> >> > >> >> > > > Yuriy,
> > > > > >> >> > >> >> > > >
> > > > > >> >> > >> >> > > > I've seen you opening a pull-request with the
> first
> > > > > >> changes:
> > > > > >> >> > >> >> > > >
> > > https://issues.apache.org/jira/browse/IGNITE-12189
> > > > > >> >> > >> >> > > >
> > > > > >> >> > >> >> > > > Alex Scherbakov and Ivan are you the right
> guys to
> > > do
> > > > > the
> > > > > >> >> > review?
> > > > > >> >> > >> >> > > >
> > > > > >> >> > >> >> > > > -
> > > > > >> >> > >> >> > > > Denis
> > > > > >> >> > >> >> > > >
> > > > > >> >> > >> >> > > >
> > > > > >> >> > >> >> > > > On Fri, Sep 27, 2019 at 8:48 AM Павлухин Иван <
> > > > > >> >> > >>  vololo100@gmail.com >
> > > > > >> >> > >> >> > > wrote:
> > > > > >> >> > >> >> > > >
> > > > > >> >> > >> >> > > > > Yuriy,
> > > > > >> >> > >> >> > > > >
> > > > > >> >> > >> >> > > > > Thank you for providing details! Quite
> > > interesting.
> > > > > >> >> > >> >> > > > >
> > > > > >> >> > >> >> > > > > Yes, we already have support of distributed
> > > limit and
> > > > > >> >> merging
> > > > > >> >> > >> >> sorted
> > > > > >> >> > >> >> > > > > subresults for SQL queries. E.g.
> > > ReduceIndexSorted
> > > > > and
> > > > > >> >> > >> >> > > > > MergeStreamIterator are used for merging
> sorted
> > > > > streams.
> > > > > >> >> > >> >> > > > >
> > > > > >> >> > >> >> > > > > Could you please also clarify about
> > > score/relevance?
> > > > > Is
> > > > > >> it
> > > > > >> >> > >> >> provided
> > > > > >> >> > >> >> > by
> > > > > >> >> > >> >> > > > > Lucene engine for each query result? I am
> > > thinking
> > > > > how
> > > > > >> to
> > > > > >> >> do
> > > > > >> >> > >> >> sorted
> > > > > >> >> > >> >> > > > > merge properly in this case.
> > > > > >> >> > >> >> > > > >
> > > > > >> >> > >> >> > > > > ср, 25 сент. 2019 г. в 18:56, Yuriy Shuliga <
> > > > > >> >> >  shuliga@gmail.com
> > > > > >> >> > >> >:
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > Ivan,
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > Thank you for interesting question!
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > Text searches (or full text searches) are
> > > mostly
> > > > > >> >> > >> human-oriented.
> > > > > >> >> > >> >> > And
> > > > > >> >> > >> >> > > > the
> > > > > >> >> > >> >> > > > > > point of user's interest is topmost part of
> > > > > response.
> > > > > >> >> > >> >> > > > > > Then user can read it, evaluate and use the
> > > given
> > > > > >> records
> > > > > >> >> > for
> > > > > >> >> > >> >> > further
> > > > > >> >> > >> >> > > > > > purposes.
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > Particularly in our case, we use Ignite for
> > > > > operations
> > > > > >> >> with
> > > > > >> >> > >> >> > financial
> > > > > >> >> > >> >> > > > > data,
> > > > > >> >> > >> >> > > > > > and there lots of text stuff like assets
> names,
> > > > > fin.
> > > > > >> >> > >> >> instruments,
> > > > > >> >> > >> >> > > > > companies
> > > > > >> >> > >> >> > > > > > etc.
> > > > > >> >> > >> >> > > > > > In order to operate with this quickly and
> > > reliably,
> > > > > >> users
> > > > > >> >> > >> used
> > > > > >> >> > >> >> to
> > > > > >> >> > >> >> > > work
> > > > > >> >> > >> >> > > > > with
> > > > > >> >> > >> >> > > > > > text search, type-ahead completions,
> > > suggestions.
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > For this purposes we are indexing
> particular
> > > string
> > > > > >> data
> > > > > >> >> in
> > > > > >> >> > >> >> > separate
> > > > > >> >> > >> >> > > > > caches.
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > Sorting capabilities and response size
> > > limitations
> > > > > are
> > > > > >> >> very
> > > > > >> >> > >> >> > important
> > > > > >> >> > >> >> > > > > > there. As our API have to provide most
> relevant
> > > > > >> >> information
> > > > > >> >> > >> in
> > > > > >> >> > >> >> view
> > > > > >> >> > >> >> > > of
> > > > > >> >> > >> >> > > > > > limited size.
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > Now let me comment some Ignite/Lucene
> > > perspective.
> > > > > >> >> > >> >> > > > > > Actually Ignite queries and Lucene returns
> > > > > >> >> > >> *TopDocs.scoresDocs
> > > > > >> >> > >> >> > > *already
> > > > > >> >> > >> >> > > > > > sorted by *score *(relevance). So most
> relevant
> > > > > >> documents
> > > > > >> >> > >> are on
> > > > > >> >> > >> >> > the
> > > > > >> >> > >> >> > > > top.
> > > > > >> >> > >> >> > > > > > And currently distributed queries responses
> > > from
> > > > > >> >> different
> > > > > >> >> > >> nodes
> > > > > >> >> > >> >> > are
> > > > > >> >> > >> >> > > > > merged
> > > > > >> >> > >> >> > > > > > into final query cursor queue in arbitrary
> way.
> > > > > >> >> > >> >> > > > > > So in fact we already have the score order
> > > ruined
> > > > > >> here.
> > > > > >> >> > Also
> > > > > >> >> > >> >> Ignite
> > > > > >> >> > >> >> > > > > > requests all possible documents from Lucene
> > > that is
> > > > > >> >> > redundant
> > > > > >> >> > >> >> and
> > > > > >> >> > >> >> > not
> > > > > >> >> > >> >> > > > > good
> > > > > >> >> > >> >> > > > > > for performance.
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > I'm implementing *limit* parameter to be
> part
> > > of
> > > > > >> >> *TextQuery
> > > > > >> >> > >> *and
> > > > > >> >> > >> >> > have
> > > > > >> >> > >> >> > > > to
> > > > > >> >> > >> >> > > > > > notice that we still have to add sorting
> for
> > > text
> > > > > >> queries
> > > > > >> >> > >> >> > processing
> > > > > >> >> > >> >> > > in
> > > > > >> >> > >> >> > > > > > order to have applicable results.
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > *Limit* parameter itself should improve the
> > > part of
> > > > > >> >> issues
> > > > > >> >> > >> from
> > > > > >> >> > >> >> > > above,
> > > > > >> >> > >> >> > > > > but
> > > > > >> >> > >> >> > > > > > definitely, sorting by document score at
> least
> > > > > should
> > > > > >> be
> > > > > >> >> > >> >> > implemented
> > > > > >> >> > >> >> > > > > along
> > > > > >> >> > >> >> > > > > > with limit.
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > This is a pretty short commentary if you
> still
> > > have
> > > > > >> any
> > > > > >> >> > >> >> questions,
> > > > > >> >> > >> >> > > > please
> > > > > >> >> > >> >> > > > > > ask, do not hesitate)
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > BR,
> > > > > >> >> > >> >> > > > > > Yuriy Shuliha
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > чт, 19 вер. 2019 о 11:38 Павлухин Иван <
> > > > > >> >> >  vololo100@gmail.com >
> > > > > >> >> > >> >> пише:
> > > > > >> >> > >> >> > > > > >
> > > > > >> >> > >> >> > > > > > > Yuriy,
> > > > > >> >> > >> >> > > > > > >
> > > > > >> >> > >> >> > > > > > > Greatly appreciate your interest.
> > > > > >> >> > >> >> > > > > > >
> > > > > >> >> > >> >> > > > > > > Could you please elaborate a little bit
> > > > > >> sorting?
> > > > > >> >> > What
> > > > > >> >> > >> >> tasks
> > > > > >> >> > >> >> > > > does
> > > > > >> >> > >> >> > > > > > > it help to solve and how? It would be
> great
> > > to
> > > > > >> provide
> > > > > >> >> an
> > > > > >> >> > >> >> > example.
> > > > > >> >> > >> >> > > > > > >
> > > > > >> >> > >> >> > > > > > > ср, 18 сент. 2019 г. в 09:39, Alexei
> > > Scherbakov <
> > > > > >> >> > >> >> > > > > > >  alexey.scherbakoff@gmail.com >:
> > > > > >> >> > >> >> > > > > > > >
> > > > > >> >> > >> >> > > > > > > > Denis,
> > > > > >> >> > >> >> > > > > > > >
> > > > > >> >> > >> >> > > > > > > > I like the idea of throwing an
> exception
> > > for
> > > > > >> enabled
> > > > > >> >> > text
> > > > > >> >> > >> >> > queries
> > > > > >> >> > >> >> > > > on
> > > > > >> >> > >> >> > > > > > > > persistent caches.
> > > > > >> >> > >> >> > > > > > > >
> > > > > >> >> > >> >> > > > > > > > Also I'm fine with proposed limit for
> > > unsorted
> > > > > >> >> > searches.
> > > > > >> >> > >> >> > > > > > > >
> > > > > >> >> > >> >> > > > > > > > Yury, please proceed with ticket
> creation.
> > > > > >> >> > >> >> > > > > > > >
> > > > > >> >> > >> >> > > > > > > > вт, 17 сент. 2019 г., 22:06 Denis
> Magda <
> > > > > >> >> > >>  dmagda@apache.org
> > > > > >> >> > >> >> >:
> > > > > >> >> > >> >> > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > Igniters,
> > > > > >> >> > >> >> > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > I see nothing wrong with Yury's
> proposal
> > > in
> > > > > >> regards
> > > > > >> >> > >> >> full-text
> > > > > >> >> > >> >> > > > > search
> > > > > >> >> > >> >> > > > > > > API
> > > > > >> >> > >> >> > > > > > > > > evolution as long as Yury is ready to
> > > push it
> > > > > >> >> > forward.
> > > > > >> >> > >> >> > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > As for the in-memory mode only, it
> makes
> > > > > total
> > > > > >> >> sense
> > > > > >> >> > >> for
> > > > > >> >> > >> >> > > > in-memory
> > > > > >> >> > >> >> > > > > data
> > > > > >> >> > >> >> > > > > > > > > grid deployments when Ignite caches
> data
> > > of
> > > > > an
> > > > > >> >> > >> underlying
> > > > > >> >> > >> >> DB
> > > > > >> >> > >> >> > > like
> > > > > >> >> > >> >> > > > > > > Postgres.
> > > > > >> >> > >> >> > > > > > > > > As part of the changes, I would
> simply
> > > throw
> > > > > an
> > > > > >> >> > >> exception
> > > > > >> >> > >> >> (by
> > > > > >> >> > >> >> > > > > default)
> > > > > >> >> > >> >> > > > > > > if
> > > > > >> >> > >> >> > > > > > > > > the one attempts to use text indices
> > > with the
> > > > > >> >> native
> > > > > >> >> > >> >> > > persistence
> > > > > >> >> > >> >> > > > > > > enabled.
> > > > > >> >> > >> >> > > > > > > > > If the person is ready to live with
> that
> > > > > >> limitation
> > > > > >> >> > >> that
> > > > > >> >> > >> >> an
> > > > > >> >> > >> >> > > > > explicit
> > > > > >> >> > >> >> > > > > > > > > configuration change is needed to
> come
> > > around
> > > > > >> the
> > > > > >> >> > >> >> exception.
> > > > > >> >> > >> >> > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > Thoughts?
> > > > > >> >> > >> >> > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > -
> > > > > >> >> > >> >> > > > > > > > > Denis
> > > > > >> >> > >> >> > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > On Tue, Sep 17, 2019 at 7:44 AM Yuriy
> > > > > Shuliga <
> > > > > >> >> > >> >> > >  shuliga@gmail.com
> > > > > >> >> > >> >> > > > >
> > > > > >> >> > >> >> > > > > > > wrote:
> > > > > >> >> > >> >> > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > Hello to all again,
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > Thank you for important comments
> and
> > > notes
> > > > > >> given
> > > > > >> >> > >> below!
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > Let me answer and continue the
> > > discussion.
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > (I) Overall needs in Lucene
> indexing
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > Alexei has referenced to
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >>  https://issues.apache.org/jira/browse/IGNITE-5371
> > > > > >> >> > >> where
> > > > > >> >> > >> >> > > > > > > > > > absence of index persistence was
> > > declared
> > > > > as
> > > > > >> an
> > > > > >> >> > >> >> obstacle to
> > > > > >> >> > >> >> > > > > further
> > > > > >> >> > >> >> > > > > > > > > > development.
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > a) This ticket is already closed
> as not
> > > > > >> valid.b)
> > > > > >> >> > >> There
> > > > > >> >> > >> >> are
> > > > > >> >> > >> >> > > > > definite
> > > > > >> >> > >> >> > > > > > > needs
> > > > > >> >> > >> >> > > > > > > > > > (and in our project as well) in
> just
> > > > > in-memory
> > > > > >> >> > >> indexing
> > > > > >> >> > >> >> of
> > > > > >> >> > >> >> > > > > selected
> > > > > >> >> > >> >> > > > > > > data.
> > > > > >> >> > >> >> > > > > > > > > > We intend to use search
> capabilities
> > > for
> > > > > >> fetching
> > > > > >> >> > >> >> limited
> > > > > >> >> > >> >> > > > amount
> > > > > >> >> > >> >> > > > > of
> > > > > >> >> > >> >> > > > > > > > > records
> > > > > >> >> > >> >> > > > > > > > > > that should be used in type-ahead
> > > search /
> > > > > >> >> > >> suggestions.
> > > > > >> >> > >> >> > > > > > > > > > Not all of the data will be indexed
> > > and the
> > > > > >> are
> > > > > >> >> no
> > > > > >> >> > >> need
> > > > > >> >> > >> >> in
> > > > > >> >> > >> >> > > > Lucene
> > > > > >> >> > >> >> > > > > > > index
> > > > > >> >> > >> >> > > > > > > > > to
> > > > > >> >> > >> >> > > > > > > > > > be persistence. Hope this is a wide
> > > > > pattern of
> > > > > >> >> > >> >> text-search
> > > > > >> >> > >> >> > > > usage.
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > (II) Necessary fixes in current
> > > > > >> implementation.
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > a) Implementation of correct *limit
> > > > > *(*offset*
> > > > > >> >> > seems
> > > > > >> >> > >> to
> > > > > >> >> > >> >> be
> > > > > >> >> > >> >> > > not
> > > > > >> >> > >> >> > > > > > > required
> > > > > >> >> > >> >> > > > > > > > > in
> > > > > >> >> > >> >> > > > > > > > > > text-search tasks for now)
> > > > > >> >> > >> >> > > > > > > > > > I have investigated the data flow
> for
> > > > > >> distributed
> > > > > >> >> > >> text
> > > > > >> >> > >> >> > > queries.
> > > > > >> >> > >> >> > > > > it
> > > > > >> >> > >> >> > > > > > > was
> > > > > >> >> > >> >> > > > > > > > > > simple test prefix query, like
> > > > > 'name'*='ene*'*
> > > > > >> >> > >> >> > > > > > > > > > For now each server-node returns
> all
> > > > > response
> > > > > >> >> > >> records to
> > > > > >> >> > >> >> > the
> > > > > >> >> > >> >> > > > > > > client-node
> > > > > >> >> > >> >> > > > > > > > > > and it may contain ~thousands,
> ~hundred
> > > > > >> thousands
> > > > > >> >> > >> >> records.
> > > > > >> >> > >> >> > > > > > > > > > Event if we need only first 10-100.
> > > Again,
> > > > > all
> > > > > >> >> the
> > > > > >> >> > >> >> results
> > > > > >> >> > >> >> > > are
> > > > > >> >> > >> >> > > > > added
> > > > > >> >> > >> >> > > > > > > to
> > > > > >> >> > >> >> > > > > > > > > > queue in
> > > > > >> arbitrary
> > > > > >> >> > >> order
> > > > > >> >> > >> >> by
> > > > > >> >> > >> >> > > > pages.
> > > > > >> >> > >> >> > > > > > > > > > I did not find here any means to
> > > deliver
> > > > > >> >> > >> deterministic
> > > > > >> >> > >> >> > > result.
> > > > > >> >> > >> >> > > > > > > > > > So implementing limit as part of
> query
> > > and
> > > > > >> >> > >> >> > > > > (GridCacheQueryRequest)
> > > > > >> >> > >> >> > > > > > > will
> > > > > >> >> > >> >> > > > > > > > > not
> > > > > >> >> > >> >> > > > > > > > > > change the nature of response but
> will
> > > > > limit
> > > > > >> load
> > > > > >> >> > on
> > > > > >> >> > >> >> nodes
> > > > > >> >> > >> >> > > and
> > > > > >> >> > >> >> > > > > > > > > networking.
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > Can we consider to open a ticket
> for
> > > this?
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > (III) Further extension of Lucene
> API
> > > > > >> exposition
> > > > > >> >> to
> > > > > >> >> > >> >> Ignite
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > a) Sorting
> > > > > >> >> > >> >> > > > > > > > > > The solution for this could be:
> > > > > >> >> > >> >> > > > > > > > > > - Make entities comparable
> > > > > >> >> > >> >> > > > > > > > > > - Add custom comparator to entity
> > > > > >> >> > >> >> > > > > > > > > > - Add annotations to mark sorted
> > > fields for
> > > > > >> >> Lucene
> > > > > >> >> > >> >> indexing
> > > > > >> >> > >> >> > > > > > > > > > - Use comparators when merging
> > > responses or
> > > > > >> >> > reducing
> > > > > >> >> > >> to
> > > > > >> >> > >> >> > > desired
> > > > > >> >> > >> >> > > > > > > limit on
> > > > > >> >> > >> >> > > > > > > > > > client node.
> > > > > >> >> > >> >> > > > > > > > > > Will require full result set to be
> > > loaded
> > > > > into
> > > > > >> >> > >> memory.
> > > > > >> >> > >> >> > Though
> > > > > >> >> > >> >> > > > > can be
> > > > > >> >> > >> >> > > > > > > used
> > > > > >> >> > >> >> > > > > > > > > > for relatively small limits.
> > > > > >> >> > >> >> > > > > > > > > > BR,
> > > > > >> >> > >> >> > > > > > > > > > Yuriy Shuliha
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > пт, 30 серп. 2019 о 10:37 Alexei
> > > > > Scherbakov <
> > > > > >> >> > >> >> > > > > > > > >  alexey.scherbakoff@gmail.com >
> > > > > >> >> > >> >> > > > > > > > > > пише:
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > Yuriy,
> > > > > >> >> > >> >> > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > Note what one of major blockers
> for
> > > text
> > > > > >> >> queries
> > > > > >> >> > is
> > > > > >> >> > >> >> [1]
> > > > > >> >> > >> >> > > which
> > > > > >> >> > >> >> > > > > makes
> > > > > >> >> > >> >> > > > > > > > > > lucene
> > > > > >> >> > >> >> > > > > > > > > > > indexes unusable with
> persistence and
> > > > > main
> > > > > >> >> reason
> > > > > >> >> > >> for
> > > > > >> >> > >> >> > > > > > > discontinuation.
> > > > > >> >> > >> >> > > > > > > > > > > Probably it's should be addressed
> > > first
> > > > > to
> > > > > >> make
> > > > > >> >> > >> text
> > > > > >> >> > >> >> > > queries
> > > > > >> >> > >> >> > > > a
> > > > > >> >> > >> >> > > > > > > valid
> > > > > >> >> > >> >> > > > > > > > > > > product feature.
> > > > > >> >> > >> >> > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > Distributed sorting and advanved
> > > > > querying is
> > > > > >> >> > indeed
> > > > > >> >> > >> >> not a
> > > > > >> >> > >> >> > > > > trivial
> > > > > >> >> > >> >> > > > > > > task.
> > > > > >> >> > >> >> > > > > > > > > > > Some kind of merging must be
> > > implemented
> > > > > on
> > > > > >> >> query
> > > > > >> >> > >> >> > > originating
> > > > > >> >> > >> >> > > > > node.
> > > > > >> >> > >> >> > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > [1]
> > > > > >> >> > >>  https://issues.apache.org/jira/browse/IGNITE-5371
> > > > > >> >> > >> >> > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > чт, 29 авг. 2019 г. в 23:38,
> Denis
> > > Magda
> > > > > <
> > > > > >> >> > >> >> > >  dmagda@apache.org
> > > > > >> >> > >> >> > > > >:
> > > > > >> >> > >> >> > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > Yuriy,
> > > > > >> >> > >> >> > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > If you are ready to take over
> the
> > > > > >> full-text
> > > > > >> >> > >> search
> > > > > >> >> > >> >> > > indexes
> > > > > >> >> > >> >> > > > > then
> > > > > >> >> > >> >> > > > > > > > > please
> > > > > >> >> > >> >> > > > > > > > > > go
> > > > > >> >> > >> >> > > > > > > > > > > > ahead. The primary reason why
> the
> > > > > >> community
> > > > > >> >> > >> wants to
> > > > > >> >> > >> >> > > > > discontinue
> > > > > >> >> > >> >> > > > > > > them
> > > > > >> >> > >> >> > > > > > > > > > > first
> > > > > >> >> > >> >> > > > > > > > > > > > (and, probable, resurrect
> later)
> > > are
> > > > > the
> > > > > >> >> > >> limitations
> > > > > >> >> > >> >> > > listed
> > > > > >> >> > >> >> > > > > by
> > > > > >> >> > >> >> > > > > > > Andrey
> > > > > >> >> > >> >> > > > > > > > > > and
> > > > > >> >> > >> >> > > > > > > > > > > > minimal support from the
> community
> > > end.
> > > > > >> >> > >> >> > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > -
> > > > > >> >> > >> >> > > > > > > > > > > > Denis
> > > > > >> >> > >> >> > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > On Thu, Aug 29, 2019 at 1:29 PM
> > > Andrey
> > > > > >> >> > Mashenkov
> > > > > >> >> > >> <
> > > > > >> >> > >> >> > > > > > > > > > > >  andrey.mashenkov@gmail.com >
> > > > > >> >> > >> >> > > > > > > > > > > > wrote:
> > > > > >> >> > >> >> > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > Hi Yuriy,
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > Unfortunatelly, there is a
> plan
> > > to
> > > > > >> >> > discontinue
> > > > > >> >> > >> >> > > > TextQueries
> > > > > >> >> > >> >> > > > > in
> > > > > >> >> > >> >> > > > > > > > > Ignite
> > > > > >> >> > >> >> > > > > > > > > > > [1].
> > > > > >> >> > >> >> > > > > > > > > > > > > Motivation here is text
> indexes
> > > are
> > > > > not
> > > > > >> >> > >> >> persistent,
> > > > > >> >> > >> >> > not
> > > > > >> >> > >> >> > > > > > > > > transactional
> > > > > >> >> > >> >> > > > > > > > > > > and
> > > > > >> >> > >> >> > > > > > > > > > > > > can't be user together with
> SQL
> > > or
> > > > > >> inside
> > > > > >> >> > SQL.
> > > > > >> >> > >> >> > > > > > > > > > > > > and there is a lack of
> interest
> > > from
> > > > > >> >> > community
> > > > > >> >> > >> >> side.
> > > > > >> >> > >> >> > > > > > > > > > > > > You are weclome to take on
> these
> > > > > issues
> > > > > >> and
> > > > > >> >> > >> make
> > > > > >> >> > >> >> > > > > TextQueries
> > > > > >> >> > >> >> > > > > > > great.
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > 1, PageSize can't be used to
> > > limit
> > > > > >> >> > resultset.
> > > > > >> >> > >> >> > > > > > > > > > > > > Query results return from
> data
> > > node
> > > > > to
> > > > > >> >> > >> client-side
> > > > > >> >> > >> >> > > cursor
> > > > > >> >> > >> >> > > > > in
> > > > > >> >> > >> >> > > > > > > > > > > page-by-page
> > > > > >> >> > >> >> > > > > > > > > > > > > manner and
> > > > > >> >> > >> >> > > > > > > > > > > > > this parameter is designed
> > > control
> > > > > page
> > > > > >> >> size.
> > > > > >> >> > >> It
> > > > > >> >> > >> >> is
> > > > > >> >> > >> >> > > > > supposed
> > > > > >> >> > >> >> > > > > > > query
> > > > > >> >> > >> >> > > > > > > > > > > > executes
> > > > > >> >> > >> >> > > > > > > > > > > > > lazily on server side and
> > > > > >> >> > >> >> > > > > > > > > > > > > it is not excepted full
> > > resultset be
> > > > > >> loaded
> > > > > >> >> > to
> > > > > >> >> > >> >> memory
> > > > > >> >> > >> >> > > on
> > > > > >> >> > >> >> > > > > server
> > > > > >> >> > >> >> > > > > > > > > side
> > > > > >> >> > >> >> > > > > > > > > > at
> > > > > >> >> > >> >> > > > > > > > > > > > > once, but by pages.
> > > > > >> >> > >> >> > > > > > > > > > > > > Do you mean you found Lucene
> > > > > entire
> > > > > >> >> > >> resultset
> > > > > >> >> > >> >> > into
> > > > > >> >> > >> >> > > > > memory
> > > > > >> >> > >> >> > > > > > > > > before
> > > > > >> >> > >> >> > > > > > > > > > > > first
> > > > > >> >> > >> >> > > > > > > > > > > > > page is sent to client?
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > I'd think a new parameter
> should
> > > be
> > > > > >> added
> > > > > >> >> to
> > > > > >> >> > >> limit
> > > > > >> >> > >> >> > > > result.
> > > > > >> >> > >> >> > > > > The
> > > > > >> >> > >> >> > > > > > > best
> > > > > >> >> > >> >> > > > > > > > > > > > > solution is to use query
> language
> > > > > >> commands
> > > > > >> >> > for
> > > > > >> >> > >> >> this,
> > > > > >> >> > >> >> > > e.g.
> > > > > >> >> > >> >> > > > > > > > > > > "LIMIT/OFFSET"
> > > > > >> >> > >> >> > > > > > > > > > > > in
> > > > > >> >> > >> >> > > > > > > > > > > > > SQL.
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > This task doesn't look
> trivial.
> > > > > Query is
> > > > > >> >> > >> >> distributed
> > > > > >> >> > >> >> > > > > operation
> > > > > >> >> > >> >> > > > > > > and
> > > > > >> >> > >> >> > > > > > > > > > same
> > > > > >> >> > >> >> > > > > > > > > > > > > user query will be executed
> on
> > > data
> > > > > >> nodes
> > > > > >> >> > >> >> > > > > > > > > > > > > and then results from all
> nodes
> > > > > should
> > > > > >> be
> > > > > >> >> > >> correcly
> > > > > >> >> > >> >> > > merged
> > > > > >> >> > >> >> > > > > > > before
> > > > > >> >> > >> >> > > > > > > > > > being
> > > > > >> >> > >> >> > > > > > > > > > > > > returned via client-cursor.
> > > > > >> >> > >> >> > > > > > > > > > > > > So, LIMIT should be applied
> on
> > > every
> > > > > >> node
> > > > > >> >> and
> > > > > >> >> > >> >> then on
> > > > > >> >> > >> >> > > > merge
> > > > > >> >> > >> >> > > > > > > phase.
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > Also, this may be
> non-obviuos,
> > > > > limiting
> > > > > >> >> > results
> > > > > >> >> > >> >> make
> > > > > >> >> > >> >> > no
> > > > > >> >> > >> >> > > > > sence
> > > > > >> >> > >> >> > > > > > > > > without
> > > > > >> >> > >> >> > > > > > > > > > > > > sorting,
> > > > > >> >> > >> >> > > > > > > > > > > > > as there is no guarantee
> every
> > > next
> > > > > >> query
> > > > > >> >> run
> > > > > >> >> > >> will
> > > > > >> >> > >> >> > > return
> > > > > >> >> > >> >> > > > > same
> > > > > >> >> > >> >> > > > > > > data
> > > > > >> >> > >> >> > > > > > > > > > > > because
> > > > > >> >> > >> >> > > > > > > > > > > > > of page reordeing.
> > > > > >> >> > >> >> > > > > > > > > > > > > Basically, merge phase
> > > > > results
> > > > > >> from
> > > > > >> >> > >> data
> > > > > >> >> > >> >> > nodes
> > > > > >> >> > >> >> > > > > > > > > asynchronously
> > > > > >> >> > >> >> > > > > > > > > > > and
> > > > > >> >> > >> >> > > > > > > > > > > > > messages from different nodes
> > > can't
> > > > > be
> > > > > >> >> > ordered.
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > 2.
> > > > > >> >> > >> >> > > > > > > > > > > > > a. "tokenize" param name (for
> > > > > >> >> > @QueryTextFiled)
> > > > > >> >> > >> >> looks
> > > > > >> >> > >> >> > > more
> > > > > >> >> > >> >> > > > > > > verbose,
> > > > > >> >> > >> >> > > > > > > > > > > isn't
> > > > > >> >> > >> >> > > > > > > > > > > > > it.
> > > > > >> >> > >> >> > > > > > > > > > > > > b,c. What about distributed
> > > query?
> > > > > How
> > > > > >> >> > partial
> > > > > >> >> > >> >> > results
> > > > > >> >> > >> >> > > > from
> > > > > >> >> > >> >> > > > > > > nodes
> > > > > >> >> > >> >> > > > > > > > > > will
> > > > > >> >> > >> >> > > > > > > > > > > be
> > > > > >> >> > >> >> > > > > > > > > > > > > merged?
> > > > > >> >> > >> >> > > > > > > > > > > > > Does Lucene allows to
> configure
> > > > > >> comparator
> > > > > >> >> > for
> > > > > >> >> > >> >> data
> > > > > >> >> > >> >> > > > > sorting?
> > > > > >> >> > >> >> > > > > > > > > > > > > What comparator Ignite should
> > > choose
> > > > > to
> > > > > >> >> sort
> > > > > >> >> > >> >> result
> > > > > >> >> > >> >> > on
> > > > > >> >> > >> >> > > > > merge
> > > > > >> >> > >> >> > > > > > > phase?
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > 3. For now Lucene engine is
> not
> > > > > >> >> configurable
> > > > > >> >> > at
> > > > > >> >> > >> >> all.
> > > > > >> >> > >> >> > > E.g.
> > > > > >> >> > >> >> > > > > it is
> > > > > >> >> > >> >> > > > > > > > > > > > impossible
> > > > > >> >> > >> >> > > > > > > > > > > > > to configure Tokenizer.
> > > > > >> >> > >> >> > > > > > > > > > > > > I'd think about possible
> ways to
> > > > > >> configure
> > > > > >> >> > >> engine
> > > > > >> >> > >> >> at
> > > > > >> >> > >> >> > > > first
> > > > > >> >> > >> >> > > > > and
> > > > > >> >> > >> >> > > > > > > only
> > > > > >> >> > >> >> > > > > > > > > > > then
> > > > > >> >> > >> >> > > > > > > > > > > > go
> > > > > >> >> > >> >> > > > > > > > > > > > > further to discuss\implement
> > > complex
> > > > > >> >> > features,
> > > > > >> >> > >> >> > > > > > > > > > > > > that may depends on engine
> > > config.
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > On Thu, Aug 29, 2019 at 8:17
> PM
> > > Yuriy
> > > > > >> >> > Shuliga <
> > > > > >> >> > >> >> > > > > > >  shuliga@gmail.com >
> > > > > >> >> > >> >> > > > > > > > > > > wrote:
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > Dear community,
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > By starting this chain I'd
> > > like to
> > > > > >> open
> > > > > >> >> > >> >> discussion
> > > > > >> >> > >> >> > > that
> > > > > >> >> > >> >> > > > > would
> > > > > >> >> > >> >> > > > > > > > > come
> > > > > >> >> > >> >> > > > > > > > > > to
> > > > > >> >> > >> >> > > > > > > > > > > > > > contribution results in
> subj.
> > > area.
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > Ignite has indexing
> > > capabilities,
> > > > > >> backed
> > > > > >> >> up
> > > > > >> >> > >> by
> > > > > >> >> > >> >> > > > different
> > > > > >> >> > >> >> > > > > > > > > > mechanisms,
> > > > > >> >> > >> >> > > > > > > > > > > > > > including Lucene.
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > Currently, Lucene 7.5.0 is
> used
> > > > > (past
> > > > > >> >> year
> > > > > >> >> > >> >> > release).
> > > > > >> >> > >> >> > > > > > > > > > > > > > This is a wide spread and
> > > mature
> > > > > >> >> technology
> > > > > >> >> > >> that
> > > > > >> >> > >> >> > > covers
> > > > > >> >> > >> >> > > > > text
> > > > > >> >> > >> >> > > > > > > > > search
> > > > > >> >> > >> >> > > > > > > > > > > > area
> > > > > >> >> > >> >> > > > > > > > > > > > > > and beyond (e.g. spacial
> data
> > > > > >> indexing).
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > My goal is to *expose more
> > > Lucene
> > > > > >> >> > >> functionality
> > > > > >> >> > >> >> to
> > > > > >> >> > >> >> > > > Ignite
> > > > > >> >> > >> >> > > > > > > > > indexing
> > > > > >> >> > >> >> > > > > > > > > > > and
> > > > > >> >> > >> >> > > > > > > > > > > > > > query mechanisms for text
> > > data*.
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > It's quite simple request
> at
> > > > > current
> > > > > >> >> stage.
> > > > > >> >> > >> It
> > > > > >> >> > >> >> is
> > > > > >> >> > >> >> > > > coming
> > > > > >> >> > >> >> > > > > > > from our
> > > > > >> >> > >> >> > > > > > > > > > > > > project's
> > > > > >> >> > >> >> > > > > > > > > > > > > > needs, but i believe, will
> be
> > > > > useful
> > > > > >> for
> > > > > >> >> a
> > > > > >> >> > >> lot
> > > > > >> >> > >> >> more
> > > > > >> >> > >> >> > > > > people.
> > > > > >> >> > >> >> > > > > > > > > > > > > > Let's walk through and
> vote or
> > > > > discuss
> > > > > >> >> > about
> > > > > >> >> > >> >> Jira
> > > > > >> >> > >> >> > > > > tickets for
> > > > > >> >> > >> >> > > > > > > > > them.
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > 1.[trivial] Use
> > > > > >> dataQuery.getPageSize()
> > > > > >> >> > to
> > > > > >> >> > >> >> limit
> > > > > >> >> > >> >> > > > search
> > > > > >> >> > >> >> > > > > > > > > response
> > > > > >> >> > >> >> > > > > > > > > > > > items
> > > > > >> >> > >> >> > > > > > > > > > > > > > inside
> GridLuceneIndex.query().
> > > > > >> Currently
> > > > > >> >> > it
> > > > > >> >> > >> is
> > > > > >> >> > >> >> > > calling
> > > > > >> >> > >> >> > > > > > > > > > > > > > IndexSearcher.search(query,
> > > > > >> >> > >> >> *Integer.MAX_VALUE*) -
> > > > > >> >> > >> >> > so
> > > > > >> >> > >> >> > > > > > > basically
> > > > > >> >> > >> >> > > > > > > > > all
> > > > > >> >> > >> >> > > > > > > > > > > > > scored
> > > > > >> >> > >> >> > > > > > > > > > > > > > matches will me returned,
> what
> > > we
> > > > > do
> > > > > >> not
> > > > > >> >> > >> need in
> > > > > >> >> > >> >> > most
> > > > > >> >> > >> >> > > > > cases.
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > 2.[simple] Add sorting.
> Then
> > > more
> > > > > >> >> capable
> > > > > >> >> > >> >> search
> > > > > >> >> > >> >> > > call
> > > > > >> >> > >> >> > > > > can be
> > > > > >> >> > >> >> > > > > > > > > > > > > > executed:
> > > > > *IndexSearcher.search(query,
> > > > > >> >> > count,
> > > > > >> >> > >> >> > > > > > > > > > > > > > sort) *
> > > > > >> >> > >> >> > > > > > > > > > > > > > Implementation steps:
> > > > > >> >> > >> >> > > > > > > > > > > > > > a) Introduce boolean
> > > *sortField*
> > > > > >> >> parameter
> > > > > >> >> > in
> > > > > >> >> > >> >> > > > > > > *@QueryTextFiled *
> > > > > >> >> > >> >> > > > > > > > > > > > > > annotation. If
> > > > > >> >> > >> >> > > > > > > > > > > > > > *true *the filed will be
> > > indexed
> > > > > but
> > > > > >> not
> > > > > >> >> > >> >> tokenized.
> > > > > >> >> > >> >> > > > > Number
> > > > > >> >> > >> >> > > > > > > types
> > > > > >> >> > >> >> > > > > > > > > > are
> > > > > >> >> > >> >> > > > > > > > > > > > > > preferred here.
> > > > > >> >> > >> >> > > > > > > > > > > > > > b) Add *sort* collection to
> > > > > >> *TextQuery*
> > > > > >> >> > >> >> > constructor.
> > > > > >> >> > >> >> > > It
> > > > > >> >> > >> >> > > > > > > should
> > > > > >> >> > >> >> > > > > > > > > > define
> > > > > >> >> > >> >> > > > > > > > > > > > > > desired sort fields used
> for
> > > > > querying.
> > > > > >> >> > >> >> > > > > > > > > > > > > > c) Implement Lucene sort
> usage
> > > in
> > > > > >> >> > >> >> > > > > GridLuceneIndex.query().
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > 3.[moderate] Build complex
> > > queries
> > > > > >> with
> > > > > >> >> > >> >> > *TextQuery*,
> > > > > >> >> > >> >> > > > > > > including
> > > > > >> >> > >> >> > > > > > > > > > > > > > terms/queries boosting.
> > > > > >> >> > >> >> > > > > > > > > > > > > > *This section for voting
> only,
> > > as
> > > > > >> >> requires
> > > > > >> >> > >> more
> > > > > >> >> > >> >> > > > detailed
> > > > > >> >> > >> >> > > > > > > work.
> > > > > >> >> > >> >> > > > > > > > > > Should
> > > > > >> >> > >> >> > > > > > > > > > > > be
> > > > > >> >> > >> >> > > > > > > > > > > > > > extended if community is
> > > > > interested in
> > > > > >> >> it.*
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > Looking forward to your
> > > comments!
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > > BR,
> > > > > >> >> > >> >> > > > > > > > > > > > > > Yuriy Shuliha
> > > > > >> >> > >> >> > > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > > > --
> > > > > >> >> > >> >> > > > > > > > > > > > > Best regards,
> > > > > >> >> > >> >> > > > > > > > > > > > > Andrey V. Mashenkov
> > > > > >> >> > >> >> > > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > --
> > > > > >> >> > >> >> > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > > > Best regards,
> > > > > >> >> > >> >> > > > > > > > > > > Alexei Scherbakov
> > > > > >> >> > >> >> > > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > > >
> > > > > >> >> > >> >> > > > > > > > >
> > > > > >> >> > >> >> > > > > > >
> > > > > >> >> > >> >> > > > > > >
> > > > > >> >> > >> >> > > > > > >
> > > > > >> >> > >> >> > > > > > > --
> > > > > >> >> > >> >> > > > > > > Best regards,
> > > > > >> >> > >> >> > > > > > > Ivan Pavlukhin
> > > > > >> >> > >> >> > > > > > >
> > > > > >> >> > >> >> > > > >
> > > > > >> >> > >> >> > > > >
> > > > > >> >> > >> >> > > > >
> > > > > >> >> > >> >> > > > > --
> > > > > >> >> > >> >> > > > > Best regards,
> > > > > >> >> > >> >> > > > > Ivan Pavlukhin
> > > > > >> >> > >> >> > > > >
> > > > > >> >> > >> >> > > >
> > > > > >> >> > >> >> > >
> > > > > >> >> > >> >> >
> > > > > >> >> > >> >> >
> > > > > >> >> > >> >> > --
> > > > > >> >> > >> >> > Best regards,
> > > > > >> >> > >> >> > Andrey V. Mashenkov
> > > > > >> >> > >> >> >
> > > > > >> >> > >> >>
> > > > > >> >> > >> >
> > > > > >> >> > >> >
> > > > > >> >> > >> > --
> > > > > >> >> > >> > Best regards,
> > > > > >> >> > >> > Andrey V. Mashenkov
> > > > > >> >> > >> >
> > > > > >> >> > >>
> > > > > >> >> > >
> > > > > >> >> >
> > > > > >> >> > --
> > > > > >> >> > Best regards,
> > > > > >> >> > Andrey V. Mashenkov
> > > > > >> >> >
> > > > > >> >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Ivan Pavlukhin
> > >
> > >
>
>
>
> --
> Best regards,
> Ivan Pavlukhin
>
>

Mime
• Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message