calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: MATCH_RECOGNIZE
Date Thu, 02 Aug 2018 13:27:14 GMT
Hi Julian,

It would be great to use the same test suite.

We have quite a few tests in Flink but they are not super well organized.
I would love to have more structure for at least some of the tests.

I had a quick look at how Calcite runs its Quidem tests.
Not sure if this is a format that we could easily adopt to, but maybe its
possible to put a test data set, queries, and results in a more portable
format.

Best, Fabian





2018-07-31 19:54 GMT+02:00 Julian Hyde <jhyde@apache.org>:

> I’m delighted that Flink is getting full SQL support for MATCH_RECOGNIZE.
>
> Sounds like it might be challenging to share the implementation, but could
> we perhaps share the test suite? (I.e. a set of SQL queries and their
> expected results.)
>
> I added a simple test in https://github.com/julianhyde/calcite/commit/
> ee460847643ec17544f310088affd99be4028bb6 <https://github.com/
> julianhyde/calcite/commit/ee460847643ec17544f310088affd99be4028bb6> that
> could be extended.
>
> Julian
>
>
> > On Jul 31, 2018, at 8:07 AM, Fabian Hueske <fhueske@gmail.com> wrote:
> >
> > Hi everyone,
> >
> > I'd like to share the plans for MATCH_RECOGNIZE support in Flink.
> >
> > Flink features a so-called CEP library for quite some time [1]. The CEP
> > features is a popular feature and frequently used.
> > In a nutshell, the library provides a domain-specific API to define event
> > patterns. The patterns are translated into a state machine and evaluated
> in
> > a streaming program.
> >
> > Even before, we learned about about MATCH_RECOGNIZE, Till (another Flink
> > committer) and I gave a few talks about unifying SQL and CEP [2].
> > Hence, we were quite excited when we learned about MATCH_RECOGNIZE and
> even
> > more when it was added to Calcite.
> > Shortly after that, we got a PR [3] which translated the parsed
> > MATCH_RECOGNIZE clause into patterns of our CEP library.
> > However, we never really got to the point of merging that contribution,
> > mainly because there were some inconsistencies in the semantics of
> > MATCH_RECOGNIZE and Flink's CEP library.
> >
> > Recently, a Flink committers picked up this feature again, validated the
> > the semantics, and made a few corrections [4].
> > The CEP library is now ready to support a subset of the MATCH_RECOGNIZE
> > features.
> > Unfortunately, MATCH_RECOGNIZE support won't make it into the upcoming
> > 1.6.0 release, but the plans are to add it for the 1.7.0 release.
> >
> > Regarding the idea of sharing parts of the evaluation logic.
> > Flink has runtime support for a subset of the MATCH_RECOGNIZE clause.
> > Unfortunately, I am not familiar with the internals of Flink's CEP
> library
> > and don't know how portable it is.
> >
> > Best, Fabian
> >
> > [1]
> > https://ci.apache.org/projects/flink/flink-docs-
> release-1.5/dev/libs/cep.html <https://ci.apache.org/
> projects/flink/flink-docs-release-1.5/dev/libs/cep.html>
> > [2]
> > https://www.slideshare.net/tillrohrmann/streaming-
> analytics-cep-two-sides-of-the-same-coin <https://www.slideshare.net/
> tillrohrmann/streaming-analytics-cep-two-sides-of-the-same-coin>
> > [3] https://github.com/apache/flink/pull/4502 <
> https://github.com/apache/flink/pull/4502>
> > [4] https://issues.apache.org/jira/browse/FLINK-9593 <
> https://issues.apache.org/jira/browse/FLINK-9593>
> >
> > 2018-07-23 21:03 GMT+02:00 Sergey Nuyanzin <snuyanzin@gmail.com <mailto:
> snuyanzin@gmail.com>>:
> >
> >> looks exciting.
> >> If it is possible I would like to take a part of it however I'm not sure
> >> about this week (I could since August)
> >>
> >> On Mon, Jul 23, 2018 at 9:10 PM, Michael Mior <mmior@apache.org
> <mailto:mmior@apache.org>> wrote:
> >>
> >>> This does sound like my idea of fun, but unfortunately I won't have
> >>> the time to contribute in the near future. I'll keep this on my radar
> >>> though. I also shared this message with all the students in our
> >>> research group and I wouldn't be surprised if there was someone
> >>> willing to jump in. Thanks for keeping this moving Julian!
> >>>
> >>> --
> >>> Michael Mior
> >>> mmior@apache.org <mailto:mmior@apache.org>
> >>> Le lun. 23 juil. 2018 à 13:54, Julian Hyde <jhyde@apache.org <mailto:
> jhyde@apache.org>> a écrit :
> >>>>
> >>>> For quite a while we have had partial support for MATCH_RECOGNIZE. We
> >>> support it in the parser and validator, but there is no runtime
> >>> implementation. It’s a shame, because MATCH_RECOGNIZE is an incredibly
> >>> powerful SQL feature for both traditional SQL (it’s in Oracle 12c) and
> >> for
> >>> continuous query (aka complex event processing - CEP).
> >>>>
> >>>> I figure it’s time to change that. My plan is to implement it
> >>> incrementally, getting simple queries working to start with, then allow
> >>> people to add more complex queries.
> >>>>
> >>>> In a dev branch [1], I’ve added a method Enumerables.match[2]. The
> idea
> >>> is that if you supply an Enumerable of input data, a finite state
> machine
> >>> to figure out when a sequence of rows makes a match (represented by a
> >>> transition function: (state, row) -> state), and a function to convert
> a
> >>> matched set of rows to a set of output rows. The match method is fairly
> >>> straightforward, and I almost have it finished.
> >>>>
> >>>> The complexity is in generating the finite state machine, emitter
> >>> function, and so forth.
> >>>>
> >>>> Can someone help me with this task? If your idea of fun is
> implementing
> >>> database algorithms, this is about as much fun as it gets. You learned
> >>> about finite state machines in college - this is your chance to
> actually
> >>> write one!
> >>>>
> >>>> This might be a good joint project with the Flink community. I know
> >>> Flink are thinking of implementing CEP, and the algorithm we write here
> >>> could be shared with Flink (for use via Flink SQL or via the Flink
> API).
> >>>>
> >>>> Julian
> >>>>
> >>>> [1] https://github.com/julianhyde/calcite/commits/1935-match-
> recognize
> >> <
> >>> https://github.com/julianhyde/calcite/commits/1935-match-recognize <
> https://github.com/julianhyde/calcite/commits/1935-match-recognize>>
> >>>>
> >>>> [2] https://github.com/julianhyde/calcite/commit/ <
> https://github.com/julianhyde/calcite/commit/>
> >>> 4dfaf1bbee718aa6694a8ce67d829c32d04c7e87#diff-
> >>> 8a97a64204db631471c563df7551f408R73 <https://github.com/ <
> https://github.com/>
> >>> julianhyde/calcite/commit/4dfaf1bbee718aa6694a8ce67d829c
> 32d04c7e87#diff-
> >>> 8a97a64204db631471c563df7551f408R73>
> >>>
> >>
> >>
> >>
> >> --
> >> Best regards,
> >> Sergey
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message