flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: Apache Flink Operator State as Query Cache
Date Fri, 13 Nov 2015 08:54:47 GMT
Hi Welly,
Flink 0.10.0 is out, its just not announced yet.
Its available on maven central and the global mirrors are currently syncing
it. This mirror for example has the update already:
http://apache.mirror.digionline.de/flink/flink-0.10.0/

On Fri, Nov 13, 2015 at 9:50 AM, Welly Tambunan <if05041@gmail.com> wrote:

> Hi Aljoscha,
>
> Thanks for this one. Looking forward for 0.10 release version.
>
> Cheers
>
> On Thu, Nov 12, 2015 at 5:34 PM, Aljoscha Krettek <aljoscha@apache.org>
> wrote:
>
>> Hi,
>> I don’t know yet when the operator state will be transitioned to managed
>> memory but it could happen for 1.0 (which will come after 0.10). The good
>> thing is that the interfaces won’t change, so state can be used as it is
>> now.
>>
>> For 0.10, the release vote is winding down right now, so you can expect
>> the release to happen today or tomorrow. I think the streaming is
>> production ready now, we expect to mostly to hardening and some
>> infrastructure changes (for example annotations that specify API stability)
>> for the 1.0 release.
>>
>> Let us know if you need more information.
>>
>> Cheers,
>> Aljoscha
>> > On 12 Nov 2015, at 02:42, Welly Tambunan <if05041@gmail.com> wrote:
>> >
>> > Hi Stephan,
>> >
>> > >Storing the model in OperatorState is a good idea, if you can. On the
>> roadmap is to migrate the operator state to managed memory as well, so that
>> should take care of the GC issues.
>> > Is this using off the heap memory ? Which version we expect this one to
>> be available ?
>> >
>> > Another question is when will the release version of 0.10 will be out ?
>> We would love to upgrade to that one when it's available. That version will
>> be a production ready streaming right ?
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Nov 11, 2015 at 4:49 PM, Stephan Ewen <sewen@apache.org> wrote:
>> > Hi!
>> >
>> > In general, if you can keep state in Flink, you get better
>> throughput/latency/consistency and have one less system to worry about
>> (external k/v store). State outside means that the Flink processes can be
>> slimmer and need fewer resources and as such recover a bit faster. There
>> are use cases for that as well.
>> >
>> > Storing the model in OperatorState is a good idea, if you can. On the
>> roadmap is to migrate the operator state to managed memory as well, so that
>> should take care of the GC issues.
>> >
>> > We are just adding functionality to make the Key/Value operator state
>> usable in CoMap/CoFlatMap as well (currently it only works in windows and
>> in Map/FlatMap/Filter functions over the KeyedStream).
>> > Until the, you should be able to use a simple Java HashMap and use the
>> "Checkpointed" interface to get it persistent.
>> >
>> > Greetings,
>> > Stephan
>> >
>> >
>> > On Sun, Nov 8, 2015 at 10:11 AM, Welly Tambunan <if05041@gmail.com>
>> wrote:
>> > Thanks for the answer.
>> >
>> > Currently the approach that i'm using right now is creating a
>> base/marker interface to stream different type of message to the same
>> operator. Not sure about the performance hit about this compare to the
>> CoFlatMap function.
>> >
>> > Basically this one is providing query cache, so i'm thinking instead of
>> using in memory cache like redis, ignite etc, i can just use operator state
>> for this one.
>> >
>> > I just want to gauge do i need to use memory cache or operator state
>> would be just fine.
>> >
>> > However i'm concern about the Gen 2 Garbage Collection for caching our
>> own state without using operator state. Is there any clarification on that
>> one ?
>> >
>> >
>> >
>> > On Sat, Nov 7, 2015 at 12:38 AM, Anwar Rizal <anrizal05@gmail.com>
>> wrote:
>> >
>> > Let me understand your case better here. You have a stream of model and
>> stream of data. To process the data, you will need a way to access your
>> model from the subsequent stream operations (map, filter, flatmap, ..).
>> > I'm not sure in which case Operator State is a good choice, but I think
>> you can also live without.
>> >
>> > val modelStream = .... // get the model stream
>> > val dataStream   =
>> >
>> > modelStream.broadcast.connect(dataStream). coFlatMap(  ) Then you can
>> keep the latest model in a CoFlatMapRichFunction, not necessarily as
>> Operator State, although maybe OperatorState is a good choice too.
>> >
>> > Does it make sense to you ?
>> >
>> > Anwar
>> >
>> > On Fri, Nov 6, 2015 at 10:21 AM, Welly Tambunan <if05041@gmail.com>
>> wrote:
>> > Hi All,
>> >
>> > We have a high density data that required a downsample. However this
>> downsample model is very flexible based on the client device and user
>> interaction. So it will be wasteful to precompute and store to db.
>> >
>> > So we want to use Apache Flink to do downsampling and cache the result
>> for subsequent query.
>> >
>> > We are considering using Flink Operator state for that one.
>> >
>> > Is that the right approach to use that for memory cache ? Or if that
>> preferable using memory cache like redis etc.
>> >
>> > Any comments will be appreciated.
>> >
>> >
>> > Cheers
>> > --
>> > Welly Tambunan
>> > Triplelands
>> >
>> > http://weltam.wordpress.com
>> > http://www.triplelands.com
>> >
>> >
>> >
>> >
>> > --
>> > Welly Tambunan
>> > Triplelands
>> >
>> > http://weltam.wordpress.com
>> > http://www.triplelands.com
>> >
>> >
>> >
>> >
>> > --
>> > Welly Tambunan
>> > Triplelands
>> >
>> > http://weltam.wordpress.com
>> > http://www.triplelands.com
>>
>>
>
>
> --
> Welly Tambunan
> Triplelands
>
> http://weltam.wordpress.com
> http://www.triplelands.com <http://www.triplelands.com/blog/>
>

Mime
View raw message