asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Feeds UDF
Date Wed, 09 Dec 2015 07:46:12 GMT
(I am still completely not seeing a problem here.)

On 12/8/15 10:20 PM, abdullah alamoudi wrote:
> The plan is to mostly use Upsert in the future since we can do some
> optimizations with it that we can't do with an insert.
> We should also support deletes as well and probably allow a mix of the
> three operations within the same feed. This is a work in progress right now
> but before I go far, I am stabilizing some other parts of the feeds.
>
> Cheers,
> Abdullah.
>
>
> Amoudi, Abdullah.
>
> On Tue, Dec 8, 2015 at 10:11 PM, Ildar Absalyamov <
> ildar.absalyamov@gmail.com> wrote:
>
>> Abdullah,
>>
>> OK, now I see what problems it will cause.
>> Kinda related question: could the feed implement “upsert” semantics, that
>> you’ve been working on, instead of “insert” semantics?
>>
>>> On Dec 8, 2015, at 21:52, abdullah alamoudi <bamousaa@gmail.com> wrote:
>>>
>>> I think that we probably should restrict feed applied functions somehow
>>> (needs further thoughts and discussions) and I know for sure that we
>> don't.
>>> As for the case you present, I would imagine that it could be allowed
>>> theoretically but I think everyone sees why it should be disallowed.
>>>
>>> One thing to keep in mind is that we introduce a materialize if the
>> dataset
>>> was part of an insert pipeline. Now think about how this would work with
>> a
>>> continuous feed. One choice would be that the feed will materialize all
>>> records to be inserted and once the feed stops, it would start inserting
>>> them but I still think we should not allow it.
>>>
>>> My 2c,
>>> Any opposing argument?
>>>
>>>
>>> Amoudi, Abdullah.
>>>
>>> On Tue, Dec 8, 2015 at 6:28 PM, Ildar Absalyamov <
>> ildar.absalyamov@gmail.com
>>>> wrote:
>>>> Hi All,
>>>>
>>>> As a part of feed ingestion we do allow preprocessing incoming data with
>>>> AQL UDFs.
>>>> I was wondering if we somehow restrict the kind of UDFs that could be
>>>> used? Do we allow joins in these UDFs? Especially joins with the same
>>>> dataset, which is used for intake. Ex:
>>>>
>>>> create type TweetType as open {
>>>>   id: string,
>>>>   username : string,
>>>>   location : string,
>>>>   text : string,
>>>>   timestamp : string
>>>> }
>>>> create dataset Tweets(TweetType)
>>>> primary key id;
>>>> create function feed_processor($x) {
>>>> for $y in dataset Tweets
>>>> // self-join with Tweets dataset on some predicate($x, $y)
>>>> return $y
>>>> }
>>>> create feed TweetFeed
>>>> apply function feed_processor;
>>>>
>>>> The query above fails in runtime, but I was wondering if that
>>>> theoretically could work at all.
>>>>
>>>> Best regards,
>>>> Ildar
>>>>
>>>>
>> Best regards,
>> Ildar
>>
>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message