kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Percy <mpe...@apache.org>
Subject Re: Spark Streaming + Kudu
Date Mon, 05 Mar 2018 21:18:48 GMT
Hi Ravi, are you using AUTO_FLUSH_BACKGROUND
<https://kudu.apache.org/releases/1.6.0/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html>?
You mention that you are trying to use getPendingErrors()
<https://kudu.apache.org/releases/1.6.0/apidocs/org/apache/kudu/client/KuduSession.html#getPendingErrors-->
but
it sounds like it's not working for you -- can you be more specific about
what you expect and what you are observing?

Thanks,
Mike



On Mon, Feb 26, 2018 at 8:04 PM, Ravi Kanth <ravikanth.4b0@gmail.com> wrote:

> Thank Clifford. We are running Kudu 1.4 version. Till date we didn't see
> any issues in production and we are not losing tablet servers. But, as part
> of testing I have to generate few unforeseen cases to analyse the
> application performance. One among that is bringing down the tablet server
> or master server intentionally during which I observed the loss of records.
> Just wanted to test cases out of the happy path here. Once again thanks for
> taking time to respond to me.
>
> - Ravi
>
> On 26 February 2018 at 19:58, Clifford Resnick <cresnick@mediamath.com>
> wrote:
>
>> I'll have to get back to you on the code bits, but I'm pretty sure we're
>> doing simple sync batching. We're not in production yet, but after some
>> months of development I haven't seen any failures, even when pushing load
>> doing multiple years' backfill. I think the real question is why are you
>> losing tablet servers? The only instability we ever had with Kudu was when
>> it had that weird ntp sync issue that was fixed I think for 1.6. What
>> version are you running?
>>
>> Anyway I would think that infinite loop should be catchable somewhere.
>> Our pipeline is set to fail/retry with Flink snapshots. I imagine there is
>> similar with Spark. Sorry I cant be of more help!
>>
>>
>>
>> On Feb 26, 2018 9:10 PM, Ravi Kanth <ravikanth.4b0@gmail.com> wrote:
>>
>> Cliff,
>>
>> Thanks for the response. Well, I do agree that its simple and seamless.
>> In my case, I am able to upsert ~25000 events/sec into Kudu. But, I am
>> facing the problem when any of the Kudu Tablet or master server is down. I
>> am not able to get a hold of the exception from client. The client is going
>> into an infinite loop trying to connect to Kudu. Meanwhile, I am loosing my
>> records. I tried handling the errors through getPendingErrors() but still
>> it is helpless. I am using AsyncKuduClient to establish the connection and
>> retrieving the syncClient from the Async to open the session and table. Any
>> help?
>>
>> Thanks,
>> Ravi
>>
>> On 26 February 2018 at 18:00, Cliff Resnick <cresny@gmail.com> wrote:
>>
>> While I can't speak for Spark, we do use the client API from Flink
>> streaming and it's simple and seamless. It's especially nice if you require
>> an Upsert semantic.
>>
>> On Feb 26, 2018 7:51 PM, "Ravi Kanth" <ravikanth.4b0@gmail.com> wrote:
>>
>> Hi,
>>
>> Anyone using Spark Streaming to ingest data into Kudu and using Kudu
>> Client API to do so rather than the traditional KuduContext API? I am stuck
>> at a point and couldn't find a solution.
>>
>> Thanks,
>> Ravi
>>
>>
>>
>>
>

Mime
View raw message