spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: Improving performance of a kafka spark streaming app
Date Tue, 21 Jun 2016 20:32:10 GMT
Take HBase out of the equation and just measure what your read
performance is by doing something like

createDirectStream(...).foreach(_.println)

not take() or print()

On Tue, Jun 21, 2016 at 3:19 PM, Colin Kincaid Williams <discord@uw.edu> wrote:
> @Cody I was able to bring my processing time down to a second by
> setting maxRatePerPartition as discussed. My bad that I didn't
> recognize it as the cause of my scheduling delay.
>
> Since then I've tried experimenting with a larger Spark Context
> duration. I've been trying to get some noticeable improvement
> inserting messages from Kafka -> Hbase using the above application.
> I'm currently getting around 3500 inserts / second on a 9 node hbase
> cluster. So far, I haven't been able to get much more throughput. Then
> I'm looking for advice here how I should tune Kafka and Spark for this
> job.
>
> I can create a kafka topic with as many partitions that I want. I can
> set the Duration and maxRatePerPartition. I have 1.7 billion messages
> that I can insert rather quickly into the Kafka queue, and I'd like to
> get them into Hbase as quickly as possible.
>
> I'm looking for advice regarding # Kafka Topic Partitions / Streaming
> Duration / maxRatePerPartition / any other spark settings or code
> changes that I should make to try to get a better consumption rate.
>
> Thanks for all the help so far, this is the first Spark application I
> have written.
>
> On Mon, Jun 20, 2016 at 12:32 PM, Colin Kincaid Williams <discord@uw.edu> wrote:
>> I'll try dropping the maxRatePerPartition=400, or maybe even lower.
>> However even at application starts up I have this large scheduling
>> delay. I will report my progress later on.
>>
>> On Mon, Jun 20, 2016 at 2:12 PM, Cody Koeninger <cody@koeninger.org> wrote:
>>> If your batch time is 1 second and your average processing time is
>>> 1.16 seconds, you're always going to be falling behind.  That would
>>> explain why you've built up an hour of scheduling delay after eight
>>> hours of running.
>>>
>>> On Sat, Jun 18, 2016 at 4:40 PM, Colin Kincaid Williams <discord@uw.edu>
wrote:
>>>> Hi Mich again,
>>>>
>>>> Regarding batch window, etc. I have provided the sources, but I'm not
>>>> currently calling the window function. Did you see the program source?
>>>> It's only 100 lines.
>>>>
>>>> https://gist.github.com/drocsid/b0efa4ff6ff4a7c3c8bb56767d0b6877
>>>>
>>>> Then I would expect I'm using defaults, other than what has been shown
>>>> in the configuration.
>>>>
>>>> For example:
>>>>
>>>> In the launcher configuration I set --conf
>>>> spark.streaming.kafka.maxRatePerPartition=500 \ and I believe there
>>>> are 500 messages for the duration set in the application:
>>>> JavaStreamingContext jssc = new JavaStreamingContext(jsc, new
>>>> Duration(1000));
>>>>
>>>>
>>>> Then with the --num-executors 6 \ submit flag, and the
>>>> spark.streaming.kafka.maxRatePerPartition=500 I think that's how we
>>>> arrive at the 3000 events per batch in the UI, pasted above.
>>>>
>>>> Feel free to correct me if I'm wrong.
>>>>
>>>> Then are you suggesting that I set the window?
>>>>
>>>> Maybe following this as reference:
>>>>
>>>> https://databricks.gitbooks.io/databricks-spark-reference-applications/content/logs_analyzer/chapter1/windows.html
>>>>
>>>> On Sat, Jun 18, 2016 at 8:08 PM, Mich Talebzadeh
>>>> <mich.talebzadeh@gmail.com> wrote:
>>>>> Ok
>>>>>
>>>>> What is the set up for these please?
>>>>>
>>>>> batch window
>>>>> window length
>>>>> sliding interval
>>>>>
>>>>> And also in each batch window how much data do you get in (no of messages
in
>>>>> the topic whatever)?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Dr Mich Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> LinkedIn
>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>>
>>>>>
>>>>>
>>>>> http://talebzadehmich.wordpress.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 18 June 2016 at 21:01, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:
>>>>>>
>>>>>> I believe you have an issue with performance?
>>>>>>
>>>>>> have you checked spark GUI (default 4040) for details including shuffles
>>>>>> etc?
>>>>>>
>>>>>> HTH
>>>>>>
>>>>>> Dr Mich Talebzadeh
>>>>>>
>>>>>>
>>>>>>
>>>>>> LinkedIn
>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://talebzadehmich.wordpress.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 18 June 2016 at 20:59, Colin Kincaid Williams <discord@uw.edu>
wrote:
>>>>>>>
>>>>>>> There are 25 nodes in the spark cluster.
>>>>>>>
>>>>>>> On Sat, Jun 18, 2016 at 7:53 PM, Mich Talebzadeh
>>>>>>> <mich.talebzadeh@gmail.com> wrote:
>>>>>>> > how many nodes are in your cluster?
>>>>>>> >
>>>>>>> > --num-executors 6 \
>>>>>>> >  --driver-memory 4G \
>>>>>>> >  --executor-memory 2G \
>>>>>>> >  --total-executor-cores 12 \
>>>>>>> >
>>>>>>> >
>>>>>>> > Dr Mich Talebzadeh
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > LinkedIn
>>>>>>> >
>>>>>>> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > http://talebzadehmich.wordpress.com
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > On 18 June 2016 at 20:40, Colin Kincaid Williams <discord@uw.edu>
>>>>>>> > wrote:
>>>>>>> >>
>>>>>>> >> I updated my app to Spark 1.5.2 streaming so that it
consumes from
>>>>>>> >> Kafka using the direct api and inserts content into
an hbase cluster,
>>>>>>> >> as described in this thread. I was away from this project
for awhile
>>>>>>> >> due to events in my family.
>>>>>>> >>
>>>>>>> >> Currently my scheduling delay is high, but the processing
time is
>>>>>>> >> stable around a second. I changed my setup to use 6
kafka partitions
>>>>>>> >> on a set of smaller kafka brokers, with fewer disks.
I've included
>>>>>>> >> some details below, including the script I use to launch
the
>>>>>>> >> application. I'm using a Spark on Hbase library, whose
version is
>>>>>>> >> relevant to my Hbase cluster. Is it apparent there is
something wrong
>>>>>>> >> with my launch method that could be causing the delay,
related to the
>>>>>>> >> included jars?
>>>>>>> >>
>>>>>>> >> Or is there something wrong with the very simple approach
I'm taking
>>>>>>> >> for the application?
>>>>>>> >>
>>>>>>> >> Any advice is appriciated.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> The application:
>>>>>>> >>
>>>>>>> >> https://gist.github.com/drocsid/b0efa4ff6ff4a7c3c8bb56767d0b6877
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> From the streaming UI I get something like:
>>>>>>> >>
>>>>>>> >> table Completed Batches (last 1000 out of 27136)
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Batch Time Input Size Scheduling Delay (?) Processing
Time (?) Total
>>>>>>> >> Delay (?) Output Ops: Succeeded/Total
>>>>>>> >>
>>>>>>> >> 2016/06/18 11:21:32 3000 events 1.2 h 1 s 1.2 h 1/1
>>>>>>> >>
>>>>>>> >> 2016/06/18 11:21:31 3000 events 1.2 h 1 s 1.2 h 1/1
>>>>>>> >>
>>>>>>> >> 2016/06/18 11:21:30 3000 events 1.2 h 1 s 1.2 h 1/1
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Here's how I'm launching the spark application.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> #!/usr/bin/env bash
>>>>>>> >>
>>>>>>> >> export SPARK_CONF_DIR=/home/colin.williams/spark
>>>>>>> >>
>>>>>>> >> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>>>>> >>
>>>>>>> >> export
>>>>>>> >>
>>>>>>> >> HADOOP_CLASSPATH=/home/colin.williams/hbase/conf/:/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/*:/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/hbase-protocol-0.98.6-cdh5.3.0.jar
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> /opt/spark-1.5.2-bin-hadoop2.4/bin/spark-submit \
>>>>>>> >>
>>>>>>> >> --class com.example.KafkaToHbase \
>>>>>>> >>
>>>>>>> >> --master spark://spark_master:7077 \
>>>>>>> >>
>>>>>>> >> --deploy-mode client \
>>>>>>> >>
>>>>>>> >> --num-executors 6 \
>>>>>>> >>
>>>>>>> >> --driver-memory 4G \
>>>>>>> >>
>>>>>>> >> --executor-memory 2G \
>>>>>>> >>
>>>>>>> >> --total-executor-cores 12 \
>>>>>>> >>
>>>>>>> >> --jars
>>>>>>> >>
>>>>>>> >> /home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/zookeeper/zookeeper-3.4.5-cdh5.3.0.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/guava-12.0.1.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/protobuf-java-2.5.0.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-protocol.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-client.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-common.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-hadoop2-compat.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-hadoop-compat.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-server.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/htrace-core.jar
>>>>>>> >> \
>>>>>>> >>
>>>>>>> >> --conf spark.app.name="Kafka To Hbase" \
>>>>>>> >>
>>>>>>> >> --conf spark.eventLog.dir="hdfs:///user/spark/applicationHistory"
\
>>>>>>> >>
>>>>>>> >> --conf spark.eventLog.enabled=false \
>>>>>>> >>
>>>>>>> >> --conf spark.eventLog.overwrite=true \
>>>>>>> >>
>>>>>>> >> --conf spark.serializer=org.apache.spark.serializer.KryoSerializer
\
>>>>>>> >>
>>>>>>> >> --conf spark.streaming.backpressure.enabled=false \
>>>>>>> >>
>>>>>>> >> --conf spark.streaming.kafka.maxRatePerPartition=500
\
>>>>>>> >>
>>>>>>> >> --driver-class-path /home/colin.williams/kafka-hbase.jar
\
>>>>>>> >>
>>>>>>> >> --driver-java-options
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> -Dspark.executor.extraClassPath=/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/*
>>>>>>> >> \
>>>>>>> >>
>>>>>>> >> /home/colin.williams/kafka-hbase.jar "FromTable" "ToTable"
>>>>>>> >> "broker1:9092,broker2:9092"
>>>>>>> >>
>>>>>>> >> On Tue, May 3, 2016 at 8:20 PM, Colin Kincaid Williams
>>>>>>> >> <discord@uw.edu>
>>>>>>> >> wrote:
>>>>>>> >> > Thanks Cody, I can see that the partitions are
well distributed...
>>>>>>> >> > Then I'm in the process of using the direct api.
>>>>>>> >> >
>>>>>>> >> > On Tue, May 3, 2016 at 6:51 PM, Cody Koeninger
<cody@koeninger.org>
>>>>>>> >> > wrote:
>>>>>>> >> >> 60 partitions in and of itself shouldn't be
a big performance issue
>>>>>>> >> >> (as long as producers are distributing across
partitions evenly).
>>>>>>> >> >>
>>>>>>> >> >> On Tue, May 3, 2016 at 1:44 PM, Colin Kincaid
Williams
>>>>>>> >> >> <discord@uw.edu>
>>>>>>> >> >> wrote:
>>>>>>> >> >>> Thanks again Cody. Regarding the details
66 kafka partitions on 3
>>>>>>> >> >>> kafka servers, likely 8 core systems with
10 disks each. Maybe the
>>>>>>> >> >>> issue with the receiver was the large number
of partitions. I had
>>>>>>> >> >>> miscounted the disks and so 11*3*2 is how
I decided to partition
>>>>>>> >> >>> my
>>>>>>> >> >>> topic on insertion, ( by my own, unjustified
reasoning, on a first
>>>>>>> >> >>> attempt ) . This worked well enough for
me, I put 1.7 billion
>>>>>>> >> >>> entries
>>>>>>> >> >>> into Kafka on a map reduce job in 5 and
a half hours.
>>>>>>> >> >>>
>>>>>>> >> >>> I was concerned using spark 1.5.2 because
I'm currently putting my
>>>>>>> >> >>> data into a CDH 5.3 HDFS cluster, using
hbase-spark .98 library
>>>>>>> >> >>> jars
>>>>>>> >> >>> built for spark 1.2 on CDH 5.3. But after
debugging quite a bit
>>>>>>> >> >>> yesterday, I tried building against 1.5.2.
So far it's running
>>>>>>> >> >>> without
>>>>>>> >> >>> issue on a Spark 1.5.2 cluster. I'm not
sure there was too much
>>>>>>> >> >>> improvement using the same code, but I'll
see how the direct api
>>>>>>> >> >>> handles it. In the end I can reduce the
number of partitions in
>>>>>>> >> >>> Kafka
>>>>>>> >> >>> if it causes big performance issues.
>>>>>>> >> >>>
>>>>>>> >> >>> On Tue, May 3, 2016 at 4:08 AM, Cody Koeninger
>>>>>>> >> >>> <cody@koeninger.org>
>>>>>>> >> >>> wrote:
>>>>>>> >> >>>> print() isn't really the best way to
benchmark things, since it
>>>>>>> >> >>>> calls
>>>>>>> >> >>>> take(10) under the covers, but 380
records / second for a single
>>>>>>> >> >>>> receiver doesn't sound right in any
case.
>>>>>>> >> >>>>
>>>>>>> >> >>>> Am I understanding correctly that you're
trying to process a
>>>>>>> >> >>>> large
>>>>>>> >> >>>> number of already-existing kafka messages,
not keep up with an
>>>>>>> >> >>>> incoming stream?  Can you give any
details (e.g. hardware, number
>>>>>>> >> >>>> of
>>>>>>> >> >>>> topicpartitions, etc)?
>>>>>>> >> >>>>
>>>>>>> >> >>>> Really though, I'd try to start with
spark 1.6 and direct
>>>>>>> >> >>>> streams, or
>>>>>>> >> >>>> even just kafkacat, as a baseline.
>>>>>>> >> >>>>
>>>>>>> >> >>>>
>>>>>>> >> >>>>
>>>>>>> >> >>>> On Mon, May 2, 2016 at 7:01 PM, Colin
Kincaid Williams
>>>>>>> >> >>>> <discord@uw.edu> wrote:
>>>>>>> >> >>>>> Hello again. I searched for "backport
kafka" in the list
>>>>>>> >> >>>>> archives
>>>>>>> >> >>>>> but
>>>>>>> >> >>>>> couldn't find anything but a post
from Spark 0.7.2 . I was going
>>>>>>> >> >>>>> to
>>>>>>> >> >>>>> use accumulators to make a counter,
but then saw on the
>>>>>>> >> >>>>> Streaming
>>>>>>> >> >>>>> tab
>>>>>>> >> >>>>> the Receiver Statistics. Then I
removed all other
>>>>>>> >> >>>>> "functionality"
>>>>>>> >> >>>>> except:
>>>>>>> >> >>>>>
>>>>>>> >> >>>>>
>>>>>>> >> >>>>>     JavaPairReceiverInputDStream<byte[],
byte[]> dstream =
>>>>>>> >> >>>>> KafkaUtils
>>>>>>> >> >>>>>       //createStream(JavaStreamingContext
jssc,Class<K>
>>>>>>> >> >>>>> keyTypeClass,Class<V> valueTypeClass,
Class<U> keyDecoderClass,
>>>>>>> >> >>>>> Class<T> valueDecoderClass,
java.util.Map<String,String>
>>>>>>> >> >>>>> kafkaParams,
>>>>>>> >> >>>>> java.util.Map<String,Integer>
topics, StorageLevel storageLevel)
>>>>>>> >> >>>>>       .createStream(jssc, byte[].class,
byte[].class,
>>>>>>> >> >>>>> kafka.serializer.DefaultDecoder.class,
>>>>>>> >> >>>>> kafka.serializer.DefaultDecoder.class,
kafkaParamsMap, topicMap,
>>>>>>> >> >>>>> StorageLevel.MEMORY_AND_DISK_SER());
>>>>>>> >> >>>>>
>>>>>>> >> >>>>>        dstream.print();
>>>>>>> >> >>>>>
>>>>>>> >> >>>>> Then in the Recieiver Stats for
the single receiver, I'm seeing
>>>>>>> >> >>>>> around
>>>>>>> >> >>>>> 380 records / second. Then to get
anywhere near my 10% mentioned
>>>>>>> >> >>>>> above, I'd need to run around 21
receivers, assuming 380 records
>>>>>>> >> >>>>> /
>>>>>>> >> >>>>> second, just using the print output.
This seems awfully high to
>>>>>>> >> >>>>> me,
>>>>>>> >> >>>>> considering that I wrote 80000+
records a second to Kafka from a
>>>>>>> >> >>>>> mapreduce job, and that my bottleneck
was likely Hbase. Again
>>>>>>> >> >>>>> using
>>>>>>> >> >>>>> the 380 estimate, I would need
200+ receivers to reach a similar
>>>>>>> >> >>>>> amount of reads.
>>>>>>> >> >>>>>
>>>>>>> >> >>>>> Even given the issues with the
1.2 receivers, is this the
>>>>>>> >> >>>>> expected
>>>>>>> >> >>>>> way
>>>>>>> >> >>>>> to use the Kafka streaming API,
or am I doing something terribly
>>>>>>> >> >>>>> wrong?
>>>>>>> >> >>>>>
>>>>>>> >> >>>>> My application looks like
>>>>>>> >> >>>>> https://gist.github.com/drocsid/b0efa4ff6ff4a7c3c8bb56767d0b6877
>>>>>>> >> >>>>>
>>>>>>> >> >>>>> On Mon, May 2, 2016 at 6:09 PM,
Cody Koeninger
>>>>>>> >> >>>>> <cody@koeninger.org>
>>>>>>> >> >>>>> wrote:
>>>>>>> >> >>>>>> Have you tested for read throughput
(without writing to hbase,
>>>>>>> >> >>>>>> just
>>>>>>> >> >>>>>> deserialize)?
>>>>>>> >> >>>>>>
>>>>>>> >> >>>>>> Are you limited to using spark
1.2, or is upgrading possible?
>>>>>>> >> >>>>>> The
>>>>>>> >> >>>>>> kafka direct stream is available
starting with 1.3.  If you're
>>>>>>> >> >>>>>> stuck
>>>>>>> >> >>>>>> on 1.2, I believe there have
been some attempts to backport it,
>>>>>>> >> >>>>>> search
>>>>>>> >> >>>>>> the mailing list archives.
>>>>>>> >> >>>>>>
>>>>>>> >> >>>>>> On Mon, May 2, 2016 at 12:54
PM, Colin Kincaid Williams
>>>>>>> >> >>>>>> <discord@uw.edu> wrote:
>>>>>>> >> >>>>>>> I've written an application
to get content from a kafka topic
>>>>>>> >> >>>>>>> with
>>>>>>> >> >>>>>>> 1.7
>>>>>>> >> >>>>>>> billion entries,  get the
protobuf serialized entries, and
>>>>>>> >> >>>>>>> insert
>>>>>>> >> >>>>>>> into
>>>>>>> >> >>>>>>> hbase. Currently the environment
that I'm running in is Spark
>>>>>>> >> >>>>>>> 1.2.
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>> With 8 executors and 2
cores, and 2 jobs, I'm only getting
>>>>>>> >> >>>>>>> between
>>>>>>> >> >>>>>>> 0-2500 writes / second.
This will take much too long to
>>>>>>> >> >>>>>>> consume
>>>>>>> >> >>>>>>> the
>>>>>>> >> >>>>>>> entries.
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>> I currently believe that
the spark kafka receiver is the
>>>>>>> >> >>>>>>> bottleneck.
>>>>>>> >> >>>>>>> I've tried both 1.2 receivers,
with the WAL and without, and
>>>>>>> >> >>>>>>> didn't
>>>>>>> >> >>>>>>> notice any large performance
difference. I've tried many
>>>>>>> >> >>>>>>> different
>>>>>>> >> >>>>>>> spark configuration options,
but can't seem to get better
>>>>>>> >> >>>>>>> performance.
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>> I saw 80000 requests /
second inserting these records into
>>>>>>> >> >>>>>>> kafka
>>>>>>> >> >>>>>>> using
>>>>>>> >> >>>>>>> yarn / hbase / protobuf
/ kafka in a bulk fashion.
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>> While hbase inserts might
not deliver the same throughput, I'd
>>>>>>> >> >>>>>>> like to
>>>>>>> >> >>>>>>> at least get 10%.
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>> My application looks like
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>> https://gist.github.com/drocsid/b0efa4ff6ff4a7c3c8bb56767d0b6877
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>> This is my first spark
application. I'd appreciate any
>>>>>>> >> >>>>>>> assistance.
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>>>> ---------------------------------------------------------------------
>>>>>>> >> >>>>>>> To unsubscribe, e-mail:
user-unsubscribe@spark.apache.org
>>>>>>> >> >>>>>>> For additional commands,
e-mail: user-help@spark.apache.org
>>>>>>> >> >>>>>>>
>>>>>>> >> >>>>
>>>>>>> >> >>>>
>>>>>>> >> >>>> ---------------------------------------------------------------------
>>>>>>> >> >>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>>>> >> >>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>>>> >> >>>>
>>>>>>> >>
>>>>>>> >> ---------------------------------------------------------------------
>>>>>>> >> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>>>> >> For additional commands, e-mail: user-help@spark.apache.org
>>>>>>> >>
>>>>>>> >
>>>>>>
>>>>>>
>>>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message