spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From upkar_kohli <upkar.ko...@gmail.com>
Subject Re: Structured streaming (Kafka) error with Union
Date Fri, 25 Aug 2017 11:03:57 GMT
Hi Tathagata,

The error changes after adding dataFrame.explain(True) just before the
kafka write. Full log is attached as "spark_log.txt"

Code is attached as kafka_ml_union.py

Code is executed as :
spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0
kafka_ml_working.py

I open a console producer for kafka side by side :
kafka-console-producer --broker-list localhost:9092 --topic spark-stream
--property "parse.key=true" --property "key.separator=:"

eg.

jatin@Sandbox.RHEL:/home/jatin#kafka-console-producer --broker-list
localhost:90
                         92 --topic spark-stream --property
"parse.key=true" --property "key.separator=:"
abc:def
abc:spark


Consumer as below :
kafka-console-consumer --bootstrap-server localhost:9092 --topic topic_tgt
--property print.key=true --property key.separator=":" --from-beginning

output :

8589934592:def:lr:0.0
8589934592:spark:lr:1.0


Regards,
Upkar


On Fri, Aug 25, 2017 at 12:42 PM, Tathagata Das <tathagata.das1565@gmail.com
> wrote:

> Can you give the query with the union?
>
> Better still can you do `dataFrame.explain(true)`  and paste it here
> (where `dataFrame` is the same one on which you are calling writeStream).
>
> TD
>
> On Tue, Aug 22, 2017 at 10:04 PM, upkar_kohli <upkar.kohli@gmail.com>
> wrote:
>
>> Hi,
>>
>> After the change, the code has started intermittently giving error as
>> below (only with the union):
>>
>> This is after running on standalone node (latest Spark 2.2 version).
>>
>> May be related to https://issues.apache.org/jira/browse/SPARK-19185 ?
>>
>> 17/08/23 09:52:38 ERROR StreamExecution: Query [id =
>> 98c3ce45-9bf8-4d99-b115-b95b78cc06d1, runId =
>> 9c0e333b-a846-460c-b744-0acf2c1248cf] terminated with error
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> 0 in stage 23.0 failed 1 times, most recent failure: Lost task 0.0 in stage
>> 23.0 (TID 39, localhost, executor driver): java.util.ConcurrentModificationException:
>> KafkaConsumer is not safe for multi-threaded access
>>         at org.apache.kafka.clients.consumer.KafkaConsumer.acquire(
>> KafkaConsumer.java:1431)
>>         at org.apache.kafka.clients.consumer.KafkaConsumer.seek(KafkaCo
>> nsumer.java:1132)
>>         at org.apache.spark.sql.kafka010.CachedKafkaConsumer.seek(Cache
>> dKafkaConsumer.scala:294)
>>         at org.apache.spark.sql.kafka010.CachedKafkaConsumer.org$apache
>> $spark$sql$kafka010$CachedKafkaConsumer$$fetchData(
>> CachedKafkaConsumer.scala:205)
>>         at org.apache.spark.sql.kafka010.CachedKafkaConsumer$$anonfun$g
>> et$1.apply(CachedKafkaConsumer.scala:117)
>>         at org.apache.spark.sql.kafka010.CachedKafkaConsumer$$anonfun$g
>> et$1.apply(CachedKafkaConsumer.scala:106)
>>         at org.apache.spark.util.UninterruptibleThread.runUninterruptib
>> ly(UninterruptibleThread.scala:85)
>>         at org.apache.spark.sql.kafka010.CachedKafkaConsumer.runUninter
>> ruptiblyIfPossible(CachedKafkaConsumer.scala:68)
>>         at org.apache.spark.sql.kafka010.CachedKafkaConsumer.get(Cached
>> KafkaConsumer.scala:106)
>>         at org.apache.spark.sql.kafka010.KafkaSourceRDD$$anon$1.getNext
>> (KafkaSourceRDD.scala:157)
>>         at org.apache.spark.sql.kafka010.KafkaSourceRDD$$anon$1.getNext
>> (KafkaSourceRDD.scala:148)
>>         at org.apache.spark.util.NextIterator.hasNext(NextIterator.
>> scala:73)
>>         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>>         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>>         at org.apache.spark.sql.catalyst.expressions.GeneratedClass$Gen
>> eratedIterator.processNext(Unknown Source)
>>
>>
>>
>> Below configuration didn't help for similar issue
>>
>> http://apache-spark-developers-list.1001551.n3.nabble.com/
>> Streaming-ConcurrentModificationExceptions-when-Windowing-td20548.html
>>
>> running with :
>>
>> spark-submit --conf spark.executor.cores=1 --packages
>> org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0 kafka_ml.py
>>
>>
>> Regards,
>> Upkar
>>
>> On Wed, Aug 23, 2017 at 8:57 AM, <upkar.kohli@gmail.com> wrote:
>>
>>> Thanks. This was the issue
>>>
>>> Regards,
>>> Upkar
>>>
>>> Sent from my iPhone
>>>
>>> On 23-Aug-2017, at 01:29, Shixiong(Ryan) Zhu <shixiong@databricks.com>
>>> wrote:
>>>
>>> My hunch is you were using the same checkpoint location for
>>> "selected_lr" and "selected_all". When you tried to use "selected_all",
>>> it was using the checkpoint generated by "selected_lr" and caused the query
>>> failed.
>>>
>>> On Tue, Aug 22, 2017 at 2:12 AM, upkar_kohli <upkar.kohli@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am doing a POC on ML using structured streaming.
>>>>
>>>> The code below runs fine if i run streaming script using
>>>> selected_lr.selectExpr (just logistic regression), but fails when i run
>>>> selected_all.selectExpr (which is union of naive bayes and logistic
>>>> regression pipelines).
>>>>
>>>> Please can you help with what I am doing wrong.
>>>>
>>>> Regards,
>>>> Upkar
>>>>
>>>> ************************************************************
>>>> **************************************
>>>> Code Starts ------------------------------
>>>> ----------------------------------------------------------------
>>>>
>>>> from pyspark.ml import Pipeline
>>>> from pyspark.ml.classification import LogisticRegression, NaiveBayes
>>>> from pyspark.ml.feature import HashingTF, Tokenizer, StopWordsRemover,
>>>> CountVectorizer, IDF
>>>> from pyspark import SparkConf, SparkContext
>>>> from pyspark.sql import SparkSession
>>>> from pyspark.sql.functions import concat, col, lit
>>>>
>>>> if __name__ == "__main__":
>>>>     spark = SparkSession\
>>>>         .builder\
>>>>         .appName("KafkaSentiment")\
>>>>         .getOrCreate()
>>>>
>>>>     # $example on$
>>>>     # Prepare training documents from a list of (id, text, label)
>>>> tuples.
>>>>     training = spark.createDataFrame([
>>>>         (0, "a b c d e spark", 1.0),
>>>>         (1, "b d", 0.0),
>>>>         (2, "spark f g h", 1.0),
>>>>         (3, "hadoop mapreduce", 0.0)
>>>>     ], ["id", "text", "label"])
>>>>
>>>>     # Configure an ML pipeline, which consists of three stages:
>>>> tokenizer, hashingTF, and lr.
>>>>     tokenizer = Tokenizer(inputCol="text", outputCol="words")
>>>>     remover = StopWordsRemover(inputCol=tokenizer.getOutputCol(),
>>>> outputCol="relevant_words")
>>>>     hashingTF = HashingTF(inputCol=remover.getOutputCol(),
>>>> outputCol="rawFeatures")
>>>>     idf = IDF(inputCol="rawFeatures", outputCol="features")
>>>>     lr = LogisticRegression(maxIter=10, regParam=0.001)
>>>>     nb = NaiveBayes(smoothing=1.0, modelType="multinomial")
>>>>     pipeline_lr = Pipeline(stages=[tokenizer, remover, hashingTF, idf,
>>>> lr])
>>>>     pipeline_nb = Pipeline(stages=[tokenizer, remover, hashingTF, idf,
>>>> nb])
>>>>
>>>>     # Fit the pipeline to training documents.
>>>>     model_lr = pipeline_lr.fit(training)
>>>>     model_nb = pipeline_nb.fit(training)
>>>>
>>>>     testing_data = spark.readStream.format("kafka
>>>> ").option("kafka.bootstrap.servers", "localhost:9092").option("subscribe",
>>>> "spark-stream").load()
>>>>     test_ml = testing_data.selectExpr("1 as id", "CAST(value AS STRING)
>>>> as text")
>>>>
>>>>     # Make predictions on test documents and print columns of interest.
>>>>     prediction_lr = model_lr.transform(test_ml)
>>>>     prediction_nb = model_nb.transform(test_ml)
>>>>     selected_lr = prediction_lr.select("id", "text", "prediction")
>>>>     selected_nb = prediction_nb.select("id", "text", "prediction")
>>>>     print ("printing schema")
>>>>     selected_lr.printSchema()
>>>>     selected_nb.printSchema()
>>>>     test_ml.printSchema()
>>>>
>>>>     query = test_ml.writeStream.outputMode
>>>> ("append").format("console").start()
>>>>     selected_all=selected_lr.union(selected_nb)
>>>>     selected_all.printSchema()
>>>>     kafka_write = selected_all.selectExpr("CAST(id AS STRING) as key",
>>>> "CONCAT(CAST(text AS STRING),':', CAST(prediction AS STRING)) as
>>>> value").writeStream.format("kafka").option("checkpointLocation",
>>>> "/tmp/kaf_tgt").option("kafka.bootstrap.servers",
>>>> "localhost:9092").option("topic", "topic_tgt").start()
>>>>
>>>>     kafka_write.awaitTermination()
>>>>
>>>>     query.awaitTermination()
>>>>
>>>> Code Ends ------------------------------------------------------------
>>>> ----------------------------------
>>>> ************************************************************
>>>> ***********************************
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> jatin@Sandbox.RHEL:/home/jatin/projects/app#spark-submit --packages
>>>> org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0 kafka_ml.py
>>>> Ivy Default Cache set to: /home/jatin/.ivy2/cache
>>>> The jars for the packages stored in: /home/jatin/.ivy2/jars
>>>> :: loading settings :: url = jar:file:/home/jatin/projects/
>>>> spark/spark-2.2.0-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apac
>>>> he/ivy/core/settings/ivysettings.xml
>>>> org.apache.spark#spark-sql-kafka-0-10_2.11 added as a dependency
>>>> :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
>>>>         confs: [default]
>>>>         found org.apache.spark#spark-sql-kafka-0-10_2.11;2.2.0 in
>>>> central
>>>>         found org.apache.kafka#kafka-clients;0.10.0.1 in central
>>>>         found net.jpountz.lz4#lz4;1.3.0 in central
>>>>         found org.xerial.snappy#snappy-java;1.1.2.6 in central
>>>>         found org.slf4j#slf4j-api;1.7.16 in central
>>>>         found org.apache.spark#spark-tags_2.11;2.2.0 in central
>>>>         found org.spark-project.spark#unused;1.0.0 in central
>>>> :: resolution report :: resolve 9475ms :: artifacts dl 23ms
>>>>         :: modules in use:
>>>>         net.jpountz.lz4#lz4;1.3.0 from central in [default]
>>>>         org.apache.kafka#kafka-clients;0.10.0.1 from central in
>>>> [default]
>>>>         org.apache.spark#spark-sql-kafka-0-10_2.11;2.2.0 from central
>>>> in [default]
>>>>         org.apache.spark#spark-tags_2.11;2.2.0 from central in
>>>> [default]
>>>>         org.slf4j#slf4j-api;1.7.16 from central in [default]
>>>>         org.spark-project.spark#unused;1.0.0 from central in [default]
>>>>         org.xerial.snappy#snappy-java;1.1.2.6 from central in [default]
>>>>         ------------------------------------------------------------
>>>> ---------
>>>>         |                  |            modules            ||
>>>> artifacts   |
>>>>         |       conf       | number| search|dwnlded|evicted||
>>>> number|dwnlded|
>>>>         ------------------------------------------------------------
>>>> ---------
>>>>         |      default     |   7   |   2   |   2   |   0   ||   7   |
>>>> 0   |
>>>>         ------------------------------------------------------------
>>>> ---------
>>>> :: retrieving :: org.apache.spark#spark-submit-parent
>>>>         confs: [default]
>>>>         0 artifacts copied, 7 already retrieved (0kB/15ms)
>>>> Using Spark's default log4j profile: org/apache/spark/log4j-default
>>>> s.properties
>>>> 17/08/22 14:20:15 INFO SparkContext: Running Spark version 2.2.0
>>>> 17/08/22 14:20:15 WARN NativeCodeLoader: Unable to load native-hadoop
>>>> library for your platform... using builtin-java classes where applicable
>>>> 17/08/22 14:20:16 INFO SparkContext: Submitted application:
>>>> KafkaSentiment
>>>> 17/08/22 14:20:16 INFO SecurityManager: Changing view acls to: jatin
>>>> 17/08/22 14:20:16 INFO SecurityManager: Changing modify acls to: jatin
>>>> 17/08/22 14:20:16 INFO SecurityManager: Changing view acls groups to:
>>>> 17/08/22 14:20:16 INFO SecurityManager: Changing modify acls groups to:
>>>> 17/08/22 14:20:16 INFO SecurityManager: SecurityManager: authentication
>>>> disabled; ui acls disabled; users  with view permissions: Set(jatin);
>>>> groups with view permissions: Set(); users  with modify permissions:
>>>> Set(jatin); groups with modify permissions: Set()
>>>> 17/08/22 14:20:16 INFO Utils: Successfully started service
>>>> 'sparkDriver' on port 41554.
>>>> 17/08/22 14:20:16 INFO SparkEnv: Registering MapOutputTracker
>>>> 17/08/22 14:20:17 INFO SparkEnv: Registering BlockManagerMaster
>>>> 17/08/22 14:20:17 INFO BlockManagerMasterEndpoint: Using
>>>> org.apache.spark.storage.DefaultTopologyMapper for getting topology
>>>> information
>>>> 17/08/22 14:20:17 INFO BlockManagerMasterEndpoint:
>>>> BlockManagerMasterEndpoint up
>>>> 17/08/22 14:20:17 INFO DiskBlockManager: Created local directory at
>>>> /tmp/blockmgr-41217e8a-3b1f-498c-a749-8c51564e0ebe
>>>> 17/08/22 14:20:17 INFO MemoryStore: MemoryStore started with capacity
>>>> 366.3 MB
>>>> 17/08/22 14:20:17 INFO SparkEnv: Registering OutputCommitCoordinator
>>>> 17/08/22 14:20:17 INFO Utils: Successfully started service 'SparkUI' on
>>>> port 4040.
>>>> 17/08/22 14:20:17 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started
>>>> at http://192.168.25.187:4040
>>>> 17/08/22 14:20:17 INFO SparkContext: Added JAR
>>>> file:/home/jatin/.ivy2/jars/org.apache.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar
>>>> at spark://192.168.25.187:41554/jars/org.apache.spark_spark-sql
>>>> -kafka-0-10_2.11-2.2.0.jar with timestamp 1503391817919
>>>> 17/08/22 14:20:17 INFO SparkContext: Added JAR
>>>> file:/home/jatin/.ivy2/jars/org.apache.kafka_kafka-clients-0.10.0.1.jar
>>>> at spark://192.168.25.187:41554/jars/org.apache.kafka_kafka-cli
>>>> ents-0.10.0.1.jar with timestamp 1503391817924
>>>> 17/08/22 14:20:17 INFO SparkContext: Added JAR
>>>> file:/home/jatin/.ivy2/jars/org.apache.spark_spark-tags_2.11-2.2.0.jar
>>>> at spark://192.168.25.187:41554/jars/org.apache.spark_spark-tag
>>>> s_2.11-2.2.0.jar with timestamp 1503391817924
>>>> 17/08/22 14:20:17 INFO SparkContext: Added JAR
>>>> file:/home/jatin/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar
>>>> at spark://192.168.25.187:41554/jars/org.spark-project.spark_un
>>>> used-1.0.0.jar with timestamp 1503391817924
>>>> 17/08/22 14:20:17 INFO SparkContext: Added JAR
>>>> file:/home/jatin/.ivy2/jars/net.jpountz.lz4_lz4-1.3.0.jar at spark://
>>>> 192.168.25.187:41554/jars/net.jpountz.lz4_lz4-1.3.0.jar with timestamp
>>>> 1503391817924
>>>> 17/08/22 14:20:17 INFO SparkContext: Added JAR
>>>> file:/home/jatin/.ivy2/jars/org.xerial.snappy_snappy-java-1.1.2.6.jar
>>>> at spark://192.168.25.187:41554/jars/org.xerial.snappy_snappy-j
>>>> ava-1.1.2.6.jar with timestamp 1503391817924
>>>> 17/08/22 14:20:17 INFO SparkContext: Added JAR
>>>> file:/home/jatin/.ivy2/jars/org.slf4j_slf4j-api-1.7.16.jar at spark://
>>>> 192.168.25.187:41554/jars/org.slf4j_slf4j-api-1.7.16.jar with
>>>> timestamp 1503391817925
>>>> 17/08/22 14:20:18 INFO SparkContext: Added file
>>>> file:/home/jatin/projects/app/kafka_ml.py at
>>>> file:/home/jatin/projects/app/kafka_ml.py with timestamp 1503391818468
>>>> 17/08/22 14:20:18 INFO Utils: Copying /home/jatin/projects/app/kafka_ml.py
>>>> to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/kafka_ml.py
>>>> 17/08/22 14:20:18 INFO SparkContext: Added file
>>>> file:/home/jatin/.ivy2/jars/org.apache.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar
>>>> at file:/home/jatin/.ivy2/jars/org.apache.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar
>>>> with timestamp 1503391818514
>>>> 17/08/22 14:20:18 INFO Utils: Copying /home/jatin/.ivy2/jars/org.apa
>>>> che.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.apache.spark_spark-sq
>>>> l-kafka-0-10_2.11-2.2.0.jar
>>>> 17/08/22 14:20:18 INFO SparkContext: Added file
>>>> file:/home/jatin/.ivy2/jars/org.apache.kafka_kafka-clients-0.10.0.1.jar
>>>> at file:/home/jatin/.ivy2/jars/org.apache.kafka_kafka-clients-0.10.0.1.jar
>>>> with timestamp 1503391818538
>>>> 17/08/22 14:20:18 INFO Utils: Copying /home/jatin/.ivy2/jars/org.apa
>>>> che.kafka_kafka-clients-0.10.0.1.jar to /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/org.apache.kafka_kafka-clients-0.10.0.1.jar
>>>> 17/08/22 14:20:18 INFO SparkContext: Added file
>>>> file:/home/jatin/.ivy2/jars/org.apache.spark_spark-tags_2.11-2.2.0.jar
>>>> at file:/home/jatin/.ivy2/jars/org.apache.spark_spark-tags_2.11-2.2.0.jar
>>>> with timestamp 1503391818567
>>>> 17/08/22 14:20:18 INFO Utils: Copying /home/jatin/.ivy2/jars/org.apa
>>>> che.spark_spark-tags_2.11-2.2.0.jar to /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/org.apache.spark_spark-tags_2.11-2.2.0.jar
>>>> 17/08/22 14:20:18 INFO SparkContext: Added file
>>>> file:/home/jatin/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar
>>>> at file:/home/jatin/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar
>>>> with timestamp 1503391818573
>>>> 17/08/22 14:20:18 INFO Utils: Copying /home/jatin/.ivy2/jars/org.spa
>>>> rk-project.spark_unused-1.0.0.jar to /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/org.spark-project.spark_unused-1.0.0.jar
>>>> 17/08/22 14:20:18 INFO SparkContext: Added file
>>>> file:/home/jatin/.ivy2/jars/net.jpountz.lz4_lz4-1.3.0.jar at
>>>> file:/home/jatin/.ivy2/jars/net.jpountz.lz4_lz4-1.3.0.jar with
>>>> timestamp 1503391818581
>>>> 17/08/22 14:20:18 INFO Utils: Copying /home/jatin/.ivy2/jars/net.jpountz.lz4_lz4-1.3.0.jar
>>>> to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/net.jpountz.lz4_lz4-1.3.0.jar
>>>> 17/08/22 14:20:18 INFO SparkContext: Added file
>>>> file:/home/jatin/.ivy2/jars/org.xerial.snappy_snappy-java-1.1.2.6.jar
>>>> at file:/home/jatin/.ivy2/jars/org.xerial.snappy_snappy-java-1.1.2.6.jar
>>>> with timestamp 1503391818602
>>>> 17/08/22 14:20:18 INFO Utils: Copying /home/jatin/.ivy2/jars/org.xer
>>>> ial.snappy_snappy-java-1.1.2.6.jar to /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/org.xerial.snappy_snappy-java-1.1.2.6.jar
>>>> 17/08/22 14:20:18 INFO SparkContext: Added file
>>>> file:/home/jatin/.ivy2/jars/org.slf4j_slf4j-api-1.7.16.jar at
>>>> file:/home/jatin/.ivy2/jars/org.slf4j_slf4j-api-1.7.16.jar with
>>>> timestamp 1503391818631
>>>> 17/08/22 14:20:18 INFO Utils: Copying /home/jatin/.ivy2/jars/org.slf4j_slf4j-api-1.7.16.jar
>>>> to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.slf4j_slf4j-api-1.7.16.jar
>>>> 17/08/22 14:20:18 INFO Executor: Starting executor ID driver on host
>>>> localhost
>>>> 17/08/22 14:20:18 INFO Utils: Successfully started service
>>>> 'org.apache.spark.network.netty.NettyBlockTransferService' on port
>>>> 46647.
>>>> 17/08/22 14:20:18 INFO NettyBlockTransferService: Server created on
>>>> 192.168.25.187:46647
>>>> 17/08/22 14:20:18 INFO BlockManager: Using
>>>> org.apache.spark.storage.RandomBlockReplicationPolicy for block
>>>> replication policy
>>>> 17/08/22 14:20:18 INFO BlockManagerMaster: Registering BlockManager
>>>> BlockManagerId(driver, 192.168.25.187, 46647, None)
>>>> 17/08/22 14:20:18 INFO BlockManagerMasterEndpoint: Registering block
>>>> manager 192.168.25.187:46647 with 366.3 MB RAM, BlockManagerId(driver,
>>>> 192.168.25.187, 46647, None)
>>>> 17/08/22 14:20:18 INFO BlockManagerMaster: Registered BlockManager
>>>> BlockManagerId(driver, 192.168.25.187, 46647, None)
>>>> 17/08/22 14:20:18 INFO BlockManager: Initialized BlockManager:
>>>> BlockManagerId(driver, 192.168.25.187, 46647, None)
>>>> 17/08/22 14:20:19 INFO SharedState: Setting
>>>> hive.metastore.warehouse.dir ('null') to the value of
>>>> spark.sql.warehouse.dir ('file:/home/jatin/projects/ap
>>>> p/spark-warehouse/').
>>>> 17/08/22 14:20:19 INFO SharedState: Warehouse path is
>>>> 'file:/home/jatin/projects/app/spark-warehouse/'.
>>>> 17/08/22 14:20:20 INFO StateStoreCoordinatorRef: Registered
>>>> StateStoreCoordinator endpoint
>>>> 17/08/22 14:20:26 INFO CodeGenerator: Code generated in 504.736602 ms
>>>> 17/08/22 14:20:27 INFO SparkContext: Starting job: treeAggregate at
>>>> IDF.scala:54
>>>> 17/08/22 14:20:27 INFO DAGScheduler: Got job 0 (treeAggregate at
>>>> IDF.scala:54) with 2 output partitions
>>>> 17/08/22 14:20:27 INFO DAGScheduler: Final stage: ResultStage 0
>>>> (treeAggregate at IDF.scala:54)
>>>> 17/08/22 14:20:27 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:27 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:27 INFO DAGScheduler: Submitting ResultStage 0
>>>> (MapPartitionsRDD[10] at treeAggregate at IDF.scala:54), which has no
>>>> missing parents
>>>> 17/08/22 14:20:27 INFO MemoryStore: Block broadcast_0 stored as values
>>>> in memory (estimated size 26.0 KB, free 366.3 MB)
>>>> 17/08/22 14:20:27 INFO MemoryStore: Block broadcast_0_piece0 stored as
>>>> bytes in memory (estimated size 11.5 KB, free 366.3 MB)
>>>> 17/08/22 14:20:27 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 11.5 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:27 INFO SparkContext: Created broadcast 0 from broadcast
>>>> at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:27 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 0 (MapPartitionsRDD[10] at treeAggregate at IDF.scala:54)
>>>> (first 15 tasks are for partitions Vector(0, 1))
>>>> 17/08/22 14:20:27 INFO TaskSchedulerImpl: Adding task set 0.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:27 INFO TaskSetManager: Starting task 0.0 in stage 0.0
>>>> (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:27 INFO TaskSetManager: Starting task 1.0 in stage 0.0
>>>> (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:28 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
>>>> 17/08/22 14:20:28 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
>>>> 17/08/22 14:20:28 INFO Executor: Fetching file:/home/jatin/.ivy2/jars/or
>>>> g.apache.kafka_kafka-clients-0.10.0.1.jar with timestamp 1503391818538
>>>> 17/08/22 14:20:28 INFO Utils: /home/jatin/.ivy2/jars/org.apa
>>>> che.kafka_kafka-clients-0.10.0.1.jar has been previously copied to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.apache.kafka_kafka-cl
>>>> ients-0.10.0.1.jar
>>>> 17/08/22 14:20:28 INFO Executor: Fetching file:/home/jatin/.ivy2/jars/or
>>>> g.spark-project.spark_unused-1.0.0.jar with timestamp 1503391818573
>>>> 17/08/22 14:20:28 INFO Utils: /home/jatin/.ivy2/jars/org.spa
>>>> rk-project.spark_unused-1.0.0.jar has been previously copied to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.spark-project.spark_u
>>>> nused-1.0.0.jar
>>>> 17/08/22 14:20:28 INFO Executor: Fetching file:/home/jatin/.ivy2/jars/or
>>>> g.apache.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar with timestamp
>>>> 1503391818514
>>>> 17/08/22 14:20:28 INFO Utils: /home/jatin/.ivy2/jars/org.apa
>>>> che.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar has been previously
>>>> copied to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.apache.spark_spark-sq
>>>> l-kafka-0-10_2.11-2.2.0.jar
>>>> 17/08/22 14:20:28 INFO Executor: Fetching file:/home/jatin/projects/app/kafka_ml.py
>>>> with timestamp 1503391818468
>>>> 17/08/22 14:20:28 INFO Utils: /home/jatin/projects/app/kafka_ml.py has
>>>> been previously copied to /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/kafka_ml.py
>>>> 17/08/22 14:20:28 INFO Executor: Fetching file:/home/jatin/.ivy2/jars/org.slf4j_slf4j-api-1.7.16.jar
>>>> with timestamp 1503391818631
>>>> 17/08/22 14:20:28 INFO Utils: /home/jatin/.ivy2/jars/org.slf4j_slf4j-api-1.7.16.jar
>>>> has been previously copied to /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/org.slf4j_slf4j-api-1.7.16.jar
>>>> 17/08/22 14:20:28 INFO Executor: Fetching file:/home/jatin/.ivy2/jars/net.jpountz.lz4_lz4-1.3.0.jar
>>>> with timestamp 1503391818581
>>>> 17/08/22 14:20:28 INFO Utils: /home/jatin/.ivy2/jars/net.jpountz.lz4_lz4-1.3.0.jar
>>>> has been previously copied to /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/net.jpountz.lz4_lz4-1.3.0.jar
>>>> 17/08/22 14:20:28 INFO Executor: Fetching file:/home/jatin/.ivy2/jars/or
>>>> g.xerial.snappy_snappy-java-1.1.2.6.jar with timestamp 1503391818602
>>>> 17/08/22 14:20:28 INFO Utils: /home/jatin/.ivy2/jars/org.xer
>>>> ial.snappy_snappy-java-1.1.2.6.jar has been previously copied to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.xerial.snappy_snappy-
>>>> java-1.1.2.6.jar
>>>> 17/08/22 14:20:28 INFO Executor: Fetching file:/home/jatin/.ivy2/jars/or
>>>> g.apache.spark_spark-tags_2.11-2.2.0.jar with timestamp 1503391818567
>>>> 17/08/22 14:20:28 INFO Utils: /home/jatin/.ivy2/jars/org.apa
>>>> che.spark_spark-tags_2.11-2.2.0.jar has been previously copied to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.apache.spark_spark-ta
>>>> gs_2.11-2.2.0.jar
>>>> 17/08/22 14:20:28 INFO Executor: Fetching spark://
>>>> 192.168.25.187:41554/jars/org.slf4j_slf4j-api-1.7.16.jar with
>>>> timestamp 1503391817925
>>>> 17/08/22 14:20:28 INFO TransportClientFactory: Successfully created
>>>> connection to /192.168.25.187:41554 after 95 ms (0 ms spent in
>>>> bootstraps)
>>>> 17/08/22 14:20:28 INFO Utils: Fetching spark://192.168.25.187:41554/j
>>>> ars/org.slf4j_slf4j-api-1.7.16.jar to /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/fetchFileTemp9222131133573862634.tmp
>>>> 17/08/22 14:20:28 INFO Utils: /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/fetchFileTemp9222131133573862634.tmp has been previously copied
>>>> to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.slf4j_slf4j-api-1.7.16.jar
>>>> 17/08/22 14:20:28 INFO Executor: Adding file:/tmp/spark-9474eda2-128d-
>>>> 4b00-8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7e
>>>> a20c82f1b/org.slf4j_slf4j-api-1.7.16.jar to class loader
>>>> 17/08/22 14:20:28 INFO Executor: Fetching spark://
>>>> 192.168.25.187:41554/jars/org.apache.spark_spark-sql
>>>> -kafka-0-10_2.11-2.2.0.jar with timestamp 1503391817919
>>>> 17/08/22 14:20:28 INFO Utils: Fetching spark://192.168.25.187:41554/j
>>>> ars/org.apache.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/fetchFileTemp3539601253503251859.tmp
>>>> 17/08/22 14:20:28 INFO Utils: /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/fetchFileTemp3539601253503251859.tmp has been previously copied
>>>> to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.apache.spark_spark-sq
>>>> l-kafka-0-10_2.11-2.2.0.jar
>>>> 17/08/22 14:20:28 INFO Executor: Adding file:/tmp/spark-9474eda2-128d-
>>>> 4b00-8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7e
>>>> a20c82f1b/org.apache.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar to
>>>> class loader
>>>> 17/08/22 14:20:28 INFO Executor: Fetching spark://
>>>> 192.168.25.187:41554/jars/org.apache.kafka_kafka-clients-0.10.0.1.jar
>>>> with timestamp 1503391817924
>>>> 17/08/22 14:20:28 INFO Utils: Fetching spark://192.168.25.187:41554/j
>>>> ars/org.apache.kafka_kafka-clients-0.10.0.1.jar to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/fetchFileTemp6110236546923979105.tmp
>>>> 17/08/22 14:20:28 INFO Utils: /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/fetchFileTemp6110236546923979105.tmp has been previously copied
>>>> to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.apache.kafka_kafka-cl
>>>> ients-0.10.0.1.jar
>>>> 17/08/22 14:20:28 INFO Executor: Adding file:/tmp/spark-9474eda2-128d-
>>>> 4b00-8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7e
>>>> a20c82f1b/org.apache.kafka_kafka-clients-0.10.0.1.jar to class loader
>>>> 17/08/22 14:20:28 INFO Executor: Fetching spark://
>>>> 192.168.25.187:41554/jars/net.jpountz.lz4_lz4-1.3.0.jar with timestamp
>>>> 1503391817924
>>>> 17/08/22 14:20:28 INFO Utils: Fetching spark://192.168.25.187:41554/j
>>>> ars/net.jpountz.lz4_lz4-1.3.0.jar to /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/fetchFileTemp7615793935390489572.tmp
>>>> 17/08/22 14:20:28 INFO Utils: /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/fetchFileTemp7615793935390489572.tmp has been previously copied
>>>> to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/net.jpountz.lz4_lz4-1.3.0.jar
>>>> 17/08/22 14:20:28 INFO Executor: Adding file:/tmp/spark-9474eda2-128d-
>>>> 4b00-8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7e
>>>> a20c82f1b/net.jpountz.lz4_lz4-1.3.0.jar to class loader
>>>> 17/08/22 14:20:28 INFO Executor: Fetching spark://
>>>> 192.168.25.187:41554/jars/org.xerial.snappy_snappy-java-1.1.2.6.jar
>>>> with timestamp 1503391817924
>>>> 17/08/22 14:20:28 INFO Utils: Fetching spark://192.168.25.187:41554/j
>>>> ars/org.xerial.snappy_snappy-java-1.1.2.6.jar to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/fetchFileTemp4037226777938816116.tmp
>>>> 17/08/22 14:20:28 INFO Utils: /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/fetchFileTemp4037226777938816116.tmp has been previously copied
>>>> to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.xerial.snappy_snappy-
>>>> java-1.1.2.6.jar
>>>> 17/08/22 14:20:28 INFO Executor: Adding file:/tmp/spark-9474eda2-128d-
>>>> 4b00-8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7e
>>>> a20c82f1b/org.xerial.snappy_snappy-java-1.1.2.6.jar to class loader
>>>> 17/08/22 14:20:28 INFO Executor: Fetching spark://
>>>> 192.168.25.187:41554/jars/org.spark-project.spark_unused-1.0.0.jar
>>>> with timestamp 1503391817924
>>>> 17/08/22 14:20:28 INFO Utils: Fetching spark://192.168.25.187:41554/j
>>>> ars/org.spark-project.spark_unused-1.0.0.jar to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/fetchFileTemp5870103019955438186.tmp
>>>> 17/08/22 14:20:28 INFO Utils: /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/fetchFileTemp5870103019955438186.tmp has been previously copied
>>>> to /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.spark-project.spark_u
>>>> nused-1.0.0.jar
>>>> 17/08/22 14:20:28 INFO Executor: Adding file:/tmp/spark-9474eda2-128d-
>>>> 4b00-8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7e
>>>> a20c82f1b/org.spark-project.spark_unused-1.0.0.jar to class loader
>>>> 17/08/22 14:20:28 INFO Executor: Fetching spark://
>>>> 192.168.25.187:41554/jars/org.apache.spark_spark-tags_2.11-2.2.0.jar
>>>> with timestamp 1503391817924
>>>> 17/08/22 14:20:28 INFO Utils: Fetching spark://192.168.25.187:41554/j
>>>> ars/org.apache.spark_spark-tags_2.11-2.2.0.jar to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/fetchFileTemp344232108189163563.tmp
>>>> 17/08/22 14:20:28 INFO Utils: /tmp/spark-9474eda2-128d-4b00-
>>>> 8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7ea20c8
>>>> 2f1b/fetchFileTemp344232108189163563.tmp has been previously copied to
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/userFiles-33
>>>> d4eaeb-d577-4250-a223-d7ea20c82f1b/org.apache.spark_spark-ta
>>>> gs_2.11-2.2.0.jar
>>>> 17/08/22 14:20:28 INFO Executor: Adding file:/tmp/spark-9474eda2-128d-
>>>> 4b00-8f99-dc02012a8bf7/userFiles-33d4eaeb-d577-4250-a223-d7e
>>>> a20c82f1b/org.apache.spark_spark-tags_2.11-2.2.0.jar to class loader
>>>> 17/08/22 14:20:29 INFO CodeGenerator: Code generated in 32.736524 ms
>>>> 17/08/22 14:20:29 INFO CodeGenerator: Code generated in 25.33048 ms
>>>> 17/08/22 14:20:31 INFO PythonRunner: Times: total = 436, boot = 401,
>>>> init = 33, finish = 2
>>>> 17/08/22 14:20:31 INFO PythonRunner: Times: total = 439, boot = 406,
>>>> init = 32, finish = 1
>>>> 17/08/22 14:20:31 INFO MemoryStore: Block taskresult_1 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 364.3 MB)
>>>> 17/08/22 14:20:31 INFO BlockManagerInfo: Added taskresult_1 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.3 MB)
>>>> 17/08/22 14:20:31 INFO Executor: Finished task 1.0 in stage 0.0 (TID
>>>> 1). 2109385 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:31 INFO MemoryStore: Block taskresult_0 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 362.2 MB)
>>>> 17/08/22 14:20:31 INFO BlockManagerInfo: Added taskresult_0 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.3 MB)
>>>> 17/08/22 14:20:31 INFO Executor: Finished task 0.0 in stage 0.0 (TID
>>>> 0). 2109385 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:31 INFO TransportClientFactory: Successfully created
>>>> connection to /192.168.25.187:46647 after 2 ms (0 ms spent in
>>>> bootstraps)
>>>> 17/08/22 14:20:31 INFO TaskSetManager: Finished task 0.0 in stage 0.0
>>>> (TID 0) in 3689 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:31 INFO TaskSetManager: Finished task 1.0 in stage 0.0
>>>> (TID 1) in 3682 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:31 INFO BlockManagerInfo: Removed taskresult_0 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.3 MB)
>>>> 17/08/22 14:20:31 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:31 INFO BlockManagerInfo: Removed taskresult_1 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.3 MB)
>>>> 17/08/22 14:20:31 INFO DAGScheduler: ResultStage 0 (treeAggregate at
>>>> IDF.scala:54) finished in 3.776 s
>>>> 17/08/22 14:20:31 INFO DAGScheduler: Job 0 finished: treeAggregate at
>>>> IDF.scala:54, took 4.418658 s
>>>> 17/08/22 14:20:32 INFO CodeGenerator: Code generated in 173.848512 ms
>>>> 17/08/22 14:20:32 INFO CodeGenerator: Code generated in 84.504745 ms
>>>> 17/08/22 14:20:32 INFO Instrumentation: LogisticRegression-LogisticReg
>>>> ression_4647a03eff9e40a729dc-1694037276-1: training: numPartitions=2
>>>> storageLevel=StorageLevel(disk, memory, deserialized, 1 replicas)
>>>> 17/08/22 14:20:32 INFO Instrumentation: LogisticRegression-LogisticReg
>>>> ression_4647a03eff9e40a729dc-1694037276-1:
>>>> {"elasticNetParam":0.0,"fitIntercept":true,"standardization"
>>>> :true,"threshold":0.5,"regParam":0.001,"tol":1.0E-6,"maxIter":10}
>>>> 17/08/22 14:20:32 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:517
>>>> 17/08/22 14:20:32 INFO DAGScheduler: Got job 1 (treeAggregate at
>>>> LogisticRegression.scala:517) with 2 output partitions
>>>> 17/08/22 14:20:32 INFO DAGScheduler: Final stage: ResultStage 1
>>>> (treeAggregate at LogisticRegression.scala:517)
>>>> 17/08/22 14:20:32 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:32 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:32 INFO DAGScheduler: Submitting ResultStage 1
>>>> (MapPartitionsRDD[20] at treeAggregate at LogisticRegression.scala:517),
>>>> which has no missing parents
>>>> 17/08/22 14:20:32 INFO MemoryStore: Block broadcast_1 stored as values
>>>> in memory (estimated size 2.0 MB, free 364.2 MB)
>>>> 17/08/22 14:20:32 INFO MemoryStore: Block broadcast_1_piece0 stored as
>>>> bytes in memory (estimated size 24.3 KB, free 364.2 MB)
>>>> 17/08/22 14:20:32 INFO BlockManagerInfo: Added broadcast_1_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.3 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:33 INFO SparkContext: Created broadcast 1 from broadcast
>>>> at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:33 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 1 (MapPartitionsRDD[20] at treeAggregate at
>>>> LogisticRegression.scala:517) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:33 INFO TaskSchedulerImpl: Adding task set 1.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:33 INFO TaskSetManager: Starting task 0.0 in stage 1.0
>>>> (TID 2, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:33 INFO TaskSetManager: Starting task 1.0 in stage 1.0
>>>> (TID 3, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:33 INFO Executor: Running task 0.0 in stage 1.0 (TID 2)
>>>> 17/08/22 14:20:33 INFO Executor: Running task 1.0 in stage 1.0 (TID 3)
>>>> 17/08/22 14:20:33 INFO CodeGenerator: Code generated in 42.221867 ms
>>>> 17/08/22 14:20:33 INFO PythonRunner: Times: total = 53, boot = -3759,
>>>> init = 3812, finish = 0
>>>> 17/08/22 14:20:33 INFO MemoryStore: Block rdd_19_1 stored as values in
>>>> memory (estimated size 272.0 B, free 364.2 MB)
>>>> 17/08/22 14:20:33 INFO PythonRunner: Times: total = 47, boot = -3782,
>>>> init = 3829, finish = 0
>>>> 17/08/22 14:20:33 INFO MemoryStore: Block rdd_19_0 stored as values in
>>>> memory (estimated size 288.0 B, free 364.2 MB)
>>>> 17/08/22 14:20:33 INFO BlockManagerInfo: Added rdd_19_1 in memory on
>>>> 192.168.25.187:46647 (size: 272.0 B, free: 366.3 MB)
>>>> 17/08/22 14:20:33 INFO BlockManagerInfo: Added rdd_19_0 in memory on
>>>> 192.168.25.187:46647 (size: 288.0 B, free: 366.3 MB)
>>>> 17/08/22 14:20:33 INFO ContextCleaner: Cleaned accumulator 1
>>>> 17/08/22 14:20:33 INFO ContextCleaner: Cleaned accumulator 2
>>>> 17/08/22 14:20:33 INFO BlockManagerInfo: Removed broadcast_0_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 11.5 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:33 INFO MemoryStore: Block taskresult_2 stored as bytes
>>>> in memory (estimated size 16.1 MB, free 348.2 MB)
>>>> 17/08/22 14:20:33 INFO BlockManagerInfo: Added taskresult_2 in memory
>>>> on 192.168.25.187:46647 (size: 16.1 MB, free: 350.2 MB)
>>>> 17/08/22 14:20:33 INFO Executor: Finished task 0.0 in stage 1.0 (TID
>>>> 2). 16862200 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:33 INFO MemoryStore: Block taskresult_3 stored as bytes
>>>> in memory (estimated size 16.1 MB, free 332.1 MB)
>>>> 17/08/22 14:20:33 INFO BlockManagerInfo: Added taskresult_3 in memory
>>>> on 192.168.25.187:46647 (size: 16.1 MB, free: 334.1 MB)
>>>> 17/08/22 14:20:33 INFO Executor: Finished task 1.0 in stage 1.0 (TID
>>>> 3). 16862200 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Removed taskresult_2 on
>>>> 192.168.25.187:46647 in memory (size: 16.1 MB, free: 350.2 MB)
>>>> 17/08/22 14:20:34 INFO TaskSetManager: Finished task 0.0 in stage 1.0
>>>> (TID 2) in 1056 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Removed taskresult_3 on
>>>> 192.168.25.187:46647 in memory (size: 16.1 MB, free: 366.3 MB)
>>>> 17/08/22 14:20:34 INFO TaskSetManager: Finished task 1.0 in stage 1.0
>>>> (TID 3) in 1094 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:34 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:34 INFO DAGScheduler: ResultStage 1 (treeAggregate at
>>>> LogisticRegression.scala:517) finished in 1.100 s
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Job 1 finished: treeAggregate at
>>>> LogisticRegression.scala:517, took 1.217516 s
>>>> 17/08/22 14:20:34 INFO Instrumentation: LogisticRegression-LogisticReg
>>>> ression_4647a03eff9e40a729dc-1694037276-1: {"numClasses":2}
>>>> 17/08/22 14:20:34 INFO Instrumentation: LogisticRegression-LogisticReg
>>>> ression_4647a03eff9e40a729dc-1694037276-1: {"numFeatures":262144}
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_2 stored as values
>>>> in memory (estimated size 2.0 MB, free 362.2 MB)
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_2_piece0 stored as
>>>> bytes in memory (estimated size 10.2 KB, free 362.2 MB)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Added broadcast_2_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.2 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:34 INFO SparkContext: Created broadcast 2 from broadcast
>>>> at LogisticRegression.scala:600
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_3 stored as values
>>>> in memory (estimated size 2.0 MB, free 360.2 MB)
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_3_piece0 stored as
>>>> bytes in memory (estimated size 10.1 KB, free 360.2 MB)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Added broadcast_3_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.1 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:34 INFO SparkContext: Created broadcast 3 from broadcast
>>>> at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:34 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Got job 2 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Final stage: ResultStage 2
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Submitting ResultStage 2
>>>> (MapPartitionsRDD[21] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_4 stored as values
>>>> in memory (estimated size 2.0 MB, free 358.2 MB)
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_4_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 358.2 MB)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Added broadcast_4_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:34 INFO SparkContext: Created broadcast 4 from broadcast
>>>> at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 2 (MapPartitionsRDD[21] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:34 INFO TaskSchedulerImpl: Adding task set 2.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:34 INFO TaskSetManager: Starting task 0.0 in stage 2.0
>>>> (TID 4, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:34 INFO TaskSetManager: Starting task 1.0 in stage 2.0
>>>> (TID 5, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:34 INFO Executor: Running task 0.0 in stage 2.0 (TID 4)
>>>> 17/08/22 14:20:34 INFO Executor: Running task 1.0 in stage 2.0 (TID 5)
>>>> 17/08/22 14:20:34 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:34 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block taskresult_5 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 356.2 MB)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Added taskresult_5 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block taskresult_4 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:34 INFO Executor: Finished task 1.0 in stage 2.0 (TID
>>>> 5). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Added taskresult_4 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:34 INFO Executor: Finished task 0.0 in stage 2.0 (TID
>>>> 4). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Removed taskresult_5 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:34 INFO TaskSetManager: Finished task 1.0 in stage 2.0
>>>> (TID 5) in 124 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:34 INFO TaskSetManager: Finished task 0.0 in stage 2.0
>>>> (TID 4) in 147 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:34 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Removed taskresult_4 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:34 INFO DAGScheduler: ResultStage 2 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.151 s
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Job 2 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.210300 s
>>>> 17/08/22 14:20:34 WARN BLAS: Failed to load implementation from:
>>>> com.github.fommil.netlib.NativeSystemBLAS
>>>> 17/08/22 14:20:34 WARN BLAS: Failed to load implementation from:
>>>> com.github.fommil.netlib.NativeRefBLAS
>>>> 17/08/22 14:20:34 INFO TorrentBroadcast: Destroying Broadcast(3) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Removed broadcast_3_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.1 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_5 stored as values
>>>> in memory (estimated size 2.0 MB, free 358.2 MB)
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_5_piece0 stored as
>>>> bytes in memory (estimated size 10.2 KB, free 358.2 MB)
>>>> 17/08/22 14:20:34 INFO BlockManagerInfo: Added broadcast_5_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.2 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:34 INFO SparkContext: Created broadcast 5 from broadcast
>>>> at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:34 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Got job 3 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Final stage: ResultStage 3
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:34 INFO DAGScheduler: Submitting ResultStage 3
>>>> (MapPartitionsRDD[22] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_6 stored as values
>>>> in memory (estimated size 2.0 MB, free 356.1 MB)
>>>> 17/08/22 14:20:34 INFO MemoryStore: Block broadcast_6_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 356.1 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added broadcast_6_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO SparkContext: Created broadcast 6 from broadcast
>>>> at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 3 (MapPartitionsRDD[22] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:35 INFO TaskSchedulerImpl: Adding task set 3.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Starting task 0.0 in stage 3.0
>>>> (TID 6, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Starting task 1.0 in stage 3.0
>>>> (TID 7, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:35 INFO Executor: Running task 0.0 in stage 3.0 (TID 6)
>>>> 17/08/22 14:20:35 INFO Executor: Running task 1.0 in stage 3.0 (TID 7)
>>>> 17/08/22 14:20:35 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:35 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block taskresult_6 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added taskresult_6 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block taskresult_7 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 352.1 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added taskresult_7 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:35 INFO Executor: Finished task 0.0 in stage 3.0 (TID
>>>> 6). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:35 INFO Executor: Finished task 1.0 in stage 3.0 (TID
>>>> 7). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Removed taskresult_7 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Finished task 1.0 in stage 3.0
>>>> (TID 7) in 177 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Removed broadcast_4_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 364.2 MB)
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Finished task 0.0 in stage 3.0
>>>> (TID 6) in 223 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:35 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:35 INFO DAGScheduler: ResultStage 3 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.225 s
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Job 3 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.255439 s
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Removed taskresult_6 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO TorrentBroadcast: Destroying Broadcast(5) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Removed broadcast_5_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.2 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:35 INFO LBFGS: Val and Grad Norm: 0.317022 (rel: 0.543)
>>>> 0.358320
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block broadcast_7 stored as values
>>>> in memory (estimated size 2.0 MB, free 358.2 MB)
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block broadcast_7_piece0 stored as
>>>> bytes in memory (estimated size 10.3 KB, free 358.2 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO SparkContext: Created broadcast 7 from broadcast
>>>> at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:35 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Got job 4 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Final stage: ResultStage 4
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Submitting ResultStage 4
>>>> (MapPartitionsRDD[23] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block broadcast_8 stored as values
>>>> in memory (estimated size 2.0 MB, free 356.1 MB)
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block broadcast_8_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 356.1 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added broadcast_8_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO SparkContext: Created broadcast 8 from broadcast
>>>> at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 4 (MapPartitionsRDD[23] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:35 INFO TaskSchedulerImpl: Adding task set 4.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Starting task 0.0 in stage 4.0
>>>> (TID 8, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Starting task 1.0 in stage 4.0
>>>> (TID 9, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:35 INFO Executor: Running task 0.0 in stage 4.0 (TID 8)
>>>> 17/08/22 14:20:35 INFO Executor: Running task 1.0 in stage 4.0 (TID 9)
>>>> 17/08/22 14:20:35 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:35 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block taskresult_9 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block taskresult_8 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 352.1 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added taskresult_9 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:35 INFO Executor: Finished task 1.0 in stage 4.0 (TID
>>>> 9). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Removed taskresult_9 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Finished task 1.0 in stage 4.0
>>>> (TID 9) in 103 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added taskresult_8 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:35 INFO Executor: Finished task 0.0 in stage 4.0 (TID
>>>> 8). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Removed taskresult_8 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Finished task 0.0 in stage 4.0
>>>> (TID 8) in 136 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:35 INFO TaskSchedulerImpl: Removed TaskSet 4.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:35 INFO DAGScheduler: ResultStage 4 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.140 s
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Job 4 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.182699 s
>>>> 17/08/22 14:20:35 INFO TorrentBroadcast: Destroying Broadcast(7) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Removed broadcast_7_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:35 INFO LBFGS: Val and Grad Norm: 0.145089 (rel: 0.542)
>>>> 0.164387
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block broadcast_9 stored as values
>>>> in memory (estimated size 2.0 MB, free 356.1 MB)
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block broadcast_9_piece0 stored as
>>>> bytes in memory (estimated size 10.3 KB, free 356.1 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added broadcast_9_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO SparkContext: Created broadcast 9 from broadcast
>>>> at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:35 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Got job 5 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Final stage: ResultStage 5
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Submitting ResultStage 5
>>>> (MapPartitionsRDD[24] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block broadcast_10 stored as values
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block broadcast_10_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 354.1 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added broadcast_10_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:35 INFO SparkContext: Created broadcast 10 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:35 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 5 (MapPartitionsRDD[24] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:35 INFO TaskSchedulerImpl: Adding task set 5.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Starting task 0.0 in stage 5.0
>>>> (TID 10, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:35 INFO TaskSetManager: Starting task 1.0 in stage 5.0
>>>> (TID 11, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:35 INFO Executor: Running task 0.0 in stage 5.0 (TID 10)
>>>> 17/08/22 14:20:35 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:35 INFO Executor: Running task 1.0 in stage 5.0 (TID 11)
>>>> 17/08/22 14:20:35 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block taskresult_10 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 352.0 MB)
>>>> 17/08/22 14:20:35 INFO MemoryStore: Block taskresult_11 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 350.0 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added taskresult_10 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Added taskresult_11 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:35 INFO Executor: Finished task 1.0 in stage 5.0 (TID
>>>> 11). 2110611 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:35 INFO BlockManagerInfo: Removed broadcast_6_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 362.2 MB)
>>>> 17/08/22 14:20:35 INFO Executor: Finished task 0.0 in stage 5.0 (TID
>>>> 10). 2110611 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Finished task 1.0 in stage 5.0
>>>> (TID 11) in 197 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed taskresult_11 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Finished task 0.0 in stage 5.0
>>>> (TID 10) in 217 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:36 INFO TaskSchedulerImpl: Removed TaskSet 5.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed taskresult_10 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO DAGScheduler: ResultStage 5 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.236 s
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed broadcast_8_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Job 5 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.282317 s
>>>> 17/08/22 14:20:36 INFO TorrentBroadcast: Destroying Broadcast(9) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:36 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:36 INFO LBFGS: Val and Grad Norm: 0.0635321 (rel: 0.562)
>>>> 0.0740437
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed broadcast_9_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_11 stored as values
>>>> in memory (estimated size 2.0 MB, free 358.2 MB)
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_11_piece0 stored as
>>>> bytes in memory (estimated size 10.3 KB, free 358.2 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Added broadcast_11_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO SparkContext: Created broadcast 11 from
>>>> broadcast at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:36 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Got job 6 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Final stage: ResultStage 6
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Submitting ResultStage 6
>>>> (MapPartitionsRDD[25] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_12 stored as values
>>>> in memory (estimated size 2.0 MB, free 356.1 MB)
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_12_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 356.1 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Added broadcast_12_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO SparkContext: Created broadcast 12 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 6 (MapPartitionsRDD[25] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:36 INFO TaskSchedulerImpl: Adding task set 6.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Starting task 0.0 in stage 6.0
>>>> (TID 12, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Starting task 1.0 in stage 6.0
>>>> (TID 13, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:36 INFO Executor: Running task 0.0 in stage 6.0 (TID 12)
>>>> 17/08/22 14:20:36 INFO Executor: Running task 1.0 in stage 6.0 (TID 13)
>>>> 17/08/22 14:20:36 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:36 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block taskresult_13 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Added taskresult_13 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block taskresult_12 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 352.1 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Added taskresult_12 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:36 INFO Executor: Finished task 0.0 in stage 6.0 (TID
>>>> 12). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Finished task 0.0 in stage 6.0
>>>> (TID 12) in 79 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:36 INFO Executor: Finished task 1.0 in stage 6.0 (TID
>>>> 13). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed taskresult_12 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Finished task 1.0 in stage 6.0
>>>> (TID 13) in 102 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:36 INFO TaskSchedulerImpl: Removed TaskSet 6.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:36 INFO DAGScheduler: ResultStage 6 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.106 s
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Job 6 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.143539 s
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed taskresult_13 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO TorrentBroadcast: Destroying Broadcast(11) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed broadcast_11_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:36 INFO LBFGS: Val and Grad Norm: 0.0334526 (rel: 0.473)
>>>> 0.0350590
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_13 stored as values
>>>> in memory (estimated size 2.0 MB, free 356.1 MB)
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_13_piece0 stored as
>>>> bytes in memory (estimated size 10.3 KB, free 356.1 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO SparkContext: Created broadcast 13 from
>>>> broadcast at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:36 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Got job 7 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Final stage: ResultStage 7
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Submitting ResultStage 7
>>>> (MapPartitionsRDD[26] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_14 stored as values
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_14_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 354.1 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO SparkContext: Created broadcast 14 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 7 (MapPartitionsRDD[26] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:36 INFO TaskSchedulerImpl: Adding task set 7.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Starting task 0.0 in stage 7.0
>>>> (TID 14, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Starting task 1.0 in stage 7.0
>>>> (TID 15, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:36 INFO Executor: Running task 0.0 in stage 7.0 (TID 14)
>>>> 17/08/22 14:20:36 INFO Executor: Running task 1.0 in stage 7.0 (TID 15)
>>>> 17/08/22 14:20:36 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:36 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block taskresult_15 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 352.0 MB)
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block taskresult_14 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 350.0 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Added taskresult_15 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:36 INFO Executor: Finished task 1.0 in stage 7.0 (TID
>>>> 15). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Added taskresult_14 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed broadcast_10_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 362.2 MB)
>>>> 17/08/22 14:20:36 INFO Executor: Finished task 0.0 in stage 7.0 (TID
>>>> 14). 2110611 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed taskresult_15 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Finished task 1.0 in stage 7.0
>>>> (TID 15) in 387 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed broadcast_1_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.3 KB, free: 364.2 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed taskresult_14 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed broadcast_12_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:36 INFO TaskSetManager: Finished task 0.0 in stage 7.0
>>>> (TID 14) in 416 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:36 INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:36 INFO DAGScheduler: ResultStage 7 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.418 s
>>>> 17/08/22 14:20:36 INFO DAGScheduler: Job 7 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.459555 s
>>>> 17/08/22 14:20:36 INFO TorrentBroadcast: Destroying Broadcast(13) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Removed broadcast_13_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.3 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:36 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:36 INFO LBFGS: Val and Grad Norm: 0.0200470 (rel: 0.401)
>>>> 0.0164153
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_15 stored as values
>>>> in memory (estimated size 2.0 MB, free 360.2 MB)
>>>> 17/08/22 14:20:36 INFO MemoryStore: Block broadcast_15_piece0 stored as
>>>> bytes in memory (estimated size 10.3 KB, free 360.2 MB)
>>>> 17/08/22 14:20:36 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.3 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:36 INFO SparkContext: Created broadcast 15 from
>>>> broadcast at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:37 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Got job 8 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Final stage: ResultStage 8
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Submitting ResultStage 8
>>>> (MapPartitionsRDD[27] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_16 stored as values
>>>> in memory (estimated size 2.0 MB, free 358.2 MB)
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_16_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 358.2 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added broadcast_16_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO SparkContext: Created broadcast 16 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 8 (MapPartitionsRDD[27] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:37 INFO TaskSchedulerImpl: Adding task set 8.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Starting task 0.0 in stage 8.0
>>>> (TID 16, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Starting task 1.0 in stage 8.0
>>>> (TID 17, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:37 INFO Executor: Running task 0.0 in stage 8.0 (TID 16)
>>>> 17/08/22 14:20:37 INFO Executor: Running task 1.0 in stage 8.0 (TID 17)
>>>> 17/08/22 14:20:37 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:37 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block taskresult_16 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 356.2 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added taskresult_16 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:37 INFO Executor: Finished task 0.0 in stage 8.0 (TID
>>>> 16). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block taskresult_17 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added taskresult_17 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed taskresult_16 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:37 INFO Executor: Finished task 1.0 in stage 8.0 (TID
>>>> 17). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Finished task 0.0 in stage 8.0
>>>> (TID 16) in 67 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed taskresult_17 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Finished task 1.0 in stage 8.0
>>>> (TID 17) in 110 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:37 INFO TaskSchedulerImpl: Removed TaskSet 8.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:37 INFO DAGScheduler: ResultStage 8 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.118 s
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Job 8 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.149447 s
>>>> 17/08/22 14:20:37 INFO TorrentBroadcast: Destroying Broadcast(15) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed broadcast_15_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:37 INFO LBFGS: Val and Grad Norm: 0.0146788 (rel: 0.268)
>>>> 0.00726759
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_17 stored as values
>>>> in memory (estimated size 2.0 MB, free 358.2 MB)
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_17_piece0 stored as
>>>> bytes in memory (estimated size 10.3 KB, free 358.2 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added broadcast_17_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO SparkContext: Created broadcast 17 from
>>>> broadcast at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:37 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Got job 9 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Final stage: ResultStage 9
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Submitting ResultStage 9
>>>> (MapPartitionsRDD[28] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_18 stored as values
>>>> in memory (estimated size 2.0 MB, free 356.1 MB)
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_18_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 356.1 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added broadcast_18_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO SparkContext: Created broadcast 18 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 9 (MapPartitionsRDD[28] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:37 INFO TaskSchedulerImpl: Adding task set 9.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Starting task 0.0 in stage 9.0
>>>> (TID 18, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Starting task 1.0 in stage 9.0
>>>> (TID 19, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:37 INFO Executor: Running task 0.0 in stage 9.0 (TID 18)
>>>> 17/08/22 14:20:37 INFO Executor: Running task 1.0 in stage 9.0 (TID 19)
>>>> 17/08/22 14:20:37 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block taskresult_19 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:37 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added taskresult_19 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed broadcast_16_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 364.2 MB)
>>>> 17/08/22 14:20:37 INFO Executor: Finished task 1.0 in stage 9.0 (TID
>>>> 19). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block taskresult_18 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added taskresult_18 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed taskresult_19 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Finished task 1.0 in stage 9.0
>>>> (TID 19) in 197 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:37 INFO Executor: Finished task 0.0 in stage 9.0 (TID
>>>> 18). 2110611 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed taskresult_18 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Finished task 0.0 in stage 9.0
>>>> (TID 18) in 233 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:37 INFO TaskSchedulerImpl: Removed TaskSet 9.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:37 INFO DAGScheduler: ResultStage 9 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.236 s
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Job 9 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.274679 s
>>>> 17/08/22 14:20:37 INFO TorrentBroadcast: Destroying Broadcast(17) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed broadcast_17_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:37 INFO LBFGS: Val and Grad Norm: 0.0127666 (rel: 0.130)
>>>> 0.00298415
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_19 stored as values
>>>> in memory (estimated size 2.0 MB, free 358.2 MB)
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_19_piece0 stored as
>>>> bytes in memory (estimated size 10.3 KB, free 358.2 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added broadcast_19_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO SparkContext: Created broadcast 19 from
>>>> broadcast at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:37 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Got job 10 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Final stage: ResultStage 10
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Submitting ResultStage 10
>>>> (MapPartitionsRDD[29] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_20 stored as values
>>>> in memory (estimated size 2.0 MB, free 356.1 MB)
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_20_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 356.1 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added broadcast_20_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO SparkContext: Created broadcast 20 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 10 (MapPartitionsRDD[29] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:37 INFO TaskSchedulerImpl: Adding task set 10.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Starting task 0.0 in stage 10.0
>>>> (TID 20, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Starting task 1.0 in stage 10.0
>>>> (TID 21, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:37 INFO Executor: Running task 0.0 in stage 10.0 (TID 20)
>>>> 17/08/22 14:20:37 INFO Executor: Running task 1.0 in stage 10.0 (TID 21)
>>>> 17/08/22 14:20:37 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block taskresult_21 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added taskresult_21 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:37 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block taskresult_20 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 352.1 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added taskresult_20 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:37 INFO Executor: Finished task 1.0 in stage 10.0 (TID
>>>> 21). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:37 INFO Executor: Finished task 0.0 in stage 10.0 (TID
>>>> 20). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed taskresult_21 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Finished task 1.0 in stage 10.0
>>>> (TID 21) in 95 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed taskresult_20 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO TaskSetManager: Finished task 0.0 in stage 10.0
>>>> (TID 20) in 115 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:37 INFO TaskSchedulerImpl: Removed TaskSet 10.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:37 INFO DAGScheduler: ResultStage 10 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.118 s
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Job 10 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.143837 s
>>>> 17/08/22 14:20:37 INFO TorrentBroadcast: Destroying Broadcast(19) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Removed broadcast_19_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:37 INFO LBFGS: Val and Grad Norm: 0.0122267 (rel:
>>>> 0.0423) 0.00119244
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_21 stored as values
>>>> in memory (estimated size 2.0 MB, free 356.1 MB)
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_21_piece0 stored as
>>>> bytes in memory (estimated size 10.3 KB, free 356.1 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added broadcast_21_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:37 INFO SparkContext: Created broadcast 21 from
>>>> broadcast at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:37 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Got job 11 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Final stage: ResultStage 11
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:37 INFO DAGScheduler: Submitting ResultStage 11
>>>> (MapPartitionsRDD[30] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_22 stored as values
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:37 INFO MemoryStore: Block broadcast_22_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 354.1 MB)
>>>> 17/08/22 14:20:37 INFO BlockManagerInfo: Added broadcast_22_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO SparkContext: Created broadcast 22 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:38 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 11 (MapPartitionsRDD[30] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:38 INFO TaskSchedulerImpl: Adding task set 11.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:38 INFO TaskSetManager: Starting task 0.0 in stage 11.0
>>>> (TID 22, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:38 INFO TaskSetManager: Starting task 1.0 in stage 11.0
>>>> (TID 23, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:38 INFO Executor: Running task 0.0 in stage 11.0 (TID 22)
>>>> 17/08/22 14:20:38 INFO Executor: Running task 1.0 in stage 11.0 (TID 23)
>>>> 17/08/22 14:20:38 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:38 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:38 INFO MemoryStore: Block taskresult_23 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 352.0 MB)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Added taskresult_23 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:38 INFO Executor: Finished task 1.0 in stage 11.0 (TID
>>>> 23). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:38 INFO TaskSetManager: Finished task 1.0 in stage 11.0
>>>> (TID 23) in 85 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed broadcast_18_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 364.2 MB)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed taskresult_23 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed broadcast_20_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO MemoryStore: Block taskresult_22 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 356.2 MB)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Added taskresult_22 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:38 INFO Executor: Finished task 0.0 in stage 11.0 (TID
>>>> 22). 2110611 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:38 INFO TaskSetManager: Finished task 0.0 in stage 11.0
>>>> (TID 22) in 118 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:38 INFO TaskSchedulerImpl: Removed TaskSet 11.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:38 INFO DAGScheduler: ResultStage 11 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.122 s
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed taskresult_22 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO DAGScheduler: Job 11 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.143423 s
>>>> 17/08/22 14:20:38 INFO TorrentBroadcast: Destroying Broadcast(21) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed broadcast_21_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:38 INFO LBFGS: Val and Grad Norm: 0.0120434 (rel:
>>>> 0.0150) 0.000857606
>>>> 17/08/22 14:20:38 INFO MemoryStore: Block broadcast_23 stored as values
>>>> in memory (estimated size 2.0 MB, free 358.2 MB)
>>>> 17/08/22 14:20:38 INFO MemoryStore: Block broadcast_23_piece0 stored as
>>>> bytes in memory (estimated size 10.3 KB, free 358.2 MB)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Added broadcast_23_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO SparkContext: Created broadcast 23 from
>>>> broadcast at LogisticRegression.scala:1879
>>>> 17/08/22 14:20:38 INFO SparkContext: Starting job: treeAggregate at
>>>> LogisticRegression.scala:1892
>>>> 17/08/22 14:20:38 INFO DAGScheduler: Got job 12 (treeAggregate at
>>>> LogisticRegression.scala:1892) with 2 output partitions
>>>> 17/08/22 14:20:38 INFO DAGScheduler: Final stage: ResultStage 12
>>>> (treeAggregate at LogisticRegression.scala:1892)
>>>> 17/08/22 14:20:38 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:38 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:38 INFO DAGScheduler: Submitting ResultStage 12
>>>> (MapPartitionsRDD[31] at treeAggregate at LogisticRegression.scala:1892),
>>>> which has no missing parents
>>>> 17/08/22 14:20:38 INFO MemoryStore: Block broadcast_24 stored as values
>>>> in memory (estimated size 2.0 MB, free 356.1 MB)
>>>> 17/08/22 14:20:38 INFO MemoryStore: Block broadcast_24_piece0 stored as
>>>> bytes in memory (estimated size 24.6 KB, free 356.1 MB)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Added broadcast_24_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 24.6 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO SparkContext: Created broadcast 24 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:38 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 12 (MapPartitionsRDD[31] at treeAggregate at
>>>> LogisticRegression.scala:1892) (first 15 tasks are for partitions Vector(0,
>>>> 1))
>>>> 17/08/22 14:20:38 INFO TaskSchedulerImpl: Adding task set 12.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:38 INFO TaskSetManager: Starting task 0.0 in stage 12.0
>>>> (TID 24, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:38 INFO TaskSetManager: Starting task 1.0 in stage 12.0
>>>> (TID 25, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:38 INFO Executor: Running task 0.0 in stage 12.0 (TID 24)
>>>> 17/08/22 14:20:38 INFO Executor: Running task 1.0 in stage 12.0 (TID 25)
>>>> 17/08/22 14:20:38 INFO BlockManager: Found block rdd_19_0 locally
>>>> 17/08/22 14:20:38 INFO MemoryStore: Block taskresult_24 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 354.1 MB)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Added taskresult_24 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:38 INFO Executor: Finished task 0.0 in stage 12.0 (TID
>>>> 24). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:38 INFO BlockManager: Found block rdd_19_1 locally
>>>> 17/08/22 14:20:38 INFO MemoryStore: Block taskresult_25 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 352.1 MB)
>>>> 17/08/22 14:20:38 INFO TaskSetManager: Finished task 0.0 in stage 12.0
>>>> (TID 24) in 57 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Added taskresult_25 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:38 INFO Executor: Finished task 1.0 in stage 12.0 (TID
>>>> 25). 2110568 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed taskresult_24 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.2 MB)
>>>> 17/08/22 14:20:38 INFO TaskSetManager: Finished task 1.0 in stage 12.0
>>>> (TID 25) in 94 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:38 INFO TaskSchedulerImpl: Removed TaskSet 12.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed taskresult_25 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO DAGScheduler: ResultStage 12 (treeAggregate at
>>>> LogisticRegression.scala:1892) finished in 0.098 s
>>>> 17/08/22 14:20:38 INFO DAGScheduler: Job 12 finished: treeAggregate at
>>>> LogisticRegression.scala:1892, took 0.125965 s
>>>> 17/08/22 14:20:38 INFO TorrentBroadcast: Destroying Broadcast(23) (from
>>>> destroy at LogisticRegression.scala:1933)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed broadcast_23_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO LBFGS: Step Size: 1.000
>>>> 17/08/22 14:20:38 INFO LBFGS: Val and Grad Norm: 0.0118476 (rel:
>>>> 0.0163) 0.00102213
>>>> 17/08/22 14:20:38 INFO LBFGS: Converged because max iterations reached
>>>> 17/08/22 14:20:38 INFO TorrentBroadcast: Destroying Broadcast(2) (from
>>>> destroy at LogisticRegression.scala:796)
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed broadcast_2_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 10.2 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:38 INFO MapPartitionsRDD: Removing RDD 19 from
>>>> persistence list
>>>> 17/08/22 14:20:38 INFO BlockManager: Removing RDD 19
>>>> 17/08/22 14:20:38 INFO CodeGenerator: Code generated in 69.50928 ms
>>>> 17/08/22 14:20:38 INFO BlockManagerInfo: Removed broadcast_24_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:39 INFO BlockManagerInfo: Removed broadcast_22_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 24.6 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:39 INFO Instrumentation: LogisticRegression-LogisticReg
>>>> ression_4647a03eff9e40a729dc-1694037276-1: training finished
>>>> 17/08/22 14:20:39 INFO SparkContext: Starting job: treeAggregate at
>>>> IDF.scala:54
>>>> 17/08/22 14:20:39 INFO DAGScheduler: Got job 13 (treeAggregate at
>>>> IDF.scala:54) with 2 output partitions
>>>> 17/08/22 14:20:39 INFO DAGScheduler: Final stage: ResultStage 13
>>>> (treeAggregate at IDF.scala:54)
>>>> 17/08/22 14:20:39 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:39 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:39 INFO DAGScheduler: Submitting ResultStage 13
>>>> (MapPartitionsRDD[42] at treeAggregate at IDF.scala:54), which has no
>>>> missing parents
>>>> 17/08/22 14:20:39 INFO MemoryStore: Block broadcast_25 stored as values
>>>> in memory (estimated size 26.0 KB, free 364.2 MB)
>>>> 17/08/22 14:20:39 INFO MemoryStore: Block broadcast_25_piece0 stored as
>>>> bytes in memory (estimated size 11.6 KB, free 364.2 MB)
>>>> 17/08/22 14:20:39 INFO BlockManagerInfo: Added broadcast_25_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 11.6 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:39 INFO SparkContext: Created broadcast 25 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:39 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 13 (MapPartitionsRDD[42] at treeAggregate at IDF.scala:54)
>>>> (first 15 tasks are for partitions Vector(0, 1))
>>>> 17/08/22 14:20:39 INFO TaskSchedulerImpl: Adding task set 13.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:39 INFO TaskSetManager: Starting task 0.0 in stage 13.0
>>>> (TID 26, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:39 INFO TaskSetManager: Starting task 1.0 in stage 13.0
>>>> (TID 27, localhost, executor driver, partition 1, PROCESS_LOCAL, 4892 bytes)
>>>> 17/08/22 14:20:39 INFO Executor: Running task 0.0 in stage 13.0 (TID 26)
>>>> 17/08/22 14:20:39 INFO Executor: Running task 1.0 in stage 13.0 (TID 27)
>>>> 17/08/22 14:20:39 INFO PythonRunner: Times: total = 52, boot = -6281,
>>>> init = 6333, finish = 0
>>>> 17/08/22 14:20:39 INFO PythonRunner: Times: total = 58, boot = -6294,
>>>> init = 6351, finish = 1
>>>> 17/08/22 14:20:39 INFO MemoryStore: Block taskresult_26 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 362.2 MB)
>>>> 17/08/22 14:20:39 INFO BlockManagerInfo: Added taskresult_26 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 364.3 MB)
>>>> 17/08/22 14:20:39 INFO MemoryStore: Block taskresult_27 stored as bytes
>>>> in memory (estimated size 2.0 MB, free 360.2 MB)
>>>> 17/08/22 14:20:39 INFO BlockManagerInfo: Added taskresult_27 in memory
>>>> on 192.168.25.187:46647 (size: 2.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:39 INFO Executor: Finished task 1.0 in stage 13.0 (TID
>>>> 27). 2109342 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:39 INFO Executor: Finished task 0.0 in stage 13.0 (TID
>>>> 26). 2109342 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:39 INFO TaskSetManager: Finished task 0.0 in stage 13.0
>>>> (TID 26) in 156 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:39 INFO BlockManagerInfo: Removed taskresult_26 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 364.3 MB)
>>>> 17/08/22 14:20:39 INFO BlockManagerInfo: Removed taskresult_27 on
>>>> 192.168.25.187:46647 in memory (size: 2.0 MB, free: 366.3 MB)
>>>> 17/08/22 14:20:39 INFO TaskSetManager: Finished task 1.0 in stage 13.0
>>>> (TID 27) in 179 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:39 INFO TaskSchedulerImpl: Removed TaskSet 13.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:39 INFO DAGScheduler: ResultStage 13 (treeAggregate at
>>>> IDF.scala:54) finished in 0.185 s
>>>> 17/08/22 14:20:39 INFO DAGScheduler: Job 13 finished: treeAggregate at
>>>> IDF.scala:54, took 0.204303 s
>>>> 17/08/22 14:20:39 INFO Instrumentation: NaiveBayes-NaiveBayes_4836ba16aa3f054bc95c-1042799920-2:
>>>> training: numPartitions=2 storageLevel=StorageLevel(1 replicas)
>>>> 17/08/22 14:20:39 INFO Instrumentation: NaiveBayes-NaiveBayes_4836ba16aa3f054bc95c-1042799920-2:
>>>> {"smoothing":1.0,"featuresCol":"features","modelType":"multi
>>>> nomial","labelCol":"label","predictionCol":"prediction","raw
>>>> PredictionCol":"rawPrediction","probabilityCol":"probability"}
>>>> 17/08/22 14:20:39 INFO CodeGenerator: Code generated in 49.860121 ms
>>>> 17/08/22 14:20:40 INFO SparkContext: Starting job: head at
>>>> NaiveBayes.scala:154
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Got job 14 (head at
>>>> NaiveBayes.scala:154) with 1 output partitions
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Final stage: ResultStage 14 (head
>>>> at NaiveBayes.scala:154)
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Submitting ResultStage 14
>>>> (MapPartitionsRDD[49] at head at NaiveBayes.scala:154), which has no
>>>> missing parents
>>>> 17/08/22 14:20:40 INFO MemoryStore: Block broadcast_26 stored as values
>>>> in memory (estimated size 2.0 MB, free 362.2 MB)
>>>> 17/08/22 14:20:40 INFO MemoryStore: Block broadcast_26_piece0 stored as
>>>> bytes in memory (estimated size 22.0 KB, free 362.2 MB)
>>>> 17/08/22 14:20:40 INFO BlockManagerInfo: Added broadcast_26_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 22.0 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:40 INFO SparkContext: Created broadcast 26 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Submitting 1 missing tasks from
>>>> ResultStage 14 (MapPartitionsRDD[49] at head at NaiveBayes.scala:154)
>>>> (first 15 tasks are for partitions Vector(0))
>>>> 17/08/22 14:20:40 INFO TaskSchedulerImpl: Adding task set 14.0 with 1
>>>> tasks
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Starting task 0.0 in stage 14.0
>>>> (TID 28, localhost, executor driver, partition 0, PROCESS_LOCAL, 4883 bytes)
>>>> 17/08/22 14:20:40 INFO Executor: Running task 0.0 in stage 14.0 (TID 28)
>>>> 17/08/22 14:20:40 INFO PythonRunner: Times: total = 42, boot = -541,
>>>> init = 583, finish = 0
>>>> 17/08/22 14:20:40 INFO Executor: Finished task 0.0 in stage 14.0 (TID
>>>> 28). 1713 bytes result sent to driver
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Finished task 0.0 in stage 14.0
>>>> (TID 28) in 78 ms on localhost (executor driver) (1/1)
>>>> 17/08/22 14:20:40 INFO TaskSchedulerImpl: Removed TaskSet 14.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:40 INFO DAGScheduler: ResultStage 14 (head at
>>>> NaiveBayes.scala:154) finished in 0.081 s
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Job 14 finished: head at
>>>> NaiveBayes.scala:154, took 0.129266 s
>>>> 17/08/22 14:20:40 INFO Instrumentation: NaiveBayes-NaiveBayes_4836ba16aa3f054bc95c-1042799920-2:
>>>> {"numFeatures":262144}
>>>> 17/08/22 14:20:40 INFO SparkContext: Starting job: collect at
>>>> NaiveBayes.scala:174
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Registering RDD 54 (map at
>>>> NaiveBayes.scala:162)
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Got job 15 (collect at
>>>> NaiveBayes.scala:174) with 2 output partitions
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Final stage: ResultStage 16
>>>> (collect at NaiveBayes.scala:174)
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Parents of final stage:
>>>> List(ShuffleMapStage 15)
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Missing parents:
>>>> List(ShuffleMapStage 15)
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Submitting ShuffleMapStage 15
>>>> (MapPartitionsRDD[54] at map at NaiveBayes.scala:162), which has no missing
>>>> parents
>>>> 17/08/22 14:20:40 INFO BlockManagerInfo: Removed broadcast_26_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 22.0 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:40 INFO MemoryStore: Block broadcast_27 stored as values
>>>> in memory (estimated size 6.0 MB, free 358.2 MB)
>>>> 17/08/22 14:20:40 INFO ContextCleaner: Cleaned accumulator 373
>>>> 17/08/22 14:20:40 INFO ContextCleaner: Cleaned accumulator 374
>>>> 17/08/22 14:20:40 INFO BlockManagerInfo: Removed broadcast_25_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 11.6 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:40 INFO MemoryStore: Block broadcast_27_piece0 stored as
>>>> bytes in memory (estimated size 45.0 KB, free 358.2 MB)
>>>> 17/08/22 14:20:40 INFO BlockManagerInfo: Added broadcast_27_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 45.0 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:40 INFO ContextCleaner: Cleaned accumulator 346
>>>> 17/08/22 14:20:40 INFO ContextCleaner: Cleaned accumulator 345
>>>> 17/08/22 14:20:40 INFO SparkContext: Created broadcast 27 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ShuffleMapStage 15 (MapPartitionsRDD[54] at map at NaiveBayes.scala:162)
>>>> (first 15 tasks are for partitions Vector(0, 1))
>>>> 17/08/22 14:20:40 INFO TaskSchedulerImpl: Adding task set 15.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Starting task 0.0 in stage 15.0
>>>> (TID 29, localhost, executor driver, partition 0, PROCESS_LOCAL, 4872 bytes)
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Starting task 1.0 in stage 15.0
>>>> (TID 30, localhost, executor driver, partition 1, PROCESS_LOCAL, 4881 bytes)
>>>> 17/08/22 14:20:40 INFO Executor: Running task 1.0 in stage 15.0 (TID 30)
>>>> 17/08/22 14:20:40 INFO Executor: Running task 0.0 in stage 15.0 (TID 29)
>>>> 17/08/22 14:20:40 INFO PythonRunner: Times: total = 55, boot = -340,
>>>> init = 394, finish = 1
>>>> 17/08/22 14:20:40 INFO PythonRunner: Times: total = 47, boot = -962,
>>>> init = 1009, finish = 0
>>>> 17/08/22 14:20:40 INFO Executor: Finished task 1.0 in stage 15.0 (TID
>>>> 30). 1867 bytes result sent to driver
>>>> 17/08/22 14:20:40 INFO Executor: Finished task 0.0 in stage 15.0 (TID
>>>> 29). 1867 bytes result sent to driver
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Finished task 1.0 in stage 15.0
>>>> (TID 30) in 220 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Finished task 0.0 in stage 15.0
>>>> (TID 29) in 224 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:40 INFO TaskSchedulerImpl: Removed TaskSet 15.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:40 INFO DAGScheduler: ShuffleMapStage 15 (map at
>>>> NaiveBayes.scala:162) finished in 0.227 s
>>>> 17/08/22 14:20:40 INFO DAGScheduler: looking for newly runnable stages
>>>> 17/08/22 14:20:40 INFO DAGScheduler: running: Set()
>>>> 17/08/22 14:20:40 INFO DAGScheduler: waiting: Set(ResultStage 16)
>>>> 17/08/22 14:20:40 INFO DAGScheduler: failed: Set()
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Submitting ResultStage 16
>>>> (ShuffledRDD[55] at aggregateByKey at NaiveBayes.scala:163), which has no
>>>> missing parents
>>>> 17/08/22 14:20:40 INFO MemoryStore: Block broadcast_28 stored as values
>>>> in memory (estimated size 6.0 MB, free 352.1 MB)
>>>> 17/08/22 14:20:40 INFO MemoryStore: Block broadcast_28_piece0 stored as
>>>> bytes in memory (estimated size 45.3 KB, free 352.1 MB)
>>>> 17/08/22 14:20:40 INFO BlockManagerInfo: Added broadcast_28_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 45.3 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:40 INFO SparkContext: Created broadcast 28 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Submitting 2 missing tasks from
>>>> ResultStage 16 (ShuffledRDD[55] at aggregateByKey at NaiveBayes.scala:163)
>>>> (first 15 tasks are for partitions Vector(0, 1))
>>>> 17/08/22 14:20:40 INFO TaskSchedulerImpl: Adding task set 16.0 with 2
>>>> tasks
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Starting task 1.0 in stage 16.0
>>>> (TID 31, localhost, executor driver, partition 1, PROCESS_LOCAL, 4621 bytes)
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Starting task 0.0 in stage 16.0
>>>> (TID 32, localhost, executor driver, partition 0, ANY, 4621 bytes)
>>>> 17/08/22 14:20:40 INFO Executor: Running task 0.0 in stage 16.0 (TID 32)
>>>> 17/08/22 14:20:40 INFO Executor: Running task 1.0 in stage 16.0 (TID 31)
>>>> 17/08/22 14:20:40 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty
>>>> blocks out of 2 blocks
>>>> 17/08/22 14:20:40 INFO ShuffleBlockFetcherIterator: Started 0 remote
>>>> fetches in 13 ms
>>>> 17/08/22 14:20:40 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty
>>>> blocks out of 2 blocks
>>>> 17/08/22 14:20:40 INFO ShuffleBlockFetcherIterator: Started 0 remote
>>>> fetches in 19 ms
>>>> 17/08/22 14:20:40 INFO Executor: Finished task 1.0 in stage 16.0 (TID
>>>> 31). 1888 bytes result sent to driver
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Finished task 1.0 in stage 16.0
>>>> (TID 31) in 99 ms on localhost (executor driver) (1/2)
>>>> 17/08/22 14:20:40 INFO MemoryStore: Block taskresult_32 stored as bytes
>>>> in memory (estimated size 4.0 MB, free 348.1 MB)
>>>> 17/08/22 14:20:40 INFO BlockManagerInfo: Added taskresult_32 in memory
>>>> on 192.168.25.187:46647 (size: 4.0 MB, free: 362.2 MB)
>>>> 17/08/22 14:20:40 INFO Executor: Finished task 0.0 in stage 16.0 (TID
>>>> 32). 4217031 bytes result sent via BlockManager)
>>>> 17/08/22 14:20:40 INFO TaskSetManager: Finished task 0.0 in stage 16.0
>>>> (TID 32) in 217 ms on localhost (executor driver) (2/2)
>>>> 17/08/22 14:20:40 INFO TaskSchedulerImpl: Removed TaskSet 16.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:40 INFO DAGScheduler: ResultStage 16 (collect at
>>>> NaiveBayes.scala:174) finished in 0.225 s
>>>> 17/08/22 14:20:40 INFO DAGScheduler: Job 15 finished: collect at
>>>> NaiveBayes.scala:174, took 0.589796 s
>>>> 17/08/22 14:20:40 INFO BlockManagerInfo: Removed taskresult_32 on
>>>> 192.168.25.187:46647 in memory (size: 4.0 MB, free: 366.2 MB)
>>>> 17/08/22 14:20:40 INFO Instrumentation: NaiveBayes-NaiveBayes_4836ba16aa3f054bc95c-1042799920-2:
>>>> {"numClasses":2}
>>>> 17/08/22 14:20:41 INFO Instrumentation: NaiveBayes-NaiveBayes_4836ba16aa3f054bc95c-1042799920-2:
>>>> training finished
>>>> 17/08/22 14:20:41 INFO SparkSqlParser: Parsing command: 1 as id
>>>> 17/08/22 14:20:41 INFO SparkSqlParser: Parsing command: CAST(value AS
>>>> STRING) as text
>>>> 17/08/22 14:20:42 INFO BlockManagerInfo: Removed broadcast_27_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 45.0 KB, free: 366.2 MB)
>>>> 17/08/22 14:20:42 INFO ContextCleaner: Cleaned accumulator 372
>>>> 17/08/22 14:20:42 INFO BlockManagerInfo: Removed broadcast_28_piece0 on
>>>> 192.168.25.187:46647 in memory (size: 45.3 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:42 INFO ContextCleaner: Cleaned accumulator 371
>>>> 17/08/22 14:20:42 INFO ContextCleaner: Cleaned accumulator 400
>>>> 17/08/22 14:20:42 INFO ContextCleaner: Cleaned accumulator 399
>>>> 17/08/22 14:20:42 INFO ContextCleaner: Cleaned shuffle 0
>>>> printing schema
>>>> root
>>>>  |-- id: integer (nullable = false)
>>>>  |-- text: string (nullable = true)
>>>>  |-- prediction: double (nullable = true)
>>>>
>>>> root
>>>>  |-- id: integer (nullable = false)
>>>>  |-- text: string (nullable = true)
>>>>  |-- prediction: double (nullable = true)
>>>>
>>>> root
>>>>  |-- id: integer (nullable = false)
>>>>  |-- text: string (nullable = true)
>>>>
>>>> 17/08/22 14:20:42 INFO StreamExecution: Starting [id =
>>>> db779c29-4790-4b60-b035-f5701f6cc8dd, runId =
>>>> 1996e5c6-aac3-45b2-871e-c2ba22c2da92]. Use
>>>> /tmp/temporary-5d7c55ec-4704-48ce-bf49-bb900f85a284 to store the query
>>>> checkpoint.
>>>> root
>>>>  |-- id: integer (nullable = false)
>>>>  |-- text: string (nullable = true)
>>>>  |-- prediction: double (nullable = true)
>>>>
>>>> 17/08/22 14:20:42 INFO ConsumerConfig: ConsumerConfig values:
>>>>         metric.reporters = []
>>>>         metadata.max.age.ms = 300000
>>>>         partition.assignment.strategy = [org.apache.kafka.clients.cons
>>>> umer.RangeAssignor]
>>>>         reconnect.backoff.ms = 50
>>>>         sasl.kerberos.ticket.renew.window.factor = 0.8
>>>>         max.partition.fetch.bytes = 1048576
>>>>         bootstrap.servers = [localhost:9092]
>>>>         ssl.keystore.type = JKS
>>>>         enable.auto.commit = false
>>>>         sasl.mechanism = GSSAPI
>>>>         interceptor.classes = null
>>>>         exclude.internal.topics = true
>>>>         ssl.truststore.password = null
>>>>         client.id =
>>>>         ssl.endpoint.identification.algorithm = null
>>>>         max.poll.records = 1
>>>>         check.crcs = true
>>>>         request.timeout.ms = 40000
>>>>         heartbeat.interval.ms = 3000
>>>>         auto.commit.interval.ms = 5000
>>>>         receive.buffer.bytes = 65536
>>>>         ssl.truststore.type = JKS
>>>>         ssl.truststore.location = null
>>>>         ssl.keystore.password = null
>>>>         fetch.min.bytes = 1
>>>>         send.buffer.bytes = 131072
>>>>         value.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         group.id = spark-kafka-source-49941407-b4
>>>> 15-4f5b-a584-5524a84e9066-1732634835-driver-0
>>>>         retry.backoff.ms = 100
>>>>         sasl.kerberos.kinit.cmd = /usr/bin/kinit
>>>>         sasl.kerberos.service.name = null
>>>>         sasl.kerberos.ticket.renew.jitter = 0.05
>>>>         ssl.trustmanager.algorithm = PKIX
>>>>         ssl.key.password = null
>>>>         fetch.max.wait.ms = 500
>>>>         sasl.kerberos.min.time.before.relogin = 60000
>>>>         connections.max.idle.ms = 540000
>>>>         session.timeout.ms = 30000
>>>>         metrics.num.samples = 2
>>>>         key.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         ssl.protocol = TLS
>>>>         ssl.provider = null
>>>>         ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
>>>>         ssl.keystore.location = null
>>>>         ssl.cipher.suites = null
>>>>         security.protocol = PLAINTEXT
>>>>         ssl.keymanager.algorithm = SunX509
>>>>         metrics.sample.window.ms = 30000
>>>>         auto.offset.reset = earliest
>>>>
>>>> 17/08/22 14:20:42 INFO SparkSqlParser: Parsing command: CAST(id AS
>>>> STRING) as key
>>>> 17/08/22 14:20:42 INFO SparkSqlParser: Parsing command:
>>>> CONCAT(CAST(text AS STRING),':', CAST(prediction AS STRING)) as value
>>>> 17/08/22 14:20:42 INFO ConsumerConfig: ConsumerConfig values:
>>>>         metric.reporters = []
>>>>         metadata.max.age.ms = 300000
>>>>         partition.assignment.strategy = [org.apache.kafka.clients.cons
>>>> umer.RangeAssignor]
>>>>         reconnect.backoff.ms = 50
>>>>         sasl.kerberos.ticket.renew.window.factor = 0.8
>>>>         max.partition.fetch.bytes = 1048576
>>>>         bootstrap.servers = [localhost:9092]
>>>>         ssl.keystore.type = JKS
>>>>         enable.auto.commit = false
>>>>         sasl.mechanism = GSSAPI
>>>>         interceptor.classes = null
>>>>         exclude.internal.topics = true
>>>>         ssl.truststore.password = null
>>>>         client.id = consumer-1
>>>>         ssl.endpoint.identification.algorithm = null
>>>>         max.poll.records = 1
>>>>         check.crcs = true
>>>>         request.timeout.ms = 40000
>>>>         heartbeat.interval.ms = 3000
>>>>         auto.commit.interval.ms = 5000
>>>>         receive.buffer.bytes = 65536
>>>>         ssl.truststore.type = JKS
>>>>         ssl.truststore.location = null
>>>>         ssl.keystore.password = null
>>>>         fetch.min.bytes = 1
>>>>         send.buffer.bytes = 131072
>>>>         value.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         group.id = spark-kafka-source-49941407-b4
>>>> 15-4f5b-a584-5524a84e9066-1732634835-driver-0
>>>>         retry.backoff.ms = 100
>>>>         sasl.kerberos.kinit.cmd = /usr/bin/kinit
>>>>         sasl.kerberos.service.name = null
>>>>         sasl.kerberos.ticket.renew.jitter = 0.05
>>>>         ssl.trustmanager.algorithm = PKIX
>>>>         ssl.key.password = null
>>>>         fetch.max.wait.ms = 500
>>>>         sasl.kerberos.min.time.before.relogin = 60000
>>>>         connections.max.idle.ms = 540000
>>>>         session.timeout.ms = 30000
>>>>         metrics.num.samples = 2
>>>>         key.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         ssl.protocol = TLS
>>>>         ssl.provider = null
>>>>         ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
>>>>         ssl.keystore.location = null
>>>>         ssl.cipher.suites = null
>>>>         security.protocol = PLAINTEXT
>>>>         ssl.keymanager.algorithm = SunX509
>>>>         metrics.sample.window.ms = 30000
>>>>         auto.offset.reset = earliest
>>>>
>>>> 17/08/22 14:20:42 INFO StreamExecution: Starting [id =
>>>> b4288c77-f554-4c98-bc63-5539db7d1869, runId =
>>>> 3efa5ffc-230f-4819-a7d2-47ae0e6a8a43]. Use /tmp/kaf_tgt to store the
>>>> query checkpoint.
>>>> 17/08/22 14:20:42 INFO ConsumerConfig: ConsumerConfig values:
>>>>         metric.reporters = []
>>>>         metadata.max.age.ms = 300000
>>>>         partition.assignment.strategy = [org.apache.kafka.clients.cons
>>>> umer.RangeAssignor]
>>>>         reconnect.backoff.ms = 50
>>>>         sasl.kerberos.ticket.renew.window.factor = 0.8
>>>>         max.partition.fetch.bytes = 1048576
>>>>         bootstrap.servers = [localhost:9092]
>>>>         ssl.keystore.type = JKS
>>>>         enable.auto.commit = false
>>>>         sasl.mechanism = GSSAPI
>>>>         interceptor.classes = null
>>>>         exclude.internal.topics = true
>>>>         ssl.truststore.password = null
>>>>         client.id =
>>>>         ssl.endpoint.identification.algorithm = null
>>>>         max.poll.records = 1
>>>>         check.crcs = true
>>>>         request.timeout.ms = 40000
>>>>         heartbeat.interval.ms = 3000
>>>>         auto.commit.interval.ms = 5000
>>>>         receive.buffer.bytes = 65536
>>>>         ssl.truststore.type = JKS
>>>>         ssl.truststore.location = null
>>>>         ssl.keystore.password = null
>>>>         fetch.min.bytes = 1
>>>>         send.buffer.bytes = 131072
>>>>         value.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         group.id = spark-kafka-source-607b6303-bd
>>>> a0-4e93-b6fd-d46d4a4d2e1f--19863691-driver-0
>>>>         retry.backoff.ms = 100
>>>>         sasl.kerberos.kinit.cmd = /usr/bin/kinit
>>>>         sasl.kerberos.service.name = null
>>>>         sasl.kerberos.ticket.renew.jitter = 0.05
>>>>         ssl.trustmanager.algorithm = PKIX
>>>>         ssl.key.password = null
>>>>         fetch.max.wait.ms = 500
>>>>         sasl.kerberos.min.time.before.relogin = 60000
>>>>         connections.max.idle.ms = 540000
>>>>         session.timeout.ms = 30000
>>>>         metrics.num.samples = 2
>>>>         key.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         ssl.protocol = TLS
>>>>         ssl.provider = null
>>>>         ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
>>>>         ssl.keystore.location = null
>>>>         ssl.cipher.suites = null
>>>>         security.protocol = PLAINTEXT
>>>>         ssl.keymanager.algorithm = SunX509
>>>>         metrics.sample.window.ms = 30000
>>>>         auto.offset.reset = earliest
>>>>
>>>> 17/08/22 14:20:42 INFO ConsumerConfig: ConsumerConfig values:
>>>>         metric.reporters = []
>>>>         metadata.max.age.ms = 300000
>>>>         partition.assignment.strategy = [org.apache.kafka.clients.cons
>>>> umer.RangeAssignor]
>>>>         reconnect.backoff.ms = 50
>>>>         sasl.kerberos.ticket.renew.window.factor = 0.8
>>>>         max.partition.fetch.bytes = 1048576
>>>>         bootstrap.servers = [localhost:9092]
>>>>         ssl.keystore.type = JKS
>>>>         enable.auto.commit = false
>>>>         sasl.mechanism = GSSAPI
>>>>         interceptor.classes = null
>>>>         exclude.internal.topics = true
>>>>         ssl.truststore.password = null
>>>>         client.id = consumer-2
>>>>         ssl.endpoint.identification.algorithm = null
>>>>         max.poll.records = 1
>>>>         check.crcs = true
>>>>         request.timeout.ms = 40000
>>>>         heartbeat.interval.ms = 3000
>>>>         auto.commit.interval.ms = 5000
>>>>         receive.buffer.bytes = 65536
>>>>         ssl.truststore.type = JKS
>>>>         ssl.truststore.location = null
>>>>         ssl.keystore.password = null
>>>>         fetch.min.bytes = 1
>>>>         send.buffer.bytes = 131072
>>>>         value.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         group.id = spark-kafka-source-607b6303-bd
>>>> a0-4e93-b6fd-d46d4a4d2e1f--19863691-driver-0
>>>>         retry.backoff.ms = 100
>>>>         sasl.kerberos.kinit.cmd = /usr/bin/kinit
>>>>         sasl.kerberos.service.name = null
>>>>         sasl.kerberos.ticket.renew.jitter = 0.05
>>>>         ssl.trustmanager.algorithm = PKIX
>>>>         ssl.key.password = null
>>>>         fetch.max.wait.ms = 500
>>>>         sasl.kerberos.min.time.before.relogin = 60000
>>>>         connections.max.idle.ms = 540000
>>>>         session.timeout.ms = 30000
>>>>         metrics.num.samples = 2
>>>>         key.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         ssl.protocol = TLS
>>>>         ssl.provider = null
>>>>         ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
>>>>         ssl.keystore.location = null
>>>>         ssl.cipher.suites = null
>>>>         security.protocol = PLAINTEXT
>>>>         ssl.keymanager.algorithm = SunX509
>>>>         metrics.sample.window.ms = 30000
>>>>         auto.offset.reset = earliest
>>>>
>>>> 17/08/22 14:20:42 INFO AppInfoParser: Kafka version : 0.10.0.1
>>>> 17/08/22 14:20:42 INFO AppInfoParser: Kafka commitId : a7a17cdec9eaa6c5
>>>> 17/08/22 14:20:42 INFO AppInfoParser: Kafka version : 0.10.0.1
>>>> 17/08/22 14:20:42 INFO AppInfoParser: Kafka commitId : a7a17cdec9eaa6c5
>>>> 17/08/22 14:20:42 INFO StreamExecution: Starting new streaming query.
>>>> 17/08/22 14:20:43 ERROR StreamExecution: Query [id =
>>>> b4288c77-f554-4c98-bc63-5539db7d1869, runId =
>>>> 3efa5ffc-230f-4819-a7d2-47ae0e6a8a43] terminated with error
>>>> java.lang.AssertionError: assertion failed
>>>>         at scala.Predef$.assert(Predef.scala:156)
>>>>         at org.apache.spark.sql.execution.streaming.OffsetSeq.toStreamP
>>>> rogress(OffsetSeq.scala:38)
>>>>         at org.apache.spark.sql.execution.streaming.StreamExecution.org
>>>> $apache$spark$sql$execution$streaming$StreamExecution$$popul
>>>> ateStartOffsets(StreamExecution.scala:429)
>>>>         at org.apache.spark.sql.execution.streaming.StreamExecution$$an
>>>> onfun$org$apache$spark$sql$execution$streaming$StreamExecuti
>>>> on$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(Stream
>>>> Execution.scala:297)
>>>>         at org.apache.spark.sql.execution.streaming.StreamExecution$$an
>>>> onfun$org$apache$spark$sql$execution$streaming$StreamExecuti
>>>> on$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecuti
>>>> on.scala:294)
>>>>         at org.apache.spark.sql.execution.streaming.StreamExecution$$an
>>>> onfun$org$apache$spark$sql$execution$streaming$StreamExecuti
>>>> on$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecuti
>>>> on.scala:294)
>>>>         at org.apache.spark.sql.execution.streaming.ProgressReporter$cl
>>>> ass.reportTimeTaken(ProgressReporter.scala:279)
>>>>         at org.apache.spark.sql.execution.streaming.StreamExecution.rep
>>>> ortTimeTaken(StreamExecution.scala:58)
>>>>         at org.apache.spark.sql.execution.streaming.StreamExecution$$an
>>>> onfun$org$apache$spark$sql$execution$streaming$StreamExecuti
>>>> on$$runBatches$1.apply$mcZ$sp(StreamExecution.scala:294)
>>>>         at org.apache.spark.sql.execution.streaming.ProcessingTimeExecu
>>>> tor.execute(TriggerExecutor.scala:56)
>>>>         at org.apache.spark.sql.execution.streaming.StreamExecution.org
>>>> $apache$spark$sql$execution$streaming$StreamExecution$$runBa
>>>> tches(StreamExecution.scala:290)
>>>>         at org.apache.spark.sql.execution.streaming.StreamExecution$$an
>>>> on$1.run(StreamExecution.scala:206)
>>>> 17/08/22 14:20:43 INFO AbstractCoordinator: Discovered coordinator
>>>> Sandbox.RHEL:9092 (id: 2147483647 <(214)%20748-3647> rack: null) for
>>>> group spark-kafka-source-49941407-b415-4f5b-a584-5524a84e9066-1732
>>>> 634835-driver-0.
>>>> 17/08/22 14:20:43 INFO ConsumerCoordinator: Revoking previously
>>>> assigned partitions [] for group spark-kafka-source-49941407-b4
>>>> 15-4f5b-a584-5524a84e9066-1732634835-driver-0
>>>> 17/08/22 14:20:43 INFO AbstractCoordinator: (Re-)joining group
>>>> spark-kafka-source-49941407-b415-4f5b-a584-5524a84e9066-1732
>>>> 634835-driver-0
>>>> 17/08/22 14:20:43 INFO AbstractCoordinator: Successfully joined group
>>>> spark-kafka-source-49941407-b415-4f5b-a584-5524a84e9066-1732634835-driver-0
>>>> with generation 1
>>>> 17/08/22 14:20:43 INFO ConsumerCoordinator: Setting newly assigned
>>>> partitions [spark-stream-0] for group spark-kafka-source-49941407-b4
>>>> 15-4f5b-a584-5524a84e9066-1732634835-driver-0
>>>> 17/08/22 14:20:43 INFO KafkaSource: Initial offsets:
>>>> {"spark-stream":{"0":15}}
>>>> 17/08/22 14:20:44 INFO StreamExecution: Committed offsets for batch 0.
>>>> Metadata OffsetSeqMetadata(0,1503391843742,Map(spark.sql.shuffle.partitions
>>>> -> 200))
>>>> 17/08/22 14:20:44 INFO KafkaSource: GetBatch called with start = None,
>>>> end = {"spark-stream":{"0":15}}
>>>> 17/08/22 14:20:44 INFO KafkaSource: Partitions added: Map()
>>>> 17/08/22 14:20:44 INFO KafkaSource: GetBatch generating RDD of offset
>>>> range: KafkaSourceRDDOffsetRange(spark-stream-0,15,15,None)
>>>> -------------------------------------------
>>>> Batch: 0
>>>> -------------------------------------------
>>>> 17/08/22 14:20:44 INFO CodeGenerator: Code generated in 26.485704 ms
>>>> 17/08/22 14:20:44 INFO SparkContext: Starting job: start at
>>>> NativeMethodAccessorImpl.java:0
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Got job 16 (start at
>>>> NativeMethodAccessorImpl.java:0) with 1 output partitions
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Final stage: ResultStage 17 (start
>>>> at NativeMethodAccessorImpl.java:0)
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Submitting ResultStage 17
>>>> (MapPartitionsRDD[60] at start at NativeMethodAccessorImpl.java:0),
>>>> which has no missing parents
>>>> 17/08/22 14:20:44 INFO MemoryStore: Block broadcast_29 stored as values
>>>> in memory (estimated size 8.9 KB, free 364.2 MB)
>>>> 17/08/22 14:20:44 INFO MemoryStore: Block broadcast_29_piece0 stored as
>>>> bytes in memory (estimated size 4.8 KB, free 364.2 MB)
>>>> 17/08/22 14:20:44 INFO BlockManagerInfo: Added broadcast_29_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 4.8 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:44 INFO SparkContext: Created broadcast 29 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Submitting 1 missing tasks from
>>>> ResultStage 17 (MapPartitionsRDD[60] at start at
>>>> NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions
>>>> Vector(0))
>>>> 17/08/22 14:20:44 INFO TaskSchedulerImpl: Adding task set 17.0 with 1
>>>> tasks
>>>> 17/08/22 14:20:44 INFO TaskSetManager: Starting task 0.0 in stage 17.0
>>>> (TID 33, localhost, executor driver, partition 0, PROCESS_LOCAL, 5007 bytes)
>>>> 17/08/22 14:20:44 INFO Executor: Running task 0.0 in stage 17.0 (TID 33)
>>>> 17/08/22 14:20:44 INFO ConsumerConfig: ConsumerConfig values:
>>>>         metric.reporters = []
>>>>         metadata.max.age.ms = 300000
>>>>         partition.assignment.strategy = [org.apache.kafka.clients.cons
>>>> umer.RangeAssignor]
>>>>         reconnect.backoff.ms = 50
>>>>         sasl.kerberos.ticket.renew.window.factor = 0.8
>>>>         max.partition.fetch.bytes = 1048576
>>>>         bootstrap.servers = [localhost:9092]
>>>>         ssl.keystore.type = JKS
>>>>         enable.auto.commit = false
>>>>         sasl.mechanism = GSSAPI
>>>>         interceptor.classes = null
>>>>         exclude.internal.topics = true
>>>>         ssl.truststore.password = null
>>>>         client.id =
>>>>         ssl.endpoint.identification.algorithm = null
>>>>         max.poll.records = 2147483647 <(214)%20748-3647>
>>>>         check.crcs = true
>>>>         request.timeout.ms = 40000
>>>>         heartbeat.interval.ms = 3000
>>>>         auto.commit.interval.ms = 5000
>>>>         receive.buffer.bytes = 65536
>>>>         ssl.truststore.type = JKS
>>>>         ssl.truststore.location = null
>>>>         ssl.keystore.password = null
>>>>         fetch.min.bytes = 1
>>>>         send.buffer.bytes = 131072
>>>>         value.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         group.id = spark-kafka-source-49941407-b4
>>>> 15-4f5b-a584-5524a84e9066-1732634835-executor
>>>>         retry.backoff.ms = 100
>>>>         sasl.kerberos.kinit.cmd = /usr/bin/kinit
>>>>         sasl.kerberos.service.name = null
>>>>         sasl.kerberos.ticket.renew.jitter = 0.05
>>>>         ssl.trustmanager.algorithm = PKIX
>>>>         ssl.key.password = null
>>>>         fetch.max.wait.ms = 500
>>>>         sasl.kerberos.min.time.before.relogin = 60000
>>>>         connections.max.idle.ms = 540000
>>>>         session.timeout.ms = 30000
>>>>         metrics.num.samples = 2
>>>>         key.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         ssl.protocol = TLS
>>>>         ssl.provider = null
>>>>         ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
>>>>         ssl.keystore.location = null
>>>>         ssl.cipher.suites = null
>>>>         security.protocol = PLAINTEXT
>>>>         ssl.keymanager.algorithm = SunX509
>>>>         metrics.sample.window.ms = 30000
>>>>         auto.offset.reset = none
>>>>
>>>> 17/08/22 14:20:44 INFO ConsumerConfig: ConsumerConfig values:
>>>>         metric.reporters = []
>>>>         metadata.max.age.ms = 300000
>>>>         partition.assignment.strategy = [org.apache.kafka.clients.cons
>>>> umer.RangeAssignor]
>>>>         reconnect.backoff.ms = 50
>>>>         sasl.kerberos.ticket.renew.window.factor = 0.8
>>>>         max.partition.fetch.bytes = 1048576
>>>>         bootstrap.servers = [localhost:9092]
>>>>         ssl.keystore.type = JKS
>>>>         enable.auto.commit = false
>>>>         sasl.mechanism = GSSAPI
>>>>         interceptor.classes = null
>>>>         exclude.internal.topics = true
>>>>         ssl.truststore.password = null
>>>>         client.id = consumer-3
>>>>         ssl.endpoint.identification.algorithm = null
>>>>         max.poll.records = 2147483647 <(214)%20748-3647>
>>>>         check.crcs = true
>>>>         request.timeout.ms = 40000
>>>>         heartbeat.interval.ms = 3000
>>>>         auto.commit.interval.ms = 5000
>>>>         receive.buffer.bytes = 65536
>>>>         ssl.truststore.type = JKS
>>>>         ssl.truststore.location = null
>>>>         ssl.keystore.password = null
>>>>         fetch.min.bytes = 1
>>>>         send.buffer.bytes = 131072
>>>>         value.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         group.id = spark-kafka-source-49941407-b4
>>>> 15-4f5b-a584-5524a84e9066-1732634835-executor
>>>>         retry.backoff.ms = 100
>>>>         sasl.kerberos.kinit.cmd = /usr/bin/kinit
>>>>         sasl.kerberos.service.name = null
>>>>         sasl.kerberos.ticket.renew.jitter = 0.05
>>>>         ssl.trustmanager.algorithm = PKIX
>>>>         ssl.key.password = null
>>>>         fetch.max.wait.ms = 500
>>>>         sasl.kerberos.min.time.before.relogin = 60000
>>>>         connections.max.idle.ms = 540000
>>>>         session.timeout.ms = 30000
>>>>         metrics.num.samples = 2
>>>>         key.deserializer = class org.apache.kafka.common.serial
>>>> ization.ByteArrayDeserializer
>>>>         ssl.protocol = TLS
>>>>         ssl.provider = null
>>>>         ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
>>>>         ssl.keystore.location = null
>>>>         ssl.cipher.suites = null
>>>>         security.protocol = PLAINTEXT
>>>>         ssl.keymanager.algorithm = SunX509
>>>>         metrics.sample.window.ms = 30000
>>>>         auto.offset.reset = none
>>>>
>>>> 17/08/22 14:20:44 INFO AppInfoParser: Kafka version : 0.10.0.1
>>>> 17/08/22 14:20:44 INFO AppInfoParser: Kafka commitId : a7a17cdec9eaa6c5
>>>> 17/08/22 14:20:44 INFO KafkaSourceRDD: Beginning offset 15 is the same
>>>> as ending offset skipping spark-stream 0
>>>> 17/08/22 14:20:44 INFO CodeGenerator: Code generated in 27.911741 ms
>>>> 17/08/22 14:20:44 INFO Executor: Finished task 0.0 in stage 17.0 (TID
>>>> 33). 1117 bytes result sent to driver
>>>> 17/08/22 14:20:44 INFO TaskSetManager: Finished task 0.0 in stage 17.0
>>>> (TID 33) in 69 ms on localhost (executor driver) (1/1)
>>>> 17/08/22 14:20:44 INFO TaskSchedulerImpl: Removed TaskSet 17.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:44 INFO DAGScheduler: ResultStage 17 (start at
>>>> NativeMethodAccessorImpl.java:0) finished in 0.071 s
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Job 16 finished: start at
>>>> NativeMethodAccessorImpl.java:0, took 0.087629 s
>>>> 17/08/22 14:20:44 INFO SparkContext: Starting job: start at
>>>> NativeMethodAccessorImpl.java:0
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Got job 17 (start at
>>>> NativeMethodAccessorImpl.java:0) with 1 output partitions
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Final stage: ResultStage 18 (start
>>>> at NativeMethodAccessorImpl.java:0)
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Submitting ResultStage 18
>>>> (MapPartitionsRDD[64] at start at NativeMethodAccessorImpl.java:0),
>>>> which has no missing parents
>>>> 17/08/22 14:20:44 INFO MemoryStore: Block broadcast_30 stored as values
>>>> in memory (estimated size 8.1 KB, free 364.2 MB)
>>>> 17/08/22 14:20:44 INFO MemoryStore: Block broadcast_30_piece0 stored as
>>>> bytes in memory (estimated size 4.4 KB, free 364.2 MB)
>>>> 17/08/22 14:20:44 INFO BlockManagerInfo: Added broadcast_30_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 4.4 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:44 INFO SparkContext: Created broadcast 30 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Submitting 1 missing tasks from
>>>> ResultStage 18 (MapPartitionsRDD[64] at start at
>>>> NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions
>>>> Vector(0))
>>>> 17/08/22 14:20:44 INFO TaskSchedulerImpl: Adding task set 18.0 with 1
>>>> tasks
>>>> 17/08/22 14:20:44 INFO TaskSetManager: Starting task 0.0 in stage 18.0
>>>> (TID 34, localhost, executor driver, partition 0, PROCESS_LOCAL, 4835 bytes)
>>>> 17/08/22 14:20:44 INFO Executor: Running task 0.0 in stage 18.0 (TID 34)
>>>> 17/08/22 14:20:44 INFO CodeGenerator: Code generated in 25.878504 ms
>>>> 17/08/22 14:20:44 INFO Executor: Finished task 0.0 in stage 18.0 (TID
>>>> 34). 999 bytes result sent to driver
>>>> 17/08/22 14:20:44 INFO TaskSetManager: Finished task 0.0 in stage 18.0
>>>> (TID 34) in 47 ms on localhost (executor driver) (1/1)
>>>> 17/08/22 14:20:44 INFO TaskSchedulerImpl: Removed TaskSet 18.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:44 INFO DAGScheduler: ResultStage 18 (start at
>>>> NativeMethodAccessorImpl.java:0) finished in 0.047 s
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Job 17 finished: start at
>>>> NativeMethodAccessorImpl.java:0, took 0.057313 s
>>>> 17/08/22 14:20:44 INFO SparkContext: Starting job: start at
>>>> NativeMethodAccessorImpl.java:0
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Got job 18 (start at
>>>> NativeMethodAccessorImpl.java:0) with 1 output partitions
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Final stage: ResultStage 19 (start
>>>> at NativeMethodAccessorImpl.java:0)
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Parents of final stage: List()
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Missing parents: List()
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Submitting ResultStage 19
>>>> (MapPartitionsRDD[64] at start at NativeMethodAccessorImpl.java:0),
>>>> which has no missing parents
>>>> 17/08/22 14:20:44 INFO MemoryStore: Block broadcast_31 stored as values
>>>> in memory (estimated size 8.1 KB, free 364.2 MB)
>>>> 17/08/22 14:20:44 INFO MemoryStore: Block broadcast_31_piece0 stored as
>>>> bytes in memory (estimated size 4.4 KB, free 364.2 MB)
>>>> 17/08/22 14:20:44 INFO BlockManagerInfo: Added broadcast_31_piece0 in
>>>> memory on 192.168.25.187:46647 (size: 4.4 KB, free: 366.3 MB)
>>>> 17/08/22 14:20:44 INFO SparkContext: Created broadcast 31 from
>>>> broadcast at DAGScheduler.scala:1006
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Submitting 1 missing tasks from
>>>> ResultStage 19 (MapPartitionsRDD[64] at start at
>>>> NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions
>>>> Vector(1))
>>>> 17/08/22 14:20:44 INFO TaskSchedulerImpl: Adding task set 19.0 with 1
>>>> tasks
>>>> 17/08/22 14:20:44 INFO TaskSetManager: Starting task 0.0 in stage 19.0
>>>> (TID 35, localhost, executor driver, partition 1, PROCESS_LOCAL, 4835 bytes)
>>>> 17/08/22 14:20:44 INFO Executor: Running task 0.0 in stage 19.0 (TID 35)
>>>> 17/08/22 14:20:44 INFO Executor: Finished task 0.0 in stage 19.0 (TID
>>>> 35). 999 bytes result sent to driver
>>>> 17/08/22 14:20:44 INFO TaskSetManager: Finished task 0.0 in stage 19.0
>>>> (TID 35) in 15 ms on localhost (executor driver) (1/1)
>>>> 17/08/22 14:20:44 INFO TaskSchedulerImpl: Removed TaskSet 19.0, whose
>>>> tasks have all completed, from pool
>>>> 17/08/22 14:20:44 INFO DAGScheduler: ResultStage 19 (start at
>>>> NativeMethodAccessorImpl.java:0) finished in 0.015 s
>>>> 17/08/22 14:20:44 INFO DAGScheduler: Job 18 finished: start at
>>>> NativeMethodAccessorImpl.java:0, took 0.024716 s
>>>> +---+----+
>>>> | id|text|
>>>> +---+----+
>>>> +---+----+
>>>>
>>>> 17/08/22 14:20:45 INFO StreamExecution: Streaming query made progress: {
>>>>   "id" : "db779c29-4790-4b60-b035-f5701f6cc8dd",
>>>>   "runId" : "1996e5c6-aac3-45b2-871e-c2ba22c2da92",
>>>>   "name" : null,
>>>>   "timestamp" : "2017-08-22T08:50:42.983Z",
>>>>   "numInputRows" : 0,
>>>>   "processedRowsPerSecond" : 0.0,
>>>>   "durationMs" : {
>>>>     "addBatch" : 462,
>>>>     "getBatch" : 85,
>>>>     "getOffset" : 631,
>>>>     "queryPlanning" : 201,
>>>>     "triggerExecution" : 1891,
>>>>     "walCommit" : 301
>>>>   },
>>>>   "stateOperators" : [ ],
>>>>   "sources" : [ {
>>>>     "description" : "KafkaSource[Subscribe[spark-stream]]",
>>>>     "startOffset" : null,
>>>>     "endOffset" : {
>>>>       "spark-stream" : {
>>>>         "0" : 15
>>>>       }
>>>>     },
>>>>     "numInputRows" : 0,
>>>>     "processedRowsPerSecond" : 0.0
>>>>   } ],
>>>>   "sink" : {
>>>>     "description" : "org.apache.spark.sql.executio
>>>> n.streaming.ConsoleSink@68e8a192"
>>>>   }
>>>> }
>>>> 17/08/22 14:20:45 INFO StreamExecution: Streaming query made progress: {
>>>>   "id" : "db779c29-4790-4b60-b035-f5701f6cc8dd",
>>>>   "runId" : "1996e5c6-aac3-45b2-871e-c2ba22c2da92",
>>>>   "name" : null,
>>>>   "timestamp" : "2017-08-22T08:50:45.199Z",
>>>>   "numInputRows" : 0,
>>>>   "inputRowsPerSecond" : 0.0,
>>>>   "processedRowsPerSecond" : 0.0,
>>>>   "durationMs" : {
>>>>     "getOffset" : 3,
>>>>     "triggerExecution" : 3
>>>>   },
>>>>   "stateOperators" : [ ],
>>>>   "sources" : [ {
>>>>     "description" : "KafkaSource[Subscribe[spark-stream]]",
>>>>     "startOffset" : {
>>>>       "spark-stream" : {
>>>>         "0" : 15
>>>>       }
>>>>     },
>>>>     "endOffset" : {
>>>>       "spark-stream" : {
>>>>         "0" : 15
>>>>       }
>>>>     },
>>>>     "numInputRows" : 0,
>>>>     "inputRowsPerSecond" : 0.0,
>>>>     "processedRowsPerSecond" : 0.0
>>>>   } ],
>>>>   "sink" : {
>>>>     "description" : "org.apache.spark.sql.executio
>>>> n.streaming.ConsoleSink@68e8a192"
>>>>   }
>>>> }
>>>> Traceback (most recent call last):
>>>>   File "/home/jatin/projects/app/kafka_ml.py", line 67, in <module>
>>>>     kafka_write.awaitTermination()
>>>>   File "/home/jatin/projects/spark/spark-2.2.0-bin-hadoop2.7/python
>>>> /lib/pyspark.zip/pyspark/sql/streaming.py", line 106, in
>>>> awaitTermination
>>>>   File "/home/jatin/projects/spark/spark-2.2.0-bin-hadoop2.7/python
>>>> /lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
>>>>   File "/home/jatin/projects/spark/spark-2.2.0-bin-hadoop2.7/python
>>>> /lib/pyspark.zip/pyspark/sql/utils.py", line 75, in deco
>>>> pyspark.sql.utils.StreamingQueryException: u'assertion failed\n===
>>>> Streaming Query ===\nIdentifier: [id = b4288c77-f554-4c98-bc63-5539db7d1869,
>>>> runId = 3efa5ffc-230f-4819-a7d2-47ae0e6a8a43]\nCurrent Committed
>>>> Offsets: {}\nCurrent Available Offsets: {}\n\nCurrent State: ACTIVE\nThread
>>>> State: RUNNABLE\n\nLogical Plan:\nProject [cast(id#174 as string) AS
>>>> key#326, concat(cast(text#175 as string), :, cast(prediction#224 as
>>>> string)) AS value#327]\n+- Union\n   :- Project [id#174, text#175,
>>>> prediction#224]\n   :  +- Project [id#174, text#175, words#179,
>>>> relevant_words#184, rawFeatures#190, features#197, rawPrediction#205,
>>>> probability#214, UDF(rawPrediction#205) AS prediction#224]\n   :     +-
>>>> Project [id#174, text#175, words#179, relevant_words#184, rawFeatures#190,
>>>> features#197, rawPrediction#205, UDF(rawPrediction#205) AS
>>>> probability#214]\n   :        +- Project [id#174, text#175, words#179,
>>>> relevant_words#184, rawFeatures#190, features#197, UDF(features#197) AS
>>>> rawPrediction#205]\n   :           +- Project [id#174, text#175, words#179,
>>>> relevant_words#184, rawFeatures#190, UDF(rawFeatures#190) AS
>>>> features#197]\n   :              +- Project [id#174, text#175, words#179,
>>>> relevant_words#184, UDF(relevant_words#184) AS rawFeatures#190]\n   :
>>>>           +- Project [id#174, text#175, words#179, UDF(words#179) AS
>>>> relevant_words#184]\n   :                    +- Project [id#174, text#175,
>>>> UDF(text#175) AS words#179]\n   :                       +- Project [1 AS
>>>> id#174, cast(value#160 as string) AS text#175]\n   :
>>>>    +- StreamingExecutionRelation KafkaSource[Subscribe[spark-stream]],
>>>> [key#159, value#160, topic#161, partition#162, offset#163L, timestamp#164,
>>>> timestampType#165]\n   +- Project [id#174, text#175, prediction#290]\n
>>>>  +- Project [id#174, text#175, words#245, relevant_words#250,
>>>> rawFeatures#256, features#263, rawPrediction#271, probability#280,
>>>> UDF(rawPrediction#271) AS prediction#290]\n         +- Project [id#174,
>>>> text#175, words#245, relevant_words#250, rawFeatures#256, features#263,
>>>> rawPrediction#271, UDF(rawPrediction#271) AS probability#280]\n
>>>>  +- Project [id#174, text#175, words#245, relevant_words#250,
>>>> rawFeatures#256, features#263, UDF(features#263) AS rawPrediction#271]\n
>>>>             +- Project [id#174, text#175, words#245, relevant_words#250,
>>>> rawFeatures#256, UDF(rawFeatures#256) AS features#263]\n
>>>>  +- Project [id#174, text#175, words#245, relevant_words#250,
>>>> UDF(relevant_words#250) AS rawFeatures#256]\n                     +-
>>>> Project [id#174, text#175, words#245, UDF(words#245) AS
>>>> relevant_words#250]\n                        +- Project [id#174, text#175,
>>>> UDF(text#175) AS words#245]\n                           +- Project [1 AS
>>>> id#174, cast(value#160 as string) AS text#175]\n
>>>>    +- StreamingExecutionRelation KafkaSource[Subscribe[spark-stream]],
>>>> [key#159, value#160, topic#161, partition#162, offset#163L, timestamp#164,
>>>> timestampType#165]\n'
>>>> 17/08/22 14:20:45 INFO SparkContext: Invoking stop() from shutdown hook
>>>> 17/08/22 14:20:45 INFO SparkUI: Stopped Spark web UI at
>>>> http://192.168.25.187:4040
>>>> 17/08/22 14:20:45 INFO MapOutputTrackerMasterEndpoint:
>>>> MapOutputTrackerMasterEndpoint stopped!
>>>> 17/08/22 14:20:45 INFO MemoryStore: MemoryStore cleared
>>>> 17/08/22 14:20:45 INFO BlockManager: BlockManager stopped
>>>> 17/08/22 14:20:45 INFO BlockManagerMaster: BlockManagerMaster stopped
>>>> 17/08/22 14:20:45 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
>>>> OutputCommitCoordinator stopped!
>>>> 17/08/22 14:20:45 INFO SparkContext: Successfully stopped SparkContext
>>>> 17/08/22 14:20:45 INFO ShutdownHookManager: Shutdown hook called
>>>> 17/08/22 14:20:45 INFO ShutdownHookManager: Deleting directory
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7/pyspark-146b
>>>> 573e-66f7-41af-bf89-dc2b5a426c75
>>>> 17/08/22 14:20:45 INFO ShutdownHookManager: Deleting directory
>>>> /tmp/spark-9474eda2-128d-4b00-8f99-dc02012a8bf7
>>>> 17/08/22 14:20:45 INFO ShutdownHookManager: Deleting directory
>>>> /tmp/temporary-5d7c55ec-4704-48ce-bf49-bb900f85a284
>>>> jatin@Sandbox.RHEL:/home/jatin/projects/app#vi kafka_ml.py
>>>> jatin@Sandbox.RHEL:/home/jatin/projects/app#
>>>> jatin@Sandbox.RHEL:/home/jatin/projects/app#
>>>>
>>>>
>>>
>>
>

Mime
View raw message