spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Spark on minikube hanging when doing simple rdd.collect()
Date Sun, 11 Jul 2021 16:33:19 GMT
 Hi

I have now resolved this issue. I gather this is an inherent issue with
this setup.

Most simplistic examples for creating a minikube and showing it works rely
on some in-built jar file that calculates pi value and comes back with it.
To be fair calculating pi is a simple math's and does not require any use
of spark offering (to the best of my knowledge).

So if we pass that point, we can see the real issue here

kubectl get pod -n spark
NAME                               READY   STATUS              RESTARTS
AGE
spark-master                       0/1     Completed           0
 80m
testme-b7a8f97a714d9351-exec-164   1/1     Running             0          8s
testme-b7a8f97a714d9351-exec-165   0/1     ContainerCreating   0          3s


kubectl logs -n spark testme-b7a8f97a714d9351-exec-165
Error from server (NotFound): pods "testme-b7a8f97a714d9351-exec-165" not
found


These workers (test-xxxx) are created and dropped immediately because they
are not capable of communicating with the spark-master. spark-master does
not have an internal IP address and neither the workers. So Minikube
creates them and they immediately fail and disappear and replaced with a
new worker (container).  That is the whole idea of Kubernetes built for
scalability and high availability.

The below line is also misleading because there is no issue with resources
bar NOT being able to communicate with the driver (spark-master)
scala> b.collect
2021-07-04 13:14:45,848 WARN scheduler.TaskSchedulerImpl: Initial job has
not accepted any resources; check your cluster UI to ensure that workers
are registered and have sufficient resources

Spark Kubernetes is built on the concept of spark standalone resource
manager and within the pod itself these resources need to have internal IP
addresses so that they can communicate with the driver.

For example:

kubectl get deployments
NAME           READY   UP-TO-DATE   AVAILABLE   AGE
spark-master   1/1     1            1           36m
spark-worker   2/2     2            2           35m

kubectl get pods -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP
    NODE       NOMINATED NODE   READINESS GATES
spark-master-5f887dc8c-cm9wx    1/1     Running   0          36m   172.17.0.3
  minikube   <none>           <none>
spark-worker-7fd4cb4745-lsjd5   1/1     Running   0          35m
172.17.0.4   minikube   <none>           <none>
spark-worker-7fd4cb4745-s5wnx   1/1     Running   0          35m
172.17.0.5   minikube   <none>           <none>
We immediately notice the IP address for spark-master. That is crucial.
Need to pass spark.driver.bindAddress and spark.driver.host conf parameters
with that IP address!

MASTER_POD_NAME=`kubectl get pods -o wide |grep master|awk '{print $1}'`
MASTER_POD_IP=`kubectl get pods -o wide |grep master|awk '{print $6}'`

Let us open a PySpark shell inside minikube

kubectl exec $MASTER_POD_NAME -it -- pyspark --conf
spark.driver.bindAddress= $MASTER_POD_IP  --conf
spark.driver.host=$MASTER_POD_IP

Simple rdd creation


>>> r = range(1,5)
>>> sc.parallelize(r).collect()
[1, 2, 3, 4]

sc.stop()


Do the same with Scala shell


kubectl exec $MASTER_POD_NAME -it -- spark-shell --conf
spark.driver.bindAddress=$MASTER_POD_IP --conf
spark.driver.host=$MASTER_POD_IP


Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
setLogLevel(newLevel).
Spark context Web UI available at http://172.17.0.3:4040
Spark context available as 'sc' (master = spark://spark-master:7077, app id
= app-20210711151121-0006).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.0.0
      /_/

Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_111)
Type in expressions to have them evaluated.
Type :help for more information.


scala> val r = 1 to 5
r: scala.collection.immutable.Range.Inclusive = Range 1 to 5
scala> sc.parallelize(r).collect
res1: Array[Int] = Array(1, 2, 3, 4, 5)
scala> sc.stop


So both work


Also one can send a python module ${APPLICATION} to be executed by
minikube. In order to be able to read the zipped and python files, they
need to be accessible from within Kubernetes. In this case I am using HDFS
but one can use any Hadoop Compatible File System (HCFS) like Cloud storage
buckets say Google's gs://kubernetes/codes etc. You can also use https
mounts as long as HOST_PORT (IP address not name) is accessible from
Kubernetes


kubectl exec $MASTER_POD_NAME -it \
             -- spark-submit \
             --deploy-mode client \
              --conf spark.driver.bindAddress=$MASTER_POD_IP \
              --conf spark.driver.host=$MASTER_POD_IP \
              --py-files
hdfs://$HDFS_HOST:$HDFS_PORT/minikube/codes/DSBQ.zip \
              hdfs://$HDFS_HOST:$HDFS_PORT/minikube/codes/${APPLICATION}


Minikube is an interesting concept that can be used for R & D and testing
Spark jobs.


If you go into minikube it is basing a docker host


It is worth looking insise


 minikube ssh
docker@minikube:~$ pwd
/home/docker
docker@minikube:~$ cd /var/log
docker@minikube:/var/log$ ls
alternatives.log  btmp  containers  lastlog  pods  private  wtmp
docker@minikube:/var/log$ cd pods
docker@minikube:/var/log/pods$ ls -ltr
total 0
drwxr-xr-x. 3 root root 18 Jul 11 14:10
kube-system_etcd-minikube_c1a0ab3ae8afe3f264628fb5ac801630
drwxr-xr-x. 3 root root 28 Jul 11 14:10
kube-system_kube-apiserver-minikube_d756b4285ec6b3c077bc8a5298c850d6
drwxr-xr-x. 3 root root 28 Jul 11 14:10
kube-system_kube-scheduler-minikube_2c61c7fcb3d9f98a5db74e3e2e62c1d2
drwxr-xr-x. 3 root root 37 Jul 11 14:10
kube-system_kube-controller-manager-minikube_39e5e83b5e9a56cb25a1c4768a9555fc
drwxr-xr-x. 3 root root 24 Jul 11 14:10
kube-system_kube-proxy-nkn7f_3d211f55-a41c-4192-9a21-d3d00eccfdee
drwxr-xr-x. 3 root root 33 Jul 11 14:10
kube-system_storage-provisioner_f829f26e-333a-4dfd-9278-5b2b8aa374da
drwxr-xr-x. 3 root root 21 Jul 11 14:10
kube-system_coredns-5c98db65d4-5mcm6_1c7bce66-87fe-44f8-a05b-674e25f6d66a
drwxr-xr-x. 3 root root 26 Jul 11 14:19
default_spark-master-5f887dc8c-cm9wx_9d360035-9312-4a48-8004-f664f35f7cb9
drwxr-xr-x. 3 root root 26 Jul 11 14:20
default_spark-worker-7fd4cb4745-s5wnx_e7d9c89a-2af2-4a56-9cd3-caded5460daa
drwxr-xr-x. 3 root root 26 Jul 11 14:20
default_spark-worker-7fd4cb4745-lsjd5_97e40b05-8ffb-4b99-b80e-c72de75726d8
docker@minikube:/var/log/pods$ ls
kube-system_storage-provisioner_f829f26e-333a-4dfd-9278-5b2b8aa374da
storage-provisioner
docker@minikube:/var/log/pods$ cd
kube-system_storage-provisioner_f829f26e-333a-4dfd-9278-5b2b8aa374da/storage-provisioner
docker@minikube:/var/log/pods/kube-system_storage-provisioner_f829f26e-333a-4dfd-9278-5b2b8aa374da/storage-provisioner$
ls
0.log
{docker@minikube
:/var/log/pods/kube-system_storage-provisioner_f829f26e-333a-4dfd-9278-5b2b8aa374da/storage-provisioner$
sudo head 0.log
{"log":"I0711 14:10:38.940236       1 storage_provisioner.go:116]
Initializing the minikube storage
provisioner...\n","stream":"stderr","time":"2021-07-11T14:10:38.940304644Z"}
{"log":"I0711 14:10:38.945145       1 storage_provisioner.go:141] Storage
provisioner initialized, now starting
service!\n","stream":"stderr","time":"2021-07-11T14:10:38.945205586Z"}
{"log":"I0711 14:10:38.945184       1 leaderelection.go:243] attempting to
acquire leader lease
kube-system/k8s.io-minikube-hostpath...\n","stream":"stderr","time":"2021-07-11T14:10:38.945221054Z"}
{"log":"I0711 14:10:38.989215       1 leaderelection.go:253] successfully
acquired lease
kube-system/k8s.io-minikube-hostpath\n","stream":"stderr","time":"2021-07-11T14:10:38.989259991Z"}
{"log":"I0711 14:10:38.989296       1 controller.go:835] Starting
provisioner controller
k8s.io/minikube-hostpath_minikube_9f3c5904-58de-4630-aea8-f0f9c83b50b6
!\n","stream":"stderr","time":"2021-07-11T14:10:38.989335131Z"}
{"log":"I0711 14:10:38.989274       1 event.go:282]
Event(v1.ObjectReference{Kind:\"Endpoints\", Namespace:\"kube-system\",
Name:\"k8s.io-minikube-hostpath\",
UID:\"06b4f4f8-f4bc-4fd6-9165-2870dd2f6dec\", APIVersion:\"v1\",
ResourceVersion:\"378\", FieldPath:\"\"}): type: 'Normal' reason:
'LeaderElection' minikube_9f3c5904-58de-4630-aea8-f0f9c83b50b6 became
leader\n","stream":"stderr","time":"2021-07-11T14:10:38.98934145Z"}
{"log":"I0711 14:10:39.090061       1 controller.go:884] Started
provisioner controller
k8s.io/minikube-hostpath_minikube_9f3c5904-58de-4630-aea8-f0f9c83b50b6
!\n","stream":"stderr","time":"2021-07-11T14:10:39.090138148Z"}

So all the pod logfiles are here



HTH



   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Sun, 4 Jul 2021 at 13:19, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

> Further to this I tried to create the same through spark shell connecting
> to k8s on --master k8s://<KS_SERVER>:8443
>
>
> scala> val r = 1 to 10
> r: scala.collection.immutable.Range.Inclusive = Range 1 to 10
>
> scala> val b = sc.parallelize(r)
> b: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize
> at <console>:26
>
> scala> b.collect
> 2021-07-04 13:14:45,848 WARN scheduler.TaskSchedulerImpl: Initial job has
> not accepted any resources; check your cluster UI to ensure that workers
> are registered and have sufficient resources
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Sun, 4 Jul 2021 at 12:56, Mich Talebzadeh <mich.talebzadeh@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have a minikube with spark running.
>>
>> when the py file is pretty simple and does something like print(0 it works
>>
>> from src.config import config, oracle_url
>> from pyspark.sql import functions as F
>> from pyspark.sql.functions import col, round
>> from pyspark.sql.window import Window
>> from sparkutils import sparkstuff as s
>> from othermisc import usedFunctions as uf
>> import locale
>> locale.setlocale(locale.LC_ALL, 'en_GB')
>>
>> def main():
>>     appName = "testme"
>>     spark_session = s.spark_session(appName)
>>     spark_context = s.sparkcontext()
>>     spark_context.setLogLevel("ERROR")
>>     print(spark_session)
>>     print(spark_context)
>>     print(f"""\n Printing a line from {appName}""")
>>
>> if __name__ == "__main__":
>>   main()
>>
>> It comes back with
>>
>>
>> <pyspark.sql.session.SparkSession object at 0x7f9aebbdff28>
>> <SparkContext master=k8s://https://192.168.49.2:8443 appName=testme>
>>
>>  Printing a line from testme
>>
>>
>> However, if I create a simple range as below
>>
>>
>>     spark_session = s.spark_session(appName)
>>     spark_context = s.sparkcontext()
>>     spark_context.setLogLevel("ERROR")
>>     print(spark_session)
>>     print(spark_context)
>>     rdd = spark_context.parallelize([1,2,3,4,5,6,7,8,9,10])
>>
>>     print(rdd)
>>     rdd.collect()
>>
>>     print(f"""\n Printing a line from {appName}""")
>>
>> It never gets to collect()  --> rdd.collect()
>>
>>
>>
>> <pyspark.sql.session.SparkSession object at 0x7f431ee92f28>
>> <SparkContext master=k8s://https://192.168.49.2:8443 appName=testme>
>> ParallelCollectionRDD[0] at readRDDFromFile at PythonRDD.scala:274
>>
>>
>> and hangs
>>
>>
>> Examining the pods from another session I see:
>>
>>
>> kubectl get pod -n spark
>> NAME                              READY   STATUS              RESTARTS
>> AGE
>> spark-master                      0/1     Completed           0
>>  69m
>> testme-b7a8f97a714d9351-exec-33   1/1     Running             0
>>  9s
>> testme-b7a8f97a714d9351-exec-34   0/1     ContainerCreating   0
>>  9s
>>
>>
>> It keeps failing and recreating testme pods
>>
>>
>>  kubectl get pod -n spark
>> NAME                              READY   STATUS              RESTARTS
>> AGE
>> spark-master                      0/1     Completed           0
>>  69m
>> testme-b7a8f97a714d9351-exec-33   1/1     Running             0
>>  9s
>> testme-b7a8f97a714d9351-exec-34   0/1     ContainerCreating   0
>>  9s
>>  kubectl get pod -n spark
>> NAME                              READY   STATUS              RESTARTS
>> AGE
>> spark-master                      0/1     Completed           0
>>  70m
>> testme-b7a8f97a714d9351-exec-49   0/1     ContainerCreating   0
>>  2s
>> testme-b7a8f97a714d9351-exec-50   0/1     ContainerCreating   0
>>  1s
>> kubectl get pod -n spark
>> NAME                              READY   STATUS        RESTARTS   AGE
>> spark-master                      0/1     Completed     0          71m
>> testme-b7a8f97a714d9351-exec-49   0/1     Terminating   0          8s
>> testme-b7a8f97a714d9351-exec-50   1/1     Running       0          7s
>> testme-b7a8f97a714d9351-exec-51   0/1     Pending       0          0s
>>  kubectl get pod -n spark
>> NAME                              READY   STATUS              RESTARTS
>> AGE
>> spark-master                      0/1     Completed           0
>>  71m
>> testme-b7a8f97a714d9351-exec-51   0/1     ContainerCreating   0
>>  3s
>> testme-b7a8f97a714d9351-exec-52   0/1     ContainerCreating   0
>>  2s
>>
>>
>> So it seems that pods testme.xxx keep failing and being created and I
>> cannot get the logs!
>>
>>
>> kubectl get pod -n spark
>> NAME                               READY   STATUS              RESTARTS
>> AGE
>> spark-master                       0/1     Completed           0
>>  80m
>> testme-b7a8f97a714d9351-exec-164   1/1     Running             0
>>  8s
>> testme-b7a8f97a714d9351-exec-165   0/1     ContainerCreating   0
>>  3s
>> kubectl logs -n spark testme-b7a8f97a714d9351-exec-165
>> Error from server (NotFound): pods "testme-b7a8f97a714d9351-exec-165" not
>> found
>>
>>
>> I have created this minikube with 8192 MB of memory and 3 cpus
>>
>>
>> Spark GUI showing rdd collection not starting
>>
>>
>> [image: image.png]
>>
>>
>>
>> Appreciate any advice as Google search did not show much.
>>
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>

Mime
View raw message