spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Ruchovets <oruchov...@gmail.com>
Subject pass configuration parameters to PySpark job
Date Mon, 18 May 2015 14:26:26 GMT
Hi ,
   I am looking a way to pass configuration parameters to spark job.
In general I have quite simple PySpark job.

  def process_model(k, vc):
       ....
       do something
       ....


 sc = SparkContext(appName="TAD")
    lines = sc.textFile(input_job_files)
    result = lines.map(doSplit).groupByKey().map(lambda (k,vc):
process_model(k,vc))

Question:
    In case I need to pass to process_model function additional metadata ,
parameters , etc ...

   I tried to do something like
   param = 'param1'
  result = lines.map(doSplit).groupByKey().map(lambda (param,k,vc):
process_model(param1,k,vc)) ,

but job stops to work , also it looks like not elegant solution.
Is there a way to have access to SparkContext from my custom functions?
I found that there are methods setLocalProperty/getLocalProperty   but I
didn't find example how to use it for my requirements (from my function).

It would be great to have short example how to pass parameters.

Thanks
Oleg.

Mime
View raw message