sqlContext.read.json() expects Path to the JSON file.


On Tue, Sep 29, 2015 at 7:23 AM, Fernando Paladini <fnpaladini@gmail.com> wrote:
Hello guys,

I'm very new to Spark and I'm having some troubles when reading a JSON to dataframe on PySpark.

I'm getting a JSON object from an API response and I would like to store it in Spark as a DataFrame (I've read that DataFrame is better than RDD, that's accurate?). For what I've read on documentation, I just need to call the method sqlContext.read.json in order to do what I want.

Following is the code from my test application:
json_object = json.loads(response.text)
sc = SparkContext("local", appName="JSON to RDD")
sqlContext = SQLContext(sc)
dataframe = sqlContext.read.json(json_object)

The problem is that when I run
"spark-submit myExample.py" I got the following error:
15/09/29 01:18:54 INFO BlockManagerMasterEndpoint: Registering block manager localhost:48634 with 530.0 MB RAM, BlockManagerId(driver, localhost, 48634)
15/09/29 01:18:54 INFO BlockManagerMaster: Registered BlockManager
Traceback (most recent call last):
  File "/home/paladini/ufxc/lisha/learning/spark-api-kairos/test1.py", line 35, in <module>
    dataframe = sqlContext.read.json(json_object)
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 144, in json
  File "/opt/spark/python/lib/py4j-", line 538, in __call__
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 36, in deco
  File "/opt/spark/python/lib/py4j-", line 304, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling o21.json. Trace:
py4j.Py4JException: Method json([class java.util.HashMap]) does not exist
    at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333)
    at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342)
    at py4j.Gateway.invoke(Gateway.java:252)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:207)
    at java.lang.Thread.run(Thread.java:745)

What I'm doing wrong?
Check out this gist to see the JSON I'm trying to load.