spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stuart Horsman <>
Subject Re: SparkContext UI
Date Thu, 30 Oct 2014 23:50:40 GMT
Sorry too quick to pull the trigger on my original email.  I should have
added that I'm tried using persist() and cache() but no joy.

I'm doing this:

data = sc.textFile("somedata")



but I still can't see anything in the storage?

On 31 October 2014 10:42, Sameer Farooqui <> wrote:

> Hey Stuart,
> The RDD won't show up under the Storage tab in the UI until it's been
> cached. Basically Spark doesn't know what the RDD will look like until it's
> cached, b/c up until then the RDD is just on disk (external to Spark). If
> you launch some transformations + an action on an RDD that is purely on
> disk, then Spark will read it from disk, compute against it and then write
> the results back to disk or show you the results at the scala/python
> shells. But when you run Spark workloads against purely on disk files, the
> RDD won't show up in Spark's Storage UI. Hope that makes sense...
> - Sameer
> On Thu, Oct 30, 2014 at 4:30 PM, Stuart Horsman <>
> wrote:
>> Hi All,
>> When I load an RDD with:
>> data = sc.textFile("somefile")
>> I don't see the resulting RDD in the SparkContext gui on localhost:4040
>> in /storage.
>> Is there something special I need to do to allow me to view this?  I
>> tried but scala and python shells but same result.
>> Thanks
>> Stuart

View raw message