spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deep Pradhan <pradhandeep1...@gmail.com>
Subject Re: Spark and Scala
Date Sat, 13 Sep 2014 06:15:14 GMT
Is it always true that whenever we apply operations on an RDD, we get
another RDD?
Or does it depend on the return type of the operation?

On Sat, Sep 13, 2014 at 9:45 AM, Soumya Simanta <soumya.simanta@gmail.com>
wrote:

>
> An RDD is a fault-tolerant distributed structure. It is the primary
> abstraction in Spark.
>
> I would strongly suggest that you have a look at the following to get a
> basic idea.
>
> http://www.cs.berkeley.edu/~pwendell/strataconf/api/core/spark/RDD.html
> http://spark.apache.org/docs/latest/quick-start.html#basics
>
> https://www.usenix.org/conference/nsdi12/technical-sessions/presentation/zaharia
>
> On Sat, Sep 13, 2014 at 12:06 AM, Deep Pradhan <pradhandeep1991@gmail.com>
> wrote:
>
>> Take for example this:
>> I have declared one queue *val queue = Queue.empty[Int]*, which is a
>> pure scala line in the program. I actually want the queue to be an RDD but
>> there are no direct methods to create RDD which is a queue right? What say
>> do you have on this?
>> Does there exist something like: *Create and RDD which is a queue *?
>>
>> On Sat, Sep 13, 2014 at 8:43 AM, Hari Shreedharan <
>> hshreedharan@cloudera.com> wrote:
>>
>>> No, Scala primitives remain primitives. Unless you create an RDD using
>>> one of the many methods - you would not be able to access any of the RDD
>>> methods. There is no automatic porting. Spark is an application as far as
>>> scala is concerned - there is no compilation (except of course, the scala,
>>> JIT compilation etc).
>>>
>>> On Fri, Sep 12, 2014 at 8:04 PM, Deep Pradhan <pradhandeep1991@gmail.com
>>> > wrote:
>>>
>>>> I know that unpersist is a method on RDD.
>>>> But my confusion is that, when we port our Scala programs to Spark,
>>>> doesn't everything change to RDDs?
>>>>
>>>> On Fri, Sep 12, 2014 at 10:16 PM, Nicholas Chammas <
>>>> nicholas.chammas@gmail.com> wrote:
>>>>
>>>>> unpersist is a method on RDDs. RDDs are abstractions introduced by
>>>>> Spark.
>>>>>
>>>>> An Int is just a Scala Int. You can't call unpersist on Int in Scala,
>>>>> and that doesn't change in Spark.
>>>>>
>>>>> On Fri, Sep 12, 2014 at 12:33 PM, Deep Pradhan <
>>>>> pradhandeep1991@gmail.com> wrote:
>>>>>
>>>>>> There is one thing that I am confused about.
>>>>>> Spark has codes that have been implemented in Scala. Now, can we
run
>>>>>> any Scala code on the Spark framework? What will be the difference
in the
>>>>>> execution of the scala code in normal systems and on Spark?
>>>>>> The reason for my question is the following:
>>>>>> I had a variable
>>>>>> *val temp = <some operations>*
>>>>>> This temp was being created inside the loop, so as to manually throw
>>>>>> it out of the cache, every time the loop ends I was calling
>>>>>> *temp.unpersist()*, this was returning an error saying that *value
>>>>>> unpersist is not a method of Int*, which means that temp is an Int.
>>>>>> Can some one explain to me why I was not able to call *unpersist*
on
>>>>>> *temp*?
>>>>>>
>>>>>> Thank You
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message