spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: enum-like types in Spark
Date Fri, 06 Mar 2015 09:04:34 GMT
This has some disadvantage for Java, I think. You can't switch on an
object defined like this, but you can with an enum. And although the
scala compiler understands that the set of values is fixed because of
'sealed' and so can warn about missing cases, the JVM won't know this,
and can't do the same.

On Fri, Mar 6, 2015 at 3:58 AM, Xiangrui Meng <mengxr@gmail.com> wrote:
> For #4, my previous proposal may confuse the IDEs with additional
> types generated by the case objects, and their toString contain the
> underscore. The following works better:
>
> sealed abstract class StorageLevel
>
> object StorageLevel {
>   final val MemoryOnly: StorageLevel = {
>     case object MemoryOnly extends StorageLevel
>     MemoryOnly
>   }
>
>   final val DiskOnly: StorageLevel = {
>     case object DiskOnly extends StorageLevel
>     DiskOnly
>  }
> }
>
> MemoryOnly and DiskOnly can be used in pattern matching. If people are
> okay with this approach, I can add it to the code style guide.
>
> Imran, this is not just for internal APIs, which are relatively more
> flexible. It is good to use the same approach to implement public
> enum-like types from now on.
>
> Best,
> Xiangrui
>
> On Thu, Mar 5, 2015 at 1:08 PM, Imran Rashid <irashid@cloudera.com> wrote:
>> I have a very strong dislike for #1 (scala enumerations).   I'm ok with #4
>> (with Xiangrui's final suggestion, especially making it sealed & available
>> in Java), but I really think #2, java enums, are the best option.
>>
>> Java enums actually have some very real advantages over the other
>> approaches -- you get values(), valueOf(), EnumSet, and EnumMap.  There has
>> been endless debate in the Scala community about the problems with the
>> approaches in Scala.  Very smart, level-headed Scala gurus have complained
>> about their short-comings (Rex Kerr's name is coming to mind, though I'm
>> not positive about that); there have been numerous well-thought out
>> proposals to give Scala a better enum.  But the powers-that-be in Scala
>> always reject them.  IIRC the explanation for rejecting is basically that
>> (a) enums aren't important enough for introducing some new special feature,
>> scala's got bigger things to work on and (b) if you really need a good
>> enum, just use java's enum.
>>
>> I doubt it really matters that much for Spark internals, which is why I
>> think #4 is fine.  But I figured I'd give my spiel, because every developer
>> loves language wars :)
>>
>> Imran
>>
>>
>>
>> On Thu, Mar 5, 2015 at 1:35 AM, Xiangrui Meng <mengxr@gmail.com> wrote:
>>
>>> `case object` inside an `object` doesn't show up in Java. This is the
>>> minimal code I found to make everything show up correctly in both
>>> Scala and Java:
>>>
>>> sealed abstract class StorageLevel // cannot be a trait
>>>
>>> object StorageLevel {
>>>   private[this] case object _MemoryOnly extends StorageLevel
>>>   final val MemoryOnly: StorageLevel = _MemoryOnly
>>>
>>>   private[this] case object _DiskOnly extends StorageLevel
>>>   final val DiskOnly: StorageLevel = _DiskOnly
>>> }
>>>
>>> On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell <pwendell@gmail.com>
>>> wrote:
>>> > I like #4 as well and agree with Aaron's suggestion.
>>> >
>>> > - Patrick
>>> >
>>> > On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson <ilikerps@gmail.com>
>>> wrote:
>>> >> I'm cool with #4 as well, but make sure we dictate that the values
>>> should
>>> >> be defined within an object with the same name as the enumeration (like
>>> we
>>> >> do for StorageLevel). Otherwise we may pollute a higher namespace.
>>> >>
>>> >> e.g. we SHOULD do:
>>> >>
>>> >> trait StorageLevel
>>> >> object StorageLevel {
>>> >>   case object MemoryOnly extends StorageLevel
>>> >>   case object DiskOnly extends StorageLevel
>>> >> }
>>> >>
>>> >> On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust <
>>> michael@databricks.com>
>>> >> wrote:
>>> >>
>>> >>> #4 with a preference for CamelCaseEnums
>>> >>>
>>> >>> On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley <joseph@databricks.com>
>>> >>> wrote:
>>> >>>
>>> >>> > another vote for #4
>>> >>> > People are already used to adding "()" in Java.
>>> >>> >
>>> >>> >
>>> >>> > On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch <javadba@gmail.com>
>>> >>> wrote:
>>> >>> >
>>> >>> > > #4 but with MemoryOnly (more scala-like)
>>> >>> > >
>>> >>> > > http://docs.scala-lang.org/style/naming-conventions.html
>>> >>> > >
>>> >>> > > Constants, Values, Variable and Methods
>>> >>> > >
>>> >>> > > Constant names should be in upper camel case. That is,
if the
>>> member is
>>> >>> > > final, immutable and it belongs to a package object or
an object,
>>> it
>>> >>> may
>>> >>> > be
>>> >>> > > considered a constant (similar to Java'sstatic final members):
>>> >>> > >
>>> >>> > >
>>> >>> > >    1. object Container {
>>> >>> > >    2.     val MyConstant = ...
>>> >>> > >    3. }
>>> >>> > >
>>> >>> > >
>>> >>> > > 2015-03-04 17:11 GMT-08:00 Xiangrui Meng <mengxr@gmail.com>:
>>> >>> > >
>>> >>> > > > Hi all,
>>> >>> > > >
>>> >>> > > > There are many places where we use enum-like types
in Spark, but
>>> in
>>> >>> > > > different ways. Every approach has both pros and
cons. I wonder
>>> >>> > > > whether there should be an "official" approach for
enum-like
>>> types in
>>> >>> > > > Spark.
>>> >>> > > >
>>> >>> > > > 1. Scala's Enumeration (e.g., SchedulingMode, WorkerState,
etc)
>>> >>> > > >
>>> >>> > > > * All types show up as Enumeration.Value in Java.
>>> >>> > > >
>>> >>> > > >
>>> >>> > >
>>> >>> >
>>> >>>
>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
>>> >>> > > >
>>> >>> > > > 2. Java's Enum (e.g., SaveMode, IOMode)
>>> >>> > > >
>>> >>> > > > * Implementation must be in a Java file.
>>> >>> > > > * Values doesn't show up in the ScalaDoc:
>>> >>> > > >
>>> >>> > > >
>>> >>> > >
>>> >>> >
>>> >>>
>>> http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
>>> >>> > > >
>>> >>> > > > 3. Static fields in Java (e.g., TripletFields)
>>> >>> > > >
>>> >>> > > > * Implementation must be in a Java file.
>>> >>> > > > * Doesn't need "()" in Java code.
>>> >>> > > > * Values don't show up in the ScalaDoc:
>>> >>> > > >
>>> >>> > > >
>>> >>> > >
>>> >>> >
>>> >>>
>>> http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
>>> >>> > > >
>>> >>> > > > 4. Objects in Scala. (e.g., StorageLevel)
>>> >>> > > >
>>> >>> > > > * Needs "()" in Java code.
>>> >>> > > > * Values show up in both ScalaDoc and JavaDoc:
>>> >>> > > >
>>> >>> > > >
>>> >>> > >
>>> >>> >
>>> >>>
>>> http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
>>> >>> > > >
>>> >>> > > >
>>> >>> > >
>>> >>> >
>>> >>>
>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
>>> >>> > > >
>>> >>> > > > It would be great if we have an "official" approach
for this as
>>> well
>>> >>> > > > as the naming convention for enum-like values ("MEMORY_ONLY"
or
>>> >>> > > > "MemoryOnly"). Personally, I like 4) with "MEMORY_ONLY".
Any
>>> >>> thoughts?
>>> >>> > > >
>>> >>> > > > Best,
>>> >>> > > > Xiangrui
>>> >>> > > >
>>> >>> > > >
>>> ---------------------------------------------------------------------
>>> >>> > > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> >>> > > > For additional commands, e-mail: dev-help@spark.apache.org
>>> >>> > > >
>>> >>> > > >
>>> >>> > >
>>> >>> >
>>> >>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message