spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From István Gansperger (JIRA) <j...@apache.org>
Subject [jira] [Updated] (SPARK-23796) There's no API to change state RDD's name
Date Mon, 26 Mar 2018 17:07:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-23796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

István Gansperger updated SPARK-23796:
--------------------------------------
    Description: 
I use a few {{mapWithState}} stream oparations in my application and at some point it became
a minor inconvenience that I could not figure out how to set the state RDDs name or serialization
level. Searching around didn't really help and I have not come across any issues regarding
this (pardon my inability to find it if there's one). It could be useful to see how much memory
each state uses if the user has multiple such transformations.

I have used some ugly reflection based code to be able to set the name of the state RDD and
also the serialization level. I understand that the latter may be intentionally limited, but
I haven't come across any issues caused by this apart from slightly degraded performance in
exchange for a bit less memory usage. Are these limitations in place intentionally or is it
just an oversight? Having some extra methods for these on {{StateSpec}} could be useful in
my opinion.

  was:
I use a few {{mapWithState}} stream oparations in my application and at some point it became
a minor inconvenience that I could not figure out how to set the state RDDs name or serialization
level. Searching around didn't really help and I have not come across any issues regarding
this (pardon my inability to find it if there's one). It could be useful to see how much memory
each state uses if the user has multiple such transformations.

I have used some ugly reflection based code to be able to set the name of the state RDD and
also the serialization level. I understand that the latter may be intentionally limited, but
I haven't come across any issues caused by this apart from sightly degraded performance in
exchange for a bit less memory usage. Are these limitations in place intentionally or is it
just an oversight? Having some extra methods for these on {{StateSpec}} could be useful in
my opinion.


> There's no API to change state RDD's name
> -----------------------------------------
>
>                 Key: SPARK-23796
>                 URL: https://issues.apache.org/jira/browse/SPARK-23796
>             Project: Spark
>          Issue Type: Question
>          Components: Spark Core
>    Affects Versions: 2.3.0
>            Reporter: István Gansperger
>            Priority: Minor
>
> I use a few {{mapWithState}} stream oparations in my application and at some point it
became a minor inconvenience that I could not figure out how to set the state RDDs name or
serialization level. Searching around didn't really help and I have not come across any issues
regarding this (pardon my inability to find it if there's one). It could be useful to see
how much memory each state uses if the user has multiple such transformations.
> I have used some ugly reflection based code to be able to set the name of the state RDD
and also the serialization level. I understand that the latter may be intentionally limited,
but I haven't come across any issues caused by this apart from slightly degraded performance
in exchange for a bit less memory usage. Are these limitations in place intentionally or is
it just an oversight? Having some extra methods for these on {{StateSpec}} could be useful
in my opinion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message