spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jules Damji <dmat...@comcast.net>
Subject Re: More publicly documenting the options under spark.sql.*
Date Fri, 17 Jan 2020 03:25:18 GMT
It’s one thing to get the names/values of the configurations, via the Spark.sql(“set -v”),
but another thing to understand what each achieves and when and why you’ll want to use it.


A webpage with a table and description of each is huge benefit. 

Cheers 
Jules 

Sent from my iPhone
Pardon the dumb thumb typos :)

> On Jan 16, 2020, at 11:04 AM, Shixiong(Ryan) Zhu <shixiong@databricks.com> wrote:
> 
> 
> "spark.sql("set -v")" returns a Dataset that has all non-internal SQL configurations.
Should be pretty easy to automatically generate a SQL configuration page.
> Best Regards,
> 
> Ryan
> 
> 
>> On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <gurwls223@gmail.com> wrote:
>> I think automatically creating a configuration page isn't a bad idea because I think
we deprecate and remove configurations which are not created via .internal() in SQLConf anyway.
>> 
>> I already tried this automatic generation from the codes at SQL built-in functions
and I'm pretty sure we can do the similar thing for configurations as well.
>> 
>> We could perhaps mimic what hadoop does https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>> 
>>> On Wed, 15 Jan 2020, 10:46 Sean Owen, <srowen@gmail.com> wrote:
>>> Some of it is intentionally undocumented, as far as I know, as an
>>> experimental option that may change, or legacy, or safety valve flag.
>>> Certainly anything that's marked an internal conf. (That does raise
>>> the question of who it's for, if you have to read source to find it.)
>>> 
>>> I don't know if we need to overhaul the conf system, but there may
>>> indeed be some confs that could legitimately be documented. I don't
>>> know which.
>>> 
>>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>> <nicholas.chammas@gmail.com> wrote:
>>> >
>>> > I filed SPARK-30510 thinking that we had forgotten to document an option,
but it turns out that there's a whole bunch of stuff under SQLConf.scala that has no public
documentation under http://spark.apache.org/docs.
>>> >
>>> > Would it be appropriate to somehow automatically generate a documentation
page from SQLConf.scala, as Hyukjin suggested on that ticket?
>>> >
>>> > Another thought that comes to mind is moving the config definitions out
of Scala and into a data format like YAML or JSON, and then sourcing that both for SQLConf
as well as for whatever documentation page we want to generate. What do you think of that
idea?
>>> >
>>> > Nick
>>> >
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>> 

Mime
View raw message