spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiangrui Meng <men...@gmail.com>
Subject Re: [VOTE] Release Apache Spark 1.3.0 (RC3)
Date Mon, 09 Mar 2015 22:30:41 GMT
Krishna, I tested your linear regression example. For linear
regression, we changed its objective function from 1/n * \|A x -
b\|_2^2 to 1/(2n) * \|Ax - b\|_2^2 to be consistent with common least
squares formulations. It means you could re-produce the same result by
multiplying the step size by 2. This is not a problem if both run
until convergence (if not blow up). However, in your example, a very
small step size is chosen and it didn't converge in 100 iterations. In
this case, the step size matters. I will put a note in the migration
guide. Thanks! -Xiangrui

On Mon, Mar 9, 2015 at 1:38 PM, Sean Owen <sowen@cloudera.com> wrote:
> I'm +1 as I have not heard of any one else seeing the Hive test
> failure, which is likely a test issue rather than code issue anyway,
> and not a blocker.
>
> On Fri, Mar 6, 2015 at 9:36 PM, Sean Owen <sowen@cloudera.com> wrote:
>> Although the problem is small, especially if indeed the essential docs
>> changes are following just a couple days behind the final release, I
>> mean, why the rush if they're essential? wait a couple days, finish
>> them, make the release.
>>
>> Answer is, I think these changes aren't actually essential given the
>> comment from tdas, so: just mark these Critical? (although ... they do
>> say they're changes for the 1.3 release, so kind of funny to get to
>> them for 1.3.x or 1.4, but that's not important now.)
>>
>> I thought that Blocker really meant Blocker in this project, as I've
>> been encouraged to use it to mean "don't release without this." I
>> think we should use it that way. Just thinking of it as "extra
>> Critical" doesn't add anything. I don't think Documentation should be
>> special-cased as less important, and I don't think there's confusion
>> if Blocker means what it says, so I'd 'fix' that way.
>>
>> If nobody sees the Hive failure I observed, and if we can just zap
>> those "Blockers" one way or the other, +1
>>
>>
>> On Fri, Mar 6, 2015 at 9:17 PM, Patrick Wendell <pwendell@gmail.com> wrote:
>>> Sean,
>>>
>>> The docs are distributed and consumed in a fundamentally different way
>>> than Spark code itself. So we've always considered the "deadline" for
>>> doc changes to be when the release is finally posted.
>>>
>>> If there are small inconsistencies with the docs present in the source
>>> code for that release tag, IMO that doesn't matter much since we don't
>>> even distribute the docs with Spark's binary releases and virtually no
>>> one builds and hosts the docs on their own (that I am aware of, at
>>> least). Perhaps we can recommend if people want to build the doc
>>> sources that they should always grab the head of the most recent
>>> release branch, to set expectations accordingly.
>>>
>>> In the past we haven't considered it worth holding up the release
>>> process for the purpose of the docs. It just doesn't make sense since
>>> they are consumed "as a service". If we decide to change this
>>> convention, it would mean shipping our releases later, since we
>>> could't pipeline the doc finalization with voting.
>>>
>>> - Patrick
>>>
>>> On Fri, Mar 6, 2015 at 11:02 AM, Sean Owen <sowen@cloudera.com> wrote:
>>>> Given the title and tagging, it sounds like there could be some
>>>> must-have doc changes to go with what is being released as 1.3. It can
>>>> be finished later, and published later, but then the docs source
>>>> shipped with the release doesn't match the site, and until then, 1.3
>>>> is released without some "must-have" docs for 1.3 on the site.
>>>>
>>>> The real question to me is: are there any further, absolutely
>>>> essential doc changes that need to accompany 1.3 or not?
>>>>
>>>> If not, just resolve these. If there are, then it seems like the
>>>> release has to block on them. If there are some docs that should have
>>>> gone in for 1.3, but didn't, but aren't essential, well I suppose it
>>>> bears thinking about how to not slip as much work, but it doesn't
>>>> block.
>>>>
>>>> I think Documentation issues certainly can be a blocker and shouldn't
>>>> be specially ignored.
>>>>
>>>>
>>>> BTW the UISeleniumSuite issue is a real failure, but I do not think it
>>>> is serious: http://issues.apache.org/jira/browse/SPARK-6205  It isn't
>>>> a regression from 1.2.x, but only affects tests, and only affects a
>>>> subset of build profiles.
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Mar 6, 2015 at 6:43 PM, Patrick Wendell <pwendell@gmail.com>
wrote:
>>>>> Hey Sean,
>>>>>
>>>>>> SPARK-5310 Update SQL programming guide for 1.3
>>>>>> SPARK-5183 Document data source API
>>>>>> SPARK-6128 Update Spark Streaming Guide for Spark 1.3
>>>>>
>>>>> For these, the issue is that they are documentation JIRA's, which
>>>>> don't need to be timed exactly with the release vote, since we can
>>>>> update the documentation on the website whenever we want. In the past
>>>>> I've just mentally filtered these out when considering RC's. I see a
>>>>> few options here:
>>>>>
>>>>> 1. We downgrade such issues away from Blocker (more clear, but we risk
>>>>> loosing them in the fray if they really are things we want to have
>>>>> before the release is posted).
>>>>> 2. We provide a filter to the community that excludes 'Documentation'
>>>>> issues and shows all other blockers for 1.3. We can put this on the
>>>>> wiki, for instance.
>>>>>
>>>>> Which do you prefer?
>>>>>
>>>>> - Patrick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message