spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiangrui Meng <m...@databricks.com>
Subject Re: Custom Spark Error on Hadoop Cluster
Date Mon, 18 Jul 2016 13:41:20 GMT
Glad to hear. Could you please share your solution on the user mailing
list? -Xiangrui

On Mon, Jul 18, 2016 at 2:26 AM Alger Remirata <abremirata21@gmail.com>
wrote:

> Hi Xiangrui,
>
> We have now solved the problem. Thanks for all the tips you've given.
>
> Best Regards,
>
> Alger
>
> On Thu, Jul 14, 2016 at 2:43 AM, Alger Remirata <abremirata21@gmail.com>
> wrote:
>
>> By the using cloudera manager for standalone cluster manager
>>
>> On Thu, Jul 14, 2016 at 2:20 AM, Alger Remirata <abremirata21@gmail.com>
>> wrote:
>>
>>> It looks like there are a lot of people already having posted on
>>> classNotFoundError on the cluster mode fro version 1.5.1.
>>>
>>> https://www.mail-archive.com/user@spark.apache.org/msg43089.html
>>>
>>>
>>>
>>> On Thu, Jul 14, 2016 at 12:45 AM, Alger Remirata <abremirata21@gmail.com
>>> > wrote:
>>>
>>>> Hi Xiangrui,
>>>>
>>>> I check all the nodes of the cluster. It is working locally on each
>>>> node but there's an error upon deploying it on the cluster itself. I don't
>>>> know why it is and still don't understand why on individual node, it is
>>>> working locally but when deployed to hadoop cluster, it gets the error
>>>> mentioned.
>>>>
>>>> Thanks,
>>>>
>>>> Alger
>>>>
>>>> On Wed, Jul 13, 2016 at 4:38 AM, Alger Remirata <abremirata21@gmail.com
>>>> > wrote:
>>>>
>>>>> Since we're using mvn to build, it looks like mvn didn't add the
>>>>> class. Is there something on pom.xml to be added so that the new class
can
>>>>> be recognized?
>>>>>
>>>>> On Wed, Jul 13, 2016 at 4:21 AM, Alger Remirata <
>>>>> abremirata21@gmail.com> wrote:
>>>>>
>>>>>> Thanks for the reply however I couldn't locate the MLlib jar. What
I
>>>>>> have is a fat 'spark-assembly-1.5.1-hadoop2.6.0.jar'.
>>>>>>
>>>>>> There's an error on me copying user@spark.apache.org. The message
>>>>>> suddently is not sent when I do that.
>>>>>>
>>>>>> On Wed, Jul 13, 2016 at 4:13 AM, Alger Remirata <
>>>>>> abremirata21@gmail.com> wrote:
>>>>>>
>>>>>>> Thanks for the reply however I couldn't locate the MLlib jar.
What I
>>>>>>> have is a fat 'spark-assembly-1.5.1-hadoop2.6.0.jar'.
>>>>>>>
>>>>>>> On Tue, Jul 12, 2016 at 3:23 AM, Xiangrui Meng <meng@databricks.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> (+user@spark. Please copy user@ so other people could see
and
>>>>>>>> help.)
>>>>>>>>
>>>>>>>> The error message means you have an MLlib jar on the classpath
but
>>>>>>>> it didn't contain ALS$StandardNNLSSolver. So it is either
the
>>>>>>>> modified jar not deployed to the workers or there existing
an unmodified
>>>>>>>> MLlib jar sitting in front of the modified one on the classpath.
You can
>>>>>>>> check the worker logs and see the classpath used in launching
the worker,
>>>>>>>> and then check the MLlib jars on that classpath. -Xiangrui
>>>>>>>>
>>>>>>>> On Sun, Jul 10, 2016 at 10:18 PM Alger Remirata <
>>>>>>>> abremirata21@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Xiangrui,
>>>>>>>>>
>>>>>>>>> We have the modified jars deployed both on master and
slave nodes.
>>>>>>>>>
>>>>>>>>> What do you mean by this line?: 1. The unmodified Spark
jars were
>>>>>>>>> not on the classpath (already existed on the cluster
or pulled in by other
>>>>>>>>> packages).
>>>>>>>>>
>>>>>>>>> How would I check that the unmodified Spark jars are
not on the
>>>>>>>>> classpath? We change entirely the contents of the directory
for SPARK_HOME.
>>>>>>>>> The newly built customized spark is the new contents
of the current
>>>>>>>>> SPARK_HOME we have right now.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Alger
>>>>>>>>>
>>>>>>>>> On Fri, Jul 8, 2016 at 1:32 PM, Xiangrui Meng <meng@databricks.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> This seems like a deployment or dependency issue.
Please check
>>>>>>>>>> the following:
>>>>>>>>>> 1. The unmodified Spark jars were not on the classpath
(already
>>>>>>>>>> existed on the cluster or pulled in by other packages).
>>>>>>>>>> 2. The modified jars were indeed deployed to both
master and
>>>>>>>>>> slave nodes.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 5, 2016 at 12:29 PM Alger Remirata <
>>>>>>>>>> abremirata21@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> First of all, we like to thank you for developing
spark. This
>>>>>>>>>>> helps us a lot on our data science task.
>>>>>>>>>>>
>>>>>>>>>>> I have a question. We have build a customized
spark using the
>>>>>>>>>>> following command:
>>>>>>>>>>> mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0
-Phive
>>>>>>>>>>> -Phive-thriftserver -DskipTests clean package.
>>>>>>>>>>>
>>>>>>>>>>> On the custom spark we built, we've added a new
scala file or
>>>>>>>>>>> package called StandardNNLS file however it got
an error saying:
>>>>>>>>>>>
>>>>>>>>>>> Name: org.apache.spark.SparkException
>>>>>>>>>>> Message: Job aborted due to stage failure: Task
21 in stage 34.0
>>>>>>>>>>> failed 4 times, most recent failure: Lost task
21.3 in stage 34.0 (TID
>>>>>>>>>>> 2547, 192.168.60.115): java.lang.ClassNotFoundException:
>>>>>>>>>>> org.apache.spark.ml.recommendation.ALS$StandardNNLSSolver
>>>>>>>>>>>
>>>>>>>>>>> StandardNNLSolver is found on another scala file
called
>>>>>>>>>>> StandardNNLS.scala
>>>>>>>>>>> as we replace the original NNLS solver from scala
with
>>>>>>>>>>> StandardNNLS
>>>>>>>>>>> Do you guys have some idea about the error. Is
there a config
>>>>>>>>>>> file we need to edit to add the classpath? Even
if we insert the added
>>>>>>>>>>> codes in ALS.scala and not create another file
like StandardNNLS.scala, the
>>>>>>>>>>> inserted code is not recognized. It still gets
an error regarding
>>>>>>>>>>> ClassNotFoundException
>>>>>>>>>>>
>>>>>>>>>>> However, when we run this on our local machine
and not on the
>>>>>>>>>>> hadoop cluster, it is working. We don't know
if the error is because we are
>>>>>>>>>>> using mvn to build custom spark or it has something
to do with
>>>>>>>>>>> communicating to hadoop cluster.
>>>>>>>>>>>
>>>>>>>>>>> We would like to ask some ideas from you how
to solve this
>>>>>>>>>>> problem. We can actually create another package
not dependent to Apache
>>>>>>>>>>> Spark but this is so slow. As of now, we are
still learning scala and
>>>>>>>>>>> spark. Using Apache spark utilities make the
code faster. However, if we'll
>>>>>>>>>>> make another package not dependent to apache
spark, we have to recode the
>>>>>>>>>>> utilities that are set private in Apache Spark.
So, it is better to use
>>>>>>>>>>> Apache Spark and insert some code that we can
use.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Alger
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message