spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Evan R. Sparks" <evan.spa...@gmail.com>
Subject Re: Spark and Stanford CoreNLP
Date Mon, 24 Nov 2014 17:12:19 GMT
This is probably not the right venue for general questions on CoreNLP - the
project website (http://nlp.stanford.edu/software/corenlp.shtml) provides
documentation and links to mailing lists/stack overflow topics.

On Mon, Nov 24, 2014 at 9:08 AM, Madabhattula Rajesh Kumar <
mrajaforu@gmail.com> wrote:

> Hello,
>
> I'm new to Stanford CoreNLP. Could any one share good training material
> and examples(java or scala) on NLP.
>
> Regards,
> Rajesh
>
> On Mon, Nov 24, 2014 at 9:38 PM, Ian O'Connell <ian@ianoconnell.com>
> wrote:
>
>>
>> object MyCoreNLP {
>>   @transient lazy val coreNLP = new coreNLP()
>> }
>>
>> and then refer to it from your map/reduce/map partitions or that it
>> should be fine (presuming its thread safe), it will only be initialized
>> once per classloader per jvm
>>
>> On Mon, Nov 24, 2014 at 7:58 AM, Evan Sparks <evan.sparks@gmail.com>
>> wrote:
>>
>>> We have gotten this to work, but it requires instantiating the CoreNLP
>>> object on the worker side. Because of the initialization time it makes a
>>> lot of sense to do this inside of a .mapPartitions instead of a .map, for
>>> example.
>>>
>>> As an aside, if you're using it from Scala, have a look at sistanlp,
>>> which provided a nicer, scala-friendly interface to CoreNLP.
>>>
>>>
>>> > On Nov 24, 2014, at 7:46 AM, tvas <theodoros.vasiloudis@gmail.com>
>>> wrote:
>>> >
>>> > Hello,
>>> >
>>> > I was wondering if anyone has gotten the Stanford CoreNLP Java library
>>> to
>>> > work with Spark.
>>> >
>>> > My attempts to use the parser/annotator fail because of task
>>> serialization
>>> > errors since the class
>>> > StanfordCoreNLP cannot be serialized.
>>> >
>>> > I've tried the remedies of registering StanfordCoreNLP through kryo,
>>> as well
>>> > as using chill.MeatLocker,
>>> > but these still produce serialization errors.
>>> > Passing the StanfordCoreNLP object as transient leads to a
>>> > NullPointerException instead.
>>> >
>>> > Has anybody managed to get this work?
>>> >
>>> > Regards,
>>> > Theodore
>>> >
>>> >
>>> >
>>> > --
>>> > View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-and-Stanford-CoreNLP-tp19654.html
>>> > Sent from the Apache Spark User List mailing list archive at
>>> Nabble.com.
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> > For additional commands, e-mail: user-help@spark.apache.org
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>>
>

Mime
View raw message