spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marius Soutier <mps....@gmail.com>
Subject Re: Python vs Scala performance
Date Wed, 22 Oct 2014 21:30:49 GMT
Yeah we’re using Python 2.7.3.

On 22.10.2014, at 20:06, Nicholas Chammas <nicholas.chammas@gmail.com> wrote:

> On Wed, Oct 22, 2014 at 11:34 AM, Eustache DIEMERT <eustache@diemert.fr> wrote:
> 
> 
> 
> Wild guess maybe, but do you decode the json records in Python ? it could be much slower
as the default lib is quite slow. 
> 
> 
> Oh yeah, this is a good place to look. Also, just upgrading to Python 2.7 may be enough
performance improvement because they merged in the fast JSON deserializing from simplejson
into the standard library. So you may not need to use an external library like ujson, though
that may help too.
> 
> Nick
> 
> ​


Mime
View raw message