spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christophe Préaud <christophe.pre...@kelkoo.com>
Subject Re: DataFrame support for hadoop glob patterns
Date Wed, 09 Mar 2016 16:14:42 GMT
Hi,

Unless I've misunderstood what you want to achieve, you could use:
sqlContext.read.json(sc.textFile("/mnt/views-p/base/2016/01/*/*-xyz.json"))

Regards,
Christophe.

On 09/03/16 15:24, Ted Yu wrote:
Hadoop glob pattern doesn't support multi level wildcard.

Thanks

On Mar 9, 2016, at 6:15 AM, Koert Kuipers <<mailto:koert@tresata.com>koert@tresata.com<mailto:koert@tresata.com>>
wrote:

if its based on HadoopFsRelation shouldn't it support it? HadoopFsRelation handles globs

On Wed, Mar 9, 2016 at 8:56 AM, Ted Yu <yuzhihong@gmail.com<mailto:yuzhihong@gmail.com>>
wrote:
This is currently not supported.

On Mar 9, 2016, at 4:38 AM, Jakub Liska <<mailto:liska.jakub@gmail.com>liska.jakub@gmail.com<mailto:liska.jakub@gmail.com>>
wrote:

Hey,

is something like this possible?

sqlContext.read.json("/mnt/views-p/base/2016/01/*/*-xyz.json")

I switched to DataFrames because my source files changed from TSV to JSON
but now I'm not able to load the files as I did before. I get this error if I try that :

https://github.com/apache/spark/pull/9142#issuecomment-194248531



________________________________
Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive
de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire
et d'en avertir l'expéditeur.

Mime
View raw message