spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerry OELoo <>
Subject Spark return key value pair
Date Wed, 19 Aug 2015 11:10:07 GMT
I want to parse a file and return a key-value pair with pySpark, but
result is strange to me.
the test.sql is a big fie and each line is usename and password, with
# between them, I use below mapper2 to map data, and in my
understanding, i in words.take(10) should be a tuple, but the result
is that i is username or password, this is strange for me to
understand, Thanks for you help.

def mapper2(line):

    words = line.split('#')
    return (words[0].strip(), words[1].strip())

def main2(sc):

    lines = sc.textFile("hdfs://master:9000/spark/test.sql")
    words = lines.flatMap(mapper2)

    for i in words.take(10):
        msg = i + ":" + "\n"

Rejoice,I Desire!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message