spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <>
Subject Using Lambda function to generate random data in PySpark throws not defined error
Date Fri, 11 Dec 2020 15:08:00 GMT

This used to work but not anymore.

I have file that has these functions

import random
import string
import math

def randomString(length):
    letters = string.ascii_letters
    result_str = ''.join(random.choice(letters) for i in range(length))
    return result_str

def clustered(x,numRows):
    return math.floor(x -1)/numRows

def scattered(x,numRows):
    return abs((x -1 % numRows))* 1.0

def randomised(seed,numRows):
    return abs(random.randint(0, numRows) % numRows) * 1.0

def padString(x,chars,length):
    n = int(math.log10(x) + 1)
    result_str = ''.join(random.choice(chars) for i in range(length-n)) + str(x)
    return result_str

def padSingleChar(chars,length):
    result_str = ''.join(chars for i in range(length))
    return result_str

def println(lst):
    for ll in lst:

Now in the main().py module I import this file as follows:

import UsedFunctions as uf

Then I try the following

import UsedFunctions as uf

 numRows = 100000   ## do in increment of 100K rows
 rdd = sc.parallelize(Range). \
           map(lambda x: (x, uf.clustered(x, numRows), \
                             uf.scattered(x,10000), \
                             uf.randomised(x,10000), \
                             uf.randomString(50), \
                             uf.padString(x," ",50), \
The problem is that now it throws error for numRows as below

  File "C:/Users/admin/PycharmProjects/pythonProject2/pilot/src/",
line 101, in <lambda>
    map(lambda x: (x, uf.clustered(x, numRows), \
NameError: name 'numRows' is not defined

I don't know why this error is coming!

Appreciate any ideas



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

View raw message