spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: UDF on lpad
Date Thu, 25 Aug 2016 17:39:46 GMT
Thanks Mike.

Can one turn the first example into a generic UDF similar to the output
from below where 10 "0" are padded to the left of 123

  def padString(id: Int, chars: String, length: Int): String =
     (0 until length).map(_ =>
chars(Random.nextInt(chars.length))).mkString + id.toString

scala> padString(123, "0", 10)
res6: String = 0000000000123

Cheers

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 25 August 2016 at 17:29, Mike Metzger <mike@flexiblecreations.com> wrote:

> Are you trying to always add x numbers of digits / characters or are you
> trying to pad to a specific length?  If the latter, try using format
> strings:
>
> // Pad to 10 0 characters
> val c = 123
> f"$c%010d"
>
> // Result is 0000000123
>
>
> // Pad to 10 total characters with 0's
> val c = 123.87
> f"$c%010.2f"
>
> // Result is 0000123.87
>
>
> You can also do inline operations on the values before formatting.  I've
> used this specifically to pad for hex digits from strings.
>
> val d = "100"
> val hexstring = f"0x${d.toInt}%08X"
>
> // hexstring is 0x00000064
>
>
> Thanks
>
> Mike
>
> On Thu, Aug 25, 2016 at 9:27 AM, Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
>> Ok I tried this
>>
>> def padString(s: String, chars: String, length: Int): String =
>>      |      (0 until length).map(_ => chars(Random.nextInt(chars.length))).mkString
>> + s
>>
>> padString: (s: String, chars: String, length: Int)String
>> And use it like below:
>>
>> Example left pad the figure 12345.87 with 10 "0"s
>>
>> padString("12345.87", "0", 10)
>> res79: String = 000000000012345.87
>>
>> Any better way?
>>
>> Thanks
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 25 August 2016 at 12:06, Mich Talebzadeh <mich.talebzadeh@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> This UDF on substring works
>>>
>>> scala> val SubstrUDF = udf { (s: String, start: Int, end: Int) =>
>>> s.substring(start, end) }
>>> SubstrUDF: org.apache.spark.sql.expressions.UserDefinedFunction =
>>> UserDefinedFunction(<function3>,StringType,Some(List(StringType,
>>> IntegerType, IntegerType)))
>>>
>>> I want something similar to this
>>>
>>> scala> sql("""select lpad("str", 10, "0")""").show
>>> +----------------+
>>> |lpad(str, 10, 0)|
>>> +----------------+
>>> |      0000000str|
>>> +----------------+
>>>
>>> scala> val SubstrUDF = udf { (s: String, len: Int, chars: String) =>
>>> lpad(s, len, chars) }
>>> <console>:40: error: type mismatch;
>>>  found   : String
>>>  required: org.apache.spark.sql.Column
>>>        val SubstrUDF = udf { (s: String, len: Int, chars: String) =>
>>> lpad(s, len, chars) }
>>>
>>>
>>> Any ideas?
>>>
>>> Thanks
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>
>>
>

Mime
View raw message