spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "BASAK, ANANDA" <ab9...@att.com>
Subject RE: Date and decimal datatype not working
Date Tue, 24 Mar 2015 00:35:45 GMT
Thanks. This worked well as per your suggestions. I had to run following:
val TABLE_A = sc.textFile("/Myhome/SPARK/files/table_a_file.txt").map(_.split("|")).map(p
=> ROW_A(p(0).trim.toLong, p(1), p(2).trim.toInt, p(3), BigDecimal(p(4)), BigDecimal(p(5)),
BigDecimal(p(6))))

Now I am stuck at another step. I have run a SQL query, where I am Selecting from all the
fields with some where clause , TSTAMP filtered with date range and order by TSTAMP clause.
That is running fine.

Then I am trying to store the output in a CSV file. I am using saveAsTextFile(“filename”)
function. But it is giving error. Can you please help me to write a proper syntax to store
output in a CSV file?


Thanks & Regards
-----------------------
Ananda Basak
Ph: 425-213-7092

From: BASAK, ANANDA
Sent: Tuesday, March 17, 2015 3:08 PM
To: Yin Huai
Cc: user@spark.apache.org
Subject: RE: Date and decimal datatype not working

Ok, thanks for the suggestions. Let me try and will confirm all.

Regards
Ananda

From: Yin Huai [mailto:yhuai@databricks.com]
Sent: Tuesday, March 17, 2015 3:04 PM
To: BASAK, ANANDA
Cc: user@spark.apache.org
Subject: Re: Date and decimal datatype not working

p(0) is a String. So, you need to explicitly convert it to a Long. e.g. p(0).trim.toLong.
You also need to do it for p(2). For those BigDecimals value, you need to create BigDecimal
objects from your String values.

On Tue, Mar 17, 2015 at 5:55 PM, BASAK, ANANDA <ab9902@att.com<mailto:ab9902@att.com>>
wrote:
Hi All,
I am very new in Spark world. Just started some test coding from last week. I am using spark-1.2.1-bin-hadoop2.4
and scala coding.
I am having issues while using Date and decimal data types. Following is my code that I am
simply running on scala prompt. I am trying to define a table and point that to my flat file
containing raw data (pipe delimited format). Once that is done, I will run some SQL queries
and put the output data in to another flat file with pipe delimited format.

*******************************************************
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD


// Define row and table
case class ROW_A(
  TSTAMP:           Long,
  USIDAN:             String,
  SECNT:                Int,
  SECT:                   String,
  BLOCK_NUM:        BigDecimal,
  BLOCK_DEN:        BigDecimal,
  BLOCK_PCT:        BigDecimal)

val TABLE_A = sc.textFile("/Myhome/SPARK/files/table_a_file.txt").map(_.split("|")).map(p
=> ROW_A(p(0), p(1), p(2), p(3), p(4), p(5), p(6)))

TABLE_A.registerTempTable("TABLE_A")

***************************************************

The second last command is giving error, like following:
<console>:17: error: type mismatch;
found   : String
required: Long

Looks like the content from my flat file are considered as String always and not as Date or
decimal. How can I make Spark to take them as Date or decimal types?

Regards
Ananda

Mime
View raw message