spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Soheil Pourbafrani <soheil.i...@gmail.com>
Subject Create Hive table from CSVfile
Date Tue, 12 Feb 2019 10:45:38 GMT
Hi, Using the following code I create a Thrift Server including a Hive
table from CSV file and I expect it considers the first line as a header
but when I select data from the so-called table, I see it considers the CSV
header as data row! It seems the line "TBLPROPERTIES(skip.header.line.count
= 1)" didn't work! Is there any way to do that using the SparkSQL?

def main(args: Array[String]): Unit = {
    val conf = new SparkConf
    conf
      .set("hive.server2.thrift.port", "10000")
      .set("spark.sql.hive.thriftServer.singleSession", "true")
      .set("spark.sql.warehouse.dir", "/metadata/hive")
      .set("spark.sql.catalogImplementation","hive")
      .set("skip.header.line.count","1")
      .setMaster("local[*]")
      .setAppName("ThriftServer")
    val sc = new SparkContext(conf)
    val spark = SparkSession.builder()
      .config(conf)
      .enableHiveSupport()
      .getOrCreate()

spark.sql(
      "CREATE TABLE IF NOT EXISTS freq_back (" +
        "id int," +
        "time_stamp bigint," +
        "time_quality string )" +
        "ROW FORMAT DELIMITED " +
        "FIELDS TERMINATED BY ',' " +
        "STORED AS TEXTFILE " +
        "LOCATION 'hdfs://DB_BackUp/freq' " +
        "TBLPROPERTIES(skip.header.line.count = 1)"
    )

HiveThriftServer2.startWithContext(spark.sqlContext)

Mime
View raw message