spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhaval Modi <dhavalmod...@gmail.com>
Subject Re: Convert a line of String into column
Date Sat, 05 Oct 2019 08:30:54 GMT
Hi,

1st convert  "lines"  to dataframe. You will get one column with original
string in one row.

Post this, use string split on this column to convert to Array of String.

After This, you can use explode function to have each element of the array
as columns.

On Wed 2 Oct, 2019, 03:18 , <hamishberridge@tutanota.com> wrote:

> I want to convert a line of String to a table. For instance, I want to
> convert following line
>
> <column1> <column2> <columns> ...<column6> # this is a line in
a text
> file, separated by a white space
>
> to table
>
> +-----+------+----....+------+
> |col1| col2| col3...|col6|
> +-----+-----+-----....+-----+
> |val1|val2|val3....|val6|
> +-----+------+---.....+-----+
> .....
>
> The code looks as below
>
>     import org.apache.spark.sql.functions._
>     import org.apache.spark.sql.SparkSession
>
>     val spark = SparkSession
>       .builder
>       .master("local")
>       .appName("MyApp")
>       .getOrCreate()
>
>     import spark.implicits._
>
>     val lines = spark.readStream.textFile("/tmp/data/")
>
>     val words = lines.as[String].flatMap(_.split(" "))
>     words.printSchema()
>
>     val query = words.
>       writeStream.
>       outputMode("append").
>       format("console").
>       start
>     query.awaitTermination()
>
> But in fact this code only turns the line into a single column
>
> +-------+
> | value|
> +-------+
> |col1...|
> |col2...|
> | col3..|
> |  ...     |
> |  col6 |
> +------+
>
> How to achieve the effect that I want to do?
>
> Thanks?
>
>

Mime
View raw message