spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean Georges Perrin <>
Subject Parsing XML
Date Tue, 04 Oct 2016 21:35:54 GMT
Spark 2.0.0
XML parser 0.4.0


I am trying to create a new column in my data frame, based on a value of a sub element. I
have done that several time with JSON, but not very successful in XML.

(I know a world with less format would be easier :) )

Here is the code:
df.withColumn("FulfillmentOption1", df.col("//FulfillmentOption[1]/text()"));

And here is the error:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot resolve column name
"//FulfillmentOption[1]/text()" among (x, xx, xxx, xxxx, a, b, FulfillmentOption, c, d, e,
f, g);
    at org.apache.spark.sql.Dataset$$anonfun$resolve$1.apply(Dataset.scala:220)
    at org.apache.spark.sql.Dataset$$anonfun$resolve$1.apply(Dataset.scala:220)

The XPath is valid...



To unsubscribe e-mail:

View raw message