spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "VON RUEDEN, Jonathan" <>
Subject XML
Date Fri, 15 Jul 2016 12:44:17 GMT
Hi everyone,

I want to read an XML file with multiple attributes per tag and would need some help. I am
able to read and process the sample files but can't find a solution for my XML.
Here's the file structure:
<?xml version="1.0" encoding="UTF-8"?>
<report format="1.0">
   <creationTime millis="1468158875331" readable="2016-07-10 13:54:35 +0000" />
   <project artifactid="fin.ap.balances.display" gitUrl="ssh://"
groupid="" parentArtifactId="" parentVersion="1.12.2" version="4.0.7-SNAPSHOT">
      <check columnNumber="0" context="4.0.6" errorType="PREVIOUS_PROJECT_VERSION" filePath="/hompath/path/path"
lineNumber="0" message="Reporting :: Previous version checked for compatibility&#xA;For
details, see:" severity="Info" />
      <check columnNumber="0" context="Directories in '/src/main/webapp': [WEB-INF, model,
view, util, css, img, i18n]" errorType="PROJECT_OLD_STRUCTURE" filePath="/hpathpathpath/ath/webapp"
lineNumber="0" message="Reporting :: Using old project structure&#xA;For details, see:" severity="Info" />

--> Is there any way I can have com.databricks.spark.xml write all the attributes into
one cell as a string and I come up with my own way of splitting and transforming this into
a table? Do you guys know how I can read in such a file.
thanks much,


Jonathan von RĂ¼den
Enterprise Analytics

SAP France | Paris
Mobile: +33 68 221-2425

View raw message