spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "VON RUEDEN, Jonathan" <jonathan.von.rue...@sap.com>
Subject XML
Date Fri, 15 Jul 2016 12:44:17 GMT
Hi everyone,

I want to read an XML file with multiple attributes per tag and would need some help. I am
able to read and process the sample files but can't find a solution for my XML.
Here's the file structure:
<?xml version="1.0" encoding="UTF-8"?>
<report format="1.0">
   <creationTime millis="1468158875331" readable="2016-07-10 13:54:35 +0000" />
   <project artifactid="fin.ap.balances.display" gitUrl="ssh://git.wdf.sap.corp:2/path/path/path"
groupid="com.sap.prod.prod" parentArtifactId="name.name" parentVersion="1.12.2" version="4.0.7-SNAPSHOT">
      <check columnNumber="0" context="4.0.6" errorType="PREVIOUS_PROJECT_VERSION" filePath="/hompath/path/path"
lineNumber="0" message="Reporting :: Previous version checked for compatibility&#xA;For
details, see: https://githudoc.doc.doc.doc.docm.md" severity="Info" />
      <check columnNumber="0" context="Directories in '/src/main/webapp': [WEB-INF, model,
view, util, css, img, i18n]" errorType="PROJECT_OLD_STRUCTURE" filePath="/hpathpathpath/ath/webapp"
lineNumber="0" message="Reporting :: Using old project structure&#xA;For details, see:
https://github.wdf.sap.coath.oath/pathpath.nmd" severity="Info" />
   </project>
</report>


--> Is there any way I can have com.databricks.spark.xml write all the attributes into
one cell as a string and I come up with my own way of splitting and transforming this into
a table? Do you guys know how I can read in such a file.
thanks much,
best,
jonathan


[SAP_grad_R_pref.png]

Jonathan von RĂ¼den
Enterprise Analytics

SAP France | Paris
Mobile: +33 68 221-2425
Email: Jonathan.von.rueden@sap.com<mailto:Jonathan.von.rueden@sap.com>


Mime
View raw message