orc-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From omal...@apache.org
Subject orc git commit: Push pull request #108 (advanced writing example) to site.
Date Tue, 25 Apr 2017 21:23:54 GMT
Repository: orc
Updated Branches:
  refs/heads/asf-site 20d4e29b9 -> 4fdca69c5


Push pull request #108 (advanced writing example) to site.

Signed-off-by: Owen O'Malley <omalley@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/orc/repo
Commit: http://git-wip-us.apache.org/repos/asf/orc/commit/4fdca69c
Tree: http://git-wip-us.apache.org/repos/asf/orc/tree/4fdca69c
Diff: http://git-wip-us.apache.org/repos/asf/orc/diff/4fdca69c

Branch: refs/heads/asf-site
Commit: 4fdca69c5aee3e6ebd01359b8e630befdaea3ce2
Parents: 20d4e29
Author: Owen O'Malley <omalley@apache.org>
Authored: Tue Apr 25 14:23:01 2017 -0700
Committer: Owen O'Malley <omalley@apache.org>
Committed: Tue Apr 25 14:23:01 2017 -0700

----------------------------------------------------------------------
 docs/core-java.html | 65 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/orc/blob/4fdca69c/docs/core-java.html
----------------------------------------------------------------------
diff --git a/docs/core-java.html b/docs/core-java.html
index bc13b0c..0e46208 100644
--- a/docs/core-java.html
+++ b/docs/core-java.html
@@ -1318,6 +1318,7 @@ keys and values.</p>
 
 <h2 id="writing-orc-files">Writing ORC Files</h2>
 
+<h3 id="simple-example">Simple Example</h3>
 <p>To write an ORC file, you need to define the schema and use the
 <a href="/api/orc-core/index.html?org/apache/orc/OrcFile.html">OrcFile</a>
 class to create a
@@ -1355,6 +1356,70 @@ if (batch.size != 0) {
 writer.close();
 </code></pre>
 
+<h3 id="advanced-example">Advanced Example</h3>
+
+<p>The following example writes an ORC file with two integer
+columns and a map column. Each row’s map has 5 elements with keys
+ranging from “&lt;row&gt;.0” to “&lt;row&gt;.4”.</p>
+
+<pre><code class="language-java">Path testFilePath = new Path("advanced-example.orc");
+Configuration conf = new Configuration();
+
+TypeDescription schema =
+    TypeDescription.fromString("struct&lt;first:int," +
+                               "second:int,third:map&lt;string,int&gt;&gt;");
+
+Writer writer =
+    OrcFile.createWriter(testFilePath,
+        OrcFile.writerOptions(conf).setSchema(schema));
+
+VectorizedRowBatch batch = schema.createRowBatch();
+LongColumnVector first = (LongColumnVector) batch.cols[0];
+LongColumnVector second = (LongColumnVector) batch.cols[1];
+
+//Define map. You need also to cast the key and value vectors
+MapColumnVector map = (MapColumnVector) batch.cols[2];
+BytesColumnVector mapKey = (BytesColumnVector) map.keys;
+LongColumnVector mapValue = (LongColumnVector) map.values;
+
+// Each map has 5 elements
+final int MAP_SIZE = 5;
+final int BATCH_SIZE = batch.getMaxSize();
+
+// Ensure the map is big enough
+mapKey.ensureSize(BATCH_SIZE * MAP_SIZE, false);
+mapValue.ensureSize(BATCH_SIZE * MAP_SIZE, false);
+
+// add 1500 rows to file
+for(int r=0; r &lt; 1500; ++r) {
+  int row = batch.size++;
+
+  first.vector[row] = r;
+  second.vector[row] = r * 3;
+
+  map.offsets[row] = map.childCount;
+  map.lengths[row] = MAP_SIZE;
+  map.childCount += MAP_SIZE;
+
+  for (int mapElem = (int) map.offsets[row];
+       mapElem &lt; map.offsets[row] + MAP_SIZE; ++mapElem) {
+    String key = "row " + r + "." + (mapElem - map.offsets[row]);
+    mapKey.setVal(mapElem, key.getBytes(StandardCharsets.UTF_8));
+    mapValue.vector[mapElem] = mapElem;
+  }
+  if (row == BATCH_SIZE - 1) {
+    writer.addRowBatch(batch);
+    batch.reset();
+  }
+}
+if (batch.size != 0) {
+  writer.addRowBatch(batch);
+  batch.reset();
+}
+writer.close();
+
+</code></pre>
+
 <h2 id="reading-orc-files">Reading ORC Files</h2>
 
 <p>To read ORC files, use the


Mime
View raw message