hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11540) Raw Reed-Solomon coder using Intel ISA-L library
Date Thu, 07 Apr 2016 20:18:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230976#comment-15230976
] 

Colin Patrick McCabe commented on HADOOP-11540:
-----------------------------------------------

Thanks, [~drankye].

{code}
+  /**
+   * Convert an output bytes array buffer to direct ByteBuffer.
+   * @param output
+   * @return direct ByteBuffer
+   */
+  protected ByteBuffer convertOutputBuffer(byte[] output, int len) {
+    ByteBuffer directBuffer = ByteBuffer.allocateDirect(len);
+    return directBuffer;
+  }
{code}
Is it intentional that the "output" parameter is ignored here?

bq. For initOutputs and resetBuffer, good catch! About this I initially thought as you suggested,
instead of having initOutputs, just letting concrete coders to override resetBuffer, which
would be most flexible. Then I realized for Java coders, a default behavior can be provided
and used; for native coders, we can avoid having it because at the beginning of the encode()
call the native coder can memset the output buffers directly. If instead the native coder
has to provide resetBuffer, then a JNI function has to be added, which will be called some
times to reset output buffers. Considering the overhead in both implementation and extra JNI
calls, I used the initOutputs() approach.

Thanks for the explanation.  Why not just have the encode() function zero the buffer in every
case?  I don't see why the pure java code benefits from doing this differently-- and it is
much simpler to understand if all the coders do it the same way.

{code}
void setCoder(JNIEnv* env, jobject thiz, IsalCoder* pCoder) {
  jclass clazz = (*env)->GetObjectClass(env, thiz);
  jfieldID fid = (*env)->GetFieldID(env, clazz, "nativeCoder", "J");
  (*env)->SetLongField(env, thiz, fid, (jlong) pCoder);
}
{code}
All these functions can fail.  You need to check for, and handle their failures.

isAllowingChangeInputs, isAllowingVerboseDump: should be {{allowChangeInputs}}, {{allowVerboseDump}}
for clarity.

> Raw Reed-Solomon coder using Intel ISA-L library
> ------------------------------------------------
>
>                 Key: HADOOP-11540
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11540
>             Project: Hadoop Common
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Kai Zheng
>         Attachments: HADOOP-11540-initial.patch, HADOOP-11540-v1.patch, HADOOP-11540-v10.patch,
HADOOP-11540-v2.patch, HADOOP-11540-v4.patch, HADOOP-11540-v5.patch, HADOOP-11540-v6.patch,
HADOOP-11540-v7.patch, HADOOP-11540-v8.patch, HADOOP-11540-v9.patch, HADOOP-11540-with-11996-codes.patch,
Native Erasure Coder Performance - Intel ISAL-v1.pdf
>
>
> This is to provide RS codec implementation using Intel ISA-L library for encoding and
decoding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message