hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15204) Add Configuration API for parsing storage sizes
Date Fri, 02 Feb 2018 03:01:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349709#comment-16349709

Anu Engineer commented on HADOOP-15204:

[~chris.douglas] I have attached patch v2 that takes care of the comments and checkstyle issues.
{quote}I find the API intuitive, but that is not universal (e.g., HDFS-9847). Explaining it
has taken more cycles than I expected, and perhaps more than a good API should.
Thank you for the time and comments. Personally, I find _getTimeDuration_ API extremely intuitive,
hence imitating that for this API; As for others, you have done the heavy lifting of educating
the crowd, I will just ride on your coattails.
{quote}TERRABYTES is misspelled.
Thanks for catching that, Fixed.
{quote}Is long insufficient as a return type for getStorageSize? I appreciate future-proofing,
but for Configuration values, that's what, ~8 petabytes?
I started with long; the real issue was returning rounded numbers for large storage units.
Rounding causes a significant loss when we convert from _x bytes_ to _y exabytes_. Hence I
voted for the least element of surprise and decided to return double.
{quote}Why ROUND_UP of the options? Just curious.
I was using RoundingMode.HALF_UP in divide and now I do that for multiply too, just to be

The reason for using [HALF_UP|https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html#ROUND_HALF_UP]
is that it is probably the least surprising result for most users. From the Doc: {{Note that
this is the rounding mode that most of us were taught in grade school.}}
{quote}Storage units are more likely to be exact powers
This is the curse of writing a unit as a library; we need to be cognizant of that single use
case which will break us. Hence I have used bigDecimal to be safe and correct and return doubles.
It yields values that people expect.


> Add Configuration API for parsing storage sizes
> -----------------------------------------------
>                 Key: HADOOP-15204
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15204
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 3.1.0
>            Reporter: Anu Engineer
>            Assignee: Anu Engineer
>            Priority: Minor
>             Fix For: 3.1.0
>         Attachments: HADOOP-15204.001.patch, HADOOP-15204.002.patch
> Hadoop has a lot of configurations that specify memory and disk size. This JIRA proposes
to add an API like {{Configuration.getStorageSize}} which will allow users
>  to specify units like KB, MB, GB etc. This is JIRA is inspired by HADOOP-8608 and Ozone.
Adding {{getTimeDuration}} support was a great improvement for ozone code base, this JIRA
hopes to do the same thing for configs that deal with disk and memory usage.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message