hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15204) Add Configuration API for parsing storage sizes
Date Fri, 02 Feb 2018 03:01:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349709#comment-16349709
] 

Anu Engineer commented on HADOOP-15204:
---------------------------------------

[~chris.douglas] I have attached patch v2 that takes care of the comments and checkstyle issues.
{quote}I find the API intuitive, but that is not universal (e.g., HDFS-9847). Explaining it
has taken more cycles than I expected, and perhaps more than a good API should.
{quote}
Thank you for the time and comments. Personally, I find _getTimeDuration_ API extremely intuitive,
hence imitating that for this API; As for others, you have done the heavy lifting of educating
the crowd, I will just ride on your coattails.
{quote}TERRABYTES is misspelled.
{quote}
Thanks for catching that, Fixed.
{quote}Is long insufficient as a return type for getStorageSize? I appreciate future-proofing,
but for Configuration values, that's what, ~8 petabytes?
{quote}
I started with long; the real issue was returning rounded numbers for large storage units.
Rounding causes a significant loss when we convert from _x bytes_ to _y exabytes_. Hence I
voted for the least element of surprise and decided to return double.
{quote}Why ROUND_UP of the options? Just curious.
{quote}
I was using RoundingMode.HALF_UP in divide and now I do that for multiply too, just to be
consistent.

The reason for using [HALF_UP|https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html#ROUND_HALF_UP]
is that it is probably the least surprising result for most users. From the Doc: {{Note that
this is the rounding mode that most of us were taught in grade school.}}
{quote}Storage units are more likely to be exact powers
{quote}
This is the curse of writing a unit as a library; we need to be cognizant of that single use
case which will break us. Hence I have used bigDecimal to be safe and correct and return doubles.
It yields values that people expect.

 

> Add Configuration API for parsing storage sizes
> -----------------------------------------------
>
>                 Key: HADOOP-15204
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15204
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 3.1.0
>            Reporter: Anu Engineer
>            Assignee: Anu Engineer
>            Priority: Minor
>             Fix For: 3.1.0
>
>         Attachments: HADOOP-15204.001.patch, HADOOP-15204.002.patch
>
>
> Hadoop has a lot of configurations that specify memory and disk size. This JIRA proposes
to add an API like {{Configuration.getStorageSize}} which will allow users
>  to specify units like KB, MB, GB etc. This is JIRA is inspired by HADOOP-8608 and Ozone.
Adding {{getTimeDuration}} support was a great improvement for ozone code base, this JIRA
hopes to do the same thing for configs that deal with disk and memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message