hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [hadoop] steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances
Date Wed, 18 Mar 2020 14:10:16 GMT
steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException
and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances
URL: https://github.com/apache/hadoop/pull/1890#discussion_r394372894
 
 

 ##########
 File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java
 ##########
 @@ -63,20 +70,53 @@
   private final int bufferSize;
   private byte[] buffer;
   private int bufferIndex;
-  private final int maxConcurrentRequestCount;
+
+  private static int maxConcurrentRequestcount;
+  private static int maxBufferCount;
 
   private ConcurrentLinkedDeque<WriteOperation> writeOperations;
-  private final ThreadPoolExecutor threadExecutor;
-  private final ExecutorCompletionService<Void> completionService;
+  private static final Object INIT_LOCK = new Object();
+  private static ThreadPoolExecutor threadExecutor;
+  private static ExecutorCompletionService<Void> completionService;
+
+  private static final int ONE_MB = 1024 * 1024;
+  private static final int HUNDRED_MB = 100 * ONE_MB;
+  private static final int MIN_MEMORY_THRESHOLD = HUNDRED_MB;
 
   /**
    * Queue storing buffers with the size of the Azure block ready for
    * reuse. The pool allows reusing the blocks instead of allocating new
    * blocks. After the data is sent to the service, the buffer is returned
    * back to the queue
    */
-  private final ElasticByteBufferPool byteBufferPool
-          = new ElasticByteBufferPool();
+  private static final ElasticByteBufferPool BYTE_BUFFER_POOL
+      = new ElasticByteBufferPool();
+  private static AtomicInteger buffersToBeReturned = new AtomicInteger(0);
+
+  static {
+    if (threadExecutor == null) {
+      synchronized (INIT_LOCK) {
+        if (threadExecutor == null) {
+          int availableProcessors = Runtime.getRuntime().availableProcessors();
+          maxConcurrentRequestcount = 4 * availableProcessors;
+          maxBufferCount = maxConcurrentRequestcount + availableProcessors + 1;
 
 Review comment:
   I don't like the hard coded assumptions about #of CPUs and amount of space which can be
used for buffering.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message