flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeroen Steggink | knowsy <jer...@knowsy.nl>
Subject Jars uploaded to taskmanager are deleted but not free'ed by OS
Date Wed, 18 Apr 2018 13:44:48 GMT
Hi,

I'm having some troubles running the Flink taskmanager in a Docker 
container (OpenShift). The container's internal storage is filling up 
because the deleted jar files in blob storage are probably still in use 
and therefore resources are not free'ed.

We are using Apache Beam to start an Apache Flink process, so the Jars 
are sent to Apache Flink everytime we fire a batch.

I enabled the debug logging, but I can't seem to find anything showing 
these deletes. Maybe someone has an idea why resources are not free'ed? 
I checked the blob store, and it indeed are the jars.

208875129    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/142
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_90964be94a2f4471844a00284e44fb32/blob_p-5202910b36af8c12548df97a7e4a057b77786217-ffa3f85003b1f124cd1cccdb0f72a8e0\
(deleted)

208875130    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/143
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_b7c00268b488411a8f6e1af984bcdcc2/blob_p-5202910b36af8c12548df97a7e4a057b77786217-8bab07adb34d4ce8de20846ec72059ce\
(deleted)

208875131    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/144
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_46183ac02f1dcd3543f8e481f59948b5/blob_p-5202910b36af8c12548df97a7e4a057b77786217-ac6bc86d8932e7d631416d9bafab4ab4\
(deleted)

208875132    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/145
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_717bf3f4b3f80700c1cc44d6076c2aca/blob_p-5202910b36af8c12548df97a7e4a057b77786217-780dd2383dee11a2361ac20a5da7bbb8\
(deleted)

208875133    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/146
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_22e67caac65c9c4e537caa3b072b8cc3/blob_p-5202910b36af8c12548df97a7e4a057b77786217-e0b523663672c641b368e5d1440b0b70\
(deleted)

208875134    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/147
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_3afe5b02ccb95b3494a1acd8677c66f0/blob_p-5202910b36af8c12548df97a7e4a057b77786217-9a8cd48c09a4b518adf0309a0255b339\
(deleted)

208875135    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/148
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_cb024c561531905e81c9768ec62a2fe0/blob_p-5202910b36af8c12548df97a7e4a057b77786217-0addc83aaf9a2f781528ad035fd79cc8\
(deleted)

208875136    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/149
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_d3dc0b0608d71ffa77575771f088e80e/blob_p-5202910b36af8c12548df97a7e4a057b77786217-c9015b012ec4b249f32872471a31a500\
(deleted)

208875137    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/150
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_1b4cdb127bb2c345e1b099e3e446bf58/blob_p-5202910b36af8c12548df97a7e4a057b77786217-ac4457b393b7ff0565c47c1e38786005\
(deleted)

208875138    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/151
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_8c23503c614a88e8c8f7a54a31e41886/blob_p-5202910b36af8c12548df97a7e4a057b77786217-d096b3ef150bf7e8e98224e0b8f17292\
(deleted)

208875139    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/152
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_e7c8132da483bd14e5abfe9390adeeb1/blob_p-5202910b36af8c12548df97a7e4a057b77786217-f370d8dcad0cb36581f9a5f1568e1487\
(deleted)

208875140    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/153
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_cbee9f15b0c6adba0f5ddb67b587b607/blob_p-5202910b36af8c12548df97a7e4a057b77786217-9ae77c3419d77adab8f44258ca4290c5\
(deleted)

208875141    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 /proc/1/fd/154
-> /var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_29c5a145ae231be4c0d53717625c3938/blob_p-5202910b36af8c12548df97a7e4a057b77786217-76bb4d83f962a887d41effb2646bd63d\
(deleted)



There are several places in the code where the returned boolean of the 
file delete is not read, so we have no clue if the file was deleted 
succesfully. Maybe it can be changed to something like 
java.nio.file.Files.delete to get an IOException when something goes 
wrong.  Though this is not a solution, but it can make it more 
transparent when things go wrong.

Thanks,
Jeroen


Mime
View raw message