tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jihoon Son <jihoon...@apache.org>
Subject Re: Tajo storage layer
Date Sat, 01 Feb 2014 15:29:03 GMT
Hi, Min

The operation of StorageManagerV2 is as follows. The
ScanSchedulercoordinates read requests for each disk. That is, when it
receives a number
of read requests, it first finds the DiskFileScanScheduler who is assigned
the minimum number of read requests. After that, it assigns a read request
to the found DiskFileScanScheduler. This process is repeated for remaining
read requests. DiskFileScanScheduler creates FileScanRunners for every
assigned request. FileScanRunner just reads data by a fixed size of buffer.
You can see the related issue at
https://issues.apache.org/jira/browse/TAJO-178 and this
help you understand.

Although StorageManagerV2 is designed to accelerate the read performance by
scheduling disk scans, its performance was not up to our expectations. As
you said, its thread model is too complex, and it might degrade the
performance. So, StorageManager is mainly used instead of StorageManagerV2.
(StorageManager is used by default).


2014-02-01 Min Zhou <coderplay@gmail.com>:

> Hi all,
> Seems the thread model of tajo storage layer is quite complex.
> Each call of StorageManagerFactory.getStorageManager(TajoConf)  creates
> one instance of StorageManagerV2,  which creates a scan scheduler thread
> and several disk file scan schedulers threads.  Why those threads are
> needed? What's their function?  How do those threads work with file
> scanners?
> Regards,
> Min
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message