Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "Streaming_JA" page has been changed by yutuki.
http://wiki.apache.org/cassandra/Streaming_JA
--------------------------------------------------
New page:
CassandraのClusterを構成するNode間でデータ移転を行う必要が出た場合、下記の様な手順で行われます。
1. データ受信側が、データ送信側に対して必要とするデータの範囲を送ります。
1. データ送信側は、受け取った範囲情報に従って必要なSStableファイルをStreamingの為にCopyします。複数のSSTableから単一のSSTableを生成する「Compaction」と逆の処理を行う為、この処理は「Anti-Compaction」と呼ばれています。
1. データ送信側は、データ受信側に対してまず送信するデータの一覧を送り、それに続いて実データの転送を開始します。
Monitoring the status of streaming on both source and destination nodes can be found (in 0.6)
under the `org.apache.cassandra.streaming.StreamingService` MBean. The `Status` attribute
gives an easy indication of what a node is doing with respect to streaming.
Step 2 is what takes the most time on most systems. The destination will be idle during this
stage; to monitor anti-compaction progress, you should check the `Compaction` mbean on the
source.
Once step 3 begins actual data transfer, the sending node will report a status of `"Waiting
for transfer to $some_node to complete."` The receiving node will report `"Receiving stream"`
while receiving stream data. The `StreamDestinations` and `StreamSources` attributes each
contain a list of hosts that the current node is either sending stream data to or receiving
it from.
The operations `getOutgoingFiles(host)` and `getIncomingFiles(host)` each return a list of
strings describing the status of individual files being streamed to and from a given host.
Each string follows this format: `[path to file] [bytes sent/received]/[file size]` If you
think that streaming is taking too long on your cluster, the first thing you should do is
check `StreamSources` or `StreamDestinations` to figure out which hosts are streaming files.
Use those hosts as inputs to `getOutgoingFiles()` or `getIncomingFiles()` to check on the
status of individual files from the problematic source and destination nodes. Streaming is
conducted in 32MB chunks, so you should refresh the file status after a few seconds to see
if the sent/received values change. If they do not change, or change more slowly than you'd
like, something is wrong. Keep in mind that a source node can only stream a single file at
a time, but a destination node can simultaneously receive several files.
|