cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shaurya Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10273) Reduce number of data directory scans during startup
Date Fri, 01 Mar 2019 10:24:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781546#comment-16781546
] 

Shaurya Gupta commented on CASSANDRA-10273:
-------------------------------------------

Hi,

Could you please assign this issue to me, since the current assignee may not be interested
in this issue as there has not been any activity on this issue from a long time.

I have uploaded a patch for merging 3rd and 4th scans. Please have a look. Since I am just
starting with Cassandra code base it could be very wrong. Please point out the mistakes if
any. As part of verification I verified by starting Cassandra Daemon and then running a few
commands.


For merging 1st and 2nd: It appears that 1st is about visiting each file in data directory
and ensuring correct version. 2nd is about visiting each keyspace directory and then removing
all the "un required" files.
scrubDataDirectories takes as input the cfm obtained from keyspace name. Merging these two
will entail that I figure out the keyspace from the data directory path which looks a bit
"dirty".

Merging these two will any way require that I do these operations for system keyspace separately.

If the above method of merging 1st and 2nd scans looks fine, then I can go ahead with it.
However, I think we can skip merging 1st and 2nd for the above reasons. Please point out if
I am missing something.

Thanks

[~snazy], [~jjirsa]

> Reduce number of data directory scans during startup
> ----------------------------------------------------
>
>                 Key: CASSANDRA-10273
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10273
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Startup and Shutdown
>            Reporter: Robert Stupp
>            Assignee: Giampaolo
>            Priority: Minor
>              Labels: lhf
>         Attachments: patch_CASSANDRA-10273_trunk
>
>
> ATM we scan each data directory four times. We could easily reduce that to at least two,
maybe to one.
> # pre-flight (startup tests) scrub
> # pre-flight (startup tests) sstable min version
> # {{ColumnFamilyStore.createColumnFamilyStore}}
> # {{ColumnFamilyStore.<init>}} (if {{loadSSTables==true}})
> First two pre-flight tests could be combined to one and 3+4 could also be combined, as
both appear at pretty related code paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message