cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Kołaczkowski (JIRA) <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-4049) Add generic way of adding SSTable components required custom compaction strategy
Date Tue, 11 Sep 2012 09:18:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452835#comment-13452835
] 

Piotr Kołaczkowski edited comment on CASSANDRA-4049 at 9/11/12 8:17 PM:
------------------------------------------------------------------------

Sorry for late reply, been busy fixing things in dse.

{quote}
{noformat}
catch (Exception e)
        {
            if (!"snapshots".equals(name) && !"backups".equals(name)
                    && !name.contains(".json"))
                logger.warn("Invalid file '{}' in data directory {}.", name, dir);
            return null;
        }
{noformat}
What was the reasoning behind this? Not saying it's wrong to remove it, but I want to make
sure we understand what it was supposed to do, before deciding we don't need it.
{quote}

--This check was there before. Actually because this check was firing out warnings, we created
this ticket ;)--
Oh, just noticed, you are referring to removed code. [~slebresne] said he could live without
them. But if you insist, I can think of bringing them back. Don't know how to do it yet, but
if it is important, I can try.

On the other hand, I left the warning about missing components (which is IMHO much more important
than some spurious components):
{noformat}
if (!new File(descriptor.filenameFor(component)).exists())
                    logger.error("Missing component: " + descriptor.filenameFor(component));
{noformat}

-----------------
 
{noformat}
catch (IOException e)
        {
            Set<Component> components = Sets.newHashSetWithExpectedSize(Component.TYPES.size());
            for (Component.Type componentType : Component.TYPES)
            {
                Component component = new Component(componentType);
                if (new File(desc.filenameFor(component)).exists())
                    components.add(component);
            }

            saveTOC(desc, components);
            return components;
        }
{noformat}

This one is for backwards compatibility. If we find an SSTable without a TOC component (from
previous version of C*), we just do what we always did - loop through all C* components. 

--------------------
{quote}
Use FileUtils.closeQuietly{quote}

Oh, yeah. I was looking for IOUtils.closeQuietly, and couldn't find it. Thanks, that is what
I needed!

{quote}
But probably simpler to just use Guava's Files.readLines{quote}

Ok, I fix it.

--------------------
{quote}
Do we not know what components are necessary at construction time? Would strongly prefer an
immutable set. Adding extra parameters to SSTW to do this is fine.{quote}

We do, but there is a chicken-and-egg problem here. CompactionStrategy knows. But CompactionStrategy
needs a set of SSTables to be created. And to create SSTable readers you need to know the
components. That is why I decided for a TOC component, that allows for reading the list of
components at SSTable construction time.

The workflow of creating a new SSTable is currently as follows:

1. memtable is flushed to disk, with C* only components
2. compaction strategy is notified that a new sstable was created and gets an SSTableReader
(with only default components)
3. compaction strategy adds its custom components to it; in order to do it, it has to read
some components of SSTable (e.g. access the index or data file)

In order to make SSTableReader immutable, we had to ask currently installed compaction strategy
for custom component list somewhere in the middle of this process, before creating SSTableReader.
That is slightly more complex than what we have now (have to change the CS interface), but
retaining full immutability is probably worth it.

{noformat}
public synchronized void addCustomComponent(Component component){noformat}

You are right that synchronized is wrong here. 

Thanks for great suggestions [~jbellis]. I look into improving my patch as soon as I'm done
with the tickets I've got in the waiting queue for DSE 3.0.
                
      was (Author: pkolaczk):
    Sorry for late reply, been busy fixing things in dse.

{quote}
{noformat}
catch (Exception e)
        {
            if (!"snapshots".equals(name) && !"backups".equals(name)
                    && !name.contains(".json"))
                logger.warn("Invalid file '{}' in data directory {}.", name, dir);
            return null;
        }
{noformat}
What was the reasoning behind this? Not saying it's wrong to remove it, but I want to make
sure we understand what it was supposed to do, before deciding we don't need it.
{quote}

--This check was there before. Actually because this check was firing out warnings, we created
this ticket ;)--
Oh, just noticed, you are referring to removed code. [~slebresne] said he could live without
them. But if you insist, I can think of bringing them back. Don't know how to do it yet, but
if it is important, I can try.


{noformat}
if (!new File(descriptor.filenameFor(component)).exists())
                    logger.error("Missing component: " + descriptor.filenameFor(component));
{noformat}

-----------------
 
{noformat}
catch (IOException e)
        {
            Set<Component> components = Sets.newHashSetWithExpectedSize(Component.TYPES.size());
            for (Component.Type componentType : Component.TYPES)
            {
                Component component = new Component(componentType);
                if (new File(desc.filenameFor(component)).exists())
                    components.add(component);
            }

            saveTOC(desc, components);
            return components;
        }
{noformat}

This one is for backwards compatibility. If we find an SSTable without a TOC component (from
previous version of C*), we just do what we always did - loop through all C* components. 

--------------------
{quote}
Use FileUtils.closeQuietly{quote}

Oh, yeah. I was looking for IOUtils.closeQuietly, and couldn't find it. Thanks, that is what
I needed!

{quote}
But probably simpler to just use Guava's Files.readLines{quote}

Ok, I fix it.

--------------------
{quote}
Do we not know what components are necessary at construction time? Would strongly prefer an
immutable set. Adding extra parameters to SSTW to do this is fine.{quote}

We do, but there is a chicken-and-egg problem here. CompactionStrategy knows. But CompactionStrategy
needs a set of SSTables to be created. And to create SSTable readers you need to know the
components. That is why I decided for a TOC component, that allows for reading the list of
components at SSTable construction time.

The workflow of creating a new SSTable is currently as follows:

1. memtable is flushed to disk, with C* only components
2. compaction strategy is notified that a new sstable was created and gets an SSTableReader
(with only default components)
3. compaction strategy adds its custom components to it; in order to do it, it has to read
some components of SSTable (e.g. access the index or data file)

In order to make SSTableReader immutable, we had to ask currently installed compaction strategy
for custom component list somewhere in the middle of this process, before creating SSTableReader.
That is slightly more complex than what we have now (have to change the CS interface), but
retaining full immutability is probably worth it.

{noformat}
public synchronized void addCustomComponent(Component component){noformat}

You are right that synchronized is wrong here. 

Thanks for great suggestions [~jbellis]. I look into improving my patch as soon as I'm done
with the tickets I've got in the waiting queue for DSE 3.0.
                  
> Add generic way of adding SSTable components required custom compaction strategy
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4049
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4049
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Piotr Kołaczkowski
>            Assignee: Piotr Kołaczkowski
>            Priority: Minor
>              Labels: compaction
>             Fix For: 1.1.6
>
>         Attachments: pluggable_custom_components-1.1.4.patch
>
>
> CFS compaction strategy coming up in the next DSE release needs to store some important
information in Tombstones.db and RemovedKeys.db files, one per sstable. However, currently
Cassandra issues warnings when these files are found in the data directory. Additionally,
when switched to SizeTieredCompactionStrategy, the files are left in the data directory after
compaction.
> The attached patch adds new components to the Component class so Cassandra knows about
those files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message