drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Rogers <par0...@yahoo.com.INVALID>
Subject Re: [DISCUSS] ExecConstants class refactoring
Date Mon, 02 Mar 2020 19:30:11 GMT
Hi Igor,

You know the old joke about how you eat an elephant? One bite at a time.

Yes, trying to refactor java-exec in one go would be difficult, as hard as replacing value
vectors in one go. Instead, we might do it in baby steps.

For example, as we get the new "SPI" mechanism to work for storage plugins, we have the option
to pull the existing plugins into separate modules. Interestingly, these modules must be below
java-exec (which becomes the integration point for the runtime modules. This means that the
module to define the SPI must be even lower. This then requires that it be defined in terms
of interfaces, some of which are implemented in java-exec.

The same process can be applied to operators. If we want an operator module, they must implement
interfaces and accept interfaces, with the implementations residing elsewhere.

We've been gradually creating such interfaces, there are more to go. Each time we create one,
we break apart a bit more of the tight coupling you mentioned.

Why would we gradually do such work? Primarily so Drill can be extensible. We want storage
plugins to be easily built outside of Drill and be somewhat cross-version compatible. The
same argument could be made for specialized operators (specified in SQL as, say, some kind
of table function.)

Further, I've seen over and over that tight coupling makes it ever harder to safely change
anything, so changes get more expensive and the project stagnates. Drill might be different,
but I'd not bet on it. So, we need modularity to allow the project to continue to flourish.

I agree that is nothing we'd do any time soon. But, often if we know where we're going, we
can slowly get our ducks in a line so we can eventually do it.

- Paul


    On Monday, March 2, 2020, 11:11:21 AM PST, Igor Guzenko <ihor.huzenko.igs@gmail.com>
 Hello Paul,

Thanks for quick response, I've been thinking a lot of times about the
necessity to split java-exec into modules. Although it would be really cool
to do it, at the moment it is almost impossible to do it correctly. Too
many things are tightly coupled together and without good preparation,
splitting will be a huge pain. By preparation I mean the long iterative
process to collect related components into packages (by feature) and
clearly define interaction APIs between the packages. Another long-standing
issue which most probably will never be fixed... Huh, but at least we're
moving towards the goal :)

Kind regards,

On Mon, Mar 2, 2020 at 8:01 PM Paul Rogers <par0328@yahoo.com.invalid>

> Hi Igor,
> Great idea; I've been noticing that file has gotten excessively large.
> I wonder if we can split the file by topic instead of by the (often odd)
> naming hierarchy which has evolved over the years. For example, one file
> for internal server config options (thread counts, RPC stuff.) Another for
> things related to local config (local file systems, etc.) Another related
> to core operators (sorts, hash joins, etc.) Picking the right split will
> require a bit of thought and sone experimentation.
> There are three kinds of constants:
> * Config variables (from drill-override.conf)
> * System/session options
> * Other random constants
> One could argue that we should keep the three kinds together for each
> topic. (That is, all the sort-related stuff in one place.) Whether that
> means three well-known files in one place, three nested interfaces within a
> single class could be debated.
> One thing we probably should do is to separate out the string name of a
> system/session property from the implementation of its validator. It used
> to be that people would use the validator to access the option value. Most
> newer code in the last several years uses the typed access methods with a
> string key. So, we can move the validators into a OptionDefinitions
> class/interface separate from the key definitions.
> Most names are for the benefit of us: the poor humans who have to
> understand them. The compiler would be happy with inline constant values.
> Most names tend to be short to be easier to remember. For example, it is
> easier to understand CLIENT_RPC_THREADS than
> "drill.exec.rpc.user.client.threads".
> Most costants are in ExecConstants. Some (but not all) constants for the
> planner live in PlannerSettings. Oddly, some planner settings are in
> ExecConstants. We might want to consolidate planner-related constants into
> a single location.
> One final thing to keep in mind is that the "java-exec" project has become
> overly large. At some point, it might make sense to split it into
> components, such as planner, server, exec engine, etc. So, if our constants
> are grouped by functional units (rather than by name), it might make it
> easier to reshuffle modules if we ever choose to do so.
> Thanks,
> - Paul
>    On Monday, March 2, 2020, 8:20:32 AM PST, Igor Guzenko <
> ihor.huzenko.igs@gmail.com> wrote:
>  Hello Drillers,
> I would like to refactor the ExecConstants class in order to group our
> property keys by related sections. Neglecting some naming conventions for
> the case, we can utilize Java interfaces with fields (which are constants
> by default), and create something like:
> public interface ExecOption {
>  interface http {
>    interface session {
>      interface memory {
>        String maximum = "drill.exec.http.session.memory.maximum";
>        String reservation = "drill.exec.http.session.memory.reservation";
>      }
>    }
>    interface jetty {
>      interface server {
>        String acceptors = "drill.exec.http.jetty.server.acceptors";
>        String selectors = "drill.exec.http.jetty.server.selectors";
>      }
>    }
>  }
> }
> So then with the help of Java imports, we can write client code like:
> config.getLong(http.session.memory.reservation),
> config.getLong(http.session.memory.maximum)
> As a bonus, IDE's autocomplete and class structure features will allow an
> exploration of available choices fairly easily, since all related options
> will be related to specific nested interfaces instead of one top-level
> class.
> Any thoughts and suggestions are highly appreciated :)
> Thanks,
> Igor
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message