falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ajay Yadava" <ajayn...@gmail.com>
Subject Re: Review Request 39588: State Store for instances scheduled by Falcon Native Scheduler
Date Thu, 05 Nov 2015 10:38:13 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39588/#review105205
-----------------------------------------------------------



common/src/main/resources/startup.properties (line 261)
<https://reviews.apache.org/r/39588/#comment163684>

    I think we should store all these properties in a separate file to avoid bloating of startup.properties.
It will also help in avoiding the issue of credentials in startup.properties.



scheduler/src/main/java/org/apache/falcon/state/StateService.java (line 139)
<https://reviews.apache.org/r/39588/#comment163626>

    Why is it being overridden again after being fetched from statestore just 2 lines before.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 73)
<https://reviews.apache.org/r/39588/#comment163627>

    Can we use EntityState instead of instance, please?



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 90)
<https://reviews.apache.org/r/39588/#comment163678>

    Let's not overload the word "instance" please. It is very confusing.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 144)
<https://reviews.apache.org/r/39588/#comment163628>

    Didn't check the implementation but if an instance is created, shouldn't it always be
in some state?



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 165)
<https://reviews.apache.org/r/39588/#comment163632>

    This will make database language dependent, as you will not be able to read it in any
other language except Java.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 167)
<https://reviews.apache.org/r/39588/#comment163633>

    Any reason for not storing predicates in a simple readable format, like in their own table/column?
It would be immensely useful to read predicate states.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 169)
<https://reviews.apache.org/r/39588/#comment163630>

    You should use IOUtils.closeQuietly() from commons.io.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 172)
<https://reviews.apache.org/r/39588/#comment163629>

    It is not required to close it as it will be automatically closed when you call close
on out.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 213)
<https://reviews.apache.org/r/39588/#comment163637>

    Use IOUtils.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 216)
<https://reviews.apache.org/r/39588/#comment163636>

    Redundant.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 278)
<https://reviews.apache.org/r/39588/#comment163634>

    use IOUtils



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java (line 281)
<https://reviews.apache.org/r/39588/#comment163635>

    Redundant.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/EntityBean.java (line 39)
<https://reviews.apache.org/r/39588/#comment163679>

    nit: GET_ENTITIES_FOR_TYPE



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/EntityBean.java (line 44)
<https://reviews.apache.org/r/39588/#comment163638>

    It's not a good convention to have table name in all caps or plural(makes it difficult
in certain scenarios to guess/spell correctly)



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/EntityBean.java (line 65)
<https://reviews.apache.org/r/39588/#comment163639>

    Do you need the prefix current? There is no other similar column to be disambiguated.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/InstanceBean.java (line 41)
<https://reviews.apache.org/r/39588/#comment163646>

    Why like and not =? Can't this be done with an equal on entity_id?



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/InstanceBean.java (line 43)
<https://reviews.apache.org/r/39588/#comment163647>

    Same as above for pattern matching on id.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/InstanceBean.java (line 44)
<https://reviews.apache.org/r/39588/#comment163648>

    Same as above on pattern matching



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/InstanceBean.java (line 46)
<https://reviews.apache.org/r/39588/#comment163645>

    Why actualStartTime and actualEndTime? They are not deterministic so we always query on
the basis of nominal time.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/InstanceBean.java (line 47)
<https://reviews.apache.org/r/39588/#comment163640>

    There is no limit, so how is it ensuring that you only get the last instance?



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/InstanceBean.java (line 48)
<https://reviews.apache.org/r/39588/#comment163641>

    DELETE_TABLE



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/JDBCStateStore.java (line 42)
<https://reviews.apache.org/r/39588/#comment163651>

    This class should be divided into a class per table so that it becomes more manageable.



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/JDBCStateStore.java (line 53)
<https://reviews.apache.org/r/39588/#comment163649>

    I understand that you might need clear method for tests but it is not useful, rather risky,
in production scenarios. So can we throw an exception here?



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/JDBCStateStore.java (line 87)
<https://reviews.apache.org/r/39588/#comment163653>

    I guess empty is returned when there are no results, so when is null returned?



scheduler/src/main/java/org/apache/falcon/state/store/jdbc/JDBCStateStore.java (line 151)
<https://reviews.apache.org/r/39588/#comment163650>

    Same as above for "clear"



scheduler/src/main/java/org/apache/falcon/state/store/service/FalconJPAService.java (line
58)
<https://reviews.apache.org/r/39588/#comment163682>

    Can you please document the role of this field?



scheduler/src/main/java/org/apache/falcon/state/store/service/FalconJPAService.java (line
59)
<https://reviews.apache.org/r/39588/#comment163681>

    Make it final please.



scheduler/src/main/java/org/apache/falcon/state/store/service/FalconJPAService.java (line
82)
<https://reviews.apache.org/r/39588/#comment163683>

    .getSimpleName() will be more readable.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 44)
<https://reviews.apache.org/r/39588/#comment163662>

    How are we ensuring that this is not run by anyone else except admin?



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 52)
<https://reviews.apache.org/r/39588/#comment163660>

    What is the purpose of this field?



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 53)
<https://reviews.apache.org/r/39588/#comment163659>

    Any reasons for this limitation? Most people will use MySQL or postgres in production
environments.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 66)
<https://reviews.apache.org/r/39588/#comment163661>

    Can we rephrase it? It sounds like it is interactive. You should highlight that it won't
ask for confirmation.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 81)
<https://reviews.apache.org/r/39588/#comment163663>

    Falcon DB is very confusing. It should be either state store version or underlying db
version.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 129)
<https://reviews.apache.org/r/39588/#comment163687>

    redundant blank lines?



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 137)
<https://reviews.apache.org/r/39588/#comment163664>

    "Falcon DB" is ambiguous.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 141)
<https://reviews.apache.org/r/39588/#comment163665>

    We should document what will happen in case of rollbacks.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 151)
<https://reviews.apache.org/r/39588/#comment163666>

    Why double spacing everywhere?



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 179)
<https://reviews.apache.org/r/39588/#comment163703>

    More descriptive message please.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 186)
<https://reviews.apache.org/r/39588/#comment163667>

    Please avoid using Falcon DB for table names or database.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 205)
<https://reviews.apache.org/r/39588/#comment163702>

    More descriptive message please.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 216)
<https://reviews.apache.org/r/39588/#comment163668>

    Credentials shouldn't be stored in startup.properties



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 297)
<https://reviews.apache.org/r/39588/#comment163697>

    A more descriptive message on what operation got completed please. It will be clearer
in cases like calling one function from another.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 301)
<https://reviews.apache.org/r/39588/#comment163690>

    There should be a function called falconPropsTableExists() and it should return true or
false depending on whether the table exists.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 301)
<https://reviews.apache.org/r/39588/#comment163706>

    There should only be a method called falconPropsTableExists() which returns true or false.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 307)
<https://reviews.apache.org/r/39588/#comment163689>

    I think this function shouldn't throw exception on the basis of an argument. Caller functions
should decide how they want to handle the scenario and throw exception if required.
    
    If you need to do it many times then create another function called assertFalconPropsTableExists()
which should throw an error.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 307)
<https://reviews.apache.org/r/39588/#comment163707>

    Functions shouldn't throw exceptions on the basis of arguments. Let the caller decide
how he wants to handle it.
    
    If you require same exception throwing behavior then create another function called assertFalconPropsTableExists
which calls this function and throws error if it doesn't exist.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 308)
<https://reviews.apache.org/r/39588/#comment163688>

    nit: Checking



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 329)
<https://reviews.apache.org/r/39588/#comment163698>

    More descriptive message please.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 349)
<https://reviews.apache.org/r/39588/#comment163700>

    Not printing the error or re-throwing the exception?



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 360)
<https://reviews.apache.org/r/39588/#comment163691>

    nit: Validating



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 363)
<https://reviews.apache.org/r/39588/#comment163692>

    nit: DB Connection validated successfully.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 363)
<https://reviews.apache.org/r/39588/#comment163699>

    More descriptive message please.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 403)
<https://reviews.apache.org/r/39588/#comment163693>

    nit: a more descriptive message on what action got completed will be more user friendly.



scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java (line 403)
<https://reviews.apache.org/r/39588/#comment163701>

    More descriptive message please.



scheduler/src/main/resources/META-INF/persistence.xml (line 1)
<https://reviews.apache.org/r/39588/#comment163695>

    It will be useful to at least put in comments the MySQL / Postgres configurations wherever
they are different.



scheduler/src/main/resources/falcon-buildinfo.properties (line 1)
<https://reviews.apache.org/r/39588/#comment163704>

    Is another copy of this file required in scheduler module?



scheduler/src/main/resources/falcon-buildinfo.properties (line 20)
<https://reviews.apache.org/r/39588/#comment163694>

    Is it required to have this copy in scheduler as well?



scheduler/src/test/java/org/apache/falcon/state/service/store/TestJDBCStateStore.java (line
127)
<https://reviews.apache.org/r/39588/#comment163696>

    Not the best way to do it. Break this into several test cases. Put expected exception
and message on each of them. You can extract common code to a private function.



scheduler/src/test/resources/startup.properties (line 1)
<https://reviews.apache.org/r/39588/#comment163657>

    Having startup.properties in each module seems verbose to me. Can't we leverage common
startup.properties somehow?



src/bin/falcon-db.sh (line 14)
<https://reviews.apache.org/r/39588/#comment163658>

    Sample usage/help documentation?



src/bin/falcon-db.sh (line 16)
<https://reviews.apache.org/r/39588/#comment163656>

    Although not a hard requirement but it will be much better to have a python script instead
of a bash script. Reason for this is windows compatibility. We already have a JIRA to convert
existing scripts to python so we shouldn't add more tasks to it as much as possible.



src/conf/startup.properties (line 61)
<https://reviews.apache.org/r/39588/#comment163654>

    Shouldn't all scheduler code be inside a package path containing "scheduler"?


- Ajay Yadava


On Nov. 3, 2015, 5:45 p.m., pavan kumar kolamuri wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39588/
> -----------------------------------------------------------
> 
> (Updated Nov. 3, 2015, 5:45 p.m.)
> 
> 
> Review request for Falcon.
> 
> 
> Bugs: https://issues.apache.org/jira/browse/FALCON-1234
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/FALCON-1234
> 
> 
> Repository: falcon-git
> 
> 
> Description
> -------
> 
> Persistent State Store for Falcon Native Scheduler
> 
> 
> Diffs
> -----
> 
>   checkstyle/src/main/resources/falcon/checkstyle.xml 2130e73 
>   checkstyle/src/main/resources/falcon/findbugs-exclude.xml 0a7580d 
>   common/src/main/resources/startup.properties cc5212a 
>   pom.xml 6f2c480 
>   scheduler/pom.xml 20a91d2 
>   scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java 3869ff2

>   scheduler/src/main/java/org/apache/falcon/execution/FalconExecutionService.java b959320

>   scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java 19089c4

>   scheduler/src/main/java/org/apache/falcon/predicate/Predicate.java fb4ce82 
>   scheduler/src/main/java/org/apache/falcon/state/ID.java 420c856 
>   scheduler/src/main/java/org/apache/falcon/state/StateService.java 81357a4 
>   scheduler/src/main/java/org/apache/falcon/state/store/EntityStateStore.java 4aa6fdb

>   scheduler/src/main/java/org/apache/falcon/state/store/InMemoryStateStore.java 3822860

>   scheduler/src/main/java/org/apache/falcon/state/store/InstanceStateStore.java d6a4b49

>   scheduler/src/main/java/org/apache/falcon/state/store/StateStore.java f595c26 
>   scheduler/src/main/java/org/apache/falcon/state/store/jdbc/BeanMapperUtil.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/store/jdbc/EntityBean.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/store/jdbc/InstanceBean.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/store/jdbc/JDBCStateStore.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/store/service/FalconJPAService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/tools/FalconStateStoreDBCLI.java PRE-CREATION

>   scheduler/src/main/resources/META-INF/persistence.xml PRE-CREATION 
>   scheduler/src/main/resources/falcon-buildinfo.properties PRE-CREATION 
>   scheduler/src/test/java/org/apache/falcon/execution/FalconExecutionServiceTest.java
b2f9e59 
>   scheduler/src/test/java/org/apache/falcon/notification/service/SchedulerServiceTest.java
b4a0f35 
>   scheduler/src/test/java/org/apache/falcon/state/AbstractSchedulerTestBase.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/state/EntityStateServiceTest.java 2f32b43

>   scheduler/src/test/java/org/apache/falcon/state/InstanceStateServiceTest.java d27ac7e

>   scheduler/src/test/java/org/apache/falcon/state/service/TestFalconJPAService.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/state/service/store/TestJDBCStateStore.java
PRE-CREATION 
>   scheduler/src/test/java/org/apache/falcon/tools/TestFalconStateStoreDBCLI.java PRE-CREATION

>   scheduler/src/test/resources/startup.properties PRE-CREATION 
>   src/bin/falcon-db.sh PRE-CREATION 
>   src/conf/startup.properties ce6e91f 
>   src/main/assemblies/distributed-package.xml 794eaef 
>   src/main/assemblies/standalone-package.xml fcff8d7 
>   unit/src/main/resources/startup.properties fe6f430 
> 
> Diff: https://reviews.apache.org/r/39588/diff/
> 
> 
> Testing
> -------
> 
> I have written unit tests. I will also test externally by setting up everything
> 
> 
> Thanks,
> 
> pavan kumar kolamuri
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message