storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul.Milliken@baesystems.com" <Paul.Milli...@baesystems.com>
Subject RE: Local Mode issues and undocumented behaviour
Date Tue, 21 Feb 2017 17:19:55 GMT
HI Petr,

I encountered a similar issue to your first point a while ago. See https://issues.apache.org/jira/browse/STORM-2038
(STORM-2038) for some more discussion that occurred.

Regards,
Paul

-----Original Message-----
From: Petr Janeček [mailto:JanecekPetr@seznam.cz]
Sent: 20 February 2017 10:02
To: user@storm.apache.org
Subject: Local Mode issues and undocumented behaviour

Hello,

you might notice the below email is a simplified repost as the last one got no attention which
may have been because it was in a wrong thread. Sorry about that, but any answer from a reliable
source qualifies - even "nobody knows anymore" or "not sure, please file a Jira with a minimal
reproducible test".

We're using Storm heavily and are trying to get things tested locally as much as possible.
While doing so, we accumulated a few questions:


1. Since 1.0.3 the Local Cluster on Windows needs the ability to create symlinks, but it did
not need to do that before. Both `LocalCluster.submitTopology()` and `Testing.withSimulatedTimeLocalCluster()
... Testing.completeTopology()` do this, and it's a major pain for local development where
our IDEs constantly run tests in local mode.

    We read <link mangled by corporate email filter - removed - PM> (which 404s for
1.0.1 and 1.0.2, by the way), and running our IDEs as Administrator fixes the issue.

    It might at be a good idea to add this change into the release notes - "Running in Local
Mode now requires the symlink creation permission, too." Introducing new major features in
.build versions is unfortunate :(. Is there any configuration to revert to old behaviour,
please?


2. Since 1.0.3, every time we run any test on Local Cluster, there is an extra directory being
created in the root of our project in IDE: ./logs/workers-artifacts/topologytest-random-uuid,
and it contains a single file, "worker.yaml".

    Is there anything we can do to move this logging to wherever else, preferably the ./target
directory? I went through the release notes and did not find anything related.


3. How does `Testing.completeTopology()` know the topology is completed? We're not acking
any tuples, so I'd expect the method to return once all tuples have internally timed out (or
the `CompleteTopologyParam` timeout has passed). However, the method returns much sooner (sooner
than we'd say our topology is "completed"), implying a more clever strategy. Is this deterministic?
Are there any knobs to turn?


4. Does local mode not honor `conf.registerSerialization()`? This seems strange, but if we're
sending an instance of  `OurData` class in local mode, the serialization fails with `NotSerializableException`,
like this:

        java.io.NotSerializableException: com.our.company.data.OurData
                at org.apache.storm.utils.Utils.javaSerialize(Utils.java:236)
                at org.apache.storm.thrift$serialize_component_object.invoke(thrift.clj:172)
                at org.apache.storm.testing$complete_topology.doInvoke(testing.clj:514)
                at clojure.lang.RestFn.invoke(RestFn.java:1124)
                at org.apache.storm.testing4j$_completeTopology.invoke(testing4j.clj:63)
                at org.apache.storm.Testing.completeTopology(Unknown Source)

     ...even though we have used `conf.registerSerialization(OurData.class, OurDataSerializer.class);`

    Slapping `Serializable` on the class fixes the issue, but obviously that's not the solution
- we don't want to change our class because of local testing, and we definitely *want* local
testing to use the same serialization mechanism as production. I'm very sure there's a lot
of people using this functionality, so we probably just overlooked something? We even tried:

        conf.put(Config.TOPOLOGY_TESTING_ALWAYS_TRY_SERIALIZE, true);  // Thanks for fixing
this in 1.0.3, by the way!
        conf.put(Config.TOPOLOGY_FALL_BACK_ON_JAVA_SERIALIZATION, false);

    By the way, as far as I know, Kryo can serialize nonserializable classes, too, via using
the `FieldSerializer`. Is there any hidden option to enable this by default instead of Java
serialization? Do you have any plans on using this instead of Java serialization?


Thank you in advance for any responses, we've been scratching our heads lately.
Petr Janeček
Please consider the environment before printing this email. This message should be regarded
as confidential. If you have received this email in error please notify the sender and destroy
it immediately. Statements of intent shall only become binding when confirmed in hard copy
by an authorised signatory. The contents of this email may relate to dealings with other companies
under the control of BAE Systems Applied Intelligence Limited, details of which can be found
at http://www.baesystems.com/Businesses/index.htm.
Mime
View raw message