ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Valentin Kulichenko <valentin.kuliche...@gmail.com>
Subject Re: usage analytics
Date Tue, 03 Nov 2020 12:10:51 GMT
Makes sense to me. I would love to know which components/APIs are used more
than others. Obviously, we should make sure everything is anonymous and we
don't collect any private user data, but I believe this is already
guaranteed by Google Analytics.

-Val

On Tue, Nov 3, 2020 at 3:59 AM Alexey Goncharuk <alexey.goncharuk@gmail.com>
wrote:

> Folks,
>
> I want to bump up this discussion and slightly change the format suggested
> by Nikita. I dot think it is correct to gather any information related to
> the user environment. However, can we collect just the fact of some of the
> Ignite APIs/subsystems being used with no user information whatsoever?
> Having started thinking about Ignite 3.0 I realized that we lack even some
> very basic knowledge on the impact of changing one or another feature or
> API.
>
> To my knowledge, the Ignite website already uses google analytics which is
> available to the community. The google analytics platform already has
> tooling to track app screen hits in a completely anonymous way, so we can
> use this tool to track Ignite components usage (once per node startup)
> sending solely component name and a unique environment hash - no IP
> addresses, no jdk/os/other information. The information will be available
> in the same toolkit we are already using to analyze the website and
> optimize our docs.
>
> WDYT?
>
> ср, 19 июл. 2017 г. в 01:15, <dsetrakyan@apache.org>:
>
> > I would try to ping legal again and see if they respond. If not, I think
> > we will need to come up with a simpler approach, that does not require
> > legal approval.
> >
> > ⁣D.​
> >
> > On Jul 18, 2017, 2:23 PM, at 2:23 PM, Nikita Ivanov <nivanov30@gmail.com
> >
> > wrote:
> > >Igniters,
> > >Just a quick update. I haven't gotten response from ASF Legal on this
> > >thread and I frankly don't know how to proceed here. What's the process
> > >to
> > >arrive to a decision point here?
> > >
> > >Thanks!
> > >--
> > >Nikita Ivanov
> > >
> > >
> > >On Mon, Jul 10, 2017 at 3:11 PM, Konstantin Boudnik <cos@apache.org>
> > >wrote:
> > >
> > >> On Sat, Jul 08, 2017 at 11:04AM, Nikita Ivanov wrote:
> > >> > Cos,
> > >> > Based on my experience having it off by default negates the entire
> > >> > purpose... We need statistically meaningful data set to make any
> > >> inferences
> > >> > from it. Moreover, if we are going to ask folks to turn it on it
> > >will
> > >> > significantly skew the resulting data set anyways and show full
> > >picture.
> > >> I
> > >> > think "on" by default is the better option if we are to collect
> > >usage
> > >> stats
> > >> > to begin with.
> > >>
> > >> yes, sure. But having this "on" by default is likely to expose us to
> > >> another
> > >> shit-storm down the road. An interesting dilemma to have indeed. In
> > >my
> > >> experience, whenever I install something like a browser or an
> > >operating
> > >> system, it would ask if I want to make the particular piece of
> > >software
> > >> better
> > >> by sending back some anonymized stats. Basically, I am given a way to
> > >> explicitly opt-out if I wish.
> > >>
> > >> By turning the feature "on" by default is like saying: "we'll be
> > >collecting
> > >> some stats, but if you don't want to you can go here and there and
> > >disable
> > >> the
> > >> collection. Oh, and by the way - you need to go and figure out the
> > >exact
> > >> steps
> > >> to disable it."
> > >>
> > >> > Also, I want to re-iterate it again to avoid misunderstanding:
> > >there is
> > >> no
> > >> > proposal nor will there be a technical way to attribute collected
> > >data
> > >> back
> > >> > to a certain company. That's not what this is all about. We should
> > >only
> > >> be
> > >> > interested in aggregated stats (community size, geo information,
> > >language
> > >> > information, components usage).
> > >>
> > >> Yes, I think it is clear, but never hurts to re-iterate.
> > >>
> > >> Cos
> > >>
> > >> > Thoughts?
> > >> >
> > >> > --
> > >> > Nikita Ivanov
> > >> > Founder & CTO
> > >> > GridGain Systems
> > >> >
> > >> > On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnik <cos@apache.org>
> > >> wrote:
> > >> >
> > >> > > Actually, that should be OFF by default. It sounds like this
> > >reduce the
> > >> > > amount
> > >> > > of the data collected, but this would address the concerns of
> > >companies
> > >> > > like
> > >> > > Roman's. I know for sure that a few of my clients would sue my
> > >ass out
> > >> of
> > >> > > existence if I gave them the platform collecting their
> > >data-centers
> > >> info.
> > >> > >
> > >> > > Let's have it, set if off by default and document and easy way
to
> > >turn
> > >> it
> > >> > > off.
> > >> > > Then start making rounds asking our user base to share _some_
of
> > >the
> > >> stats
> > >> > > with the community, so we can track the growth of the install
> > >base,
> > >> etc.
> > >> > >
> > >> > > Cos
> > >> > >
> > >> > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote:
> > >> > > > The idea so far is to have a single system property in
> > >configuration
> > >> that
> > >> > > > turns this off completely. I envision that this will be
> > >prominently
> > >> > > > featured on Ignite website so that everyone who would like
to
> > >> disable it
> > >> > > -
> > >> > > > can do it in seconds.
> > >> > > >
> > >> > > > Thoughts?
> > >> > > >
> > >> > > > --
> > >> > > > Nikita Ivanov
> > >> > > > Founder & CTO
> > >> > > > GridGain Systems
> > >> > > >
> > >> > > > On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykh
> > ><rshtykh@yahoo.com>
> > >> wrote:
> > >> > > >
> > >> > > > > Nikita,
> > >> > > > >
> > >> > > > > Sending and storing (somewhere the company cannot securely
> > >handle)
> > >> any
> > >> > > > > information (OS version, IP addresses, etc.) that can
be used
> > >to
> > >> > > compromise
> > >> > > > > the services would be unacceptable.
> > >> > > > > Turning it off might be ok (possibly through the cluster
> > >settings,
> > >> not
> > >> > > via
> > >> > > > > globally-accessible site), but the thing that there's
a risk
> > >some
> > >> > > > > information can leak outside (for any reason, starting
from a
> > >human
> > >> > > > > mistake) is scary.
> > >> > > > >
> > >> > > > > -- Roman
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov <
> > >> > > nivanov@gridgain.com>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > >
> > >> > > > > Roman,
> > >> > > > > Thanks for the feedback. What are those questions
> > >specifically?
> > >> Are IP
> > >> > > > > addresses and OS is what causing it?
> > >> > > > >
> > >> > > > > Thanks!
> > >> > > > >
> > >> > > > > --
> > >> > > > > Nikita Ivanov
> > >> > > > > Founder & CTO
> > >> > > > > GridGain Systems
> > >> > > > >
> > >> > > > > On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh
> > >> <rshtykh@yahoo.com.invalid
> > >> > > >
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > NIkita,
> > >> > > > >
> > >> > > > > While this will help improve Ignite, it will prevent
its
> > >adoption
> > >> by
> > >> > > many
> > >> > > > > projects -- sending and retaining IP adresses, OS versions,
> > >etc.
> > >> raises
> > >> > > > > tons of questions when considering to use Ignite. Even
if it
> > >can be
> > >> > > opted
> > >> > > > > out.
> > >> > > > > -- Roman
> > >> > > > >
> > >> > > > >
> > >> > > > >     On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov
<
> > >> > > nivanov30@gmail.com>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > >
> > >> > > > >  Igniters,
> > >> > > > > I would like to kick off the discussion on the idea
of
> > >collecting
> > >> > > Ignite
> > >> > > > > usage statistics. The basic idea behind this is to
better
> > >> understand
> > >> > > > > general and anonymous Ignite usage information to better
> > >calibrate
> > >> > > > > community efforts in developing new features, improving
> > >existing
> > >> ones,
> > >> > > > > delivering better documentation - and in every other
way to
> > >make
> > >> our
> > >> > > > > project a better software solution.
> > >> > > > >
> > >> > > > > Although such instrumentation is standard practice
in
> > >commercially
> > >> > > > > developed software, for an ASF project this could be
a
> > >sensitive
> > >> issue.
> > >> > > > > Therefore I would like to initiate a full community
> > >discussion on
> > >> how
> > >> > > best
> > >> > > > > to implement such practice for the benefit of project
while
> > >> ensuring
> > >> > > the
> > >> > > > > privacy protection of Ignite users.
> > >> > > > >
> > >> > > > > To ignite (pun intended) the discussion I'll outline
below
> > >some of
> > >> the
> > >> > > > > basic thoughts that I have on this subject. They are
here
> > >only to
> > >> give
> > >> > > an
> > >> > > > > idea of what such instrumentation may potentially look
like
> > >so
> > >> that we
> > >> > > can
> > >> > > > > discuss the merits of this idea in a tangible context.
> > >> > > > >
> > >> > > > > Overview
> > >> > > > > -------------
> > >> > > > > Upon start and every hour thereafter each Ignite node
will
> > >collect,
> > >> > > encrypt
> > >> > > > > and send usage statistics over HTTPS to the ASF-hosted
> > >server. That
> > >> > > server
> > >> > > > > will accept such HTTPS packets, decrypt them and store
them
> > >in a
> > >> > > > > time-series DB. A web interface will be provided to
view the
> > >usage
> > >> > > > > information.
> > >> > > > >
> > >> > > > > Opt-In or Opt-out
> > >> > > > > -------------------------
> > >> > > > > Opt-out. Ignite website will offer simple instructions
> > >(system
> > >> > > property) on
> > >> > > > > how to disable this instrumentation.
> > >> > > > >
> > >> > > > > Code, Infra, Access
> > >> > > > > ---------------------------
> > >> > > > > Ignite instrumentation will be part of the Ignite code
base.
> > >The
> > >> > > collection
> > >> > > > > server will be a separate module in the Ignite code
base
> > >(released
> > >> > > > > separately from Ignite). The collection server will
be hosted
> > >by
> > >> ASF
> > >> > > Infra.
> > >> > > > >
> > >> > > > > Usage statistics will be publicly accessible by anyone
in the
> > >> > > community.
> > >> > > > >
> > >> > > > > Private, Personal Data
> > >> > > > > ------------------------------
> > >> > > > > No private or personal data will ever be transferred.
No
> > >emails,
> > >> > > usernames,
> > >> > > > > company names, grid names, etc.
> > >> > > > >
> > >> > > > > Data Retention
> > >> > > > > --------------------
> > >> > > > > All data will be retained for 1 year and deleted permanently
> > >> > > thereafter.
> > >> > > > >
> > >> > > > > Usage Data
> > >> > > > > ----------------
> > >> > > > > The following data will be collected in each packet
sent to
> > >the
> > >> > > collection
> > >> > > > > server:
> > >> > > > > - GRID_SIZE (to correspond our testing environment
with the
> > >more
> > >> > > frequent
> > >> > > > > cluster sizes)
> > >> > > > > - IP_ADDR (for general geo-tracking as well as to know
what
> > >> > > documentation
> > >> > > > > language should be a priority)
> > >> > > > > - SES_ID (to track continues uptime vs. re-starts)
> > >> > > > > - USERNAME_TYPE (privilege username vs. standard, to
track
> > >> production
> > >> > > vs.
> > >> > > > > dev/testing usage; note - this is not an actual username)
> > >> > > > > - OS_NAME
> > >> > > > > - OS_VER
> > >> > > > > - OS_ARCH
> > >> > > > > - JAVA_VER
> > >> > > > > - JAVA_VENDOR
> > >> > > > > - COMP_SQL (whether or not this feature was used)
> > >> > > > > - COMP_COMPUTE (whether or not this feature was used)
> > >> > > > > - COMP_DATAGRID (whether or not this feature was used)
> > >> > > > > - COMP_STREAMING (whether or not this feature was used)
> > >> > > > > - COMP_IGFS (whether or not this feature was used)
> > >> > > > > - COMP_SERVICE (whether or not this feature was used)
> > >> > > > > - COMP_PERSISTENCE (whether or not this feature was
used)
> > >> > > > >
> > >> > > > > Please let's discuss this idea. Everyone's comments
and
> > >> suggestions are
> > >> > > > > *extremely* welcome.
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > > Nikita Ivanov.
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > >
> > >>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message