freemarker-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Rüger <c.rue...@synesty.com>
Subject Re: Lambda Expressions - filter list without <#list> directive
Date Tue, 02 Jul 2019 19:08:03 GMT
Good point. Seems you are not the first ones stumbling on that one.
I quickly searched around and found:

Similar question on SO:
https://stackoverflow.com/questions/45939202/filter-naming-convention
Javascript: filter :
https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Global_Objects/Array/filter
Spark SQL -> "where" is an alias for "filter":
https://stackoverflow.com/a/33887122/135535
<https://stackoverflow.com/questions/33885979/difference-between-filter-and-where-in-scala-spark-sql>
-> search for "filter" or "where" on
https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.DataFrame
R Statistics Language : filter
https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html#filter-rows-with-filter

Python: filter https://www.geeksforgeeks.org/filter-in-python/
Ruby: they use select:
https://www.codementor.io/tips/8247613177/how-to-filter-arrays-of-data-in-ruby
Kotlin: filter:
https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/filter.html

This languages rank in the upper area of the Stackoverflow survey:
https://insights.stackoverflow.com/survey/2019#technology-_-programming-scripting-and-markup-languages

I agree that "where" reads pretty nice. I like it. But "filter" seems to be
found in multiple common languages supporting lambdaish syntax.
Python and R is especially common in the data science / statistics
community, which are different target group than e.g. Java-Programmers.
Also web-developers these days are doing lots of javascript to build "html"
websites / templates - and javascript also uses "filter".

My vote would still go for "filter", because I think we are working on
lists of objects and objects are closer to "programming" than to "sql".
Maybe the "where"-alias would be a compromise - but might also be confusing
two have both.

What do others think?

Thanks
Christoph







Am Di., 2. Juli 2019 um 20:27 Uhr schrieb Daniel Dekany <ddekany@apache.org
>:

> I wonder if "filter" is a good name. For Java 8 programmers it's
> given, but otherwise I find it confusing, as it's not clear if you
> specify what to filter out, or what to keep. Worse, I believe in every
> day English "foo filter" or "filters foo" means removing foo-s because
> you don't want them, which is just the opposite of the meaning in
> Java. So I think "where", which is familiar for many from SQL (for
> most Java programmers as well, but also for non-Java programmers),
> would be better. Consider:
>
>   users?filter(user -> user.inactive)
>
> VS
>
>   users?where(user -> user.inactive)
>
> The first can be easily misunderstood as removing the inactive users,
> while the meaning of the second is obvious.
>
>
> Tuesday, July 2, 2019, 2:57:52 PM, Christoph Rüger wrote:
>
> > Thanks for the heads up. Very nice. We will run our test suite to see if
> > those test are still green.
> >
> > Am Mo., 1. Juli 2019 um 09:30 Uhr schrieb Daniel Dekany <
> ddekany@freemail.hu
> >>:
> >
> >> Since then I have also made a change that ensures that if the lambda
> >> argument is null (which in FTL is the same as if the variable isn't
> >> there at all), then it will not fall back to find an identically named
> >> variable in higher variable scopes. This is important when doing
> >> things like:
> >>
> >>   <#-- filters out null-s -->
> >>   myList?filter(it -> it??)
> >>
> >> because if some day someone adds a variable called "it" to the
> >> data-model, then suddenly the above won't filter out the null-s.
> >>
> >> The same thing was always an issue with #list loop variables as well,
> >> also with #nested arguments. So I have added a configuration setting
> >> called "fallbackOnNullLoopVariable", which is by default true
> >> (unfortunate historical baggage... but we can't break backward
> >> compatibility). If you set it to false, then this will print "N/A" at
> >> null list items, rather than "Clashing variable in higher scope":
> >>
> >> <#assign it = "Clashing variable in higher scope">
> >> <#list myList as it>
> >>   ${it!'N/A'}
> >> </#list>
> >>
> >> These changes are pushed and deployed to the Apache snapshot Maven
> >> repo in both branches.
> >>
> >>
> >> So, apart from documentation, the local lambda feature is about ready,
> >> or so I hope. I'm worried of rough edges though, so I think I will add
> >> lambda support to some more builtins (?seq_contains, ?sort_by), and
> >> explore some more use cases... If you have your own that you actually
> >> keep running into, or want to be in the 2.3.29, tell it.
> >>
> >>
> >> Monday, June 24, 2019, 1:59:21 AM, Daniel Dekany wrote:
> >>
> >> > Well, I'm not exactly fast nowadays either... Anyway, I have pushed
> >> > and deployed to the snapshot repo the changes I was talking about
> >> > recently. That is, ?map or ?filter won't make a sequence out of an
> >> > enumerable non-sequence (typically an Iterator) anymore. Because, it
> >> > was the concern that if hugeResultSet is an Iterator because it's
> >> > huge, then someone might writes:
> >> >
> >> >   <#assign transformed = hugeResultSet?map(it -> something(it))>
> >> >   <#list transformed as it>
> >> >
> >> > instead of just
> >> >
> >> >   <#list hugeResultSet?map(it -> something(it)) as it>
> >> >
> >> > and thus consuming a lot of memory without realizing it. So now if
> >> > hugeResultSet wasn't already a sequence (List-like), the assignment
> >> > will be an error, since we can't safely store a lazily transformed
> >> > collection (lambdas will break), and we can't condense it down to a
> >> > sequence (List-like thing) automatically either, as that might
> >> > consumes too much memory. If hugeResultSet was a sequence, then it's
> >> > not an error, as we assume that keeping all of it in memory is fine,
> >> > as the original was stored there as well (in practice, most of the
> >> > times... in principle we can't know).
> >> >
> >> > Now if the user feels confident about it, they can still write:
> >> >
> >> >   <#assign transformed = hugeResultSet?map(it ->
> something(it))?sequence>
> >> >
> >> > Similarly, hugeResultSet?map(it -> something(it))[index] will be an
> >> > error, as [index] is for sequences only, and ?map will not change a
> >> > non-sequence to a sequence anymore. Similarly, if the user feels
> >> > confident about it, they can write hugeResultSet?map(it ->
> >> > something(it))?sequence[index].
> >> >
> >> > An interesting consequence of these is that ?sequence is now a bit
> >> > smarter than before. Like if you write myIterator?sequnce[n], it will
> >> > not fetch the elements into an in-memory sequence, it just skips n
> >> > elements from myIterators, and returns the nth one. Similarly,
> >> > myIterator?sequence?size won't store the elements in memory, it just
> >> > counts them.
> >> >
> >> > As an interesting note, these two are also identically efficient:
> >> >
> >> >   <#assign seq = hugeResultSet?filter(it -> something(it))?sequence>
> >> >   <#assign seq = hugeResultSet?sequence?filter(it -> something(it))>
> >> >
> >> > In both cases the actual conversion to a sequence (in-memory list)
> >> > happens only just before assigning the value to seq. Once again,
> >> > ?sequence now just means "it's OK to treat this as a sequence, however
> >> > inefficient it is", and not "convert it to sequence right now".
> >> >
> >> >
> >> > Friday, June 7, 2019, 10:38:50 AM, Christoph Rüger wrote:
> >> >
> >> >> These optimisations sound great. I will try to run some tests within
> the
> >> >> next weeks. A bit busy lately.
> >> >> Thanks
> >> >> Christoph
> >> >>
> >> >> Am Mi., 29. Mai 2019 um 23:55 Uhr schrieb Daniel Dekany <
> >> ddekany@apache.org
> >> >>>:
> >> >>
> >> >>> Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:
> >> >>>
> >> >>> [snip]
> >> >>> >> Well, if you fear users jumping on ?filter/?map outside
#list
> for no
> >> >>> >> good enough reason, there can be some option to handle
that. But
> I
> >> >>> >> don't think restricting the usage to #list is a good compromise
> as
> >> the
> >> >>> >> default.
> >> >>> >
> >> >>> > I agree. Just keep as it is.
> >> >>> >
> >> >>> >> >> I'm not sure how efficiently could a configuration
setting
> catch
> >> >>> these
> >> >>> >> >> cases, or if it should be addressed on that level.
> >> >>> >> >
> >> >>> >> > Maybe let's postpone configurability discussion a
bit until the
> >> above
> >> >>> is
> >> >>> >> > more clear.
> >> >>> >>
> >> >>> >> In the light of the above, I think we can start thinking
about
> that
> >> >>> >> now.
> >> >>> >
> >> >>> > On that note on configurability: Would it be possible to
> >> programmatically
> >> >>> > influence the Collection (Sequence) which is created under
the
> hood?
> >> >>> > E.g. by specifying a Factory? I ask because we are using something
> >> like
> >> >>> > this (
> >> >>> >
> >> >>>
> >>
> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections
> >> >>> )
> >> >>> > in other places for large collections. I know it is very specific,
> >> but
> >> >>> just
> >> >>> > wanted to bring it up.
> >> >>> [snip]
> >> >>>
> >> >>> I think a good approach would be to ban the *implicit* collection
of
> >> >>> the result, when the filtered/mapped source is an Iterator, or
other
> >> >>> similar stream-like object that's often used for enumerating a
huge
> >> >>> number of elements. So for example, let's say you have this:
> >> >>>
> >> >>>   <#assign xs2 = xs?filter(f)>
> >> >>>
> >> >>> If xs is List-like, then this will work. Since the xs List fits
into
> >> >>> the memory (although a List can be backed by disk, that's rather
> >> >>> rare), hopefully it's not the kind of data amount that can't fit
> into
> >> >>> the memory again (as xs2). On the other hand, if xs is an
> >> >>> Iterator-like object, then the above statement fails, with the
hint
> >> >>> that xs?filter(f)?sequence would work, but might consumes a lot
of
> >> >>> memory.
> >> >>>
> >> >>> This is also consistent with how xs[i] works in the existing
> >> >>> FreeMarker versions. That only works if xs is List-like (an FTL
> >> >>> sequence). While xs[i] would be trivial to implement even if xs
is
> >> >>> Iterator-like, we don't do that as it's not efficient for a high
i,
> >> >>> and so the template author is probably not meant to do that. If
he
> >> >>> knows what's he doing though, he can write xs?sequence[i]. Yes,
> that's
> >> >>> very inefficient if you only use [] once on that sequence, but
you
> see
> >> >>> the logic. map/filter breaks it, as xs?filter(f)[i] works even
if xs
> >> >>> is an Iterator, because filter/map currently always returns a
> >> >>> sequence. If xs is Iteartor-like, then I want filter/map to return
> an
> >> >>> Iterator-like as well, so then [] will fail on it.
> >> >>>
> >> >>> As a side note, I will make ?sequence smarter too, so that
> >> >>> xs?sequence[i] won't actually build a sequence if xs is
> Iterator-like.
> >> >>> It just have to skip the first i elements after all. (The ?sequence
> is
> >> >>> still required there. It basically says: "I know what I'm doing,
> treat
> >> >>> this as a sequence.")
> >> >>>
> >> >>> --
> >> >>> Thanks,
> >> >>>  Daniel Dekany
> >> >>>
> >> >>>
> >> >>
> >> >
> >>
> >> --
> >> Thanks,
> >>  Daniel Dekany
> >>
> >>
> >
> > --
> > Christoph Rüger, Geschäftsführer
> > Synesty <https://synesty.com/> - Anbinden und Automatisieren ohne
> > Programmieren - Automatisierung, Schnittstellen, Datenfeeds
> >
> > Xing: https://www.xing.com/profile/Christoph_Rueger2
> > LinkedIn: http://www.linkedin.com/pub/christoph-rueger/a/685/198
> >
>
> --
> Thanks,
>  Daniel Dekany
>
>

-- 
Synesty GmbH
Moritz-von-Rohr-Str. 1a
07745 Jena
Tel.: +49 3641 
5596493Internet: https://synesty.com <https://synesty.com>
Informationen 
zum Datenschutz: https://synesty.com/datenschutz 
<https://synesty.com/datenschutz>

Geschäftsführer: Christoph Rüger

Unternehmenssitz: Jena
Handelsregister B beim Amtsgericht: Jena

Handelsregister-Nummer: HRB 508766
Ust-IdNr.: DE287564982

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message