freemarker-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pete Helgren <p...@valadd.com>
Subject Re: Lambda Expressions - filter list without <#list> directive
Date Tue, 02 Jul 2019 19:29:05 GMT
As a more casual Java programmer, the "where" option is much clearer to 
me. I spend more time using FM syntax than changing the Java underneath, 
so from a "fading memory" standpoint, "where" would lead to fewer "What 
the....?" moments,  for me at least.

Pete Helgren
www.petesworkshop.com
GIAC Secure Software Programmer-Java
Twitter - Sys_i_Geek  IBM_i_Geek

On 7/2/2019 2:08 PM, Christoph Rüger wrote:
> Good point. Seems you are not the first ones stumbling on that one.
> I quickly searched around and found:
>
> Similar question on SO:
> https://stackoverflow.com/questions/45939202/filter-naming-convention
> Javascript: filter :
> https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Global_Objects/Array/filter
> Spark SQL -> "where" is an alias for "filter":
> https://stackoverflow.com/a/33887122/135535
> <https://stackoverflow.com/questions/33885979/difference-between-filter-and-where-in-scala-spark-sql>
> -> search for "filter" or "where" on
> https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.DataFrame
> R Statistics Language : filter
> https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html#filter-rows-with-filter
>
> Python: filter https://www.geeksforgeeks.org/filter-in-python/
> Ruby: they use select:
> https://www.codementor.io/tips/8247613177/how-to-filter-arrays-of-data-in-ruby
> Kotlin: filter:
> https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/filter.html
>
> This languages rank in the upper area of the Stackoverflow survey:
> https://insights.stackoverflow.com/survey/2019#technology-_-programming-scripting-and-markup-languages
>
> I agree that "where" reads pretty nice. I like it. But "filter" seems to be
> found in multiple common languages supporting lambdaish syntax.
> Python and R is especially common in the data science / statistics
> community, which are different target group than e.g. Java-Programmers.
> Also web-developers these days are doing lots of javascript to build "html"
> websites / templates - and javascript also uses "filter".
>
> My vote would still go for "filter", because I think we are working on
> lists of objects and objects are closer to "programming" than to "sql".
> Maybe the "where"-alias would be a compromise - but might also be confusing
> two have both.
>
> What do others think?
>
> Thanks
> Christoph
>
>
>
>
>
>
>
> Am Di., 2. Juli 2019 um 20:27 Uhr schrieb Daniel Dekany <ddekany@apache.org
>> :
>> I wonder if "filter" is a good name. For Java 8 programmers it's
>> given, but otherwise I find it confusing, as it's not clear if you
>> specify what to filter out, or what to keep. Worse, I believe in every
>> day English "foo filter" or "filters foo" means removing foo-s because
>> you don't want them, which is just the opposite of the meaning in
>> Java. So I think "where", which is familiar for many from SQL (for
>> most Java programmers as well, but also for non-Java programmers),
>> would be better. Consider:
>>
>>    users?filter(user -> user.inactive)
>>
>> VS
>>
>>    users?where(user -> user.inactive)
>>
>> The first can be easily misunderstood as removing the inactive users,
>> while the meaning of the second is obvious.
>>
>>
>> Tuesday, July 2, 2019, 2:57:52 PM, Christoph Rüger wrote:
>>
>>> Thanks for the heads up. Very nice. We will run our test suite to see if
>>> those test are still green.
>>>
>>> Am Mo., 1. Juli 2019 um 09:30 Uhr schrieb Daniel Dekany <
>> ddekany@freemail.hu
>>>> :
>>>> Since then I have also made a change that ensures that if the lambda
>>>> argument is null (which in FTL is the same as if the variable isn't
>>>> there at all), then it will not fall back to find an identically named
>>>> variable in higher variable scopes. This is important when doing
>>>> things like:
>>>>
>>>>    <#-- filters out null-s -->
>>>>    myList?filter(it -> it??)
>>>>
>>>> because if some day someone adds a variable called "it" to the
>>>> data-model, then suddenly the above won't filter out the null-s.
>>>>
>>>> The same thing was always an issue with #list loop variables as well,
>>>> also with #nested arguments. So I have added a configuration setting
>>>> called "fallbackOnNullLoopVariable", which is by default true
>>>> (unfortunate historical baggage... but we can't break backward
>>>> compatibility). If you set it to false, then this will print "N/A" at
>>>> null list items, rather than "Clashing variable in higher scope":
>>>>
>>>> <#assign it = "Clashing variable in higher scope">
>>>> <#list myList as it>
>>>>    ${it!'N/A'}
>>>> </#list>
>>>>
>>>> These changes are pushed and deployed to the Apache snapshot Maven
>>>> repo in both branches.
>>>>
>>>>
>>>> So, apart from documentation, the local lambda feature is about ready,
>>>> or so I hope. I'm worried of rough edges though, so I think I will add
>>>> lambda support to some more builtins (?seq_contains, ?sort_by), and
>>>> explore some more use cases... If you have your own that you actually
>>>> keep running into, or want to be in the 2.3.29, tell it.
>>>>
>>>>
>>>> Monday, June 24, 2019, 1:59:21 AM, Daniel Dekany wrote:
>>>>
>>>>> Well, I'm not exactly fast nowadays either... Anyway, I have pushed
>>>>> and deployed to the snapshot repo the changes I was talking about
>>>>> recently. That is, ?map or ?filter won't make a sequence out of an
>>>>> enumerable non-sequence (typically an Iterator) anymore. Because, it
>>>>> was the concern that if hugeResultSet is an Iterator because it's
>>>>> huge, then someone might writes:
>>>>>
>>>>>    <#assign transformed = hugeResultSet?map(it -> something(it))>
>>>>>    <#list transformed as it>
>>>>>
>>>>> instead of just
>>>>>
>>>>>    <#list hugeResultSet?map(it -> something(it)) as it>
>>>>>
>>>>> and thus consuming a lot of memory without realizing it. So now if
>>>>> hugeResultSet wasn't already a sequence (List-like), the assignment
>>>>> will be an error, since we can't safely store a lazily transformed
>>>>> collection (lambdas will break), and we can't condense it down to a
>>>>> sequence (List-like thing) automatically either, as that might
>>>>> consumes too much memory. If hugeResultSet was a sequence, then it's
>>>>> not an error, as we assume that keeping all of it in memory is fine,
>>>>> as the original was stored there as well (in practice, most of the
>>>>> times... in principle we can't know).
>>>>>
>>>>> Now if the user feels confident about it, they can still write:
>>>>>
>>>>>    <#assign transformed = hugeResultSet?map(it ->
>> something(it))?sequence>
>>>>> Similarly, hugeResultSet?map(it -> something(it))[index] will be an
>>>>> error, as [index] is for sequences only, and ?map will not change a
>>>>> non-sequence to a sequence anymore. Similarly, if the user feels
>>>>> confident about it, they can write hugeResultSet?map(it ->
>>>>> something(it))?sequence[index].
>>>>>
>>>>> An interesting consequence of these is that ?sequence is now a bit
>>>>> smarter than before. Like if you write myIterator?sequnce[n], it will
>>>>> not fetch the elements into an in-memory sequence, it just skips n
>>>>> elements from myIterators, and returns the nth one. Similarly,
>>>>> myIterator?sequence?size won't store the elements in memory, it just
>>>>> counts them.
>>>>>
>>>>> As an interesting note, these two are also identically efficient:
>>>>>
>>>>>    <#assign seq = hugeResultSet?filter(it -> something(it))?sequence>
>>>>>    <#assign seq = hugeResultSet?sequence?filter(it -> something(it))>
>>>>>
>>>>> In both cases the actual conversion to a sequence (in-memory list)
>>>>> happens only just before assigning the value to seq. Once again,
>>>>> ?sequence now just means "it's OK to treat this as a sequence, however
>>>>> inefficient it is", and not "convert it to sequence right now".
>>>>>
>>>>>
>>>>> Friday, June 7, 2019, 10:38:50 AM, Christoph Rüger wrote:
>>>>>
>>>>>> These optimisations sound great. I will try to run some tests within
>> the
>>>>>> next weeks. A bit busy lately.
>>>>>> Thanks
>>>>>> Christoph
>>>>>>
>>>>>> Am Mi., 29. Mai 2019 um 23:55 Uhr schrieb Daniel Dekany <
>>>> ddekany@apache.org
>>>>>>> :
>>>>>>> Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:
>>>>>>>
>>>>>>> [snip]
>>>>>>>>> Well, if you fear users jumping on ?filter/?map outside
#list
>> for no
>>>>>>>>> good enough reason, there can be some option to handle
that. But
>> I
>>>>>>>>> don't think restricting the usage to #list is a good
compromise
>> as
>>>> the
>>>>>>>>> default.
>>>>>>>> I agree. Just keep as it is.
>>>>>>>>
>>>>>>>>>>> I'm not sure how efficiently could a configuration
setting
>> catch
>>>>>>> these
>>>>>>>>>>> cases, or if it should be addressed on that level.
>>>>>>>>>> Maybe let's postpone configurability discussion a
bit until the
>>>> above
>>>>>>> is
>>>>>>>>>> more clear.
>>>>>>>>> In the light of the above, I think we can start thinking
about
>> that
>>>>>>>>> now.
>>>>>>>> On that note on configurability: Would it be possible to
>>>> programmatically
>>>>>>>> influence the Collection (Sequence) which is created under
the
>> hood?
>>>>>>>> E.g. by specifying a Factory? I ask because we are using
something
>>>> like
>>>>>>>> this (
>>>>>>>>
>> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections
>>>>>>> )
>>>>>>>> in other places for large collections. I know it is very
specific,
>>>> but
>>>>>>> just
>>>>>>>> wanted to bring it up.
>>>>>>> [snip]
>>>>>>>
>>>>>>> I think a good approach would be to ban the *implicit* collection
of
>>>>>>> the result, when the filtered/mapped source is an Iterator, or
other
>>>>>>> similar stream-like object that's often used for enumerating
a huge
>>>>>>> number of elements. So for example, let's say you have this:
>>>>>>>
>>>>>>>    <#assign xs2 = xs?filter(f)>
>>>>>>>
>>>>>>> If xs is List-like, then this will work. Since the xs List fits
into
>>>>>>> the memory (although a List can be backed by disk, that's rather
>>>>>>> rare), hopefully it's not the kind of data amount that can't
fit
>> into
>>>>>>> the memory again (as xs2). On the other hand, if xs is an
>>>>>>> Iterator-like object, then the above statement fails, with the
hint
>>>>>>> that xs?filter(f)?sequence would work, but might consumes a lot
of
>>>>>>> memory.
>>>>>>>
>>>>>>> This is also consistent with how xs[i] works in the existing
>>>>>>> FreeMarker versions. That only works if xs is List-like (an FTL
>>>>>>> sequence). While xs[i] would be trivial to implement even if
xs is
>>>>>>> Iterator-like, we don't do that as it's not efficient for a high
i,
>>>>>>> and so the template author is probably not meant to do that.
If he
>>>>>>> knows what's he doing though, he can write xs?sequence[i]. Yes,
>> that's
>>>>>>> very inefficient if you only use [] once on that sequence, but
you
>> see
>>>>>>> the logic. map/filter breaks it, as xs?filter(f)[i] works even
if xs
>>>>>>> is an Iterator, because filter/map currently always returns a
>>>>>>> sequence. If xs is Iteartor-like, then I want filter/map to return
>> an
>>>>>>> Iterator-like as well, so then [] will fail on it.
>>>>>>>
>>>>>>> As a side note, I will make ?sequence smarter too, so that
>>>>>>> xs?sequence[i] won't actually build a sequence if xs is
>> Iterator-like.
>>>>>>> It just have to skip the first i elements after all. (The ?sequence
>> is
>>>>>>> still required there. It basically says: "I know what I'm doing,
>> treat
>>>>>>> this as a sequence.")
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>>   Daniel Dekany
>>>>>>>
>>>>>>>
>>>> --
>>>> Thanks,
>>>>   Daniel Dekany
>>>>
>>>>
>>> --
>>> Christoph Rüger, Geschäftsführer
>>> Synesty <https://synesty.com/> - Anbinden und Automatisieren ohne
>>> Programmieren - Automatisierung, Schnittstellen, Datenfeeds
>>>
>>> Xing: https://www.xing.com/profile/Christoph_Rueger2
>>> LinkedIn: http://www.linkedin.com/pub/christoph-rueger/a/685/198
>>>
>> --
>> Thanks,
>>   Daniel Dekany
>>
>>

Mime
View raw message