Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add comp, constantly and complement #1179

Merged
merged 7 commits into from
Dec 25, 2016
Merged

Add comp, constantly and complement #1179

merged 7 commits into from
Dec 25, 2016

Conversation

tuturto
Copy link
Contributor

@tuturto tuturto commented Dec 19, 2016

relates #1176

@gilch
Copy link
Member

gilch commented Dec 19, 2016

It looks like comp is applying functions in the opposite order of Clojure's. Hy's comp should work the same as Clojure's comp.

@Kodiologist
Copy link
Member

Kodiologist commented Dec 19, 2016

I agree that Clojure's order is the right one, since in mathematics, (g ∘ f)(x) usually means g(f(x)).

@pyx
Copy link
Contributor

pyx commented Dec 20, 2016

Agree with gilch and Kodiologist, ((comp f g) x) means (f (g x)).

Regarding tests, compose does obey three laws, provided function f, g and h are pure functions, id is identity function and .(dot) is compose function:

  1. left identity
    (id . f) x = f x
  2. right identity
    (f . id) x = f x
  3. associative
    ((f . g) . h) x = (f . (g . h)) x

This is how I test compose in hymn, even though it looks like overkill for such a simple function. :)

@tuturto
Copy link
Contributor Author

tuturto commented Dec 20, 2016

heh, that's rather embarrassing. I'll fix that. Good that it was spotted so soon.

Edit: actually, @pyx, can I just copy your implementation? It's concise and I don't think I can come up with anything like that. It's on a different license of course.

@pyx
Copy link
Contributor

pyx commented Dec 20, 2016

@tuturto, sure, my pleasure, if you find that useful, please do whatever you want, including re-licensing, I didn't put a no-evil clause on it. :D

@pyx
Copy link
Contributor

pyx commented Dec 20, 2016

@tuturto
In case you were talking about not only the tests but also the function itself, please notice that there is a design decision I made that might not be suitable for standard library: I did not catch the no-argument case, since the function was meant to be used only internally, an exception from reduce is enough for me.

I think there are two sensible options:

  1. comp can be called with any number of arguments, return identity function when no argument provided.
    EDIT: one if test for no-arugment case is enough, reduce can handle sequence with only one element.

  2. comp should be called with at least one argument, something like

(defn compose [f &rest fs]
  "function composition"
  (defn compose-2 [f g]
    "compose 2 functions"
    (fn [&rest args &kwargs kwargs]
      (f (apply g args kwargs))))
  (reduce compose-2 fs f))

should do

@tuturto
Copy link
Contributor Author

tuturto commented Dec 20, 2016

Thank you @pyx. I added the atleast-one-argument version and fixed docs while I was at it.

@gilch
Copy link
Member

gilch commented Dec 20, 2016

We should really follow Clojure's lead and choose option 1: return identity.

@tuturto
Copy link
Contributor Author

tuturto commented Dec 20, 2016

I find it bit counter-intuitive that comp without any parameters returns identity, but it doesn't really make sense to deviate from other lisps. I added that there too now.

@Kodiologist
Copy link
Member

Kodiologist commented Dec 20, 2016

The idea is that the identity function is the identity element for the composition operator. That is, (comp f identity) is equal to f. That's why it's the right choice for the nullary case.

@tuturto
Copy link
Contributor Author

tuturto commented Dec 20, 2016

oh, hm.. If you put it that way it makes sense.

@pyx
Copy link
Contributor

pyx commented Dec 20, 2016

May I suggest a minor improvement,
since we have if taking out the empty case, and reduce can handle sequence with a single element by returning that sole element, which is what we expect in this case. (comp f) == f

(reduce compose-2 (list (rest fs)) (first fs))

can be

(reduce compose-2 fs)

@gilch
Copy link
Member

gilch commented Dec 21, 2016

Function calls are kind of expensive in Python I think it would be more efficient to use the imperative style, rather than calling a new function for every pair. Maybe something like this:

def comp(*fs):
    if not fs:
        return identity
    fs = reversed(fs)
    def composed(*args,**kwargs):
        for f in fs:
            res = f(*args,**kwargs)
            for f in fs:
                res = f(res)
        return res
    return composed

@Kodiologist
Copy link
Member

@gilch I think a benchmark or example showing an improvement should be a minimum requirement for a performance-motivated change.

@gilch
Copy link
Member

gilch commented Dec 21, 2016

@Kodiologist, that's fair, but I'm starting to think Hy needs a timing macro. The timeit module is proving difficult to use in Hy.

Here's a Hy version:

=> (defn comp [&rest fs]
... (if (not fs) identity
...     (= 1 (len fs)) (first fs)
...     (do (setv fs (reversed fs))
...         (fn [&rest args &kwargs kwargs]
...           (for* [f fs] (setv res (apply f args kwargs))
...             (for* [f fs] (setv res (f res))))
...           res))))

@gilch
Copy link
Member

gilch commented Dec 21, 2016

(defn comp [&rest fs]
  (if (not fs) identity
      (= 1 (len fs)) (first fs)
      (do (setv fs (reversed fs))
          (fn [&rest args &kwargs kwargs]
            (for* [f fs] (setv res (apply f args kwargs))
              (for* [f fs] (setv res (f res)))) res))))

(import [time [time]])
(setv start (time))

(for [_ (range 100000)]
  ((comp inc inc inc inc) 1))

(print (- (time) start))
(print "iterative comp completed.")


(defn comp [&rest fs]
   "function composition"
   (defn compose-2 [f g]
     "compose 2 functions"
     (fn [&rest args &kwargs kwargs]
       (f (apply g args kwargs))))
   (if fs
     (reduce compose-2 fs)
     identity))

(setv start (time))

(for [_ (range 100000)]
  ((comp inc inc inc inc) 1))
(print (- (time) start))

(print "functional comp completed.")
0.6814742088317871
iterative comp completed.
1.6196417808532715
functional comp completed.

@pyx
Copy link
Contributor

pyx commented Dec 22, 2016

@gilch
IMHO, I do not understand the nested for loop, it works because in Hy, reversed, like in python3, returns a generator, so the first iteration on the outer loop fetch the first function, then all the work done inside inner loop, then the generator is exhausted, yielding the expected result (each function been called once and only once).

But, then this composed function can only be called once, because the function list (fs) is defined in outer scope of the function, try:

hy 0.11.0+353.gca6fd66 using CPython(default) 3.5.2 on Linux
=> (defn comp [&rest fs]
...   (if (not fs) identity
...       (= 1 (len fs)) (first fs)
...       (do (setv fs (reversed fs))
...           (fn [&rest args &kwargs kwargs]
...             (for* [f fs] (setv res (apply f args kwargs))
...               (for* [f fs] (setv res (f res)))) res))))
=> (def inc4 (comp inc inc inc inc))
=> (inc4 1)
5
=> (inc4 1)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<input>", line 263, in _hy_anon_fn_1
UnboundLocalError: local variable 'res' referenced before assignment
=> 

Here is my take:

(defn comp [&rest fs]
  (if (not fs) identity
      (= 1 (len fs)) (first fs)
      (do (setv rfs (reversed fs)
                first-f (next rfs)
                fs (tuple rfs))
          (fn [&rest args &kwargs kwargs]
            (setv res (apply first-f args kwargs))
            (for* [f fs]
              (setv res (f res)))
            res))))

Notice:

  1. I used next instead of first, to make explicit the intent that we want to advance the generator once.
  2. I used tuple call to make the rest of sequence re-usable, I guess tuple should have small footprint and better performance.
  3. I named the first function first-f, a little bit ugly, but we need a different name or we will have the (in)famous UnboundLocalError exception, no nonlocal here, because we still support python2. Oh, lovely python scoping ... :)

@Kodiologist
Copy link
Member

Yeah, the nested for loop is needlessly weird. I'd write your version, @pyx, without the extra variables, like this:

(defn comp [&rest fs]
  (if (not fs) identity
      (= 1 (len fs)) (first fs)
      (fn [&rest args &kwargs kwargs]
        (setv res (apply (get fs -1) args kwargs))
        (for [f (reversed fs)]
          (setv res (f res)))
        res)))

This has similar performance to @gilch's version.

@pyx
Copy link
Contributor

pyx commented Dec 22, 2016

@Kodiologist
Generating reversed list each time on-the-fly (inside the inner function) is fine, as we usually do not feed too much into comp, so the performance lost should be acceptable, but this version will call the last (of the argument list) function twice:

hy 0.11.0+353.gca6fd66 using CPython(default) 3.5.2 on Linux
=> (defn comp [&rest fs]
...   (if (not fs) identity
...       (= 1 (len fs)) (first fs)
...       (fn [&rest args &kwargs kwargs]
...         (setv res (apply (get fs -1) args kwargs))
...         (for [f (reversed fs)]
...           (setv res (f res)))
...         res)))
=> (def inc4 (comp inc inc inc (fn [n] (print "hit!") (inc n))))
=> (inc4 1)
hit!
hit!
6

Edit: spelling mistake

@Kodiologist
Copy link
Member

Oh, right, whoops, I meant (cut fs -2 None -1), not (reversed fs).

@pyx
Copy link
Contributor

pyx commented Dec 22, 2016

I think the most common case of compose, might be that the composed fuction will be re-used a lot, think map, so on a second thought, we might better prepare the function list before entering into the scope of inner (composed) function.

And I wrote a naive time-it macro as mentioned by @gilch, I will open a new issue.

@pyx pyx mentioned this pull request Apr 29, 2022
@tuturto
Copy link
Contributor Author

tuturto commented Dec 22, 2016

Nice work and quite big difference in the speed already ❤️

@pyx
Copy link
Contributor

pyx commented Dec 22, 2016

Now that we have a proper (I hope so) timeit macro hylang/hyrule#39
And we have three implementations:

(defn comp-f [&rest fs]
   "function composition"
   (defn compose-2 [f g]
     "compose 2 functions"
     (fn [&rest args &kwargs kwargs]
       (f (apply g args kwargs))))
   (if fs
     (reduce compose-2 fs)
     identity))

(defn comp-i1 [&rest fs]
  (if (not fs) identity
      (= 1 (len fs)) (first fs)
      (fn [&rest args &kwargs kwargs]
        (setv res (apply (get fs -1) args kwargs))
        (for [f (cut fs -2 None -1)]
          (setv res (f res)))
        res)))

(defn comp-i2 [&rest fs]
  (if (not fs) identity
      (= 1 (len fs)) (first fs)
      (do (setv rfs (reversed fs)
                first-f (next rfs)
                fs (tuple rfs))
          (fn [&rest args &kwargs kwargs]
            (setv res (apply first-f args kwargs))
            (for* [f fs]
              (setv res (f res)))
            res))))

It's time to let the data speak:

=> (defn comp-f [&rest fs]
...    "function composition"
...    (defn compose-2 [f g]
...      "compose 2 functions"
...      (fn [&rest args &kwargs kwargs]
...        (f (apply g args kwargs))))
...    (if fs
...      (reduce compose-2 fs)
...      identity))

=> (defn comp-i1 [&rest fs]
...   (if (not fs) identity
...       (= 1 (len fs)) (first fs)
...       (fn [&rest args &kwargs kwargs]
...         (setv res (apply (get fs -1) args kwargs))
...         (for [f (cut fs -2 None -1)]
...           (setv res (f res)))
...         res)))

=> (defn comp-i2 [&rest fs]
...   (if (not fs) identity
...       (= 1 (len fs)) (first fs)
...       (do (setv rfs (reversed fs)
...                 first-f (next rfs)
...                 fs (tuple rfs))
...           (fn [&rest args &kwargs kwargs]
...             (setv res (apply first-f args kwargs))
...             (for* [f fs]
...               (setv res (f res)))
...             res))))

=> (print ((comp-f inc inc inc inc) 1))
5

=> (print ((comp-i1 inc inc inc inc) 1))
5

=> (print ((comp-i2 inc inc inc inc) 1))
5

=> (time-it ((comp-f inc inc inc inc) 1))
11.021359261001635

=> (time-it ((comp-i1 inc inc inc inc) 1))
8.513512686997274

=> (time-it ((comp-i2 inc inc inc inc) 1))
12.34987382899999

=> (time-it (inc4 1) (setv inc4 (comp-f inc inc inc inc)))
4.459002538002096

=> (time-it (inc4 1) (setv inc4 (comp-i1 inc inc inc inc)))
4.978545720001421

=> (time-it (inc4 1) (setv inc4 (comp-i2 inc inc inc inc)))
3.3503086150012678

Calling cut is actually slower than recurring 4 times. If we try 10 (on my machine), then it starts to gain:

=> (time-it (inc10 1) (setv inc10 (comp-f inc inc inc inc inc inc inc inc inc inc)))
11.87790252600098
=> (time-it (inc10 1) (setv inc10 (comp-i1 inc inc inc inc inc inc inc inc inc inc)))
7.043803205000586
=> (time-it (inc10 1) (setv inc10 (comp-i2 inc inc inc inc inc inc inc inc inc inc)))
5.294795802998124

@pyx
Copy link
Contributor

pyx commented Dec 22, 2016

Now that I think flip might not be a good idea, but may I propose other two handy functions?

pipe - the counterpart of comp

(defn pipe [&rest fs]
  "docstring goes here"
  (apply comp (reversed fs)))

in use

=> (def inc-str (pipe int inc str))
=> (inc-str "41")
'42'

juxt - I saw this one when I read about the "see also" part on clojure-docs, did not read the implementation though

(defn juxt [f &rest fs]
  "docstring goes here"
  (setv fs (cons f fs))
  (fn [&rest args &kwargs kwargs]
    (setv res [])
    (for* [f fs]
      (.append res (apply f args kwargs)))
    res))

I can imagine juxt will be super handy in data processing

=> ((juxt min max sum) (range 1 101))
[1, 100, 5050]

=> (setv [total difference product quotient] ((juxt + - * /) 24 3))
[27, 21, 72, 8.0]

=> (import [operator [itemgetter]])
=> ((juxt (itemgetter 'name) (itemgetter 'age)) {'name 'Joe 'age 42 'height 182})
['Joe', 42]

@Kodiologist
Copy link
Member

@pyx, this PR is for only the three functions named in the title. Those should be in their own PR if you want them.

@gilch
Copy link
Member

gilch commented Dec 23, 2016

@pyx you could make a new issue first for those.

From the timing data we see that the cut version doesn't perform as well. It can be constructed faster, because it defers some computation to execution. This is not a good trade-off in the common case of creating a composede function once, and then calling it many times in a map over a long iterable. It's better to precompute it once. So far, I like @pyx's implementation of comp best.

You could use cut instead of reversed when precomputing. I'm not sure which is faster. We could also test for more lengths in the if, and hard code the short cases while avoiding the loop altogether. I'm not sure it would help much, but the overhead of a loop and assignment is probably more significant when there are fewer functions.

@pyx
Copy link
Contributor

pyx commented Dec 23, 2016

@Kodiologist, fair enough, but usually pipe is defined in terms of comp to avoid duplicate code, so that PR will be depending on this one, in this case, it might be less management effort to go with comp.

juxt, I can draft a separate PR.

@pyx
Copy link
Contributor

pyx commented Dec 23, 2016

@gilch Since I am new here, sometimes I don't know which way should I go, e.g, I am not sure if creating a new issue regarding yet-to-be merged PR is a proper way of giving feedback.

That's why I did not draft a PR myself in the first place, I know it should come with proper documentation and test cases, to which the requirements I am not quite sure about. I will try, for juxt, anyway.

About cut vs reversed when precomputing, I think the invocation of comp is a onetime event in most cases (data processing/analysis, web app(?), etc.), the difference is acceptable, to say the least, so whichever with better readability should win, and I vote reversed for being straightforward, cut, IMHO, is not like [::-1], which is a well known pythonic idiom.

@pyx
Copy link
Contributor

pyx commented Dec 23, 2016

I have juxt implemented with documentation and test, should I create a PR now, or wait until this PR get merged first then rebase, cause it will create merge conflict[1] right now, both PRs touch the *exports* part.

[1] That's why I proposed pipe and juxt into this PR instead of a separate one.

@Kodiologist
Copy link
Member

Kodiologist commented Dec 23, 2016

@pyx Wait until this one is merged. Not because of a merge conflict, but because pipe depends on comp.

@Kodiologist
Copy link
Member

Kodiologist commented Dec 23, 2016

@gilch I sort of regret even bringing up the subject of benchmarks because any discussion about performance is speculation outside of profiling a real program. Premature optimization is the root of all evil. Anyway, in my opinion, any of the three versions is fine for now. Do you approve this PR?

@pyx
Copy link
Contributor

pyx commented Dec 23, 2016

@Kodiologist , okay, I will add pipe into the mix (just juxt right now) and send a PR later. Thanks.

@gilch
Copy link
Member

gilch commented Dec 24, 2016

@gilch Since I am new here, sometimes I don't know which way should I go, e.g, I am not sure if creating a new issue regarding yet-to-be merged PR is a proper way of giving feedback.

We don't know either. Don't worry too much about it, you're doing fine.

I would approve adding juxt to Hy. Clojure has it and I use it. I'd still suggest a new PR. I feel pipe needs more discussion. That's why I don't want it added to this PR. You may open a new issue for discussing that, but I'm not inclined to approve it since Clojure doesn't have it. It mostly duplicates the behavior of the threading macros (->,->>,as->). And comp, but in reverse. It is a good point that pipe can be implemented in terms of comp, but we don't have to do it that way, and it might be more efficient if we don't. It might actually be easier to implement pipe first and then comp in terms of pipe.

Merge conflicts in this file could probably be avoided if the exports list had only one element per line. The merge conflicts in this case are trivial to resolve though.

@gilch I sort of regret even bringing up the subject of benchmarks because any discussion about performance is speculation outside of profiling a real program. Premature optimization is the root of all evil. Anyway, in my opinion, any of the three versions is fine for now. Do you approve this PR?

I normally agree with that maxim, but I feel you're using it out of context. When developing an application, it's unlikely that any given function will be on the critical path, so it's best to profile and see what's actually the bottleneck. Optimizing other parts is just a waste of time and the optimized code can be harder to maintain too.

However, when developing the standard library, we're not just making one application, but the building blocks of every application written in Hy. Therefore it's much more likely the functions we write will be on the critical path of some Hy applications, especially for something as fundamental as function composition! So I do feel it's worth some effort not to be too inefficient here. That said, proving optimality is probably an undecidable problem in the general case, so we do have to give up eventually. But you do hit diminishing returns pretty quickly in practice for a function this short.

constantly and complement are pretty trivial, and appear to be correct. I'll approve those so far. I'll also approve comp if it's changed to @pyx's version.

@refi64
Copy link
Contributor

refi64 commented Dec 25, 2016

I'm not really seeing a use case for compliment... what's the difference between that and (comp not f)?

@Kodiologist
Copy link
Member

@kirbyfan64 Good thinking; comp not is shorter than complement, anyway.

@pyx
Copy link
Contributor

pyx commented Dec 25, 2016

@gilch this is my thought on pipe
When composing with, let's say, more than a handful functions, pipe is easier to work with, case study:
Assuming we are building a file processing pipeline, and have these functions:
read-file write-file parse-file rename transform-phase-1 transform-phase-2 transform-phase-3 , etc.
They are all of type

file -> file

We can build the pipeline like this:

(def process
  (pipe
    read-file
    parse-file
    transform-phase-1
    transform-phase-2
    transform-phase-3
    rename
    write-file))

It's just more natural to read in this case than comp.
If I am forced to have only one, I will go with pipe. But, there is only compose predefined (have a name) in maths, so many functional programming languages only define compose in their standard libraries, it's not because pipe is less useful, to reduce the surprise factor, so compose should stay...

If we can have both comp and pipe, I totally agree that define pipe imperatively, then define comp in terms of pipe looks better, especially in (I suppose) common scenarios like above, pipe gets used more. I might actually go as far as define both imperatively, cause these two functions' behaviours are so well known, and the logic is so trivial, that we do not expect them to be changed in the future, that means almost no maintenance burden.

@pyx
Copy link
Contributor

pyx commented Dec 25, 2016

@kirbyfan64 @gilch , I totally agree technically (compliment f) is just (comp not f), and in other static compiled languages, the compiler may even inline the function, but iirc, in Hy, not is not a normal function (pun not intended) but a special form, so, cannot be composed:

hy 0.11.0+357.g29b3702 using CPython(default) 3.5.2 on Linux
=> (def my-even? (comp not odd?))
Traceback (most recent call last):
  File "<input>", line 1, in <module>
NameError: name 'not' is not defined
=> not
Traceback (most recent call last):
  File "<input>", line 1, in <module>
NameError: name 'not' is not defined
=> (not True)
False

If I remember correctly, you have an issue or milestone regarding this and in, etc. But it does not work as of today.

BTW, happy holidays, everyone.

@Kodiologist
Copy link
Member

If I remember correctly, you have an issue or milestone regarding this and in, etc. But it does not work as of today.

Yes, #1103. I'm guessing it will be pretty straightforward.

@pyx
Copy link
Contributor

pyx commented Dec 25, 2016

PS. my juxt branch is here master...pyx:juxt , just waiting for this PR, then rebase, also, if you all agree to add pipe, I can add that too.

@refi64
Copy link
Contributor

refi64 commented Dec 25, 2016

not is not a normal function (pun not intended)

I wonder...not should probably be shadowed just like operators like +< , right?

@tuturto
Copy link
Contributor Author

tuturto commented Dec 25, 2016

I updated comp to @pyx version. Happy holidays people ❤️

@gilch gilch merged commit 71f30e8 into hylang:master Dec 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants