Skip to content

Commit e96d3fe

Browse files
committed
[misc] add performance comparison
1 parent 31fa90a commit e96d3fe

File tree

7 files changed

+392
-2
lines changed

7 files changed

+392
-2
lines changed

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -60,10 +60,10 @@ In addition, `py4cl2-cffi/config` also exports the following useful symbols:
6060
- [x] numpy arrays to non-CL arrays
6161
- [x] arbitrary module import
6262
- [x] numpy floats
63-
- [ ] optimization: [PyCall.jl](https://github.com/JuliaPy/PyCall.jl) is just about 2x slower than native CPython, while we are 10-20x as slow!
63+
- [ ] optimization ~~[PyCall.jl](https://github.com/JuliaPy/PyCall.jl) is just about 2x slower than native CPython, while we are 10-20x as slow!~~ (See [./perf-compare/README.org](./perf-compare/README.org).)
6464
- [ ] unloading python libraries to allow reloading python without restarting lisp (?)
6565
- [ ] playing nice with dumping a lisp image
66-
- [ ] single threaded mode: some python libraries (including matplotlib) hate multithreaded environments
66+
- [x] single threaded mode: some python libraries (including matplotlib) hate multithreaded environments
6767

6868
... and much more ...
6969

perf-compare/README.org

+199
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
2+
3+
4+
* Performance Comparison
5+
:PROPERTIES:
6+
:TOC: :include siblings :depth 1 :ignore this
7+
:CUSTOM_ID: performance-comparison
8+
:END:
9+
10+
All the below code/scripts are run on an i7-8750H locked to 0.8GHz. Except for burgled-batteries3, all the others use python3.10. Julia version is 1.10. SBCL version is 2.3.11.
11+
12+
Burgled-batteries3 uses SBCL 1.5.4 and python3.6m.
13+
14+
:CONTENTS:
15+
- [[#run-pypy][run-py.py]]
16+
- [[#run-bblisp][run-bb.lisp]]
17+
- [[#run-pycalljl][run-pycall.jl]]
18+
- [[#run-py4cl2-cffi][run-py4cl2-cffi]]
19+
- [[#run-py4cl][run-py4cl]]
20+
- [[#summary][Summary]]
21+
:END:
22+
23+
* run-py.py
24+
:PROPERTIES:
25+
:CUSTOM_ID: run-pypy
26+
:END:
27+
28+
#+begin_src sh
29+
python run-py.py
30+
#+end_src
31+
32+
#+begin_src
33+
Evaluating performance of pystr_i through 1000000 calls...
34+
Calls per second: 1576057.6666891782
35+
36+
Evaluating performance of pycall_str through 100000 calls...
37+
Calls per second: 1074461.2959969486
38+
#+end_src
39+
40+
* run-bb.lisp
41+
:PROPERTIES:
42+
:CUSTOM_ID: run-bblisp
43+
:END:
44+
45+
Install burgled-batteries3 by following the instructions [[https://github.com/digikar99/burgled-batteries3#installation][here]]. The burgled-batteries3 repository must be somewhere quicklisp can find. Usually, this is ~/path/to/quicklisp/local-projects/~.
46+
47+
Finally, activate the burgled-batteries3 environment and run
48+
49+
#+begin_src sh
50+
sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --script run-bb.lisp
51+
#+end_src
52+
53+
#+begin_src
54+
This is SBCL 1.5.4, an implementation of ANSI Common Lisp.
55+
More information about SBCL is available at <http://www.sbcl.org/>.
56+
57+
SBCL is free software, provided as is, with absolutely no warranty.
58+
It is mostly in the public domain; some portions are provided under
59+
BSD-style licenses. See the CREDITS and COPYING files in the
60+
distribution for more information.
61+
62+
63+
Evaluating (DEFPYFUN "str" (OBJECT))
64+
65+
Evaluating performance of
66+
(LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (STR X))
67+
on the basis of 100000 runs...
68+
Calls per second: 9282.466
69+
70+
Evaluating performance of
71+
(LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (STR* X))
72+
on the basis of 100000 runs...
73+
Calls per second: 16339.869
74+
#+end_src
75+
76+
* run-pycall.jl
77+
:PROPERTIES:
78+
:CUSTOM_ID: run-pycalljl
79+
:END:
80+
81+
Install julia from [[https://julialang.org/downloads/][here]].
82+
83+
Install PyCall.jl using its REPL:
84+
85+
#+begin_src julia
86+
using Pkg
87+
Pkg.add("PyCall")
88+
#+end_src
89+
90+
Finally, run the run-pycall.jl script
91+
92+
#+begin_src sh
93+
/path/to/julia run-pycall.jl
94+
#+end_src
95+
96+
#+begin_src
97+
Evaluating performance of pystr_i through 100000 calls...
98+
Calls per second: 239483.74837415223
99+
100+
Evaluating performance of pycall_str through 10000 calls...
101+
Calls per second: 20160.842444870534
102+
#+end_src
103+
104+
* run-py4cl2-cffi
105+
:PROPERTIES:
106+
:CUSTOM_ID: run-py4cl2-cffi
107+
:END:
108+
109+
Make sure python3-config is accessible in the environment.
110+
111+
The py4cl2-cffi repository must be somewhere quicklisp can find. Usually, this is ~/path/to/quicklisp/local-projects/~.
112+
113+
#+begin_src sh
114+
sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --eval '(ql:quickload "py4cl2-cffi")'
115+
#+end_src
116+
117+
Finally, run the run-py4cl2-cffi.lisp script
118+
119+
#+begin_src sh
120+
sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --script run-py4cl2-cffi.lisp
121+
#+end_src
122+
123+
#+begin_src
124+
This is SBCL 2.3.11, an implementation of ANSI Common Lisp.
125+
More information about SBCL is available at <http://www.sbcl.org/>.
126+
127+
SBCL is free software, provided as is, with absolutely no warranty.
128+
It is mostly in the public domain; some portions are provided under
129+
BSD-style licenses. See the CREDITS and COPYING files in the
130+
distribution for more information.
131+
132+
Evaluating performance of
133+
(LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYSTR X))
134+
on the basis of 100000 runs...
135+
Calls per second: 210080.5
136+
137+
Evaluating performance of
138+
(LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYCALL "str" X))
139+
on the basis of 100000 runs...
140+
Calls per second: 55555.367
141+
#+end_src
142+
143+
* run-py4cl
144+
:PROPERTIES:
145+
:CUSTOM_ID: run-py4cl
146+
:END:
147+
148+
Make sure python3-config is accessible in the environment.
149+
150+
The py4cl repository must be somewhere quicklisp can find. Usually, this is ~/path/to/quicklisp/local-projects/~.
151+
152+
#+begin_src sh
153+
sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --eval '(ql:quickload "py4cl")'
154+
#+end_src
155+
156+
Finally, run the run-py4cl2-cffi.lisp script
157+
158+
#+begin_src sh
159+
sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --script run-py4cl.lisp
160+
#+end_src
161+
162+
#+begin_src
163+
This is SBCL 2.3.11, an implementation of ANSI Common Lisp.
164+
More information about SBCL is available at <http://www.sbcl.org/>.
165+
166+
SBCL is free software, provided as is, with absolutely no warranty.
167+
It is mostly in the public domain; some portions are provided under
168+
BSD-style licenses. See the CREDITS and COPYING files in the
169+
distribution for more information.
170+
171+
Evaluating performance of
172+
(LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYTHON-CALL "str" X))
173+
on the basis of 10000 runs...
174+
Calls per second: 3576.2117
175+
176+
Evaluating performance of
177+
(LAMBDA (X)
178+
(DECLARE (OPTIMIZE SPEED))
179+
(REMOTE-OBJECTS
180+
(PYTHON-CALL "str" X)))
181+
on the basis of 10000 runs...
182+
Calls per second: 3857.6765
183+
#+end_src
184+
185+
* Summary
186+
:PROPERTIES:
187+
:CUSTOM_ID: summary
188+
:END:
189+
190+
Table summarizing number of calls per second that the particular library can reach by either using ~PyObject_Call~ or ~PyObject_Str~. Blank column indicates either that no such facility is available, or I could not find how to use it.
191+
192+
| Library \ How | PyObject_Call | PyObject_Str |
193+
| <l> | <r> | <r> |
194+
|--------------------+---------------+--------------|
195+
| Python | 1000000 | 1600000 |
196+
| burgled-batteries3 | 16500 | - |
197+
| PyCall.jl | 20000 | 240000 |
198+
| py4cl2-cffi | 55000 | 210000 |
199+
| py4cl | 4000 | - |

perf-compare/run-bb.lisp

+45
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
(asdf:load-system "burgled-batteries3")
2+
3+
(in-package #:burgled-batteries3)
4+
5+
(defun call-n-times (n function)
6+
(let ((start-time (get-internal-real-time)))
7+
(dotimes (i n)
8+
(funcall function i))
9+
(/ (- (get-internal-real-time) start-time)
10+
internal-time-units-per-second)))
11+
12+
(defun calls-per-second (n fn)
13+
(let* ((total-time (call-n-times n fn))
14+
(per-call-time (/ total-time n)))
15+
(/ 1.0 per-call-time)))
16+
17+
(startup-python)
18+
19+
(terpri)
20+
(terpri)
21+
22+
(defmacro print-and-eval (form)
23+
(format t "Evaluating ~S~%" form)
24+
form)
25+
26+
(print-and-eval (defpyfun "str" (object)))
27+
28+
(defmacro print-and-eval-perf (n lambda-form)
29+
(terpri)
30+
(format t "Evaluating performance of~% ~S~%on the basis of ~D runs..."
31+
lambda-form n)
32+
(force-output)
33+
`(format t "~%Calls per second: ~D~%" (calls-per-second ,n ,lambda-form)))
34+
35+
(print-and-eval-perf
36+
100000
37+
(lambda (x)
38+
(declare (optimize speed))
39+
(str x)))
40+
41+
(print-and-eval-perf
42+
100000
43+
(lambda (x)
44+
(declare (optimize speed))
45+
(str* x)))

perf-compare/run-py.py

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
import time
2+
3+
def call_n_times(n, fn):
4+
start_time = time.process_time()
5+
for i in range(n): fn(i)
6+
return time.process_time() - start_time
7+
8+
def calls_per_second (n, fn):
9+
total_time = call_n_times(n, fn)
10+
per_call_time = total_time / n
11+
return 1/per_call_time
12+
13+
def pystr(i):
14+
return str(i)
15+
16+
def pycall_str(i):
17+
return globals()['__builtins__'].__dict__['str'](i)
18+
19+
calls_per_second(1, pystr)
20+
print("Evaluating performance of pystr_i through 1000000 calls...")
21+
print("Calls per second: ", calls_per_second(1000000, pystr), "\n")
22+
23+
calls_per_second(1, pycall_str)
24+
print("Evaluating performance of pycall_str through 100000 calls...")
25+
print("Calls per second: ", calls_per_second(100000, pycall_str),)

perf-compare/run-py4cl.lisp

+40
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
(asdf:load-system "py4cl")
2+
3+
(defpackage :py4cl-user
4+
(:use :cl :py4cl))
5+
6+
(in-package :py4cl-user)
7+
8+
(defun call-n-times (n function)
9+
(let ((start-time (get-internal-real-time)))
10+
(dotimes (i n)
11+
(funcall function i))
12+
(/ (- (get-internal-real-time) start-time)
13+
internal-time-units-per-second)))
14+
15+
(defun calls-per-second (n fn)
16+
(let* ((total-time (call-n-times n fn))
17+
(per-call-time (/ total-time n)))
18+
(/ 1.0 per-call-time)))
19+
20+
(defmacro print-and-eval-perf (n lambda-form)
21+
(terpri)
22+
(format t "Evaluating performance of~% ~S~%on the basis of ~D runs..."
23+
lambda-form n)
24+
(force-output)
25+
`(format t "~%Calls per second: ~D~%" (calls-per-second ,n ,lambda-form)))
26+
27+
(python-start)
28+
29+
(print-and-eval-perf
30+
10000
31+
(lambda (x)
32+
(declare (optimize speed))
33+
(python-call "str" x)))
34+
35+
(print-and-eval-perf
36+
10000
37+
(lambda (x)
38+
(declare (optimize speed))
39+
(remote-objects
40+
(python-call "str" x))))

perf-compare/run-py4cl2-cffi.lisp

+50
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
(asdf:load-system "py4cl2-cffi")
2+
3+
(defpackage :py4cl2-cffi-user
4+
(:use :cl :py4cl2-cffi)
5+
(:import-from #:py4cl2-cffi
6+
#:pyforeign-funcall))
7+
8+
(in-package :py4cl2-cffi-user)
9+
10+
(defun call-n-times (n function)
11+
(let ((start-time (get-internal-real-time)))
12+
(dotimes (i n)
13+
(funcall function i))
14+
(/ (- (get-internal-real-time) start-time)
15+
internal-time-units-per-second)))
16+
17+
(defun calls-per-second (n fn)
18+
(let* ((total-time (call-n-times n fn))
19+
(per-call-time (/ total-time n)))
20+
(/ 1.0 per-call-time)))
21+
22+
(defun pystr (i)
23+
(declare (optimize speed)
24+
(type (signed-byte 64) i))
25+
(with-pygc
26+
(cffi:foreign-string-to-lisp
27+
(pyforeign-funcall "PyObject_Str"
28+
:pointer (pythonize i)
29+
:pointer))))
30+
31+
(defmacro print-and-eval-perf (n lambda-form)
32+
(terpri)
33+
(format t "Evaluating performance of~% ~S~%on the basis of ~D runs..."
34+
lambda-form n)
35+
(force-output)
36+
`(format t "~%Calls per second: ~D~%" (calls-per-second ,n ,lambda-form)))
37+
38+
(pystart)
39+
40+
(print-and-eval-perf
41+
100000
42+
(lambda (x)
43+
(declare (optimize speed))
44+
(pystr x)))
45+
46+
(print-and-eval-perf
47+
100000
48+
(lambda (x)
49+
(declare (optimize speed))
50+
(pycall "str" x)))

0 commit comments

Comments
 (0)