Speed up the commons functions #2

captn3m0 · 2021-07-08T19:19:06Z

My 10M (sqlite3_opt_batched) run goes from 42s to 26s.

Switching to https://docs.python.org/3/library/os.html#os.urandom would
make this even faster, but I don't think that's the point of the
exercise so leaving it at this. I think it's still nice to have because
it reduces the time by a lot and makes it faster for me to benchmark

result for threaded_batched.py against pypy3 on my i7 for all 100M records:

real 111.251 user 104.187 sys 4.483 pcpu 97.68

I seem to be running out of memory though (process keeps getting OOM-killed, wondering if that can be fixed somehow).

My 10M (sqlite3_opt_batched) run goes from 42s to 26s. Switching to https://docs.python.org/3/library/os.html#os.urandom would make this even faster, but I don't think that's the point of the exercise so leaving it at this. I think it's still nice to have because it reduces the time by a lot and makes it _faster for me to benchmark_

commons.py

avinassh · 2021-07-13T17:21:31Z

I seem to be running out of memory though (process keeps getting OOM-killed, wondering if that can be fixed somehow).

huh, I wonder why is this happening. This is for 100M records or were you trying with bigger num? The way it works currently is, it builds the entire DB in memory and flushes to disk at the end. However, it should not take lot of memory (I am assuming you have 8GB+ memory)

captn3m0 · 2021-07-13T18:06:38Z

16GB total memory, but I do run earlyoom so that might have had an impact. I was also trying mostly against a RAM disk (/dev/shm) when it kept failing.

avinassh · 2021-07-14T15:06:17Z

Thank you!

I am merging this now. I will post the new numbers and how much faster it has gotten on my machine

avinassh · 2021-07-17T16:27:59Z

I ran the script today, for CPython, the running time reduced to almost half. Thats crazy!

Before:

Sat Jul 17 21:39:53 IST 2021 [PYTHON] running sqlite3_opt_batched.py (100_000_000) inserts

real	7m19.090s
user	7m8.877s
sys	0m7.883s

Sat Jul 17 21:47:16 IST 2021 [PYPY] running sqlite3_opt_batched.py (100_000_000) inserts

real	2m30.555s
user	2m23.542s
sys	0m6.135s

After:

Sat Jul 17 21:50:29 IST 2021 [PYTHON] running sqlite3_opt_batched.py (100_000_000) inserts

real	3m29.370s
user	3m23.231s
sys	0m5.434s

Sat Jul 17 21:54:01 IST 2021 [PYPY] running sqlite3_opt_batched.py (100_000_000) inserts

real	2m6.333s
user	1m59.112s
sys	0m6.538s

avinassh approved these changes Jul 13, 2021

View reviewed changes

commons.py Show resolved Hide resolved

commons.py Show resolved Hide resolved

commons.py Show resolved Hide resolved

Add note about why using range is faster

37a11df

avinassh merged commit 199007d into avinassh:master Jul 14, 2021

red15 mentioned this pull request Aug 8, 2021

Avoid the slow Display::fmt trait for area code generation #17

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up the commons functions #2

Speed up the commons functions #2

captn3m0 commented Jul 8, 2021 •

edited

Loading

avinassh commented Jul 13, 2021

captn3m0 commented Jul 13, 2021

avinassh commented Jul 14, 2021

avinassh commented Jul 17, 2021

Speed up the commons functions #2

Speed up the commons functions #2

Conversation

captn3m0 commented Jul 8, 2021 • edited Loading

avinassh commented Jul 13, 2021

captn3m0 commented Jul 13, 2021

avinassh commented Jul 14, 2021

avinassh commented Jul 17, 2021

captn3m0 commented Jul 8, 2021 •

edited

Loading