Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the commons functions #2

Merged
merged 2 commits into from
Jul 14, 2021
Merged

Conversation

captn3m0
Copy link
Contributor

@captn3m0 captn3m0 commented Jul 8, 2021

My 10M (sqlite3_opt_batched) run goes from 42s to 26s.

Switching to https://docs.python.org/3/library/os.html#os.urandom would
make this even faster, but I don't think that's the point of the
exercise so leaving it at this. I think it's still nice to have because
it reduces the time by a lot and makes it faster for me to benchmark

result for threaded_batched.py against pypy3 on my i7 for all 100M records:

real 111.251 user 104.187 sys 4.483 pcpu 97.68

I seem to be running out of memory though (process keeps getting OOM-killed, wondering if that can be fixed somehow).

My 10M (sqlite3_opt_batched) run goes from 42s to 26s.

Switching to https://docs.python.org/3/library/os.html#os.urandom would
make this even faster, but I don't think that's the point of the
exercise so leaving it at this. I think it's still nice to have because
it reduces the time by a lot and makes it _faster for me to benchmark_
commons.py Show resolved Hide resolved
commons.py Show resolved Hide resolved
commons.py Show resolved Hide resolved
@avinassh
Copy link
Owner

I seem to be running out of memory though (process keeps getting OOM-killed, wondering if that can be fixed somehow).

huh, I wonder why is this happening. This is for 100M records or were you trying with bigger num? The way it works currently is, it builds the entire DB in memory and flushes to disk at the end. However, it should not take lot of memory (I am assuming you have 8GB+ memory)

@captn3m0
Copy link
Contributor Author

16GB total memory, but I do run earlyoom so that might have had an impact. I was also trying mostly against a RAM disk (/dev/shm) when it kept failing.

@avinassh avinassh merged commit 199007d into avinassh:master Jul 14, 2021
@avinassh
Copy link
Owner

Thank you!

I am merging this now. I will post the new numbers and how much faster it has gotten on my machine

@avinassh
Copy link
Owner

I ran the script today, for CPython, the running time reduced to almost half. Thats crazy!

Before:

Sat Jul 17 21:39:53 IST 2021 [PYTHON] running sqlite3_opt_batched.py (100_000_000) inserts

real	7m19.090s
user	7m8.877s
sys	0m7.883s

Sat Jul 17 21:47:16 IST 2021 [PYPY] running sqlite3_opt_batched.py (100_000_000) inserts

real	2m30.555s
user	2m23.542s
sys	0m6.135s

After:

Sat Jul 17 21:50:29 IST 2021 [PYTHON] running sqlite3_opt_batched.py (100_000_000) inserts

real	3m29.370s
user	3m23.231s
sys	0m5.434s

Sat Jul 17 21:54:01 IST 2021 [PYPY] running sqlite3_opt_batched.py (100_000_000) inserts

real	2m6.333s
user	1m59.112s
sys	0m6.538s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants