Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage increased a lot! #2051

Closed
utoni opened this issue Jul 17, 2023 · 6 comments
Closed

Memory usage increased a lot! #2051

utoni opened this issue Jul 17, 2023 · 6 comments

Comments

@utoni
Copy link
Collaborator

utoni commented Jul 17, 2023

Since #2041, the memory usage heavily increased!
This makes nDPI basically impossible to use on some systems.
It seems that the AC based string matching implementation is not very memory efficient.

ookla.pcap                                      	[DIFF]
--- /home/toni/git/nDPId/test/results/caches_cfg/ookla.pcap.out	2023-07-16 23:12:50.613960380 +0200
+++ /tmp/nDPId-test-stdout/caches_cfg_ookla.pcap.out.new	2023-07-17 09:02:05.891399762 +0200
@@ -64,3 +64,3 @@
-~~ total memory allocated....: 7995315 bytes
-~~ total memory freed........: 7995315 bytes
-~~ total allocations/frees...: 148474/148474
+~~ total memory allocated....: 32271394 bytes
+~~ total memory freed........: 32271394 bytes
+~~ total allocations/frees...: 487788/487788
teams.pcap                                      	[DIFF]
--- /home/toni/git/nDPId/test/results/caches_cfg/teams.pcap.out	2023-07-16 23:12:50.777959228 +0200
+++ /tmp/nDPId-test-stdout/caches_cfg_teams.pcap.out.new	2023-07-17 09:02:06.303397242 +0200
@@ -689,3 +689,3 @@
-~~ total memory allocated....: 9092192 bytes
-~~ total memory freed........: 9092192 bytes
-~~ total allocations/frees...: 151087/151087
+~~ total memory allocated....: 33370119 bytes
+~~ total memory freed........: 33370119 bytes
+~~ total allocations/frees...: 490401/490401
1kxun.pcap                                      	[DIFF]
--- /home/toni/git/nDPId/test/results/default/1kxun.pcap.out	2023-07-16 23:12:51.021957512 +0200
+++ /tmp/nDPId-test-stdout/default_1kxun.pcap.out.new	2023-07-17 09:02:06.771394379 +0200
@@ -1293,3 +1293,3 @@
-~~ total memory allocated....: 8506082 bytes
-~~ total memory freed........: 8506082 bytes
-~~ total allocations/frees...: 152931/152931
+~~ total memory allocated....: 32786745 bytes
+~~ total memory freed........: 32786745 bytes
+~~ total allocations/frees...: 492245/492245
443-chrome.pcap                                 	[DIFF]
--- /home/toni/git/nDPId/test/results/default/443-chrome.pcap.out	2023-07-16 23:12:51.145956641 +0200
+++ /tmp/nDPId-test-stdout/default_443-chrome.pcap.out.new	2023-07-17 09:02:07.127392201 +0200
@@ -16,3 +16,3 @@
-~~ total memory allocated....: 7966176 bytes
-~~ total memory freed........: 7966176 bytes
-~~ total allocations/frees...: 148289/148289
+~~ total memory allocated....: 32242135 bytes
+~~ total memory freed........: 32242135 bytes
+~~ total allocations/frees...: 487603/487603
@IvanNardi
Copy link
Collaborator

I also noticed something bad about about #2041: startup time are significantly longer. You can easily see that running the tests. This is the reason about 5811a56

As a temporary workaround you can disable this list loading.

Not sure about the default configurations (for the library, for ndpiReader and for the tests), though...

@utoni
Copy link
Collaborator Author

utoni commented Jul 17, 2023

As libnDPIs string matching automata increase in size, it may be worth to do some research about more memory efficient multiple string matching data structures / algorithms.
AFAIS, Aho-Corasick has a decent performance, but bad memory consumption compared to others.

@utoni
Copy link
Collaborator Author

utoni commented Aug 27, 2023

Recent changes made by @lucaderi to the host matching system reduced memory consumption a lot.

Commit 1f693c3f5a5dcd9d69dffb610b9a81bd33f95382 gives me the following results:

ookla.pcap                                      	[DIFF]
--- /home/toni/git/nDPId/test/results/caches_cfg/ookla.pcap.out	2023-07-27 18:33:48.426550187 +0200
+++ /tmp/nDPId-test-stdout/caches_cfg_ookla.pcap.out.new	2023-08-27 20:29:24.413104234 +0200
@@ -64,3 +64,3 @@
-~~ total memory allocated....: 7625096 bytes
-~~ total memory freed........: 7625096 bytes
-~~ total allocations/frees...: 142877/142877
+~~ total memory allocated....: 7798209 bytes
+~~ total memory freed........: 7798209 bytes
+~~ total allocations/frees...: 146558/146558
teams.pcap                                      	[DIFF]
--- /home/toni/git/nDPId/test/results/caches_cfg/teams.pcap.out	2023-07-27 18:33:48.430550165 +0200
+++ /tmp/nDPId-test-stdout/caches_cfg_teams.pcap.out.new	2023-08-27 20:29:24.585103369 +0200
@@ -689,3 +689,3 @@
-~~ total memory allocated....: 8723821 bytes
-~~ total memory freed........: 8723821 bytes
-~~ total allocations/frees...: 145490/145490
+~~ total memory allocated....: 8898782 bytes
+~~ total memory freed........: 8898782 bytes
+~~ total allocations/frees...: 149171/149171
1kxun.pcap                                      	[DIFF]
--- /home/toni/git/nDPId/test/results/default/1kxun.pcap.out	2023-08-24 22:02:01.700255074 +0200
+++ /tmp/nDPId-test-stdout/default_1kxun.pcap.out.new	2023-08-27 20:29:24.853102022 +0200
@@ -1293,3 +1293,3 @@
-~~ total memory allocated....: 8140447 bytes
-~~ total memory freed........: 8140447 bytes
-~~ total allocations/frees...: 147334/147334
+~~ total memory allocated....: 8318144 bytes
+~~ total memory freed........: 8318144 bytes
+~~ total allocations/frees...: 151015/151015
443-chrome.pcap                                 	[DIFF]
--- /home/toni/git/nDPId/test/results/default/443-chrome.pcap.out	2023-08-24 16:23:46.466642244 +0200
+++ /tmp/nDPId-test-stdout/default_443-chrome.pcap.out.new	2023-08-27 20:29:24.969101441 +0200
@@ -16,3 +16,3 @@
-~~ total memory allocated....: 7595837 bytes
-~~ total memory freed........: 7595837 bytes
-~~ total allocations/frees...: 142692/142692
+~~ total memory allocated....: 7768830 bytes
+~~ total memory freed........: 7768830 bytes
+~~ total allocations/frees...: 146373/146373

For example ookla.pcap required ~24mb more memory (!) after the gambling list change.
It requires only ~172kb more after @lucaderi changes to the host matching system.
This is a great improvement!

@lucaderi
Copy link
Member

I hope this 36abf06 fixes this issue you reported. We need to watch the code for a while before eventually discontinuing Aho-Corasick.

Closing this ticket.

@mmanoj
Copy link
Contributor

mmanoj commented Nov 9, 2023

Aho-Corasick

Pls have a look at this, seems promising
https://github.com/dongyx/libaca

@utoni
Copy link
Collaborator Author

utoni commented Nov 9, 2023

@mmanoj
I came across that one during my researches regarding a better (better in terms of a size-performance-ratio) AC algorithm. I remember testing it, but did not spent more time understanding the algorithm behind and why it should be better than the current AC algorithm we use in production. I should catch up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants