Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak? #419

Closed
StevenFredette opened this issue Aug 28, 2020 · 14 comments
Closed

Memory leak? #419

StevenFredette opened this issue Aug 28, 2020 · 14 comments

Comments

@StevenFredette
Copy link

StevenFredette commented Aug 28, 2020

So not sure if i'm going crazy or pvpgn is using a lot more cpu over time.

I edited the LUA script. handle_game.lua

function handle_game_create(game)
channel_send_message("War2BNE", tostring(game.mapname) .. " hosted by " .. tostring(game.owner), message_type_info)
end

Works great to notify channel of game hosted but after 12 hours or so the cpu sadly went from 3% to 95% had to terminate the process and reopen to make server stable again.

I cannot use the latest development build with email because it has my cpu at 90% default.

@RElesgoe
Copy link
Member

Try the latest commit (c2167e1)

@StevenFredette
Copy link
Author

StevenFredette commented Aug 29, 2020

I'm still learning github. So I would need to recompile do you have a compiled plain link?

@RElesgoe
Copy link
Member

You can either compile or download from https://pvpgn.pro/download.html

@StevenFredette
Copy link
Author

I am now on the latest dev version.

The memory leak might have been fixed (Prior Updates). I need about 12-24 hours to tell.

This definitely runs a lot better on the CPU. Thank YOU!

If the memory leak on the LUA script is gone. I will close tomorrow. Thanks again!

@StevenFredette
Copy link
Author

StevenFredette commented Aug 29, 2020

Sadly,

The leaks are still happening with the most current updated development.

image

About 8 hours idle with 12 bots, the CPU went from 3% to 90%. If there is something I can do please let me know. The only difference from default should be the Lua script I mentioned above.

EDIT: hmm... not sure if Lua was causing this now. I removed all 12 bot's on the same IP. Be back in 12 hours with an update.

@StevenFredette
Copy link
Author

The CPU stayed at 0 near most of the night. The RAM only increased by .5 MB. (The leak did not occur while the bots were offline.)

The name & version of the bot.
image

Bot configuration:

  1. 14 accounts running (pvpgn handles multiple connections per IP.)
  2. Spoof ping to 1ms
  3. Same Channel
  4. W2BN Client

I'm not sure if it's the packets going back and forth between the server or server code for concurrent connections is leaking. Hopefully, this update will help track down whatever is causing that CPU to rocket overtime when those bots connect.

I had about 15 players for 5-6 hours play many games with no issues last night as well. These players are from all over the world.

@HarpyWar
Copy link
Member

Try run bots on different machine than pvpgn. Probably undefined udp packets cause this behavior because of the same udp port.
Set full logging loglevels = fatal,error,warn,info,debug,trace and look which actions from there may cause cpu load.

@StevenFredette
Copy link
Author

StevenFredette commented Aug 30, 2020

Try run bots on different machine than pvpgn. Probably undefined udp packets cause this behavior because of the same udp port.
Set full logging loglevels = fatal,error,warn,info,debug,trace and look which actions from there may cause cpu load.

Yes, the bot's (my desktop), pvpgn (my VPS). I enabled more verbose debugging and the CPU is already at 5 CPU% for pvpgn. There is not much coming in the debug console.

There is a bad echo timing when they connect.

Aug 30 09:29:30 [info ] pvpgn::bnetd::conn_destroy: [628] closed bnet connection
Aug 30 09:29:30 [debug] pvpgn::bnetd::ipbanlist_check: lastcheck: 1598779563, now: 1598779770, now-lc: 207.
Aug 30 09:29:30 [debug] pvpgn::bnetd::ipbanlist_check: checking 50.90.128.16
Aug 30 09:29:31 [debug] pvpgn::bnetd::ipbanlist_check: address 50.90.128.16 does not match exact 76.119.98.17
Aug 30 09:29:31 [debug] pvpgn::bnetd::ipbanlist_check: address 50.90.128.16 does not match exact 76.119.98.17
Aug 30 09:29:31 [info ] pvpgn::bnetd::sd_accept: [628] accepted connection from 50.90.128.16:61078 on 0.0.0.0:6112
Aug 30 09:29:31 [debug] pvpgn::bnetd::conn_create: [628][616] sessionkey=0x1598800988 sessionnum=0x00000014
Aug 30 09:29:31 [debug] pvpgn::bnetd::sd_accept: [628] client connected to a bnet listening address
Aug 30 09:29:31 [info ] pvpgn::bnetd::handle_init_packet: [628] client initiated bnet connection
Aug 30 09:29:31 [trace] pvpgn::bnetd::conn_shutdown: [628] connection already closed
Aug 30 09:29:31 [debug] pvpgn::bnetd::conn_set_class: added latency check timer
Aug 30 09:29:31 [debug] pvpgn::bnetd::_client_auth_info: [628] AUTH_INFO packet { protocol=0x0, platform=IX86, product=W2BN, versionid=0x4f, language=SUne, localip=0x1dca8c0, tzbias=012c, locale=1033, language=1033, country=USA.United States }
Aug 30 09:29:31 [warn ] pvpgn::bnetd::conn_set_gamelang: got UNKNOWN gamelang
Aug 30 09:29:31 [debug] pvpgn::bnetd::_client_auth_info: [628] selected "ver-IX86-1.mpq" "A=3845581634 B=880823580 C=1363937103 4 A=A-S B=B-C C=C-A A=A-B"
Aug 30 09:29:31 [warn ] pvpgn::bnetd::_client_echoreply: [628] bad timing in echo reply: now=480442 then=0
Aug 30 09:29:31 [info ] pvpgn::bnetd::_client_authreq109: [628] CLIENT_AUTHREQ_109 ticks=0x5dc056da, verstr=2.0.2.0 exeinfo="WarCraft II BNE.exe 08/27/20 03:22:17 712704" versionid=0x0000004f gameversion=0x02000200 checksum=0xb52bad87
Aug 30 09:29:31 [info ] pvpgn::bnetd::_client_authreq109: [628] client matches versiontag "W2BN_202A"
Aug 30 09:29:31 [info ] pvpgn::bnetd::_client_authreq109: [628] no upgrade is available
Aug 30 09:29:31 [info ] pvpgn::bnetd::_client_loginreq2: [628] "Admin" logged in (correct password)
Aug 30 09:29:32 [error] pvpgn::bnetd::handle_bnet_packet: [628] unknown (logged in) bnet packet type 0x14ff, len 8
Aug 30 09:29:32 [error] pvpgn::bnetd::handle_bnet_packet: [628] unknown (logged in) bnet packet type 0x2dff, len 4
Aug 30 09:29:32 [info ] pvpgn::bnetd::conn_set_channel: [628] joined channel "The Void"
Aug 30 09:29:32 [debug] pvpgn::bnetd::class_topic::class_topiclist::get: returning nullptr
Aug 30 09:29:34 [info ] pvpgn::bnetd::conn_set_channel: [628] joined channel "War2USA"

I do have these which seem common with or without a leak. <-- could be wrong
Aug 30 09:22:32 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=350854750
Aug 30 09:23:00 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1572973296
Aug 30 09:24:02 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=350944747
Aug 30 09:24:30 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1573063296
Aug 30 09:25:32 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=351034744
Aug 30 09:26:00 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1573153296

EDIT: The server is near 10% already. The UDP packets are the only thing coming in the logs.
Aug 30 09:36:30 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1573783296
Aug 30 09:37:32 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=351754751
Aug 30 09:38:00 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1573873296
Aug 30 09:39:02 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=351844748
Aug 30 09:39:30 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1573963296
Aug 30 09:40:32 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=351934745
Aug 30 09:41:00 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1574053296
Aug 30 09:42:02 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=352024742

I closed the bot the server instantly went to 0% CPU and I'm still getting those UDP packets. Remember the server and bot are from different networks and hosts.
Aug 30 09:42:46 [info ] pvpgn::bnetd::conn_destroy: [628] "Admin" logged out
Aug 30 09:42:46 [info ] pvpgn::bnetd::conn_destroy: [628] closed bnet connection
Aug 30 09:42:47 [debug] pvpgn::bnetd::sd_tcpinput: [632] read returned -1 (closing connection)
Aug 30 09:42:47 [info ] pvpgn::bnetd::conn_destroy: [632] "PG-13" logged out
Aug 30 09:42:47 [info ] pvpgn::bnetd::conn_destroy: [632] closed bnet connection
Aug 30 09:42:48 [debug] pvpgn::bnetd::sd_tcpinput: [636] read returned -1 (closing connection)
Aug 30 09:42:48 [info ] pvpgn::bnetd::conn_destroy: [636] "Owned" logged out
Aug 30 09:42:48 [info ] pvpgn::bnetd::conn_destroy: [636] closed bnet connection
Aug 30 09:42:49 [debug] pvpgn::bnetd::sd_tcpinput: [640] read returned -1 (closing connection)
Aug 30 09:42:49 [info ] pvpgn::bnetd::conn_destroy: [640] "Love" logged out
Aug 30 09:42:49 [info ] pvpgn::bnetd::conn_destroy: [640] closed bnet connection
Aug 30 09:42:50 [debug] pvpgn::bnetd::sd_tcpinput: [644] read returned -1 (closing connection)
Aug 30 09:42:50 [info ] pvpgn::bnetd::conn_destroy: [644] "Win" logged out
Aug 30 09:42:50 [info ] pvpgn::bnetd::conn_destroy: [644] closed bnet connection
Aug 30 09:42:51 [debug] pvpgn::bnetd::sd_tcpinput: [648] read returned -1 (closing connection)
Aug 30 09:42:51 [info ] pvpgn::bnetd::conn_destroy: [648] "Bot" logged out
Aug 30 09:42:51 [info ] pvpgn::bnetd::conn_destroy: [648] closed bnet connection
Aug 30 09:42:52 [debug] pvpgn::bnetd::sd_tcpinput: [652] read returned -1 (closing connection)
Aug 30 09:42:52 [info ] pvpgn::bnetd::conn_destroy: [652] "Mod" logged out
Aug 30 09:42:52 [info ] pvpgn::bnetd::conn_destroy: [652] closed bnet connection
Aug 30 09:42:53 [debug] pvpgn::bnetd::sd_tcpinput: [656] read returned -1 (closing connection)
Aug 30 09:42:53 [info ] pvpgn::bnetd::conn_destroy: [656] "YouTube" logged out
Aug 30 09:42:53 [info ] pvpgn::bnetd::conn_destroy: [656] closed bnet connection
Aug 30 09:42:54 [debug] pvpgn::bnetd::sd_tcpinput: [660] read returned -1 (closing connection)
Aug 30 09:42:54 [info ] pvpgn::bnetd::conn_destroy: [660] "Twitch" logged out
Aug 30 09:42:54 [info ] pvpgn::bnetd::conn_destroy: [660] closed bnet connection
Aug 30 09:42:56 [debug] pvpgn::bnetd::sd_tcpinput: [664] read returned -1 (closing connection)
Aug 30 09:42:56 [info ] pvpgn::bnetd::conn_destroy: [664] "Discord" logged out
Aug 30 09:42:56 [info ] pvpgn::bnetd::conn_destroy: [664] closed bnet connection
Aug 30 09:42:57 [debug] pvpgn::bnetd::sd_tcpinput: [668] read returned -1 (closing connection)
Aug 30 09:42:57 [info ] pvpgn::bnetd::conn_destroy: [668] "Insta-Win" logged out
Aug 30 09:42:57 [info ] pvpgn::bnetd::conn_destroy: [668] closed bnet connection
Aug 30 09:42:59 [debug] pvpgn::bnetd::sd_tcpinput: [676] read returned -1 (closing connection)
Aug 30 09:42:59 [info ] pvpgn::bnetd::conn_destroy: [676] "Thomas" logged out
Aug 30 09:42:59 [info ] pvpgn::bnetd::conn_destroy: [676] closed bnet connection
Aug 30 09:43:32 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=352114755
Aug 30 09:44:00 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1574233312
Aug 30 09:45:02 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=352204752
Aug 30 09:45:30 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1574323312
Aug 30 09:46:32 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=352294749
Aug 30 09:46:59 [debug] pvpgn::bnetd::handle_udp_packet: [616] got udpping unknown1=1574413312

@HarpyWar
Copy link
Member

Did you try master branch to make sure it was not caused by last changes? Or even this one https://github.com/pvpgn/pvpgn-server/releases/tag/1.99.7.2.1
First thoughts there is a "leak" somewhere in the code which increases a queue or like that, and every new iteration of that array takes more time.

Immediately after you stop bots cpu load on pvpgn server becomes normal, Is that correct?

@StevenFredette
Copy link
Author

StevenFredette commented Aug 30, 2020

Did you try master branch to make sure it was not caused by last changes? Or even this one https://github.com/pvpgn/pvpgn-server/releases/tag/1.99.7.2.1
First thoughts there is a "leak" somewhere in the code which increases a queue or like that, and every new iteration of that array takes more time.

Immediately after you stop bots cpu load on pvpgn server becomes normal, Is that correct?

It definitely wasn't caused by the latest build. I actualy havent updated since Finalize 1.99.7.2.1 release #362 is the version were I found the original leak. Now I'm on your latest dev build. Still here also.

Yes, once the bot got closed. The server CPU instantly went to 0%.

@HarpyWar
Copy link
Member

HarpyWar commented Aug 30, 2020

The robust way (but may be not so fast in your case) is to try different commits to find where the issue was started.
Firstly try with disabled Lua to exclude it from suspicion (dont every commit, just once to check it works fine or not).

Brief tutor how to do it:

  1. Install Visual C++ as it described in readme of the Magic Builder repo in "Requirements" and download Magic Builder itself.
  2. Open pvpgn commits history https://github.com/pvpgn/pvpgn-server/commits/master
  3. Select any commit and click on <> to browse it's code.
  4. Download the repo ZIP of this point and unpack into source directory which is in Magic Builder root path.
  5. Run build_pvpgn.bat and follow instructions. After build copy release\pvpgnconsole.exe and replace on your server to check it.
  6. Repeat from step 2 if required.
    (Hint: after first build you can run rebuild_pvpgn.bat to start build immediately with the parameters from the lastest build)

Don't try every commit, first find which works fine. It can be far away from the current commit, so pass several pages and start from there. When you find it - follow backward to find the commit which contain the issue.

@RElesgoe
Copy link
Member

If you only run 1 bot, how long would it take for CPU usage to go up?

@cen1
Copy link
Collaborator

cen1 commented Aug 30, 2020

Our SVN branch we use in production is using 100% CPU since we moved from freebsd to debian for unknown reasons. So it might be that there is some kind of resource leak even since 1.99. Luckily it does not seem to affect bnetd working normally in any way. I have not yet deep dived into the problem since we plan to move on from 1.99.

@StevenFredette
Copy link
Author

StevenFredette commented Aug 31, 2020

The robust way (but may be not so fast in your case) is to try different commits to find where the issue was started.
Firstly try with disabled Lua to exclude it from suspicion (dont every commit, just once to check it works fine or not).

Brief tutor how to do it:

  1. Install Visual C++ as it described in readme of the Magic Builder repo in "Requirements" and download Magic Builder itself.
  2. Open pvpgn commits history https://github.com/pvpgn/pvpgn-server/commits/master
  3. Select any commit and click on <> to browse it's code.
  4. Download the repo ZIP of this point and unpack into source directory which is in Magic Builder root path.
  5. Run build_pvpgn.bat and follow instructions. After build copy release\pvpgnconsole.exe and replace on your server to check it.
  6. Repeat from step 2 if required.
    (Hint: after first build you can run rebuild_pvpgn.bat to start build immediately with the parameters from the lastest build)

Don't try every commit, first find which works fine. It can be far away from the current commit, so pass several pages and start from there. When you find it - follow backward to find the commit which contain the issue.

I would have to prepare for this bounty. I think that would take a great amount of time. It would have to occur next weekend.

If you only run 1 bot, how long would it take for CPU usage to go up?

One bot connected and instantly seen CPU activity. Sits at 0% for 30 seconds then hops to .1% back to 0% hops to .2% back to 0% hops backs to .3%. Just in a matter of minutes.

EDIT: If someone is abusing this bot and bug against you're server. You can defend yourself with the /alert command. It causes the bot to error out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants