Repeated server crasches due to segmentation fault


(Fenris) #1

I could use any help here in locating the error, I have a few ideas but I’ll save those for the end and describe the problem first if anyone has any thoughts pls give them to me.

Ok here goes: I’m running a small Linux RedHat based ET server, the hardware is P733 with only 128 MB Ram, the Internet connection is a switched 10 Mbps Full duplex Ethernet link. In general I have no problem with this setup, but the Linux GUI’s a little slow, (due to lack of memory I assume), ET however is not since it runs as a dedicated server in a command shell.

The server problem is that it keeps crasching with a “segmentation fault” error, I’m kind of new to Linux (I usually use Windows but I though this would be a nice way of learning the benefits of Linux on the same time as running the server) so I’m unsure of how to troubleshoot that but a friend has told me to look for “core” files indicating craschdumps. I have found none.

The server is set up for 20+8 players (i.e. 20 open slots and 8 private), it used to be quite full, but there has been fewer players lately, I can only assume it’s due to the frequent crasches. The server runs an 8 map campaign of which 4 maps are custom, at the moment it crasches about every 3rd day. If I change the campaign to just 2 custom maps out of 8 it will stay alive for about a week, or 10 days. If I lower the number of players to 20+4 it will stay alive for 13-14 days.
It also depends on which maps are in rotation, Stonehenge_KOTH for example caused it to crasch much more often than Temple(1/2/3)

However it is only ET that crasches and not Linux in itself, so whenever this happens I only need to kill the command shell where the ET server is running (it is completely locked up and cannot be closed normally), then start a new command shell and start the ET server again. No reboot of Linux is needed, leading me to believe the error lies in ET rather than Linux (correct me if you think I’m wrong).

The segmentation fault occurs at different frequencies, but applies regardless of whether I’m running plain ET, ETPro or ShrubET, I regret that I have only the information from ETPro to show you, but the other two does not give any details, they just say “Segmentation Fault” and that’s basically it. ETPro gives a bit more detail which I hope could help someone point me in the right direction, here’s what it says :

<lots of previous normal game entries cut away>…
ClientBegin: 8
ET PB Server : New Client Connection (slot #9) <snip>
ET PB Server : Player GUID Computed <snip>
------ Unrecoverable Error ------
This information may be due to a bug in etpro
Information to be used in a bug report is being generated
------ Cut Here ------
Version : etpro 2.0.5
Platform : Linux
Signal : Segmentation Fault (11)
Stack Trace : 1 entries
/usr/local/games/enemy-territory/etpro/qagame.mp.i386.so (etpro_DumpBackTrace+0x33)[0x4637b457]
------ Cut Here ------
Trying to clean up …
------ Recursive Crasch ------
Exiting hard.
/usr/local/bin/et : line 5 : 1988 aborted .et.x86 $*
[homedir]

So far, my theories are two, my primary suspect is lack of memory but the server runs fine all up to the crasch.
Secondarily I suspect a fault in ET, but here I’m at a loss of what to do.
I do not believe the mods to be at fault, nor the maps although they seem to contribute, but I suspect that to be the size of them rather than their construction (?), I would guess on a memory leak, and if so it won’t really help to add more memory, but against that speaks that other don’t seem to have this problem (or?)

I appreciate all help
Thanks in advance


(McAfee) #2

First thing you have to test is a “Stock” installation of ET. By that I mean no custom maps, and no mods.

Running standard 3 map campaign could be a good idea, but I don’t think you have to go to that extent. I strongly suggest not running long campaigns until you solve the issue. Limit yourself to 6 map cycles or less.

Also, There have been reports of too many pk3 files causing problems.

As for the ram, try testing the ram with an utility. Microsoft has one ram tester. It works with a boot disks, so after creating it (in windows) linux shouldn’t be a problem. http://oca.microsoft.com/en/windiag.asp

There are other ram utilties out there which are worth a try. Buying more ram is a good idea in general (perfomance wise), but I don’t think low ram is the cause of your stability problems.

Don’t assume mods and custom maps don’t cause problems without testing.


(Fenris) #3

The reason for my assuming the mods or maps are not mainly the problem is that this happens regardless of which mod (or no mod) I run, and also regardless of which (custom) maps are on, it is just a question of how long until the crasch.

Thanks for the thought about stripping down to a minimum, I’ve thought of that as well of course but with only a 3 map std. campaign I would personally find it quite boring to play and while I of course could let the server idle on empty for a few days (well ok, some players would still come I assume but…) I’d really rather try to find a solution to the custom setup as that what the server will be running and while seeing how long it can run with 3 maps and no players could provide more info I don’t think it will isolate the problem. I guess I could go for the “complete” campaign with all 6 maps for a while.

I sure hope it’s not a too many PK3 files issue, since the server in total has 5 PK3 files in addition to the ones that come with ET by default (i.e. 4 custom maps and my campaign file)

As far as I’ve found out segmentation faults are either/for example the program overwriting parts of memory it isn’t allowed to, or faulty hardware so I will try the memory diagnostic tool, thanks for the link.


(Doodie) #4

If you want a non-MS memory test solution, try Memtest86
http://www.memtest86.com

Available in Linux and DOS formats, usually included with most Linux distributions


(Fenris) #5

Thanks, though I still think the problem is ET/config related rather than malfunctioning hardware.


(forty) #6

You are probably running out of memory. The dedicated servers we run seem to eat up memory like candy. Esp if you are running X at the same time.

If your server left a core file around, you can run ‘gdb etded.x86 core’ in your et directory and do a backtrace (bt inside of gdb) and it will tell you exactly where the program failed.


(Fenris) #7

Thanks, I’ve thought as much myself.
Slight typo above I actually run two custom campaigns, they are the same maps just that the order of them is different in campaign A vs campaign B, any idea whether it would help anything and consume less memory to just repeat the same campaign instead ?


(Azarael) #8

It may be possible that part of the executable etc got corrupted somehow. Have you tried redownloading and installing?


(bani) #9

128mb is way too low. thats your most likely culprit.