Chunk access not respecting save-off

Discussion in 'Bukkit Help' started by EdGruberman, Mar 11, 2011.

Thread Status:
Not open for further replies.
  1. Offline

    EdGruberman

    I've been having a problem where chunks seem to be "forgot and randomly regenerated" until a server restart.

    This has been happening since 493, but I'm on 522 right now and still getting the errors. I've associated it with when a backup is occuring (not always though, because I do a backup every hour, but the problem only ever manifests while doing a backup not any other time.) It happens about once or twice a day.

    In the console I get a large list of errors starting with the following and then repeating with a lot more null pointer errors (three of them are at the end of the small sample at the end of this post).

    I believe the critical point of discussion is this:
    I perform a save-all, wait for confirmation it saved, then issue a save-off command, wait 5 seconds, then start the backup (with 7zip if that matters), wait for the backup to finish, then issue a save-on command. The null pointer errors persist past the save-on command (along with the chunks appearing to be "reset" in-game.) When I restart the server, everything acts like it never happened and the chunks are back to what they were before the problem occured.

    Is it that Bukkit is not respecting the save-off command? Or is it that since there is no confirmation of the save-off command completing, it might still be finishing saving to clear the cache and I don't know? It's why I added a 5 second pause, but this latest error you can see almost 15 seconds after the save-off command the problem occurs.

    HALP! :)

     
  2. Offline

    TnT

    It seems like your save-off and save-on are working, but there is something holding your region files open. Have you tried this without any plugins? I wonder if one of them is keeping the region file open.
     
  3. Offline

    EdGruberman

    Oh, good suggestion. I didn't even think of a plug-in being able to override the core Bukkit on such a basic function.

    Here is what I'm running, I'll start disabling them one by one over the next week to see if I can isolate. Might be hard since it's intermittent, but usually happens at least once per day.

    dev-Bukkit #451 - Bukkit API
    dev-CraftBukkit #527 - Bukkit Server
    rTriggers v0.6_7 - Death messages
    BorderGuard Lite (Square) v2.1 - 2200 block limit from (X:0,Z:0)
    MapMarkers v0.3.1 - Live player positions on website map image
    ChatStamp v1.03 - Date/time stamp for chat
    WorldGuard 4.0-alpha1 - Region protection, lava/fire/tnt permissions
    WorldEdit 4.0-beta8 - Required by WorldGuard (Everything disabled)

    I wonder if @sk89q would have any commentary on WG/WE being a culprit here. The others seem so basic as to not worry about.
     
  4. Offline

    TnT

    Why do you have the dev-Bukkit in there? That is only needed if you're doing plugin programming and should be kept independent of your CraftBukkit server even if you are developing.
     
  5. Offline

    EdGruberman

    So the Bukkit api is included within CraftBukkit itself? I don't need to have it present on the production server running CraftBukkit? Maybe it's not really doing anything then. I have it in the same direction as the CraftBukkit.jar, but don't call it anywhere specificially. To be honest, I just thought it needed to be present. :(
     
  6. Offline

    TnT

    Nope, not needed at all if you just want to run the server. Its probably not doing anything, but remove it to be sure (and don't bother putting it back).
     
  7. Offline

    EdGruberman

    dev-Bukkit removed! Thank you for setting me straight on that confusion.

    I also upgraded to WorldEdit 4.1 which I just noticed is out. Apparently @sk89q made changes to improve the new McRegion chunk file format. Perhaps that might be my winning ticket here. I'll sit on this to see. If not, I'll start by removing WG to see, then WE, then the rest one by one until the problem is no more. See you in a week! :)
    [MERGETIME="1300143259"][/MERGETIME]
    Well, upgrading to WE 4.1 (and 4.2 now) is giving me hope. I haven't seen this problem once over the weekend. I'm figuring I'll jinx myself by posting this, but at least then I'll know sooner, LOL.

    Here's to hoping it was WE 4.0-beta8!

    On a design note, I'm still kind of amazed a plugin can override access to the world files for such a core function as save-off. I would have thought the Bukkit API would keep tight control on such low-level abilities...
    [MERGETIME="1300144176"][/MERGETIME]
    No lies, like I said I literally jinxed myself. No sooner did I finish posting this did I go check on my server to see it experience this problem again. I swear... it's like computers KNOW sometimes...

    Anyways, I've now disabled WG and WE to see how that flys.
    [MERGETIME="1300205163"][/MERGETIME]
    I upgraded to CraftBukkit #541 but it exhibited the same problem again last night. I've now removed BorderGuard and Markers.

    I'm now running this:
    dev-CraftBukkit #541 - Bukkit Server
    rTriggers v0.6_7 - Death messages
    ChatStamp v1.03 - Date/time stamp for chat

    And again. I've had some active players lately, so I think it's happening a bit more frequently the past few days.

    I'm down now to only CraftBukkit #541 now...

    So if it still happens, where can I go with this? Submit a bug? Since it's a bit intermittent, I'm not sure how to isolate. And why only me? No one else has this problem? Does no one else backup their server with save-off? I'm confused.

    EDIT by Moderator: merged posts, please use the edit button instead of double posting.
     
    Last edited by a moderator: May 11, 2016
  8. Offline

    TnT

    How are you handling the backups? Is it some plugin you are using for it? A script? Manual task?
     
  9. Offline

    EdGruberman

    It's a custom wrapper script that issues commands through StdIn and monitors StdErr for log output/confirmation.

    It issues a save-all, wait's for confirmation it saved, then issues a save-off command, waits 5 seconds, then starts the backup (7zip), waits for the backup to finish, then issues a save-on command. I've compared it against a few other general backup scripts and I can't see that it does anything procedurally different.
     
  10. Offline

    TnT

    Has it been updated to work with the recent console changes that were implemented? Have you posted on the thread for that tool - maybe someone using it has seen the same problem?
     
  11. Offline

    EdGruberman

    The wrapper script definitely compensates for the new console. The server.log indicates the commands are being issued in sequence as expected. Clearly the "save-off" command is in effect for at least 5 seconds before the backup starts to occur. It is after the backup starts that a region file access is attempted through the Bukkit API. If the backup happens to have a lock on the region file that is attempting to be accessed by Bukkit at this point, this is where the problem results.

    My fundamental concern is why is the Bukkit API allowing region file access when a "save-off" command is in effect?
     
  12. Offline

    TnT

    Might be that the backup has started before the API had a chance to finish the save-off command. If it truly is an API bug - the dev of that wrapper should be able to prove it and post a bug report (hence suggesting to post on their thread).

    If its easy enough for you, you may want to try a different backup method (preferably a plugin) so you can test its use of save-off <backup> save-on commands - see if it gives you a better result.
     
  13. Offline

    EdGruberman

    @TnT Unfortunately, I'm the dev of the wrapper. lol

    "try a different backup method": This got me thinking... Instead of continuing to blame Bukkit, maybe I could do something different. Upon looking closer at 7-zip, I realized I'm using the 32bit command line version on a 64bit OS. Perhaps something odd was going on there. So now I'm using the 64bit version of 7-zip. Also, this caused me to look closer at the zip process output which indicated it was having trouble compressing the region files that were in use. This means my backups were incomplete anyways... I found the -ssw switch on 7-zip and that solved that concern at least. (I also now check the exit code of the 7z.exe process so I detect any problems there also, duh.)

    During all this time I had the crazy idea in my head that Bukkit was allowing write access to the region files, not sure why. But upon rethinking it all, I realized Bukkit just wanted to READ the files because it still has to supply render data to clients even when save-off is in effect. If 7-zip was locking the files so Bukkit couldn't read them, that could be the problem! I'm hoping the 64bit and -ssw switch prevents this problem. I've re-enabled all my plugins to re-test in a full environment again.

    Thank you for being patient with me TnT and helping me to bounce ideas/concerns off of. My time is usually scattered quite a bit and it's hard to focus a lot of time on all this. The ability to talk it out and track it all here helps a lot!
     
Thread Status:
Not open for further replies.

Share This Page