Search the Community
Showing results for tags 'post mortem'.
Found 1 result
-
Okay, so the site was just down for a bit, here's what happened: 1. The underlying operating system had some security and functionality upgrades. Probably 90+% of the time there are really no issues, but for whatever reason this one caused a kernel panic situation (super rare) that required a hard reboot. These aren't very nice because the operating system doesn't do things very cleanly when you're forced to do this, but there's basically no way gracefully around a kernel panic, it's the closest thing to a heart attack that can happen to a server. 2. When the box rebooted, it took its sweet time because it hadn't been rebooted in 259 days (pretty good), so it checked the file system, this took awhile, and is sort of an automatic check that just has to run. 3. After it booted, mysql database server started making rumblings that it was having problems, but then started behaving. Except while the mysql server was behaving, one of the forum tables wasn't, which was why mysql itself didn't throw up any flags. 4. Randy let me know that the database still sucked, so I dug in a bit. While we do very frequent database backups, restoring from backups is a tricky thing because the database is fluid as people post things and change things, so a day old backup even wouldn't really match the rest of the forum structure, basically it could puke even with that frequent of database backups, and even with our frequent full fileset backups. Hazards of the game I guess. 5. I was able to shut down mysql server and do a specific repair on one table that was corrupt, and it worked, so we're back online. Of course, let me know if you still have issues. 6. It's sort of an interesting thing, mostly because we have a freakin ton of posts on this forum, and it's become pretty large, that's a "Good Thing". But now I feel we have to migrate to an active real time database cluster that will keep multiple up-to-the-minute copies of the database, so that if anything happens, we can just point our forum at the hot spare data mirror. Part of the growth process I guess. I don't think Randy or I ever thought it would grow this much, but here we are. We'll let you know what we come up with