View Single Post
Old January 20, 2009, 12:20 AM   #46
tyme
Staff
 
Join Date: October 13, 2001
Posts: 3,133
Xfs apparently wasn't the problem. The remaining suspects include the kernel software raid driver, lvm2, and the kernel's softlockup detection/reboot code itself. If the kernel incorrectly detects a softlockup and tries to reboot and fails (seems unlikely), that could also explain the symptoms.

Second hand reports from the colo staff were that, a week ago Sunday, sysrq+t indicated the system was stuck in software raid code. However, neither Friday nor today have I been able to get sysrq hotkeys to work after the crashes. Just a message about a detected softlockup, and then one saying it's rebooting in 6 seconds... which it obviously didn't.

For now I've disabled the kernel panic and reboot after a softlockup. Since it wasn't successfully rebooting anyway, there's no sense in having that turned on.

The attachment upload issues and usercp problems were due to bad temporary directory permissions after switching those temp directories from xfs to ext3; those permissions are both fixed.

Quote:
Originally Posted by peetzakilla
FiringLine needs to move to Mac!
FreeBSD is a possibility. It's a trade off between the prospect of possible future downtime due to crashes and forced immediate downtime to switch OSes.
__________________
OOOOOOOO.OOOOO...OOO......OOOOOOO.OOOOO (8ob5o3b3o6b7ob5o!)
“Compleχity is a symptom of confusion, not a cause.” —Jeff Hawkins
“The egg hatched...” “...the egg hatched... and a hundred baby spiders came out...”
tyme is offline  
 
Page generated in 0.04255 seconds with 7 queries