0 5 min 6 mths

Feb 1, 2016: Experiencing some really nasty stability problems with few newly launched Unix Servers. It all started with a power outages few days back. When the power came back booted up the servers and ran zpool scrub on the arrays. Later checked the status on the scrub and could not ssh into the box. Power cycled the machine and ssh back in to the machine. About 20 minute later ssh disconnects and the machine stops responding.

Until now all servers had been stable with no crashes whatsoever.
Running with 64GB of RAM on a Xeon Rack and 8 disks ZFS
Also read that upgrading to 10.2 stable fixes a lot of issues with ZFS, not sure if this will fix problem

Next plugged the monitor into the box and watch what was going on. Like before around the 10 minute mark see a kernel panic happen. The message states that it could not write the crash dump to the disk.

Then powered cycled the machine once it came backup, rebooted it. The reboot takes a very long time waiting to process to shutdown. It gets to the syncing of the nodes, spiting out a bunch of 4s, 2s and 1s and then says time out. Next the screen says that it is emptying the buffer and a bunch of numbers are written to the screen. After about 5 min of this get another kernel panic.

Tried the above again and the same result. Suggest a solution, please.

Leave a Reply