Over the past several days, I’ve had weird problems with my main PC, which is where I create most of the content for OCModShop. Yes, I have all of the assets guarded on a seperate RAIDed box, so don’t worry about that.
I first noticed the problem when moving MP3s over to a new Synology 2TB NAS I’m testing (review coming soon). Some of the Explorer windows will just become unresponsive, and require terminating the process. As days went on, the windows would completely lock, and I couldn’t run much of anything else, including Task Manager. Sometimes when I leave the room for an hour or so I’ll come back to a POST screen (the computer had obviously rebooted), and it can’t find the primary hard drive. As soon as I cycle the power, then the hard drive is found. Weird.
This is very odd behavior so I thought it could be a virus. After running several scans from several programs, and even rolling back using System Restore, the problem persisted. I even reinstalled Vista, and although the problem seemed to have been alleviated at first, it came back.
So the problem appeared to be more hardware-related. We’ve had a heatwave in Seattle for the past few days (we broke the all-time record at 107 degrees in some places), so I put a 20″ box fan on My Lian Li V1000 case and opened up all the side panels. I started removing hardware, bit by bit. The problem is hard to reproduce, so I have to wait and see how stable the system is before I moved onto other suspect hardware. Sometimes removing hardware (extra RAM, sound card, USB devices) seemed to do the trick, but then the problem crops up again later in the day.
Then I looked at something I should have been one of the first things… the Event Log. I noticed that Windows logged a “bad block” error on Hard Drive 0 right before an “unexpected system reboot” message.
So, it appears that my primary hard drive is going wonky. That would certainly explain why the drive disappears from POST on a soft crash. I ran CHKDSK, and it hangs on a particular sector. Of course I had to run it several times, and each time it hangs on the same sector.
It sucks that I have to replace a relatively-new 500GB hard drive, but since it’s a Seagate it has a 5-year warranty. It’s just a pain to go through the RMA process. It is possible that my disk controller is wonky, so I’ll run a scan on the drive using the iStar Hard Drive Dock (reviewed here) to see if the drive is truly effed.
I recently had another hard drive do a very similar thing in the same system, but it wasn’t a system drive so it was harder to diagnose. Run a CHKDSK on it and BAM… no more drive letter after it hits a certain sector. I hope it’s just coincidence rather than a bad SATA controller.
So, I have to reinstall (again), and then try to pull off what data I can… and the drive is nearly full so I hope I can recover everything.
Hard drives fail, kids. If you really really really don’t want to go through recovery scenarios then get a pair of cheap drives and run them in RAID 1 for your boot drive. Luckily my motherboard has an “EZ RAID” feature.