Thursday, February 3, 2011

Samsung 2 TB HD204UI Firmware Bug,

The Samsung 2 TB SATA Drive, Model HD204UI, Firmware version 1AQ10001, has a nasty bug which causes data corruption if an Identity command issued while the drive is buzy writing with queued requests via NCQ. Smartctl, hdparm, and any number of utilities can issue an Identity request to the drive. I noticed it because I've got smartd (part of the smartmontools package) running in the background to periodically do smart health checks of the drives on my servers.

Samsung appears to be handling this incredibly badly, yet the drives still receives very high reviews.

1. This error is easy to reproduce, I'm really surprised it wasn't caught earlier.

2. New drives as of Dec 2010 are still shipping with the broken firmware.

3. The updated firmware doesn't have a new version number. Let me say that again, They didn't bother to update version number so you can tell if you need to do a firmware update. This also means you can't easily tell if you've successfully updated the drive's firmware. Seriously, as far as I can tell, there is no way to tell a good drive from a bad one.

4. Finding the firmware update on their site is difficult. It is listed in an FAQ entry as "Q. Patch tool for Identify fail during NCQ write command.(Model: F4EG". That's clear as mud isn't it? At least the answer to the question in the FAQ is a little bit clearer "If identify commmand is issued from host during NCQ write command in the condition of PC, write condition is unstable. So It can make the loss of written data."

5. The firmware update is only found on that FAQ page, yet, there is no version number, no date, or any way of knowing if you are downloading the most up-to-date or correct firmware update.

6. The update is in the form of a DOS .exe file. You need a system you can boot DOS to update the firmware.

By a fortunate accident, I was able to spot the problem during the extended drive burn test that I do, which has saved me several times. I wrote my own Perl script to do somewhat of the equivalent of a surface scan, repetitively writing patterns across the whole disk and reading it back. There is some belief that there isn't any value in running your own surface scans any longer since it's been done by the drive manufacturer. My experience is that infant mortality rates on dirt cheap drives seems to be climbing. Testing doesn't appear to be improving.

Given Samsung's mishandling of this. I'm still considering returning this drive. I can see having any faith in Samsung's storage products given the above. To me the worst part is that they couldn't bother to update the fixed firmware version number which shows, IMnsHO, a complete lack of rigor in engineering as well as a complete disregard for customer's time.

1 comment: said...

As they where cheap, I bought one of these, I was aware of the problems before I bought it. I like a challenge. I may be in luck though, the date on my (I assume its the date as the prefic is in chinese,lol), is 2011.05. So I'm crossing my fingers and just shuving it in. But Im gonna do some testing with the drive burn tool you mentioned, simply because that QA they've got does not say if your drive is dated 2011.01 or greater you do not need to update the FW. You think samsung would be more explicit in the deatil they provide so you don't waste oer an hour reading up on the drive. Now for my plug (lol) If you want a hard drive you can buy of my website