What is the meaning of CLARiiON soft and hard SCSI and media errors with extended status codes?
- CLARiiON FC-series
- CLARiiON CX-series
- CLARiiON CX3-series
Description of events:
- 6A0/820 - Soft Media Error. A bad or marginal data sector has been detected. The sector was successfully read.
- 20 - Hard Media Error. A bad or marginal data sector has been detected. The data sector could not be read and the Storage Processor (SP) needed to regenerate the data using RAID reconstruction.
- 801 - A SCSI operation failed and needed to be retried. The error indicates that the retry succeeded.
- 901 - A SCSI operation failed and needed to be retried. The error indicates that the retry attempts failed.
- 803 - Recommend Disk Replacement. The drive's predictive failure analysis has determined that the drive is likely to fail soon.
Note: 801 and 901 events are not strictly confined to being a disk-related problem. Soft SCSI errors can be an indicator of a bad LCC cable or bad LCC not handling backend bus loop noise correctly. Look at the extended status (described in the Fix statement) for the affected drive to determine the cause of the event.
- If 6A0/820 and 920 is received - A determination must be made on how to handle errors.
- If 801 and 901 is received - See extended status codes below.
- If 803 is received - Replace the drive.
Extended status codes:
These extended status codes indicate a problem with the drive that is reporting the error. For status code 9, the drive should be replaced. For the others, the drive should only be considered for replacement if the errors recur frequently.
- 0x1c - Bad Sense Key
- 0x02 - The drive detected a parity error.
- 0x04 - A timeout occurred during a remap operation.
- 0x05 - A bad block was detected and remapped on the drive.
- 0x08 - The drive is not ready.
- 0x09 - The drive reported a hardware error.
- 0xd - BAD Command Descriptor Block (CDB). This can occur when LCCs/BCCs have mismatched FRUmon code (newer commands are being sent, but not understood by the older code). Sometimes seen during a FLARE upgrade that involves updating the FRUmon code.
- 0x11 - This is a benign soft under run error seen on ATA drives with FLARE Release 14 (02.07) and greater. The next version of ATA firmware (1.67) will fix the problem. This is scheduled to be incorporated into FLARE Release 19 scheduled to be available mid-July 2005.
- 0x19 - Predictive Failure Analysis fault threshold reached.
- 0x21 - The drive encountered an error reading data, but was able to recover by re-reading the sector. The drive transferred valid data to the SP.
- 0x22 - The drive encountered an error reading data, but was able to recover using its internal ECC mechanisms. The drive transferred valid data to the SP.
- 0x3A - Following a corrected error, the drive remapped the marginal sector.
- 0x3B - The drive was unable to remap a marginal sector.
- 0x3C - The drive attempted to remap a marginal sector and failed.
- 0x3D - The drive was unable to remap a bad sector.
- 0x3E - The drive attempted to remap a bad sector and failed. These extended status codes indicate a problem with the backend loop. The problem may be with the reporting drive, another drive on the loop, an LCC or cable, or the SP. If the errors are confined to one drive, that drive is likely (but not necessarily) the cause of the problem.
- 0x06 - Command timeout. The SP sent a command to the drive, and the drive did not respond in time.
- 0x07 - Select timeout. The SP sent an 'I want to talk to this drive now' message, and the drive did not respond in time.
- 0x2A - Bad transfer count. The drive transferred a different amount of data than was requested.