The LTO2 drive wants to be cleaned repeatedly

tapedrive

A normal functioning backup job for a long while suddenly began to fail around thankgiving intermittently and kept getting worse as time went on. Now it's failing daily. I think my tape drive may be damaged but I am not completely sure on that.

The backup software (ArcServe) always threw media errors and while my tapes were not brand new they have seen fifteen or less uses/erasings and were stored appropriately.

The tape drive (Quantum LTO2 Half Height) is on it's latest firmware revision and the scsi HBA and tape drive both are on the latest windows OS drivers. Those driver versions haven't been updated since the issues started because one of the first places I looked was at driver updates.

I use a tool called xTalk to run diagnostic tests against the tape drive and the "drive health" test says that it is okay. I run all the individual diagnostic tests and all of those complete too with the exception of the Full Tape Backup. The full tape backup fails out somewhere around 70-80% into the test. It throws a media error (The same media error ArcServe throws). My daily Arcserve backup also fails out in network share 6 out of 6 which given the backup set, is about 70-80% through the job. The pattern between 70-80% into the full tape backup test before fail and the 70-80% into writing a Arcserve job correlate all too well but I do not know if I can put any weight on what i'm seeing there. I'm inexperienced with these things.

It is important to note that in two months of testing here and there that the tape drive required cleaning before it would run my tests. Those dates were 12/16/11, 12/20/11, 1/9/11, and 1/23/11. On 1/9/11 I reflashed the tape drive with the same firmware it already was on and ran three back to back cleanings as recommended by Quantum. In my experience a functional drive does not ever ask for that many cleanings. My cleaning tape is old. It was at my company before I was. But it's not seen anywhere close to the 50 cleanings it says that it can do. I assume the tape still cleans okay because when the drive tells me I must clean it the tape goes in and the message disappears about cleaning it for a few days.

The thing that has me puzzled is that a short tape write and a medium length tape write are both perfectly fine. Something about the full tape, and near the end of the tape, is what freaks my drive out.

So, I ordered fresh tapes and am currently attempting a full tape backup on a fresh tape.

In my backup logs I see some network errors too. So I am not sure if I am fighting a issue where a backup share is lost over the network, or if I am battling bad hardware, or maybe both.

I was hoping someone who has been there before may be able to put some pieces together that i'm not seeing here. If my drive is damaged then I do not mind sending it in and paying to get it fixed. But when it only fails on one type of test and the device health test says the unit is fine it makes it difficult to justify throwing the tape drive in the mail.

Hopefully someone out there can assist. Thanks for reading my post and have a good day.

edit #1: I can restore from a tape that was incomplete due to a media error. I can merge a very old tape back into the database and restore files from it too. And even my Arcserve log shows me e6918 "your tape drive needs cleaning". So that's a couple more "normal" tape drive things that my tape drive can still do. Makes me think less that it's the tape drive being damaged but the persistent cleanings say otherwise I feel.

edit #2: 1/27/12 – I took a brand new LTO2 tape out of the shrinkwrap on 1/25/12 and ran a full tape backup test on it and had no problems. I took another fresh LTO2 tape on 1/25/12 and made a full arcserve backup and had no problems. This morning I received my new LTO2 cleaning tape and I ran it through three times. After the first cleaning I examined the window on the tape cleaner to discover it was not completely covered in grime so I ran it the other two times as a good measure. I then opened up my xtalk testing software and had no "clean tape drive now" warnings so I decided to take a previously erroneous tape and erase it and try a full tape backup. I've wrote the whole tape's contents and am in the process of verifying the write operations. If that comes back good then i'll be able to do something that last week would have failed during the write operation. So that may be a good sign that my issue is on the way to being resolved. Can't believe a little bit of dirt can affect so much.

Best Answer

First try replacing your cleaning tape (order a couple).
The "50 cleanings" figure for a cleaning tape is kind of like "every 10000 miles" for oil changes in a car: Under ideal conditions that's fine, but it's not a guarantee of performance. If you're using the cleaning tape on grungy old heads in a beat-up drive it won't last anywhere near that long: I've seen cleaning tapes come out brown after ONE cleaning cycle, and I certainly wouldn't reuse them for 49 more.
Your tape may look clean, but hold it up next to a shiny new tape and you may notice a surprising difference.

Also note that the "CLEAN ME" message going away just indicates that a cleaning cycle was done -- The fact that the cleaning message (and presumably the media errors) come back after "a few days" instead of months makes me suspect that your cleaning tape isn't doing the job (the drive is picking up errors, which triggers it to demand another cleaning).

If running a fresh cleaning tape and using new tapes for the backup doesn't make the problem go away you may need to have the drive serviced/replaced.


I don't think your network errors are related - a media error is typically thrown by the tape drive, and would be independent of any network problems. You may want to open a separate question about those errors (please be as specific as possible if you do: "network error" is one of those meaningless phrases that can be anything from "I couldn't open a connection to the backup client" to "I'm using ISCSI to talk to the tape drive and it ain't working!")

Related Topic