Tuesday, December 7, 2010

A Story of the Deleted Transaction Log

I was lying in bed one Saturday night and my phone rang. It was one of our team leads on the other end. The conversation began something like this. "Hey, this client ran out of disk space so we deleted the transaction log for the database as it was over 100GB". Well I wasn't quite asleep but I certainly wasn't awake and my initial thought was this guy thinks he is having a little fun with me. Before my mind could fully envision the animated balled up pieces of paper dematerializing as they were thrown towards the recycle bin I heard the funniest thing of the night come out of the leads mouth, "Do you think that is related to the application not functioning?"
After rubbing my eyes and placing the jack back under the bed that I often use to get my jaw back into place, I asked if the LDF file was in the Recycle Bin. I am sure you know that the answer to that question was "No", mainly because the file is too big for most bins and secondly because nothing is that easy. Obviously the next question was "When was the last backup?" Wouldn't you know there wasn't one, go figure!
To fill you in on a little detail, the company I work for develops software. Our support model includes support of the applications, and we often get involved with database issues, but we do not own or maintain databases. The client had called the support line and stated they were out of space and asked if we could help them. This particular client does not have a DBA on staff, so our technicians were trying to help them. Certainly the wrong route was taken but at this point it was too late to look back. I am the sole DBA on the support side of the company, and since this was now a DBA issue that was caused by our technicians, I had to come up with a solution. The solution I came up with was the best for this particular issue based upon all the factors, specifically no backups.
After getting over the ridiculousness of the issue I began researching how to resolve the problem. I ran into a few hints in BOL and then other hints on the Internet. After all the research, I determined the best issue would be to use the DBCC CHECKDB with the REPAIR_ALLOW_DATA_LOSS option. BOL stated this option could repair a transaction log. Well what could be more broken about a transaction log than not having one at all?
Once I determined I would go down this path the first problem was that the db would not go online, not even in emergency mode. So what I did was stop the SQL Server service and then I renamed the MDF file. Let's say the database name was DbLive and I renamed the MDF from DbLive.mdf to DbBad.mdf. I then started the SQL Server service and created a new DbLive db with a corresponding DbLive_log.ldf file. I then stopped the SQL service again and deleted the DbLive.mdf file. I then renamed DbBad.mdf back to DbLive.mdf. I then started the SQL service again.
You can imagine that these steps didn't fix the problem. The db did not come on line, but at least the db now had a transaction log, albeit one from a different database. But as far as SQL server was concerned the log was there now. It was just in a jacked-up state (yes that is a technical term).
The next thing I did was run ALTER DATABASE DbLive SET EMERGENCY. This put the db in emergency mode and allowed SQL server to work with the db at an emergency level. Next I ran "sp_dboption 'DbLive', 'single user', 'true' ". This set the db to single user mode preparing it for the DBCC that I would run next. Finally I ran "DBCC CHECKDB ('DbLive', REPAIR_ALLOW_DATA_LOSS)". After about an hour the db was ONLINE!
I took the db out of single user mode by running "sp_dboption 'DbLive', 'single user', 'false'" I then ran a few queries against the database and all was fine. I am not totally sure what data loss there was, if any. Since the MDF was intact and the SQL service was taken off line and then the LDF file was deleted all data should have been committed to the MDF. So I believe there was no data loss. This happened several months ago and there has not been any reported db or application anomalies so I must assume all data was present.
Oh and yes, I slapped a maintenance plan on that system and sent the backups to an obscure folder. If they run out of space again it will probably be due to having too many backups! What I meant to say was that I explained to the client how important backups are.

/*Stop SQL service, rename DbLive.mdf to DbBad.mdf.started SQL service, created fake DbLive db (with log etc)Stopped SQL serviceDeleted DbLIve.mdf Renamed DbBad.MDF to DbLive.MDFStarted SQL service.Ran following script:*/ALTER DATABASE DbLive SET EMERGENCYsp_dboption 'DbLive', 'single user', 'true' DBCC CHECKDB ('DbLive', REPAIR_ALLOW_DATA_LOSS)sp_dboption 'DbLive', 'single user', 'false' 

No comments:

Post a Comment