Sunday, June 05, 2005

XFS undelete HOWTO:

How to undelete a file in a linux XFS filesystem:

"I moved to the next possibility and clicked ‘undeploy’.
Suddenly- all my application and
data files were deleted by tomcat."

Background (How I accidently deleted all my data):

At the end of a 4 week project to make a shipping portal website, I decided to try out the Jakarta-Tomcat web admin interface to start and stop the application instead of sticking to the command line startup and shutdown scripts that I am familiar with.

I logged into the admin console (jakarta-tomcat-5.5.9) and proceeded to restart the application by clicking ‘start’ and then ‘stop’. This did not show the changes I was expecting so I needed to try something else. With each button click, a generic window pops up to say “Are you sure?” Am I sure about what? I wondered. I moved to the next possibility and clicked ‘undeploy’. Suddenly- all my application and data files were deleted by tomcat.

Reminder- to tomcat, undeploy means you want all your code and project files deleted.

I ran df –k praying that this filesystem was your friendly ext3 but instead it read xfs. Dread.


Q: Does the filesystem have a undelete function?

There is no undelete in XFS, in fact once you delete something, the
chances are the space it used to occupy is the first thing reused.
Undelete is really something you have to design in from the start.
Getting anything back after a accidental rm -rf is near to impossible.

This called for extreme measures. I had to bring the files back.

The Undelete:

They say it is near to imposible to recover files in XFS but I did it, and here’s how:

df –k /usr/local/jakarta-tomcat/webapps

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda7 26245376 13740048 12505328 53% /usr/local

I knew that it was important not to disturb the file system where the files had been deleted, but I had production data on part of that disk so I could not unmount it. Also, I did not have 26 Gig’s of free space on the machine to make a safe copy. This meant that I had to work fast and avoid all file creation on the hda7 filesystem.

I used the time command on every search I ran because when grepping 26 Gig files it is very important to manage the amout of time spent searching for data.

First I ran some tests on 1 Gig of raw XFS data to find the fastest way to search for strings. I searched for '$WGET' because it was a variable i remembered that was in the lost file.

dd if=/dev/hda7 bs=1024 count=1000000 | strings | cut -c0-50 | grep

$WGET --post-data="prefix1=$PREFIX&number1=$NUMBER
$WGET --post-data="prefix1=$PREFIX&number1=$NUMBER
1000000+0 records in
1000000+0 records out
real 4m35.243s
user 1m24.390s
sys 0m20.810s

Now the better way.

dd if=/dev/hda7 bs=1024 count=1000000 | grep -a '$WGET' | strings

1000000+0 records in
1000000+0 records out
$WGET --post-data="prefix1=$PREFIX&number1=$NUMBER&DoIt=Do+it"
$WGET --post-data="prefix1=$PREFIX&number1=$NUMBER&DoIt=Do+it"
real 0m55.836s
user 0m4.350s
sys 0m18.310s

Both command statements searched 1 million records of data (1,000,000 x 1024=1Gig) but moving the strings command to after the grep and taking out the cut is 500% faster. This is important because I need to search 26 times this amount of data.

Now I began the search using a unique keyword that I knew was present in the deleted file.

time grep -a –B100 –A100 "$WGET" /dev/hda7 | strings

fo< x=")&" i="n?&s%p1%d%;m">[99H

#045 24630815
usage () {
echo "Usage:"
echo '-p 3 digit prefix'

WGET='wget -q --user-agent="Internet Explorer 5.5" --wait=1 --timeout=10 --tries=2 -O -'
#begin the web get

$WGET --post-data="prefix1=$PREFIX&number1=$NUMBER&DoIt=Do+it" grep -A60 "Details" grep -E "::' sed 's:::' sed 's:::g'tr '<' ',' sed 's/>/,_/g' awk -F, '{print $3}' grep -v _Time tr -d '_' paste -d= 5.template -

#script done

fo< ~b~Ht s3Q+ k&s@ vAJ,%

real 19m57.031s user 0m35.090s sys 4m19.860s

As you can see, in 20 minutes I found my deleted file stashed in 26 Gigs of binary junk!

It was not practical to attempt recovery of all my files this way so I just recovered the most valuable programs. Other jsp’s I rewrote based on html I was able to recover from the browser cache of another computer and the most recent backups I had.


Winnetou said...

Hello. Thanks for that HowTo. I've got similar problem. I've made a backup of my network communicator archive (plain text files) - just copied them on another HDD. Then I had to reinstall my Debian.I've rebooted machine run installer. Run partitioning tool on 1st HDD and mount 2nd HDD. After installation I want to copy archive to my home directory but 2nd HDD is clean ! :( Installer and partitioning tool erase my HDD :| :( So another reboot from rescue CD and nothing. Even Ontract Easy Recovery Pro didn't help. OERP was able to recover my previous NTFS partition and data but not current XFS. I've tried Yours HowTo and run as root

Etch-x86:/home/winnetou# time dd if=/dev/hdb1 bs=1024 count=30000000 | grep -a winnetou

system reply was:

29896933+1 records in
29896933+1 records out
copied 30614459904 bytes (31 GB), 1748,18 sec, 17,5 MB/s

And now I've got a question. Where (if) copied data are saved. And if they're not save how can i recover them. I've tried everything. I'm not engineer, just starting with Linux (intermediate user). Could you help me, please? It's very important data for me.

No new data was saved on this HDD, computer is still running, HDD is unmounted,

Thanks a lot and sorry for my terrible English ;)

Unknown said...

If you want to copy the data then you need to provide an output file argument like:

In my case, I just needed to recover a few files, not the whole hard drive. I used grep to find the location of the raw data for the files I lost.

If you are unable to boot to a debian rescue disk and then mount the partition then maybe at least you can recover a few of your most important files with the method I outlined. Even if the inodes are deleted, you can still get data off of your disk.

Ed said...

Wow, thanks, I never thought of a little dd, and strings.

AirOnSkin said...

Hey man, thanks for that post. I just recovered some important notes... work of 3 days. You really helped me a lot! :) Greetings from Switzerland.

Romy said...

Hey, I know this post is like 3 years old, but since you're the top Google result for "xfs undelete", I thought I'd leave 2 comments.

1) The pipe character, "|", didn't come through in your post, so all your commands got smashed together.

2) When you were comparing performance of dd, did you take page cache into consideration ? Because even if you run the same exact command back to back, the 2nd run will be significantly faster as most of it (depending on RAM) will be cached.

Aside from that, thanks for sharing :)

Unknown said...

Hi Romy,

I did watch for caching when benchmarking the 1 million record search. I think that because the input was from a pipe though, the cache was not used. I also monitored system load to make sure that results were not skewed by another process.

I just fixed the missing pipes typo in this post after 3 years haha!

Oriental Transport said...

Hi Doug Watson,

Appreciate that you shared your fantastic experience over the blog... but.. I know nothing about Linux system... does the recovery codes can be done in command prompt? it's nothing to do with the cookies and cache of the internet explorer? as I want to delete all the cache and cookie as now it causing my browser loading very slowly... maybe after I accidentally deleted my application from tomcat server and it's kept at somewhere like cache and cookies... I don't know.. please advise what should I do if I'm using window xp and do not know anything about Linux...


Unknown said...

I realize your post is really old, but I just wanted to stop in and say thanks; I worked all day revising a paper that is due shortly, only to have Firefox (!?!) turn the files in to empty, 0k placeholders when I tried emailing them. Your post helped me recover nearly all of the raw text from the hard drive. Not only did you save me a headache, you probably saved me a letter grade!

Anonymous said...

Have you ever try Windows utility like this:

In my situation it recovered near 95% files with its "undelete" and near 80% were with their real names.

warning: It is commercial...

s0ya said...

Would you know how to recover an entire drive ? I have a 1 TB drive with ALL data missing, I have another 1 TB drive that I can copy the files to... your help would be appreciated.

Anonymous said...

How would you recover binary files by filename? For example, I have images I would like to recover with names IMG_11*.JPG

Can your method be used>?

Anonymous said...

Can't believe I never thought of this! Simple and straight-forward - thanks for sharing!

Ash said...

Quite easy solution:

Pranav said...

Wow... that was amazing!
I was able to recover a file from my system quickly using your technique... Thank you!

Unknown said...

Cool, I'm glad that worked for you and you got your file back! I just noticed that I wrote this post 10 years ago too- I'm glad it is still helpful :)