backup

This post will probably be pretty short. I wanted to document this process flow so I could potentially script it in the future, and apply it other than in the instance that has led to this being necessary. I’ve been put in a position where I need to create backups and perform modifications on core system files for some high volume servers. While we could do the deed in production (who doesn’t like breaking prod?), it’s not the best idea given the revenue losses that would be suffered should downtime occur. This has led to a bit more discovery in the power of ssh, dd, pv, and qemu. Let’s get started.


Note: Names, IPs, ports, etc have been changed because reasons.


I need teh remote codez, plz

One of the bigger hurdles is how the servers are configured. They’re locked down pretty tight, including good UAC practices and some rather…. immature SSH configurations, albeit secure. This presented a bit of an issue initially, as key files are password protected, then sudo accounts aren’t exactly relaxed. I understand the reasoning for this, but it hasn’t contributed to making these efforts any easier. Anyways, we craft the initial ssh command as so:


> ssh -p 4321 sudoUser@192.168.100.12 -i keyfile.pem
Enter passphrase for key 'keyfile.pem':
Last login: Mon Mar  2 18:10:27 2020 from 10.0.11.2
sudoUser@prodserver1:~$


That gives us the basic access. But, how could you possible pull a raw image of this disk via this tunnel? You certainly cannot run dd and pipe the output back through the line; that’s not how this tunnel is configured. Why would we use dd instead of generating a tarball instead? Well, there are a few reasons for that, my biggest reason being I want to make sure I have a near identical copy of the entire system. Running sudo tar -cvzf prodserver1.tgz /* won’t provide us an exact replica, and will only copy the filesystem files. I want all the bytes, not just the data, so we use dd instead. Craft the dd command as so:


> dd if=/dev/xvda of=/home/tj0/prodserver1.img


The nice thing about dd is the fact that the images are raw binary data. This makes the tool useful for all kinds of things, including generating bootable USB media (with some additional parameters). Anyways, we now have our tunnel, and we have our method of collecting all the data. Why’d I mention pv you ask. Well, dd doesn’t say much – it’s a utility of few words. Essentially you’ll be sitting there, waiting for something, anything. You could run ls -alh | grep prodserver1.img and see what the file size is, but that’s manual, and it doesn’t provide the satisfaction of progress bars. pv provides us feedback.


Crafting the query

At this point, we have pretty much all we need so let’s throw it all together and get that image:


> ssh -p 4321 -tt sudoUser@192.168.100.12 -i keyfile.pem "sudo dd if=/dev/xvda " | pv | dd of=/home/tj0/prodserver1.img
Enter passphrase for key 'keyfile.pem': <enter keyfile password>
<enter sudo password>
0.00 B 0:00:06 [0.00 B/s] [<=>                                          ]
9.63GiB 0:24:52 [6.05MiB/s] [<=>                                          ]


You’ll see pv dump some output. Don’t let this scare you. In my case, I had to enter the sudo user password as well, otherwise sudo (which you cannot see the output from) will sit there. Waiting. Leaving you with a 0B file, and questioning if you are ever going to be a great terminal wizard. Just enter the password. You might notice that I added -tt to the ssh command. This is required because dd requires some type of TTY being available, and -tt forces a pseudo-tty to be available for dd to bind to, or something like that. RTFM for exact technical details.


Getting it VirtualBox ready


This step is interim, and possibly won’t be necessary for you. Given I’m doing this on a Mac and I use VirtualBox in other projects (and Vagrant, ftw), This is a simple method of achieving the end goal:


VBoxManage convertfromraw prodserver1.img vmdkprodserver1.vmdk --format VMDK


Make sure to run VBoxManage convertfromraw and check out the additional options. There’s a VHD option, which is the format needed for Azure according to this document.


That’s it for now


YMMV. This was a rather hackish, albeit necessary process that is being applied across several servers in an effort to keep several teams from winding up in a really, really crappy situation. Ping me if you need guidance or have input of better methods to doing this.