When I returned to JPL after working at Google for a year I was tasked with evaluating a Google Search Appliance. We ultimately decided not to keep it, and so we had to erase the disks, which now contained sensitive data. The appliance had a "self-destruct" feature that supposedly erased all the data, but there was no way to verify it. After lengthy negotiations with Google (some people just have a hard time grasping the idea that just because a file has been deleted doesn't mean the data is actually gone) we eventually got them to agree to let us open the enclosure and take out the disks. Forensic analysis revealed that they had not in fact been erased.
"...The data will be 256-bit encrypted on the host [running the Snowball client?] and stored on the appliance in encrypted form. The appliance can be hosted on a private subnet with limited network access."
So I assume the data is encrypted asymetrically.
"...ship it back to us for ingestion. We’ll decrypt the data [using the private key specified in the job,] and copy it to the S3 bucket(s) that you specified when you made your request[/job]. Then we’ll sanitize the appliance in accordance with NIST Special Publication 800-88 (Guidelines for Media Sanitization)."
There are a few different types of sanitisation (clear/purge/destroy), and Amazon doesn't specify which type. I assume they would go with "clear", and maybe in a few select places (I'd hope storage media) "purge".
"Clear" is scary though, as for network devices, it is only "full manufacturer’s reset to reset the router or switch back to its factory default settings", and for HDD's it is "Overwrite media by using organizationally approved and validated overwriting technologies/methods/tools. The Clear procedure should consist of at least one pass of writes with a fixed data value, such as all zeros. Multiple passes or more complex values may optionally be used".
So what vector do you want to protect? Accidental data egress shouldn't happen as the data is encrypted. However there are more interesting vectors, such as getting hold of the public key and injecting your own data into another companies buckets...
But that's the whole point: Can you trust that the box does what Amazon says it does? Because the Google box did not do what Google said it did, but if I hadn't been very insistent about it (to the point of having a number of people think I was being a total dick), we would never have known.
When I returned to JPL after working at Google for a year I was tasked with evaluating a Google Search Appliance. We ultimately decided not to keep it, and so we had to erase the disks, which now contained sensitive data. The appliance had a "self-destruct" feature that supposedly erased all the data, but there was no way to verify it. After lengthy negotiations with Google (some people just have a hard time grasping the idea that just because a file has been deleted doesn't mean the data is actually gone) we eventually got them to agree to let us open the enclosure and take out the disks. Forensic analysis revealed that they had not in fact been erased.
Caveat emptor.