Recently a company called Bitcasa demonstrated a product of cloud storage. they
ID: 651426 • Letter: R
Question
Recently a company called Bitcasa demonstrated a product of cloud storage. they indicated that they would use "Convergent Encryption" to secure your data and de-duplicate, essentially one copy of the same file between users.
From what I can read on a university paper about Secure Data Duplication which makes mention of "Convergent Encryption" and assuming this is what this is referring too.
My question is, what security implications of using this technology and is it truly secure that you cannot get the key based on chunk data being used to generate different keys between users.
Reference interview on Washington post
Explanation / Answer
If it's implemented properly, it is as secure as any other form of encryption in preventing those who don't know the data from obtaining it from the encrypted data. However, it does have one fundamental limitation that, so far as we know, is inherent in the technology -- Anyone who has the same file you have can potentially prove that you have that file.
The general way such algorithms work is as follows:
+ The object to be encrypted is validated to ensure it is suitable for this type of encryption. This generally means, at a minimum, the the file is sufficiently long. (There is no point in encrypting, say, 3 bytes this way. Someone could trivially encrypt every 3-byte combination to create a reversing table.)
+ Some kind of hash of the decrypted data is created. Usually a specialized function just for this purpose is used, not a generic one like SHA-1. (For example, HMAC-SHA1 can be used with a specially-selected HMAC key not used for any other purpose.)
+ This hash is called the 'key'. The data is encrypted with the key (using any symmetric encryption function such as AES-CBC).
+ The encrypted data is then hashed (a standard hash function can be used for this purpose). This hash is called the 'locator'.
+ The client sends the locator to the server to store the data. If the server already has the data, it can increment the reference count if desired. If the server does not, the client uploads it. The client need not send the key to the server. (The server can validate the locator without knowing the key simply by checking the hash of the encrypted data.)
+A client who needs access to this data stores the key and the locator. They send the locator to the server so the server can lookup the data for them, then they decrypt it with the key. This function is 100% deterministic, so any clients encrypting the same data will generate the same key, locator, and encrypted data.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.