I have a bunch of network data and I would like to determine if it is either 1)
ID: 655892 • Letter: I
Question
I have a bunch of network data and I would like to determine if it is either 1) encrypted or 2) compressed. I doubt it is both, but the potential exists. If I am assuming that the traditional compression headers are stripped (preventing me from simply running file on the data), then how would you determine whether the data I am looking at is either compressed or encrypted, if it had to be one?
The reason I would think that the compression headers are stripped is because it would be unnecessary information considering both the client and the server know the exact method to compress and decompress the data.
Explanation / Answer
If you can get a sufficiently large batch of the "subject" data, you could analyze it statistically. Good encryption, over a large enough data set, should show high entropy and distribution. Each possible value for the bytes (i.e., in the range 00h to FFh) should, in particular, show up about 0.39216% of the time. HxDen will let you paste your hex values in and save as a binary file and give you a general statistical count. The DieHard utilities will let you evaluate for distribution, including distribution of repeats, and other qualities of randomness. (None of your sample sizes linked in your question are large enough to make a good analysis).
If your data stream incorporates blocks of data (encrypted or otherwise) that are sprinkled with HMAC or other verification/authentication blocks, this will not hold even if the data is encrypted.
Compressed data in most cases will show slightly more varied byte distribution, often with the appearance of a sawtooth shape if my memory serves.
Its pretty hard to provide better help absent some additional details.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.