Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

In this part, you are required to extract the text message hidden. The LSB (leas

ID: 3863132 • Letter: I

Question

In this part, you are required to extract the text message hidden. The LSB (least-significant bit) of each colour component (R, G, and B) of the pixels of the image potentially contains a bit of the message. The task is to pick these bits to form the hidden message.

Strings

Strings (i.e., texts) are represented digitally using two different formats. The first one is the C-format (used by the C language) where the string consists only of the characters and is terminated by a null character (this is typically a byte with value 0). The second one is the Pascal-format (used by the Pascal language) where the string consists of its length (an integer value) and the characters.

As an aside, think about the pros and cons of each of these two formats.

Message Extraction

The text message hidden in the image employs the Pascal-format. It uses a 32-bit integer to keep the length of the text, and then uses as many bytes as specified by the length to keep the characters.

The message is hidden using LSB Steganography and the pixels for hiding the message are chosen sequentially from Pixel (0,0). The pixels are accessed in a row-major fashion (i.e. accessed row by row rather than column by column).

You will need to first extract the 32 bits that tell the length of the text, and once this is known, you extract the other 8N bits where N is the text length previously extracted. You then assemble those 8Nbits into N bytes to form the text message.

You may notice that the bits (for both length and the characters) may need reversing to get sensible values.

(0,0)#013499 (1,0)#01329A (2,0)#00329A (3,0)#003298 (4,0)#00329A (5,0)#00349A (6,0)#00329A (7,0)#003498 (8,0)#00349A (9,0)#003498 (10,0)#003298 (11,0)#00339A (12,0)#013399 (13,0)#003499 (14,0)#003299 (15,0)#013398 (16,0)#013298 (17,0)#013499 (18,0)#01349A (19,0)#00349A (20,0)#013399 (21,0)#00329A (22,0)#01339A (23,0)#013398 (24,0)#01349A (25,0)#013499 (26,0)#013499 (27,0)#013498 (28,0)#003399 (29,0)#00339A (30,0)#00329A (31,0)#013398 (32,0)#003299 (33,0)#003399 (34,0)#013499 (35,0)#003299 (36,0)#003399 (37,0)#003399 (38,0)#01339A (39,0)#01339A (40,0)#003399 (41,0)#013299 (42,0)#013499 (43,0)#01329A (44,0)#013399 (45,0)#003399

Explanation / Answer

LSB insertion modifies the LSBs of each color in 24-bit images, or the LSBs of the 8-bit value for 8-bit images.

Example:

The letter 'A' has an ASCII code of 65(decimal), which is 1000001 in binary.

It will need three consecutive pixels for a 24-bit image to store an 'A':

Let's say that the pixels before the insertion are:

10000000.10100100.10110101, 10110101.11110011.10110111, 11100111.10110011.00110011

Then their values after the insertion of an 'A' will be:

10000001.10100100.10110100, 10110100.11110010.10110110, 11100110.10110011.00110011

(The values in bold are the ones that were modified by the transformation)

The same example for an 8-bit image would have needed 8 pixels:

10000000, 10100100, 10110101, 10110101, 11110011, 10110111, 11100111, 10110011

Then their values after the insertion of an 'A' would have been:

10000001, 10100100, 10110100, 10110100, 11110010, 10110110, 11100110, 10110011

(Again, the values in bold are the ones that were modified by the transformation)

From these examples we can infer that 1-LSB insertion usually has a 50% chance to change a LSB every 8 bits, thus adding very little noise to the original picture.

For 24-bit images the modification can be extended sometimes to the second or even the third LSBs without being visible. 8-bit images instead have a much more limited space where to choose colors, so it's usually possible to change only the LSBs without the modification being detectable.

Data Rate

The most basic of LSBs insertion for 24-bit pictures inserts 3 bits/pixel. Since every pixel is 24 bits, we can hide

3 hidden_bits/pixel / 24 data_bits/pixel = 1/8 hidden_bits/data_bits

So for this case we hide 1 bit of the embedded message for every 8 bits of the cover image.

If we pushed the insertion to include the second LSBs, the formula would change to:

6 hidden_bits/pixel / 24 data_bits/pixel = 2/8 hidden_bits/data_bits

And we would hide 2 bits of the embedded message for every 8 bits of the cover image. Adding a third-bit insertion, we would get:

9 hidden_bits/pixel / 24 data_bits/pixel = 3/8 hidden_bits/data_bits

Acquiring a data rate of 3 embedded bits every 8 bits of the image.

The data rate for insertion in 8-bit images is analogous to the 1 LSB insertion in 24-bit images, or 1 embedded bit every 8 cover bits.

We can see the problem in another light, and ask how many cover bytes are needed to send an embedded byte.

For 1-LSB insertion in 24-bit images or in 8-bit images this value would be 8/1*8 = 8 Bytes, for 2-LSBs insertion in 24-bit pictures it would be 8/2*8 = 4 Bytes, for 3-LSBs insertion it would be 8/3*8 = 21.33 Bytes.

Robustness

LSB insertion is very vulnerable to a lot of transformations, even the most harmless and usual ones.

Lossy compression, e.g. JPEG, is very likely to destroy it completely. The problem is that the "holes" in the Human Visual System that LSB insertion tries to exploit - little sensitivity to added noise - are the same that lossy compression algorithms rely on to be able to reduce the data rate of images.

Geometrical transformations, moving the pixels around and especially displacing them from the original grid, are likely to destroy the embedded message, and the only one that could allow recovery is a simple translation.

Any other kind of picture transformation, like blurring or other effects, usually will destroy the hidden data.

All in all, LSB insertion is a very little robust technique for data hiding.

Ease of detection/extraction

There is no theoretical outstanding mark of LSB insertion, if not a little increase of background noise.

It's very easy, instead, to extract LSBs even with simple programs, and to check them later to find if they mean something or not.

Suitability for steganography or watermarking

First of all, since it is a so vulnerable technique even for simple processing, LSB insertion is almost useless for digital watermarking, where it must face malicious attempts at its destruction, plus normal transformations like compression/decompression or conversion to analog (printing or visualization)/conversion to digital (scanning).

Its comparatively high data rate can point it as a good technique for steganography, where robustness is not such an important constraint.

Problems and possible solutions

Having stated that LSB insertion is good for steganography, we can try to improve one of its major drawbacks: the ease of extraction. We don't want that a malicious attacker be able to read everything we are sending.

This is usually accomplished with two complementary techniques:

In this way, the message is protected by two different keys, acquiring much more confidentiality than before.

This approach protects also the integrity of the message, being much more difficult (we could say at least computationally infeasible) to counterfeit the message.

Anyway, since we don't want our message to be only an encrypted and scrambled message, we must go back to the purpose of making the communication hidden.

The two most important issues in this problems are:

The cover image first of all must seem casual, so it must be chosen between a set of subjects that can have a reason to be exchanged between the source and the receiver.

Then it must have quite varying colors, it must be "noisy", so that the added noise is going to be covered by the already present one. Wide solid-color areas magnify very much any little amount of noise added to them.

Second, there is a problem with the file size, that involves the choice of the format. Unusually big files exchanged between two peers, in fact, are likely to arise suspicion.

Let's calculate, for instance, what the size would be for a 500x300 image (150,000 pixels), quite common for pictures on the Internet, with the different color representations:

Looking at the size, we can see that a 24-bit uncompressed picture is of a quite uncommon size, because it's very strange that the sender didn't compress it, a practice that's widely used and wouldn't have worsened the image quality so much.

To solve this problem, it has been studied a modification to the JPEG algorithm that inserts LSBs in some of the lossless stages or pilots the rounding of the coefficients of the DCT used to compress the image to encode the bits.

Since we need to have small image file sizes, we should resort in using 8-bit images if we want to communicate using LSB insertion, because their size is more likely to be considered as normal.

The problem with 256 colors images is that they make use of an indexed palette, and changing a LSB means that we switch a pixel from a position to an adjacent one. If there are adjacent contrasting colors in the palette, it can happen that a pixel in the image changes its color abruptly and the hidden message becomes visible

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote