Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Suppose I had a message in Chinese (or another non-phonetic language) and I want

ID: 651711 • Letter: S

Question

Suppose I had a message in Chinese (or another non-phonetic language) and I wanted to encipher it.

Some of the simplest encryptions in English are substitution ciphers, but such ciphers don't seem to be a viable option for a language such as chinese (it's more similar to a code...). What would be a simple cipher for such an alphabet? (Lets assume that we don't want to transliterate it to English first).

Second, suppose that someone had used a substitution cipher on such a language (in the code sense where every word stood for another).

Would frequency analysis still work?
Are non-phonetic languages more or less secure when used with substitution ciphers?
Assuming frequency analysis is not an option, how would one break it?

Explanation / Answer

For designing a cipher, one first has to decide about the alphabet.

This is a bit problematic for a language like Chinese, since it is not really clear how many (and which) characters should be used. The number of signs known by people differs greatly. You don't want that your encrypted message is un-decryptable just because you used some unknown character in the ciphertext. One possibility here would be to restrict the ciphertext alphabet to some "known to everyone" subset (while still allowing a greater alphabet on the plaintext side).

Assuming we have decided on an alphabet, we also should decide on some encoding of this alphabet, i.e. some ordering of the signs, or a mapping to numbers. This allows using some simple mathematical formula/algorithm for the actual encryption, instead of having to use a "translation table" as the key (which would be quite big in the case of a big alphabet).

Every monoalphabetic substitution cipher can be presented as a "big table", and thus breaking the big table algorithm also breaks each other (monoalphabetic) algorithm (which in fact are only ways to write the big table in a shorter way).

So, how to break a monoalphabetic cipher? Just like always. In a language like Chinese there are more frequent words (= signs), too, and more or less frequent two-word or three-word combinations, just like in our European single-sound-alphabets, too. The problem is just that there are more different signs, and thus you need a lot more text to get significant samples. Then we try the most probable words first, and try to fill in the rest, and see if it starts to make sense. Some signs used only once or twice are quite probably not decodable at all, if multiple different ones would make sense here.

The reason that modern substitution ciphers on really large alphabets (like AES) are comparatively quite secure even in ECB mode (in other modes they are not pure substitution ciphers), is that you don't get as much blocks to find many repeated ones - and that you simply can't write down the whole substitution table, for space and time reasons.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote