Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

I did check out the previous postings and didnt find what I was looking for- apo

ID: 661321 • Letter: I

Question

I did check out the previous postings and didnt find what I was looking for- apologies if this is a repetition.

I am working on an analytics engagement for a customer and he needs to share access to his customer sales transactions for me to do my work. The idea was to create a encryption routine whereby details such as SSN and Zip would be anonymized so that I wouldnt have access to PII data. When I present the findings, he needs to have the ability to see the original data back from the garbled text I work with.

From a definition perspective, I understand masking/ obfuscation would hide the details and possibly impact referential integrity. I read on tokenization and believe it is quite a complex model. I am probably missing a nuance here...

I really don't care for the process time for encrypt/ decrypt- that is not relevant and processing cost can be ignored. Just need a way for SSN : 123-45-6789 to be encrypted consistently to say, A467YuGHT, so I can work unimpeded and the customer is comfortable that no data that he has shared with me violates customer PII. When I submit a report stating that A467YuGHT is a potential churn, he deciphers it back to 123-45-6789 [there are too many such PIIs for me to create alternate identifiers]

I was thinking of a private key(which customer retains) /public key to do this- Am I missing something here? Any open source tool that does this

Explanation / Answer

I understand your issues, but one of the fundamental maxims of cryptography is that you shouldn't reinvent the wheel and create your own encryption algorithm, primarily because it will be nowhere near as secure as an established one.

Is this data going to be accessible over the Internet or transmitted over unencrypted channels? If the answer to either of those is yes, the risk factor rises exponentially. If you use your own home-baked efforts at obfuscation, all it takes is a determined hacker to find this out and they will have all of your SSN details.

How could they find it? What if you customer leaves something lying around telling him how to decrypt? What if your email gets hacked? What if you server isn't as secure as you thought it was?

You really need a solution where if any of those things happen an adversary still couldn't decrypt the data, and really that only comes with established public-key infrastructure. Sure there are ways to defeat that, but unfortunately the reality is that they are a lot more difficult than it would be to compromise a home-grown encryption algorithm.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote