Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

We are using MySQL at the company I work for, and we build both client-facing an

ID: 639692 • Letter: W

Question

We are using MySQL at the company I work for, and we build both client-facing and internal applications using Ruby on Rails.

When I started working here, I ran into a problem what I had never encountered before; the database on the production server is set to Latin-1, meaning that the MySQL gem throws an exception whenever there is user input where the user copies & pastes UTF-8 characters.

My boss calls these "bad characters" since most of them are non-printable characters, and says that we need to strip them out. I've found a few ways to do this, but eventually we've ended up in a circumstance where a UTF-8 character was needed. Plus it's a bit of a hassle, especially since it seems like the only solution I ever read about for this issue is to just set the database to UTF-8 (makes sense to me).

The only argument that I've heard for sticking with Latin-1 is that allowing non-printable UTF-8 characters can mess up text/full-text searches in MySQL. Is this really true?

Are there other reasons one should use Latin-1 over UTF-8? It's my understanding that it is superior and becoming more ubiquitous.

Explanation / Answer

Unicode is certainly difficult, and the UTF-8 encoding has a couple of inconvenient properties. However, UTF-8 has become the de-facto standard encoding on the web, surpassing ASCII, Latin-1, UCS-2 and UTF-16. Just use UTF-8 everywhere.

The most important reason why you should support Unicode is that you shouldn't make unnecessary assumptions about user input. I have no idea what your domain is, but things like Hebrew usernames, a blog post about China, a comment with Emoji, or simply well styled text

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote