During our November 26, 2025, Office Hours livestream, a viewer brought up an excellent question. What does the alphabet soup of UTF-8, UTF8MB3, UTF8MB4, etc. mean? It really does look like a jumble of technical jargon at first glance, but it’s actually straightforward to understand.
UTF-8 determines how your database stores text and what characters it can understand. If you have ever encountered those acronyms in phpMyAdmin or app configs and wondered what they mean, then look no further. We’ll break it all down in this post.
The main question
During the livestream, a viewer asked Nathan this question.
UTF8 is currently an alias for the character set UTF8MB3, but will be an alias for UTF8MB4 in the future. Please consider using UTF8MB4 in order to be unambiguous. Are you familiar with it? Best practices?
As Nathan aptly put it, it’s alphabet soup. It means nothing to the everyday person. However, it’s a crucial piece of how computers, and databases consequently, understand and store text.
What is UTF-8?
Before we dive deeper into the topic, we must first understand what UTF-8 means. The dash is essential.
It’s a standardized way of representing letters, numbers, punctuation, and symbols inside a computer. It’s a character encoding, a way to store characters defined by Unicode. That matters because computers don’t store letters as letters.
For example, “Hello” isn’t stored as H-E-L-L-O. A computer doesn’t understand letters the same way we do. However, it understands numbers. Because of that, it needs a character set like UTF-8 to store text.
The character set determines which numbers correspond to which characters. Or, to put a more visual perspective on the explanation, it’s like a giant grid of every character in the Unicode standard. From letters to numbers, punctuation, symbols, and even emojis, each character has a numerical value.
So, from the computer’s point of view, “Hello” looks like this in Unicode: U+0048 U+0065 U+006C U+006C U+006F.
What is UTF8MB4 then?
On the other hand, you have UTF8MB4, without the dash. You have most likely seen this spelled with lowercase letters in phpMyAdmin, but for clarity, we’ll continue using uppercase throughout this post.
That’s important because there is a significant difference between UTF-8 and UTF8MB4. UTF-8 is the character encoding standard, defined by Unicode, and supports 1-4 bytes per character. It also supports all Unicode characters, including emojis and rare scripts. It’s the standard used across browsers, files, APIs, operating systems, and so on.
However, UTF8MB4 is not the latest version of UTF-8. We understand why it may seem that way. Instead, UTF8MB4 is MySQL’s latest version of its implementation of the standard.
In simpler terms, when MySQL wanted to use UTF-8 to encode characters in their databases, they didn’t go with the full version, so to speak. Instead, MySQL implemented only the 1-3 bytes portion of UTF-8, leaving out the 4-byte range. That severely limited the number of characters the database could store.
That is, until UTF8MB4 came along, and MySQL databases could finally store up to 4 bytes per character and fully utilize the Unicode standard.
Why does using UTF8MB4 matter?
After all that technical explanation, the reason UTF8MB4 is important is very simple. Unless your MySQL database uses it, your website may not be able to show specific characters. Have you ever been on a site and encountered empty boxes or rows of question marks where content should be instead? That’s most likely due to the site’s database being unable to recognize–and therefore store and then serve–specific characters. It’s like when you are missing an emoji pack on your phone.
But why should you bother if your site is purely in English and doesn’t support emojis, for example? Because sooner or later, someone is going to paste a symbol, and it will appear as corrupted text instead. That doesn’t look professional at all, nor pretty.
Furthermore, UTF8MB4 supports all languages, is future-proof, and works great with many of the most popular modern frameworks (WordPress, Laravel, Drupal, etc.).
The alphabet soup is telling you how to improve your site
In the end, switching to UTF8MB4 in your database is a good idea. You can check whether your database uses UTF8MB4 by going to phpMyAdmin (or whichever management tool you use) and looking at the table’s collation. If it says utf8mb4, then you are all set.
Depending on your setup, you might need to do things differently, but upgrading will only benefit your website. Of course, it is a change to your database, so always back it up beforehand.
Switching ensures your site can store and display all modern characters, avoids database errors related to them, and handles all manner of scripts without breaking. It’s a small thing with a huge impact.
And if you have a similar question, or any question regarding running an agency, a tricky client issue, or hosting, register for Office Hours and have it answered live.
.webp)


