In this entry of the series we explore using VARCHAR and CHAR data types in your database and give some pointers on which type is best to use and when.
Overview
Ever find yourself building a database only to start questioning what data types you should use for a specific column? In this entry of the MySQL data types series, we’ll explore the various ways you can save strings and text to a database to help demystify the options you have as a developer, starting with VARCHAR
and CHAR
.
VARCHAR vs CHAR
VARCHAR
is probably the most widely used data type for strings. It stores a string of variable length up to a maximum of 65,535 characters. When creating a VARCHAR
field, you’ll be able to specify the maxmimum number of characters the field will accept using the VARCHAR(n)
format, where n
is the maximum number of characters to be stored. Due to the fact that is is variable length, it will only allocate enough disk space to store the contents of the string, not the full length of the contents passed in.
VARCHAR
also allocates a little bit of extra space with each value stored. Depending on the space required to store the data, 1 or 2 bytes of overhead will be allocated. If the space required is less than 255 bytes, a 1 byte prefix will be added, otherwise a 2 byte prefix will be used. The exact space required to store a value depends on the character set used (more on that in a bit).
CHAR
is another method to store strings, but it has a maximum length of 255 and is fixed length. As with VARCHAR
, you may optionally set the maximum number of characters in a CHAR
field with the CHAR(n)
format. If not specified, n
defaults to 1. The values stored in a CHAR
column are right-padded with empty spaces, so it will always store n
characters regardless of the string being saved. In certain circ*mstances, this can actually increase performance of the database.
Factoring in the charset
While most programming languages use characters from the English language, humans across the world write and read using different types of characters. This can be something simple like Ñ
in Spanish, or something very different like データベース
in Japanese. To address this, MySQL has different character sets (or charsets) to address the symbols used in different languages. Character sets affect the way text is stored in the database, but also affect the amount of storage space allocated when saving data.
For example, when using the default charset of utf8mb4
, MySQL will allocate 4 bytes per character stored. Factoring this in, along with a maximum row size of 65,535 bytes across ALL columns, you'd realistically only be able to create a VARCHAR
column with a maximum length 16,383 characters due to the storage requirements for each character.
Visualizing the differences
When saving data to a CHAR
field, one side effect is that any trailing spaces in the string are effectively lost when the value is saved. In fact, MySQL will not even return trailing spaces when you query data from a CHAR
column because it has to assume that the extra spaces are just padding.
To demonstrate this, let’s create a table with two columns, one VARCHAR(20)
and one CHAR(20)
. We’ll then insert some data with five spaces at the end of it to see how it’s stored.
SQL
CREATE TABLE strings(
id INT PRIMARY KEY AUTO_INCREMENT,
variable VARCHAR(20),
fixed CHAR(20)
);
INSERT INTO strings (variable, fixed) VALUES ("Drifter ", "Drifter ");
Now if I run the following SELECT
statement, it appears that the returned data is the same.
SQL
SELECT * FROM strings;
However, if I use the CHAR_LENGTH
function to calculate the number of characters used in each field, you’ll notice the data stored in the VARCHAR
field (respresented with varchar_data_length
) is 12, which considers the 5 extra space characters at the end, whereas the CHAR
field only shows 7. This is because MySQL is storing the whitespace at the end of the VARCHAR
value, but it assumes the extra space at the end of the CHAR
value is the padding that was appended based on the data type.
SQL
SELECT CHAR_LENGTH(variable) AS varchar_data_length, CHAR_LENGTH(fixed) AS char_data_length FROM strings;
As stated earlier, VARCHAR
values have an extra overhead when the data is written to disk as well. This means that if you are storing the string “Spider” which is 6 characters in length, and you are storing it in both a VARCHAR(6)
and a CHAR(6)
column, the VARCHAR
value will use 25 bytes (4 bytes per character using the utf8mb4
character set plus 1 byte of overhead) of disk space, whereas the CHAR
value will use 24 bytes.
However, if you store “Eido” in those same columns, the VARCHAR
will only use 5 bytes and the CHAR
will still use 6 bytes. Since the CHAR
data type is fixed-length, it is right-padded with 2 empty spaces, for a total of 6.
Value | VARCHAR(6) Stored value | VARCHAR(6) Space used | CHAR(6) Stored value | CHAR(6) Space used |
---|---|---|---|---|
"Spider" | "Spider" | 25 bytes | "Spider" | 24 bytes |
"Eido" | "Eido" | 17 bytes | "Eido" | 24 bytes |
"Eido" | "Eido" | 25 bytes | "Eido" | 24 bytes |
When to use VARCHAR vs CHAR
Now that you understand the differences between VARCHAR
and CHAR
, here are a few tips on deciding which data type fits your application best:
Use VARCHAR
if:
- You need to store a string with more than 255 characters.
- You find yourself in a rare scenario where you do need to preserve trailing spaces.
Use CHAR
if:
You are at or below 255 characters, and you always know the length of the string.
A fixed-length serial number would be a good example of when
CHAR
is useful.
Want a powerful MySQL database that doesn’t skimp on developer experience?
Further learning
If you'd like to learn more about data types in MySQL, we have an article on the INT data type and one on the JSON data type that you may find useful.
We also have short videos on the following data types:
- Integers
- Decimals
- Strings
- Binary Strings
- Long Strings
- Enums
- Dates
- JSON
Share
Next post
Debugging database errors with InsightsAs a seasoned database professional with extensive expertise in MySQL and database design, let's delve into the concepts discussed in the article on using VARCHAR and CHAR data types.
1. VARCHAR vs CHAR:
Overview:
- VARCHAR: Variable-length data type widely used for strings, accommodating up to 65,535 characters. It allocates space based on the actual content length, with additional overhead (1 or 2 bytes).
- CHAR: Fixed-length data type limited to 255 characters. Stores a constant number of characters, padding with spaces if necessary. May enhance performance in certain scenarios.
Factoring in the Charset:
- Charset: MySQL supports different character sets (charsets) to address language-specific symbols. The charset affects storage space; for instance, utf8mb4 allocates 4 bytes per character. Consideration of charset is crucial for optimizing storage.
2. Visualizing Differences:
- Trailing Spaces: CHAR columns right-pad with spaces, causing trailing spaces to be lost when stored. This is due to the fixed-length nature of CHAR.
- Demonstration: The article demonstrates the difference in data storage between VARCHAR and CHAR by creating a table with two columns, inserting data with trailing spaces, and showcasing the impact on data length.
Example SQL Queries:
-- Creating the table and inserting data
CREATE TABLE strings(id INT PRIMARY KEY AUTO_INCREMENT,variable VARCHAR(20),fixed CHAR(20));
INSERT INTO strings (variable, fixed) VALUES ("Drifter ", "Drifter ");
-- Selecting data and calculating data length
SELECT * FROM strings;
SELECT CHAR_LENGTH(variable) AS varchar_data_length, CHAR_LENGTH(fixed) AS char_data_length FROM strings;
Visual Representation of Storage:
- The article provides a visual representation of storage differences between VARCHAR and CHAR using examples with the strings "Spider" and "Eido." It illustrates how storage varies based on the actual content length.
3. When to Use VARCHAR vs CHAR:
- Use VARCHAR if:
- Storing strings with more than 255 characters.
- Needing to preserve trailing spaces, which is a unique requirement.
- Use CHAR if:
- Operating within or below 255 characters, and the exact length is always known.
- Examples include fixed-length serial numbers where CHAR's constant length is beneficial.
Conclusion:
The article provides valuable insights into the nuances of VARCHAR and CHAR data types, offering practical tips for choosing the appropriate type based on specific use cases. Understanding these distinctions is crucial for efficient database design and optimization.
Further Learning:
- The article suggests additional reading on the INT and JSON data types in MySQL for a comprehensive understanding of MySQL data types.
- Short videos on various data types, including Integers, Decimals, Strings, Binary Strings, Long Strings, Enums, Dates, and JSON, are recommended for deeper exploration.
For individuals seeking a robust MySQL database with an excellent developer experience, the article suggests trying PlanetScale.
In conclusion, the provided information reflects a deep understanding of MySQL data types, emphasizing practical considerations for developers and database designers.