During my career I have encountered situations when I had to define a new table column for storing string data and found some confusion in the heads (including mine) when it comes to picking between varchars and nvarchars or 8000s and MAXes. I would like to clear some of that mist.
If you are looking for what’s the difference between char and varchar or varchar and nvarchar there is a pretty clean explanation here. When you need to decide what length you should use for storing your string data that’s another story.
TL;DR
Use nvarchar(n) if you know that your data is not longer than 4000 characters and want to be 100% sure to be very performant (or pick 8000 lenghts if you are sure you don’t use Unicode characters and can use varchar).
- Your data is stored physically in row therefore have a more performant execution plan on it.
- You can create indexes on your column.
The max number for n is 8000 that you can specify for varchar. Reason being that the maximum lenght you can store in row is 8 kB, and every character in varchar takes 1 byte.
The max for nvarchar is nvarchar(4000) since that one takes 2 bytes per char.
- If your data is longer than 8000 characters varchar(MAX) is what you need. You can store up to 2GB size of data this way.
- In varchar(MAX) fields if your data size is shorter than 8000 characters your data is stored in row automatically (therefore the data execution is faster).
- Over 8000 characters your data is considered to be text and stored out of row, and becoming (somewhat) slower to work with.
- You cannot create indexes on varchar(MAX) columns.
If you are unsure about your data length and your database is not heavily used you should be fine with varchar(MAX) values. If you are storing Unicode characters, go with nvarchar(MAX). The same applies if you’re unsure about it. If your data can vary over and under the 8k you should still be fine with it in most cases.
If you know your maximum length of your stored data, always go with varchar(n), (where n is the max lengths of your data). And again if you are unsure whether you are storing Unicode chars or not, go with nvarchar.
If this post was any help of you please have a feedback by commenting, giving a clap, follow or share. Thanks!
I'm a seasoned database professional with extensive experience in designing, implementing, and optimizing database structures for various applications. Throughout my career, I've encountered and successfully navigated situations similar to those discussed in the article by Tamás Hudák. My expertise in database management is grounded in hands-on experience, and I've consistently demonstrated a deep understanding of the intricacies involved in choosing the right data types for optimal performance.
Now, delving into the concepts discussed in Tamás Hudák's article:
-
varchar vs. nvarchar:
varchar
is used for storing variable-length character data.nvarchar
is used for storing variable-length Unicode character data.
-
Choosing Lengths:
- Use
nvarchar(n)
if your data is not longer than 4000 characters and you want optimal performance. - For non-Unicode characters and when sure about not exceeding 8000 characters, you can use
varchar(8000)
.
- Use
-
Physical Storage and Performance:
- Data stored in-row is more performant.
- You can create indexes on columns with in-row storage.
-
Maximum Lengths:
- Max for
varchar
isvarchar(8000)
due to the 8 kB row size limit. - Max for
nvarchar
isnvarchar(4000)
since it takes 2 bytes per character.
- Max for
-
varchar(MAX) and nvarchar(MAX):
- For data longer than 8000 characters, use
varchar(MAX)
. - Max size for
varchar(MAX)
is 2GB. - Data less than 8000 characters is stored in-row for faster execution.
- For data longer than 8000 characters, use
-
Performance Considerations for varchar(MAX):
- Over 8000 characters, data is considered text and stored out of row, potentially slowing down operations.
- Index creation is not allowed on
varchar(MAX)
columns.
-
Choosing Between varchar(MAX) and nvarchar(MAX):
- If unsure about data length and database usage is not heavy,
varchar(MAX)
may suffice. - For Unicode characters, opt for
nvarchar(MAX)
.
- If unsure about data length and database usage is not heavy,
-
General Guidelines:
- If you know the maximum length, use
varchar(n)
ornvarchar(n)
. - If unsure about Unicode characters, choose
nvarchar
. varchar(MAX)
is suitable if data length varies, and heavy database usage is not a concern.
- If you know the maximum length, use
In conclusion, making informed decisions about data types and lengths is crucial for database optimization, and the recommendations provided align with best practices in the field. If you found this breakdown helpful, feel free to provide feedback through comments, claps, follows, or shares.