Choose text encoding when you open and save files (2024)

Skip to main content

Microsoft

Support

Support

Sign in

Sign in with Microsoft

Sign in or create an account.

Hello,

Select a different account.

You have multiple accounts

Choose the account you want to sign in with.

Word for Microsoft 365 Word 2021 Word 2019 Word 2016 Word 2013 More...Less

Typically, you can share text files without worrying about the underlying details of how the text is stored. However, you may need to choose an encoding standard when you open or save a file in these situations:

  • Sharingtext files with people working in other languages

  • Downloading text files across the Internet

  • Sharingtext files with other computer systems

Encoding standards helpMicrosoft Word and other programs determine how to represent the text so that it is readable.This may be neededon a computer with system software in differentlanguage thanthe language in which the text was created.

To learn more, you can expand and collapse any of the following topics.

What appears to you as text on the screen is actually stored as numeric values in the text file. Your computer translates the numeric values into visible characters. It does this is by using an encoding standard.

An encoding standard is a numbering scheme that assigns each text character in a character set to a numeric value. A character set can include alphabetical characters, numbers, and other symbols. Different languages commonly consist of different sets of characters, so many different encoding standards exist to represent the character sets that are used in different languages.
'

Different encoding standards for different alphabets

The encoding standard that is saved with a text file provides the information that your computer needs to display the text on the screen. For example, in the Cyrillic (Windows) encoding, the character Й has the numeric value 201. When you open a file that contains this character on a computer that uses the Cyrillic (Windows) encoding, the computer reads the 201 numeric value and displays Й on the screen.

However, if you open the same file on a computer that uses a different encoding, the computer displays whatever character corresponds to the 201 numeric value in the encoding standard that the computer uses by default. For example, if your computer uses the Western European (Windows) encoding standard, the character in the original Cyrillic-based file will be displayed as É rather than Й because in Western European (Windows) encoding, the value 201 maps to É.
'

Unicode: One encoding standard for many alphabets

To avoid problems with encoding and decoding text files, you can save files with Unicode encoding. Unicode accommodates most characters sets across all the languages that are commonly used among computer users today.

Because Word is based on Unicode, Word automatically saves files encoded as Unicode. You can open and read Unicode-encoded files on your English-language computer system regardless of the language of the text. Likewise, when you use your English-language system to save files encoded as Unicode, the file can include characters not found in Western European alphabets, such as Greek, Cyrillic, Arabic, or Japanese characters.

If, when you open a file, text appears garbled or as question marks or boxes, Word may not have accurately detected the encoding standard of text in the file. You can specify the encoding standard that you can use to display (decode) the text.

  1. Click the File tab.

  2. Click Options.

  3. Click Advanced.

  4. Scroll to the General section, and then select the Confirm file format conversion on open check box.

    Note:When this check box is selected, Word displays the Convert File dialog box every time you open a file in a format other than a Word format (Word formats include .doc, .dot, .docx, .docm, .dotx, or .dotm files). If you frequently work with such files but rarely want to choose an encoding standard, remember to switch this option off to prevent having this dialog box open unnecessarily.

  5. Close and then reopen the file.

  6. In the Convert File dialog box, select Encoded Text.

  7. In the File Conversion dialog box, select Other encoding, and then select the encoding standard that you want from the list.

    You can preview the text in the Preview area to check whether all the text is readable in the encoding standard that you selected.

If almost all the text looks the same (for example, all boxes or all dots), the font required for displaying the characters may not be installed. If the font that you need is not available, you can install additional fonts.

To install additional fonts, do the following:

  1. In Microsoft Windows, click the Start button, and then click Control Panel.

  2. Do one of the following:

    In Windows 7

    1. In Control Panel, click Uninstall a program.

    2. In the list of programs, click the listing for Microsoft Office or Microsoft Word, depending on whether you installed Word as part of Office or as an individual program, and then click Change.

    In Windows Vista

    1. In Control Panel, click Uninstall a program.

    2. In the list of programs, click the listing for Microsoft Office or Microsoft Word, depending on whether you installed Word as part of Office or as an individual program, and then click Change.

    In Microsoft Windows XP

    1. In Control Panel, click Add or Remove Programs.

    2. In the Currently installed programs box, click the listing for Microsoft Office or Microsoft Word, depending on whether you installed Word as part of Office or as an individual program, and then click Change.

  3. Under Change your installation of Microsoft Office, click Add or Remove Features, and then click Continue.

  4. Under Installation Options, expand Office Shared Features, and then expand International Support.

  5. Select the font set that you need, click the arrow next to your selection, and then select Run from My Computer.

Tip:When you open an encoded text file, Word applies the fonts that are defined in the Web Options dialog box. (To reach the Web Options dialog box, click the Microsoft Office Button, click Word Options, and then click Advanced. In the General section, click Web Options.) You can select the options on the Fonts tab in the Web Options dialog box to customize the font for each character set.

If you don't choose an encoding standard when you save a file, Word encodes the file as Unicode. Usually, you can use the default Unicode encoding, because it supports most characters in most languages.

If your document will be opened in a program that does not support Unicode, you can choose an encoding standard that matches that of the target program. For example, Unicode enables you to create a Traditional Chinese language document on your English-language system. However, if the document will be opened in a Traditional Chinese language program that does not support Unicode, you can save the document with Chinese Traditional (Big5) encoding. When the document is opened in the Traditional Chinese language program, all the text is displayed properly.

Note:Because Unicode is the most comprehensive standard, saving text in any other encoding may result in some characters that can no longer be displayed. For example, a document encoded in Unicode can contain Hebrew and Cyrillic text. If this document is saved with Cyrillic (Windows) encoding, the Hebrew text can no longer be displayed, and if the document is saved with Hebrew (Windows) encoding, the Cyrillic text can no longer be displayed.

If you choose an encoding standard that doesn't support the characters you used in the file, Word marks in red the characters that it cannot save. You can preview the text in the encoding standard that you choose before you save the file.

Text formatted in the Symbol font or in field codes is removed from the file when you save a file as encoded text.
'

Choose an encoding standard

  1. Click the File tab.

  2. Click Save As.

    If you want to save the file in a different folder, locate and open the folder.

  3. In the File name box, type a new name for the file.

  4. In the Save as type box, select Plain Text.

  5. Click Save.

  6. If the Microsoft Office Word Compatibility Checker dialog box appears, click Continue.

  7. In the File Conversion dialog box, select the option for the encoding standard that you want to use:

    • To use the default encoding standard for your system, click Windows (Default).

    • To use the MS-DOS encoding standard, click MS-DOS.

    • To choose a specific encoding standard, click Other encoding, and then select the encoding standard that you want from the list. You can preview the text in the Preview area to check whether all the text is readable in the encoding standard that you selected.

      Note:You can resize the File Conversion dialog box so that you can preview more of your document.

  8. If you receive a message that states, "Text marked in red will not save correctly in the chosen encoding," you can try to choose a different encoding, or you can select the Allow character substitution check box.

    When you allow character substitution, Word replaces a character that cannot be displayed with the closest equivalent character in the encoding that you chose. For example, three dots replace an ellipsis, and straight quotation marks replace curly quotation marks.

    If the encoding that you chose has no equivalent character for a character marked in red, the character marked in red will be saved as an out-of-context character, such as a question mark.

  9. If the document will be opened in a program that does not wrap text from one line to the next, you can include hard line breaks in the document by selecting the Insert line breaks check box, and then specifying whether you want the line breaks to be delineated with a carriage return (CR), line feed (LF), or both, in the End lines with box.

See Also
HL7 Messages

Word recognizes several encoding standards, and it supports the encoding standards that are provided with the system software on your computer.

The following list of writing systems shows the encoding standards (also called code pages) associated with each writing system.

Writing system

Encoding standards

Font applied

Multilingual

Unicode (UCS-2 little-endian and big-endian, UTF-8, UTF-7)

Default font based on the Normal style for your language version of Word

Arabic

Windows 1256, ASMO 708

Courier New

Simplified Chinese

GB2312, GBK, EUC-CN, ISO-2022-CN, HZ

SimSun

Traditional Chinese

BIG5, EUC-TW, ISO-2022-TW

MingLiU

Cyrillic

Windows 1251, KOI8-R, KOI8-RU, ISO8859-5, DOS 866

Courier New

English, Western European, or other Latin script

Windows 1250, 1252-1254, 1257, ISO8859-x

Courier New

Greek

Windows 1253

Courier New

Hebrew

Windows 1255

Courier New

Japanese

Shift-JIS, ISO-2022-JP (JIS), EUC-JP

MS Mincho

Korean

Wansung, Johab, ISO-2022-KR, EUC-KR

Malgun Gothic

Thai

Windows 874

Tahoma

Vietnamese

Windows 1258

Courier New

Indic: Tamil

ISCII 57004

Latha

Indic: Nepali

ISCII 57002 (Devanagari)

Mangal

Indic: Konkani

ISCII 57002 (Devanagari)

Mangal

Indic: Hindi

ISCII 57002 (Devanagari)

Mangal

Indic: Assamese

ISCII 57006

Indic: Bengali

ISCII 57003

Indic: Gujarati

ISCII 57010

Indic: Kannada

ISCII 57008

Indic: Malayalam

ISCII 57009

Indic: Oriya

ISCII 57007

Indic: Marathi

ISCII 57002 (Devanagari)

Indic: Punjabi

ISCII 57011

Indic: Sanskrit

ISCII 57002 (Devanagari)

Indic: Telugu

ISCII 57005

  • Use of Indic languages requires system support and the appropriate OpenType fonts.

  • Only limited support is available for Nepali, Assamese, Bengali, Gujarati, Malayalam, and Oriya.

Need more help?

Want more options?

Discover Community Contact Us

Explore subscription benefits, browse training courses, learn how to secure your device, and more.

Choose text encoding when you open and save files (2)

Microsoft 365 subscription benefits

Choose text encoding when you open and save files (3)

Microsoft 365 training

Choose text encoding when you open and save files (4)

Microsoft security

Choose text encoding when you open and save files (5)

Accessibility center

Communities help you ask and answer questions, give feedback, and hear from experts with rich knowledge.

Choose text encoding when you open and save files (6)

Ask the Microsoft Community

Choose text encoding when you open and save files (7)

Microsoft Tech Community

Find solutions to common problems or get help from a support agent.

Choose text encoding when you open and save files (10)

Online support

Thank you for your feedback!

×

Choose text encoding when you open and save files (2024)

FAQs

How do I save a file with specific encoding? ›

From the File menu, choose Save File As, and then click the drop-down button next to the Save button. The Advanced Save Options dialog box is displayed. Under Encoding, select the encoding to use for the file. Optionally, under Line endings, select the format for end-of-line characters.

What encoding should I use for text files? ›

Since it's now the standard method for encoding text on the web, all your site pages and databases should use UTF-8. A content management system or website builder will save your files in UTF-8 format by default, but it's still a good idea to make sure you're sticking to this best practice.

How to change txt file encoding? ›

Choose an encoding standard when you open a file
  1. Click the File tab.
  2. Click Options.
  3. Click Advanced.
  4. Scroll to the General section, and then select the Confirm file format conversion on open check box. ...
  5. Close and then reopen the file.
  6. In the Convert File dialog box, select Encoded Text.

How do I select file conversion encoding? ›

The first step is to open the document in Microsoft Word and select the “File” menu. From here, select “Options” and find the “Advanced” tab. Under the “General” section, you will find the “File Conversion” section. Here, you can select the default encoding for the document.

How to save txt file as UTF-8? ›

Use the “Save As” option under the file menu.
  1. Click “Save As,” then choose “Plain Text (. txt)” from the “File Format” dropdown menu.
  2. After clicking “Save” you'll get a new window asking about the text encoding.
  3. Select “Other Encoding” and choose UTF-8 from the right-side menu.
  4. Click OK. Boom! That's it!

How to save a file as UTF-8 encoding? ›

UTF-8 Encoding in Microsoft Excel (Windows)
  1. Open your CSV file in Microsoft Excel.
  2. Click File in the top-left corner of your screen.
  3. Select Save as...
  4. Click the drop-down menu next to File format.
  5. Select CSV UTF-8 (Comma delimited) (. csv) from the drop-down menu.
  6. Click Save.

How to determine the encoding of a file? ›

Files generally indicate their encoding with a file header. There are many examples here. However, even reading the header you can never be sure what encoding a file is really using. For example, a file with the first three bytes 0xEF,0xBB,0xBF is probably a UTF-8 encoded file.

How to do text encoding? ›

Four main steps are involved:
  1. transferring selected materials to a computer text editor.
  2. encoding or marking up the document using markup tags and elements.
  3. validating, or checking the correctness of, the document.
  4. presenting the document to the user via a Web or some other interface.

What are the different types of text encoding? ›

Simple character encoding schemes include UTF-8, UTF-16BE, UTF-32BE, UTF-16LE, and UTF-32LE; compound character encoding schemes, such as UTF-16, UTF-32 and ISO/IEC 2022, switch between several simple schemes by using a byte order mark or escape sequences; compressing schemes try to minimize the number of bytes used ...

What is the default encoding of Windows? ›

As of my last knowledge update in January 2022, Windows typically defaults to using UTF-16 for its internal character encoding, not ISO-8859-1. UTF-16 allows representation of a broader range of characters compared to ISO-8859-1.

What is the default encoding for Notepad? ›

Files by default, are encoded in Notepad with either ANSI or UTF-8 (depending on the Notepad version). ANSI encoding generally is used for the Latin character set (including the English alphabet), and UTF-8 supports the Unicode character set (a global character set).

What is the default text encoding in Notepad? ›

When saving text files using Notepad, the default text encoding format is set to ANSI. On the other hand, if you open a text file on a computer that's using a different encoding, the computer displays whatever the character corresponds to.

How do I change the encoding of a file in Windows? ›

Change default encoding for new text files in Windows
  1. Create a new text file, open it in any text editor and save this empty file with Unicode encoding: ...
  2. Rename the file to TXTUNICODE.txt and place it in C:\Windows\ShellNew (if you don't have ShellNew folder then create one) ...
  3. Open regedit.exe.
Feb 17, 2022

What is the difference between converting and encoding? ›

Transcoding is the best option if you need to convert a video file to a different format. It helps optimize your video or audio files for different devices, browsers, or streaming services. Encoding is the best option if you are looking to compress the video for faster streaming or downloading.

What is file encoding? ›

File encodings, also known as character encodings, specify how to represent characters when text processing. One encoding may be preferable over another in terms of which language characters it can or cannot handle, although Unicode is usually preferred.

How do I save a CSV file with encoding? ›

Notepad (Windows)

If you have a CSV file from another source, or you are starting with Notepad: Open the CSV file in Notepad. Go to File > Save As and select UTF-8 as the encoding option (use "UTF-8 no BOM" if available). Click Save.

How do I save a file as UTF-8 without BOM encoding? ›

To make sure your PHP files do not have the BOM, follow these steps:
  1. Download and install this powerful free text editor: Notepad++
  2. Open the file you want to verify/fix in Notepad++
  3. In the top menu select Encoding > Convert to UTF-8 (option without BOM)
  4. Save the file.

How do I save a CSV file as UTF-16? ›

Converting Excel to CSV through UTF-16 is easy. It is done in two steps. We must select the “Save As” option through the Microsoft Office button, choose the “Unicode Text” format . txt under the “Save as type” box, and click on “o*k”.

How do I save a CSV file as UTF-8 without BOM? ›

Option 2: Save the CSV file without the BOM (BYTE ORDER MARK):
  1. Open your CSV file with any text editor that supports both BOM and NON-BOM.
  2. Save it again without BOM (for example, in Notepad++ , select Encoding | Encode in UTF-8 and save the file).
Dec 22, 2023

Top Articles
Latest Posts
Article information

Author: Arline Emard IV

Last Updated:

Views: 5821

Rating: 4.1 / 5 (72 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Arline Emard IV

Birthday: 1996-07-10

Address: 8912 Hintz Shore, West Louie, AZ 69363-0747

Phone: +13454700762376

Job: Administration Technician

Hobby: Paintball, Horseback riding, Cycling, Running, Macrame, Playing musical instruments, Soapmaking

Introduction: My name is Arline Emard IV, I am a cheerful, gorgeous, colorful, joyous, excited, super, inquisitive person who loves writing and wants to share my knowledge and understanding with you.