UTF-8 file is detected as ANSI.

To ensure UTF-8 detection under all circumstances, please use one of thw following strategies:

Go to Options / Preferences / Files / General and check Open ANSI files as UTF-8 without BOM. This is the preferred method.
For most file types you can just avoid using UTF-8 without BOM and use general UTF-8 file format, but for some server-side script types, like PHP, this can prevent code from executing correctly.
If you absolutely need to use UTF-8 without BOM, but to not want to change your settings for all files, please include the following HTML code to force a file to be opened as UTF-8 without BOM:
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
This HTML code will work even if inserted as a comment or invisible element in a non HTML file, such as PHP.

Technical explanation of how UTF-8 is detected

While files encoded in UTF-8 with BOM contain BOM (Byte Order Mask) which is an invisible string of bytes that acts as an indicator that the file contains UTF-8 text, files encoded in UTF-8 without BOM do not contain this indicator and the only way of detection is analyzing the contents of the file.

According to the UTF-8 specification, UTF-8 text is simply an ANSI text where each language-specific character is replaced by two special (human-unreadable) characters which are displayed as a single readable language-specific character when the file is open in a compatible editor. This means that an UTF-8 file with no language-specific characters actually IS an ANSI file and there is no way to tell otherwise unless the file contains BOM or some other valid indication, such as HTML meta encoding tag.

UTF-8 file is detected as ANSI.

Technical explanation of how UTF-8 is detected

SITEMAP

COMPANY

Products