Thursday, December 15, 2011

Different between utf-8 and utf-8 without BOM?

The UTF-8 BOM is a sequence of bytes (EF BB BF) that allows the reader to identify the file as an UTF-8 file.
Normally, the BOM is used to signal the endianness of the encoding, but since UTF-8 doesn't have any encoding issue, the BOM is unnecessary.
According to the Unicode standard, the BOM for UTF-8 files is not recommended:

2.6 Encoding Schemes

Use of a BOM is neither required nor recommended for UTF-8, but may be encounter in contexts where UTF-8 data is converted from other encoding forms that use a BOM or where the BOM is used as a UTF-8 signature. See the "Byte Order Mark" subsection in Section 16.8, Specials, for more information.

No comments:

Post a Comment