![]() ![]() ![]() ![]() The problem arose when 256 characters were not enough, and they had to make code pages like IBM273 (IBM EBCDIC Germany) and ISO-8859-1 (ANSI Latin 1 Western European) to interpret what "0x2C" meant. UTF-8/UTF-16 are MBCS, which means when you encode a character, it may give you more than a single byte.īefore UTF-? came along, everything was SBCS, which meant that any code page you selected was coded using 8-bits. I don't understand why you want to send your own custom 1-byte words, but what you are really looking for is any SBCS (Single Byte Character Set) which has a character for those bytes you specify. UTF-16 says that all words are represented as 2 byte/octets and has a Byte Order Mark which essentially tells the reader which one is the most-significant octet (or endianness.Ĭheck on UTF-8 for full specifications, but essentially you are moving from the end of the 1 byte range, to the start of the 2 byte range. 0x7F is 01111111, so when reading it, it knows it is only 1 byte/octet long. UTF-8 specification says that words that are 1 byte/octet start with 0.
0 Comments
Leave a Reply. |