ANSI Character Set - NETWORK ENCYCLOPEDIA

Definition of ANSI Character Set in Network Encyclopedia.

What is ANSI CHARACTER SET?

ANSI Character Set, also known as Windows Code Page, is an 8-bit character set used by Microsoft Windows 95 and Windows 98 that lets you represent up to 256 characters (numbered 0 through 255).

The ASCII (American Standard Code for Information Exchange) character set is a subset of the ANSI (American National Standards Institute) character set with characters numbered 32 through 126, each representing a displayable character. Some ANSI character codes cannot be displayed by Windows 95 or Windows 98 applications and are generally displayed as solid blocks on the output device.

ANSI CHARACTER SET TABLE

ANSI uses a single byte to represent a character, in contrast to the Unicode standard supported by Windows NT, which uses 2 bytes to represent a character. For example, the ANSI character «A» would be represented in hexadecimal notation by the single byte 41h. The 256-character limit of ANSI supports only a few international characters, such as accented French and German vowels, but the 65,536-character limit of Unicode supports virtually every alphabet in the world. For example, the Unicode character «A» would be represented in hexadecimal notation by the two-byte string {41h, 00h}.

History of ANSI Character Set (Windows Code Page)

Initially, computer systems and system programming languages did not make a distinction between characters and bytes. This led to much confusion subsequently. Microsoft software and systems previous to the Windows NT line are examples of this, using the OEM and ANSI code pages, which do not make the distinction.

Since the late 1990s, software and systems are increasingly adopting more direct encodings of Unicode, in particular, UTF-8 and UTF-16; this trend has been improved by the widespread adoption of XML, which provides a more adequate mechanism for labeling the encoding used. Recent Microsoft products and application program interfaces use Unicode internally, but many applications and APIs continue to use the default encoding of the computer’s locale when reading and writing text data to files or standard output. Therefore, though Unicode is the accepted standard, there is still backward compatibility with the older Windows code pages.

The euro sign was introduced after many of the ANSI and OEM code pages were introduced; several code pages were revised to contain the euro sign.

Since version 1803, Windows machines can be configured to allow UTF-8 as the “ANSI” and OEM codepage.