ANSI, terminal, encodings

Table of content:

Encoding in general

ANSI chars 0-127

https://en.wikipedia.org/wiki/ASCII

ANSI chars 0-127 (decimal), 00-7F (hexa):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
   | 0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F  
---+----------------------------------------------------------------
0x | NUL SOH STX ETX EOT ENQ ACK BEL BS  HT  LF  VT  FF  CR  SO  SI   
1x | DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM  SUB ESC FS  GS  RS  US   
2x | SP  !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /  
3x | 0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?  
4x | @   A   B   C   D   E   F   G   H   I   J   K   L   M   N   O  
5x | P   Q   R   S   T   U   V   W   X   Y   Z   [   \   ]   ^   _  
6x | `   a   b   c   d   e   f   g   h   i   j   k   l   m   n   o  
7x | p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~   DEL  

Coding pages

8 bit coding pages can encode chars 0-255 (decimal), 00-FF (hexa). In general:

Examples:

OEM vs CP:

Unicode

UTF-8, UTF-16, UTF-32 — they all encode the same set of characters. They differ only in the way they encode them.

ANSI chars look the same in all encodings (is it true?).

Encoding on Windows

How encoding works in Windows cmd

CLI apps output set of bytes (like 11001100-01010101-…) to Windows cmd stdout/stderr. Author of CLI app can have any encoding in his mind. Unless this CLI app specifically inspects cmd settings on their own, cmd will interpret this set of bytes with currently active coding page.

If [author’s encoding] match [currently active coding page], all will be rendered as expected (of cause if current font supports all the chars). If not — you’ll get giberrish output to all chars out of ANSI (0-127) range.

New line in Windows cmd