ASCII
ASCII (the American Standard Code for Information Interchange) is a character encoding standard first developed for teleprinters and then adapted for use with computers. Newer encoding standards have been developed using ASCII as a foundation. The Arduino family support ASCII character encoding through the char variable type. Standard ASCII codes run from 0 to 127 and this works perfectly with a single byte signed integer.
The ASCII code set starts with a sequence of 32 codes known as the control characters. You are unlikely to use many of these with an Arduino although NUL and LF are going to be used frequently. The “printable” characters that run from (decimal) 32 to 126 are the key values. The final character (DEL) might again be thought of as a control character.
Dec | Hex | char | Description |
0 | 00 | NUL | null |
1 | 01 | SOH | start of header |
2 | 02 | STX | start of text |
3 | 03 | ETX | end of text |
4 | 04 | EOT | end of transmission |
5 | 05 | ENQ | enquiry |
6 | 06 | ACK | acknowledge |
7 | 07 | BEL | bell |
8 | 08 | BS | backspace |
9 | 09 | HT | horizontal tab |
10 | 0A | LF | line feed |
11 | 0B | VT | vertical tab |
12 | 0C | FF | form feed |
13 | 0D | CR | carriage return |
14 | 0E | SO | shift out |
15 | 0F | SI | shift in |
16 | 10 | DLE | data link escape |
17 | 11 | DC1 | device control 1 |
18 | 12 | DC2 | device control 2 |
19 | 13 | DC3 | device control 3 |
20 | 14 | DC4 | device control 4 |
21 | 15 | NAK | negative acknowledge |
22 | 16 | SYN | synchronise |
23 | 17 | ETB | end of block |
24 | 18 | CAN | cancel |
25 | 19 | EM | end of medium |
26 | 1A | SUB | substitute |
27 | 1B | ESC | escape |
28 | 1C | FS | file separator |
29 | 1D | GS | group separator |
30 | 1E | RS | record separator |
31 | 1F | US | unit separator |
32 | 20 | SP | space |
33 | 21 | ! | exclamation mark |
34 | 22 | “ | double quotes |
35 | 23 | # | hash |
36 | 24 | $ | dollar symbol |
37 | 25 | % | percent |
38 | 26 | & | ampersand |
39 | 27 | ‘ | single quote |
40 | 28 | ( | left bracket |
41 | 29 | ) | right bracket |
42 | 2A | * | asterisk |
43 | 2B | + | plus sign |
44 | 2C | , | comma |
45 | 2D | - | minus or hyphen |
46 | 2E | . | full stop |
47 | 2F | / | forward slash |
48 | 30 | 0 | zero digit |
49 | 31 | 1 | |
50 | 32 | 2 | |
51 | 33 | 3 | |
52 | 34 | 4 | |
53 | 35 | 5 | |
54 | 36 | 6 | |
55 | 37 | 7 | |
56 | 38 | 8 | |
57 | 39 | 9 | |
58 | 3A | : | colon |
59 | 3B | ; | semi-colon |
60 | 3C | < | less than |
61 | 3D | = | equals sign |
62 | 3E | > | greater than |
63 | 3F | ? | question mark |
64 | 40 | @ | at sign |
65 | 41 | A | uppercase A |
66 | 42 | B | |
67 | 43 | C | |
68 | 44 | D | |
69 | 45 | E | |
70 | 46 | F | |
71 | 47 | G | |
72 | 48 | H | |
73 | 49 | I | |
74 | 4A | J | |
75 | 4B | K | |
76 | 4C | L | |
77 | 4D | M | |
78 | 4E | N | |
79 | 4F | O | |
80 | 50 | P | |
81 | 51 | Q | |
82 | 52 | R | |
83 | 53 | S | |
84 | 54 | T | |
85 | 55 | U | |
86 | 56 | V | |
87 | 57 | W | |
88 | 58 | X | |
89 | 59 | Y | |
90 | 5A | Z | |
91 | 5B | [ | left square bracket |
92 | 5C | \ | backslash |
93 | 5D | ] | right square bracket |
94 | 5E | ^ | circumflex |
95 | 5F | _ | underscore |
96 | 60 | ` | grave accent |
97 | 61 | a | lower case a |
98 | 62 | b | |
99 | 63 | c | |
100 | 64 | d | |
101 | 65 | e | |
102 | 66 | f | |
103 | 67 | g | |
104 | 68 | h | |
105 | 69 | i | |
106 | 6A | j | |
107 | 6B | k | |
108 | 6C | l | |
109 | 6D | m | |
110 | 6E | n | |
111 | 6F | o | |
112 | 70 | p | |
113 | 71 | q | |
114 | 72 | r | |
115 | 73 | s | |
116 | 74 | t | |
117 | 75 | u | |
118 | 76 | v | |
119 | 77 | w | |
120 | 78 | x | |
121 | 79 | y | |
122 | 7A | z | |
123 | 7B | { | left curly brace |
124 | 7C | | | vertical bar |
125 | 7D | } | right curly brace |
126 | 7E | ~ | tilde |
127 | 7F | DEL | delete |
Great, you might think but where (say) is the UK pound sign or Euro symbol? Some accented characters might be nice as well. At one time, computers sold in the UK used the ASCII code for the # symbol for the £ sign but that hardly solved the problem. Even that temporary solution got more complicated when symbols like the # began to be used within popular programming languages (like C or even BASIC dialects).
You will certainly recall that the char data type is a signed integer so there are 128 additional potential values in a single byte of storage. What happens if we try sending those values to the Serial Monitor?
void setup() { Serial.begin(115200); for(int b = 128; b < 256; b++) { Serial.print((char)b); Serial.print(" "); Serial.print((char)b, DEC); Serial.print(" "); Serial.println(b, DEC); } }
We find the “extended” (8 bit) ASCII codes hiding away in the negative char values. So, the £ sign is ASCII 163 or (char)-93 and the Euro currency sign is at 128 ((char-128).
For a final exercise, why not adapt the program above that displays the negative char characters to display a three column table of those values that you could insert into this appendix? In my experience when mentoring new programmers, the sort of loops required can prove tricky when first encountered and this adds to the value of the task. One solution can be found on the next page.
const char line[] = "+---+------+-----+"; const char gap[] = " "; void setup() { Serial.begin(115200); Serial.print(line); Serial.print(gap); Serial.print(line); Serial.print(gap); Serial.println(line); for(int b = 128; b < 171; b++) { for(int i = 0; i < 87; i+=43) { if((b + i) < 256) { showValue(b + i); Serial.print(gap); } } Serial.println(); } } void showValue(int b){ Serial.print("| "); Serial.print((char)b); switch (b) { case 128 ... 156: Serial.print(" | "); break; case 157 ... 246: Serial.print(" | "); break; default: Serial.print(" | "); break; } Serial.print((char)b, DEC); Serial.print(" | "); Serial.print(b, DEC); Serial.print("|"); }
Your output (like mine) may not look terribly well aligned in the Serial Monitor window but once copied and pasted into another program that uses a monospaced font (such as Notepad on Windows) then the layout should square up nicely ready for printing.