Chapter 3: An Introduction To C

In chapter 1, we wrote a couple of short C programs and ran them on an Arduino. We concentrated on the results and not on how the results were achieved but you will find that you have already acquired some fundamental C knowledge. Now all we need to do is build upon those foundations while exploring the C language. First a short history.

The C programming language was developed by Dennis Ritchie between 1969 and 1973 at Bell Labs. There, C was used to rewrite the Unix operating system originally written in assembler. Nowadays, the best-known version of Unix is Linux and Linux probably runs something like 65% of the worlds servers as well as innumerable devices and microcomputers such as the Raspberry Pi and the Arduino Yún. There is a lot of C code about.

C was designed to provide low level access to memory and to provide code elements that mapped efficiently to processor instructions with a very minimum of additional software support. Thus, was created, a programming language ideally suited to being ported to a wide range of hardware platforms, ready and waiting for the microprocessor revolution that ultimately provided us with our Arduinos. The language gained a wide audience following the publication of the book “The C Programming Language” written by Brian Kernighan and Dennis Ritchie. That book’s first chapter started (more or less) with the program “Hello World”.

The C language was standardised by the American National Standards Institute to give us what is known as ANSI C. Many other languages have since based their structure on C and this means that programmers with experience of languages such as Java, JavaScript, C#, Swift, Python and many others generally find themselves quickly at home when programming for the Arduino.

ANSI C has 32 keywords although some later standards have made the odd addition. The grammar of the language is constructed from those keywords with a sprinkling of operators and punctuation – as you have already seen in Chapter 1. Among programmers generally, C has a reputation for being a difficult language to master. This is partly because it is terse and partly because there is no inbuilt frippery designed to ease the task of programming a lot of common functions. This shortfall is generally alleviated by the existence of a wide range of libraries that can be imported into a program. However, it must be said that C will allow you to make mistakes on a grand scale – if you are not careful. Many languages have runtimes that check code statements as they are executed and can trap a wide range of errors while programs are running. The C compiler can only spot errors in your syntax when you write your code so more subtle errors during execution can lead to some excitingly unexpected results. C programmers are a cautious breed who anticipate errors and strive to eliminate them before they can arise. I think that philosophically, this fits rather well with most Arduino projects and their mix of electronics and code.

Please don’t be put off. C programming is fun and the Arduino is a safe and rewarding playground to work within. Arduinos are the perfect platform for learning C and the ins and outs of programming generally.

This chapter is a long one as it is intended to introduce the entire C language. It will help, if you take the time to type the demonstration code into your Arduino IDE and run the programs on your hardware. To quote from K&R’s C book, “The only way to learn a new programming language is by writing programs in it”. Ultimately, you need to know everything in this chapter but the only way to turn that knowledge into a practical skill is to apply each of the language elements in programs (ideally ones you invent for yourself). With that experience, you will find that you can “read” program code just like you might read a book. Learning from the code itself, rather than just from any accompanying explanation will be a “breakthrough” moment.

We will take the different language components and structures in chunks. C is an imperative language and uses statements to specify actions. A common statement form is an “expression statement” where variables can be assigned new values. So, let’s start with the elements called variables.

Variables

A variable is given a name chosen by the programmer that is then associated with a location in memory used to store a value. As the name implies, the value stored at that memory location can normally be changed (or varied) during program execution.

Before a variable can be used in your code then it has to be declared. We have already met a char variable in Chapter 1 but some others types can be declared like this:

int myInt; // myInt can hold an integer value like -32 or 130
float myFloat; // myFloat can hold real numbers like 3.456 or -98.4

C is what is known as a typed language where each variable is assigned a type when it is declared or defined. The code above declares two variables. One is an int type and the other is a float type. Each variable type has a fixed size that decides just how many bytes of memory are needed to store any value given to the variable. All variable types hold binary numbers in one configuration or another. Even alphabetic characters are stored in char variables as numbers.

I am sure you remember each binary digit on a computer is represented by a bit which can be thought of as being set to 0 or 1. The earliest days of computing saw bits being grouped together in a number of configurations but this eventually settled down into groups of 8 bits, called a byte. 8 binary bits, by themselves, can represent an integer value in the range -128 to +127 or a positive only range of 0 to 255.

An int variable on an 8 bit Arduino, such as a Uno, takes up 2 bytes (16 bits) of memory. The value is stored as a binary number using the rightmost 15 bits. The leftmost (16th ) bit is used as a sign bit. If all of the non-sign bits of an int variable were set to a 1 then that int variable would be holding a value of 32767 which is the maximum for this type. The binary representation would look like 0111 1111 1111 1111 which is the sum of 2 to the power of 0 plus 2 to the power of 1 plus 2 to the power of 2 and so on until 2 to the power of 14. If your program were to then add 1 to the value stored in that variable then the result would be - 32768. That might be just a bit surprising.

Integers do not break when an arithmetic operation takes them beyond their range but roll-over (or overflow). In this instance, rolling over also sets the bit that is the negative flag. Negative integer values are stored using a format known as “Twos Complement” where the highest bit flags the number as negative and subsequent bits are inverted (1 becomes 0 and 0 becomes 1) finally with 1 added. It might surprise you to learn that this simplifies arithmetic operations within the processor. The web site supporting this book has a lot more detail on the Twos Complement format and why it is used. For now, you can skip the detailed explanation and just accept that it works.

Unsigned integer variables (those that only hold positive values) can use the highest bit as a normal binary digit as they don’t need a sign bit. This is why a 16 bit unsigned int variable on an ATmega based Arduino can hold a value as high as 65,535.

Variable Types

Below is a list of the variable types with their associated sizes and the range of numbers they can store. You may need to refer back to this from time to time.

Type Length in bytes Range Comments
Atmega boards Due & Zero
boolean or bool 1 1 0 or 1 1 equates to true and zero to false
byte 1 1 0 to 255 strictly not a C++ type but synonymous with an unsigned char
char 1 1 -128 to 127 Normally used to store an ASCII character using its number code.
unsigned char 1 1 0 to 255 The byte type name is preferred on Arduino but an unsigned char is the same thing.
double 4 8 Acts as a float on 8 bit boards
float 4 4 -3.4028235E+38 to 3.4028235E+38 but see the following section on floating point numbers
int 2 4 -32,768 to 32,767 or 2,147,483,648 to 2,147,483,647 The Due and Zero boards are 32 bit and support 4 byte integers
unsigned int 2 4 0 to 65,535 or 0 to 4,294,967,295
long 4 4 -2,147,483,648 to 2,147,483,647 For an 8 Byte long on 32 bit board use type int64_t
unsigned long 4 4 0 to 4,294,967,295
short 2 2 -32,768 to 32,767
string implemented as an array of character values (see section below)
String The String class exposes lots of additional functionality
word 2 4 0 to 65,535 or 0 to 4,294,967,295 This is often defined as the amount of data that a computer can process in one operation. It would normally be the size of a processors general registers. Not true for an 8 bit Arduino though
array A sequence (or list) of a value type referenced by a single variable name
void see later
pointer* see later
struct see later
union see later


The char variable

One of the nice things about the char variable is that you can treat it like an integer and indeed that is how it was portrayed in the table above. C char variables are most often used to store text characters using the ASCII code for each character. The full list of ASCII codes can be found in appendix 2.

We can use integer arithmetic to manipulate characters held in char variables. If you have an upper case letter then it can be converted to lower case by adding 32 to the variable. If you needed to know where in the alphabet a letter stored in a char was positioned then you could subtract 65 from an upper-case letter character or 97 from a lower-case letter character. This would return an index position of 0 for ‘a’ or ‘A’ and 25 for ‘z’ or ‘Z’. C programmers always count from zero rather than one as, in most instances, that simplifies program code and makes addressing those things called arrays much easier.

Float and Double variables

Variables with the type float or double are known as floating point numbers. These can store fractional values with a very large range but sometimes with low precision, this can be as low as just 6 decimal digits. Floating Point values are very useful but need to be treated with caution. Floating point arithmetic is relatively slow. Readers over a certain age may recall that early PC’s introduced separate and specialised floating-point processors to speed up floating point calculations. Even today, use is often made of any dedicated graphics processor attached to a PC as these are optimised for floating point calculations and are often pressed into service by programs that need just such a resource.

There is also a floating-point precision issue. This affects divisions in particular and is exacerbated by the difficulty of representing some fractional values accurately in binary. Also, very large or small values are inevitably held with lower precision. The following short program helps illustrate this issue.

void setup() {
  Serial.begin(115200);
  Serial.println("Hello floating point");
  float x = 0.3;
  float y = 2.0;
  Serial.println(x / y, 7);
  Serial.println(y / x, 7);
  Serial.println( y * x, 7);
}

When run, this displays the following on the Serial Monitor:

Hello floating point
0.1500000
6.6666665
0.6000000

The program uses yet another variant of the Serial.println() method. This one works with floating point numbers only and allows you to specify the number of decimal places to be used when converting the value to a text representation to be sent to the serial interface. The default by the way is 2 decimal places if you do not specify a value.

Take a look at the middle calculation result for y / x. If you check on a pocket calculator you will see that the result in decimal is 6.66666666666666 with sixes reoccurring forever. So, you might expect to see the last digit displayed shown as 6 or rounded up to a 7. Indeed, if I had left the number of decimal places used at the default, the display would have shown 6.67. What we see here though is a loss of precision in the fractional value. Now losing one or two ten millionths is not generally a big issue but it can be if you then go on to compare a floating point arithmetic calculation result against some specific value. You might have to make an allowance so that a near miss can be treated as a match.

Probably the most expensive result of a floating point arithmetical error was the loss of the Ariane 5 rocket and payload on its maiden flight. The crucial software fault was an integer overflow but this would not have occurred if more cautious floating point processing had been implemented. The cost in 1996 money was around £238 million ($370 million).

If you tried running the little program above then you will have spotted the lines that both declared a float variable and then went on to assign a value. So we had better deal with that next.

Declaring and Defining Variables

You declare a variable like this:

int myInt;

and the program would reserve memory for this int type variable and associate that part of memory with the chosen name, myInt.

There are some rules for variable names. They can’t be the same as C keywords as that would just create confusion. They must start with a letter or an underscore. Otherwise they can contain any upper or lower case letter and number. Variable names are case sensitive so myInt is a different variable to MyInt. There is no set limit on the length of a C variable name in the language specification but it is best to keep things within sensible bounds. The shortest valid variable name is a single letter. It is a good idea to use meaningful variable names and there is a convention that the first letter is lower case and that any subsequent “words” in the name start with an upper case letter.

Typical variable names might be:

x, var1, _VAR, aShortPhrase or thisIsGettingABitLongMaybe

We can set the value of a variable using the assignment operator ( = ) thus:

myInt = 5;

We could have set an initial value for myInt when it was first declared – this is known as defining a variable.

int myInt; // declaring myInt
myInt = 5; // using the assignment operator “=” to assign a value
int myOtherInt = 7; // this is defining a new variable which means declaring it
// and assigning it a value in the same statement

The short code examples used so far have often included fixed values to be assigned to a variable. These are known as constants.

Defining constants (literals)

Constants are fixed values set in your C code statements. These can be integers, floating point values, characters or strings (often then called literals). Here are some examples where constants are used to define a variable:

int m = 5; // the 5 itself is known as an integer constant.
char c = ‘a’; // a char constant is delimited by single quotes
char ca[] = “This is a string constant or literal”; // a string constant is delimited
               // by double quotes
float f = 5.67; // The decimal point creates a floating point numeric constant

Integer constants can be a decimal (base 10), octal (base 8), hexadecimal (base 16) or binary (base 2) value. An unsigned constant has a U suffix and a long constant an L suffix. A binary constant has a prefix B. As that list can be a bit overwhelming a few examples might help.

int x = 5; // constant 5 in decimal notation
byte b = 7; // constant 7 in decimal notation
int y = 0164; // Octal format has a leading zero
int z = 0xFD; // Hexadecimal format has a leading 0x (zero x)
int bb = B1100011; // binary format usually for byte type
unsigned int u = 45U; // Unsigned suffix (u can be lower case)
long x = 3765L; // Long suffix (L can be lower case)
unsigned long y = 781296UL; // Format for an unsigned long

Just a note of caution, the expression int y = 0164; above set the int variable to a decimal value of 116 using octal notation. Accidentally including a leading zero when you intend to set a decimal value could have unintended consequences. The variable y was not defined as having an initial value of decimal 164.

Floating point constants must usually include a decimal point and may optionally include an exponent. You may not have run into exponents since school maths.

float f = 0.0045; // the leading zero can be omitted.
//The compiler will NOT confuse a leading zero with octal notation.
double d = 5.0; // The decimal point and fractional part are required (even when zero)
float e = 2.67e3; // evaluates to 2.67 times 10 to the power of 3 which equals 2,670
double dd = 89e-4; // evaluates to 89 times 10 to the minus 4 which equals 0.0089

C char constants can be an explicit character or may be defined as a number using decimal, octal or hexadecimal notation.

char a = ‘b’; // explicit
char b = 56; // decimal ASCII value (b = ‘8’)
char c = ‘\xde’; // hexadecimal notation (c = -34) note the single quote marks
char d = ‘\0354’; // octal format (note the leading zero, d = -20)

The range of options for defining constant values can seem confusing but that wide range means that you have options and that usually means you can pick the most convenient option for any given situation.

Escape sequences

Some explicit character constants present difficulties when it comes to typing them in from a keyboard. You could, of course, use the decimal value but these can be hard to remember what they mean later. C therefore has some defined sequences for some specific characters. These are known as escape sequences. The C “escape” character is the backslash which you will have already noticed being used to signal hexadecimal and octal constants above. Some are not strictly applicable to most Arduino implementations but the table is intended to be complete for the C language.

Escape Sequence ASCII Value The character represented
\a 7 Bell (Alert) not implemented on the Arduino
\b 8 Backspace
\f 12 Formfeed
\n 10 Newline - commonly terminates a line of text
\r 13 Carriage return
\t 9 Horizontal tab
\v 11 Vertical tab
\\ 92 Backslash (as otherwise the backslash itself would have become difficult to specify)
\' 39 Single quotation mark
\" 34 Double quotation mark
\? 63 Question mark (to avoid trigraphs when repeated)
\0 0 Added to make it clear that this is how you can specify a null (or empty) char value.

Arduino built-in constants

The Arduino programming environment includes two useful constants for truthiness. They are true and false and may be used in expressions to improve readability. They are both written in lower case. The constant false is defined as zero and true is defined as 1. It should be noted that C treats any non-zero value as true.

Other constants (all written in upper-case) are HIGH, LOW, INPUT, INPUT_PULLUP and OUTPUT. These are all detailed later in the book, see chapter 6.

There is also LED_BUILTIN which is a constant defined as the pin number connected to the on-board LED. This is free to vary between Arduino models but is normally pin 13.

String Constants

String constants (literals) are defined within double quotation marks.

Most strings are stored in a simple char array. Arrays are dealt with properly a little further into this chapter but as you were probably wondering about string constants they needed introducing here. Just think of arrays as lists for now. Strings can be defined as follows.

char mText[] = “this is a string”;

The code above would create a char array (list of characters) with 17 elements – one for each character in the string and one null “termination” character (char code zero). See the upcoming section on arrays for some alternate ways of defining a char array.

If you need to define a long string constant it can assist program readability of you break it up over more than one line of code.

char nText[] = “this is the start of a long line that continues ”
          “on the next line in the Arduino IDE window”;

The final semicolon character (;) of the statement marks the end and the intervening whitespace is ignored (but note the two sets of double quotation marks).

If we want to use the Arduino String class then the terminology (at least) changes a bit. We create an instance of the String class using what is known as a constructor. There is a lot to come later in the book to explain classes and constructors.

String myString = “Hello String Class”;

We can also use some of the static methods of the String class to create some strings to feed to the String type constructor. Here are just a couple of examples.

String myString =String(‘a’); // converts a char to a string
String myOtherString = String(myString + “ new string”); // concatenates one
                    // String object with a string literal

When a constant value used in a program statement is assigned to a variable the term “constant” has no further implication. The value stored in that variable may be altered by subsequent program statements unless the const keyword is used.

The const keyword

The C language has a keyword const that acts to qualify a variable definition making it read only. Where the const modifier is used then the variable concerned must be defined. Unlike an unmodified variable, you can’t declare it on one line of code and the give at a value on another. The following is a valid const variable definition.

const float PI = 3.14159;

There is a convention, common to many programming languages, that numeric constants in particular are given uppercase variable names. This can aid program readability as it is clear that a value is a constant. This usage is found within the Arduino environment when setting digital PINs (OUTPUT, INPUT, HIGH, LOW). It is also commonly used when using the #define pre-processor.

#define

#define is a C coding construct called a macro that allows the programmer to give a name to a constant value or even a function. The C compilation process substitutes the actual value for the constant name in the program code and so the #define statement itself has no memory overhead as no program variable is created. If a program included the following:

#define MOTOR_PIN 3

Then subsequent program lines might say something like:

digitalWrite(MOTOR_PIN, HIGH); //or
digitalWrite(MOTOR_PIN, LOW);

then the compile pre-processor will replace every reference to MOTOR_PIN with the value 3 before the final compilation and the subsequent program upload to the Arduino.

The #define statement is different to a regular variable definition. There is no statement terminating semicolon and no assignment operator. The compiler will object if you enter either.

Another use for the #define statement is to effectively move numeric constants to a single location in the code. Earlier, when discussing floating point variable types there was a demo code snippet that used the Serial.println() method to output some float values correct to 7 decimal places. I could have used a #define to define a value of 7 and used that in the subsequent Serial.println() code statements. Then, if I had wanted to change the number of decimal places displayed to (say) 5, I would only have had to change one line of the program.

The official Arduino documentation specifically states that defining a const variable is preferred to using the #define pre-processor. C++ programmers (and we must remember that the Arduino compiler is a C++ compiler) would generally agree with that sentiment. There are some potential traps for the unwary when using #define to apply a function (this book has some examples later) but I believe that the #define approach for selected constants is effective and useful for microprocessor development

enum

The last kind of constant is the enumeration constant which takes the form of a list of integer values. Unless specified, the compiler assigns the value 0 to the first item in the list, 1 to the next and so on. An enum statement looks like this:

enum weekdays {sun, mon, tue, wed, thu, fri, sat};

where sat would end up with the value 6. If a value is set for at least one element in the set then unspecified values continue to increment from the last specified one.

enum months {jan = 1, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec};

feb would have the value 2.

The values assigned to an enum set are substituted into the code statements that use them at compile time in just the same way as constants created using #define.

The enum set values can be used anywhere in code like any other constant.

int day = wed; // assigns enum value of wed (3) to int variable day

Enumeration elements do not have to be assigned unique values but element names must always be unique within a program or library. The enum names themselves (weekdays and months used above) are optional and usually play no part in how an enum is used. Later in the book you will see that an enum name can be used as a pseudo type name when declaring a struct and a “slot” needs to be reserved for an enum value within the struct (see later) but even this is not a required usage. You might also use an enum name when defining arguments for a function that is intended to be an enum value but we have not got to functions yet.

Later in the chapter on sort algorithms you will see an example of an enum name being used as an argument type but most of that is yet to come.

enum Sort_Order {ASC, DESC};
void shellSort(int array[], int arrayLength, Sort_Order so) {
// function code ... }

Void Type

Void could be taken to mean “no value” or perhaps more properly an absence of a value.

Many functions return a specified data type. Those where no value is returned are defined as having the void type. A function that takes no arguments can be said to have an argument set of void.

There are also void pointers but we have got ahead of ourselves and existential enough for this early stage and will deal with them later.

Arrays

An array is a set or list of variables of a single type that is referenced by a single variable name and with individual elements of the set being addressed by an index number.

int myInts[7];

declares an array intended to contain 7 int values with index numbers ranging from 0 to 6. C array indexes are said to be zero based. The following statement

int myOtherInts[] = {2, 4, 6, 8, 10, 12, 14};

would define another 7 element array of int values. This time the compiler counts the number of values supplied as constants to create the correct size of array in memory and also assigns the relevant values to each array element.

When a char array is defined from a string constant or a constant list then the compiler will add a null termination char if one is not supplied.

char myString[] = “program”; // creates an 8 element char array
char myString2[] = {'p', 'r', 'o', 'g', 'r', 'a', 'm'}; // another
                                                 //8 element array
char myString3[8] = {'p', 'r', 'o', 'g', 'r', 'a', 'm', '\0'}; 
                  // and another, with the null terminator explicit
char myString4[12] = “program”; // creates a 12 element char array and
                          //populates the first 8 elements

You can assign a value to an array element using an index value.

myOtherInts[3] = 9;

and read values from an array in the same way.

x = myOtherInts[5];

Some programming languages do what is called “bounds checking” to make sure that you do not use an invalid array index but C does not. Thus, I could also write

int myInts[] = {2, 4, 6, 8, 10, 12, 14};
x = myInts[8];

without triggering an error even though the myInts array has only 7 elements and I am trying to read the 9th. Something will be assigned to x. Whatever ends up in x will probably be the bits in the memory position 4 bytes after the end of the memory allocated to the myInts array. That could be another variable or just some junk value. C programmers have to check their own bounds.

Try this fun test:

void setup() {
  Serial.begin(115200);
  char pad[] = "just some text";
  char test[] = "program";
  Serial.println(test);
  test[-2] = 'z';
  Serial.println(pad);
}

The Serial Monitor output in my test run was:

Program
just some texz

The ‘z’ char assigned to test[-2] overwrote the last non null char in the char array named pad[]. This was because the memory assigned to the array named test[] immediately followed the memory used by the array named pad[].

OK now can you work out what happened when I changed the code to set test[-1] to z instead?

Top marks if you realised that the second line of output was “just some textzprogram”. This was because this time the terminating null character in pad[] was overwritten so the Serial.println() method did not know where it finished until it got to the end of test[].

There is not just a danger of reading junk, you might also inadvertently overwrite other data variables. While it is unlikely that you would address an array like test[] using a constant index that was deliberately out of bounds you might arrive at an invalid index via a faulty calculation.

There is another array “gotcha” that you should be aware of although this one is particular to char arrays. The following code snippet illustrates it.

void setup() {
  Serial.begin(115200);
  char test[12] = "program";
  Serial.println(test);
  test[9] = 'x';
  test[10] = 'y';
  test[11] = '\0';
  Serial.println(test);
}

The Serial Monitor output is:

program
program

Can you spot why?

When the array test was defined it was created with 12 elements. However, when the string “program” was used to initialize the array values element 7 was set to a null char. Later, elements 9, 10 and 11 were set to values but the Serial.println() method was still presented with a string terminator at element 7 and the other elements were ignored even though they were not null.

Variable Scope

Variables can be declared so that they are accessible from all parts of a program, such variables are considered to have global scope. Alternately, some variables might only be accessible within a single function or even just a part of the code (a code block) within a function. The latter are described as having a local scope.

Global variables are declared or defined outside of any function, typically this is before the setup() function in a program. Global variables are assigned a memory location and exist there for the duration of a program run.

Local variables are declared within a function or code block. They only exist in memory for the time that the function is active unless the static variable modifier is used. Normally declared local variables are assigned to a memory location and (when defined) initialised each time the function or code block within a function is executed. Static variables continue to exist and retain any value for the duration of a program run. Static variables are stored in memory alongside global variables but can only be accessed from the function or code block where they are declared.

Here is a very contrived code example with some variable definitions.

int mInts[] = {1, 2, 3, 4, 5, 6}; // has global scope
void setup() {
  int nInts[] = {9, 7, 6, 5, 3}; // has scope confined to setup()
  for(int i=0; i< 5; i++) 
  {
    // the int variable i only has scope within this code block
    if(nInts[i] <= 5) 
    {
      int nVal = nInts[i] * 2;
      // nVal only has scope within this inner code block
      nInts[i] = nVal;
    }
  }
}

The int array mInts[] is clearly outside the setup() function (and the loop() function) and therefore has global scope.

The nInts[] int array has local scope anywhere within the setup() function but cannot be accessed from any other function. The variable named i has a local scope but confined with the { curly brace following the for statement and the matching closing } curly brace just before the function end. Later the variable nVal has a local scope but only within the confines of the curly brace pair following the if statement.

We will explore the for() and if() statements shortly.

The static variable modifier

Static variables have already been mentioned but not fully explained. Static variables have local scope within a function or code block. Unlike regular variables they are only created the first time the code block is executed and they are stored in the global variable space. Static variables therefore retain their value between function calls.

The declaration format is:

static char lastChar;

The volatile variable modifier

The volatile modifier is discussed and used in the sections of this book that deal with interrupts. This modifier acts as an instruction to the compiler. It is used when the compiler needs to know that a given variable may be accessed from functions responding to a hardware interrupt and from other areas of code. The compiler can then ensure that the variable value is always stored in SRAM memory and not in a register. Even with this provision, volatile variables may not always return the “current” value and this is demonstrated with a short program in the section on interrupts.

The declaration format is:

volatile int myInt;

The register variable modifier

The register modifier mostly exists for historical reasons. In early C versions, this modifier was an instruction to the compiler that indicated that the variable should be stored in a register for efficiency reasons. The optimising C++ compiler used by the Arduino IDE is considered to be better able to make these decisions than you or me. You can use this modifier, but it is quite likely to be politely ignored by a compiler that will treat it as a suggestion only.