Consider a program where part of the user interface is a system of drop- down and pop-up menus where the user can make choices by pressing a key or combination of keys similar to the Alt-F-X combination in the DOS Help, Edit, and QBasic programs. Good design dictates that we don't want to force the user to use only upper case or lower case characters; however, for efficiency of code and speed of operation, we don't want to make 2 tests every time a key is entered. The solution is to immediately convert a user's key to upper case, then wherever we need to make a branching decision based on the key elsewhere in the program we only need to make one test.
Here is about the shortest and simplest program I can think of that incorporates most of the principal elements of a program: executable statements, control statements (conditional and iterative), data manipulation statements, and input and output statements. It reads a character from the keyboard, converts it to uppercase (if it was entered as lower case), and displays it, looping until the 'q' or 'Q' character is entered. The line that displays the character is a testing tool just to show that the input and logic works up to this point; it would be replaced in a "real" program with code that made decisions based on the user's key.
The machine language version of the program is 30 bytes long; it
consists of 15 2-byte pairs of opcode and operand. The Intel instruction
set is not always so symmetrical; opcodes can be 2 bytes, and operands
are more typically 2 bytes, but can be 4 or more bytes long. The machine
instructions are shown below, grouped as opcode and operand, and are
represented as hexadecimal numbers. Recall that 4 binary digits can be
represented by 1 hexadecimal digit, so each 4-digit hex number
represents 2 bytes or 16 bits. For example, the first opcode - operand
pair is B400H. In binary it would be
B400 CD16 3C61 7206 3C7A 7702 24DF 88C2
B402 CD21 3C51 75E8 B000 B44C CD21
In order to get the binary codes entered and saved to disk as an executable program we need to enter them in a "hex editor." I will use the DOS utility DEBUG. While DEBUG is about as intuitive and user friendly as a chain saw, it is also available on every DOS and Windows operating system. The process is this:
It is also possible to write a "script" of the DEBUG commands to a text
file and redirect input to DEBUG from the script file. The DOS command
to feed the script file GETKEY1.SCR shown below to DEBUG is:
E 100 B4 00 CD 16 3C 61 72 06
E 108 3C 7A 77 02 24 DF 88 C2
E 110 B4 02 CD 21 3C 51 75 E8
E 118 B0 00 B4 4C CD 21
N GETKEY1.COM
R CX
1E
W
Q
|
|
| Binary HEX codes entered in DEBUG and saved to file GETKEY1.COM | Binary HEX codes redirected to DEBUG from script file GETKEY1.SCR |
The next step in the evolution of computer languages after Machine Language was Assembly Language where the machine opcodes are represented as somewhat more "English-like" words called Mnemonics. A program called an assembler translates the mnemonic instructions into binary machine code. Shown below is the assembly language version of the above machine language program. While the instructions could be entered directly into DEBUG, shown here is the script file GETKEY2.SCR that will be redirected to DEBUG.
A 100
MOV AH,00 ; BIOS service 00H Read Keyboard Character
INT 16 ; call BIOS
CMP AL,61 ; is char less than 'a'?
JB 010E ; yes, display the char
CMP AL,7A ; is char greater than 'z'?
JA 010E ; yes, display the char
AND AL,DF ; char is a-z, clear bit 5 to convert to upper case
; display the char; in a "real" program the char
; would be used to, say, check for a menu choice
MOV DL,AL ; DL = char to display
MOV AH,02 ; DOS service 02H Display output
INT 21 ; call DOS
CMP AL,51 ; was char 'Q' (Quit)?
JNE 0100 ; no, loop and get another character
MOV AL,00 ; yes, AL = return code
MOV AH,4C ; AH = DOS service 4CH Terminate with return code
INT 21 ; call DOS
; leave blank line to end DEBUG Assembly mode
N getkey2.com
R CX
1E
W
Q

Assembly language instructions redirected to DEBUG from script file
GETKEY2.SCR
It is instructive to "disassemble" the program in DEBUG because we can see the binary opcodes and operands side by side with the assembly instructions. Note that each assembly language mnemonic instruction represents one machine instruction. The far left column of numbers are the memory address "offsets" in the program of each instruction, the next column are the machine language codes, and the far right columns are the assembly language instructions.
0100 B400 MOV AH,00
0102 CD16 INT 16
0104 3C61 CMP AL,61
0106 7206 JB 010E
0108 3C7A CMP AL,7A
010A 7702 JA 010E
010C 24DF AND AL,DF
010E 88C2 MOV DL,AL
0110 B402 MOV AH,02
0112 CD21 INT 21
0114 3C51 CMP AL,51
0116 75E8 JNZ 0100
0118 B000 MOV AL,00
011A B44C MOV AH,4C
011C CD21 INT 21
| ![]() |
| Disassembly of GETKEY2.COM using DEBUG's U (Unassemble) command | |
The next step after assembly language resulted in various high-level languages where the instructions are more like English statements. A program called a Compiler or an Interpreter translates the high-level statements into binary machine code, typically with an intermediate step of first translating the statements into assembly language, then calling an assembler to make the final translation to machine code.
Here is a C programming language program that is very nearly functionally identical to the above 2 programs in machine language and assembly language:
/* GETKEY3.C Read keyboard character without echo, convert
* to upper case and display to Standard Output
*/
#include <stdio.h> /* putchar() */
#include <conio.h> /* getch() */
#include <ctype.h> /* toupper(), islower() */
/* all C/C++ programs have a main() function that return an integer
* value to the operating system, 0 is the usual "success" code
*/
int main()
{
int ch; /* keyboard character */
do { /* loop at least once */
ch = getch(); /* read char from keyboard */
if ( islower( ch ) ) /* if char is lower case */
ch = toupper( ch ); /* convert to upper case */
/* display the character; a "real" program would now use the
* upper case character to make, say, menu choices
*/
putchar( ch );
} while ( ch != 'Q' ); /* loop until char = 'Q' */
return 0; /* return success code to DOS */
}

Compiling GETKEY3.C
Notice how very little "direct manipulation" of the input character is
done in the high-level version. We call library functions to
handle all the details of determining if the character was lower case in
the first place ( islower( ch ) ), converting
it to upper case
Notice also that in the C version we declare a data type and
reserve a memory location to store the character
GETKEY doesn't do much; it just displays the character you entered as upper case if you entered a lower case key. Any other keys, like numbers or punctuation, should get passed straight through without change. Here are some runs with GETKEY1 (the machine language version), GETKEY2 (the assembly language version), and GETKEY3 (the C language version) using the following characters as input: @ABYZ[`abyz{09q.
GETKEY1 Input: @ABYZ[`abyz{09q
GETKEY1 Output: @ABYZ[`ABYZ{09Q
GETKEY2 Input: @ABYZ[`abyz{09q
GETKEY2 Output: @ABYZ[`ABYZ{09Q
GETKEY3 Input: @ABYZ[`abyz{09q
GETKEY3 Output: @ABYZ[`ABYZ{09Q
|
![]() |
| Testing machine, assembly, and C language versions of GETKEY | |
Extra Credit: See if you can figure out why I chose certain characters to test. (Hint: look at an ASCII character chart and look at the test in the assembly language version)