Reviewing C
I have learned some about C in the past - I went through "The C Programming Language", but I a, planning on reading some textbooks about operating systems and the hardware/software interface of computers, os I want to go back over it.
C is a general-purpose programming language which features economy of expression, modern control flow and data structures, and a rich set of operators. C is not a "very high level" language, nor a "big" one, and is not specialized to any particular area of application. But its absence of restrictions and its generality make it more convenient and effective for many tasks than supposedly more powerful languages.
- C is a general-purpose programming language it has been closely associated with the UNIX system where it was developed, since both the system and most of the programs that run on it are written in C. The language, however, is not tied to any one operating system or machine.
- C provides a variety of data types. The fundamental types are characters, integers, and floating point numbers of several sizes. In addition, there is a hierarchy of derived data types created with pointers, arrays, structures, and unions. Expressions are formed from operators and operands; any expression, including an assignment or a function call, can be a statement. Pointers provide for machine-independent address arithmetic.
- C provides the fundamental control-flow construction required for well-structured programs: statement grouping, decision making (
if-else
), selecting one of a set of possible causes (switch
), looping with the termination test at the top (while, for
), or at the bottom (do
), and early loop exit (break
). - Functions may return values of basic types, structures, unions, or pointers. Any function may be called recursively. Local variables are typically "automatic", or created anew with each invocation. Function definitions may not be nested but variables may exist in separate source files that are compiled separately. Variables may be internal to a function, external but known only within a single source file, or visible to an entire program.
- A preprocessing step performs macro substitution on program text, inclusion of other source files, and conditional compilation.
A Tutorial Introduction
- A C program consists of functions and variables. A function may contain statements that specify the computing operations to be done, and variables that store values used during the computation.
- Normally, you are at liberty to give functions whatever names you like, but "main" is special - your program begins executing at the beginning of
main
. This means that every program must have amain
somewhere. - A sequence of characters in double quotes, like "hello, world\n" is called a character string or string constant.
- Comments are written between
/*
and*/
. - All variables must be declared before they are used. A declaration announces the properties of variables, it consists of a type and a name:
// Variable declaration
int farh, celsius;
int lower, upper, step;
// assignment statements
upper = 300;
lower = 0;
step = 20;
fahr = lower;
- C basic data types: int, float, char, short, long, double.
- Integer division truncates in C.
printf
is a general-purpose output formatting function. - If an arithmetic integer has one floating-point operand and one integer operand, the integer will be converted to floating point before the operation is done.
- A
#define
line defined a symbolic name or symbolic constant to be a particular string of characters. Symbolic constant names are conventionally written in upper case so they can be readily distinguished from lower case variable names.
#define name replacement text
#define LOWER 0
#define UPPER 300
- Text input or output, regardless of where it originates or where it goes to, is dealt with as streams of characters. A text stream is a sequence of characters divided into lines; each line consists of zero or more characters followed by a new line character.
- The
getchar()
andputchar()
functions are for getting and printing characters. - A line with just a
;
on it is called a null statement, and is sometimes used because C require that a for statement have a body. - A character written between single quotes represents an integer value equal to the numerical value of the character in the machine's characters set. This is called a character constant.
- NOTE: To enter an
EOF
character in the Linux command line, press CTRL+d - Expressions connected by
&&
and||
are evaluated left to right and it is guaranteed that evaluation will stop as soon as the truth or falsehood is known.
int power(int base, int n)
{
int i, p;
p = 1;
for (i = 1; i <=n; +i)
p = p*base;
return p;
}
- In C, all function arguments are passed "by value". This means that the function is given the values of its arguments in temporary variables rather than the originals. This leads to some different properties than are seen with "call by reference" languages in which the called routine has access to the original argument, not a local copy.
- If you want to modify the value of a variable, you can always pass in the address of the variable.
- The story is different for arrays,; when the name of an array is used as an argument, the value passed to the function is the location or address of the begging of the array - there is no copying of array elements.
- When a string constant like
"hello\n"
appears in a C program, it is stored as an array of characters containing the characters of the string and terminated with a'\0'
to mark the end:
- Each local variable in a function comes into existence only when the function is called, and disappears when the function is exited. This is why such variables are usually known as automatic variables or local variables. Variables can also be defined globally, outside of any function.
- You should declare all external variables at the beginning of the source file to prevent having to use the
extern
declaration.
Types, Operators, and Expressions
Variables and constants are the basic data objects manipulated in a program. Declarations list the variables to be used, and state what type they have and perhaps what their initial values are. Operators specify what is to be done to them. Expressions combine variables and constants to produce new values. The type of an object determines the set of values it can have and what operations can be performed on it.
- The basic data types in C
char
: s single byte, capable of holding one character in the local character setint
: an integer, typically reflecting the natural size of integers to the host machinefloat
: single precision floating pointdouble
: double precision floating point
- Qualifiers
short
long
signed
unsigned
- chars are just small integers, so char variables and constants are identical n arithmetic expressions.
- Function definitions can appear in any order, and in one source file or several. A parameter is used to describe a variable named in the parenthesized list in a function definition, and argument is used to describe the value used in a call of the function.
- A long constant is written with a terminal
l
:123456789l
- A number can be written in octal by being written with a leading
0
and can be written in hexadecimal with a leading0x
- Character constants,
'c'
, which are integers, participate in numeric operations just as any other integers
- A constant expression is an expression that involves only constants. Such expressions can be evaluated during compilation rather than run-time
#define MAXLINE 1000
char line[MAXLINE+1];
- A string constant or string literal is a sequence of zero or more characters surrounded by double quotes, as in
"I am a string"
- The internal representation of a string is an array of characters with a size equal to 1+the number of characters in the string
- The standard library function
strlen(s)
returns the number of characters in s
- An enumeration constant is a list of constant integer values:
enum boolean { NO, YES }
enum months { JAN = 1, FEB, MAR, APR ...
enum escapes { BELL = '\a', BACKSPACE = '\b', TAB = '\t' ...
- A declaration specifies a type and contains a list of one or more variables of that type:
int lower, upper, step;
char c, line[1000];
- A variable may also be initialized in its declartion
char esc = '\\';
int i = 0;
int limit = MAXLINE+1;
float eps = 1.0e-5;
- External and static variables are initialized to zero by default. Automatic variables for which there is no explicit initializer have undefined (i.e., garbage) values.
- The qualifier
const
can be applied to the declaration of any variable to specify that its value will not be changed:
const double e = 2.71828182;
double char msg[] = "warning: "
// The const declaration can be used with array arguments to indicate that the function does not change that array:
int strlen(const char[]);
- The binary arithmetic operators are
+
,-
,/
,*
, and the modulus operator%
. - Binary in this case means that the operator takes two operands.
- The relational operators are
>
,>=
,<
,<=
. - The equality operators are
==
and!=
. - The logical operators are
&&
and||
. Expressions connected by these logical operators are evaluated left to right. - By definition, the numeric value of a relational or logical expression is 1 if the relation is
true
, and 0 if the relation isfalse
. - In general, the only automatic type conversions are those that convert a "narrower" operand into a "wider" one without losing information, such as converting an integer to floating point in an expression like
float + int
. - Specify
signed
andunsigned
id non-character data is to be stored inchar
variables. - Some type conversion rules:
- If either operand is
long double
, convert the other tolong double
. - Otherwise, if either operand is
double
, convert the other todouble
. - Otherwise, if either operand is
float
, convert the other tofloat
. - Otherwise, convert
char
andshort
toint
. - Then, if either operand is
long
, convert the other tolong
.
- If either operand is
- Conversions take place across assignments; the value of the right side is converted to the type of the left, which is the type of the result.
- Explicit type conversions can be forced ("coerced") in any expression, with a unary operator called a cast:
// (type-name) expression
sqrt((double) n) // casting n to be double
- The
++
and--
operators increment and decrement values respectively and maybe used as a prefix or postfix. If used as a prefix, they modify values before the value is used, and if they are used as a postfix (n++
), they modify values after the value is used. - Bitwise Operators:
- Ternary Operator:
z = (a > b) ? a : b; /* z = max(,b) */
Control Flow
- The control-flow statements of a language specify the order in which computations are performed.
- In C, the semicolon is a statement terminator, rather than a separator as it is in languages like Pascal.
- Braces
{
and}
are used to group declarations and statements together into a compound statement or block so that they are syntactically equivalent to a single statement. - Switch Statement:
#include <stdio.h>
main() /* count digits, white space, others */
{
int c, i, nwhite, nother, ndigit[10];
nwhite = nother = 0;
for (i = 0; i < 10; i++) ndigit[i] = 0;
while ((c = getchar()) != EOF) {
switch (c) {
case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9':
ndigit[c-'0']++;
break;
case ' ': case '\n': case '\t':
nwhite++;
break;
default:
nother++;
break;
}
printf("digits =");
for (i = 0; i < 10; i++) {
printf(" %d", ndigit[i]);
}
printf(", white space = %d, other = %d\n",nwhite,nother);
retuirn 0;
}
- Most of the control flow is similar in C to other languages. C does have a
do ... while
loop andgoto
statements that can be seen below.
// do ... while
void itoa(int n, char s[])
{
int i, sign;
if ((sign=n) < 0) n = -n; // record sign
i = 0;
do { // generate digits in reverse order
s[i++] = n % 10 + '0';
} while ((n /= 10) > 0);
if (sign < 0) s[i++] = '-';
a[i] = '\0';
reverse(s);
}
// goto snippet
for (i = 0; i < n; i++)
for (j = 0; j < m; j++)
if (a[i] == b[j])
goto foundl
...
found:
...
Functions and Program Structure
- Functions break large computing tasks into smaller ones, and enable people to build on what others have done instead of starting over from scratch
- If the return type of a function is omitted,
int
is assumed. - Functions themselves are always external, because C does not allow functions to be defined inside other functions.
- The scope of a name is the part of the program within which the name can be used.
- A declaration of a variable announces the properties of a variable, and a definition causes storage to be set aside.
- The
static
declaration applied to an external variable or function limits the scope of that object to the rest of the source file being compiled. - The
static
declaration can also be applied to internal variables. Internalstatic
variables are local to a particular function just as automatic variables are, but unlike automatics, they remain in existence rather than coming and going each time the function is activated. - The
register
declaration advises the compiler that the variable in question will be heavily used. - In the absence of explicit initialization, external and static variables are guaranteed to be initialized to zero; automatic and register variables have undefined (i.e. garbage) initial values.
- An array may be initialized by following its declaration with a list of initializers in braces and separated by commas. When the size of the array is omitted, the compiler will compute the length by counting the initializers. If there are fewer initializers for an array than the number specified, the missing elements will be zero for external, static, and automatic variables. It is an error to have too many initializers. Characters array are a special case of initialization; a string may be used instead of the braces and comma notation.
- C provides certain language facilities by means of a preprocessor, which is conceptually a separate first step in compilation. The two most frequently used features are
#include
to include the contents of a file during compilation and#define
to replace a token by an arbitrary sequence of characters. - File inclusion makes it easy to handle collections of
#define
and declarations. Any line of the form
#include "filename" // search for the filename starting at the current file
// or
#include <filename> // search for filename using implementation defined rule
is replaced by the contents of the file filename.
- Macro Substitution
#define name replacement text
- Subsequent occurrences of the token name are replaced by the replacement text.
#define max(A,B) ((A) > (B) ? (A) : (B))
- Macro definitions can go multiple lines as long as the line ends with
\
Pointers and Arrays
- A pointer is a variable that contains the address of a variable. Pointers are much used in C, partly because they are sometimes the only way to express a computation, and partly because they usually lead to more compact and efficient code than can be obtained in other ways.
- A typical machine has an array of consecutively numbered or addressed memory cells that may be manipulated individually or in contiguous groups. One common situation is that any byte can be a
char
, a pair of one-byte cells can be treated as ashort
integer, and four adjacent bytes as along
. A pointer is a group of cells (often two or four) that can hold an address. So if c is achar
and p is a pointer that points to it, we could represent the situation like:
- The unary operator
&
gives the address of an object, so the statementp = &c
assigns the address of c to the variable p, and p is said to "point to" c. The unary operator*
is the indirection or dereferencing operator; when applies to a pointer, it accesses the object the pointer points to.
void swap(int x, int y) {
int temp;
temp = x;
x = y;
y = temp;
}
main() {
int a = 2;
int b = 3;
swap(&a,&b);
printf("%d = a = 3, %d = b = 2\n",a,b);
}
void swapBetter(int *px, int *py) {
int temp;
temp = *px;
*px = *py;
*py = temp;
}
In C, there is a strong relationship between pointers and arrays, strong enough that pointers and array should be discussed simultaneously. Any operation that can be achieved by array should be discusses simultaneously. Any operation that can be achieved by array subscripting can also be done with pointers.
- When an array is passed to a function, what is passed is the location of the initial element.
#include <stdio.h>
int strlen(char *s) {
int n;
for (n = 0; *s != '\0'; s++) n++;
return n;
}
main() {
printf("%d\n",strlen("Hello")); // 5
}
- As formal parameters in function definition,
char s[]
andchar *s
are equivalent - people prefer the latter because it says more explicitly that the parameter is a pointer. - If p is a pointer to some element of an array, then p++ increments p to point to the next element, and p+=i increments it to point i elements beyond where it currently does.
- C provides rectangular multi-dimensional arrays, although in practice they are much less used than arrays of pointers.
Structures
- A structure is a collection of one or more variables, possibly of different types, grouped together under a single name for convenient handling. Structures help to organize complicated data, particularly in large programs, because they permit a group of related variables to be treated as a unit instead of as separate entities.
- The keyword
struct
introduces a structure declaration, which is a list of declarations enclosed in braces. An optional name called a structure tag may follow the wordstruct
. The tag names this kind of structure and can be used subsequently as a shorthand for the part of the declaration in braces
// Creating Structs
struct point {
int x;
int y;
};
// Initializing structs
struct point maxpt = { 320, 200 };
// Reference
printf("x = %d, y = %d\n",maxpt.x,maxpt.y);
// nested structs
struct rect {
struct point p1;
struct point p2;
}
// declaring struct
struct rect screen;
printf("px.x = %d",rect.p1.x);
- The only legal operations on a structure are copying it or assigning to it as a unit, taking its address with
&
, and accessing its members. - If a large structure is to be passed to a function, it is generally more efficient to pass a pointer than to copy the whole structure. Structure pointers are just like pointers to ordinary variables.
- Pointers to structures are so frequently used that an alternative notation is provided as a shorthand. If p is a pointer to a structure, then
p->member-of-structure
refers to a particular member.
struct point *pp;
printf("origin is (%d,%d)\n",pp->x,pp->y);
The structure operators . and ->, together with () for function calls and [] for subscripts, are at the top of the precedence hierarchy and thus bind very tightly.
// Given declaration
struct {
int len;
char *str;
} *p;
// then
++p->len; // increments len, not p
= C provides a facility called typedef
for creating new data type names. For example, the declaration
typedef int Length;
makes the name Length
a synonym for int
.
- A union is a variable that may hold (at different times) objects of different types and sizes, with the compiler keeping track of size and alignment requirements. Unions provide a way to manipulate different kinds of data in a single area of storage, without embedding any machine dependent information in the program.
- Members of a union are accessed like
structs
.
union u_tag {
int ival;
float fval;
char *sval;
} u;
- This is good enough for now. I am going to move on to something else.
Comments
You have to be logged in to add a comment
User Comments
There are currently no comments for this article.