Tru64 UNIX
Compaq C Language Reference Manual


Previous Contents Index

1.5 Operators

An operator is a token that specifies an operation on at least one operand, and yields some result (a value, designator, side effect, or some combination). Operands are expressions or constants (a form of expression). Operators with one operand are unary operators, and operators with two operands are binary operators. For example:


x = -b;          /*   Unary minus operator   */ 
y = a - c;       /*   Binary minus operator  */ 

Operators with three operands are called ternary operators.

All operators are ranked by precedence, a ranking system determining which operators are evaluated before others in a statement. See Chapter 6 for information on what each operator does and for the rules of operator precedence.

Some operators in C are composed of more than one character, while others are single characters. The single-character operators in C are:


!  %  ^  &  *  -  +  =  ~  |  .  <  >  /  ?  :  ,  [  ]  (  )  # 

The multiple-character operators in C are:


++    --    ->    <<     >>     <=    >=    ==    !=    *=    /= 
%=    +=    -=    <<=    >>=    &=    ^=    |=    ##    &&    || 

The # and ## operators can only be used in preprocessor macro definitions. See Chapter 8 for more information on predefined macros and preprocessor directives.

The sizeof operator determines the size of a data type. See Chapter 6 for more information on the sizeof operator.

The old form for compound assignment operators ( =+ , =- , =* , =/ , =% , =<< , =>> , =& , =^ , and =| ) is not supported by the ANSI C standard. Use of these operators in a program is unsupported, and will produce unpredictable results. For example:


x =-3; 

This construction means x is assigned the value -3 , not x is assigned the value x - 3 .

The error-checking compiler option provides a warning message when the old form of compound assignment operators is encountered.

1.6 Punctuators

Some characters in C are used as punctuators, which have their own syntactic and semantic significance. Punctuators are not operators or identifiers. Table 1-4 lists the C punctuators.

Table 1-4 Punctuators
Punctuator Use Example
< > Header name <limits.h>
[ ] Array delimiter char a[7];
{ } Initializer list, function body, or compound statement delimiter char x[4] = {'H', 'i', '!', '\0' };
( ) Function parameter list delimiter; also used in expression grouping int f (x,y)
* Pointer declaration int *x;
, Argument list separator char x[4] = { 'H', 'i', '!', '\0'};
: Statement label labela: if (x == 0) x += 1;
= Declaration initializer char x[4] = { "Hi!" };
; Statement end x += 1;
... Variable-length argument list int f ( int y, ...)
# Preprocessor directive #include <limits.h>
' ' Character constant char x = 'x';
" " String literal or header name char x[] = "Hi!";

The following punctuators must be used in pairs:

< >
[ ]
( )
' '
" "
{ }

Some characters can be used either as a punctuator or as an operator, or as part of an operator. The context of the occurrence specifies the meaning. Punctuators usually delineate a specific type of C construct, as shown in Table 1-4.

1.7 String Literals

Strings are sequences of zero or more characters. String literals are character strings surrounded by quotation marks. String literals can include any valid character, including white-space characters and character escape sequences. Once stored as a string literal, modification of the string leads to undefined results.

In the following example, ABC is the string literal. It is assigned to a character array where each character in the string literal is stored as one array element. Storing a string literal in a character array lets you modify the characters of the array.


char x[] = "ABC"; 

String literals are typically stored as arrays of type char (or wchar_t ) if prefaced with an L , and have static storage duration.

The following declaration declares a character array to hold the string "Hello!":


char s[] = "Hello!"; 

The character array s is initialized with the characters specified in the double quotation marks, and terminated with a null character ( \0 ) . The null character marks the end of each string, and is automatically concatenated to the end of the string literal by the compiler. Adjacent string literals are automatically concatenated (with a single null character added at the end) to reduce the need for the line continuation character (the backslash at the end of a line).

Following are some valid string literals:


""            /*  Here's a string with only the null character */ 
 
"You can have many characters in a string." 
 
"\"You can mix characters and escape sequences.\"\n" 
 
"Long lines of text can be continued on the next line \
by using the backslash character at the end of a line." 
 
"Or, long lines of text can be continued by using " 
"ANSI's concatenation of adjacent string literals." 
 
"\'\n"        /*  Only escape sequences are in this string    */ 

To determine the length of a given string literal (not including the null character), use the strlen function. See Chapter 9 for more information on other library routines available for string manipulation.

1.8 Constants

There are four categories of constants in C:

The following sections describe these constants.

The value of any constant must be within the range of representable values for the specified type. Regardless of its type, a constant is a literal or symbolic value that does not change. A constant is also an rvalue, as defined in Section 2.14.

1.8.1 Integer Constants

Integer constants are used to represent whole numbers. An integer constant can be specified in decimal, octal, or hexadecimal radix, and can optionally include a prefix that specifies its radix and a suffix that specifies its type. An integer constant cannot include a period or an exponent part.

Follow these rules when specifying an integer constant:

Without explicit specification, the type of an integer constant defaults to the smallest possible type that can hold the constant's value, unless the value is suffixed with an L , l , U , or u . The following list describes the type assignment of integer constants:

For example, the constant 59 is assigned the int data type by default, but the constant 59L is assigned the long data type. 59UL is typed as unsigned long int .

Integer constant values are always nonnegative; a preceding minus sign is interpreted as a unary operator, not as part of the constant. If the value exceeds the largest representable integer value (causing an overflow), the compiler issues a warning message and uses the greatest representable value for the integer type. Unsuffixed integer constants can have different types, because without explicit specification the constant is represented in the smallest possible integer type.

1.8.2 Floating-Point Constants

A floating-point constant has a fractional or exponential part. Floating-point constants are always interpreted in decimal radix (base 10). An optional suffix can be appended to show the constant's type. Floating-point constants can be expressed with decimal point notation, signed exponent notation, or both. A decimal point without a preceding or following digit is not allowed (for example, .E1 is illegal). Table 1-5 shows examples of valid notational options.

The significand part of the floating-point constant (the whole number part, the decimal point, and the fractional part) may be followed by an exponent part, such as 32.45E2 . The exponent part (in the previous example, E2 ) indicates the power of 10 by which the significand part is to be scaled. The precise value after scaling is dependent on your platform. The determining algorithm is described in your platform-specific Compaq C documentation.

The default type of a floating-point constant is double , unless:

Floating-point constant values must be nonnegative; a preceding minus sign is interpreted as a unary operator, not as part of the constant.

Table 1-5 Floating-Point Notation
Notation Value Type
.0 0.000000 double
0. 0.000000 double
2. 2.000000 double
2.5 2.500000 double
2e1 20.00000 double
2E1 20.00000 double
2.E+1 20.00000 double
2e+1 20.00000 double
2e-1 0.200000 double
2.5e4 25000.00 double
2.5E+4 25000.00 double
2.5F 2.500000 float
2.5L 2.500000 long double

1.8.3 Character Constants

A character constant is any character from the source character set enclosed in apostrophes. Character constants are represented by objects of type int . For example:


char alpha = 'A'; 

Characters such as the new-line character, single quotation marks, double quotation marks, and backslash can be included in a character constant by using escape sequences as described in Section 1.8.3.3. All valid characters can also be included in a constant by using numeric escape sequences, as described in Section 1.8.3.4.

The value of a character constant containing a single character is the numeric value of the character in the current character set. Character constants containing multiple characters within the single quotation marks have a value determined by the compiler. The value of a character constant represented by an octal or hexadecimal escape sequence is the same as the octal or hexadecimal value of the escape sequence. The value of a wide character constant (discussed in Section 1.8.3.1) is determined by the mbtowc library function.

There is a limit of four characters for any one character constant. Enclosing more than four characters in single quotation marks (such as 'ABCDE' ), generates an overflow warning.

Note that the byte ordering of character constants is platform specific.

1.8.3.1 Wide Characters

C provides for an extended character set through the use of wide characters. Wide characters are characters too large to fit in the char type. The wchar_t type is typically used to represent a character constant in a character set requiring more than 256 possible characters, because 8 bits can represent only 256 different values.

A character constant in the extended character set is written using a preceding L , and is called a wide-character constant. Wide-character constants have an integer type, wchar_t , defined in the <stddef.h> header file. Wide-character constants can be represented with octal or hexadecimal character escape sequences, just like normal character escape sequences, but with the preceding L .

Strings composed of wide characters can also be formed. The compiler allocates storage as if the string were an array of type wchar_t , and appends a wide null character (\0) to the end of the string. The array is just long enough to hold the characters in the string and the wide null character, and is initialized with the specified characters.

The following examples show valid wide-character constants and string literals:


wchar_t wc = L'A'; 
wchar_t wmc = L'ABCD'; 
wchar_t *wstring = L"Hello!"; 
wchar_t *x = L"Wide"; 
wchar_t z[] = L"wide string"; 

Compaq C stores wchar_t objects as unsigned long objects (OPENVMS) or unsigned int objects (TRU64 UNIX) in 32 bits of storage. The null character at the end of a wide-character string is 32 bits long.

1.8.3.2 Multibyte Characters

Some programmers requiring an extended character set have used shift-dependent encoding schemes to represent the non-ASCII characters in the normal char size of 8 bits. This encoding results in multibyte characters. ANSI C supports these encoding schemes, in addition to providing the wide-character type wchar_t .

In accordance with the ANSI standard, Compaq C recognizes multibyte characters in the following contexts:

For proper input and output of the multibyte character encodings, and to prevent conflicts with existing string processing routines, note the following rules governing the use of multibyte characters:

Transforming multibyte characters to wide-character constants and wide string literals eases the programmer's problems when dealing with shift-state encoding. There are several C library functions available for transforming multibyte characters to wide characters and back. See Chapter 9 for more information.


Previous Next Contents Index