C_traps - C Traps and Pitfalls* Andrew Koenig AT&T Bell...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
C Traps and Pitfalls* Andrew Koenig Murray Hill, New Jersey 07974 ABSTRACT The C language is like a carving knife: simple, sharp, and extremely useful in skilled hands. Like any sharp tool, C can injure people who don’t know how to handle it. This paper shows some of the ways C can injure the unwary, and how to avoid injury. 0. Introduction The C language and its typical implementations are designed to be used easily by experts. The lan- guage is terse and expressive. There are few restrictions to keep the user from blundering. A user who has blundered is often rewarded by an effect that is not obviously related to the cause. In this paper, we will look at some of these unexpected rewards. Because they are unexpected, it may well be impossible to classify them completely. Nevertheless, we have made a rough effort to do so by looking at what has to happen in order to run a C program. We assume the reader has at least a passing acquaintance with the C language. Section 1 looks at problems that occur while the program is being broken into tokens. Section 2 fol- lows the program as the compiler groups its tokens into declarations, expressions, and statements. Section 3 recognizes that a C program is often made out of several parts that are compiled separately and bound together. Section 4 deals with misconceptions of meaning: things that happen while the program is actually running. Section 5 examines the relationship between our programs and the library routines they use. In section 6 we note that the program we write is not really the program we run; the preprocessor has gotten at it first. Finally, section 7 discusses portability problems: reasons a program might run on one implementa- tion and not another. 1. Lexical Pitfalls The first part of a compiler is usually called a lexical analyzer . This looks at the sequence of charac- ters that make up the program and breaks them up into tokens . A token is a sequence of one or more char- acters that have a (relatively) uniform meaning in the language being compiled. In C, for instance, the token -> has a meaning that is quite distinct from that of either of the characters that make it up, and that is independent of the context in which the -> appears. For another example, consider the statement: if (x > big) big = x; Each non-blank character in this statement is a separate token, except for the keyword if and the two instances of the identifier big . In fact, C programs are broken into tokens twice. First the preprocessor reads the program. It must tokenize the program so that it can find the identifiers, some of which may represent macros. It must then replace each macro invocation by the result of evaluating that macro. Finally, the result of the macro replacement is reassembled into a character stream which is given to the compiler proper. The compiler then breaks the stream into tokens a second time.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 08/08/2011 for the course CS 101 taught by Professor Jitenderkumarchhabra during the Summer '11 term at National Institute of Technology, Calicut.

Page1 / 29

C_traps - C Traps and Pitfalls* Andrew Koenig AT&T Bell...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online