Translation

powered by

Computing (FOLDOC) dictionary

context clash

grammar When a parser cannot tell which alternative

production of a syntax applies by looking at the next

input token ("lexeme").

E.g. given syntax

C -@# A | b c

A -@# d | b e

If you're parsing non-terminal C and the next token is 'b',

you don't know whether it's the first or second alternative of

C since they both can start with b.

To discover whether a grammar has a context clash:

For each non-terminal, N, with multiple alternatives, look at

the first symbol of each alternative's right-hand side, call

it s. If s is the empty string, then find the set FOLLOWER(N)

otherwise find the set FIRST*(s). If any of the sets for N's

alternatives intersect then there will be a context clash when

parsing N. If the next input symbol is one of those in the

intersection of two sets then you won't know which of the

alternatives applies.

FIRST(s) is the set of symbols with which s can start,

including s itself. If s is a non-terminal then FIRST(s) also

includes the first symbol of each alternative right-hand side

of s. The '*' in FIRST*(s) means the "transitive closure"

of FIRST which means keep applying FIRST to each element of

the result until the result doesn't change. I.e. start with

just the set R = s, then for each non-terminal x in R, add

FIRST(x) to R. Keep doing this until nothing new is added.

(We are really only interested in the terminals in FIRST*(s)

but some definitions include the non-terminals).

FOLLOWER(N) is the set of symbols which can come after N in a

sentence. Find each occurrence of N on the right-hand side of

a rule, e.g.

M -@# ... | ... N ... | ...

If there is a symbol s immediately following N then add

FIRST*(s) to the result (again, we're only interested in the

terminal symbols in FIRST*(s)) if there is no symbol after N

in the alternative then add FOLLOWER(M) to the result (i.e. if

N can be the last symbol in an M then anything that can follow

M can also follow N).

If a grammar can generate the same sentence in multiple

different ways (with different parse tress) then it is

ambiguous. An ambiguity must start with a context clash (but

not all context clashes imply ambiguity). The context clash

occurs when trying to parse the first token of the phrase with

multiple parses - you will not be able to tell which

alternative to take. To see if a context clash is also a case

of ambiguity you would need to follow the alternatives

involved in each context clash to see if they can generate the

same complete sequence of tokens.