Generative phonology

Generative phonology

Important: You must have installed the phonetic font "Charis SIL" or tested this installation to determine if the phonetic characters installed properly.

Jonathan Harrington

Recommended reading

Clark,  Yallop & Fletcher  (2007). Chapter 5.

Generative phonology and underlying representations

We have considered so far two different levels of abstraction in representing the sound structure of a language: a phonetic level of representation which includes aspects of pronunciation which are at least shared by a community/dialect group, and which above all includes the variability due to context effects; and a phonemic level of representation which is formed from a finite number of phonemic units and which factors out the contextual influences.

It has been a particular hallmark of a branch of phonology known as Generative Phonology, that came to prominence with Chomsky & Halle's (1968) Sound Pattern of English, to consider a more abstract representation which will call an underlying representation that allows phonological rules and principles to be more transparently and economically stated.

Their emphasis in the Sound Pattern of English is to eliminate redundancy from phonological analyses. We already do this to a certain extent, of course, in representing words using the phonemic rather than a phonetic representation: that is, there are some aspects of pronunciation that are redundant (e.g. aspiration of oral stops in English) and so we factor out this redundancy and subsequently fill it in by rule. We therefore of course also necessarily end up with a considerably more abstract sound representation of the word (e.g. /pɪn/ rather than [pʰɪn]) i.e. one which is one stage further removed than a phonetic transcription from the actual details of the production of speech and how the vocal organs are coordinated.

In the Sound Pattern of English, one of the main aims is to factor out many more redundancies from the words' phonological representations and to fill in these redundancies by rule. This in turn results in a representation which is a good deal more abstract than the phonemic forms we have been considering. Furthermore, these highly abstract representations are presumed to form part of the talker's knowledge of the language. -- they might be how the words are encoded in the mental lexicon.

As an example, consider some data reported by Mohanan (1992) and discussed in Kenstowicz (1994). In Singapore English, as in many dialects of English, talkers simplify word-final consonant clusters by deleting the final stop. For example, words like 'task', 'lift', 'list' are produced as /tɑs/,/lɪf/,/lɪs/, i.e. without the final /t/. Now consider the pluralisation rule (in all standard accents of English) in which a /əz/ suffix is added to words that end in /s/. For example, the plural of 'class' and 'dress' are /klɑsəz/ and /drɛsəz/ in standard forms of English. The question is: how do these kinds of Singapore talkers who regularly reduce consonantal clusters form plurals in words like 'tasks' and 'lists'? Since their singular productions end in /s/ (/tɑs/,/lɪs/) and since the pluralisation rule requires an /əz/, they should produce /tɑsəz/ and /lɪsəz/ (analogous to 'classes' and 'dresses') if it is the phonemic /s/ which is responsible for plurals in /əz/.

But these Singapore English talkers do no such thing: their plural forms of 'tasks' and 'lists' are the same as their singular forms i.e. /tɑs//lɪs/ even though they do produce the plural forms /klɑsəz/ and /drɛsəz/ for 'classes' and 'dresses'. How can we explain this?

The Generative Phonology solution would be to propose that the underlying forms of 'task' and 'list' in this dialect of Singapore English are /tɑsk/ and /lɪst/ i.e. with a full consonantal cluster that is not pronounced. Or from another point of view, these talkers' mental representations of these words -- that is, the way in which these words are stored in their mental lexicons -- include phonological forms (final /sk/, final /st/) that are never pronounced (by them). Generative Phonologists would then model the actual pronunciations of the singular and plurals in terms of phonological rules, as follows:

  'task' 'tasks'
underlying representation tɑsk tɑsk+s
cluster simplification tɑs tɑs+s
output (phonetic form) tɑs tɑs

Therefore, because Singapore English talkers 'know' that 'task' really has a final /k/ (even though they do not pronounce it), they do not add the plural suffix /əz/ (which is reserved for words like 'dance' which have a final /s/), but the plural suffix /s/, analogously to words that end in a final voiceless non-sibilant consonant ('tap', 'block' etc. which form plurals by adding /s/ thus /tæps/,/blɒks/ etc.).

The significance of this example (and there are many others like it) lies not so much in the details but rather in the idea that the way in which the pronunciation of words is stored in the mental lexicon may be a good deal more abstract than the phonemic representation of the actual pronunciations would suggest. If we accept this premise, then we must also accept the idea that there are phonological rules that link these often highly abstract underlying forms to the phonetic forms (or 'surface' forms, to use terminology from Generative Phonology) because otherwise we cannot explain how underlying forms are related to pronunciation (this is exactly parallel to our earlier phonemic/phonetic distinction: once we represent words phonemically, we have to have rules that fill in the redundant or predictable aspects of pronunciation like aspiration; the difference in the Generative Phonology model is that the underlying forms that are being proposed are more abstract than phonemic forms -- resulting in many more rules to explain the predictable and redundant aspects of pronunciation -- and they lay much greater emphasis on the claim that these underlying forms are in some sense 'psychologically real' i.e. part of the talker's linguistic competence).

Examples of how underlying forms are reconstructed

Generative phonologists often use evidence from morphological alternations which are a very productive source of sound variation, to reconstruct underlying forms. In English, underlying forms have been reconstructed from alternations such as opaque/opacity, electric/electricity, serene/serenity, digress/digression, confuse/confusion etc. Again, the argument is that these alternations include predictable aspects of pronunciation that can be succinctly expressed by applying phonological rules to abstract underlying forms. We will take a first example from Indonesian (discussed in Halle & Clements, 1983) to illustrate how underlying forms are reconstructed and the way in which phonological rules apply to derive the 'surface' phonetic forms.

In Indonesian, there is an alternation between a simple and a prefixed form. By studying these alternations, we have to try to come up with a plausible underlying form for the prefix and an appropriate set of rules. Here is the data:

Simple formPrefixed form 
lempar məlempar 'throw'
rasa mərasa 'feel'
wakil məwakil 'represent'
jakin məjakin 'convince'
masak məmasak 'cook'
nikah mənikah 'marry'
ŋaco məŋaco 'chat'
ɲaɲi məɲaɲi 'sing'
hituŋ məŋhituŋ 'count'
ɡambar məŋɡambar 'draw a picture'
kirim məŋirim 'send'
dəŋar məndəŋar 'hear'
tulis mənulis 'write'
bantu məmbantu 'help'
pukul məmukul 'hit'
d͡ʒahit məɲd͡ʒahit 'sew'
t͡ʃatat məɲt͡ʃatat 'note down'
ambil məŋambil 'take'
isi məŋisi 'fill up'
undaŋ məŋundaŋ 'invite'

We can see that there is some variation in the phonetic shape of the prefix. For example, we have [mə] in the first 7 forms, [məɲ] for 'sing', 'sew' and 'note down', [məm] for 'help' and 'hit', [məŋ] for many other forms and so on. We can also note that the phonetic shape of the simple form is usually copied in the prefixed form (e.g. [lempar] and [məlempar]), but not always (e.g. [pukul] but [məmukul]). Our task is to see if we can explain this variation in terms of phonological rules and come up with a single underlying representation for the prefix to reflect the fact that we are dealing with the same prefix in all cases, even though its phonetic shape does vary.

As with the problems in complementary distribution, we must look for some pattern than might explain the variation in the shape in the prefix. The most striking feature seems to be that it is affected by the place of articulation of the following consonant. For example:

Initial consonant of the stemFinal consonant of the prefixExample
bilabial bilabial məmbantu
alveolar alveolar məndəjar
palato-alveolar palatal məɲd͡ʒahit
velar velar məŋɡambar

This is a rule of anticipatory assimilation which occurs in many languages: a final consonant changes to the place of articulation of a following consonant. This happens in English as well in some contexts. For example, at a faster speech production, we can get [siːm meɪbl] for 'seen Mable', [siːŋ kærən] for 'seen Karen' etc.

We need at least another rule to explain why the first consonant of the simple form is deleted when the prefix is added in 'send', 'write' and 'hit'. Do the first consonants of these words have anything in common? Yes, they are voiceless stops. Perhaps, then, when the prefixed form begins with a voiceless stop, it is deleted in the prefixed form. To verify this assumption, we have to look at all words to make sure that none of them begin with a voiceless stop that is not deleted when the prefix is added. An inspection of this data shows that only 'send' 'write' and 'hit' begin with voiceless stops and in all three cases these are deleted in the prefixed form. We can therefore tentatively propose a second deletion rule that will delete voiceless stops in the prefixed form.

The other two prefix shapes that are not accounted for are /mə/ and /məŋ/. We get /mə/ + the simple form in the first eight words (down to 'sing'). The fact that we have already demonstrated an assimilation rule in which the prefix is affected by the following consonant might lead us to look for some pattern in the following consonant to explain the occurrence of [mə]. The initial consonants of these words are [l r w j m n ŋ ɲ]. Do these have anything in common? They are all non-syllabic sonorant sounds i.e. [-syll, +son]. Is it therefore the case that we get [mə] whenever the simple form begins with a non-syllabic sonorant? An inspection of all the other forms shows that this is indeed the case. Finally, leaving aside 'send' and 'draw a picture', whose prefix can be explained by the anticipatory assimilation rule given earlier, we get [məŋ] in 'chat', 'count', and the last three, 'take', 'fill up', and 'invite'. These do not seem to have much in common as far as the initial sound of the simple form is concerned. However, the last three all begin with a vowel in the simple form ([ambil],[isi],[undaŋ]), so we can at least propose that [məŋ] occurs when the simple form begins with a vowel (and this is consistent with the rest of the data).

We have suggested various rules in the preceding analysis. We now have to try to reconstruct an underlying form that the rules will apply to. There will always be more than one possible way of reconstructing an underlying form (we have no way by looking at phonetic data to know what underlying form might be in a talker's head); so when there is more than one possibility, we have to argue for one over the other. The way we might do this for the present data is as follows. (But we do not want to stray from the idea that there will be ONE underlying form for the different shapes of the prefix to express the idea that we are basically dealing with the same prefix, at least in terms of morphology).

One possibility is to pick one of the possible shapes of the prefix as the underlying form. Let's pick, /məm/. In this case, we would be proposing that the underlying prefixed forms are all /məm/ and that we derive the surface forms by rule. For example, for the first word we would have UR (underlying representation) /məmlempar/, and a deletion rule (see above) which would discard the /m/ of the prefix to derive [məlempar]. There would also have to be an assimilation rule to change our proposed UR /məmɡambar/ into [məŋɡambar]. However, when we come to 'count', and the last three words 'take', 'fill up', and 'invite' that all begin with a vowel, we have a problem. There is certainly nothing to stop us writing a rule that will change /məmisi/ into [məŋisi], but this rule has no phonetic basis to it at all (and is therefore not well motivated). Specifically, why should a bilabial /m/ change into a velar [ŋ] before a high front vowel? And above all, why should it change into a velar [ŋ] before /h/ ('count') and three vowels at different height and backness (the last three words) that seem to share no phonetic features in common? We have no answer and therefore we should look for a different solution.

What about if we propose /mə/ as the UR of the prefix? In this case, the UR and surface forms of the first eight words would be the same (no rules needed) and we would have to insert a final nasal in all of the other cases. But we quickly see that we run into the same problem as before. Although we could propose a well motivated rule that would insert a nasal consonant at the same place of articulation as the initial consonant in the simple form (i.e. insert /m/ in the proposed UR /məbantu/ to derive the phonetic form [məmbantu] etc.), we have no answer to why /ŋ/ should be inserted in the proposed UR /məisi/ to derive [məŋisi].

Let's try a different tack. The UR of words should consist of segments that cannot be explained or reduced further by phonological rules. For example, there's probably no reason to suppose that the UR of the English word 'pick' is anything other than /pɪk/. We can't explain the /ɪ/ vowel by the presence of the neighbouring consonants (whereas we can explain the presence of the /m/ in 'impatient' by the presence of the following bilabial /p/) and there's no reason to suppose we can exploit any redundancy in the UR of 'pick' by looking at the morphological relatives ('pick', 'picked', 'picking' etc.). So since we can't explain away any of the three units in /pɪk/ by rule, they must form part of the word's underlying representation in the (mental) lexicon.

Similarly, since we can't account for the existence of /ŋ/ in some words, let's propose that it forms part of the UR of the prefix. In other words, if we choose a UR of /məŋ/ as the prefix, we have no explaining to do (no rules to write) for 'count', 'take', 'fill up' or 'invite': their URs and phonetic forms are the same.

Now let's recap on the rules that we need to derive the other surface forms.

  1. an assimilation rule: assimilate the final nasal consonant of the prefix to the place of articulation of the following consonant
  2. deletion rule A: delete voiceless stops after the final nasal consonant of the prefix
  3. deletion rule B: delete the final nasal consonant of the prefix before a non-vocalic sonorant.

Let's see if these rules generate the appropriate surface phonetic representations of the prefixed forms. Here are some examples.

UR: məŋlempar
rule 3 məlempar
UR: məŋŋaco
rule 3 məŋaco
UR: məŋɡambar
rule 1 no change (the assimilation rules applies vacuously)
UR məŋtulis
rule 1 məntulis
rule 2 mənulis

We can see from the last example that the derivation of the surface phonetics form may require the application of more than one rule, and also that the rules have to be ordered. Consider what the result would be if we applied rule 2 first..

UR: məŋtulis
rule 2 məŋulis
rule 1 (cannot apply -- the voiceless stop has been deleted)

i.e. the incorrect phonetic form [məŋulis] is derived.

Finding patterns of alternation

Setting up an underlying form from phonetic data requires some practice in formulating phonological rules. This section provides an example of developing a set of phonological rules without explicitly setting up any underlying forms.


The data below is from Tagalog and shows an alternation between a simple form and a suffixed form that occurs before the suffix [an] (i.e. the suffixed form of 'open' is [buksan]). What phonological rules would you need to convert the simple form into the suffixed form?

Simple formSuffixed form 
bukas buks 'open'
kapit kapt 'embrace'
posod pusd 'tuft'
bata bath 'suffer'
polo pulh 'ask for trifles'
baniɡ baŋɡ 'mat'
tanim tamn 'plant'
talab tabl 'penetrate'
opos ups 'stop'
bili bilh 'buy'
dipa diph 'open'
damit damt 'clothe'
ɡanap ɡamp 'fulfil'
atip apt 'thatching'
laman lamn 'fill'

From the first three forms, and indeed many others, we can see that the vowel in the second syllable is deleted in the suffixed form. For example:

bukas + an becomes buksan

Secondly, there seems to be an alternation between [o] in the simple form and [u] in the suffixed form. This seems to be entirely 'context-free' i.e. [o] changes obligatorily to [u] regardless of the context (or at least that's all we can say from the information that we have). So from this rule and the deletion rule above, we get from:

ops (deletion rule)
ups (vowel raising rule: i.e. raise the phonetic height of [o])

Thirdly, the alternation of [baniɡ] and [baŋɡ] suggests an assimilation rule. Is it generally the case that [n] assimilates to the place of articulation of the following consonant (after vowel deletion has taken place)? This does happen in [ɡanap] which becomes [ɡamp] (i.e. [p] is bilabial so [n] assimilates to the same place of articulation and becomes a bilabial as well), but it doesn't happen in [tanim] which alternates with [tamn]. It would seem from this limited data that the assimilation rule must be highly context-sensitive: i.e. [n] only assimilates when the following sound is an oral stop (so such a rule will apply to [baniɡ] and [ɡanap], because, after vowel deletion, [n] precedes [ɡ] or [p] which are both oral stops, but it won't apply to [tanim] because [m] is a nasal stop).

Fourthly, some of the consonants in some forms undergo metathesis in which they swap places after vowel deletion (this is in fact quite common in morphological alternations in many languages). For example, we have:

atip simple form
atp vowel deletion
apt metathesis of the last two consonants

A closer inspection shows that metathesis occurs whenever an alveolar consonant immediately precedes a bilabial consonant. For example:

talab simple form
talb vowel deletion: [l] is alveolar, [b] is bilabial
tabl metathesis
tanim simple form
tanm vowel deletion: [n] is alveolar, [m] is bilabial
tamn metathesis

Finally, we need a rule to deal with alternations such as [bili]/[bilh] in which the suffixed form has an [h]. Those forms that do have an [h] in the suffix all end in a vowel in the simple form ([bata],[polo],[bili],[dipa]). One possibility is that [h] is inserted when the suffix is added (this is a kind of liaison rule, similar in nature to the insertion of [t] in the interrogative French 'a-t-elle?' that is derived from 'elle a' meaning 'she has'). Specifically, we can propose:

bili simple form
bilih h-insertion
bilh vowel deletion (see above)

We therefore have five rules that apply when the suffix [an] is added:

  1. delete the vowel in the final syllable of the simple form
  2. [o] becomes, [u]
  3. [n] assimilates to the place of articulation of a following oral stop
  4. when an alveolar precedes a bilabial they metathesise
  5. insert a final [h] if the simple form ends in a vowel

Clearly, rule (1) must apply before either rules (3) or (4): if we did not delete the vowel first (rule 1), there would be no context for either rules (3) or (4) to apply. For example, this is the outcome of an ordering of rule (3) before rule (I):

ɡanap simple form - rule (3) doesn't apply because [n] and [p] aren't adjacent
ɡanp by rule (1) -- this is not the correct form for the suffix

We must also apply the assimilation rule (3) before the methathesis rule (4). Otherwise, we would again derive an incorrect result:

ɡanap simple form
ɡanp rule 1
ɡapn rule 4 metathesis

i.e. we would have [ɡapnan]. Finally, rule (5) must apply before rule (1) to derive the correct suffixed forms with [h]. If rule (1) applied first, we would have:

bili simple form
bil by rule 1

and then no context for the [h]-insertion rule to apply (because [h] is only inserted after a vowel), thus incorrectly deriving the suffixed form [bilan].

So in summary, the order of the rules is 5-1-3-4; rule (2) being a context-insensitive rule is free to apply at any stage in the derivation of the suffixed forms.

Underlying forms and phonological rules


The following shows phonetic data of alternations in Zoque (a language in Mexico) between the plural form, 1st person singular plural, and 3rd person singular plural. (a) set up an underlying form for each word and the affixes for 'my' and 'his' and (b) describe the phonological rules that are necessary to link the underlying forms to the phonetic data.

pama mbama pjama 'the clothes' / 'my clothes' / 'his clothes'
tatah ndatah t͡ʃatah 'father'
tuwi nduwi t͡ʃuwi 'dog'
kaju ŋɡaju kjaju 'horse'
t͡sin nd͡zin t͡ʃin 'pine'
mok mok mjok 'corn'
ʔatsi ʔatsi ʔjatsi 'brother

The way to begin such problems is firstly to look across the rows (at the paradigmatic variations of each word) and make a few mental notes of the alternations. For example, in the first row we note that [pama] is prefixed by [m] in 'my' and infixed with [j] for 'his'. When you have looked carefully across the rows, compare the rows (words) with each other to see if you can find evidence of a similar kind of alternation that you can explain by rule. For example: does a similar kind of prefixing and infixing occur for the other forms? This is so for some of the words (e.g. [kaju],[ŋɡaju],[kjaju]), but the phonetic form of the 'my' prefix changes form word to word (and does not appear in 'corn' or 'brother'). Also, the [j] does not appear in the 'his' forms of 'father', 'dog', 'pine' and the initial consonants of these words have turned into a palato-alveolar affricate.

Let's begin more systematically with the 'my' forms. The fact that there is an alternation between [m],[n],[ŋ] before a following consonant at the same place of articulation suggests an assimilation rule. There seems to be no basis from the data for choosing either /m/,/n/, or /ŋ/ as the UR of the 'my' suffix. So let's choose /n/ on the grounds that it is prone to assimilation in many languages (e.g. 'seen Mable' and 'seen Karen' in which /n/ assimilates to /m/ and /ŋ/ respectively at faster rates of speech in English).

The assimilation rule applies appropriately to the first five forms (vacuously in fact to 'father', 'dog', 'pine' in which /n/ 'changes to' [n]), but we have to deal with 'corn' and 'brother' in which the prefix seems to have disappeared. We can handle these cases with a deletion rule. Firstly, we can say that when two nasals at the same place of articulation follow each other, one of them is deleted (this is appropriate because perceptually there is unlikely to be a lot of difference between [mmok] and [mok]). In other words, by the assimilation rule we get from /nmok/ to /mmok/ and then one of the nasals is deleted to produce the phonetic form [mok]. Secondly, since it's impossible to assimilate the /n/ to a glottal place of articulation ('brother'), we shall say that /n/ is deleted before glottal stops. The rules we have so far, then are:

  1. assimilate /n/ to the place of articulation of a following consonant
  2. delete one of two homorganic nasal consonants (homorganic = at the same place of articulation)
  3. delete /n/ when it precedes a glottal stop

We still have to deal with the alternations of [p] and [b] ('clothes'), [t] and [d] ('dog'), [k] and [ɡ] ('horse'), and [t͡s] and [d͡z] ('pine'). This is clearly an alternation in the voicing of the stop. The question is: which one do the words have as part of their UR? The answer is that it must be the voiceless one and for two reasons. Firstly, the voiceless stop/affricate also appears in the 'his' forms and secondly it would be an odd rule indeed if voiced sounds became voiceless after the attachment of the voiced /n/ prefix in the 'my' forms! So we need another rule

  1. voiceless consonants become voiced after /n/

Focusing now on the 'his' forms, the data suggests that 'his' is /j/ in Zoque and that this prefix metathesises with the first consonant. In other words, we are proposing that /j + pama/ becomes /pjama/ (analogously 'horse', 'corn' and 'brother'). We have to account for the 'his' forms with /t͡ʃ/ in 'father', 'dog', 'pine'. This is more than likely another assimilation rule in which /tj/ turns into an affricate (another common occurrence: consider in English the alternation between [dɪdjʉ] and [dɪd͡ʒʉ] 'did you' or [tjʉn] and [t͡ʃun] for 'tune' etc.).

So in summary, the URs for the words are exactly as represented in the plural forms; the UR for 'my' is /n/ and the UR for 'his' is /j/. We then have 6 rules:

  1. assimilate /n/ to the place of articulation of a following consonant
  2. delete one of two homorganic nasal consonants (homorganic = at the same place of articulation)
  3. delete /n/ when it precedes a glottal stop
  4. voiceless consonants become voiced after /n/
  5. metathesise /j/ and a following consonant
  6. alveolars assimilate to palato-alveolars before /j/ (and delete the /j/)

For example, the derivation of 'his pine' is as follows:

/j t͡sin/ UR
t͡sjin rule 5
[t͡ʃin] rule 6


  • Chomsky, N., and Halle, M. 1968. The Sound Pattern of English. New York: Harper & Row.
  • Halle, M. and Clements, G. 1983. Problem Book in Phonology, MIT Press.
  • Kenstowicz, M. (1994) Phonology in Generative Grammar. Cambridge MA: Blackwell.
  • Mohanan, K.P. (1992) "Describing the phonology of non-native varieties of a language", World Englishes, 11, 111-128.

Content owner: Department of Linguistics Last updated: 12 Mar 2020 12:17pm

Back to the top of this page