Sentence Production I. Introduction

(Introduction to) Language History and Use – Psycholinguistics

Sentence Production

Rachael-Anne Knight

Sentence Production


This lecture covers how a spoken sentence is produced from the formation of an idea in the speaker’s mind to the moment before it is articulated. We will discuss the processes involved and the methods by which these can be examined.

A.Why is sentence production interesting?

The storage space of the brain is finite. This means that it cannot store the infinite number of sentences that we may ever need to produce. From this it follows that we must somehow construct sentences from smaller parts or units before we are able to say them. The main issues then concern the processes by which units come to be selected and then combined in a particular order.

B.Processes of Speech Production (after Levelt 1989)

Figure 1 The Processes of Speech Production

The three main areas of speech production are:


The speaker must decide on the message to be conveyed. Very little is known about this stage. The end point is a stage at which the message itself has been decided but it has no linguistic form. It is also called the preverbal message or the message level of representation. This stage is often represented by a thought bubble.


The speaker must convert their message into a linguistic form. This stage involves

  • Lexicalisation – selecting the appropriate word

  • Syntactic planning – putting the words in the right order and adding grammatical elements.

3.Articulation / Execution

The speaker must plan the motor movements needed to convey the message.

C.Where does our evidence come from?

It’s hard to study speech production as it’s very difficult to get inside someone’s head as they plan a sentence.

1.Normal speech

  • Speech errors

  • Dysfluencies

2.‘Lab speech’

  • Speech Errors

  • Dysfluencies

3.Non-normal speech

  • Aphasic Speech

II.Speech Errors (Slips of the tongue)

These are the types of errors that are relatively common in normal speech production. Errors are categorised by the mechanism and the unit involved in the error.


A unit is missed out from the intended target

1. The chimney catches fire  The chimney catch fire (affix deletion)

2. Background lighting  Backgound lighting (phoneme deletion)


A unit occurs both in the right place and later in the utterance

3. A phonological rule  A phonological fool (phoneme /f/)


A unit occurs in the right place and earlier in the utterance

4. A reading list  A leading list (phoneme /l/ anticipation)


Two units are swapped over

5. Do you feel really bad  Do you reel feally bad (phoneme/onset exchange)

6. Guess whose name came to mind  Guess whose mind came to name (word exchange)

7. I sampled some randomly  I randomed some samply (morpheme exchange)


Two units are combined

8. The children / young of today  The chung of today (word blend)

9. Miss you very much / a great deal  Miss you a very much (phrase blend)


A word is substituted for a different word

10. Give me a spoon  Give me a fork

11. I think they are equivalent  I think they are equivocal

12. Get me the catalogue  Get me the calender

G.Cognitive Intrusions

Units from outside the message level are inserted into the utterance

13. I’ve read all my library books  I’ve eaten all my library books (produced when the speaker was hungry)

14. Get out of the car  Get out of the clark (produced when the speaker was looking at a shop called Clark’s)

III.Hesitation analysis

We make lots of pauses while we speak. Sometimes these pauses are periods of silence (unfilled pauses) or they may contain repetitions or items such as ‘umm’ or ‘I mean’ (filled pauses).

A.Pauses before words

  • These pauses seem to be to do with retrieving individual words

  • They occur more frequently and are longer before words that are less predictable.

  • During such pauses people often make appropriate hand gestures that describe the word they are about to say.

  • Such pauses are sometimes described as a difficulty in microplanning

1.Tip-of-the-tongue state (TOT)

This state is an extreme version of a microplanning pause. The speaker knows they know what the word is (they have a ‘feeling of knowing’) and can provide semantic information about it but cannot remember the exact phonological form. Speakers may know some information about the phonological form (such as first sound or number of syllables) or produce interlopers (near phonological neighbours).

B.Pauses for sentence planning

  • These pauses seem to be to do with planning the syntactic and semantic content of speech.

  • There are fluent and hesitant phases of production.

  • There are more and longer pauses in the hesitant phases.

  • There are more of these pauses if the task is difficult or there is a high cognitive load.

  • These pauses are sometime described as difficulties in macroplanning.

IV.Syntactic Planning

When we speak we must put our words in a certain order and add grammatical elements to our utterance.

A.What evidence must models of syntactic planning account for?

Look at the famous error below

15. A weekend for 1maniacs  A maniac for 1weekends

Three things to note

a. The position of stress was unchanged – the primary stress and the nuclear accent remained on the final word. Suggests that prosody is generated independently of the words themselves.
b. The plural morpheme has stayed at the end of the utterance rather than moving with the stem ‘maniac’. We say the morpheme has been stranded.  Suggesting that content words and function words are accessed and processed separately.
Another indication that content and function words are processed differently is that they show different patterns of exchange errors. Word exchanges aren’t constrained by distance whereas sound exchanges are not. Furthermore, words tend to exchange with others of the same syntactic class.
c. The plural morpheme is pronounced as /z/. This is appropriate for the phonological environment of ‘weekend’ rather than ‘maniac’. We say it shows phonological accommodation to the environment  Suggests that the phonological form of function words is specified after that of content words.

B.Garrett’s model of syntactic planning

Garrett proposed a model of syntactic planning based primarily on data from speech errors like the ones above.

1.Main features of Garrett’s model

  • Processing is serial, that is to say that information can only flow one way.

  • There are two main stages, functional and positional.

  • Content and function words are selected at different stages.

Message Level

Form abstract semantic specification and assign syntactic functions

Functional Level

Subject = Verb = Object =

Generate syntactic frame,

(Det) N1 V [+PAST} (Det) N2 [+PLURAL]

Retrieve phonological forms of content words

// // //

Slot phonological forms into syntactic frame

Positional Level

(Det) // // [+PAST] (Det) // [+PLURAL]

Specify phonological forms of function words and affixes

Sound Level

// // // // //

Articulatory Instructions

2.How well does Garrett’s model account for the speech error evidence?

a)Different processing of content and function words

 Content words are processed at the functional stage whereas function words are not selected until the positional stage
 Words can exchange over large distances because they are retrieved before their position is established. Sounds are not specified until after the positional level which constrains the distance of their exchanges.
 The tendency for words to exchange with others of the same class can be attributed to errors made when slotting the forms into the syntactic frame.

b)Late phonological specification of function words

 The phonological form of function words is specified after that of content words.

c)Blends and Cognitive Intrusions

 Because the model is serial and modular it can’t explain the existence of phrase blends such as 9 and cognitive intrusions such as 13 and 14.


Lexicalisation is the process of turning the semantic representation of words into the phonological specification. In Garrett’s model this isn’t really specified. We know that we retrieval the phonological forms of content words between the functional and positional levels but not exactly how this happens.

A.What speech evidence must a model of lexicalisation account for?

1.Speech Errors

There are distinct types of substitution errors

  • Semantic Substitutions:

10. Give me a spoon  Give me a fork

  • Phonologically Related Substitutions (Malapropisms)

11.I think they are equivalent  I think they are equivocal

  • Mixed Errors

12. Get me the catalogue  Get me the calender
Mixed errors occur more often than would be predicted by chance

2.Hesitations and TOTs

Speakers can have access to semantic information without having access to the phonological specification. I.e. they can make appropriate hand gestures during microplanning pauses and may find themselves in a tip-of-the-tongue state. How does this happen? Also, some small parts of phonological information may be available.

B.One-stage or two?

Do we go directly from the semantic representation to the phonological representation or is there an intervening level? The lemma representation has been posited as an intervening stage. A lemma is a representation containing syntactic and semantic but not phonological information.

Figure 2 One stage of lexicalisation or two?

1.How well does a two stage model explain the evidence?

a)Speech errors

 Semantic substitutions come from selecting the wrong lemma, phonologically related substitutions occur when selecting the phonological representation

 This model does not explain why mixed errors occur so frequently

 This model doesn’t explain why word blends can occur

b)TOTs and Gestures during Hesitation

 These can be explained if the lemma has been accessed but the phonological representation has not.

 The model can’t explain how some but not all phonological information can be available to the speaker. This is due to the autonomous nature of the model.

VI.Modifications to models of syntactic planning and lexicalisation

Modular, autonomous models such as the ones above can explain many aspects of planning and lexicalisation. Some parts of evidence, however, support a more interactive approach where different levels of information interact and where information from outside the model can be used. Much recent work has concentrated on developing such interactive models. Such work may focus on modifying existing autonomous models or on designing completely new models.

Affix-A morpheme that cannot exist on its own

Phoneme- A sound of the language

Onset – The initial consonant or cluster of a syllable

Morpheme - The smallest unit of meaning

Nuclear Accent – The final prominence giving pitch movement in an utterance

Primary Stress – The main stress in an utterance

Prosody – Properties of duration, pitch, and loudness

Content Word – The type of open-class words that convey most of the meaning of the utterance .

Function Word – The type of closed-class word that does the grammatical work of the language

