top of page

Psychology of Auditory perception

Hearing forms an especially important component for a communication to

happen. This process of hearing has two facets for a complete cycle.

1. The behaviour of the mechanical apparatus which forms ‘Sound


2. The neurological processing of the information received. This requires

central processes such as Memory, Attention, Auditory segregation,

Auditory scene Analysis, Localization which results in Sound


THE EAR MECHANISM (Sound Reception)

For humans, perception begins with sound waves entering the outer ear,

and setting the delicate tympanic membrane into vibration in the middle ear.

This movement is transferred to three tiny bones that are attached, which

push up against the end of the fluid-filled cochlea, setting up a wave that

displaces a flexible structure called the basilar membrane.

Auditory receptors called inner hair cells reside on the basilar membrane, are

displaced by the wave which translates the mechanical energy into a neural

code along the auditory nerve.

Psychology of Speech Perception

The mechanical process described so far is only the beginning of our perception

of sounds. Understanding and sound interpretation constitutes Perception.

How does Sound Perception/understanding of sound happen?

Sound perception happens making use of the following cues.

1. Pitch is how we perceive a frequency of a sound, i.e., the number of

vibrations of the stimulus. Higher the number of vibrations, we hear a

high-pitched sound. Lower the number of vibrations, we hear a low-

pitched sound.

2. Loudness is how we perceive Intensity of sound wave. An intensity of 30

dB is perceived as Quiet and an intensity of 70 dB and above as loud and

very loud.

3. Timber is the quality of a sound. This is how we distinguish sounds of

same pitch and loudness. E.g.: Flute vs Mandolin

4. Duration is a character that helps us to distinguish short and long sounds

and to an extent the direction of low frequency sounds.

The sound wave reaching the ear includes the superimposed effects of multiple

sound events. These sound waves will be deconstructed into frequency

components by the cochlea, which leaves the listener with perception tasks like

Segregation, Auditory scene Analysis, Localization, Categorization to

completely understand the sounds.


Normally, we perform these tasks with ease. Have you wondered how we could

so readily listen to someone with whom we were engaged in conversation even

when surrounded by many other conversations?

What is remarkable about this ability is not just that we can segregate the parts

of the signal specific to our interlocutor, but that we can shift our attention to

another talker if our current conversation becomes uninteresting. This process is

affected by our attention, context, and knowledge.

Acoustic components arising from the same source tend to be similar across

time, to be harmonically related (frequencies being integer multiples of each

other), to begin and end together, and to continue without abrupt discontinuities.

Listeners tend to segregate complex sounds using these principles of similarity,

harmonicity, contemporaneity, and good continuation.


We can perceive the direction of a sound source with some accuracy. Left and

right location is determined by perception of the difference of arrival time or

difference in phase of sounds at each ear. Also, there is an intensity difference

between ears, to the same sound, which enhances the knowledge of location of


The process though is dependent on the physical characteristics of sound such

as frequency, intensity, and duration, depends on the visual perception. This is

the basis of the ventriloquist effect in which we perceive the speech coming

from the visually moving dummy’s mouth instead of from its true source, the

ventriloquist’s mouth.


Categorization is done by weighing the importance of each sound in a particular

language. These cue weights are learned, through persistent language exposure

from the critical language development age.

E.g., The difficulties producing and perceiving speech sounds in a non-native

language (such as Japanese speakers have difficulties with English ‘l’ and ‘r’)

appear to be due to mismatches of cue weighting strategies of unlearnt


We now know that the speech we hear undergoes a lot of processes before we

decode what is being said. Thus, a person having trouble understanding sounds

despite normal hearing sensitivity, could have problem in any of the processes

discussed above.

This calls for a complete evaluation for Central Auditory Processing Disorders

(CAPD), which is an important topic of discussion in our upcoming blogs.

0 views0 comments

Recent Posts

See All


bottom of page