Rounding of the glottal source waveform for breathy voice. In non-breathy phonation in modal register, airflow rises gradually to a peak during the opening phase, but typically falls more abruptly during the closing phase (Figure 1). However, in breathy voice, the closing phase of the glottal source function is more gradual, producing a more rounded source signal. This can be seen in the top of Figure 2, which shows a more rounded (i.e., more sinusoidal) source waveform for the breathy voice. Once again, these breathiness-related differences in the degree of rounding of the glottal signal are more easily observed in the spectrum rather than in the time domain – in this case, by measuring the relative amplitude of the first harmonic (H1). Here is the reasoning: For a perfect sinusoid, all of the energy in the signal is at H1, with no spread of energy into the higher frequencies. (One definition of a sinusoid is that changes over time are as smooth as they can possibly be. As we discussed in the section on basic acoustics, the sinusoid is the extension over time of motion around a circle, with a circle being the smoothest shape possible.) As the source waveform becomes more abrupt (i.e., more like an impulse), the spread of energy to higher frequencies increases – meaning that the relative amplitude of H1 will decrease. The bottom line is that we would generally expect to see higher amplitude first harmonics for more breathy voices and weaker first harmonics for less breathy voices. Compare the first harmonic amplitudes for the breathy and non-breathy voices in Figures 2 and 3. Which of the spectra show stronger 1st harmonics – the breathy or the non-breathy?
The Lab
The lab uses ten sustained [a] vowels out of 25 voice samples that were used many years ago in a study of breathy vocal quality in dysphonic speakers (Hillenbrand & Houde, 1996). These voice samples, in turn, were drawn from a large database of recordings that were made at Massachusetts Eye and Ear Infirmary by Robert Hillman. The samples that we picked were intended to represent a fairly broad range of breathiness percepts from clear phonation to very breathy voice.
Open SpeechTool/Ztool, then use the File menu to open ‘br01.wav’ (‘c:\ztool\br01.wav’ – on the LRC machines, it’s ‘r:\ztool\br01.wav’).
Play the signal as many times as you wish and rate how breathy the voice is on a scale of 1 to 5, with 5 being the most breathy. Record your rating for this signal. So, the row of data for this signal will look like this:
br01.wav 3 (or whatever)
Toward the end of the string of buttons at the top, you will see one called ‘CPP’. Bugging this button will run a program that estimates how periodic the signal is by measuring the degree of harmonic organization in its narrow band spectrum. The very last number that the program gives you is called “Mean CPPS”. Larger CPPS values indicate a higher degree of harmonic organization – i.e., most of the energy is at harmonically related frequencies, indicating a more periodic signal. Consequently, small values of CPPS should be associated with less periodic (breathier) voices. Write this number down in the same row as your breathiness rating. Your row of data should now look something like this:
br01.wav 3 0.71
Do the same thing for the remaining nine signals. In your table of results, you should have your breathiness rating (1-5) and a CPPS value for each test signal.
Below is a table of breathiness ratings for each signal.
br01.wav 8.22
br02.wav 5.21
br03.wav 2.12
br04.wav 2.86
br05.wav 5.33
br06.wav 1.51
br07.wav 7.94
br08.wav 3.42
br09.wav 4.07
br10.wav 4.87
These are very much like the breathiness ratings (BR) that you made, except that these ratings are averages from a panel of 21 listeners doing pretty much what you did. (These values vary from ~1.5 to ~8.2 instead of 1-5, but this doesn’t matter.) Copy these breathiness ratings into a new column of the table you created. So, each row in your table will have, in this order: (1) the name of the signal, (2) your breathiness rating, (3) the CPPS value from Ztool, and (4) the average rating from the 21-listener panel. Use Word to create a file called ‘brXXXX.txt’ (where ‘XXXX’ is the last 4 digits of your WIN – or any random string of numbers, e.g., ‘br1598.txt’) in the ztool folder with all of these numbers in it (filename, your BR, CPP, and panel BR – for all 10 signals). The 1st line should look something like this:
br01.wav 3 0.71 8.22
All 10 lines need to be in exactly this format; e.g., you want “0.71” NOT “0.71 dB”.
In Word, set the font to Courier and use the space bar only, not the Tab key. SAVE YOUR FILE AS PLAIN TEXT (File>Save as>Choose plain text, using the name ‘brXXXX.txt’; e.g., ‘br4598.txt’. (If Word asks you about “text encoding”, just leave it at the Windows default setting.)
The last step is to measure correlations between: (1) your BR and the panel BR (columns 2 and 4), (2) your BR and CPPS (columns 2 and 3), and (3) the panel BR and CPPS (columns 3
and 4). A correlation is a measure of the strength of the relationship between two sets of numbers.1 The easiest way to measure a correlation happens to be the most arcane, but it’s not that bad:
Hold down the Windows key (the one with the flag-looking thing on it) and hit ‘R’.
Type ‘cmd’ into the text box that pops up.
Put your cursor in the black window that appears and type:
c:
cd c:\ztool’
(LRC people: Use r: instead of c:)
Let’s assume you want to measure the correlation between your BR (col 2) and CPPS (col 3). Type this arcane thing:
.\tcor brXXXX.txt 2 brXXXX.txt 3 (measure the correlation between col 2 and col 3)
(e.g.: .\tcor br1760.txt 2 br1760.txt 3)
Notes: 1. The weird ‘.\’ thing has to be there. It needs to be a backslash (‘\’) and not a forward slash (‘/’). 2. If you get an error from tcor, take a close look at the format or your data file. All 10 lines need to be in exactly this format, with no extra lines:
br01.wav 3 0.71 8.22
‘tcor’ will type out a bunch of stuff; the only numbers you need are the values for ‘r’ and ‘rsq’ (r2, aka variance explained); e.g.:
r: -0.92022
rsq: 0.84680
Do the same thing for the two other correlations that you need; e.g.,
.\tcor brdata.txt 2 brdata.txt 4
.\tcor brdata.txt 3 brdata.txt 4
Results:
r r2
correlation between your BR and the panel BR _______ _______
correlation between your BR and CPP _______ _______
correlation between the panel BR CPP _______ _______
Questions:
How well do your breathiness ratings agree with the panel ratings? Note that the more important measure of the strength of a relationship is rsq (r2) rather than r: for example, an r value of 0.8 is not 80% of perfect, but an rsq value of 0.8 is 80% of perfect.
Answer here:
How well do the CPPS measures predict your breathiness ratings?
Answer here:
How well do the CPPSmeasures predict the panel breathiness ratings?
Answer here:
Why is the correlation between breathiness ratings and CPPSnegative? (If you’re not sure, see footnote 1).
Answer here:
What do you make of all this? For example, is there any advantage to using this measure of periodicity measure in place of your own subjective estimate?
Answer here:
Look at the figures on the last two pages of this document and read the description at the top of the page. Pick the two spectra that seem to show the most harmonic organization, and the two spectra that seem to show the least harmonic organization. (These are subjective judgments, so there are no right and wrong answers. The signal you pick for the most harmonic organization should be very clean looking, with most of its energy at harmonically related frequencies; vice versa for the signal with the least harmonic organization.) Record your results below, along with the panel breathiness rating and the CPPSvalue for each signal:
Signal with the most harmonic organization
File name (e.g., br09) Panel BR CPP
________________ ______ ____
Signal with the second most harmonic organization
File name Panel BR CPP
________________ ______ ____
Signal with the least harmonic organization
File name Panel BR CPP
________________ ______ ____
Signal with the second least harmonic organization
File name Panel BR CPP
________________ ______ ____
Last question: Do the voices that you judged to have the most harmonic organization tend to be among the signals with: (a) the lowest breathiness ratings and/or (b) the largest CPPSvalues?
Answer here:
REFERENCES
Aronson, A.E. (1971). Early motor unit disease masquerading as psychogenic breathy dysphonia: A clinical case presentation. Journal of Speech and Hearing Disorders, 36, 115-124.
Aronson, A.E. (1990). Clinical voice disorders (3rd ed). New York: Thieme.
Boone, D.R., and McFarlane, S.C. (1988). The voice and voice therapy (4th ed). Englewood Cliffs, NJ: Prentice Hall.
Colton, R.A., and Casper, J.K. (1990). Understanding voice problems: A physiological perspective for diagnosis and treatment. Baltimore: Williams and Wilkins.
Hillenbrand, J.M., and Houde, R.A. (1996). Acoustic characteristics of breathy vocal quality: Dysphonic voices and continuous speech. Journal of Speech and Hearing Research, 39, 311-321.
Hollien, H. (1987). "Old voices": What do we really know about them? Journal of Voice, 1, 2-17.
Klatt, D.H., and Klatt, L.C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. Journal of the Acoustical Society of America, 87, 820-57.
McKay, I. (1987). Phonetics: The Science of Speech Production (2nd ed.). Boston: College Hill.
Ryan, W.J., and Burk, K.W. (1974). Perceptual and acoustic correlates of aging in the speech of males. Journal of Communication Disorders, 1, 181-192.
Södersten. M., Hertegård S., Hammarberg B. (1995). Glottal closure, transglottal airflow, and voice quality in healthy middle-aged women. Journal of Voice, 9, 182-97.
Narrow Band Spectra of the Test Signals
The figures below are narrow band amplitude spectra of the ten test signals. Notice that the spectra vary quite a bit in the degree of harmonic organization, which reflects how periodic the signal is. For example, for br06 nearly all of the energy is at harmonic frequencies (whole number multiples of f0). The same is true of br03, though to a somewhat lesser extent. The spectra of some of the other signals, however, show all kinds of energy at non-harmonic frequencies; e.g., br01, br05, br07, and br10.
The CPPSalgorithm attempts to measure these variations in harmonic organization, with large CPPSvalues reflecting a high degree of harmonic organization (i.e., high periodicity). The assumption is that signals with large CPPSvalues tend to be less breathy – and vice versa.
Share with your friends: |