Table 2.1: Summary of training corpora for cross accent experiments, Here BJ, SH and GD means Beijing, Shanghai and Guangdong accent respectively. 3
Table 2.2: Summary of test corpora for cross accent experiments, PPc show here is character perplexity of test corpora according to the LM of 54K.Dic and BG=TG=300,000. 3
Table 2.3: Character error rate for cross accent experiments. 3
Table 3.4: Different feature pruning methods (number in each cell mean the finally kept dimensions used to represent the speaker). 6
Table 3.5: Distribution of speakers in corpora. 7
Table 3.6: Gender classifications errors based on different speaker representation methods (The result is according to the projection of PCA, the total number for EW and SH are 500 and 480 respectively). 8
Table 3.7: Different supporting regression classes selection. 8
Table 3.8: Gender classifications errors of EW based on different supporting regression classes (The relative size of feature vector length is indicated as Parameters). 9
Table 4.9: Front nasal and back nasal mapping pairs of accent speaker in term of standard phone set. 15
Table 4.10: Syllable mapping pairs of accented speakers in term of standard phone set: 16
Table 4.11: Performance of PDA (37 transformation pairs used in PDA). 16
Table 4.12: Performance of MLLR with different adaptation sentences. 17
Table 4.13: Performance Combined MLLR with PDA. 17
Table 4.14: Syllable error rate with different baseline model or different adaptation technologies (BES means a larger training set including 1500 speakers from both Beijing and Shanghai). 17
Table 4.15: Character error rate with different baseline model or different adaptation technologies (BES means a larger training set including 1500 speakers from both Beijing and Shanghai). 18
Table 5.16: Speaker Distribution of Corpus. 20
Table 5.17: Gender Identification Error Rate(Relative error reduction is calculated when regarding GMM with 8 components as the baseline). 22
Table 5.18: Gender Identification Error Rate (Relative error reduction is calculated when regarding 1 utterance as the baseline). 23
Table 5.19: Inter-Gender Accent Identification Result. 24
Table 5.20: Accents identification confusion matrices (Including four accents like Beijing, Shanghai, Guangdong and Taiwan). 25
It is well known that state-of-the-art speech recognition (SR) systems, even in the domain of large vocabulary continuous speech recognition, have achieved great improvements in last decades. There are several commercial systems on the shelves like ViaVoice of IBM, SAPI of Microsoft and FreeSpeech of Phillips.