論文 - 北村 達也
-
発話観測システムNDI Waveの改良型センサを用いた子音構音の観測
北村 達也, 能田 由紀子, 波多野 博顕, 吐師 道子, 西谷 実
音声言語医学 55 ( 1 ) 59 - 59 2014年1月
-
Comparison of vocal tract transfer functions calculated using one-dimensional and three-dimensional acoustic simulation methods
Hironori Takemoto, Parham Mokhtari, Tatsuya Kitamura
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 408 - 412 2014年
共著
出版者・発行元:ISCA-INT SPEECH COMMUNICATION ASSOC
Acoustic characteristics of the vocal tract have been investigated extensively in the literature using a one-dimensional (ID) acoustic simulation method. Because the ID method assumes plane wave propagation only, it is recognized to be valid only in the low frequency region (below about 4 or 5 kHz). Recently, a three-dimensional (3D) acoustic simulation method was developed, to obtain more precise acoustic characteristics of the vocal tract. In the present study, from a male's vocal tract shapes, transfer functions were calculated using the 1D and 3D methods and compared with each other to evaluate the valid frequency range of the ID method. As a result, when acoustic effects of the piriform fossae were considered in the ID method, the transfer functions agreed with each other up to 7 kHz (ignoring small dips). The 3D method showed that a deep dip was generated at around 8 kHz by the transverse resonance mode in the pharynx. Above this dip frequency, the transfer functions disagreed with each other. Thus, the ID method is valid up to 7 kHz for this subject. Because this subject has a relatively large vocal tract, in general the upper limit of the valid frequency range could exceed 8 kHz.
-
Vocal tract length estimation based on vowels using a database consisting of 385 speakers and a database with MRI-based vocal tract shape information
Hideki Kawahara, Tatsuya Kitamura, Hironori Takemoto, Ryuichi Nisimura, Toshio Irino
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 870 - 874 2014年
共著
出版者・発行元:ISCA-INT SPEECH COMMUNICATION ASSOC
A highly-reproducible estimation method of vocal tract length (VTL) and text independent VTL estimation method are proposed based on a Japanese vowel database spoken by 385 male and female speakers ranging from age 6 to 56 and other vowel database with MRI-based vocal tract shape information. Proposed methods are based on interference-free power spectral representation and systematic suppression of biasing factors. MRI data is used to calibrate VTL estimation result to be represented in terms of physically meaningful unit. These databases are normalized based on the estimated VTL information to provide a reference template, which is used to implement a text independent VTL estimation method. A prototype system for text independent estimation of VTL is implemented using Mat lab and runs faster than realtime on a PC.
-
Acoustic interaction between the right and left piriform fossae in generating spectral dips 査読あり
Hironori Takemoto, Seiji Adachi, Parham Mokhtari, Tatsuya Kitamura
Journal of the Acoustical Society of America 134 ( 4 ) 2955 - 2964 2013年10月
-
Naturalness on Japanese pronunciation before and after shadowing training and prosody modified stimuli
Rongna A, Ryoko Hayashi, Tatsuya Kitamura
Proceedings of Interspeech 2013 Satellite workshop on Speech and Language Technology in Education 143 - 146 2013年8月
共著
-
Timing differences in articulation between voiced and voiceless stop consonants: An analysis of cine-MRI data
Masako Fujimoto, Tatsuya Kitamura, Hiroaki Hatano, Ichiro Fujimoto
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 955 - 958 2013年
共著
出版者・発行元:ISCA-INT SPEECH COMMUNICATION ASSOC
Laryngeal and supralaryngeal articulators coordinately work to produce speech sounds. In order to study differences in supralaryngeal manifestations of voiced and voiceless consonants, we compared the tongue movement during a minimal pair /agise/ and /akise/ using the fast scanning techniques of MRI movies. The result showed that the tongue displacement starts earlier in /k/ than in /g/ for many of the speakers of Tokyo Japanese. This agrees with our previous findings using other dialect speakers. These results suggest that many Japanese actively differentiate supralaryngeal articulation according to the voicing of the consonants, raising the tongue earlier in voiceless ones. This movement is presumably to ensure the voicelessness of the consonant. The present study also supplies evidence for the usefulness of a constructive approach for physical modeling.
-
Differences in articulatory movement between voiced and voiceless stop consonants 査読あり
Ryosuke O. Tachibana, Tatsuya Kitamura, Masako Fujimoto
Acoustical Science and Technology 33 ( 6 ) 391 - 393 2012年11月
-
Measurement of vibration velocity pattern of facial surface during phonation using scanning vibrometer
Tatsuya Kitamura
Acoustical Science and Technology 33 ( 2 ) 126 - 128 2012年3月
-
A Method for Predicting Stressed Words in Teaching Materials for English Jazz Chants 査読あり
NAGATA Ryo, FUNAKOSHI Kotaro, KITAMURA Tatsuya, NAKANO Mikio
IEICE Trans Inf Syst (Inst Electron Inf Commun Eng) E95.D ( 11 ) 2658 - 2663 2012年
-
Correlation between vocal tract length, body height, formant frequencies, and pitch frequency for the five Japanese vowels uttered by fifteen male speakers
Hiroaki Hatano, Tatsuya Kitamura, Hironori Takemoto, Parham Mokhtari, Kiyoshi Honda, Shinobu Masaki
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 402 - 405 2012年
共著
出版者・発行元:ISCA-INT SPEECH COMMUNICATION ASSOC
We conducted quantitative analyses of a magnetic resonance imaging (MRI) database to examine the correlation between physical measures (vocal tract length and body height) and acoustic parameters (pitch and formant frequencies) of vowels. The vocal tract length was measured from MRI data for the five Japanese vowels produced by fifteen male Japanese speakers between the ages of 24 and 55. The acoustic features were computed from vowel sounds recorded during scan. The vocal tract length showed a weak positive correlation with the speakers' age (correlation coefficient r = 0.51) but not with the speaker body height (r = 0.08). There were only weaker correlations between the vocal tract length and the first four formant frequencies except that F1 and F2 of the vowel /e/ show negative correlations with the vocal tract length (F1: r = -0.65, F2: r = -0.56). The result suggests that the vocal tract length is one of the dominant factors causing individual differences in the formant frequencies for the vowel /e/, produced by not forming a strong constriction. Furthermore, the pitch frequency was negatively correlated with the body height (r = -0.61).
-
Simulation of the coupling between vocal-fold vibration and time-varying vocal tract
Yosuke Tanabe, Parham Mokhtari, Hironori Takemoto, Tatsuya Kitamura
Journal of the Acoustical Society of America 130 ( 4 ) 2441 2011年10月
共著
-
Study of perceptual factors for speaker identification focusing on perceptual similarity of speaker characteristics
Tsuyoshi Izumida, Tatsuya Kitamura
Acoustical Science and Technology 32 ( 5 ) 216 - 219 2011年9月
-
Dental imaging using a magnetic resonance visible mouthpiece for measurement of vocal tract shape and dimension
Tatsuya Kitamura, Hironori Nishimoto, Ichiro Fujimoto, Yasuhiro Shimada
Acoustical Science and Technology 32 ( 5 ) 224 - 227 2011年9月
共著
-
Acoustic analysis of the vocal tract during vowel production by finite-difference time-domain method 査読あり
Hironori Takemoto, Parham Mokhtari, Tatsuya Kitamura
Journal of the Acoustical Society of America 128 ( 6 ) 3724 - 3738 2010年12月
-
Visualisation of hypopharyngeal cavities and vocal-tract acoustic modelling
Kiyoshi Honda, Tatsuya Kitamura, Hironori Takemoto, al
Computer methods in Biomechanics and Biomedical Engineering 13 ( 4 ) 443 - 453 2010年7月
-
Yasuhiro Hamada, Tatsuya Kitamura, Masato Akagi
Journal of Signal Processing 14 ( 4 ) 265 - 268 2010年7月
共著
-
Similarity of effects of emotions on the speech organ configuration with and without speaking
Tatsuya Kitamura
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 909 - 912 2010年
単著
出版者・発行元:ISCA-INST SPEECH COMMUNICATION ASSOC
In this work we propose and verify a hypothesis on emotional speech production: emotions induce physical and physiological changes in the whole body including changes in the configuration and physical/mechanical properties of the speech organs, regardless of whether or not the person is speaking, and as a side effect, this changes the voice quality. To verify this hypothesis, we measured the configuration of the speech organs of professional actors simulating four emotions (neutral, hot anger, joy, and sadness) with and without speaking by magnetic resonance imaging. The results clearly showed that emotions affect the speech organ configuration, and the same tendency of changes in the speech organ configuration was found regardless of whether or not the person was speaking. We also measured electromagnetic articulography data while a participant watched a relaxation or horror movie, and the result implies that emotional changes can deform the speech organ configuration even if the participant does not speak. These results support our hypothesis.
-
Transfer functions of solid vocal-tract models constructed from ATR MRI database of Japanese vowel production
Tatsuya Kitamura, Hironori Takemoto, Seiji Adachi, Kiyoshi Honda
Acoustical Science and Technologies 30 ( 4 ) 288 - 296 2009年4月