教員・研究者詳細 - 北村　達也

小学生向け教育番組の音声に用いられる語彙の予備調査

その他リンク： https://dblp.uni-trier.de/db/journals/corr/corr2111.html#abs-2111-03629

淺井優介, 北村達也, 川村よし子

甲南大学紀要知能情報学編 13 ( 1 ) 67 - 75 2020年7月

共著

出版者・発行元：甲南大学

日本語教育が必要な児童向けの教材作成の基礎データを提供するために，NHK Eテレの小学生向け教育番組の音声を書き起こし，語彙表を試作した．低学年向け17番組計510分，高学年向け19番組計570分を音声認識を利用して書き起こし，その中に現れた単語を有用度指標にもとづいて降順に並べ，語彙表を作成した．得られた語彙表は，先行研究にて作成された書き言葉コーパスに基づく語彙表よりも易しい単語が抽出されており，本研究の方法論の有効性が示された．

DOI： 10.14990/00003648

その他リンク： http://doi.org/10.14990/00003648

System Integration for Component-Based Manzai Robots with Improved Scalability

Tomohiro Umetani, Satoshi Aoki, Tatsuya Kitamura, Akiyo Nadamoto

Journal of Robotics and Mechatronics 32 ( 2 ) 459 - 468 2020年4月

出版者・発行元：Fuji Technology Press Ltd.

This paper describes system developments for integrating control systems of Manzai robot duos that automatically generate Manzai scripts from Internet articles based on given keywords, as well as improvements in the scalability of the integrated control system. Component-based Manzai robots controlled by RT-Middleware have been developed. However, conventional Manzai robot systems, the control systems of which are individually developed, experience some difficulties in interface integration and system maintenance as well as in scalability. In this study, we built a Manzai robot system excellent in reusability, maintainability and scalability by separating the common part from the hardware-dependent part by using the RT components of RT-Middleware. We also verify the reusability and scalability of the hardware-constrained component groups by implementing the Manzai robot control system into ready-made robots with different types of mechanism. We proved the effectiveness of the developed Manzai robot control system on its implementation results.

DOI： 10.20965/jrm.2020.p0459

東京方言話者の単語音声におけるおそ下がりの生起条件の調査査読あり

北村達也, 天川雄太, 波多野博顕

音声研究 23 ( 0 ) 165 - 173 2019年12月

共著

出版者・発行元：日本音声学会

おそ下がりは，アクセント核に後続するモーラに基本周波数の下がり目が生じる現象である。本研究では，東京方言話者の男性21名，女性27名，計48名が読み上げた230語の単語音声を対象にしておそ下がりの生起条件を調査した。その結果，おそ下がりは(1) 男性よりも女性の方が生じやすい，(2) 中高型よりも頭高型の語に現れやすい，(3) 語に含まれるモーラ数が多い方が現れやすい，(4) アクセント核のあるモーラに後続するモーラに広母音を持つ語で現れやすいことが示された。

DOI： 10.24467/onseikenkyu.23.0_165

Further observations on a principal components analysis of head-related transfer functions 査読あり

Parham Mokhtari, Hiroaki Kato, Hironori Takemoto, Ryouichi Nishimura, Seigo Enomoto, Seiji Adachi, Tatsuya Kitamura

Scientific Reports 9 7477 2019年5月

共著

DOI： 10.1038/s41598-019-43967-0

大学生・大学院生を対象とした発話のしにくさの自覚に関するアンケート調査査読あり

北村達也, 能田由紀子, 吐師道子, 竹本浩典

日本音響学会誌 75 ( 3 ) 118 - 124 2019年3月

共著

日本国内の15大学の日本語を母語とする大学生，大学院生を対象にして日常的な発話のしにくさの自覚に関するアンケート調査を行った。調査は質問紙法により実施した。回答のうち，言葉や聞こえの問題がないと回答した1,831名を対象に分析した。その結果，普段の会話で発音がうまくいかないと感じることが「ある」又は「どちらかと言えばある」と回答した者は全体の31.0%であった。男女別に分析すると，男性の35.5%，女性の24.4%が普段の会話で発音がうまくいかないと感じることが「ある」又は「どちらかと言えばある」と回答し，発話のしにくさを自覚する人は自分の音声が聞き返されることが多いと感じる傾向があった。

DOI： 10.20697/jasj.75.3_118

教科書中の単語の初出課を判定する日本語教育支援システムの利用状況の分析

北村達也

甲南大学紀要知能情報学編 11 ( 2 ) 209 - 215 2019年2月

単著

出版者・発行元：甲南大学

日本語教育用の教科書に含まれる単語がその教科書において初めて現れる課（初出課）を自動的に判定するシステムを開発し，その利用状況を調査した．その結果，2018年4月1日から7月31日までの四半期に10,000回を超えるアクセスがあり，そのうちの約9割は日本国内からのアクセスであった．また，利用者100名を対象としたアンケート調査の結果，利用者の約6割が日本語教師を職業としている人であった．そして，利用者の約8割がこのようなシステムの有無が教科書の選定に影響すると回答した．

DOI： 10.14990/00003307

その他リンク： http://doi.org/10.14990/00003307

Morphological characteristics of male and female hypopharynx: A magnetic resonance imaging-based study 査読あり

Ju Zhang, Kiyoshi Honda, Jianguo Wei, Tatsuya Kitamura

Journal of the Acoustical Society of America 145 734 2019年2月

共著

DOI： 10.1121/1.5089220

チューブ発声時の皮膚振動を利用したバイオフィードバックシステムの開発と効果の検討査読あり

川村直子, 北村達也, 城本修

音声言語医学 59 ( 4 ) 334 - 341 2018年9月

共著

DOI： 10.5112/jjlp.59.334

Audio-Visual Teaching Aid for Instructing English Stress Timings

Tatsuya Kitamura, Ryo Nagata, Kotaro Funakoshi

甲南大学紀要知能情報学編 11 ( 1 ) 1 - 17 2018年7月

共著

出版者・発行元：甲南大学

This study proposed and evaluated an audio-visual teaching aid for teaching rhythm of spoken English. The teaching aid instructs stress timing of English by movements of a circle marker on PC screen. Native Japanese participants exercised English sentences with and without the teaching aid and their speech sounds were recorded before and after the exercise. The results of analyses of the speech sounds showed that the teaching aid could improve in learning the English stress timing.

DOI： 10.14990/00003196

その他リンク： http://doi.org/10.14990/00003196

Replacement of sensor cables for reducing effects on articulation in the Northern Digital Incorporated's Wave electromagnetic articulography system 査読あり

Tatsuya Kitamura, Yukiko Nota, Michiko Hashi, Hiroaki Hatano

JASA Express Letters 143 ( 3 ) EL154 - EL159 2018年3月

共著

DOI： 10.1121/1.5025167

PubMed

自己完結性を有するコンポーネント駆動型の卓上ロボット環境の構築

梅谷智弘, 清瀬大貴, 榊原洋之, 青木哲, 北村達也

計測自動制御学会論文誌 54 ( 1 ) 126 - 128 2018年

共著

DOI： 10.9746/sicetr.54.126

Scalable Component-Based Manzai Robots as Automated Funny Content Generators

Tomohiro Umetani, Satoshi Aoki, Kazuhiro Akiyama, Ryo Mashimo, Tatsuya Kitamura, Akiyo Nadamoto

Journal of Robotics and Mechatronics 28 ( 6 ) 862 - 869 2016年12月

共著

DOI： 10.20965/jrm.2016.p0862

Implicit Communication Robots based on Automatic Scenario Generation using Web Intelligence

MASHIMO Ryo, KITAMURA Tatsuya, UMETANI Tomohiro, NADAMOTO Akiyo

International Journal of Web Information Systems 12 ( 3 ) 312 - 335 2016年9月

共著

DOI： 10.1108/IJWIS-04-2016-0017

単語リストに基づく単語分類機能をもつテキストエディタ査読あり

北村達也

日本語学 ( 8 ) 80 - 87 2016年8月

単著

Manzai robot system with scalability based on distributed software components

Tomohiro Umetani, Satoshi Aoki, Kazuhiro Akiyama, Ryo Mashimo, Tatsuya Kitamura, Akiyo Nadamoto

2015 International Symposium on Micro-NanoMechatronics and Human Science, MHS 2015 2016年3月

共著

出版者・発行元：Institute of Electrical and Electronics Engineers Inc.

This paper describes a manzai robot system with scalability that is developed based on the distributed software components. Manzai is a Japanese traditional stand-up comedy that is usually performed by two comedians. The manzai robots generate their manzai scripts based on web news articles related to keywords given by audiences and the searching results on WWW automatically, and then the robots perform the manzai scripts. Each robot is controlled by distributed RT components executed on the Raspberry Pi controller. The RT components control the manzai robots synchronously. The paper focuses on the scalability of the manzai robot system. Experimental results show the feasibility of manzai performance robots with scalability of the functions of the robot systems.

DOI： 10.1109/MHS.2015.7438343

磁気センサシステムに基づく調音運動と口蓋形状の関係の観測

北村達也, 能田由紀子, 吐師道子, 波多野博顕, 梅谷智弘

音声言語医学 57 ( 1 ) 52 - 52 2016年1月

共著

出版者・発行元：日本音声言語医学会

Human-Robots Implicit Communication based on Dialogue between Robots using Automatic Generation of Funny Scenarios from Web

Ryo Mashimo, Tomohiro Umetani, Tatsuya Kitamura, Akiyo Nadamoto

ELEVENTH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN ROBOT INTERACTION (HRI'16) 327 - 334 2016年

共著

出版者・発行元：ASSOC COMPUTING MACHINERY

Numerous studies have examined communication robots that communicate with people, but it is difficult for robots to communicate with people smoothly. We call the communication style based on dialogue between robots as "human-robot implicit communication". As described herein, we propose a Manzai-robots for which the interaction style is human-robot implicit communication based on an automatically generated scenario from web news. Our generated Manzai scenario consists of snappy patter and a misunderstanding of dialogue based on the four kinds of gap of structure of funny points. Our purpose is that people feel familiarity from smoothly human-robot communication using dialogue between robots based on a Manzai scenario. We conducted experiment of three kinds to assess (1) the effectiveness of automatic creation of Manzai scenario for the robots, (2) the effectiveness of the Manzai-robots as a media, and (3) the effectiveness of types of familiarity for Manzai-robots. Based on their results, we measured the familiarity and smooth communication of our Manzai-robots.

Automatic generation of Japanese traditional funny scenario from web content based on web intelligence

Ryo Mashimo, Tomohiro Umetani, Tatsuya Kitamura, Akiyo Nadamoto

17th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2015 - Proceedings 2015年12月

共著

出版者・発行元：Association for Computing Machinery, Inc

Today there is much information and knowledge on the internet, and many studies have examined the extraction of many kinds of knowledge from the internet. In addition, numerous studies have examined entertainment robots that communicate with people, but it is difficult for robots to communicate smoothly with people. We specifically examine communication between robots based on dialogue. Here, we create a dialogue-based scenario for the robots to undertake automatically, but it is difficult because the dialogue requires knowledge of many kinds. We consider the use of the knowledge from the web and create scenarios automatically. As described herein, we propose a system that generates dialogue scenarios automatically from web news articles in real time. We used the Manzai metaphor, which is Japanese traditional humorous comedy in our system. Our generated Manzai scenario consists of snappy patter and a misunderstanding dialogue based on the gap of our structure of funny points. We create communication robots to amuse people with our generated humorous robot dialogue scenarios.

DOI： 10.1145/2837185.2837232

話しにくさを自覚する若年成人の調音動態 : 歯茎弾き音について

立川渉, 小澤由嗣, 吐師道子, 北村達也, 能田由紀子

音声研究 19 ( 3 ) 50 - 56 2015年12月

共著

出版者・発行元：日本音声学会

DOI： 10.24467/onseikenkyu.19.3_50

Crucial Prosodic Features in Japanese Learners' Pronunciation: Evidence from Naturalness Judgments of Synthetic Speech 査読あり

Rongna A, Ryoko Hayashi, Tatsuya Kitamura

音声研究 19 ( 3 ) 37 - 42 2015年12月

共著

DOI： 10.24467/onseikenkyu.19.3_37

Non-contact measurement of facial surface vibration patterns during singing by scanning laser Doppler vibrometer

Tatsuya Kitamura, Keisuke Ohtani

Frontiers in Psychology, section Performance Science 6 2015年11月

共著

DOI： 10.3389/fpsyg.2015.01682

ATR音声データベース内の文音声における知覚的話者間類似度の計測査読あり

北村達也, 中間隆正, 大村宙, 川元広樹

日本音響学会誌 71 ( 10 ) 516 - 525 2015年10月

共著

ATR音声データベースセットCの関東出身話者男女各20名による文音声を対象にして音声の個人性の類似度評価を行った。同性の2話者の音声を1対とし,話者すべての組み合わせを実験参加者に提示してその類似度を5段階で判定させた。その後,異なる実験参加者群により再度同じ実験を行い,結果の再現性を確認した。その結果から知覚的な話者間類似度を求めると共に非計量多次元尺度構成法にて話者を平面上に布置した。得られた話者の布置と相関の高い特徴量を求めたところ,男性話者では平均F_0と話者の年齢とポーズ合計時間長,女性話者では平均F_0と発話時間長と話者の年齢となった。

DOI： 10.20697/jasj.71.10_516

Improvement of five-degree-of-freedom sensors for Northern Digital Incorporated's Wave speech research system

Tatsuya Kitamura, Yukiko Nota, Michiko Hashi, Hiroaki Hatano

Acoustical Science and Technology 36 ( 4 ) 347 - 350 2015年

共著

DOI： 10.1250/ast.36.347

Manzai Robots: Entertainment Robots Based on Auto-Created Manzai Scripts from Web News Articles

UMETANI Tomohiro, MASHIMO Ryo, NADAMOTO Akiyo, KITAMURA Tatsuya, NAKAYAMA Hirotaka

J Robot Mechatron 26 ( 5 ) 662 - 664 2014年10月

共著

DOI： 10.20965/jrm.2014.p0662

スキャニングレーザドップラ振動計による歌唱時の皮膚振動計測における再現性の検証査読あり

北村達也

音声言語医学 55 ( 2 ) 167 - 172 2014年4月

単著

本研究では，スキャニングレーザドップラ振動計を用いて歌唱時の顔面の皮膚振動を複数回計測し，計測間の差異を評価した．レーザドップラ振動計とは，対象物にレーザ光を当て，振動によって反射光に生じるドップラ効果を利用して対象物の振動速度や変位を計測するシステムである．また，スキャニング型の振動計は，事前に指定した複数の計測点を自動的に走査して振動を計測することができる．本研究には声楽経験者3名が参加した．実験は坐位にて行い，前額をあご台のフレームに当てることによって頭部を固定した．そして，各自の出しやすい高さにて母音/a/を連続歌唱させ，歌唱区間における皮膚振動速度を計測した．3回の計測結果を比較した結果，平均二乗誤差は4.0 dB以下であった．また，3回の計測値の中央値から6 dB外れている計測点は全計測点の2.4%であり，これらの多くはレーザ光が垂直に当たりにくい部分であった．

DOI： 10.5112/jjlp.55.167

日英母語話者による英語弱化母音の音響・調音特徴 : X線マイクロビームデータベースに基づく分析

波多野博顕, 北村達也

日本音響学会誌 70 ( 3 ) 106 - 113 2014年3月

共著

出版者・発行元：一般社団法人日本音響学会

英語弱化母音/〓/の音響・調音特徴の記述と日英母語話者による相違の解明を目的として,英語母音/〓,〓,〓,〓/を対象に定量的分析を行った。分析にはX線マイクロビームデータベースを用い,"X-ray microbeam speech production database"から英語話者16名,その日本語版から日本語話者9名を選出した。各母音は単語発話より抽出し,持続時間,第1・第2フォルマント周波数,舌ペレット位置を計測した。結果を以下にまとめる。1)両母語話者とも/〓/の持続時間は/〓,〓/よりも短い。2)/〓/において,英語母語話者では舌の上下・前後方向に中舌化するが,日本語母語話者では上下方向のみである。3)英語母語話者のみ/〓/が後続子音へ調音同化するが,これは日英の音韻的な母音カテゴリに起因する。

DOI： 10.20697/jasj.70.3_106

発話観測システムNDI Waveの改良型センサを用いた子音構音の観測

北村達也, 能田由紀子, 波多野博顕, 吐師道子, 西谷実

音声言語医学 55 ( 1 ) 59 - 59 2014年1月

共著

出版者・発行元：日本音声言語医学会

Comparison of vocal tract transfer functions calculated using one-dimensional and three-dimensional acoustic simulation methods

Hironori Takemoto, Parham Mokhtari, Tatsuya Kitamura

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 408 - 412 2014年

共著

出版者・発行元：ISCA-INT SPEECH COMMUNICATION ASSOC

Acoustic characteristics of the vocal tract have been investigated extensively in the literature using a one-dimensional (ID) acoustic simulation method. Because the ID method assumes plane wave propagation only, it is recognized to be valid only in the low frequency region (below about 4 or 5 kHz). Recently, a three-dimensional (3D) acoustic simulation method was developed, to obtain more precise acoustic characteristics of the vocal tract. In the present study, from a male's vocal tract shapes, transfer functions were calculated using the 1D and 3D methods and compared with each other to evaluate the valid frequency range of the ID method. As a result, when acoustic effects of the piriform fossae were considered in the ID method, the transfer functions agreed with each other up to 7 kHz (ignoring small dips). The 3D method showed that a deep dip was generated at around 8 kHz by the transverse resonance mode in the pharynx. Above this dip frequency, the transfer functions disagreed with each other. Thus, the ID method is valid up to 7 kHz for this subject. Because this subject has a relatively large vocal tract, in general the upper limit of the valid frequency range could exceed 8 kHz.

Vocal tract length estimation based on vowels using a database consisting of 385 speakers and a database with MRI-based vocal tract shape information

Hideki Kawahara, Tatsuya Kitamura, Hironori Takemoto, Ryuichi Nisimura, Toshio Irino

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 870 - 874 2014年

共著

出版者・発行元：ISCA-INT SPEECH COMMUNICATION ASSOC

A highly-reproducible estimation method of vocal tract length (VTL) and text independent VTL estimation method are proposed based on a Japanese vowel database spoken by 385 male and female speakers ranging from age 6 to 56 and other vowel database with MRI-based vocal tract shape information. Proposed methods are based on interference-free power spectral representation and systematic suppression of biasing factors. MRI data is used to calibrate VTL estimation result to be represented in terms of physically meaningful unit. These databases are normalized based on the estimated VTL information to provide a reference template, which is used to implement a text independent VTL estimation method. A prototype system for text independent estimation of VTL is implemented using Mat lab and runs faster than realtime on a PC.

Acoustic interaction between the right and left piriform fossae in generating spectral dips 査読あり

Hironori Takemoto, Seiji Adachi, Parham Mokhtari, Tatsuya Kitamura

Journal of the Acoustical Society of America 134 ( 4 ) 2955 - 2964 2013年10月

共著

DOI： 10.1121/1.4818744

日本語学習者の音声の韻律変換が自然性評価に与える影響

阿栄娜, 林良子, 北村達也

日本音響学会2013年秋季研究発表会講演論文集 425 - 426 2013年9月

共著

Naturalness on Japanese pronunciation before and after shadowing training and prosody modified stimuli

Rongna A, Ryoko Hayashi, Tatsuya Kitamura

Proceedings of Interspeech 2013 Satellite workshop on Speech and Language Technology in Education 143 - 146 2013年8月

共著

日本語学習者のための文章と難易度判定システムの構築と運用実験

川村よし子, 北村達也

Journal CAJLE 14 18 - 30 2013年7月

共著

Timing differences in articulation between voiced and voiceless stop consonants: An analysis of cine-MRI data

Masako Fujimoto, Tatsuya Kitamura, Hiroaki Hatano, Ichiro Fujimoto

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 955 - 958 2013年

共著

出版者・発行元：ISCA-INT SPEECH COMMUNICATION ASSOC

Laryngeal and supralaryngeal articulators coordinately work to produce speech sounds. In order to study differences in supralaryngeal manifestations of voiced and voiceless consonants, we compared the tongue movement during a minimal pair /agise/ and /akise/ using the fast scanning techniques of MRI movies. The result showed that the tongue displacement starts earlier in /k/ than in /g/ for many of the speakers of Tokyo Japanese. This agrees with our previous findings using other dialect speakers. These results suggest that many Japanese actively differentiate supralaryngeal articulation according to the voicing of the consonants, raising the tongue earlier in voiceless ones. This movement is presumably to ensure the voicelessness of the consonant. The present study also supplies evidence for the usefulness of a constructive approach for physical modeling.

Differences in articulatory movement between voiced and voiceless stop consonants 査読あり

Ryosuke O. Tachibana, Tatsuya Kitamura, Masako Fujimoto

Acoustical Science and Technology 33 ( 6 ) 391 - 393 2012年11月

共著

DOI： 10.1250/ast.33.391

Measurement of vibration velocity pattern of facial surface during phonation using scanning vibrometer

Tatsuya Kitamura

Acoustical Science and Technology 33 ( 2 ) 126 - 128 2012年3月

単著

DOI： 10.1250/ast.33.126

A Method for Predicting Stressed Words in Teaching Materials for English Jazz Chants 査読あり

NAGATA Ryo, FUNAKOSHI Kotaro, KITAMURA Tatsuya, NAKANO Mikio

IEICE Trans Inf Syst (Inst Electron Inf Commun Eng) E95.D ( 11 ) 2658 - 2663 2012年

共著

DOI： 10.1587/transinf.E95.D.2658

Correlation between vocal tract length, body height, formant frequencies, and pitch frequency for the five Japanese vowels uttered by fifteen male speakers

Hiroaki Hatano, Tatsuya Kitamura, Hironori Takemoto, Parham Mokhtari, Kiyoshi Honda, Shinobu Masaki

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 402 - 405 2012年

共著

出版者・発行元：ISCA-INT SPEECH COMMUNICATION ASSOC

We conducted quantitative analyses of a magnetic resonance imaging (MRI) database to examine the correlation between physical measures (vocal tract length and body height) and acoustic parameters (pitch and formant frequencies) of vowels. The vocal tract length was measured from MRI data for the five Japanese vowels produced by fifteen male Japanese speakers between the ages of 24 and 55. The acoustic features were computed from vowel sounds recorded during scan. The vocal tract length showed a weak positive correlation with the speakers' age (correlation coefficient r = 0.51) but not with the speaker body height (r = 0.08). There were only weaker correlations between the vocal tract length and the first four formant frequencies except that F1 and F2 of the vowel /e/ show negative correlations with the vocal tract length (F1: r = -0.65, F2: r = -0.56). The result suggests that the vocal tract length is one of the dominant factors causing individual differences in the formant frequencies for the vowel /e/, produced by not forming a strong constriction. Furthermore, the pitch frequency was negatively correlated with the body height (r = -0.61).

Simulation of the coupling between vocal-fold vibration and time-varying vocal tract

Yosuke Tanabe, Parham Mokhtari, Hironori Takemoto, Tatsuya Kitamura

Journal of the Acoustical Society of America 130 ( 4 ) 2441 2011年10月

共著

Study of perceptual factors for speaker identification focusing on perceptual similarity of speaker characteristics

Tsuyoshi Izumida, Tatsuya Kitamura

Acoustical Science and Technology 32 ( 5 ) 216 - 219 2011年9月

共著

DOI： 10.1250/ast.32.216

Dental imaging using a magnetic resonance visible mouthpiece for measurement of vocal tract shape and dimension

Tatsuya Kitamura, Hironori Nishimoto, Ichiro Fujimoto, Yasuhiro Shimada

Acoustical Science and Technology 32 ( 5 ) 224 - 227 2011年9月

共著

Acoustic analysis of the vocal tract during vowel production by finite-difference time-domain method 査読あり

Hironori Takemoto, Parham Mokhtari, Tatsuya Kitamura

Journal of the Acoustical Society of America 128 ( 6 ) 3724 - 3738 2010年12月

共著

DOI： 10.1121/1.3502470

Visualisation of hypopharyngeal cavities and vocal-tract acoustic modelling

Kiyoshi Honda, Tatsuya Kitamura, Hironori Takemoto, al

Computer methods in Biomechanics and Biomedical Engineering 13 ( 4 ) 443 - 453 2010年7月

共著

DOI： 10.1080/10255842.2010.490528

A study of brain activities elicited by synthesized emotional voices controlled with prosodic features 査読あり

Yasuhiro Hamada, Tatsuya Kitamura, Masato Akagi

Journal of Signal Processing 14 ( 4 ) 265 - 268 2010年7月

共著

Similarity of effects of emotions on the speech organ configuration with and without speaking

Tatsuya Kitamura

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 909 - 912 2010年

単著

出版者・発行元：ISCA-INST SPEECH COMMUNICATION ASSOC

In this work we propose and verify a hypothesis on emotional speech production: emotions induce physical and physiological changes in the whole body including changes in the configuration and physical/mechanical properties of the speech organs, regardless of whether or not the person is speaking, and as a side effect, this changes the voice quality. To verify this hypothesis, we measured the configuration of the speech organs of professional actors simulating four emotions (neutral, hot anger, joy, and sadness) with and without speaking by magnetic resonance imaging. The results clearly showed that emotions affect the speech organ configuration, and the same tendency of changes in the speech organ configuration was found regardless of whether or not the person was speaking. We also measured electromagnetic articulography data while a participant watched a relaxation or horror movie, and the result implies that emotional changes can deform the speech organ configuration even if the participant does not speak. These results support our hypothesis.

Transfer functions of solid vocal-tract models constructed from ATR MRI database of Japanese vowel production

Tatsuya Kitamura, Hironori Takemoto, Seiji Adachi, Kiyoshi Honda

Acoustical Science and Technologies 30 ( 4 ) 288 - 296 2009年4月

共著

DOI： 10.1250/ast.30.288

Resonance characteristics of hypopharyngeal cavities

Kiyoshi Honda, Tatsuya Kitamura, Hironori Takemoto, Parham Mokhtari, Seiji Adachi

Journal of the Acoustical Society of America 123 ( 5 ) 3731 2008年7月

共著

MRI-Based Study on Morphological and Acoustic Properties of Mandarin Sustained Vowels

WANG Gaowu, KITAMURA Tatsuya, LU Xugang, DANG Jianwu, KONG Jiangping

J Signal Process 12 ( 4 ) 311 - 314 2008年7月

共著

Deformation of the hypopharyngeal cavities due to F0 changes and its acoustic effects 査読あり

Hironori Takemoto, Tatsuya Kitamura, Kiyoshi Honda, Shinobu Masaki

Acoustical Science and Technology 2008年4月

共著

Single-matrix formulation of a time domain acoustic model of the vocal tract with side branches

MOKHTARI Parham, TAKEMOTO Hironori, KITAMURA Tatsuya

Speech Communication 50 ( 3 ) 179 - 190 2008年3月

共著

DOI： 10.1016/j.specom.2007.08.001

Effects of acoustic modification on perception of speaker characteristics for sustained vowels

Tatsuya Kitamura, Takeshi Saitou

Acoustical Science and Technology 2007年6月

共著

Vocal tract length perturbation and its application to male-female vocal tract shape conversion

Seiji Adachi, Hironori Takemoto, Tatsuya Kitamura, Parham Mokhtari, Kiyoshi Honda

Journal of the Acoustical Society of America 121 ( 6 ) 3874 - 3885 2007年6月

共著

DOI： 10.1121/1.2730743

Principal components of vocal tract area functions and inversion of speech by linear regression of cepstrum coefficient

Parham Mokhtari, Hironori Takemoto, Tatsuya Kitamura, Kiyoshi Honda

Journal of Phonetics 2007年1月

共著

A bone-conduction system for auditory stimulation in MRI 査読あり

Yukiko Nota, Tatsuya Kitamura, Hironori Takemoto, Hiroyuki Hirata, Kiyoshi Honda, Yasuhiro Shimada, Ichiro Fujimoto, Yuko Syakudo, Shinobu Masaki

Acoustical Science and Technology 2007年1月

共著

Principal components of vocal-tract area functions and inversion of vowels by linear regression of cepstrum coefficients

Parham Mokhtari, Tatsuya Kitamura, Hironori Takemoto, Kiyoshi Honda

JOURNAL OF PHONETICS 35 ( 1 ) 20 - 39 2007年1月

共著

出版者・発行元：ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

This paper addresses the following two hypotheses: (1) vocal-tract area functions of Japanese vowels can be accurately represented by a linear combination of only a few principal components which, furthermore, are similar to those reported in the literature for different languages; and (ii) the principal components' weights can be predicted and area functions thereby accurately estimated from acoustics by linear regression of cepstrum parameters. To test these hypotheses, synchronized acoustic and vocal-tract 3D MRI data were recorded from an adult male Japanese speaker for both sustained and dynamic vowel utterances. The first two principal components explained covariations in vocal-tract shape and length accounting for 94-97% of the total variance, and indeed provided a cross-linguistic validation of the two underlying components of vowel production emergent from the literature. Multiple linear regression models were then evaluated for their accuracy in reconstructing the area functions of the dynamic utterance by predicting the first two PC coefficients, using either carefully measured formants or cepstral coefficients defined in various frequency bands. The best formant-based regression model required all four formants, with a mean adjusted correlation of 0.93 and mean absolute errors of 0.187 cm(2) in area and 0.131 em in vocal-tract length. The best cepstrum-based regression model prescribed 24 cepstral coefficients defined in the frequency band 0-4 kHz, with a mean adjusted correlation of 0.92 and mean absolute errors of 0.102 cm(2) in area and 0.082 cm in vocal-tract length. These results suggest that vowel production features, properly constrained by PCA modeling, can be mapped with sufficient accuracy from easily measured cepstrum parameters. More work is required to reduce the dependence on MRI data, to extend the applicability of these methods to different voice qualities and different speakers, and to select a smaller subset of acoustic parameters for more robust, real-time inversion. (c) 2006 Elsevier Ltd. All rights reserved.

DOI： 10.1016/j.wocn.2006.01.001

An MRI-based time-domain speech synthesis system 査読あり

Tatsuya Kitamura, Hironori Takemoto, Parham Mokhtari, Toshio Hirai

Journal of the Acoustical Society of America 120 ( 5 ) 3037 2006年12月

共著

Changes in vocal tract resonance during a pitch cycle 査読あり

Tatsuya Kitamura, Seiji Adachi

Journal of the Acoustical Society of America 120 ( 5 ) 3351 2006年12月

共著

光マイクロホンを用いたMRI撮像時の騒音測定

北村達也, 正木信夫, 島田育廣, 藤本一郎, 赤土裕子, 本多清志

日本音響学会誌 62 ( 5 ) 379 - 382 2006年5月

共著

MRI装置の普及と高磁場化に伴い,撮像時の騒音は無視できない問題となっている。しかし,装置内の騒音測定には磁性体を含む一般的な装置を利用することができない。また,強磁場下で利用可能な測定装置を使っても,その一部に導電体が含まれていればRFパルス(電磁波)に起因するノイズが混入する可能性がある。本研究では非導電体のみで構成された光マイクロホンを用いてこの問題を解決し,1.5TのMRI装置を対象にしてSpin-echo T1強調,Fast spin-echo T2強調,RF-FAST,及びSingle-shot EPIの4種の撮像シーケンスの騒音を測定した。その結果,これらの騒音の音圧レベルはそれぞれ112.6dB,112.5dB,107.8dB,110.6dBであった。

DOI： 10.20697/jasj.62.5_379

有限要素法による声道伝達特性推定の有効性に関する検討

西本博則, 赤木正人, 北村達也, 鈴木規子

日本音響学会誌 62 ( 4 ) 306 - 315 2006年4月

共著

有限要素法と声道等価回路モデルの声道伝達特性の推定精度の調査を行うため,MR計測から得られた声道モデルより推定された伝達特性のピークと実音声のホルマントの比較を行った。その結果,健常な被験者では両推定法とも精度良く推定された。一方,複雑で左右非対称な声道形状の被験者では,有限要素法を用いれば音声ホルマントの個数が同じであり,ホルマント周波数と伝達特性のピーク周波数に大きな差がなかったが,等価回路モデルでは伝達特性のピークの数と音声ホルマントの数が一致しなかった。複雑で左右非対称な声道の伝達特性推定には,等価回路モデルは不適切であり,有限要素法が有効であることが示された。

DOI： 10.20697/jasj.62.4_306

Acoustic roles of the laryngeal cavity in vocal tract resonance 査読あり

Hironori Takemoto, Seiji Adachi, Tatsuya Kitamura, Parham Mokhtari, Kiyoshi Honda

Journal of the Acoustical Society of America 2006年4月

共著

Cyclicity of laryngeal cavity resonance due to vocal fold vibration 査読あり

Tatsuya Kitamura, Hironori Takemoto, Seiji Adachi, Parham Mokhtari, Kiyoshi Honda

Journal of the Acoustical Society of America 120 ( 4 ) 2239 - 2249 2006年4月

共著

DOI： 10.1121/1.2335428

Difference in vocal tract shape between upright and supine postures: Observations by an open-type MRI scanner

KITAMURA Tatsuya, TAKEMOTO Hironori, HONDA Kiyoshi, SHIMADA Yasuhiro, FUJIMOTO Ichiro, SYAKUDO Yuko, MASAKI Shinobu, KURODA Kagayaki, OKU-UCHI Noboru, SENDA Michio

Acoustical Science and Technology 26 ( 5 ) 465 - 468 2005年9月

共著

DOI： 10.1250/ast.26.465

Individual variation of the hypopharyngeal cavities and its acoustic effects

Tatsuya Kitamura, Kiyoshi Honda, Hironori Takemoto

Acoustical Science and Technology 26 ( 1 ) 16 - 26 2005年1月

共著

A method of tooth superimposition on MRI data for accurate measurement of vocal tract shape and dimensions 査読あり

Hironori Takemoto, Tatsuya Kitamura, Hironori Nishimoto, Kiyoshi Honda

Acoustical Science and Technology 25 ( 6 ) 468 - 474 2004年11月

共著

Exploring human speech production mechanisms by MRI

Kiyoshi Honda, Hironori Takemoto, Tatsuya Kitamura, Satoru Fujita, Sayoko Takano

IEICE Transactions on Information and Systems E87-D ( 5 ) 1050 - 1058 2004年5月

共著

インターネットを活用した読解教材バンクの構築

川村よし子, 北村達也

世界の日本語教育. 日本語教育事情報告編 6 241 - 255 2001年

共著

EDR電子化辞書を活用した日本語教育用辞書ツールの開発査読あり

川村よし子, 北村達也, 保原麗

日本教育工学雑誌 24 7 - 12 2000年8月

共著

学習履歴管理機能を持つ日本語読解支援システムの開発とその評価

北村達也, 川村よし子, 内山潤, 寺朱美, 奥村学

日本教育工学雑誌 23 ( 3 ) 127 - 133 1999年

共著

出版者・発行元：日本教育工学会

単母音の話者識別に寄与するスペクトル包絡成分

北村達也, 赤木正人

日本音響学会誌 53 ( 3 ) 185 - 191 1997年3月

共著

出版者・発行元：一般社団法人日本音響学会

単母音のスペクトル包絡において個人性が顕著に現れる帯域とその帯域において話者識別に寄与する成分についての検討を行った。スペクトル包絡の特定の帯域を変形させた刺激音を用いた聴覚実験により, スペクトル包絡の変形と個人性知覚との定量的な関係を求めた。その結果, 以下のことが明らかになった。(1)個人性はスペクトル包絡全体に現れるが, 高域により多く現れる。(2)話者識別にはスペクトル包絡のdipよりもpeakが重要な意味を持っている。(3)個人性は音韻によらずスペクトル包絡の20 ERB rate (1,740Hz)付近に存在するpeak以上の帯域に顕著に現れる可能性が高く, この帯域を利用して話者変換が可能である。(4)この帯域のpeakを3角形で近似しても個人性が保存される。

DOI： 10.20697/jasj.53.3_185

Speaker individualities in speech spectral envelopes

Tatsuya Kitamura, Masato Akagi

Journal of the Acoustical Society of Japan (E) 16 ( 5 ) 283 - 289 1995年9月

共著

DOI： 10.1250/ast.16.283

離散分布型HMMによる単語音声認識におけるビタビbest-firstサーチの検討

好田正紀, 北村達也

電子情報通信学会論文誌. D-II, 情報・システム, II-情報処理 77 ( 7 ) 1187 - 1197 1994年

共著

HMMによる音声認識をグラフサーチの問題とみなし，ビームサーチの技法を利用して，当該節点までのスコアのみに基づく枝刈りや，forward-backwardサーチのようにより単純なモデルを用いた認識処理に基づく当該節点以降の推定スコアも考慮した枝刈りが検討された．また，best-firstサーチの技法を利用して，スタックデコーディング法のように厳密なA探索に必ずしもこだわらない実用的な探索法や，tree-trellisサーチのようにN-best候補の探索に対して高速化を図る方法が検討された．本論文では，best-firstサーチの技法を利用して，HMMのビタビアルゴリズムによる認識処理に対して高速化を図る方法を検討し，最大経路スコアに基づく推定スコア設定法および単純な音素HMMを利用する推定スコア設定法を提案した．ビタビbest-firstサーチは，推定スコアを適切に設定すれば，認識率を低下させずに，認識処理で主要な部分を占める経路展開の計算量が1％以下となり，計算量低減の効果が非常に大きいことを示した．単純な音素HMMを利用する推定スコアは，時間軸の順序関係が考慮されるので精度が良いが，推定スコア設定に大きな計算量を必要とする．経路展開の計算量と推定スコア設定の計算量の両方を考慮すると，単語内最大経路スコアに基づく推定スコアが最も良い．この推定スコアは，A探索の条件を満たすので，最適解も保証される．