ロボットやバーチャルエージェントがユーザと自然に対話するためには,共通基盤の構築が不可欠である.人間同士の対話における共通基盤の構築過程を調査した先行研究は少なく,多くはテキストチャットを対象としている.そこで,本研究では,対話のモダリティ(音声・映像)と話者間の社会的関係性(初対面・知人)が共通基盤構築に与える影響を調査した.具体的には,これらの条件下で共通基盤が構築される対話の収集と分析を行った.その結果,モダリティの拡張や知人関係が共通基盤の構築を加速させることが明らかになった.言語行動の分析により,モダリティの拡張が共感の意図伝達を容易にし,知人関係がタスク進行に必要な発話を増加させることが判明した.これにより,ロボットやバーチャルエージェントの設計には,共感を伝える非言語行動や知人同士の発話行動の再現が重要であると示唆された.
Empirical Evaluation of Healthcare Communication Robot Encouraging Self-Disclosure of Chronic Pain, Airi Shimada・Kazunori Takashio, The 2024 IEEE International Conference on Robot and Human Interactive Communication (RO-MAN),2025年8月
Self-disclosure of pain is essential to communicate pain, which is a subjective sensation, to a third party. However,many elderly people, especially those with chronic pain, are hesitant to communicate their pain. As a result, many patients do not receive appropriate treatment at the right time. It is
important to detect small discomfort in daily life and not overlook pain occurrence or changes. The ultimate goal of this study is to create a robot for people with chronic pain that notices the user ’s discomfort through many modalities in daily interactions, and tell the recorded information to a
hospital or family if necessary. In this study, we conducted fieldwork at Nichinan Hospital and based on the findings, we propose and verify the dialogue system that encourages selfdisclosure of pain. In this paper, we implemented a system that detects discomfort based on the user ’s utterance about pain and the action of rubbing, and asks detailed questions about the pain. We conducted a demonstration experiment with patients of Nichinan Hospital, and the content of the dialogue was evaluated by a physical therapist. The proposed method received significantly higher ratings for the naturalness of the conversation, the ease of use of the system, and the length of the conversation. The physical therapist’s evaluation suggested that the ability of the dialogue system to ”notice” the user ’s discomfort or unusualness had a positive effect on facilitating pain communication and encouraging self-disclosure. The results suggest that it is possible to realize a dialogue system that facilitates self-disclosure of pain by users who suffer from chronic pain.
Impact Analysis of Switching Pause Synchronization for Spoken Dialogue Systems, Yosuke Ujigawa・Kazunori Takashio, The 2025 IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2025年8月
Each individual has a unique mental tempo (referred to as personal tempo), and the alignment of this tempo plays a crucial role in facilitating smooth interactions with spoken dialogue systems. This study focuses on the “switching pause,” a key component of conversational tempo that is established during interaction. Using a dialogue corpus, we analyzed the impact of switching pauses on dialogue and the process of synchronization. Through the analysis of synchronization between pairs, we examined dialogues with high similarity in switching pauses to elucidate the impact of this synchronization on goal achievement and cooperativity in dialogue. Furthermore, we conducted a time-series analysis within pairs to investigate the synchronization process and proposed a method for determining switching pauses for implementation in dialogue systems. These findings contribute significantly to the investigation of individual differences in users and the identification of personal factors that enable effective dialogue with dialogue systems
Exploring the Impact of Modalities on Building Common Ground Using the Collaborative Scene Reconstruction Task Yosuke Ujigawa・Asuka Shiotani・Masato Takizawa・Eisuke Midorikawa・Ryuichiro Higashinaka・Kazunori Takashio, IWSDS2025, 2025年5月
To deepen our understanding of verbal and non-verbal modalities in establishing commonground, this study introduces a novel “collaborative scene reconstruction task.” In this task, pairs of participants, each provided with distinct image sets derived from the same video, work together to reconstruct the sequence of the original video. The level of agreement between the participants on the image order—quantified using Kendall’s rank correlation coefficient—serves as a measure of common ground construction. This approach enables the analysis of how various modalities contribute to the construction of commonground. A corpus comprising 40 dialogues from 20 participants was collected and analyzed. The findings suggest that specific gestures play a significant role in fostering common ground, offering valuable insights for thedevelopment of dialogue systems that leverage multimodal information to enhance the user construction of common ground.
Transparent Barriers: Natural Language Access Control Policies for XR-Enhanced Everyday Objects, Kentaro Taninaka・Rahul Jain・Jingyu Shi・Kazunori Takashio・Karthik Ramani, CHI2025, 2025年4月
Extended Reality (XR)-enabled headsets that overlay digital content onto the physical world, are gradually finding their way into our daily life. This integration raises significant concerns about privacy and access control, especially in shared spaces where XR applications interact with everyday objects. Such issues remain subtle in the absence of widespread applications of XR and studies in shared spaces are required for a smooth progress. This study evaluated a prototype system facilitating natural language policy creation for flexible, context-aware access control of personal objects. We assessed its usability, focusing on balancing precision and user effort in creating access control policies. Qualitative interviews and task-based interactions provided insights into users’ preferences and behaviors, informing future design directions. Findings revealed diverse user needs for controlling access to personal items in various situations, emphasizing the need for flexible, user-friendly access control in XR-enhanced shared spaces that respects boundaries and considers social contexts.
交替潜時の同調が対話に与える影響分析と決定モデルの構築,宇治川 遥祐 ・高汐 一紀 ,HAIシンポジウム2025,2025年2月
人間は固有の精神テンポ(パーソナルテンポ)を持ち,テンポの一致がシステムとの円滑な対話において重要な役割を果たす.本研究は,対話におけるテンポの同調,特に交替潜時と発話速度の関係性に着目し,これらが効果的なコミュニケーションに果たす役割を分析した.動的時間伸縮法を用いた分析により,交替潜時の同調が対話に与える影響を明らかにした.さらに,発話速度と交替潜時の相関関係を明らかにし,音声対話システムのための発話タイミング決定モデルを構築した.この研究成果は,より自然で適応的な対話システムの開発に寄与するものである.
Face Robot Performing Interaction with Emphasis on Eye Blink Entrainment,Iimori, Masato・Furuya, Yuki・Takashio, Kazunori(Keio Univ),2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN),2023年8月
Eyes play a significant role in human-human interaction, and blinking is particularly important as it can indicate a pause in the conversation and even lead to eye blink entrainment. However, most communication robots cannot reproduce eye blink movements due to cost constraints. Thus, our aim is to create a low-cost robot that can physically reproduce eye blink movements and induce eye blink entrainment. In this paper, we describe the implementation of the robot and evaluate the subjective impression of the robot’s eye blink movements. Our results suggest that the robot’s blinking behavior at pauses in the conversation facilitated the participants’ understanding of the robot’s speech. Our findings also suggest that simulating eye blink entrainment movement can increase the participant’s affinity and acceptance towards the robot in certain cases, and if the blinking is not well designed, affinity may be adversely affected
Synchronization of Speech Rate to User’s Personal Tempo in Dialogue Systems and Its Effects, Yosuke Ujigawa・Kazunori Takashio(Keio Univ),2024 IEEE International Conference on Robot and Human Interactive Communication (RO-MAN),2024年8月
Every individual lives in daily lives in own unique tempo, called Personal Tempo. Tempo is also highly important in dialogue situations, and it is thought that if the tempo can be matched with the conversational partner, it will lead to smoother communication with a higher level of comprehension. Spoken-dialogue systems have been used in many situations, and by personalizing dialogue on the basis of the user’s tempo, it is thought that dialogue will be able to make it easier to speak and make people want to speak. Previous research has focused on methods for encouraging users to change their tempo to be in tune with the tempo of their dialogue partner. However, a conversation that differs from the user’s tempo can be stressful and burdensome for the user in the process of tuning in.Therefore, we define personal tempo as speech speed, which is the number of moras divided by the duration of speech and propose a speech-speed control method for spoken-dialogue systems. We implemented our method in a spoken-dialogue system that synchronizes speech with the user. We verified the effectiveness of the proposed method by analyzing its impact on the comprehension of speech and user impressions of the spoken-dialogue system. The results indicate that significant differences were obtained with the proposed method between impression and comprehension of the speech content.
植物×ARエージェントによる一人暮らしの中での発話促進,戸沢実・高汐一紀(慶大),電子情報通信学会技術研究報告,vol. 124, no. 143,2024年8月
パンデミック以降,対面コミュニケーションの減少と孤独感の増加が問題となっている.グリーンアメ
ニティとして部屋に飾られる観葉植物は,人間が愛着を抱きやすい最も身近な植物である.Mixed Reality 技術を活
用し,植物とのインタラクションを通じて信頼関係の構築と自己対話を促進することで,孤独感の軽減に寄与する.
エージェントが擬人化を促し,自己対話による発話を増やすことでセルフケアを促進する.観葉植物を対話相手と
する研究は,プライバシーを保ちながらポジティブな感情を引き出すことが期待される.
言語モデルを用いた発話内容に基づくFACS生成モデルの提案,小橋龍人・宇治川遥祐・高汐一紀(慶大),電子情報通信学会技術研究報告,vol. 124, no. 143,2024年8月
本研究では、発話テキストから表情を生成するモデルを提案する。
従来の研究では、音声から表情のアニメーションを生成する手法が提案されてきたが、
本研究ではテキストから直接表情を生成することに焦点を当てる。
出力はFACSに基づいたAction Unit(AU)を用い、
Transformerのデコーダを用いずにエンコーダのみで構成することで、
計算量を削減し、モデルの拡張性を高める事を目指す。
また、スライディングウィンドウ方式で学習を行い、トークン毎に生成することで時系列に沿った生成を可能とする。
学習には、WEB上に公開された動画を収集し、表情検出と文字起こしを行ってデータセットを構築した。