
Publications
Comparison of Crowdsourced and Remote Subjective User Studies: A Case Study of Investigative Child Interviews
Crowdsourced and remote user studies have recently gained popularity as alternatives to traditional laboratory studies. However, they are subject to unreliability, and it is challenging to ensure that valid results are collected, especially when conducting user studies with experts. Experts are a sparse resource, usually having busy schedules and heavy workloads, and are not necessarily geographically close. They are therefore often unwilling to participate in studies which require physical attendance. In this paper, we compare three alternative methods: crowd sourced user study with non-experts, remote user study with non-experts, and remote user study with domain experts, for a use case involving investigative child interview training. We present the results from three subjective studies about the perception of AI-generated child avatars, which is developed using various technologies such as dialogue models, game engine, text-to-speech and speech-to-text components. The study was conducted with three different user groups, and our results indicate the importance of using best practice measures for ensuring the collection of reliable results in crowdsourced settings as compared to remote studies, and highlight the difference between the perspectives of domain experts and non-experts.
Multimodal Virtual Avatars for Investigative Interviews with Children
In this article, we present our ongoing work in the field of training police officers who conduct interviews with abused children. The objectives in this context are to protect vulnerable children from abuse, facilitate prosecution of offenders, and ensure that innocent adults are not accused of criminal acts. There is therefore a need for more data that can be used for improved interviewer training to equip police with the skills to conduct high-quality interviews. To support this important task, we propose to research a training program that utilizes different system components and multimodal data from the field of artificial intelligence such as chatbots, generation of visual content, text-to-speech, and speech-to-text. This program will be able to generate an almost unlimited amount of interview and also training data. The goal of combining all these different technologies and datatypes is to create an immersive and interactive child avatar that responds in a realistic way, to help to support the training of police interviewers, but can also produce synthetic data of interview situations that can be used to solve different problems in the same domain.
Towards an AI-driven talking avatar in virtual reality for investigative interviews of children
Artificial intelligence (AI) and gaming systems have advanced to the stage where the current models and technologies can be used to address real-world problems. The development of such systems comes with different challenges, e.g., most of them related to system performance, complexity and user testing. Using a virtual reality (VR) environment, we have designed and developed a game-like system aiming to mimic an abused child that can help to assist police and child protection service (CPS) personnel in interview training of maltreated children. Current research in this area points to the poor quality of conducted interviews, and emphasises the need for better training methods. Information obtained in these interviews is the core piece of evidence in the prosecution process. We utilised advanced dialogue models, talking visual avatars, and VR to build a virtual child avatar that can interact with users. We discuss our proposed architecture and the performance of the developed child avatar prototype, and we present the results from the user study conducted with CPS personnel. The user study investigates the users' perceived quality of experience (QoE) and their learning effects. Our study confirms that such a gaming system can increase the knowledge and skills of the users. We also benchmark and discuss the system performance aspects of the child avatar. Our results show that the proposed prototype works well in practice and is well received by the interview experts.
An overview of mock interviews as a training tool for interviewers of children
Mock (simulated) interviews can be used as a safe context for trainee interviewers to learn and practice questioning skills. When mock interviews are designed to reflect the body of scientific evidence on how questioning skills are best learned, research has demonstrated that interviewers acquire relevant and enduring skills. Despite the importance of this exercise in learning interview skill and its prevalence as a learning tool in other fields such as medicine and allied health, there has been relatively little discussion about mock interviews from an educational perspective in investigative interview training. This paper addresses that gap by providing the first comprehensive overview of the way mock interviews have been used in training interviewers of children. We describe the research that supports their utility, and the various ways they can be implemented in training: providing insight to learners; allowing opportunities for practice, feedback, and discussion; and as a standardized way to assess skill change over time. The paper also includes an overview of the cutting-edge use of avatars in mock interviews to enhance efficiency, provide unique learning experiences, and ultimately reduce training costs. We explain why avatars may be particularly useful in basic training, freeing up human trainers to facilitate mock interviews around advanced topics and discussion.
Synthesizing a Talking Child Avatar to Train Interviewers Working with Maltreated Children
When responding to allegations of child sexual, physical, and psychological abuse, Child Protection Service (CPS) workers and police personnel need to elicit detailed and accurate accounts of the abuse to assist in decision-making and prosecution. Current research emphasizes the importance of the interviewer’s ability to follow empirically based guidelines. In doing so, it is essential to implement economical and scientific training courses for interviewers. Due to recent advances in artificial intelligence, we propose to generate a realistic and interactive child avatar, aiming to mimic a child. Our ongoing research involves the integration and interaction of different components with each other, including how to handle the language, auditory, emotional, and visual components of the avatar. This paper presents three subjective studies that investigate and compare various state-of-the-art methods for implementing multiple aspects of the child avatar. The first user study evaluates the whole system and shows that the system is well received by the expert and highlights the importance of its realism. The second user study investigates the emotional component and how it can be integrated with video and audio, and the third user study investigates realism in the auditory and visual components of the avatar created by different methods. The insights and feedback from these studies have contributed to the refined and improved architecture of the child avatar system which we present here
Is More Realistic Better? A Comparison of Game Engine and GAN-based Avatars for Investigative Interviews of Children
The success of investigative interviews with maltreated children is often defined by the interviewer's ability to elicit a reliable and coherent account of the alleged incident from the child. Research shows that a child avatar mimicking a maltreated child can improve interviewers' performance in conducting these interviews. The realism of such a child avatar is considered one of the most critical factors. Based on this, the current study aims to generate realistic child avatars in real-time that utilize multimodal data and different components from artificial intelligence. This paper discusses the subjective findings of a study of two types of child avatar videos; animated avatars created using the Unity game engine and photorealism talking-head avatars using Generative adversarial networks (GANs). The results show that although the state-of-the-art GAN-generated avatars are significantly more realistic, they do not necessarily create a better experience, as most of the participants prefer talking to animated avatars.