Studies on childhood dysphonia have revealed considerable rates for voice disorders in 4 – 12 year-old children. The sustained vowel exercise is widely used as a technique in the vocal (re)education process. However this exercise can become tedious after a short practice. Here, we propose a novel dynamic difficulty adjustment model to be used in a serious game with the sustained vowel exercise to motivate children on practicing this exercise often. The model automatically adapts the difficulty of the challenges in response to the child’s performance. The model is not exclusive to this game and can be used in other games for dysphonia treatment. In order to measure the child’s performance, the model uses parameters that are relevant to the therapy treatment. The proposed model is based on the flow model in order to balance the difficulty of the challenges with the child’s skills.
The distortion of sibilant sounds is a common type of speech sound disorder in European Portuguese speaking children. Speech and language pathologists (SLP) use different types of speech production tasks to assess these distortions. One of these tasks consists of the sustained production of isolated sibilants. Using these sound productions, SLPs usually rely on auditory perceptual evaluation to assess the sibilant distortions. Here we propose to use an isolated sibilant machine learning model to help SLPs assess these distortions. Our model uses Mel frequency cepstral coefficients of the isolated sibilant phones from 145 children, and was trained using support vector machines. The analysis of the false negatives detected by the model can give insight into whether the child has a sibilant production distortion. We were able to confirm that there exists a relation between the model classification results and the distortion assessment of professional SLPs. Approximately 66% of the distortion cases identified by the model are confirmed by an SLP as having some sort of distortion or are perceived as being the production of a different sound.
The distortion of sibilant sounds is a common type of speech sound disorder (SSD) in Portuguese speaking children. Speech and language pathologists (SLP) frequently use the isolated sibilants exercise to assess and treat this type of speech errors. While technological solutions like serious games can help SLPs to motivate the children on doing the exercises repeatedly, there is a lack of such games for this specific exercise. Another important aspect is that given the usual small number of therapy sessions per week, children are not improving at their maximum rate, which is only achieved by more intensive therapy. We propose a serious game for mobile platforms that allows children to practice their isolated sibilants exercises at home to correct sibilant distortions. This will allow children to practice their exercises more frequently, which can lead to faster improvements. The game, which uses an automatic speech recognition (ASR) system to classify the child sibilant productions, is controlled by the child’s voice in real time and gives immediate visual feedback to the child about her sibilant productions. In order to keep the computation on the mobile platform as simple as possible, the game has a client-server architecture, in which the external server runs the ASR system. We trained it using raw Mel frequency cepstral coefficients, and we achieved very good results with an accuracy test score of above 91% using support vector machines.
By combining visual-feedback and motivational elements, a speech therapy computer-based system can offer new approaches with various advantages when compared to traditional speech therapy techniques. Through visual-feedback and adaptation of traditional speech sound exercises, it is possible to create an engaging environment with motivation focused elements. These elements can be used in an interactive environment that motivates the therapy attendee towards better performances. Hereby we present an interactive gamified environment for speech therapy that combines visual-feedback and motivational components. The results from a survey and a usability study suggest that children can show more interest in the speech therapy sessions when the proposed environment is used.
3D video is introducing great changes in many health related ar- eas. The realism of such information provides health professionals with strong evidence analysis tools to facilitate clinical decision processes. Speech and language therapy aims to help subjects in correcting several disorders. The assessment of the patient by the speech and language therapist (SLT), requires several visual and au- dio analysis procedures that can interfere with the patient’s produc- tion of speech. In this context, the main contribution of this paper is a 3D video system to improve health information management processes in speech and language therapy. The 3D video retrieval and management system supports multimodal health records and provides the SLTs with tools to support their work in many ways: (i) it allows SLTs to easily maintain a database of patients’ orofa- cial and speech exercises; (ii) supports three-dimensional orofacial measurement and analysis in a non-intrusive way; and (iii) search patient speech-exercises by similar facial characteristics, using fa- cial image analysis techniques. The second contribution is a dataset with 3D videos of patients performing orofacial speech exercises. The whole system was evaluated successfully in a user study in- volving 22 SLTs. The user study illustrated the importance of the retrieval by similar orofacial speech exercise.
O objetivo deste estudo é verificar se as medidas resultantes da avaliação da antropometria facial em adultos, com o uso do paquímetro, apresentam reprodutibilidade e repetitividade. Métodos: Quatro indivíduos adultos foram submetidos a avaliação antropométrica facial direta (oito medidas) com o uso do paquímetro. A avaliação decorreu em dois momentos, distanciados por 42 dias, com nove examinadores no primeiro momento e 16 no segundo momento. Foi determinada a fidedignidade inter-examinadores (reprodutibilidade) através do Alfa de Cronbach e a fidedignidade intra-examinadores (repetitividade) com o coeficiente de correlação Ró de Spearman. Resultados: A fidedignidade inter-examinadores (reprodutibilidade) é razoável (α=0.7-0.8) para 78% e 93.7% das medidas no primeiro e segundo momento respetivamente. A fidedignidade intra-examinador (repetitividade) não apresenta significância estatística para todas as medidas, exceto para o terço médio da face (rs=0.83, p<0.05). Conclusão: A antropometria facial com paquímetro digital é uma técnica com reprodutibilidade razoável mas a repetitividade do seu uso não foi robusta no presente estudo.
Speech is the main form of human communication. Thus it is important to detect and treat speech sound disorders as early as possible during childhood. When children need to attend speech therapy it is critical to keep them motivated on doing the therapy exercises. Software systems for speech therapy can be a useful tool to keep the child interested in keep practicing the therapy exercises. Several software systems have been developed to assist speech and language therapists during the therapy sessions. However most software focus on articulation disorders while voice disorders have been mostly neglected. Here we propose a voice-controlled serious computer game for the sustained vowel exercise, which is an exercise commonly used in speech therapy to treat voice disorders. The main novelty of this application is the combination of real time speech processing, with the gamification of the speech therapy exercises and the parameterization of the difficulty level.
Traditional speech therapy approaches for speech sound disorders have a lot of advantages to gain from computer-based therapy systems. In this paper, we propose a robust phoneme recognition solution for an interactive environment for speech therapy. With speech recognition techniques the motivation elements of computer-based therapy systems can be automated in order to get an interactive environment that motivates the therapy attendee towards better performances. The contribution of this paper is a robust phoneme recognition to control the feedback provided to the patient during a speech therapy session. We compare the results of hierarchical and flat classification, with naive Bayes, support vector machines and kernel density estimation on linear predictive coding coefficients and Mel-frequency cepstral coefficients.
Speech therapy is essential to help children with speech sound disorders. While some computer tools for speech ther- apy have been proposed, most focus on articulation disorders. Another important aspect of speech therapy is voice quality but not much research has been developed on this issue. As a contribution to fill this gap, we propose a robust scoring model for voice exercises often used in speech ther- apy sessions, namely the sustained vowel and the increas- ing/decreasing pitch variation exercises. The models are learned with a support vector machine and double cross vali- dation, and obtained approximately from 73.98% to 85.93% accuracies while showing a low rate of false negatives. The learned models allow classifying the children�s answers on the exercises, thus providing them with real-time feedback on their performance.
Thermal imaging is a type of imaging that uses thermographic cameras to detect radiation in the infrared range of the electromagnetic spectrum. Thermal images are particularly well suited for face detection and recognition because of the low sensitivity to illumination changes, color skins, beards and other artifacts. In this paper, we take a fresh look at the problem of face analysis in the thermal domain. We consider several thermal image descriptors and assess their performance in two popular tasks: face recognition and facial expression recognition. The results have shown that face recognition can reach accuracy levels of 91% with Localized Binary Patterns. Also, despite the difficulty of facial expression detection, our experiments have revealed that Haar based features (FCTH - Fuzzy Color and Texture Histogram) offers the best results for some facial expressions
This paper proposes a novel approach to include biofeedback in speech and language therapy by providing the patient with a visual self-monitoring of his/her performance combined with a reward mechanism in an entertainment environment. We propose a toolset that includes an in-session interactive environment to be used during the therapy sessions. This insession environment provides instantaneous biofeedback and assists the therapist during the session with rewards for the patient’s good performance. It also allows to make audiovisual recordings and annotations of the session for later analysis. The toolset also provides an off-line multimedia application for post-session analysis where the session audio-visual recordings can be examined through browsing, searching, and visualization techniques to plan the future session.
- M. Lopes, J. Magalhães, S. Cavaco, A voice therapy serious game with difficulty level adaptation, ACM WomENcourage, 2017.
- BioVisualSpeech - serious games for speech therapy sessions and intensive training, CMU-Portugal Symposium, 2017.
- BioVisualSpeech - NovaSpeech, CMU-Portugal Symposium, 2017.
- Carla Viegas, Multimodal Analysis of the Interaction between Motor Speech Disorders and Expressed Emotions Using Machine Learning Techniques, CMU-Portugal Symposium, 2017.
- Carla Viegas, BioVisualSpeech - a multimodal framework to support speech therapy, Innovation Research Lab Exhibition, Medical Valley Center Erlangen, July 2016.
- Vanessa Lopes, A Computer-Based Therapy Game with a Dynamic Difficulty Adjustment Model for Childhood Dysphonia, M.Sc. Dissertation, FCT.UNL, 2018.
- Inês Mestre, Sibilantes e motricidade orofacial em crianças portuguesas dos 5;00 aos 9;11 anos de idade: Estudo preliminar, M.Sc. Dissertation, ESSA, 2017.
- Susana Miguel, Protocolo de avaliação da motricidade orofacial revisto: Aplicabilidade, sensibilidade e fidedignidade, M.Sc. Dissertation, ESSA, 2017.
- Ivo Anjos, Serious mobile game with sibilant consonant exercises for speech therapy, M.Sc. Dissertation, FCT.UNL, 2017.
- Marta Lopes, Jogo Sério da Vogal Sustentada com Dificuldade Adaptativa para Terapia da Fala, M.Sc. Dissertation, FCT.UNL, 2017.
- Mariana Diogo, Robust scoring and serious games of voice exercises for speech therapy, M.Sc. Dissertation, FCT.UNL, 2016.
- Ricardo Carrapiço, A 3D Video Decision Support Tool for Speech and Language Therapy, M.Sc. Dissertation, FCT.UNL, 2016.
- Cátia Pedrosa, Contributo para o estudo da fidedignidade de duas técnicas de antropometria facial: Paquímetro e fotogrametria, M.Sc. Dissertation, ESSA, 2016.
- Ana Filipa Raimundo, Protocolo de avaliação da motricidade orofacial: Revisão e caraterísticas psicométricas, M.Sc. Dissertation, ESSA, 2016.
- André Grossinho, Visual Speech an interactive platform for speech therapy, M.Sc. Dissertation, FCT.UNL, 2015.
- Hugo Cardoso, A speech therapy game, (report about serious games for childhood apraxia of speech), IST.
- Carla Viegas, Multimodal Analysis of the Interaction between Motor Speech Disorders and Expressed Emotions Using Machine Learning Techniques (Ph.D. proposal), FCT.UNL.
- Pedro Ferreira, Automatic sound analysis to improve speech and language therapy, (report on the analysis of diadochokinetics, master dissertation proposal), FCT.UNL.
- Ivo Anjos, Serious mobile games with fricative consonant exercises for speech therapy, (master dissertation proposal), FCT.UNL.