3D video is introducing great changes in many health related ar- eas. The realism of such information provides health professionals with strong evidence analysis tools to facilitate clinical decision processes. Speech and language therapy aims to help subjects in correcting several disorders. The assessment of the patient by the speech and language therapist (SLT), requires several visual and au- dio analysis procedures that can interfere with the patient’s produc- tion of speech. In this context, the main contribution of this paper is a 3D video system to improve health information management processes in speech and language therapy. The 3D video retrieval and management system supports multimodal health records and provides the SLTs with tools to support their work in many ways: (i) it allows SLTs to easily maintain a database of patients’ orofa- cial and speech exercises; (ii) supports three-dimensional orofacial measurement and analysis in a non-intrusive way; and (iii) search patient speech-exercises by similar facial characteristics, using fa- cial image analysis techniques. The second contribution is a dataset with 3D videos of patients performing orofacial speech exercises. The whole system was evaluated successfully in a user study in- volving 22 SLTs. The user study illustrated the importance of the retrieval by similar orofacial speech exercise.
Speech is the main form of human communication. Thus it is important to detect and treat speech sound disorders as early as possible during childhood. When children need to attend speech therapy it is critical to keep them motivated on doing the therapy exercises. Software systems for speech therapy can be a useful tool to keep the child interested in keep practicing the therapy exercises. Several software systems have been developed to assist speech and language therapists during the therapy sessions. However most software focus on articulation disorders while voice disorders have been mostly neglected. Here we propose a voice-controlled serious computer game for the sustained vowel exercise, which is an exercise commonly used in speech therapy to treat voice disorders. The main novelty of this application is the combination of real time speech processing, with the gamification of the speech therapy exercises and the parameterization of the difficulty level.
Traditional speech therapy approaches for speech sound disorders have a lot of advantages to gain from computer-based therapy systems. In this paper, we propose a robust phoneme recognition solution for an interactive environment for speech therapy. With speech recognition techniques the motivation elements of computer-based therapy systems can be automated in order to get an interactive environment that motivates the therapy attendee towards better performances. The contribution of this paper is a robust phoneme recognition to control the feedback provided to the patient during a speech therapy session. We compare the results of hierarchical and flat classification, with naive Bayes, support vector machines and kernel density estimation on linear predictive coding coefficients and Mel-frequency cepstral coefficients.
Speech therapy is essential to help children with speech sound disorders. While some computer tools for speech ther- apy have been proposed, most focus on articulation disorders. Another important aspect of speech therapy is voice quality but not much research has been developed on this issue. As a contribution to fill this gap, we propose a robust scoring model for voice exercises often used in speech ther- apy sessions, namely the sustained vowel and the increas- ing/decreasing pitch variation exercises. The models are learned with a support vector machine and double cross vali- dation, and obtained approximately from 73.98% to 85.93% accuracies while showing a low rate of false negatives. The learned models allow classifying the children’s answers on the exercises, thus providing them with real-time feedback on their performance.
Thermal imaging is a type of imaging that uses thermographic cameras to detect radiation in the infrared range of the electromagnetic spectrum. Thermal images are particularly well suited for face detection and recognition because of the low sensitivity to illumination changes, color skins, beards and other artifacts. In this paper, we take a fresh look at the problem of face analysis in the thermal domain. We consider several thermal image descriptors and assess their performance in two popular tasks: face recognition and facial expression recognition. The results have shown that face recognition can reach accuracy levels of 91% with Localized Binary Patterns. Also, despite the difficulty of facial expression detection, our experiments have revealed that Haar based features (FCTH - Fuzzy Color and Texture Histogram) offers the best results for some facial expressions
This paper proposes a novel approach to include biofeedback in speech and language therapy by providing the patient with a visual self-monitoring of his/her performance combined with a reward mechanism in an entertainment environment. We propose a toolset that includes an in-session interactive environment to be used during the therapy sessions. This insession environment provides instantaneous biofeedback and assists the therapist during the session with rewards for the patient’s good performance. It also allows to make audiovisual recordings and annotations of the session for later analysis. The toolset also provides an off-line multimedia application for post-session analysis where the session audio-visual recordings can be examined through browsing, searching, and visualization techniques to plan the future session.