MedPath

Video Assisted Speech Technology to Enhance Motor Planning for Speech

Not Applicable
Completed
Conditions
Apraxia of Speech
Autism Spectrum Disorder
Interventions
Behavioral: Video Assisted Speech Therapy (VAST)
Registration Number
NCT04764539
Lead Sponsor
iTherapy, LLC
Brief Summary

Nearly 3.5 million Americans are diagnosed with Autistic Spectrum Disorder (ASD), a communication disorder that causes skill limitations in the areas of language acquisition, sensory integration, and behavior. This lack of functional language ability limits conversation to its most basic parts, making daily tasks difficult for minimally to non-verbal individuals to achieve. iTherapy is developing the VAST platform, a personalized educational experience for students with ASD by creating a virtual reality-based video-modeling program to stimulate engagement and speech production practice, ultimately providing those with ASD an opportunity to enhance their quality of life by increasing their speech abilities which will enable them to build social networks and handle the events of daily life.

Detailed Description

Autism Spectrum Disorder (ASD) is a neurodevelopmental communication disorder resulting in functional language and behavioral delays affecting over 3.5 million Americans. These delays vary with the severity of symptoms that present in ASD but often result in limited speech and increased communication challenges. Alongside linguistic acquisition, oral motor coordination is a crucial part of speech production.

Current clinical techniques have shown varying degrees of efficacy in improving functional language proficiency. Most techniques follow a drill-like procedure, where the child is made to repeat various sounds and phrases until they are retained. However, such a process requires potentially over twenty therapy sessions to show improvement which may then only be focused on one aspect of speech. This significantly limits the linguistic and social skills a student will acquire. To improve the efficacy of these therapy sessions, new technology must be developed to provide the most effective educational experience.

Video-assisted speech technology (VAST) is a method of using a video of a close-up model of the mouth and speaking simultaneously with it. Rather than present the individual with a static photograph of the initial phoneme, the entire sequence of oral movements can be presented sequentially via video-recorded segments of the orofacial area producing connected speech, combining best practices, video modeling, and literacy with auditory cues to provide unprecedented support the development of vocabulary, word combinations and communication.

In this SBIR Phase I proposal, iTherapy will develop a personalized educational experience for students with ASD by creating a virtual reality (VR) based VAST program to stimulate engagement and speech production practice. VR offers several benefits as a therapy technique: overcoming sensory difficulties, more effectively generalizing information, employing visual learning, and providing individualized treatment. As a user moves through the stages of the program, they will be immersed in a proactive environment where they will engross themselves with continuous content.

Rather than present the individual with a static photograph of the initial phoneme, the entire sequence of oral movements can be presented sequentially via VR-modelled segments of the orofacial area producing connected speech, combining best practices, video modeling, music therapy, and literacy with auditory cues to provide unprecedented support the development of vocabulary, word combinations and communication. The innovation will be a video series of a realistic VR mouth which will require the use of an app on a tablet or a smartphone, VR goggles, and bone conduction headphones.

Recruitment & Eligibility

Status
COMPLETED
Sex
All
Target Recruitment
6
Inclusion Criteria
  • Nonverbal-minimally verbal children (0-5 words)
  • Diagnosis of Autism Spectrum Disorder
Exclusion Criteria
  • No history of seizures for participating with VR goggles.

Study & Design

Study Type
INTERVENTIONAL
Study Design
PARALLEL
Arm && Interventions
GroupInterventionDescription
Stimuli administered via 2D format on an iPad ProVideo Assisted Speech Therapy (VAST)Participants were given the Video-Assisted Speech Therapy (VAST) video-modeling stimuli in a 2D format (iPad Pro). Three children with ASD, between the ages of 4 and 8, participated in a 14-sessions-long study that utilized the tablet-based VAST application. Sessions were held twice a week with each lasting approximately 15 minutes (i.e. +/- 5 minutes).
Stimuli administered in 3D format via VR goggles and bone conduction headphonesVideo Assisted Speech Therapy (VAST)Participants were given the Video-Assisted Speech Therapy (VAST) video-modeling stimuli in a VR format paired with a custom 3D-printed VR headset. Three children with ASD, between the ages of 4 and 8, participated in a 14-sessions-long study that utilized a 3D VR-integrated VAST prototype with bone conduction audio. Sessions were held twice a week with each lasting approximately 15 min (i.e. +/- 5 minutes).
Primary Outcome Measures
NameTimeMethod
Change in Articulation AccuracySeven weeks--each subject participated in the study twice a week over a 7-week period for a total of 14 sessions. The first and last sessions (session #1 and session #14) were reserved for pre-test and post-test language sample collection and assessment.

Change in % of correct phonemes in each attempted stimulus

Change in Mean Length of Utterance (MLU)Seven weeks--each subject participated in the study twice a week over a 7-week period for a total of 14 sessions. The first and last sessions (session #1 and session #14) were reserved for pre-test and post-test language sample collection and assessment.

Participants (aged 4 to 8 years) were given a pre- and post-test 15-minute language sample. MLU was calculated for tests and gain from pre-test to post-test was compared.

NOTE: This measure is calculated based on a change in the number of morphemes per utterance during pre-test and post-test language samples. During a five-minute period, two licensed speech-language pathologists (SLP) observed a parent interacting and talking with their child. Parents Both SLPs transcribed the subjects' speech and calculated a mean length of utterance (MLU) for each subject. MLU was calculated by determining how many bound and free morphemes were included within every spoken utterance produced by a subject. The total number of morphemes produced within the 5-minute period were then divided by total number of utterances, which then produced the MLU for each subject. This procedure was use for determining MLU in both the pre- and post-testing procedures.

Change in Percentage of Correctly Transcribed Words Using Automatic Speech RecognitionSeven weeks--each subject participated in the study twice a week over a 7-week period for a total of 14 sessions. The first and last sessions (session #1 and session #14) were reserved for pre-test and post-test language sample collection and assessment.

15-minute pre- and post-testing was performed using speech recognition software and transcribed by a licensed speech pathologist. Differences pre and post intervention were compared across group and within groups.

NOTE: During our assessment, we used Google's native closed captioning function (a tool which uses machine learning to recognize and transcribe speech) and a third party app, Tactiq Pins, which allows users to keep a transcript of all speaker utterances during a call. We compared our video to the Tactiq Pin transcripts in order to measure any change in the amount of accurately transcribed spoken words between pre-test and post-test language samples. Specific transcription results for each group can be found in the data tables provided.

Secondary Outcome Measures
NameTimeMethod
Change in Type-Token RatiosSeven weeks--each subject participated in the study twice a week over a 7-week period for a total of 14 sessions. The first and last sessions (session #1 and session #14) were reserved for pre-test and post-test language sample collection and assessment.

A type-token ratio measures the total number of unique words in a given segment of language.

Parent Perceptions of Communication Changes, Resulting From Study Participation.Seven weeks--each subject participated in the study twice a week over a 7-week period for a total of 14 sessions. The first and last sessions (session #1 and session #14) were reserved for pre-test and post-test language sample collection and assessment.

Parent observations -- perceptions of changes in their children's motor-speech, behavioral, and social communication skills after having participated in the study

Scale title: Net Positive Changes Score Maximum possible value: 18 Minimum possible value: -2 Higher score is better.

Increase in Response Rate to Treatment StimuliSeven weeks--each subject participated in the study twice a week over a 7-week period for a total of 14 sessions. The first and last sessions (session #1 and session #14) were reserved for pre-test and post-test language sample collection and assessment.

The change in response rate measures any significant differences in how often children responded to pre- and post-testing stimuli after having received treatment between the iPad Pro and VR goggles groups. A response is considered a verbal or non-verbal reaction (e.g., eye contact, gestures, vocalizations) to the stimuli presented during the therapy sessions. Higher response rates indicate better engagement and responsiveness to the treatment. The change in response rate is calculated as the value at the post-test time point minus the value at the pre-test time point, with positive numbers representing increases and negative numbers representing decreases in response rate.

Trial Locations

Locations (1)

All research was conducted via tele-research due to COVID-19

🇺🇸

Vallejo, California, United States

© Copyright 2025. All Rights Reserved by MedPath