Title

The DesPho-APaDy Project:
Developing an acoustic-phonetic characterization of dysarthric speech in French.
C. Fougeron1, L. Crevier-Buchman1, C. Fredouille2, A. Ghio3, C. Meunier3, C. Chevrie-Muller4,
N. Audibert2, J.-F. Bonastre2, A. Colazo Simon1, C. Delooze3, D. Duez3, C. Gendrot1, T. Legou3,
N. Levèque1, C. Pillot-Loiseau1, S. Pinto3, G. Pouchoulin2, D. Robert3, J. Vaissiere1, F. Viallet3,
C. Vincent1.
1 Lab. de Phonétique et Phonologie, UMR 7018 CNRS-Paris3/Sorbonne Nouvelle, Paris, France 2 University of Avignon, CERI/LIA, Avignon, France 3 Lab. Parole et Langage, UMR 6057 CNRS Aix-Marseille Univ., Aix-en-Provence, France 4 Lab. MoDyCo, UMR 7114, CNRS- Université Paris 10, Paris, France E-mail : cecile.fougeron@univ-paris3.fr, corinne.fredouille@univ-avignon.fr, alain.ghio@lpl-aix.fr Abstract
This paper presents the rationale, objectives and advances of an on-going project (the DesPho-APaDy project funded by the French National Agency of Research) which aims to provide a systematic and quantified description of French dysarthric speech, over a large population of patients and three dysarthria types (related to the parkinson's disease, the Amyotrophic Lateral Sclerosis disease, and a pure cerebellar alteration). The two French corpora of dysarthric patients, from which the speech data have been selected for analysis purposes, are firstly described. Secondly, this paper discusses and outlines the requirement of a structured and organized computerized platform in order to store, organize and make accessible (for selected and protected usage) dysarthric speech corpora and associated patients’ clinical information (mostly disseminated in different locations: labs, hospitals, …). The design of both a computer database and a multi-field query interface is proposed for the clinical context. Finally, advances of the project related to the selection of the population used for the dysarthria analysis, the preprocessing of the speech files, their orthographic transcription and their automatic alignment are also presented. 1. Introduction
on etiological and/or neuroanatomical criteria Dysarthria refers to neurologically-based speech (localization of lesion site) (see Grewel, 1957; Auzou et disturbances. It results from damage to the central and/or al., 2007 for a review). Although the main features that peripheral nervous system that impairs the transmission differentiate ‘typical’ patients affected by different of neural messages to the muscles involved in speech dysarthria types have been identified, the study of production. Dysarthria is therefore the expression of a dysarthrias needs more comprehensive phonetic deficit in the motor execution of speech movements, and descriptions to overcome the great diversity observed in thus a motoric speech disorder. Strength, speed, range, rigidity, coordination and precision of speech gestures In the following section, we will present the rationale and can be altered at any level of the speech production the main objectives of our on-going research project on system (respiratory, phonatory, supralaryngeal). the acoustic-phonetic characteristics of the speech of Dysarthria is one of the most frequent disorders of verbal dysarthric French patients. Section 3 describes two communication associated with damage of the nervous dysarthric speech corpora (with a focus on the Claude system. Indeed, it can appear in the clinical profile of a Chevrie-Muller corpus) and the design of a multi-field large number of neurological disorders, including query computer interface developed to facilitate the cerebellar diseases, stroke, Parkinson’s disease, management and storage of the recordings. Section 4 Amyotrophic Lateral Sclerosis (ALS), multiple sclerosis, presents the advances of the project with a description of cerebral palsy, and traumatic brain injury (see e.g. Duffy, the selection procedure of the patients to be analyzed, and the method developed for the pre-processing of the The clinical manifestation of dysarthria and the speech files. Finally, section 5 concludes this paper by characteristics of the patients’ speech depend on its cause discussing some theoretical issues related to this long- and the disease associated with it. Therefore, a classification of dysarthria as a unitary condition is inaccurate, and dysarthria has rather to be considered as a 2. Rationale and Objectives of the Project:
label for a group of disorders (Peacher, 1950; Grewel, Characterizing Dysarthric Speech
1957; Darley et al., 1969a). Several classification schemes have been proposed in the literature to 2.1. Challenges
characterize different groups of dysarthrias. They are One major challenge to overcome when trying to either based on salient auditory-perceptual features characterize dysarthric speech is that dysarthrias are (phonatory, articulatory, prosodic…) that are used to complex disorders. All dysarthrias stem from defined characterize specific articulatory or kinematic behaviors neuropathological conditions with a deficit in the spatio- (e.g. ataxic, hypokinetic dysarthrias - Darley et al., temporal execution of speech movements. However, 1969a; Darley et al., 1969b; Darley et al., 1975) or based muscular weakness, spasticity, coordination disorder, involuntary movements, or altered muscle tonus will al., 1993; Viallet et al., 2002; Mori et al., 2004; Duez, have varied consequences on the articulatory movements 2006). Finally, very few comparisons between existing (articulatory target undershoot or overshoot, reduced studies have been made, and there is no overall control of movement amplitude and speed or over time, characterization of dysarthric speech patterns. This lack uncoordinated speech gestures…). Moreover, all of a comprehensive phonetic description of dysarthric dysarthrias involve disturbances, at some varying speech patterns can be partly explained by the following degrees, affecting different levels of speech production: respiratory, laryngeal, velopharyngeal (resonance), and (a) Dysarthric speech can be very impaired and articulatory (oro-facial) (Auzou et al., 2007; Kent et al., information in the speech signal is difficult to obtain and 1998). Thus dysarthria not only refers to a deficit in analyze. Consequently, studies are often restricted to a articulation per se, but encompasses disturbances in the limited set of acoustic measures, and attention is usually control of voice quality, speech rhythm, loudness, focused on a few specific impaired aspects of the speech segmental articulation, pitch, fluency, etc.
production system. Since all studies have not been A second challenge stems from the vast amount of inter- concentrated on the same acoustic cues and on the same and intra-speaker variability. As mentioned above, patient population, comparisons are rare. As a further different types of dysarthria sharing common features consequence, studies are usually restricted to small have to be considered. While these types can be defined cohorts of dysarthric speakers and limited to a small by shared features (reduced pitch modulation, speech rate perturbation, impaired coordination, nasal resonance…), (b) The absence of a comprehensive picture of they are not well defined by a distinctive and exclusive dysarthric speech features can also be explained by the set of features. Individual speaker idiosyncrasies, fact that the majority of studies is limited to the analysis differences in the severity of the disease, speaker-specific of one type of dysarthria, or the comparison of at most impairments and compensatory strategies are among the two types of dysarthria. Although the acoustic features of different sources of variability that have to be taken into the major types of dysarthria have been fairly well documented, most of the acoustic studies have focused Given these challenges, the search for relevant and stable on dysarthrias associated with Parkinson’s disease or criteria in order to describe dysarthric speech patterns needs to include multiple deviant speech dimensions, both at the segmental and the suprasegmental levels, and Furthermore, these studies cover a restricted language to be applied to a large population of patients for intra- area: while significant progress has been made on the and inter-group comparison as well as longitudinal description of English dysarthric patients, fewer studies were carried out on French dysarthric speakers (though see Monfrais-Pfauwadel, 1995; Robert et al., 1999; 2.2. Limitations
Baudelle et al., 2003; Gentil et al., 2003; Viallet et al., Even though associations between deviant acoustic- phonetic dimensions and certain types of dysarthria have been made in clinical practice and in the clinical Finally, different studies have been reported in the literature, descriptions of dysarthria are often based on literature, based on automatic methods drawing upon the perceptual assessments as done in the precursory studies automatic speech processing. Devoted to speech of Darley et al. (1969; 1975). It is true that perceptual disorders (like for instance Gu, 2005; Maier, 2007; Su, analysis is still considered as the “gold standard” and a 2008; Middag, 2009), the large majority of these patient is declared dysarthric because he is perceived methods aims to provide objective assessment of the dysarthric (Duffy, 2005). However, instrumental analysis speech quality in order to cope with the well-known drawbacks of the perceptual assessment like the complementary information for the assessment and to subjectivity for instance. Based on objective assessment, objectively quantify descriptions of the speech patterns they do not concentrate their efforts on the (Collins, 1984 ; Kent et al., 1999). A review of acoustic characterization of the dysarthric speech by the help of studies of dysarthric speech is available in (Kent et al., the automatic approaches for a better understanding, as 1999). It reports that “the great majority (of studies) proposed in a very few studies like (Teston, al., 1995; focuses on a small set of measures and typically a very small number of subjects”. We can add that most studies focus on a single subsystem (laryngeal, velopharyngeal, Characterizing Dysarthric Speech:
labial articulation…) and are based on ad hoc task of Objectives of the Project
speech production (sustained vowel, isolated sentences, diadochokinesis…). In the review done by Murdoch et A Comprehensive Acoustic-Phonetic
al. (1998) of 17 acoustic studies, most studies were based Description of Dysarthric Speech
on word and sentence reading, one study looked at read texts, and only two studies used spontaneous speech. Acoustic analysis of continuous speech is thus scarce The main objective of this project is to provide a except in the case of prosodic studies as in (Schlenk et systematic, quantified acoustic description of the speech patterns of French dysarthric speakers. Three major types demand to use them. Moreover, the development of this of dysarthria are examined and a relatively large cohort database is also motivated by the need to preserve a large of patients is included in each type (see 4). speech corpus of French dysarthric speakers recorded A standardized procedure for the acoustic-phonetic from 1967, the CCM database (see section 3.1.1), that characterization of a patient’s production is proposed. The originality of our approach comes from the combination of methods and analysis procedures drawing While this computer database is designed to manage any upon both phonetics and speech engineering. Thus, the clinical content related to speech and voice disorders, it procedure will involve both manual analysis (by human will be firstly designed with the corpora involved in this experts) of the acoustic phonetic properties of the productions and automatic acoustic analysis of speech signals. A continuous back and forth between these two Corpora of French Dysarthric
techniques should gain from the potential of both Patients
In the context of our project, the two corpora described A large set of acoustic-phonetic dimensions will be below provide us with a large sample of speech data from investigated to capture the scope of acoustic variations French dysarthric speakers that can be used for associated with dysarthria and to identify relevant, comparisons between speakers, between groups of reliable and robust criteria to characterize patients' speakers and in some cases for longitudinal evaluations. speech. Spectral and temporal cues, segmental and suprasegmental criteria, infra and supraglottal 3.1. The CCM Corpus:
dimensions, will be examined via a set of pre-defined Over the past 30 years, Dr Claude Chevrie-Muller measurements that will be used to screen all the selected (henceforth CCM) with her team recorded at the patients. The relevance of the criteria will be evaluated ‘Laboratoire d’étude de la voix et de la parole’ (INSERM U3) the patients that were sent to her by different differentiate dysarthric productions from non- neurologists for the assessment of disordered speech and its relation with neurological pathology. This extensive distinguish different (sub-)types of dysarthric work has given birth to a unique, highly valuable historical corpus of neurological speech disorders in monitor the evolution of dysarthria in a longitudinal French, known as “Pathologie de la voix et de la parole en neurologie” or “CCM corpus”.
This corpus contains about 1000 hours of disordered The feasibility and the originality of this project emerge speech, produced by 5000 patients (adults and children) from the collaboration of a team of researchers, approximately, mainly suffering from dysphonia and specialists of speech but with complementary expertise in dysarthria, but also anarthria, aphasia, stuttering, phonetics, clinical practice and speech engineering. psychiatric disorders and so on. In the population of These partners are located in Paris (Laboratoire de adult dysarthric speakers, 860 patients were classified Phonétique et Phonologie - LPP), Aix-en-Provence according to their neurological diagnosis. Four main (Laboratoire Parole et Langage - LPL), and Avignon types of dysarthria are represented. They include three (Laboratoire Informatique Avignon - LIA). main groups of neurological syndromes and a group of mixed symptoms: 2.3.3. Development of a Multiple-Field Query
(1) Disorders related to an impairment of the Database of Dysarthric Speech
extrapyramidal system. These disorders are characterized by a modification of initiation and offset of muscle tonus Research on disordered speech is confronted with the control with rigidity, hypokinesia and hypertonia. This difficulty of getting appropriate and sufficiently large group is represented by Parkinson’s disease and related quantities of speech data, homogeneous in quality, and Parkinson’s syndromes as well as Choreic disorders. sufficiently documented by clinical information on the (2) Disorders related to an impairment of the pyramidal patients (diagnosis, medical follow-up, medication, system (principal motor tract) and responsible for symptoms…). Therefore, the second aim of this project paralytic dysarthria. These can be associated with a (and a preliminary step for our acoustic description) is to pseudo bulbar syndrome with a bilateral spastic design and create a computer database where digitized component or a bulbar syndrome such as in dysarthric speech corpora and associated patients’ clinical information, can be stored, organized and made (3) Disorders related to an impairment of the cerebellar accessible (for selected and protected usage) through system which is characterized by an alteration of the multiple-field queries. The development of this database ongoing temporal-spatial control of the movement. These is motivated by the fact that dysarthric speech recordings can be seen in diseases such as Multiple Sclerosis, are currently disseminated in different locations in France, in different formats, and often without required (4) A group of mixed dysarthrias related to more diffuse indexing or clinical documentation. Consequently, their pathologies such as vascular disease, brain injury, etc.
access and handling are difficult, despite the strong A large variety of speech materials is available in this 3.2. The Aix-Neurology-Hospital corpus (ANH)
corpus as listed in table 1. Over the past few years, the protocol has evolved and for the oldest recordings some For the past fifteen years, under the impulse of F. Viallet, speech tasks were not recorded: all the items marked with the department of neurology of Aix-en-Provence a ‘*’ in table I are present in all recordings, and it is only Hospital has recorded dysarthric speakers regularly. after 1980 that the other items were included in the These patients are recorded with the EVA workstation protocol. The production of the whole protocol lasts (Teston et al., 1999) and clinical data are recorded simultaneously on a spreadsheet. Currently, the Aix- All the recordings were done in a sound booth with a Neurology-Hospital (ANH) corpus contains 990 patients table-top microphone. Audio and electroglottographic (average age = 67,7) and 160 control speakers (average signals were recorded on the two channels of Revox age = 62) with sound, aerodynamic recordings and tapes, with indexing in a notebook. Each recording has clinical data (diagnosis, regular and contextual been analyzed by the CCM team according to specific medication, clinical motor evaluation…). The population perceptual and acoustic features. For example, speech of patients is mainly composed of Parkinson’s disease rate, word length compared to normative data, segmental (601) and Parkinsonian syndromes (98).
description (vowel and consonant realization) and other prosodic variations were reported in the final assessment as well as the oro-pharyngo-laryngeal and praxis clinical (1) the recording of physical (SPL intensity) and examination. The CCM corpus thus contains three types physiological signals (oral airflow, estimated sub- glottal pressure, nasal airflow) in addition to of the • personal patient information (civil status, tape number, number of recordings—some patients being (2) the multiple speech tasks : sustained vowels, recorded 4-5 times for longitudinal analysis) and final maximal phonation time, airway interrupted assessment of the patient’s recording were stored as sentences to estimate sub glottal pressure, special sentences to estimate velar leakage, text reading with • medical follow-up (diagnosis, treatments, surgery several speed instructions, spontaneous description reports) was stored in patient's charts that consist of of a picture, diadochokinesis and so on. The recorded tasks can vary from a patient to another. • audio and electroglottographic (EGG) recordings For example, estimated sub glottal pressure is now systematically recorded in Parkinsonian hypophonia Recordings, notebooks and patient's charts containing all (Sarr, 2009). On the other hand, velar leakage is available clinical information are now stored in the Voice mainly recorded for paralytic dysarthria as proposed and Speech medical lab associated with the Laboratoire de Phonétique et Phonologie (Paris).
(3) the multiple clinical contexts of the recording sessions : 601 Parkinson patients recorded with/ Furthermore, a control population of 80 healthy male and without dopa, with/without deep brain stimulator female speakers was recorded with the same protocol. In which represents 1616 sessions of recordings; order to continue this activity, Dr L. Crevier-Buchman (4) the collection of a comprehensive set of information and her colleagues still record the neurological patients on the speaker (date and birthplace, mother tongue, coming to the Voice and Speech Lab of the European profession…), and the clinical conditions (date of Hospital Georges Pompidou (Paris). Recordings are now appearance of the disease, localization of the made on DAT tapes, following the same protocol but symptoms, medicament dosage, characteristics of with a head mounted microphone to avoid variability in possible electro physiological stimulator, scores of intensity due to patients’ movements. EGG is no longer the clinical examinations like UPDRS…). Such a precision is necessary for clinical studies (ex: effect of the therapies on the speech production) but also at It is worth noting that there is a huge loss of data in the the linguistic level (search for phonetic-acoustic CCM corpus. Because of the large inter-patient characterization of homogeneous group of dysarthric variability, there is a need in updating clinical information about the speaker (score on international scales, precise treatment information, medical states – All the data and information are computerized. This is with/without medication, stimulation, etc). In fact, our our main source of Parkinson patients.
experience shows that these requirements are exceptionally satisfied in a retrospective study especially Advances of the Project
when using old data. It is the reason why we have decided to complete our database with other sources of 4.1. Getting the audio files
The recordings of the CCM corpus are still on an analog medium (Revox tapes) and, to ensure their safeguarding, need to be urgently digitized. This task is very time consuming. First, each Revox tape contains several but also all the information related to them. This patients, and it appeared that digitizing a whole tape at information includes patients' information, such as once was a quicker solution than searching for a specific personal and clinical data (diagnosis, medical follow-up, dysarthric patient and digitizing it. Second, during each medication, symptoms…), recording protocol recording session the speed of the tape was changed information (type of speech, number of sessions, according to the speech task (and the need to record medication state of the patient, .), material used for the EGG). Thus, “real-time” auditory control of the recordings, etc. All this information is necessary for a recording has to be done in order to stop the tape at each controlled analysis of the speech data. Before designing change of speed and set the playing speed accordingly. this multi-field query interface, this working group chose Third, many tapes are in bad conditions, several a relational model to structure data, considered as the recordings are of bad quality (mainly due to speaker most simple and refined models for databases. Its movements relative to the table top microphone). Thus simplicity stems from its tabular but efficient adjustments have to be made in order to ensure organization, which allows to define a set of objects, reasonable audio quality in the output files. To date, 180 their attributes (characteristics) and the relations between objects. This results in an intuitive architecture, efficient 94 additional patients recorded on DAT tapes were also in terms of computation access and storage, easily digitally captured as wav files. Then, all these recordings understandable by non-specialists. In this context, a were segmented per patient and per speech task. The functional analysis has been carried out in order to define same procedure is applied to the control population. a set of objects, attributes and relations related to the Then the files are renamed for anonymous storage.
clinical environment. This analysis was refined afterward In order to get a sufficient amount of speech to be by confronting the relational data model with empirical analyzed acoustically, we have chosen to work first on and “real” clinical data issued from the disorder speech the text reading speech task. It allows to have more than 1 minute of speech, identical for all patients and with Finally, the working group is now designing and segmental, prosodic and fluency variations as well as developing the multi-field query interface, necessary for information on temporal features such as pauses, group the data access. This interface is composed of 3 blocks to phrasing and reading speed through out the text.
enter the criteria of the query:(1) Basic sociolinguistic information (gender, languages, 4.2. Design of a Database and Multi-Field Query
birthplace, address restricted to region); Interface
(2) Clinical information: diagnostics, symptoms, risk As mentioned in 2.3.2, the main interest in pooling and organizing clinical resources is to make this information durable, and to allow some exchange and increasing the age of the speaker at the recording time, enrichment via an accessible and shared computerized clinical context (ex: ON, OFF, pre-op, post- If the concepts around the databases (DB) are familiar for computer scientists, it can be very different for the non-specialists. It is common to find that a collection of speech tasks (reading, sustained vowels, audio recordings or data compose a database. However, a database differs from a collection of recordings/data by a linguistic content ([reading] “La chèvre de consistent structure and organization based on a model, M. Seguin”, [sustained vowels] /a/, [diadocho- shareable by a group of people and stored on a numerical support, allowing data selection according to precise studies : the data used by a specific study criteria. In the literature, these aspects are brought by a (ex : ANR, JEP2010, a250, master 2010 Weisz…) DataBase Management System (DBMS), which is responsible for (a) supporting the concepts defined by the If the query is validated, a tabulated text file is provided data model, (b) ensuring the respect of the consistency including all information chosen by the user. This rules related to the data, (c) making the sharing of data information can be different from the one used to select between several users transparent while ensuring the the data. For instance, it may be interesting to know the confidentiality of some parts of the data, (d) replying user profession of the speaker without being a query criterion. queries with a high performance level, and finally, (e) In a second time, the user can refine the selection in excel providing different data access languages according to spreadsheet for instance and can select Parkinson's disease speakers without Deep Brain Stimulation and recorded more than 12 hours of L-dopa withdrawal. In this project, a working group has been dedicated to When this local selection is done, the user provides a list this data structuring task in order to be able to provide of target data which are distributed by a secured users (clinicians, therapists, speech scientists) with a automaton. For the meantime, as a matter of straightforward multi-field query interface capable of confidentiality, these operations are not available through responding to their data access needs. It is worth noting that data include here audio and articulatory recordings 4.3. Selection of Patients for the Acoustic Study
sequence of phonemes in a word or the replacement of In order to include a sufficient number of patients and dysarthria types in our prospective acoustic study, we • Rule 3: is considered as an insertion all addition of
focused on neurophysiologic alterations of the three main segments of at least one syllable compared to the neurological systems: the extrapyramidal system original text (e.g.: repetition of an entire word or of represented by Parkinsonian dysarthria, the cerebellar syllable(s) in the word, hesitations and filled pauses); system represented by ataxic dysarthria and the • Rule 4: all the speech produced by another speaker pyramidal system represented by ALS dysarthria. (speech therapists for instance) during the recording is For each of these three types of dysarthria, the selection transcribed but annotated as some external productions. was based on i) the clinical file and information on the The same rule is applied for external noise.
disease, the certainty of the diagnosis, the ongoing Rules 2, 3 and 4 denoting some divergences between the treatment, ii) the severity of the dysarthria (we are only speech production and the expected text to read, the working on moderate dysarthrias with relatively SAMPA alphabet was used to provide a phonetic intelligible speech.). The selection includes: transcription of added phoneme sequences. Specific tags are added in the transcription to signal these different • 30 patients with a pure cerebellar alteration cases (e.g. for a substitution : [su=expected_word] • 30 patients with Parkinson’s disease selected in the pronounced_word_in_sampa [su]). Finally, a notebook with ANH corpus. All were out of L-dopa since 12 hours, other remarks about each audio file was also elaborated 15 read the text of the AHN protocol (‘La chèvre’) and 15 read both the text of the AHN protocol and that of the CCM protocol (‘Tic tac’).
4.4.2. Automatic Text-Constrained Alignment
A text-constrained alignment provides the phoneme time-
The recordings of these selected patients are being boundaries of a sequence of words expected in a speech evaluated perceptually by 3 expert judges. Voice quality, signa When this alignment is performed by a machine, articulation, prosody, intelligibility, naturalness of the automatic system requires as input resources both an speech, and severity are rated on a perceptual scale.
orthographic transcription related to the speech production and a text-restricted lexicon of expected 4.4. Pre-Processing of the Audio Files
words associated with their phonological variants.
Here, the phonetic alignment is performed by an In order to be able to perform the manual and automatic automatic system developed at the LIA laboratory. This acoustic analysis on the selection of patients described in system is based on a Viterbi decoding algorithm coupled the previous section, a pre-processing of the audio files is with a set of 38 French phonemes (in addition to the considered as necessary, relying on an automatic text- input resources reported above). Each phoneme model constrained phonetic alignment. This pre-processing is relies on a three state HMM, initially trained on French based on different resources (see below) including an speech corpora, produced by non-dysarthric speakers. orthographic transcription of the speech production to Since the latter has no connection with the dysarthric analyze. Due to the specific nature of the audio files and corpora, classical unsupervised adaptation techniques are the quality level of the phonetic alignment expected for applied iteratively on phoneme models for the automatic the acoustic analysis, individual orthographic phonetic alignment to enhance and refine phoneme transcriptions of each audio file are necessary as they will enable to take into account the possible divergences of To deal with the individual orthographic transcriptions speech production (due to difficulties for the patient to (and potential divergences in terms of words speak, disfluencies, …) compared to the expected ones pronounced) and the different rules (notably the related to the reading tasks (i.e. the texts of “La chèvre” substitutions and deletions), it is worth noting that the text-restricted lexicon used by the automatic alignment system is dynamically updated for each audio file in 4.4.1. Orthographic transcriptions
order to take new entries (SAMPA-based words or Each audio file was listened to and manually transcribed phoneme sequences) pronounced by the speaker into following a set of common transcription rules, especially designed for this clinical context. These rules tend to provide a compromise between the quality level of the 4.4.3. Quality of the Automatic Phonetic Alignment
phonetic alignment expected and the speech disorders A subset of productions was selected for a first due to dysarthria. The following list provides the main evaluation of the automatic phonetic alignment. The subset is gender-balanced and includes different degrees Rule 1: is considered as a deletion the lack of an entire
of dysarthria severity (2 control speakers, 2 speakers with word or one or more syllables (e.g. : the lack of moderate dysarthria and 2 with severe dysarthria). The phoneme [R] in the word “pauvre” will not be
considered as a deletion);
• Rule 2: is considered as a substitution the replacement
1 as opposed to a non text-constrained alignment, which has to of at least three successive phonemes by another determine the sequences of phonemes as well as their boundaries.
automatic alignment of the productions was compared to patterns, whereas the definition of disordered speech a manual correction of phonetic labels and boundaries needs references from normal variation. A better performed by 2 phoneticians. For a given phoneme understanding of the variation that characterizes segmented manually and automatically , the comparison dysarthric speech as deviant could thus provide insights is based on the time shift between the midpoints of the into the blurred boundary between normal and two segments. As defined in (Adda et al.; 2008), the pathological speech patterns. In return, dysarthric agreement between the automatic and manual alignments productions, and their variations, may inform us about is defined according to a minimum time lag threshold set normal speaker adaptation to different speech situations. While some progress has been made on the characterization of the acoustic-phonetic properties of The comparison of the alignments showed a shift above dysarthric speech, our knowledge is still limited. The 20 ms for 17% of segments for the control speakers, 24% study of disordered speech is at the crossroads between for the moderately dysarthric patients, and 56% for the different sub-disciplines of Speech Sciences, and heavily dysarthric patients (Audibert et al., 2010).
multidisciplinary collaborations, such as the one In order to enhance the quality of the automatic proposed here, promise progress in this area. alignment, already quite satisfactory for most speakers (control and moderate), the system was tuned by Acknowledgments
combining the information of 2 different sets of acoustic This project is funded by the ANR BLAN08-0125 of the models. This optimization improves the overall French National Research Agency. We deeply thank performance and notably that of heavily dysarthric Pierre Clément, Aurélie Nuremberg, and Olavo Panseri patients (15% on control speakers, 23% on moderate, who are also collaborating to this project. and 44% on heavily dysarthric patients). It has to be outlined that the altered productions of the latter set of References
patients were also hard to segment for the human experts. A comparison between manual and automatic alignments Adda-Decker, M., Gendrot, C., Nguyen, N. (2008). was also done in terms of their consequences on specific Contributions du traitement automatique de la parole à acoustic measurements: segment duration, formant l’étude des voyelles orales du français, Traitement frequency, fricative center of gravity (Fougeron et al., Automatique des Langues, 49, n°3, pp, 13--46. 2010). While temporal measurements extracted from Audibert, N., Fougeron, C., Fredouille, C., Meunier, C., automatic alignment have to be interpreted with caution, Panseri, O. (2010). Evaluation d’un alignement automatique spectral measurements (both local in the middle of a sur la parole dysarthrique. 28èmes JEP, Mons, Belgium vowel, or global over the fricative duration) are Auzou, P., Ozsancak, C., Pinto, S., Rolland, V. (2007). Les comparable with those extracted from a manual alignment. These first results are encouraging regarding Baudelle, E., Vaissière, J., Renard, J. L., Roubeau, B., Chevrie- the possibility of using automatic alignment for some of Muller, C. (2003). Caractéristiques vocaliques intrinsèques the acoustic dimensions to be analyzed in our project. et co-intrinsèques dans les dysarthries cérébelleuses et parkinsoniennes. Folia Phoniatrica et Logopedica 55, pp. Conclusion and Issues
Collins, M. (1984). Integrating perceptual and instrumental The understanding of dysarthric speech patterns has procedures in dysarthria assessment.
evident implications for clinical research on speech Communication Disorders, 5, pp. 159--170.
disorders, but also for contemporary issues in Speech Darley, F. L. , Aronson, A. E., Brown, J. R. (1969) Clusters of Deviant Speech Dimensions in the Dysarthrias. Journal of Recent developments in phonetics and phonology show a Speech and Hearing Research, 12: pp. 462--496.
trend away from observing the language system towards Darley, F. L., Aronson, A.E., Brown, J.R. (1969). Differential observing the user of the system. From this perspective, diagnostic patterns of dysarthria. Journal of Speech and disordered speech is a challenging and promising test case. Basic tenets of our project rely on the assumption Darley, F. L, Aronson, A. E., Brown, J. R. (1975). Motor that our understanding of speech production proceeds Speech Disorders. Philadelphia: W.B. Saunders.
with advances in the study of both normal and disordered Duez, D. (2006). Syllable structure, syllable duration and final speech and that a good model has to unify knowledge lengthening in Parkinsonian French speech, Journal of from both populations. In that respect, observing the Multilingual Communication Disorders, 4, 1, pp. 45--57.
types and range of variation linked to a motoric deficit, Duffy, J. R. (1995). Motor Speech Disorders: Substrates, such as in dysarthria, is of the utmost interest for a differential diagnostics and management. St Louis: Mosby- comprehensive model of speech variation. Indeed, it raises challenging issues related to the factors governing Duffy J. R. (2005). Motor speech disorders: substrates,differen- variation in speech production in general. Models of tial diagnosis and management. Mosby-Yearbook. St. Louis.
variation in phonetics need input from disordered speech Fougeron, C., Audibert, N., Fredouille, C., Meunier, C., Gendrot, C., Panseri, O. (2010). Comparaison d’analyses 2 20ms corresponds to 2 frames in the automatic phonétiques de parole dysarthrique basées sur un alignement

manuel et un alignement automatique. 28èmes JEP, Mons, Schlenck K.-J., Bettrich R., Willmes K. (1993). Aspects of dis- turbed prosody in dysarthria, Clinical Linguistics, Phonetics, Gentil, M., Pinto, S., Pollak, P., Benadbid, A. L. (2003). Effect of bilateral stimulation of the subthalamic nucleus on Su, H. Y., Wu, C. H., Tsai, P. J. (2008). Automatic asses-sment Parkinsonian dysarthria. Brain and Language, 85, pp. 190-- of articulation disorders using confident unit-based model adaptation. In proc. of ICASSP, Las Vegas, US.
Teston B., Ghio A., Galindo B. (1999). A multisensor data ac- Grewel, F. (1957). Classification of dysarthrias. Acta quisition and processing system for speech production in- Psychiatrica Neurologica Scandinavica, 32, pp. 325--337.
vestigation. In proc. of ICPHS'99, pp. 2251--2254.
Gu, L., Harris, J. G., Rahul, S., Sapienza, C. (2005). Teston, B., Galindo, A. (1995). A Diagnostic and Disordered speech evaluation using objective quality Rehabilitation Aid Workstation for Speech and Voice measures. In proc. of ICASSP'05. Philadelphia, US.
Pathologies. In proc. of Eurospeech'95, Madrid Spain. Kent, R. D., Kent, J. F., Duffy, J. R., Weismer, G. (1998) The Viallet, F., Jankowski, L., Purson, A., Teston, B. (2004). Dopa dysarthrias: Speech-voice profiles, related dysfunctions, and effects on laryngeal dysfunction in Parkinson’s disease: An neuropathologies. Journal of Medical Speech-Language acoustic and aerodynamic study, International Congress of Parkinson’s Disease and Movement Disorders. Movement Kent, R. D., Weismer, G., Kent, J. F., Vorperian, H. K., Duffy, Disorders, vol. 19, Suppl. 9, pp. S237.
J. R. (1999). Acoustic studies of dysarthric speech: Methods, Vijayalakshmi, P., Reddy, M. R., O'Shaughnessy, D. (2006). progress, and potential. The Journal of Communication Assessment of articulatory sub-systems of dysarthric speech using an isolated-style phoneme recognition system. In proc. Maier, A., Schuster, M., Batliner, A., Nöth, E., Nkenke, E. (2007). Automatic scoring of the intelligibility in patients with cancer of the oral cavity. In proc. of Interspeech'07, Antwerpen, Belgium.
McNeil, M. R. (1997). Clinical management of sensorimotor speech disorders. New York: Thieme, 1997.
Middag, C., Martens, J.-P., Van Nuffelen, G., De Bodt, M. (2009). Automated intelligibility assessment of pathological speech using phonological features. EURASIP Journal on • the production of automatic series (counting from 1 to * Advances in Signal Processing. v. 2009.
Monfrais-Pfauwadel, M. C. (1995). Les disfluences autres que • two readings of a sentence and its repetitions celles du bégaiement. Revue de laryngologie, d'otologie et • (“C’est une affaire intéressante, qu’en pensez-vous? Il de rhinologie, 116(4), pp. 267--270.
Mori H., Kobayashi Y., Kasuya H., Hirose H., Kobayashi N. • the reading of two lists of words (Bonjour, Femme, * (2004). Prosodic and Segmental Evaluation of Dysarthric Chasseur, Légat, Exploit, Gargarisme, Voleur, Banane, Speech, Proc. Speech Prosody, Nara, Japan, 4 p.
Coupe, Coupe-papier, Spectacle, Un match de boxe, Murdoch, B. (1998). Dysarthria - A Physiological Approach to Jaser, Magique) ; (Bonjour, Jaser, Légat, Banane, Assessment and Treatment. Nelson Thornes Ltd.
Voleur, Coupe-papier, Justice, Zèbre, Magique, Exploit, Peacher, W. G. (1950). The etiology and differential diagnosis of dysarthria. Journal of Speech and Hearing Disorders, 15: • the production of sustained vowels (/a/, /e/, /i/, /o/) • the reading of a text (a fairy tale of 170 words, ‘Le Pinto S., Gentil M., Krack P., Sauleau P., Fraix V., Benabid A.-L., Pollak P. (2005). Changes induced by levodopa and subthalamic nucleus stimulation on Parkinsonian speech. • a story telling based on a picture support Movement Disorders, vol. 20, no. 11. 2005, pp. 1507--1515.
• ("La chute dans la boue", based on a test evaluating Robert D., Sangla I., Azulay J.P., Giovanni A., Cannoni M., Pouget J. (1995). Diagnostic et suivi de l’insuffisance vélaire • spontaneous speech (narrating the day's activities) dans les formes bulbaires des maladies du motoneurone. • syllable repetition (CV, VC or VCV with V= [a] and Actes du congrès sur le Voile Pathologique, Société C= [p, t, k, S, s, f, b, d, g, Z, z, v, l, R, m, n, j]) Française de phoniatrie, Lyon, pp. 63--74.
Table 1 : Speech material recorded in the CCM
Robert D., Pouget J., Giovanni A., Azulay J.P., Triglia J.M. database. A ‘*’ in the second column indicate whether (1999). Quantitative Voice Analysis in the Assessment of the material is available for all recordings. Bulbar Involvement in Amyotrophic Lateral Sclerosis. Acta Otolaryngol , 119, pp. 724--731 Sarr M., Pinto S., Jankowksi L., Purson A., Ghio A., Espesser R., Teston B., Viallet F. (2009). L-dopa and STN stimulation effects on pneumophonic coordination in Parkinsonian dysarthria: intra-oral pressure measurements. International Congress of Parkinson's Disease and Movement Disorders, Movement Disorders, vol. 24, no. S1. 2009, pp. S342.

Source: http://despho-apady.univ-avignon.fr/documents/FOUGERON-2010-LREC.pdf

Goals:

Zest: The Maximum Reliable Terabytes Per Second Storage for Petascale Systems Paul Nowoczynski, Nathan Stone, Jared Yanovich, Jason Sommerfield ABSTRACT: The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on large compute platforms. Write bandwidth, more than read bandwidth, is the dominant bottleneck in HPC I/

Microsoft word - condensed_printable.doc

Could not get a listing of products containing Use multiple suppliers and cannot guarantee animal ingredients. Need to contact them they always provide materials free of animal about the particular product you're interested in. Any product with Glycerin may be from an May contain animal ingredients, but no pork Cannot guarantee animal derivative-free due No general info given. Need to cont