La présentation est en train de télécharger. S'il vous plaît, attendez

La présentation est en train de télécharger. S'il vous plaît, attendez

Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI Gérard CHOLLET ENST/CNRS-LTCI 46 rue Barrault 75634 PARIS cedex 13.

Présentations similaires


Présentation au sujet: "Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI Gérard CHOLLET ENST/CNRS-LTCI 46 rue Barrault 75634 PARIS cedex 13."— Transcription de la présentation:

1 Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI Gérard CHOLLET ENST/CNRS-LTCI 46 rue Barrault PARIS cedex 13

2 Outline What is ENST/CNRS-LTCI ? Research and application topics related to COST-277: Speech production and perception, Speech analysis and synthesis, Speech coding: The SYMPATEX project Automatic speech recognition: The SIROCCO project Speaker characterisation and verification Perspectives within COST-277

3 ENST: Ecole Nationale Supérieure des Télécommunications CNRS: Centre National de la Recherche Scientifique LTCI: Laboratoire de Traitement et Communication de lInformation Our affiliations

4 What is ENST? Ecole Nationale de Télécommunications classed among the Grandes Ecoles d'Ingénieurs. 250 state certified engineers each year. part of Groupement des Ecoles de Télécommunications

5 ENST-Paris ( ) ENST-Bretagne in Brest Institut National des Télécommunications in Evry EURECOM in Sophia-Antipolis ENIC (Ecole Nouvelle dIngénieurs en Télécoms) in Lille Internet school in Marseille GET: Groupement des Ecoles de Télécommunications

6 Speech Production and Perception Parametric Vocal Tract model (Shinji Maeda) Non-linear Production model using Distinctive Regions and Modes (René Carré) Quantal nature of speech (R. Carré and S. Maeda) Perceptual filter (Nicolas Moreau) Auditory prosthesis (Alain Goyé and Jacques Prado)

7 Speech analysis and synthesis Time-Frequency representations, Wavelets Time-dependent spectral models (Yves Grenier) HNM (Harmonics + Noise Model) (Olivier Cappé, Eric Moulines, Maurice Charbit) Glottal Excited LPC

8 Time-dependent Spectral Models Temporal Decomposition (B. Atal, 1983) Vectorial Autoregressive models with detection of model ruptures (A. DeLima, Y. Grenier) Segmental parameterisation using a time-dependent polynomial expansion (Y. Grenier)

9 Temporal Decomposition

10 HNM: Harmonics + Noise Model Estimation des harmoniques Estimation de lenveloppe harmonique Paramètres H+B f A Signal à l entréeVoisement Estimation AR du résiduel Détection du pitch, et lénergie Estimation AR Voisé Non-voisé f A

11 A L I S P A utomatic L anguage I ndependent S peech P rocessing Automatic discovery of segmental units for speech coding, synthesis, recognition, language identification and speaker verification.

12 Speech Coding by indexing SYMPATEX SYstème de Messagerie unifiée avec présentation vocale des messages (PArole et TEXte) Thomson-CSF, ELAN TTS, Irius GET, ESIEE

13 Coding principle parole Analyse spectrale Analyse prosodique Reconnaissance HMM Dictionnaire des modèles HMM des unités ALISP Représentant A 1 … Représentant A 8 HMM A Détermination des unités de synthèse Choix unité de synthèse par DTW Codage prosodie Indice unité ALISP Indice unité de synthèse Pitch, énergie, temps

14 Decoding Parole synthétique Représentant A 1 … Représentant A 8 Indice ALISP N° représentant de synthèse Paramètres de prosodie Choix unité de synthèse Synthèse par concaténation

15 Automatic Speech Recognition Recognition of proper names and spellings Keyword spotting, noise robustness, adaptation Large Vocabulary Speech Recognition (SIROCCO) Markov Random Fields, Bayesian Networks and Graphical Models

16 Markov Random Fields Bayesian Networks and Graphical Models Speech modelling with state constrained Markov Random Field over Frequency bands (Guillaume Gravier and Marc Sigelle) Comparative framework to study MRF, Bayesian Networks and Graphical Models.

17 Speaker Verification Typology of approaches (EAGLES Handbook) Text dependent Public password Private password Customized password Text prompted Text independent Incremental enrolment Evaluation

18 Speaker Verification (text independent) The ELISA consortium ENST, LIA, IRISA,... NIST evaluations

19 Support Vector Machines and Speaker Verification Hybrid GMM-SVM system is proposed SVM scoring model trained on development data to classify true-target speakers access and impostors access, using new feature representation based on GMMs Modeling Scoring GMM SVM

20 SVM principles X (X) Input space Feature space Separating hyperplan H, with the optimal hyperplan H o HoHo H Class(X)

21 Results

22 Voice technology in Majordome Server side background tasks: continuous speech recognition applied to voice messages upon reception Detection of senders name and subject User interaction: Speaker identification and verification Speech recognition (receiving user commands through voice interaction) Text-to-speech synthesis (reading text summaries, s or faxes)

23 Perspectives within COST-277 Text-book on Speech Processing Evaluation of parametric representations of speech for diverse applications Fundamental work on voice transformations with applications in coding, synthesis, recognition and speaker characterisation Fundamental work on noise robustness with applications in coding, recognition and speaker verification


Télécharger ppt "Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI Gérard CHOLLET ENST/CNRS-LTCI 46 rue Barrault 75634 PARIS cedex 13."

Présentations similaires


Annonces Google