Télécharger la présentation
La présentation est en train de télécharger. S'il vous plaît, attendez
Publié parAnne Marteau Modifié depuis plus de 11 années
1
Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI
Gérard CHOLLET ENST/CNRS-LTCI 46 rue Barrault PARIS cedex 13
2
Outline What is ENST/CNRS-LTCI ?
Research and application topics related to COST-277: Speech production and perception, Speech analysis and synthesis, Speech coding: The SYMPATEX project Automatic speech recognition: The SIROCCO project Speaker characterisation and verification Perspectives within COST-277
3
Our affiliations ENST: Ecole Nationale Supérieure des Télécommunications CNRS: Centre National de la Recherche Scientifique LTCI: Laboratoire de Traitement et Communication de l’Information
4
What is ENST? Ecole Nationale de Télécommunications
classed among the ‘Grandes Ecoles d'Ingénieurs’. 250 state certified engineers each year . part of ‘Groupement des Ecoles de Télécommunications’
5
GET: Groupement des Ecoles de Télécommunications
ENST-Paris ( ) ENST-Bretagne in Brest Institut National des Télécommunications in Evry EURECOM in Sophia-Antipolis ENIC (Ecole Nouvelle d’Ingénieurs en Télécoms) in Lille Internet school in Marseille
6
Speech Production and Perception
Parametric Vocal Tract model (Shinji Maeda) Non-linear Production model using Distinctive Regions and Modes (René Carré) Quantal nature of speech (R. Carré and S. Maeda) Perceptual filter (Nicolas Moreau) Auditory prosthesis (Alain Goyé and Jacques Prado)
7
Speech analysis and synthesis
Time-Frequency representations, Wavelets Time-dependent spectral models (Yves Grenier) HNM (Harmonics + Noise Model) (Olivier Cappé, Eric Moulines, Maurice Charbit) Glottal Excited LPC
8
Time-dependent Spectral Models
Temporal Decomposition (B. Atal, 1983) Vectorial Autoregressive models with detection of model ruptures (A. DeLima, Y. Grenier) Segmental parameterisation using a time-dependent polynomial expansion (Y. Grenier)
9
Temporal Decomposition
10
HNM: Harmonics + Noise Model
Estimation des harmoniques Estimation de l’enveloppe harmonique Paramètres H+B f A Signal à l ’entrée Voisement Estimation AR du résiduel Détection du pitch, et l’énergie AR + - Voisé Non-voisé
11
A L I S P A utomatic L anguage I ndependent S peech P rocessing
Automatic discovery of segmental units for speech coding, synthesis, recognition, language identification and speaker verification.
12
Speech Coding by indexing
SYMPATEX SYstème de Messagerie unifiée avec présentation vocale des messages (PArole et TEXte) Thomson-CSF, ELAN TTS, Irius GET, ESIEE
13
Coding principle parole Analyse spectrale Analyse prosodique
Reconnaissance HMM Dictionnaire des modèles HMM des unités ALISP Représentant A1 … Représentant A8 HMM A Détermination des unités de synthèse Choix unité de synthèse par DTW Codage prosodie Indice unité ALISP Indice unité de synthèse Pitch, énergie, temps
14
Decoding Représentant A1 … Représentant A8 Indice ALISP
Parole synthétique Représentant A1 … Représentant A8 Indice ALISP N° représentant de synthèse Paramètres de prosodie Choix unité de synthèse Synthèse par concaténation
15
Automatic Speech Recognition
Recognition of proper names and spellings Keyword spotting, noise robustness, adaptation Large Vocabulary Speech Recognition (SIROCCO) Markov Random Fields, Bayesian Networks and Graphical Models
16
Markov Random Fields Bayesian Networks and Graphical Models
Speech modelling with state constrained Markov Random Field over Frequency bands (Guillaume Gravier and Marc Sigelle) Comparative framework to study MRF, Bayesian Networks and Graphical Models.
17
Speaker Verification Typology of approaches (EAGLES Handbook)
Text dependent Public password Private password Customized password Text prompted Text independent Incremental enrolment Evaluation
18
Speaker Verification (text independent)
The ELISA consortium ENST, LIA, IRISA, ... NIST evaluations
19
Support Vector Machines and Speaker Verification
Hybrid GMM-SVM system is proposed SVM scoring model trained on development data to classify true-target speakers access and impostors access, using new feature representation based on GMMs Modeling Scoring GMM SVM
20
SVM principles X y(X) Feature space Input space H Class(X) Ho
Separating hyperplan H , with the optimal hyperplan Ho Ho H Class(X)
21
Results
22
Voice technology in Majordome
Server side background tasks: continuous speech recognition applied to voice messages upon reception Detection of sender’s name and subject User interaction: Speaker identification and verification Speech recognition (receiving user commands through voice interaction) Text-to-speech synthesis (reading text summaries, s or faxes)
23
Perspectives within COST-277
Text-book on Speech Processing Evaluation of parametric representations of speech for diverse applications Fundamental work on voice transformations with applications in coding, synthesis, recognition and speaker characterisation Fundamental work on noise robustness with applications in coding, recognition and speaker verification
Présentations similaires
© 2024 SlidePlayer.fr Inc.
All rights reserved.