La présentation est en train de télécharger. S'il vous plaît, attendez

La présentation est en train de télécharger. S'il vous plaît, attendez

Gérard CHOLLET chollet@tsi.enst.fr Fusion Gérard CHOLLET chollet@tsi.enst.fr GET-ENST/CNRS-LTCI 46 rue Barrault 75634 PARIS cedex 13 http://www.tsi.enst.fr/~chollet.

Présentations similaires


Présentation au sujet: "Gérard CHOLLET chollet@tsi.enst.fr Fusion Gérard CHOLLET chollet@tsi.enst.fr GET-ENST/CNRS-LTCI 46 rue Barrault 75634 PARIS cedex 13 http://www.tsi.enst.fr/~chollet."— Transcription de la présentation:

1 Gérard CHOLLET chollet@tsi.enst.fr
Fusion Gérard CHOLLET GET-ENST/CNRS-LTCI 46 rue Barrault PARIS cedex 13

2 Plan Motivations, Applications Reconnaissance de formes Multi-capteurs
Rehaussement du signal Parametres Scores Decisions Conclusions Perspectives

3 Introduction Reconnaissance des formes Pourquoi fusionner ?
Que fusionner ? Des signaux issus de capteurs divers, Des parametres mesures sur ces signaux, Des scores calculés par des classificateurs, Des decisions prises par des classificateurs Comment fusionner ?

4 Reconnaissance de formes

5 Fusion de signaux Identiques ? Nombre de capteurs Types de capteurs
Nombre de sources Exemples : Réseaux de microphones Stérovision Seïsmographe

6 Fusion de paramètres Issus d’un seul capteur
Issus de plusieurs capteurs Modèles multi-flux Exemples : Reconnaissance de la parole Réseaux bayésiens

7 Fusion de scores

8 Fusion de décisions

9 Vector Quantization (VQ)
SOONG, ROSENBERG 1987 Dictionnaire locuteur 1 Dictionnaire locuteur 2 Dictionnaire locuteur n “Bonjour” locuteur test Y Dictionnaire locuteur X best quant.

10 Hidden Markov Models (HMM)
ROSENBERG 1990, TSENG 1992 “Bonjour” locuteur test Y “Bonjour” locuteur X “Bonjour” locuteur 1 “Bonjour” locuteur 2 “Bonjour” locuteur n Best path

11 Ergodic HMM PORITZ 1982, SAVIC 1990 HMM locuteur 1
HMM locuteur n “Bonjour” locuteur test Y HMM locuteur X Best path

12 Gaussian Mixture Models (GMM)
REYNOLDS 1995

13 HMM structure depends on the application

14 Gaussian Mixture Model
Parametric representation of the probability distribution of observations:

15 Gaussian Mixture Models
8 Gaussians per mixture

16 Support Vector Machines and Speaker Verification
Hybrid GMM-SVM system is proposed SVM scoring model trained on development data to classify true-target speakers access and impostors access, using new feature representation based on GMMs Modeling Scoring GMM SVM

17 SVM principles X y(X) Feature space Input space H Class(X) Ho
Separating hyperplans H , with the optimal hyperplan Ho Ho H Class(X)

18 Results

19 Combining Speech Recognition and Speaker Verification.
Speaker independent phone HMMs Selection of segments or segment classes which are speaker specific Preliminary evaluations are performed on the NIST extended data set (one hour of training data per speaker) Some developments were done during a 6 weeks workshop (SuperSID) during summer 2002

20 SuperSID experiments

21 GMM with cepstral features

22 Selection of nasals in words in -ing
being everything getting anything thing something things going

23 Fusion

24 Fusion results

25 Audio-Visual Identity Verification
A person speaking in front of a camera offers 2 modalities for identity verification (speech and face). The sequence of face images and the synchronisation of speech and lip movements could be exploited. Imposture is much more difficult than with single modalities. Many PCs, PDAs, mobile phones are equiped with a camera. Audio-Visual Identity Verification will offer non-intrusive security for e-commerce, e-banking,…

26 Examples of Speaking Faces
Sequence of digits (PIN code) Free text

27 Fusion of Speech and Face
(from thesis of Conrad Sanderson, aug. 2002)

28 An illustration Insecure Network Distant server:
Access to private data Secured transactions Acquisition of biometric signals for each modality Scores are computed for each modality Fusion of scores and decision

29 Conclusions and Perspectives
Speech is often the only usable biometric modality (over the telephone network). Interactive Voice Servers may use both text dependent and text independent approaches for improved verification accuracy. Evaluation campaigns and research workshops are efficient means to stimulate progress. Most PCs, PDAs and Mobile Phones will be equipped with cameras. Audio-Visual Identity Verification should find applications in e-Banking, e-Commerce, ….


Télécharger ppt "Gérard CHOLLET chollet@tsi.enst.fr Fusion Gérard CHOLLET chollet@tsi.enst.fr GET-ENST/CNRS-LTCI 46 rue Barrault 75634 PARIS cedex 13 http://www.tsi.enst.fr/~chollet."

Présentations similaires


Annonces Google