Présentation au sujet: "Gérard CHOLLET firstname.lastname@example.org Fusion Gérard CHOLLET email@example.com GET-ENST/CNRS-LTCI 46 rue Barrault 75634 PARIS cedex 13 http://www.tsi.enst.fr/~chollet."— Transcription de la présentation:
1 Gérard CHOLLET firstname.lastname@example.org FusionGérard CHOLLETGET-ENST/CNRS-LTCI 46 rue Barrault PARIS cedex 13
2 Plan Motivations, Applications Reconnaissance de formes Multi-capteurs Rehaussement du signalParametresScoresDecisionsConclusionsPerspectives
3 Introduction Reconnaissance des formes Pourquoi fusionner ? Que fusionner ?Des signaux issus de capteurs divers,Des parametres mesures sur ces signaux,Des scores calculés par des classificateurs,Des decisions prises par des classificateursComment fusionner ?
14 Gaussian Mixture Model Parametric representation of the probability distribution of observations:
15 Gaussian Mixture Models 8 Gaussians per mixture
16 Support Vector Machines and Speaker Verification Hybrid GMM-SVM system is proposedSVM scoring model trained on development data to classify true-target speakers access and impostors access, using new feature representation based on GMMsModelingScoringGMMSVM
17 SVM principles X y(X) Feature space Input space H Class(X) Ho Separating hyperplans H , with the optimal hyperplan HoHoHClass(X)
19 Combining Speech Recognition and Speaker Verification. Speaker independent phone HMMsSelection of segments or segment classes which are speaker specificPreliminary evaluations are performed on the NIST extended data set (one hour of training data per speaker)Some developments were done during a 6 weeks workshop (SuperSID) during summer 2002
25 Audio-Visual Identity Verification A person speaking in front of a camera offers 2 modalities for identity verification (speech and face).The sequence of face images and the synchronisation of speech and lip movements could be exploited.Imposture is much more difficult than with single modalities.Many PCs, PDAs, mobile phones are equiped with a camera. Audio-Visual Identity Verification will offer non-intrusive security for e-commerce, e-banking,…
26 Examples of Speaking Faces Sequence of digits (PIN code)Free text
27 Fusion of Speech and Face (from thesis of Conrad Sanderson, aug. 2002)
28 An illustration Insecure Network Distant server: Access to private dataSecured transactionsAcquisition of biometric signals for each modalityScores are computed for each modalityFusion of scores and decision
29 Conclusions and Perspectives Speech is often the only usable biometric modality (over the telephone network).Interactive Voice Servers may use both text dependent and text independent approaches for improved verification accuracy.Evaluation campaigns and research workshops are efficient means to stimulate progress.Most PCs, PDAs and Mobile Phones will be equipped with cameras. Audio-Visual Identity Verification should find applications in e-Banking, e-Commerce, ….