Thematic Alignment of Static Documents with Meeting Dialogs Dalila Mekhaldi Diva Group Department of Computer Science University of Fribourg.

Slides:



Advertisements
Présentations similaires
1 Project supported by the European Commission ECREIN Platform in Rhône-Alpes (RA) Analysis of instruments and actions to support eco-innovation and eco-investment.
Advertisements

CoRoT S.C. - 24/09/07 - S.Chaintreuil Production N0/N1/N2 Status of current software Proposed evolutions foreplanned production.
Aires et périmètres.
Les pronoms compléments
Département fédéral de lintérieur DFI Office fédéral de la statistique OFS Implementing the economic classification revision (NACE / ISIC) in the Business.
Spring into Action with Primary Languages oining in a poem.
Branche Développement Cnet La communication de ce document est soumise à autorisation du Cnet © France Télécom - (Nom du fichier) - D1 - 11/01/2014 Diffusion.
1 La bibliométrie pour l'évaluation stratégique des institutions de recherche : usages et limites Indicators for strategic positioning of the research.
Gérard CHOLLET Fusion Gérard CHOLLET GET-ENST/CNRS-LTCI 46 rue Barrault PARIS cedex 13
Les prépositions.
Revenir aux basiques !. 1 Revenir aux basiques Processus Nécessité daméliorer la Maîtrise les Offres et Projets: lanalyse des causes racines montre un.
les fournitures scolaires masculin! féminin! un crayon un stylo
A Le verbe être et les pronoms sujets p. 84 Être (to be) is the most frequently used verb in French. Note the forms of être in the chart below. être to.
Mercredi le 28 novembre. Warm-up Ask in 3 different ways the following statements: John et Sophie sont à lécole.
La formation des questions Reflect a bit… Reflect a bit… Pourquoi est-il important de poser les questions? Pourquoi est-il important de poser les questions?
Chez nous 6 6 UNITÉ Quit Cest quelquun que tu connais 22 LEÇON B Révision: Le passé composé p. 331 A Le verbe vivre p. 330 C Le pronom relatif qui p. 332.
interaction in the .LRN platform
Time with minutes French II Le 30 Octobre.
Découverte automatique de mappings fondée sur les requêtes dans un environnement P2P Présenté Par: Lyes LIMAM Encadré Par: Mohand-Said Hacid.
Révision (p. 130, texte) Nombres (1-100).
Homework planning Organisation du travail à la maison
Université Des Sciences Et De La Technologie DOran Mohamed Boudiaf USTO République Algérienne Démocratique et Populaire Département de linformatique Projet.
50Hz Literature 2007 Literature Plan. 50 Hz Literature With new Product Introductions … –Applications Manual (part of product catalog) –Submittal Data.
un crayon un ordinateur un stylo un taille-crayon.
Year 11 French Monday 30 th September and Tuesday 1 st October.
Agenda du jour Le Subjonctif (continued) Verbs with two stems Verbs with Spelling changes Internet et Media Internet at work/school Les atouts Les dangers.
Leçon 11-Blanc.
Voyages et Vacances.
European Program C OMENIUS Survey – Questionnaire Survey – Questionnaire Renewable energy in its regional context, ways out of the energy crisis Energie.
DELF Le 12 au 15 avril POURQUOI DELF? Official French language diplomas (DELF-DALF) - Why take the DELF and the DALF ? The Diplôme dEtudes en Langue.
Starter Fill in the gaps with the right words from the bottom:
EUROPEAN ASSOCIATION OF DEVELOPMENT RESEARCH AND TRAINING INSTITUTES ASSOCIATION EUROPÉENNE DES INSTITUTS DE RECHERCHE ET DE FORMATION EN MATIÈRE DE DÉVELOPPEMENT.
1 Report on InWor2003 ADOPT Meeting, CEA_Saclay Dec.17, 2003 P. Dhondt.
FRE 2645 CIDED04 : 22 Juin 2004 Système de reconnaissance structurelle de symboles, basé sur une multi représentation en graphes de régions, et exploitant.
RainGain o INTERREG NWE: o HQ in Lille, National Contact Point there too, o financial instrument of the.
LEÇON 51. Écrivez vos devoirs: Review speaking questions and answers...practice! Sortez vos devoirs: A2, A3, et B4 Tout de Suite: On a sheet of paper,
Laboratoire de Bioinformatique des Génomes et des Réseaux Université Libre de Bruxelles, Belgique Introduction Statistics.
Cycle préparatoire PeiP Parcours élève ingénieur Polytech
La pratique factuelle Années 90 un concept médical visant à optimiser les décisions cliniques face aux soins des patients Aujourdhui un concept évolutif,
CORALS : Aide à la décision sous incertitude dans un contexte naval Éric Beaudry, Ph.D. Systèmes daides à la décision RDDC – Valcartier 6 juin 2011 IFT615.
Sound Review Quest-ce que tu écoutes? a. -ou b.-é c.-i.
Match-up the numbers to the letters
Passage entre quaternions et matrice des cosinus directeurs Transition from Quaternions to Direction Cosine Matrices.
Jeudi le 7 novembre. F 3 DUE: Virtual tour in LMS by 7:30 for the 70! DUE: Flashcards also for the 70 today (50 Friday) 1. Poem practice Le dormeur du.
NOTES: R 4, R 5, R 6, & MAKE UP QUIZZES Over F 1 vocab, song. Time? Over pg orally for oral points! H/W: Study French 1 vocabulary & R4-6 for vocabulary.
Finger Rhyme 6 Summer Term Module 6 Culturethèque-ifru2013 May not be copied for commercial purposes.
PREPOSITIONS Objectives : To explain and practise prepositions.
CLS algorithm Step 1: If all instances in C are positive, then create YES node and halt. If all instances in C are negative, create a NO node and halt.
To be able to say what I think about different jobs for level 3.
1 Diffusion du savoir et mobilisation des connaissances Bilan de la réunion des partenaires du Domaine Justice, Police et Sécurité à Ottawa (14 novembre.
AFRICAN GROUP ON NATIONAL ACCOUNTS AGNA GROUPE AFRICAIN DE COMPTABILITÉ NATIONALE C RÉATION DU R ÉSEAU AFRICAIN DES COMPTABLES NATIONAUX.
"Man Machine Interaction" MEMODULES as tangible shortcuts to multimedia information Omar ABOU KHALED, Rolf INGOLD, Denis LALANNE.
LEÇON 90. Écrivez vos devoirs: A1, A2, A3, B1, B2, B3, C1, C2, C3... le livret entier. Tout de suite: C5: Situations. Choisissez une situation et écrivez.
Employment Policies. an Azorean story...
Différencier: NOMBRE PREMIER vs. NOMBRE COMPOSÉ
The political ecnomy of tourism development in Tolagnaro (Madagascar) Utilization of natural resources in the struggle against poverty Bruno Sarrasin Professeur.
Branche Développement Le présent document contient des informations qui sont la propriété de France Télécom. L'acceptation de ce document par son destinataire.
Leçon 25.
Pour commencer … Read the statements below and put them into three columns depending on whether they are written in the past, present or future tense.
The Solar Orbiter A high-resolution mission to the Sun and inner heliosphere.
Ministère de l’Éducation, du Loisir et du Sport Responsables des programmes FLS et ELA: Diane Alain et Michele Luchs Animateurs: Diane Alain et Michael.
La visite de Clément l’Aplati à Paris
Similarité Belkhir Abdelkader Laboratoire LSI USTHB
Évaluation des programmes de premier cycle/Evaluation of undergraduate programs Université d’Ottawa/ University of Ottawa 1 Auto-evaluation Report Objectives.
PREPOSITIONS Objectives : To explain and practise prepositions.
IP Multicast Text available on
INSERT THE TITLE OF YOUR PRESENTATION HERE FREE PPT TEMPLATES ALLPPT.com _ Free PowerPoint Templates, Diagrams and Charts Your Text Here!
Introduction to Computational Journalism: Thinking Computationally JOUR479V/779V – Computational Journalism University of Maryland, College Park Nick Diakopoulos,
Writing 2-3 sentence descriptions for each of the four drawings
Transcription de la présentation:

Thematic Alignment of Static Documents with Meeting Dialogs Dalila Mekhaldi Diva Group Department of Computer Science University of Fribourg

Outline Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Thematic Segmentation Alignments Grouping Conclusion & Perspectives

Introduction In document-centric meetings (lectures, teleconferencing, press reviews, etc.): Static documents are present Should be integrated in a common multimedia archive Need to build links between documents and other media Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

Document Alignments Several way to link static documents with other meeting data: Document/Image alignment Document/Speech alignment Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

Document/Speech Alignment Links static data (documents) to temporal data (audio). Enriches the documents with temporal indexes and thematic links. Helps: Building document-based browsing interfaces. Improving documents search and retrieval. Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

3 alignment categories Thematic: lexical similarity of document/speech parts Quotation of a document part Reference to a document part Document/Speech Alignment Text decomposition into segments Document Logical Syntactic Speech transcript Turns Utterances Document Logical SyntacticUtterances Turns Speech Transcript Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

… Alors euh.. mardi 15 juillet, hier, euh.. la commission d'enquête parlementaire en France a rendu un rapport euh..sur la gestion des entreprises.. entreprises publiques. Euh.. Très critique sur la gestion de France Telecom et d'EDF, leurs politiques d'acquisitions ont été menées sans que les moyens humains,... … Speech Transcript Rendu public mardi 15 juillet, le rapport de la commission d'enquete parlementaire sur la gestion des entreprises publiques, presidee par Philippe Douste-Blazy, secretaire general de l'UMP. Tres critique sur la gestion de France Telecom et d'EDF - leurs politiques d'acquisitions ont ete menees sans que les moyens humains, techniques, financiers aient ete adaptes en consequence.. … Similarity based matching Thematic Alignment Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

similarities S1 S2 S3 Speech transcript segments S1 S2 … Document segments Similarity based matching Vectors of weighted terms: S1 V1={t1, t2,..}; S1 V1={t1, t2,..} a. Stop-words removing, Stemming b. Similarity metrics between units Jaccard = |V1 V1| / |V1 V1| Dice = 2 × |V1 V1| / |V1| + |V1| Cosine =|V1 V1| / |V1| |V1| Two strategies: One-best and multiple alignments Thematic Alignment Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

One-best Alignment Evaluation Precision: N3/ N2 Recall: N3/ N1 Precision sent/uttutt/sentturn/logic Cosine Dice Jaccard Recall sent/uttutt/sentturn/logic Cosine Dice Jaccard Improve the similarity metrics with a semantic dictionary Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives Manual ground truth for 8 meetings N1: alignments to found (manual ) N2: alignments found (automatic) => N3: correct alignments found (automatic)

Meeting Thematic Segmentatio n Doc/Speech Thematic Alignment Multiple Alignments Evaluation A1A2A3A1A2A3 A4A4 A5A5 … Alors euh.. mardi 15 juillet, hier, euh.. la commission d'enquête parlementaire en France a rendu un rapport euh..sur la gestion des entreprises.. entreprises publiques. Euh.. en gros, ça dit que le modèle français des entreprises publiques ne répond plus aux nouvelles exi.. exigences internationales et européenne. … … Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

Thematic alignment (e.g sentences/utterances) Alignability arcs Similarity weights nodes nodes size Sentences utterances Similarity value The most connected sub-graphs Thematic regions Document sentences Speech utterances (84, 9) (91, 8) 0.25 Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives Meeting Thematic Segmentation

Document sentences Speech utterances a. Bi-graph representation of the multiple alignment pairs. Meeting Themes b. Densest regions extraction (using clustering) A 1 A 2 A 3 A 4 A 5 S1S2S3S4S5S1S2S3S4S5 c. Segments extraction (clusters projection) Meeting Thematic Segmentation Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

Thematic Segmentation Evaluation Manual ground-truth for 22 meetings 1. Speech: 2 main sets Stereotyped: 2.7 utterances/turn (ratio>2) Non-stereotyped: 1.3 utterances/turn (ratio<=2) 2. Documents: 2 main sets Mono-document Multi-documents Comparison with 2 mono-modal methods: Texttiling, Baseline Speech Baseline: turn-based segmentation Documents Baseline : reflexive alignment/clustering Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

P k (Beeferman) metric 0 for a perfect segmentation. Bi-modal Texttiling Baseline a. Speech b. Documents Thematic Segmentation Evaluation Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives StereotypedNon- stereotyped StereotypedNon- stereotyped Mono-documentMulti-documents Meetings Pk StereotypedNon-stereotyped Meetings Pk

Our bi-modal method outperforms standard mono-modal methods: Analysis bridges the gaps between documents and speech transcript detects the similar segments Document A 1 A 2 A 3 A 4 A 5 S1S2S3S4S5S1S2S3S4S5 S1S2S3S4S5S1S2S3S4S5 S1S2S3S4S5S1S2S3S4S5 Documents greatly help structuring meetings more precise in computing the segments number Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

Alignments Grouping 1.Implementation of a framework that: a. Combine the various levels, to correct the false alignments pairs, e.g. (sentences x utterances) & (logical blocks x turns) Speech T1 T2 U1 U2 U3 Document L1 L2 S1 S2 b. Combine the 3 alignments categories (Thematic, Quotations and References) to improve the document/speech alignment Speech Document Introduction Thematic Alignment One-best Alignment Multiple Alignment o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

2. A tool for the visualization that: Highlights the alignment categories (Thematic, Quotations, References) Represent the various structures of the documents/speech as Layers. Alignments Grouping Introduction Thematic Alignment One-best Alignment Multiple Alignment o Meeting Segmentation Alignments Grouping Conclusion & Perspectives Speech Document

Conclusion Thematic Alignment of documents with meeting dialog Is a solution for integrating static documents into multimedia archives: Conference Lectures, etc. Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives

Perspectives Automatic transcription of the speech Generalize the alignment on other: documents types with few text (e.g. slides, agenda) meeting kinds where documents are discussed irregularly (e.g. conferences) Introduction Thematic Alignment One-best Alignment Multiple Alignments o Meeting Segmentation Alignments Grouping Conclusion & Perspectives