Statistical Machine Translation Translation without Understanding Colin Cherry.

Slides:



Advertisements
Présentations similaires
Pourquoi? and parce que/qu’ Quantifiers Mes profs
Advertisements

Primary French Presentation 2 Saying How You Are.
Les pronoms relatifs Objective: Learn to use more interesting and more complex sentences in French.
SKHS Curriculum 2008 Essential skills: Skimming, scanning and reading detail.
Oops j’aime pas l’anglais
Les Tâches Ménagères Learning Objectives:
Pour mieux écrire. 1. Do not use on-line translators (except as a dictionary for a single word) 2. Be very careful using a dictionary (be sure youre finding.
Mes activités.
Lesson 07.04: Le verbe faire Lesson 07.05: Le négatif French 1 Segment 2.
Look at the following sentences and tell me if they are in the past or the present tense 1. I go to the swimming pool every Thursday. 1. I go to the swimming.
Aujourd’hui Conversations individuelles Négatifs Adjectifs et adverbes.
ROLE PLAY The role play is about communication and exchange of information. It is not the time for a LONG conversation! You can get full marks for very.
L’inversion --another way to make a question.. What are some ways to form a question? Est-ce que... N’est-ce pas? Voice inflection.
Mon uniforme scolaire WALT: Revise words for clothes and be able to give opinions about my school uniform. WILF: Using more complex descriptions and opinions.
CHORES WITH FAIRE! Brainstorm: which expressions have we already learned with the verb “faire”??
Les verbes qui se terminent en -ER (-ER verbs). French has both regular and irregular verbs. (English does too, for that matter.)
Livre page 48. There are 4 different ways to form questions. Félicitations!! You already know 2 of the ways ☻ We have not “officially” studied this concept.
WALT: To talk about the internet in French.
Mon Ecole WALT: Promote The Cornwallis Academy
Le Comparatif et le Superlatif
Let’s go back to the verb endings. What are our 3 infinitive endings? ER IR RE What is an infinitive? An unconjugated verb In other words, a verb in the.
Qui est présent? Écoutons Les préférences Vocabulaire: les activités Panorama Culturel.
THE ADJECTIVES: BEAU, NOUVEAU AND VIEUX 1.
Forming questions in French
LA DATE MME YEE L'hôtel Français. Avant de lire 1. Have you ever stayed at a hotel? Where? 2. If not, what do you think it would be like? 3. What do you.
Les Mots Interrogatifs
Greetings, formal and informal
Le Passé Composé - avec “avoir” Look at the following 3 sentences. Ali played football yesterday They have visited Paris 3 times We did tidy the bedroom.
French 101 Important Verbs. The most important French verbs – avoir (to have), être (to be), and faire (to do/make) They are used in some of the ways.
Pile-Face 1. Parlez en français! (Full sentences) 2. One person should not dominate the conversation 3. Speak the entire time The goal: Practice! Get better.
Articles Objectives: to be able to tell the difference between “the” and “a”
Les verbes réfléchis au passé composé
WALT: GIVE OPINIONS ABOUT MY TOWN
Activities for the first week of the École 1 unit.
Unit 1 – Greetings and Salutations
WALT: Recognise and use phrases in the past tense with opinions of leisure activities. WILF: To be able to use opinions in the past tense. You must be.
Year 10. Bon appetit unit. Introducing ‘en’. ‘en’ – ‘some of it’ or ‘some of them’ ‘En’ is a small but important word in French that is commonly used.
Jeunes, qui êtes-vous? Using reading strategies for comprehension, comparisons, and preparation for Café français.
WALT: USE MORE COMPLEX PHRASES AND TENSES TO SAY HOW WE GET ON WITH OTHERS. WILF: IMPROVE MY ORIGINAL PARAGRAPH TO A C GRADE LEVEL AND ABOVE.
Les pronoms objets Mme Zakus. Les pronoms objets When dealing with sentences, subjects are part of the action of the verb. In other words, they “ do ”
La mémoire(1): Comment bien travailler
Subject Pronouns Objectives : To explain and practise the pronouns.
CONTRACTIONS  How to use “À” to say where you are going  How to use “DE” to say where you are coming from.
Object pronouns How to say “him”, “her”, “it”, “them”
Les noms et les articles
Le pronom « en » Révision: p60-61 dans le cahier.
#1-Isn't it strange how a dollar bill seems like such a large amount when you donate it to church, but such a small amount when you go shopping? IT IS.
1. Est-ce que Est-ce que, literally translated "is it that," can be placed at the beginning of any affirmative sentence to turn it into a question: Je.
WE’RE ALMOST DONE – CONGRATULATIONS! LE PRONOM « Y »
On conjugue! [Avoir et Etre] It is very important to learn and practise using the conjugations of verbs in French.
‘Oddballs !’ Some more irregular verb revision in the Present Tense.
Write your answer in French
WALT: how to tell the time in French WILF: to be able to understand ¼ past, ½ past, ¼ to and o’clock (level 2) to be able to understand all times in French.
WILF: TO BE ABLE TO GIVE AN OPINION FOR LEVEL 3
The Passé Composé Objective: to talk about things we have done on a visit to explain what events happened to speak and write about events in the past.
Lundi 14 septembre Parle-moi de toi! la première activité: Vérifiez les devoirs. dé e st e e g a r d e h b i t e oy ag e doro ns am e ap pe le c o l e.
Verb Conjugation Learning to conjugate your first verb in French.
Flash-on-flash-off! You will see some French text in a minute but it will only be on the board for a minute then it will disappear.
Today we are learning how to ask for and give opinions about food By the end of the period you will be able to discuss some of your opinions about food.
Message Unexpected events incapacitated me to assume class today. I hope you will be good. (crêpes upon return to the best class and most productive as.
Révision: p60-61 dans le cahier
OBJECT PRONOUNS WITH THE PASSÉ COMPOSÉ Page 122. Placement  With all object pronouns, placement is the same. DirectIndirectPlaces De+ nouns or ideas.
LES PRONOMS D’OBJET DIRECT. WHAT IS A SUBJECT? In a sentence, the person or thing that performs the action of the verb is called the SUBJECT.
Bell Ringer: Qu’est-ce que tu manges? What do you eat? Write what you eat for lunch using the images & your memory/notes/packet: Pour le déjeuner je mange……
Objective: To be able to understand questions in French. Some will notice patterns for asking questions.
Le Verbe Avoir L’Objectif: to learn the verb avoir in the present tense and to be able to use it in context By: B. Antoniazzi DDE French 1 U1 L2C AVOIR.
Français 12/14/15 Ouvrez vos livres á la page 112. Ecrivez six phrases de sports et activités. What is worse than “raining cats and dogs?” Important(e)
NOUNS In French, nouns are either masculine or feminine. There are no hard and fast rules about the gender of a noun so you just have to learn the gender.
A question can be open or closed Une question ouverte Une question fermée.
Transcription de la présentation:

Statistical Machine Translation Translation without Understanding Colin Cherry

Who is this guy?  One of Dr. Lin’s PhD students  Did my Masters degree at U of A  Research Area: Machine Translation  Home town: Halifax, Nova Scotia  Please ask questions!

Machine Translation  Translation is easy for (bilingual) people  Process:  Read the text in English  Understand it  Write it down in French

Machine Translation  Translation is easy for (bilingual) people  Process:  Read the text in English  Understand it  Write it down in French  Hard for computers  The human process is invisible, intangible

One approach: Babelfish  A rule-based approach to machine translation  A 30-year-old feat in Software Eng.  Programming knowledge in by hand is difficult and expensive

Alternate Approach: Statistics  What if we had a model for P(F|E) ?  We could use Bayes rule:

Why Bayes rule at all?  Why not model P(E|F) directly?  P(F|E)P(E) decomposition allows us to be sloppy  P(E) worries about good English  P(F|E) worries about French that matches English  The two can be trained independently

Crime Scene Analogy  F is a crime scene. E is a person who may have committed the crime  P(E|F) - look at the scene - who did it?  P(E) - who had a motive? (Profiler)  P(F|E) - could they have done it? (CSI - transportation, access to weapons, alabi)  Some people might have great motives, but no means - you need both!

On voit Jon à la télévision good English? P(E)good match to French? P(F|E) Jon appeared in TV. Appeared on Jon TV. In Jon appeared TV. Jon is happy today. Jon appeared on TV. TV appeared on Jon. TV in Jon appeared. Jon was not happy. Table borrowed from Jason Eisner

Where will we get P(F|E)? Books in English Same books, in French Machine Learning Magic P(F|E) model We call collections stored in two languages parallel corpora or parallel texts Want to update your system? Just add more text!

Our Inspiration:  The Canadian Parliamentary Debates!  Stored electronically in both French and English and available over the Internet

Problem:  How are we going to generalize from examples of translations?  I’ll spend the rest of this lecture telling you:  What makes a useful P(F|E)  How to obtain the statistics needed for P(F|E) from parallel texts

Strategy: Generative Story  When modeling P(X|Y):  Assume you start with Y  Decompose the creation of X from Y into some number of operations  Track statistics of individual operations  For a new example X,Y: P(X|Y) can be calculated based on the probability of the operations needed to get X from Y

What if…? The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux

New Information  Call this new info a word alignment (A)  With A, we can make a good story The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux

P(F,A|E) Story null The quick fox jumps over the lazy dog

P(F,A|E) Story null The quick fox jumps over the lazy dog f1f2f2 f3f3 …f 10

P(F,A|E) Story null The quick fox jumps over the lazy dog f1f2f2 f3f3 …f 10

P(F,A|E) Story null The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux

P(F,A|E) Story null The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux

Getting P t (f|e)  We need numbers for P t (f|e)  Example: P t (le|the)  Count lines in a large collection of aligned text null The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux null The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux null The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux null The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux null The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux null The quick fox jumps over the lazy dog Le renard rapide saut par - dessus le chien parasseux

Where do we get the lines?  That sure looked like a lot of monkeys…  Remember POS tagging w/ HMMs:  You didn’t need a tagged corpus to train a tagger  We’ll get alignments out of unaligned text by treating the alignment as a hidden variable  Generalization of ideas in HMM training: called EM

English :In the beginning God created the heavens and the earth. Vietnamese :Ban dâu Dúc Chúa Tròi dung nên tròi dât. English :God called the expanse heaven. Vietnamese :Dúc Chúa Tròi dat tên khoang không la tròi. English :… you are this day like the stars of heaven in number. Vietnamese :… các nguoi dông nhu sao trên tròi. Where’s “heaven” in Vietnamese? Example borrowed from Jason Eisner

English :In the beginning God created the heavens and the earth. Vietnamese :Ban dâu Dúc Chúa Tròi dung nên tròi dât. English :God called the expanse heaven. Vietnamese :Dúc Chúa Tròi dat tên khoang không la tròi. English :… you are this day like the stars of heaven in number. Vietnamese :… các nguoi dông nhu sao trên tròi. Where’s “heaven” in Vietnamese? Example borrowed from Jason Eisner

EM: Estimation Maximization  Assume a probability distribution (weights) over hidden events  Take counts of events based on this distribution  Use counts to estimate new parameters  Use parameters to re-weight examples.  Rinse and repeat

Alignment Hypotheses null I like milk Je aime le lait null I like milk Je aime le lait null I like milk Je aime le lait null I like milk Je aime le lait null I like milk Je aime le lait null I like milk Je aime le lait null I like milk Je aime le lait null I like milk Je aime le lait

Weighted Alignments  What we’ll do is:  Consider every possible alignment  Give each alignment a weight - indicating how good it is  Count weighted alignments as normal

Good grief! We forgot about P(F|E)!  No worries, a little more stats gets us what we need:

Big Example: Corpus fast car voiture rapide fast rapide 1 2

Possible Alignments fast car voiture rapide fast rapide fast car voiture rapide 1a1b2

Parameters fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 P(voiture|fast)P(rapide|fast)P(voiture|car)P(rapide|car) 1/2

Weight Calculations fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 P(voiture|fast)P(rapide|fast)P(voiture|car)P(rapide|car) 1/2 P(A,F|E)P(A|F,E) 1a1/2*1/2=1/41/4 / 2/4 = 1/2 1b1/2*1/2=1/41/4 / 2/4 = 1/2 21/21/2 / 1/2 = 1

Count Lines fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 1/2 1

Count Lines fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 1/2 1 #(voiture,fast)#(rapide,fast)#(voiture,car)#(rapide,car) 1/21/2+1 = 3/21/2

Count Lines fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 1/2 1 #(voiture,fast)#(rapide,fast)#(voiture,car)#(rapide,car) 1/21/2+1 = 3/21/2 Normalize P(voiture|fast)P(rapide|fast)P(voiture|car)P(rapide|car) 1/43/41/2

Parameters fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 P(voiture|fast)P(rapide|fast)P(voiture|car)P(rapide|car) 1/43/41/2

Weight Calculations fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 P(voiture|fast)P(rapide|fast)P(voiture|car)P(rapide|car) 1/43/41/2 P(A,F|E)P(A|F,E) 1a1/4*1/2=1/81/8 / 4/8 = 1/4 1b1/2*3/4=3/83/8 / 4/8 = 3/4 23/43/4 / 3/4 = 1

Count Lines fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 1/43/41

Count Lines fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 1/43/41 #(voiture,fast)#(rapide,fast)#(voiture,car)#(rapide,car) 1/43/4+1 = 7/43/41/4

Count Lines fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 1/43/41 #(voiture,fast)#(rapide,fast)#(voiture,car)#(rapide,car) 1/43/4+1 = 7/43/41/4 Normalize P(voiture|fast)P(rapide|fast)P(voiture|car)P(rapide|car) 1/87/83/41/4

After many iterations: fast car voiture rapide fast rapide fast car voiture rapide 1a1b2 ~0~11 P(voiture|fast)P(rapide|fast)P(voiture|car)P(rapide|car)

Seems too easy?  What if you have no 1-word sentence?  Words in shorter sentences will get more weight - fewer possible alignments  Weight is additive throughout the corpus: if a word e shows up frequently with some other word f, P(f|e) will go up

Some things I skipped  Enumerating all possible alignments:  Very easy with this model: The independence assumptions save us  Model could be a lot better:  Word positions  Multiple f’s generated by the same e  Can actually use an HMM!

The Final Product  Now we have a model for P(F|E)  Test it by aligning a corpus!  IE: Find argmax A P(A|F,E)  Use it for translation:  Combine with favorite model for P(E)  Search space of English sentences for one that maximizes P(E)P(F|E) for a given F

Questions? ?