`nltk.IBMModel3`¶

class nltk.IBMModel3(sentence_aligned_corpus, iterations, probability_tables=None)[source]¶

Translation model that considers how a word can be aligned to multiple words in another language

>>> bitext = []
>>> bitext.append(AlignedSent(['klein', 'ist', 'das', 'haus'], ['the', 'house', 'is', 'small']))
>>> bitext.append(AlignedSent(['das', 'haus', 'war', 'ja', 'groß'], ['the', 'house', 'was', 'big']))
>>> bitext.append(AlignedSent(['das', 'buch', 'ist', 'ja', 'klein'], ['the', 'book', 'is', 'small']))
>>> bitext.append(AlignedSent(['ein', 'haus', 'ist', 'klein'], ['a', 'house', 'is', 'small']))
>>> bitext.append(AlignedSent(['das', 'haus'], ['the', 'house']))
>>> bitext.append(AlignedSent(['das', 'buch'], ['the', 'book']))
>>> bitext.append(AlignedSent(['ein', 'buch'], ['a', 'book']))
>>> bitext.append(AlignedSent(['ich', 'fasse', 'das', 'buch', 'zusammen'], ['i', 'summarize', 'the', 'book']))
>>> bitext.append(AlignedSent(['fasse', 'zusammen'], ['summarize']))

>>> ibm3 = IBMModel3(bitext, 5)

>>> print(round(ibm3.translation_table['buch']['book'], 3))
1.0
>>> print(round(ibm3.translation_table['das']['book'], 3))
0.0
>>> print(round(ibm3.translation_table['ja'][None], 3))
1.0

>>> print(round(ibm3.distortion_table[1][1][2][2], 3))
1.0
>>> print(round(ibm3.distortion_table[1][2][2][2], 3))
0.0
>>> print(round(ibm3.distortion_table[2][2][4][5], 3))
0.75

>>> print(round(ibm3.fertility_table[2]['summarize'], 3))
1.0
>>> print(round(ibm3.fertility_table[1]['book'], 3))
1.0

>>> print(ibm3.p1)
0.054...

>>> test_sentence = bitext[2]
>>> test_sentence.words
['das', 'buch', 'ist', 'ja', 'klein']
>>> test_sentence.mots
['the', 'book', 'is', 'small']
>>> test_sentence.alignment
Alignment([(0, 0), (1, 1), (2, 2), (3, None), (4, 3)])

Methods¶

`__init__`(sentence_aligned_corpus, iterations)	Train on `sentence_aligned_corpus` and create a lexical translation model, a distortion model, a fertility model, and a model for generating NULL-aligned words.
`best_model2_alignment`(sentence_pair[, ...])	Finds the best alignment according to IBM Model 2
`hillclimb`(alignment_info[, j_pegged])	Starting from the alignment in `alignment_info`, look at
`init_vocab`(sentence_aligned_corpus)
`maximize_distortion_probabilities`(counts)
`maximize_fertility_probabilities`(counts)
`maximize_lexical_translation_probabilities`(counts)
`maximize_null_generation_probabilities`(counts)
`neighboring`(alignment_info[, j_pegged])	Determine the neighbors of `alignment_info`, obtained by
`prob_of_alignments`(alignments)
`prob_t_a_given_s`(alignment_info)	Probability of target sentence and an alignment given the
`reset_probabilities`()
`sample`(sentence_pair)	Sample the most probable alignments from the entire alignment
`set_uniform_probabilities`(...)
`train`(parallel_corpus)

Attributes¶