nltk.IBMModel2
¶
-
class
nltk.
IBMModel2
(sentence_aligned_corpus, iterations, probability_tables=None)[source]¶ Lexical translation model that considers word order
>>> bitext = [] >>> bitext.append(AlignedSent(['klein', 'ist', 'das', 'haus'], ['the', 'house', 'is', 'small'])) >>> bitext.append(AlignedSent(['das', 'haus', 'ist', 'ja', 'groß'], ['the', 'house', 'is', 'big'])) >>> bitext.append(AlignedSent(['das', 'buch', 'ist', 'ja', 'klein'], ['the', 'book', 'is', 'small'])) >>> bitext.append(AlignedSent(['das', 'haus'], ['the', 'house'])) >>> bitext.append(AlignedSent(['das', 'buch'], ['the', 'book'])) >>> bitext.append(AlignedSent(['ein', 'buch'], ['a', 'book']))
>>> ibm2 = IBMModel2(bitext, 5)
>>> print(round(ibm2.translation_table['buch']['book'], 3)) 1.0 >>> print(round(ibm2.translation_table['das']['book'], 3)) 0.0 >>> print(round(ibm2.translation_table['buch'][None], 3)) 0.0 >>> print(round(ibm2.translation_table['ja'][None], 3)) 0.0
>>> print(ibm2.alignment_table[1][1][2][2]) 0.938... >>> print(round(ibm2.alignment_table[1][2][2][2], 3)) 0.0 >>> print(round(ibm2.alignment_table[2][2][4][5], 3)) 1.0
>>> test_sentence = bitext[2] >>> test_sentence.words ['das', 'buch', 'ist', 'ja', 'klein'] >>> test_sentence.mots ['the', 'book', 'is', 'small'] >>> test_sentence.alignment Alignment([(0, 0), (1, 1), (2, 2), (3, 2), (4, 3)])
Methods¶
__init__ (sentence_aligned_corpus, iterations) |
Train on sentence_aligned_corpus and create a lexical translation model and an alignment model. |
best_model2_alignment (sentence_pair[, ...]) |
Finds the best alignment according to IBM Model 2 |
hillclimb (alignment_info[, j_pegged]) |
Starting from the alignment in alignment_info , look at |
init_vocab (sentence_aligned_corpus) |
|
maximize_alignment_probabilities (counts) |
|
maximize_fertility_probabilities (counts) |
|
maximize_lexical_translation_probabilities (counts) |
|
maximize_null_generation_probabilities (counts) |
|
neighboring (alignment_info[, j_pegged]) |
Determine the neighbors of alignment_info , obtained by |
prob_alignment_point (i, j, src_sentence, ...) |
Probability that position j in trg_sentence is aligned to |
prob_all_alignments (src_sentence, trg_sentence) |
Computes the probability of all possible word alignments, |
prob_of_alignments (alignments) |
|
prob_t_a_given_s (alignment_info) |
Probability of target sentence and an alignment given the |
reset_probabilities () |
|
sample (sentence_pair) |
Sample the most probable alignments from the entire alignment |
set_uniform_probabilities (...) |
|
train (parallel_corpus) |