Guest User

Untitled

a guest
May 25th, 2018
104
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.45 KB | None | 0 0
  1. for num, batch in enumerate(train_iter):
  2. # batch is a torchtext.data.batch.Batch object
  3. # In batch there are 32 instances
  4. # The data in batch have two fields: text and label
  5. # The text field contains the text index itself [len(longest sentence)x32] and
  6. # the length of each setence [32] in this batch (remember include_lengths=True?)
  7. # The label field contains the prediction target [32]
  8. print("batch:")
  9. print(batch, "\n")
  10.  
  11. # Print the label
  12. print("batch.label:")
  13. print(batch.label, "\n")
  14.  
  15. # Print the first component of batch.text (the sequence of words)
  16. print("batch.text[0]:")
  17. print(batch.text[0], "\n")
  18.  
  19. # Print the first element in batch.text[0], which is all first words in all the sentences
  20. # You may notice that the results are all 2
  21. # That's because we use <SOS> to represent the beginning of a sentence
  22. # Going back to TEXT.vocab.stoi you can find that the index of <SOS> is 2
  23. print("batch.text[0][0]:")
  24. print(batch.text[0][0], "\n")
  25.  
  26. # Print the actual content fifth sentence in this batch
  27. # Noted that we use the lookup dictionary to recover the word index to word
  28. print("Fifth sentence: ")
  29. for i in range(batch.text[0].size()[0]):
  30. print(lookup[batch.text[0][i].tolist()[5]], end=" ")
  31. # You might find that torchtext automatically add <PAD>s after <EOS> to make all sentence
  32. # having identical length in this batch
  33. break
Add Comment
Please, Sign In to add comment