Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Data : 10%
- Time taken : 09mins
- batch_size = 128
- embedding_size = 128 # Dimension of the embedding vector.
- skip_window = 2 # How many words to consider left and right.
- num_skips = 2 # How many times to reuse an input to generate a label.
- num_sampled = 64 # Number of negative examples to sample.
- trianing num_steps = 100001
- ------------------------------------------------------------------------------------------------------
- /usr/bin/python3.6 /home/suthagar/PycharmProjects/wordembedding/tensorflow/tf-basic-1.py
- 31
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-2-output.txt
- Data size 128516
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-0-output.txt
- Data size 257642
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-1-output.txt
- Data size 386496
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-10-output.txt
- Data size 515056
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-11-output.txt
- Data size 643900
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-12-output.txt
- Data size 772888
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-13-output.txt
- Data size 902749
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-14-output.txt
- Data size 1032093
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-15-output.txt
- Data size 1160996
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-16-output.txt
- Data size 1290329
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-17-output.txt
- Data size 1419246
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-18-output.txt
- Data size 1547816
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-19-output.txt
- Data size 1677604
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-20-output.txt
- Data size 1806376
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-21-output.txt
- Data size 1934628
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-22-output.txt
- Data size 2064353
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-23-output.txt
- Data size 2193694
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-24-output.txt
- Data size 2323519
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-25-output.txt
- Data size 2452664
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-26-output.txt
- Data size 2580696
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-27-output.txt
- Data size 2709547
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-28-output.txt
- Data size 2839134
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-29-output.txt
- Data size 2967339
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-3-output.txt
- Data size 3097342
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-30-output.txt
- Data size 3173173
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-4-output.txt
- Data size 3302455
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-5-output.txt
- Data size 3431952
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-6-output.txt
- Data size 3561358
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-7-output.txt
- Data size 3690530
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-8-output.txt
- Data size 3819956
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00001-of-00100/news.en-00001-of-00100-out-9-output.txt
- Data size 3948945
- 31
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-2-output.txt
- Data size 4077445
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-0-output.txt
- Data size 4205863
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-1-output.txt
- Data size 4334906
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-10-output.txt
- Data size 4463802
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-11-output.txt
- Data size 4594170
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-12-output.txt
- Data size 4723159
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-13-output.txt
- Data size 4851463
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-14-output.txt
- Data size 4980526
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-15-output.txt
- Data size 5109599
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-16-output.txt
- Data size 5237380
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-17-output.txt
- Data size 5365879
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-18-output.txt
- Data size 5495692
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-19-output.txt
- Data size 5625190
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-20-output.txt
- Data size 5753760
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-21-output.txt
- Data size 5883541
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-22-output.txt
- Data size 6012679
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-23-output.txt
- Data size 6141511
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-24-output.txt
- Data size 6270866
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-25-output.txt
- Data size 6400908
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-26-output.txt
- Data size 6529719
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-27-output.txt
- Data size 6659897
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-28-output.txt
- Data size 6789040
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-29-output.txt
- Data size 6917674
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-3-output.txt
- Data size 7046961
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-30-output.txt
- Data size 7136254
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-4-output.txt
- Data size 7265195
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-5-output.txt
- Data size 7393928
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-6-output.txt
- Data size 7523897
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-7-output.txt
- Data size 7652253
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-8-output.txt
- Data size 7781490
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00002-of-00100/news.en-00002-of-00100-out-9-output.txt
- Data size 7910130
- 31
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-2-output.txt
- Data size 8038646
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-0-output.txt
- Data size 8167772
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-1-output.txt
- Data size 8296626
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-10-output.txt
- Data size 8425186
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-11-output.txt
- Data size 8554030
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-12-output.txt
- Data size 8683018
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-13-output.txt
- Data size 8812879
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-14-output.txt
- Data size 8942223
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-15-output.txt
- Data size 9071126
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-16-output.txt
- Data size 9200459
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-17-output.txt
- Data size 9329376
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-18-output.txt
- Data size 9457946
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-19-output.txt
- Data size 9587734
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-20-output.txt
- Data size 9716506
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-21-output.txt
- Data size 9844758
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-22-output.txt
- Data size 9974483
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-23-output.txt
- Data size 10103824
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-24-output.txt
- Data size 10233649
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-25-output.txt
- Data size 10362794
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-26-output.txt
- Data size 10490826
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-27-output.txt
- Data size 10619677
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-28-output.txt
- Data size 10749264
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-29-output.txt
- Data size 10877469
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-3-output.txt
- Data size 11007472
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-30-output.txt
- Data size 11083303
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-4-output.txt
- Data size 11212585
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-5-output.txt
- Data size 11342082
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-6-output.txt
- Data size 11471488
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-7-output.txt
- Data size 11600660
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-8-output.txt
- Data size 11730086
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00003-of-00100/news.en-00003-of-00100-out-9-output.txt
- Data size 11859075
- 31
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-2-output.txt
- Data size 11989071
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-0-output.txt
- Data size 12118812
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-1-output.txt
- Data size 12248449
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-10-output.txt
- Data size 12377701
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-11-output.txt
- Data size 12507389
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-12-output.txt
- Data size 12636343
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-13-output.txt
- Data size 12765008
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-14-output.txt
- Data size 12893362
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-15-output.txt
- Data size 13022361
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-16-output.txt
- Data size 13151575
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-17-output.txt
- Data size 13281469
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-18-output.txt
- Data size 13411025
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-19-output.txt
- Data size 13539980
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-20-output.txt
- Data size 13669319
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-21-output.txt
- Data size 13799528
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-22-output.txt
- Data size 13928314
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-23-output.txt
- Data size 14057778
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-24-output.txt
- Data size 14186868
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-25-output.txt
- Data size 14316248
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-26-output.txt
- Data size 14445112
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-27-output.txt
- Data size 14573723
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-28-output.txt
- Data size 14702089
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-29-output.txt
- Data size 14830842
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-3-output.txt
- Data size 14960442
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-30-output.txt
- Data size 15043117
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-4-output.txt
- Data size 15170178
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-5-output.txt
- Data size 15298572
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-6-output.txt
- Data size 15427872
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-7-output.txt
- Data size 15556447
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-8-output.txt
- Data size 15684962
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00004-of-00100/news.en-00004-of-00100-out-9-output.txt
- Data size 15814026
- 31
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-2-output.txt
- Data size 15942599
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-0-output.txt
- Data size 16071651
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-1-output.txt
- Data size 16200802
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-10-output.txt
- Data size 16329653
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-11-output.txt
- Data size 16458086
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-12-output.txt
- Data size 16587360
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-13-output.txt
- Data size 16715806
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-14-output.txt
- Data size 16845354
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-15-output.txt
- Data size 16974268
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-16-output.txt
- Data size 17102748
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-17-output.txt
- Data size 17230940
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-18-output.txt
- Data size 17358654
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-19-output.txt
- Data size 17488138
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-20-output.txt
- Data size 17616049
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-21-output.txt
- Data size 17744559
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-22-output.txt
- Data size 17873425
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-23-output.txt
- Data size 18002521
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-24-output.txt
- Data size 18131879
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-25-output.txt
- Data size 18260774
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-26-output.txt
- Data size 18390433
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-27-output.txt
- Data size 18520568
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-28-output.txt
- Data size 18648865
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-29-output.txt
- Data size 18778412
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-3-output.txt
- Data size 18907842
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-30-output.txt
- Data size 18982106
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-4-output.txt
- Data size 19111546
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-5-output.txt
- Data size 19240618
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-6-output.txt
- Data size 19370246
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-7-output.txt
- Data size 19499727
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-8-output.txt
- Data size 19629967
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00005-of-00100/news.en-00005-of-00100-out-9-output.txt
- Data size 19758905
- 31
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-2-output.txt
- Data size 19888689
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-0-output.txt
- Data size 20017844
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-1-output.txt
- Data size 20147028
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-10-output.txt
- Data size 20276372
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-11-output.txt
- Data size 20405762
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-12-output.txt
- Data size 20534821
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-13-output.txt
- Data size 20664771
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-14-output.txt
- Data size 20793963
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-15-output.txt
- Data size 20924249
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-16-output.txt
- Data size 21053983
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-17-output.txt
- Data size 21180626
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-18-output.txt
- Data size 21308700
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-19-output.txt
- Data size 21436046
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-20-output.txt
- Data size 21565510
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-21-output.txt
- Data size 21695257
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-22-output.txt
- Data size 21823971
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-23-output.txt
- Data size 21952660
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-24-output.txt
- Data size 22081132
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-25-output.txt
- Data size 22210309
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-26-output.txt
- Data size 22340291
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-27-output.txt
- Data size 22468172
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-28-output.txt
- Data size 22597593
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-29-output.txt
- Data size 22726774
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-3-output.txt
- Data size 22856659
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-30-output.txt
- Data size 22926549
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-4-output.txt
- Data size 23054433
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-5-output.txt
- Data size 23183470
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-6-output.txt
- Data size 23311500
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-7-output.txt
- Data size 23439513
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-8-output.txt
- Data size 23569024
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00006-of-00100/news.en-00006-of-00100-out-9-output.txt
- Data size 23698447
- 31
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-2-output.txt
- Data size 23827259
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-0-output.txt
- Data size 23956141
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-1-output.txt
- Data size 24085753
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-10-output.txt
- Data size 24214584
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-11-output.txt
- Data size 24343812
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-12-output.txt
- Data size 24472452
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-13-output.txt
- Data size 24601478
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-14-output.txt
- Data size 24729800
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-15-output.txt
- Data size 24858628
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-16-output.txt
- Data size 24988401
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-17-output.txt
- Data size 25117433
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-18-output.txt
- Data size 25246974
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-19-output.txt
- Data size 25376880
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-20-output.txt
- Data size 25505710
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-21-output.txt
- Data size 25636093
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-22-output.txt
- Data size 25764367
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-23-output.txt
- Data size 25894770
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-24-output.txt
- Data size 26023237
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-25-output.txt
- Data size 26152387
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-26-output.txt
- Data size 26281474
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-27-output.txt
- Data size 26411349
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-28-output.txt
- Data size 26539480
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-29-output.txt
- Data size 26669031
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-3-output.txt
- Data size 26797967
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-30-output.txt
- Data size 26882981
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-4-output.txt
- Data size 27011557
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-5-output.txt
- Data size 27141214
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-6-output.txt
- Data size 27269621
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-7-output.txt
- Data size 27399979
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-8-output.txt
- Data size 27528023
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00007-of-00100/news.en-00007-of-00100-out-9-output.txt
- Data size 27656510
- 31
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-2-output.txt
- Data size 27786651
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-0-output.txt
- Data size 27915536
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-1-output.txt
- Data size 28045392
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-10-output.txt
- Data size 28174031
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-11-output.txt
- Data size 28303345
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-12-output.txt
- Data size 28433809
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-13-output.txt
- Data size 28563480
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-14-output.txt
- Data size 28693791
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-15-output.txt
- Data size 28822938
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-16-output.txt
- Data size 28951792
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-17-output.txt
- Data size 29079751
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-18-output.txt
- Data size 29208861
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-19-output.txt
- Data size 29336184
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-20-output.txt
- Data size 29465143
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-21-output.txt
- Data size 29594464
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-22-output.txt
- Data size 29723928
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-23-output.txt
- Data size 29853983
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-24-output.txt
- Data size 29982303
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-25-output.txt
- Data size 30110676
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-26-output.txt
- Data size 30239494
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-27-output.txt
- Data size 30368856
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-28-output.txt
- Data size 30498087
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-29-output.txt
- Data size 30627186
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-3-output.txt
- Data size 30756337
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-30-output.txt
- Data size 30846535
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-4-output.txt
- Data size 30974613
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-5-output.txt
- Data size 31104882
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-6-output.txt
- Data size 31233016
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-7-output.txt
- Data size 31361489
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-8-output.txt
- Data size 31491064
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00008-of-00100/news.en-00008-of-00100-out-9-output.txt
- Data size 31620253
- 31
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-2-output.txt
- Data size 31749021
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-0-output.txt
- Data size 31877554
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-1-output.txt
- Data size 32006579
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-10-output.txt
- Data size 32134943
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-11-output.txt
- Data size 32263697
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-12-output.txt
- Data size 32392094
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-13-output.txt
- Data size 32520776
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-14-output.txt
- Data size 32648946
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-15-output.txt
- Data size 32777232
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-16-output.txt
- Data size 32906543
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-17-output.txt
- Data size 33034524
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-18-output.txt
- Data size 33161865
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-19-output.txt
- Data size 33291150
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-20-output.txt
- Data size 33420425
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-21-output.txt
- Data size 33549104
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-22-output.txt
- Data size 33678340
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-23-output.txt
- Data size 33807206
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-24-output.txt
- Data size 33935612
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-25-output.txt
- Data size 34063347
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-26-output.txt
- Data size 34192784
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-27-output.txt
- Data size 34321947
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-28-output.txt
- Data size 34451372
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-29-output.txt
- Data size 34580332
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-3-output.txt
- Data size 34710265
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-30-output.txt
- Data size 34786171
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-4-output.txt
- Data size 34915962
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-5-output.txt
- Data size 35046789
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-6-output.txt
- Data size 35175649
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-7-output.txt
- Data size 35304020
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-8-output.txt
- Data size 35432899
- Processing file : /media/suthagar/Data/Corpus/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/tmp/pre-processed-final-files/news.en-00009-of-00100/news.en-00009-of-00100-out-9-output.txt
- Data size 35562054
- Most common words (+UNK) [['UNK', 1882574], ('The_DT', 477451), ('say_V', 378571), ('I_PRP', 181851), ('year_N', 144290)]
- Sample data [480, 3443, 7056, 282, 1040, 21802, 10833, 8265, 0, 41225] ['McCain_N', 'campaign_V', 'Nashville_N', 'Saturday_N', 'night_N.', 'Travelers_N', 'Neville_N', 'Catherine_N', 'UNK', 'trekked_N']
- 3443 campaign_V -> 480 McCain_N
- 3443 campaign_V -> 7056 Nashville_N
- 7056 Nashville_N -> 282 Saturday_N
- 7056 Nashville_N -> 3443 campaign_V
- 282 Saturday_N -> 7056 Nashville_N
- 282 Saturday_N -> 1040 night_N.
- 1040 night_N. -> 282 Saturday_N
- 1040 night_N. -> 21802 Travelers_N
- WARNING:tensorflow:From /home/suthagar/PycharmProjects/wordembedding/tensorflow/tf-basic-1.py:204: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
- Instructions for updating:
- keep_dims is deprecated, use keepdims instead
- 2018-03-29 23:22:47.326391: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
- 2018-03-29 23:22:48.083983: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:895] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
- 2018-03-29 23:22:48.084598: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
- name: GeForce GT 740M major: 3 minor: 5 memoryClockRate(GHz): 1.0325
- pciBusID: 0000:01:00.0
- totalMemory: 1.96GiB freeMemory: 1.23GiB
- 2018-03-29 23:22:48.084624: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GT 740M, pci bus id: 0000:01:00.0, compute capability: 3.5)
- Initialized
- Average loss at step 0 : 275.34576416015625
- Nearest to even_R: inconvenient_N, feces_N, flaw_N, Heels_N, inadvertent_N, windy_N, receive_V., Transportation_N,
- Nearest to new_A: slows_N., tributary_N, Florida_N., privatise_N, NAO_N, Subaru_N, Southern_A, Samak_N,
- Nearest to woman_N: chemist_N, Mir_Hossein_Mousavi_TGRAM, day_N., dissemination_N, vandalise_V, quarrel_N, sprout_V, Shatner_N,
- Nearest to three_CD: Rashard_N, midsized_V, Aquino_N, creature_N, refusal_N, MF_N, Treasurer_N, fraudulent_N.,
- Nearest to group_N: Monegan_N, possibility_N., captive_N., shipbuilding_N, checked_N, Noonan_N, Development_N, speedboat_N,
- Nearest to US_N: proverbial_N, misunderstand_V, Trott_N, diary_A., Cultural_A, Graceland_N, bn_N, lowfat_N,
- Nearest to state_N: clenched_N, anyone_N, creditor_N., catastrophic_A., PlayStation_N, direction_N., infertility_N, nook_N,
- Nearest to But_CC: antipiracy_N, HRT_N, preach_V, analogue_N, institution_N., Archaeology_N, rheumatoid_N, selfdefence_N,
- Nearest to many_A: dabble_V, Mitt_N, That_DT., governorship_N, cyberspace_N, detergent_N, Va_N, Grainger_N,
- Nearest to Mr_N: Canadiens_N, Spotlight_N, disorganize_V, Harris_N., government_N, Crowley_N, treason_N, Area_N.,
- Nearest to next_A: investor_N., ale_N, runup_N, restrict_V., Introduced_V, forthcoming_V., nonbeliever_N, French_A.,
- Nearest to also_R: vicepresidential_N, Rahm_N, nonrecurring_V, climb_N., APS_N, yearolds_N., quo_N, flow_N.,
- Nearest to part_N: OK_N., semiautomatic_A, pear_N, Potters_N, UAE_N, Seasonal_A, measure_N., erectile_N,
- Nearest to go_V: differentiate_N, manhunt_N, attend_V., assure_V., transgression_N, usable_A, enthusiasm_N, Sending_V,
- Nearest to year_N.: together_R., Pilgrims_N, CDMA_N, entrant_N., Sheppard_N, ABS_N, superior_N, prohibit_N,
- Nearest to back_R: pricefixing_V, trace_V, Britt_N, polished_N, bulky_N, nosedive_V, Didn_N, Stade_de_France_TGRAM,
- Average loss at step 2000 : 126.08367293930054
- Average loss at step 4000 : 58.06661507368088
- Average loss at step 6000 : 36.36220999312401
- Average loss at step 8000 : 25.69445154762268
- Average loss at step 10000 : 19.276764664173125
- Nearest to even_R: flaw_N, Democrat_N, Albert_N, ingredient_N, credit_N, Transportation_N, defense_N, receive_V.,
- Nearest to new_A: Southern_A, local_N, northern_A, Iran_N., Florida_N., sought_N, withholding_N, especially_R,
- Nearest to woman_N: day_N., dissemination_N, favour_N, Bahrain_N, turn_V, pas_N, transport_N, pressure_N,
- Nearest to three_CD: Brady_N, Constitution_N, pack_N, widespread_A, creature_N, absorbed_N, press_V, arrest_V,
- Nearest to group_N: possibility_N., Development_N, checked_N, Monegan_N, captive_N., slam_N, advise_V, dress_V,
- Nearest to US_N: bn_N, thorough_N, threaten_V, UNK, force_N, fix_V, do_V., America_N.,
- Nearest to state_N: anyone_N, Secretary_Tim_Geithner_TGRAM, ensure_V, ship_N, report_V, Dutch_N, guilty_A, operating_N,
- Nearest to But_CC: excite_V, institution_N., Gross_N, count_N, sign_V, Norway_R, Connecticut_N, surround_V,
- Nearest to many_A: date_V, Mitt_N, index_N, Va_N, law_N., come_V., marketplace_N, These_DT,
- Nearest to Mr_N: government_N, everyone_N, UNK, Canadiens_N, withdraw_N, Crowley_N, voter_N, hurt_N,
- Nearest to next_A: investor_N., Sean_N, runup_N, They_PRP, TB_N, regular_A, run_V, brightly_R,
- Nearest to also_R: see_V, two_CD, UNK, subpoena_N, Neighbors_N, UBS_N, spoke_N, Steele_N,
- Nearest to part_N: measure_N., UAE_N, semiautomatic_A, high_A., phase_N, really_R, America_N, explore_V,
- Nearest to go_V: euro_N., fry_V, enthusiasm_N, On_IN, slow_V., press_N, enjoys_N, two_CD,
- Nearest to year_N.: together_R., famous_A, marked_N, nonalcoholic_A, enjoy_V, credible_A, celebrity_N, yuan_N,
- Nearest to back_R: gun_N, trace_V, Mark_N, White_N, Writers_N, suggestion_N, While_IN, Guardian_A,
- Average loss at step 12000 : 15.084676003694534
- Average loss at step 14000 : 12.499318482160568
- Average loss at step 16000 : 10.57048930168152
- Average loss at step 18000 : 9.099439950704575
- Average loss at step 20000 : 8.241738182783127
- Nearest to even_R: flaw_N, ingredient_N, unheard_A, Democrat_N, financier_N, say_V, Albert_N, youth_N.,
- Nearest to new_A: The_DT, local_N, Southern_A, Kline_V, UNK, Iran_N., say_V, Florida_N.,
- Nearest to woman_N: chemist_N, Bahrain_N, upscale_A, dissemination_N, day_N., turn_V, sprout_V, favour_N,
- Nearest to three_CD: two_CD, Constitution_N, Brady_N, absorbed_N, ushered_A, Bolshoi_N, pack_N, narcotic_N,
- Nearest to group_N: Monegan_N, possibility_N., captive_N., slam_N, Noonan_N, Jews_N., checked_N, advise_V,
- Nearest to US_N: UNK, bn_N, thorough_N, force_N, Ware_N, local_N, ample_N, The_DT,
- Nearest to state_N: clenched_N, anyone_N, Secretary_Tim_Geithner_TGRAM, alien_N, report_V, ensure_V, improperly_R, Dutch_N,
- Nearest to But_CC: UNK, The_DT, say_V, would_MD, excite_V, Connecticut_N, mecca_N, Gross_N,
- Nearest to many_A: date_V, law_N., defeat_N., detergent_N, Mitt_N, Va_N, These_DT, Mutual_A,
- Nearest to Mr_N: UNK, government_N, say_V, Canadiens_N, Spotlight_N, Oncology_N, voter_N, defend_V,
- Nearest to next_A: investor_N., Sean_N, two_CD, runup_N, They_PRP, brightly_R, regular_A, humiliate_V,
- Nearest to also_R: UNK, say_V, subpoena_N, The_DT, Neighbors_N, see_V, two_CD, festival_N.,
- Nearest to part_N: semiautomatic_A, UAE_N, measure_N., high_A., really_R, OK_N., important_A., phase_N,
- Nearest to go_V: enthusiasm_N, two_CD, make_V, The_DT, fry_V, I_PRP, euro_N., hold_V,
- Nearest to year_N.: together_R., credible_A, superior_N, nonalcoholic_A, Pilgrims_N, yuan_N, Confederate_N, famous_A,
- Nearest to back_R: pricefixing_V, trace_V, gun_N, Britt_N, midweek_N, batter_N, Mark_N, Writers_N,
- Average loss at step 22000 : 7.513536925077438
- Average loss at step 24000 : 7.122500022649765
- Average loss at step 26000 : 6.842666110515594
- Average loss at step 28000 : 6.489062655568123
- Average loss at step 30000 : 6.18164588189125
- Nearest to even_R: flaw_N, go_V, make_V, unheard_A, youth_N., Democrat_N, also_R, financier_N,
- Nearest to new_A: local_N, Kline_V, UNK, The_DT, Iran_N., plan_N, Southern_A, say_V,
- Nearest to woman_N: vandalise_V, upscale_A, turn_V, chemist_N, Bahrain_N, day_N., dissemination_N, sprout_V,
- Nearest to three_CD: two_CD, year_N, absorbed_N, postmodern_N, ushered_A, Faith_N, Constitution_N, narcotic_N,
- Nearest to group_N: Monegan_N, possibility_N., slam_N, Noonan_N, Jews_N., scarce_N, advise_V, presenter_N,
- Nearest to US_N: bn_N, thorough_N, force_N, Ware_N, local_N, UNK, say_V, The_DT,
- Nearest to state_N: clenched_N, anyone_N, Secretary_Tim_Geithner_TGRAM, gondola_N, ensure_V, acceleration_N, alien_N, Gov_Arnold_Schwarzenegger_TGRAM,
- Nearest to But_CC: The_DT, UNK, It_PRP, would_MD, say_V, Connecticut_N, excite_V, sign_V,
- Nearest to many_A: Lunar_N, law_N., defeat_N., detergent_N, date_V, Mitt_N, These_DT, That_DT.,
- Nearest to Mr_N: UNK, say_V, government_N, He_PRP, pamphlet_N, Oncology_N, Spotlight_N, Crowley_N,
- Nearest to next_A: two_CD, investor_N., last_A, Sean_N, runup_N, They_PRP, run_V, slept_N,
- Nearest to also_R: say_V, UNK, subpoena_N, He_PRP, say_V., see_V, two_CD, festival_N.,
- Nearest to part_N: semiautomatic_A, UAE_N, pear_N, really_R, one_CD, Guard_N., OK_N., measure_N.,
- Nearest to go_V: I_PRP, make_V, get_V, two_CD, even_R, UNK, like_IN, SAFC_N,
- Nearest to year_N.: together_R., credible_A, Confederate_N, Pilgrims_N, nonalcoholic_A, superior_N, yuan_N, famous_A,
- Nearest to back_R: pricefixing_V, gun_N, trace_V, Britt_N, midweek_N, batter_N, Mark_N, Stade_de_France_TGRAM,
- Average loss at step 32000 : 6.049046798706055
- Average loss at step 34000 : 5.990114219665528
- Average loss at step 36000 : 5.786115090847016
- Average loss at step 38000 : 5.6813979418277745
- Average loss at step 40000 : 5.603810606241226
- Nearest to even_R: go_V, flaw_N, make_V, youth_N., also_R, ingredient_N, unheard_A, redundancy_N.,
- Nearest to new_A: UNK, Kline_V, local_N, plan_N, would_MD, The_DT, Iran_N., sought_N,
- Nearest to woman_N: vandalise_V, upscale_A, turn_V, Bahrain_N, enigma_N, alias_N, Robin_van_Persie_TGRAM, dream_N.,
- Nearest to three_CD: two_CD, year_N, last_A, next_A, absorbed_N, postmodern_N, time_N, ushered_A,
- Nearest to group_N: Monegan_N, possibility_N., company_N, slam_N, also_R, presenter_N, scarce_N, captive_N.,
- Nearest to US_N: force_N, say_V, bn_N, Ware_N, thorough_N, local_N, assault_N., collect_N,
- Nearest to state_N: clenched_N, anyone_N, gondola_N, ensure_V, alien_N, acceleration_N, Secretary_Tim_Geithner_TGRAM, Gov_Arnold_Schwarzenegger_TGRAM,
- Nearest to But_CC: The_DT, It_PRP, would_MD, I_PRP, UNK, He_PRP, In_IN, say_V,
- Nearest to many_A: Lunar_N, detergent_N, defeat_N., would_MD, law_N., date_V, These_DT, Mitt_N,
- Nearest to Mr_N: UNK, He_PRP, also_R, say_V, government_N, appellate_N, pamphlet_N, contention_N,
- Nearest to next_A: two_CD, last_A, investor_N., three_CD, Sean_N, first_R, five_CD, runup_N,
- Nearest to also_R: say_V, UNK, say_V., He_PRP, subpoena_N, OTCBB_N, one_CD, WAM_N,
- Nearest to part_N: pear_N, semiautomatic_A, one_CD, UAE_N, really_R, Guard_N., OK_N., exhilarate_V,
- Nearest to go_V: get_V, I_PRP, make_V, like_IN, even_R, one_CD, long_R, SAFC_N,
- Nearest to year_N.: year_N, together_R., Confederate_N, credible_A, nonalcoholic_A, Pilgrims_N, superior_N, month_N,
- Nearest to back_R: pricefixing_V, gun_N, trace_V, Britt_N, midweek_N, Mark_N, batter_N, redshirted_V,
- Average loss at step 42000 : 5.543679899215698
- Average loss at step 44000 : 5.472415296077728
- Average loss at step 46000 : 5.4293562195301055
- Average loss at step 48000 : 5.3920544502735135
- Average loss at step 50000 : 5.359578889608383
- Nearest to even_R: go_V, flaw_N, make_V, also_R, youth_N., I_PRP, one_CD, redundancy_N.,
- Nearest to new_A: plan_N, local_N, would_MD, UNK, Kline_V, sought_N, Iran_N., service_N,
- Nearest to woman_N: vandalise_V, Bahrain_N, upscale_A, alias_N, turn_V, enigma_N, Umberger_N, Burgundy_N,
- Nearest to three_CD: two_CD, year_N, last_A, four_CD, one_CD, next_A, Faith_N, time_N,
- Nearest to group_N: Monegan_N, possibility_N., company_N, captive_N., also_R, slam_N, batter_N, timetable_N.,
- Nearest to US_N: force_N, Ware_N, assault_N., bn_N, collect_N, say_V, thorough_N, local_N,
- Nearest to state_N: clenched_N, ensure_V, gondola_N, anyone_N, acceleration_N, alien_N, Secretary_Tim_Geithner_TGRAM, possible_A,
- Nearest to But_CC: It_PRP, The_DT, He_PRP, In_IN, would_MD, I_PRP, UNK, And_CC,
- Nearest to many_A: Lunar_N, would_MD, detergent_N, defeat_N., get_V, These_DT, And_CC, law_N.,
- Nearest to Mr_N: UNK, He_PRP, Ms_N, also_R, But_CC, appellate_N, say_V, defend_V,
- Nearest to next_A: last_A, two_CD, investor_N., first_R, three_CD, five_CD, four_CD, take_V,
- Nearest to also_R: say_V, UNK, say_V., He_PRP, OTCBB_N, one_CD, subpoena_N, see_V,
- Nearest to part_N: pear_N, one_CD, semiautomatic_A, UAE_N, Guard_N., gas_N, really_R, exhilarate_V,
- Nearest to go_V: get_V, make_V, I_PRP, one_CD, even_R, like_IN, come_V, But_CC,
- Nearest to year_N.: year_N, Confederate_N, month_N, together_R., credible_A, superior_N, nonalcoholic_A, Pilgrims_N,
- Nearest to back_R: pricefixing_V, gun_N, Mark_N, Britt_N, trace_V, midweek_N, redshirted_V, batter_N,
- Average loss at step 52000 : 5.3201841595172885
- Average loss at step 54000 : 5.28888946723938
- Average loss at step 56000 : 5.280606016874313
- Average loss at step 58000 : 5.269149966955185
- Average loss at step 60000 : 5.2402068099975585
- Nearest to even_R: go_V, make_V, also_R, flaw_N, one_CD, I_PRP, But_CC, Heels_N,
- Nearest to new_A: plan_N, UNK, local_N, Kline_V, would_MD, sought_N, inalienable_A, also_R,
- Nearest to woman_N: vandalise_V, upscale_A, alias_N, Bahrain_N, people_N, turn_V, Burgundy_N, Robin_van_Persie_TGRAM,
- Nearest to three_CD: two_CD, four_CD, last_A, one_CD, six_CD, year_N, time_N, next_A,
- Nearest to group_N: Monegan_N, company_N, possibility_N., captive_N., also_R, timetable_N., government_N, member_N,
- Nearest to US_N: force_N, Ware_N, assault_N., collect_N, improves_N., bn_N, local_N, thorough_N,
- Nearest to state_N: ensure_V, gondola_N, clenched_N, acceleration_N, anyone_N, primary_N., government_N, alien_N,
- Nearest to But_CC: The_DT, It_PRP, He_PRP, In_IN, I_PRP, And_CC, would_MD, UNK,
- Nearest to many_A: get_V, Lunar_N, would_MD, detergent_N, defeat_N., These_DT, enough_R, And_CC,
- Nearest to Mr_N: He_PRP, UNK, Ms_N, also_R, But_CC, appellate_N, defend_V, contention_N,
- Nearest to next_A: last_A, two_CD, investor_N., first_R, three_CD, five_CD, take_V, Sean_N,
- Nearest to also_R: say_V, UNK, say_V., OTCBB_N, one_CD, He_PRP, government_N, add_V,
- Nearest to part_N: pear_N, one_CD, UAE_N, semiautomatic_A, city_N, really_R, Ozawa_N, Guard_N.,
- Nearest to go_V: get_V, I_PRP, make_V, come_V, like_IN, one_CD, even_R, But_CC,
- Nearest to year_N.: year_N, month_N, Confederate_N, together_R., day_N, credible_A, superior_N, theatre_N.,
- Nearest to back_R: pricefixing_V, gun_N, Britt_N, redshirted_V, trace_V, Mark_N, Guardian_A, batter_N,
- Average loss at step 62000 : 5.213798620462418
- Average loss at step 64000 : 5.21542168712616
- Average loss at step 66000 : 5.208661090612411
- Average loss at step 68000 : 5.197546833992004
- Average loss at step 70000 : 5.172291265249252
- Nearest to even_R: go_V, make_V, still_R, also_R, But_CC, I_PRP, get_V, flaw_N,
- Nearest to new_A: plan_N, would_MD, UNK, also_R, sought_N, Kline_V, local_N, inalienable_A,
- Nearest to woman_N: people_N, vandalise_V, alias_N, Bahrain_N, upscale_A, turn_V, Burgundy_N, Umberger_N,
- Nearest to three_CD: two_CD, four_CD, six_CD, one_CD, last_A, year_N, seven_CD, next_A,
- Nearest to group_N: company_N, also_R, Monegan_N, possibility_N., captive_N., government_N, member_N, timetable_N.,
- Nearest to US_N: force_N, Ware_N, collect_N, improves_N., assault_N., bn_N, government_N, local_N,
- Nearest to state_N: ensure_V, government_N, gondola_N, primary_N., clenched_N, acceleration_N, anyone_N, crumple_V,
- Nearest to But_CC: It_PRP, The_DT, He_PRP, In_IN, And_CC, I_PRP, We_PRP, That_DT,
- Nearest to many_A: get_V, enough_R, Lunar_N, one_CD, defeat_N., would_MD, people_N, And_CC,
- Nearest to Mr_N: He_PRP, UNK, Ms_N, But_CC, also_R, appellate_N, Lansing_V, say_V,
- Nearest to next_A: last_A, two_CD, three_CD, investor_N., first_R, five_CD, take_V, Sean_N,
- Nearest to also_R: say_V, UNK, say_V., government_N, OTCBB_N, add_V, one_CD, company_N,
- Nearest to part_N: pear_N, one_CD, UAE_N, America_N, UNK, semiautomatic_A, include_V, city_N,
- Nearest to go_V: get_V, come_V, make_V, I_PRP, even_R, see_V, one_CD, like_IN,
- Nearest to year_N.: year_N, month_N, Confederate_N, day_N, together_R., week_N, credible_A, theatre_N.,
- Nearest to back_R: pricefixing_V, redshirted_V, Britt_N, gun_N, bigot_N, go_V, trace_V, Mark_N,
- Average loss at step 72000 : 5.162850735664367
- Average loss at step 74000 : 5.160452203989029
- Average loss at step 76000 : 5.1545262966156
- Average loss at step 78000 : 5.135925965070724
- Average loss at step 80000 : 5.12542846083641
- Nearest to even_R: go_V, make_V, still_R, one_CD, get_V, come_V, really_R, But_CC,
- Nearest to new_A: plan_N, would_MD, also_R, UNK, sought_N, Kline_V, inalienable_A, Samak_N,
- Nearest to woman_N: people_N, man_N, alias_N, vandalise_V, Bahrain_N, upscale_A, Burgundy_N, Robin_van_Persie_TGRAM,
- Nearest to three_CD: two_CD, four_CD, six_CD, one_CD, last_A, seven_CD, next_A, first_R,
- Nearest to group_N: company_N, also_R, Monegan_N, possibility_N., government_N, captive_N., member_N, Xbox_N.,
- Nearest to US_N: force_N, Ware_N, collect_N, improves_N., assault_N., bn_N, top_N, consumer_N.,
- Nearest to state_N: government_N, ensure_V, gondola_N, primary_N., authority_N, acceleration_N, clenched_N, crumple_V,
- Nearest to But_CC: It_PRP, The_DT, He_PRP, And_CC, In_IN, That_DT, I_PRP, We_PRP,
- Nearest to many_A: one_CD, get_V, people_N, would_MD, need_N, Lunar_N, enough_R, keep_V,
- Nearest to Mr_N: He_PRP, Ms_N, also_R, UNK, But_CC, appellate_N, cleft_N, Lansing_V,
- Nearest to next_A: last_A, two_CD, three_CD, first_R, investor_N., five_CD, take_V, four_CD,
- Nearest to also_R: say_V, UNK, say_V., one_CD, add_V, government_N, OTCBB_N, Mr_N,
- Nearest to part_N: one_CD, pear_N, UAE_N, include_V, America_N, city_N, many_A, semiautomatic_A,
- Nearest to go_V: get_V, come_V, make_V, I_PRP, one_CD, see_V, even_R, like_IN,
- Nearest to year_N.: year_N, month_N, day_N, week_N, month_N., Confederate_N, together_R., theatre_N.,
- Nearest to back_R: pricefixing_V, go_V, away_R, redshirted_V, gun_N, Britt_N, bigot_N, trace_V,
- Average loss at step 82000 : 5.12895014333725
- Average loss at step 84000 : 5.11982547044754
- Average loss at step 86000 : 5.120021118879318
- Average loss at step 88000 : 5.109159954071045
- Average loss at step 90000 : 5.107910221815109
- Nearest to even_R: go_V, still_R, come_V, one_CD, get_V, make_V, also_R, need_N,
- Nearest to new_A: plan_N, UNK, would_MD, sought_N, also_R, company_N, development_N, inalienable_A,
- Nearest to woman_N: people_N, man_N, upscale_A, alias_N, Bahrain_N, turn_V, vandalise_V, Burgundy_N,
- Nearest to three_CD: two_CD, four_CD, six_CD, one_CD, five_CD, last_A, seven_CD, eight_CD,
- Nearest to group_N: company_N, Monegan_N, also_R, government_N, possibility_N., leader_N, Xbox_N., captive_N.,
- Nearest to US_N: force_N, collect_N, Ware_N, improves_N., assault_N., bn_N, consumer_N., local_N,
- Nearest to state_N: government_N, ensure_V, gondola_N, primary_N., authority_N, acceleration_N, Spanishlanguage_N, Thursday_N,
- Nearest to But_CC: It_PRP, The_DT, He_PRP, And_CC, That_DT, In_IN, We_PRP, They_PRP,
- Nearest to many_A: one_CD, get_V, people_N, keep_V, need_N, life_N, enough_R, may_MD,
- Nearest to Mr_N: He_PRP, Ms_N, But_CC, also_R, UNK, appellate_N, cleft_N, Lansing_V,
- Nearest to next_A: last_A, two_CD, three_CD, first_R, five_CD, investor_N., take_V, four_CD,
- Nearest to also_R: UNK, say_V, say_V., government_N, add_V, company_N, Eng_N, OTCBB_N,
- Nearest to part_N: one_CD, pear_N, many_A, include_V, UAE_N, city_N, America_N, semiautomatic_A,
- Nearest to go_V: get_V, come_V, I_PRP, see_V, make_V, one_CD, even_R, well_R,
- Nearest to year_N.: year_N, month_N, day_N, week_N, month_N., Confederate_N, say_V., together_R.,
- Nearest to back_R: pricefixing_V, go_V, away_R, redshirted_V, bigot_N, gun_N, home_N, overtly_R,
- Average loss at step 92000 : 5.101416747093201
- Average loss at step 94000 : 5.08804456782341
- Average loss at step 96000 : 5.087678480386734
- Average loss at step 98000 : 5.088137490510941
- Average loss at step 100000 : 5.083970499515534
- Nearest to even_R: go_V, still_R, come_V, one_CD, make_V, get_V, also_R, need_N,
- Nearest to new_A: plan_N, would_MD, sought_N, company_N, well_R, inalienable_A, also_R, development_N,
- Nearest to woman_N: people_N, man_N, men_N, child_N, upscale_A, turn_V, Bahrain_N, Burgundy_N,
- Nearest to three_CD: two_CD, four_CD, six_CD, five_CD, one_CD, last_A, seven_CD, eight_CD,
- Nearest to group_N: company_N, Monegan_N, government_N, also_R, leader_N, possibility_N., captive_N., party_N,
- Nearest to US_N: force_N, collect_N, improves_N., Ware_N, assault_N., bn_N, consumer_N., dispatch_V,
- Nearest to state_N: government_N, ensure_V, authority_N, gondola_N, primary_N., party_N, official_N, Siddle_N,
- Nearest to But_CC: It_PRP, The_DT, He_PRP, And_CC, In_IN, That_DT, We_PRP, They_PRP,
- Nearest to many_A: one_CD, get_V, people_N, need_N, keep_V, may_MD, life_N, still_R,
- Nearest to Mr_N: He_PRP, Ms_N, also_R, But_CC, UNK, appellate_N, cleft_N, contention_N,
- Nearest to next_A: last_A, two_CD, three_CD, first_R, five_CD, four_CD, investor_N., later_R,
- Nearest to also_R: say_V, UNK, say_V., add_V, government_N, Mr_N, company_N, Eng_N,
- Nearest to part_N: one_CD, many_A, pear_N, include_V, city_N, UAE_N, America_N, somewhat_R.,
- Nearest to go_V: get_V, come_V, see_V, I_PRP, make_V, well_R, even_R, still_R,
- Nearest to year_N.: year_N, month_N, week_N, month_N., day_N, Confederate_N, say_V., week_N.,
- Nearest to back_R: pricefixing_V, go_V, away_R, home_N, redshirted_V, get_V, bigot_N, gun_N,
- plot saved
- Process finished with exit code 0
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement