Untitled

def create_sequences(tokenizer, max_length, descriptions, photos):
    """Creates sequences of images, input sequences and output words for an image.

    X1,     X2 (text sequence),                         y (word)
    photo   startseq,                                   little
    photo   startseq, little,                           girl
    photo   startseq, little, girl,                     running
    photo   startseq, little, girl, running,            in
    photo   startseq, little, girl, running, in,        field
    photo   startseq, little, girl, running, in, field, endseq

    :param tokenizer:
    :param max_length:
    :param descriptions:
    :param photos:
    :return:
    """
    X1, X2, y = [], [], []
    # Walk through each image identifier.
    for desc_key, desc_list in descriptions.iteritems():
        # Walk through each description for the image.
        for desc in desc_list:
            # Encode the sequence.
            seq = tokenizer.texts_to_sequences([desc])[0]
            # Split one sequence into multiple X,Y pairs.
            for i in range(1, len(seq)):
                # Split into input and output pair.
                in_seq, out_seq = seq[:i], seq[i]
                # Pad input sequence.
                in_seq = pad_sequences([in_seq], maxlen=max_length)[0]
                # Encode output sequence
                out_seq = to_categorical([out_seq], num_classes=vocab_size)[0]
                # Store.
                X1.append(photos[desc_key][0])
                X2.append(in_seq)
                y.append(out_seq)
    print len(X1), len(X2), len(y)
    print type(X1[0])
    #return array(X1), array(X2), array(y)

Dataset: 6000 train images.
Descriptions: train=6000
Vocabulary Size: 7579
Photos: train=6000
Description Length: 34
Preparing text sequences for training.
306404 306404 306404
<type 'numpy.ndarray'>