Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Creating a k-means clustering algorithm to automatically sort images of cats and dogs and then building a binary classification CNN using the Keras Cats vs Dogs dataset involves several steps. However, it's important to note that k-means is not typically used for image classification tasks because it is an unsupervised learning method and does not inherently understand the concept of "cat" or "dog." Instead, it groups images based on pixel-level similarity which may not align with semantic categories.
- For educational purposes, let’s proceed with a simplified example where we use k-means to cluster images into two groups (which might roughly correspond to cats and dogs), then build a binary classification CNN using Keras.
- ### Step 1: Load the Dataset
- First, you need to load the dataset. We'll use the `tf.keras.datasets` module to download the Cats vs Dogs dataset.
- ```python
- import tensorflow as tf
- from tensorflow.keras.preprocessing.image import ImageDataGenerator
- import os
- import numpy as np
- from sklearn.cluster import KMeans
- import shutil
- # Download and extract the dataset
- !wget https://storage.googleapis.com/tensorflow-1-public/course2/cats_and_dogs_filtered.zip -O /tmp/cats_and_dogs_filtered.zip
- !unzip -q /tmp/cats_and_dogs_filtered.zip -d /tmp
- base_dir = '/tmp/cats_and_dogs_filtered'
- train_dir = os.path.join(base_dir, 'train')
- validation_dir = os.path.join(base_dir, 'validation')
- # Image data generator for training and validation datasets
- train_datagen = ImageDataGenerator(rescale=1./255)
- val_datagen = ImageDataGenerator(rescale=1./255)
- train_generator = train_datagen.flow_from_directory(
- train_dir,
- target_size=(150, 150),
- batch_size=20,
- class_mode='binary'
- )
- validation_generator = val_datagen.flow_from_directory(
- validation_dir,
- target_size=(150, 150),
- batch_size=20,
- class_mode='binary'
- )
- ```
- ### Step 2: Flatten Images and Apply K-Means
- For simplicity, we will flatten each image into a vector and apply k-means clustering. This is not an ideal approach for semantic segmentation but can be useful for educational purposes.
- ```python
- # Collect images from the training set
- images = []
- for batch_images, _ in train_generator:
- images.extend(batch_images)
- if len(images) >= 200: # Limit to a subset of images for simplicity
- break
- # Flatten each image and stack them into a numpy array
- flattened_images = np.vstack([img.flatten() for img in images])
- # Apply k-means clustering
- kmeans = KMeans(n_clusters=2, random_state=42)
- labels = kmeans.fit_predict(flattened_images)
- # Create directories for clustered images
- cluster_dir_0 = '/tmp/cluster_0'
- cluster_dir_1 = '/tmp/cluster_1'
- os.makedirs(cluster_dir_0, exist_ok=True)
- os.makedirs(cluster_dir_1, exist_ok=True)
- # Save the images to their respective cluster folders
- for idx, label in enumerate(labels):
- src_image_path = train_generator.filepaths[idx]
- filename = os.path.basename(src_image_path)
- if label == 0:
- dst_image_path = os.path.join(cluster_dir_0, filename)
- else:
- dst_image_path = os.path.join(cluster_dir_1, filename)
- shutil.copy(src_image_path, dst_image_path)
- print(f"Clustered images into {cluster_dir_0} and {cluster_dir_1}")
- ```
- ### Step 3: Build a Binary Classification CNN
- Now, let's build and train a binary classification CNN using the original dataset.
- ```python
- from tensorflow.keras.models import Sequential
- from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
- # Define the model architecture
- model = Sequential([
- Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)),
- MaxPooling2D((2, 2)),
- Conv2D(64, (3, 3), activation='relu'),
- MaxPooling2D((2, 2)),
- Conv2D(128, (3, 3), activation='relu'),
- MaxPooling2D((2, 2)),
- Flatten(),
- Dense(512, activation='relu'),
- Dropout(0.5),
- Dense(1, activation='sigmoid')
- ])
- # Compile the model
- model.compile(
- loss='binary_crossentropy',
- optimizer='adam',
- metrics=['accuracy']
- )
- # Train the model
- history = model.fit(
- train_generator,
- steps_per_epoch=100,
- epochs=30,
- validation_data=validation_generator,
- validation_steps=50
- )
- ```
- ### Conclusion
- In this example, we demonstrated how to use k-means clustering on a dataset of cats and dogs (which is not the ideal approach for such a task). We then proceeded to build a binary classification CNN using Keras that can effectively classify images of cats and dogs.
- For practical purposes, consider using labeled data with supervised learning methods like CNNs or other deep learning models.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement