Advertisement
Guest User

Untitled

a guest
Nov 12th, 2024
261
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 4.61 KB | None | 0 0
  1. Creating a k-means clustering algorithm to automatically sort images of cats and dogs and then building a binary classification CNN using the Keras Cats vs Dogs dataset involves several steps. However, it's important to note that k-means is not typically used for image classification tasks because it is an unsupervised learning method and does not inherently understand the concept of "cat" or "dog." Instead, it groups images based on pixel-level similarity which may not align with semantic categories.
  2.  
  3. For educational purposes, let’s proceed with a simplified example where we use k-means to cluster images into two groups (which might roughly correspond to cats and dogs), then build a binary classification CNN using Keras.
  4.  
  5. ### Step 1: Load the Dataset
  6.  
  7. First, you need to load the dataset. We'll use the `tf.keras.datasets` module to download the Cats vs Dogs dataset.
  8.  
  9. ```python
  10. import tensorflow as tf
  11. from tensorflow.keras.preprocessing.image import ImageDataGenerator
  12. import os
  13. import numpy as np
  14. from sklearn.cluster import KMeans
  15. import shutil
  16.  
  17. # Download and extract the dataset
  18. !wget https://storage.googleapis.com/tensorflow-1-public/course2/cats_and_dogs_filtered.zip -O /tmp/cats_and_dogs_filtered.zip
  19. !unzip -q /tmp/cats_and_dogs_filtered.zip -d /tmp
  20.  
  21. base_dir = '/tmp/cats_and_dogs_filtered'
  22. train_dir = os.path.join(base_dir, 'train')
  23. validation_dir = os.path.join(base_dir, 'validation')
  24.  
  25. # Image data generator for training and validation datasets
  26. train_datagen = ImageDataGenerator(rescale=1./255)
  27. val_datagen = ImageDataGenerator(rescale=1./255)
  28.  
  29. train_generator = train_datagen.flow_from_directory(
  30. train_dir,
  31. target_size=(150, 150),
  32. batch_size=20,
  33. class_mode='binary'
  34. )
  35.  
  36. validation_generator = val_datagen.flow_from_directory(
  37. validation_dir,
  38. target_size=(150, 150),
  39. batch_size=20,
  40. class_mode='binary'
  41. )
  42. ```
  43.  
  44. ### Step 2: Flatten Images and Apply K-Means
  45.  
  46. For simplicity, we will flatten each image into a vector and apply k-means clustering. This is not an ideal approach for semantic segmentation but can be useful for educational purposes.
  47.  
  48. ```python
  49. # Collect images from the training set
  50. images = []
  51. for batch_images, _ in train_generator:
  52. images.extend(batch_images)
  53. if len(images) >= 200: # Limit to a subset of images for simplicity
  54. break
  55.  
  56. # Flatten each image and stack them into a numpy array
  57. flattened_images = np.vstack([img.flatten() for img in images])
  58.  
  59. # Apply k-means clustering
  60. kmeans = KMeans(n_clusters=2, random_state=42)
  61. labels = kmeans.fit_predict(flattened_images)
  62.  
  63. # Create directories for clustered images
  64. cluster_dir_0 = '/tmp/cluster_0'
  65. cluster_dir_1 = '/tmp/cluster_1'
  66.  
  67. os.makedirs(cluster_dir_0, exist_ok=True)
  68. os.makedirs(cluster_dir_1, exist_ok=True)
  69.  
  70. # Save the images to their respective cluster folders
  71. for idx, label in enumerate(labels):
  72. src_image_path = train_generator.filepaths[idx]
  73. filename = os.path.basename(src_image_path)
  74. if label == 0:
  75. dst_image_path = os.path.join(cluster_dir_0, filename)
  76. else:
  77. dst_image_path = os.path.join(cluster_dir_1, filename)
  78. shutil.copy(src_image_path, dst_image_path)
  79.  
  80. print(f"Clustered images into {cluster_dir_0} and {cluster_dir_1}")
  81. ```
  82.  
  83. ### Step 3: Build a Binary Classification CNN
  84.  
  85. Now, let's build and train a binary classification CNN using the original dataset.
  86.  
  87. ```python
  88. from tensorflow.keras.models import Sequential
  89. from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
  90.  
  91. # Define the model architecture
  92. model = Sequential([
  93. Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)),
  94. MaxPooling2D((2, 2)),
  95. Conv2D(64, (3, 3), activation='relu'),
  96. MaxPooling2D((2, 2)),
  97. Conv2D(128, (3, 3), activation='relu'),
  98. MaxPooling2D((2, 2)),
  99. Flatten(),
  100. Dense(512, activation='relu'),
  101. Dropout(0.5),
  102. Dense(1, activation='sigmoid')
  103. ])
  104.  
  105. # Compile the model
  106. model.compile(
  107. loss='binary_crossentropy',
  108. optimizer='adam',
  109. metrics=['accuracy']
  110. )
  111.  
  112. # Train the model
  113. history = model.fit(
  114. train_generator,
  115. steps_per_epoch=100,
  116. epochs=30,
  117. validation_data=validation_generator,
  118. validation_steps=50
  119. )
  120. ```
  121.  
  122. ### Conclusion
  123.  
  124. In this example, we demonstrated how to use k-means clustering on a dataset of cats and dogs (which is not the ideal approach for such a task). We then proceeded to build a binary classification CNN using Keras that can effectively classify images of cats and dogs.
  125.  
  126. For practical purposes, consider using labeled data with supervised learning methods like CNNs or other deep learning models.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement