Classification done easy

Hello everyone,

I will show you the simplest way to get a Tensorflow-Keras classification model to train and then do a prediction with it.

And all of this on your local machine :)

This will include setting up Python because a broken Python is a great source of misery.

Please have a look at the chapters to save yourself some time!

This machine is completely clean to start with.

vlcsnap-2025-02-22-21h27m54s050.png

Let's fire up the command line and type 'python'.

vlcsnap-2025-02-22-21h28m07s393.png)

That sends us to the store, nice and all but we like to choose our own version.

vlcsnap-2025-02-22-21h28m47s116.png

So we hit the Windows key and type 'alias' and open that settings menu.

vlcsnap-2025-02-22-21h29m04s390.png vlcsnap-2025-02-22-21h29m11s159.png

And there we disable that behavior.

vlcsnap-2025-02-22-21h29m42s103.png

Ok, that's better, no python found!

vlcsnap-2025-02-22-21h30m13s427.png

Now we hop onto the world wide web and get us the latest Python version available.

The latest version is also the wrong version at the time I am writing this script, but that gives me the opportunity to show you how to fix that properly.

Because, lets face it. You do not have a clean Python 3.12 setup now do you?

vlcsnap-2025-02-22-21h30m33s462.png

No need for a custom installation, but we do need the 'Add python to PATH' option.

Otherwise, we can't just type 'python' on the command line and expect it to know where it is.

vlcsnap-2025-02-22-21h37m51s235.png

And we have a python, excellent!

vlcsnap-2025-02-22-21h38m27s097.png

First thing we should and will do is update 'pip', which is python's package manager.

vlcsnap-2025-02-22-21h38m47s234.png

And now we install the Tensorflow package.

No Tensorflow? But this is the latest python there is!

Yes, and that is the problem.

Google has not completed their testing or whatever-it-is-they-do before releasing it to the newest python version.

So we need to get back to a previous version. Which is 3.12.

But first, we need to do a proper cleanup of the old version.

vlcsnap-2025-02-22-21h40m21s892.png vlcsnap-2025-02-22-21h40m36s473.png

Let me show you how to do that.

vlcsnap-2025-02-22-21h40m48s626.png

That's gone, now lets remove the launcher as well.

vlcsnap-2025-02-22-21h41m09s670.png

Little more advanced, in the explorer go to %appdata%, go one folder up and enter 'local'

vlcsnap-2025-02-22-21h41m20s953.png vlcsnap-2025-02-22-21h41m28s752.png

There we have some leftover 'pip' files AND, in the 'programs' folder some remains of python itself.

They all need to go.

vlcsnap-2025-02-22-21h46m35s743.png vlcsnap-2025-02-22-21h46m41s379.png vlcsnap-2025-02-22-21h47m11s456.png vlcsnap-2025-02-22-21h47m19s982.png

If then for any reasons you still have problems, find the system and users variables and clean any python related entries there also.

Now the system is clean again, great, good skill to have when doing python.

vlcsnap-2025-02-22-21h48m40s965.png

I open a command line to triple check python is really gone.

vlcsnap-2025-02-22-21h48m23s802.png

Time to install the correct version, make sure to choose the 64 bit version.

vlcsnap-2025-02-22-21h49m17s413.png

Python not found again? Yes, the path variable mentioned in the installer is loaded only when opening the command line.

vlcsnap-2025-02-22-21h49m28s249.png

So reopening the command line will fix this!

vlcsnap-2025-02-22-21h50m08s572.png

Once again we upgrade pip.

vlcsnap-2025-02-22-21h50m18s675.png

And install Tensorflow, which now is available and installing as it should.

And done, great, we are slowly getting were we want.

vlcsnap-2025-02-22-21h55m52s486.png

Now open python and from the command-line 'import tensorflow as tf', that will try to load the module with an alias 'tf'.

And we get an error. Well, I do anyway because my system is clean. If you don't? Just skip ahead.

vlcsnap-2025-02-22-21h56m35s658.png

The error tells us we need to install the "Microsoft C++ redistributable." which is actually a common thing for programs to need. Just follow the link and install it.

vlcsnap-2025-02-22-21h56m53s979.png

Yes, that's us, X64!

Let's try loading Tensorflow again.

vlcsnap-2025-02-22-22h00m58s309.png

And it is actually loaded! That's great.

If you got this far, congrats, the hard part is over! Go get a coffee or something, the rest is 'almost' guaranteed to work.

I assume you already have some data you want to train on and I don't want to share mine so....

Let's get some from the internet!

vlcsnap-2025-02-22-22h03m06s469.png

kaggle.com is a nice site to find some datasets. Let's find some flowers, everyone like flowers right?

We got notebooks, topics, comments... and datasets! Yes need that.

vlcsnap-2025-02-22-22h03m46s277.png

I like this one! Been here before and all.

vlcsnap-2025-02-22-22h04m31s192.png

Got 5 folders with images, now go download.

vlcsnap-2025-02-22-22h05m33s034.png

And we need to log in, not what the button said but understandable.

Nothing was downloaded, this is where the download button becomes a lie. That's a very thin line you're walking, website!

So just hit the download button again, and unpack it to the desktop.

vlcsnap-2025-02-22-22h06m17s899.png

If we look inside the folder, we can see there are 5 subdirectories, one for each type of flower.

vlcsnap-2025-02-22-22h06m27s988.png

Inside these are 1000 images with various resolutions.

That is enough for our model to get a good general idea of what each type of flower looks like.

vlcsnap-2025-02-22-22h13m02s568.png Prepare images

Ok, here we go, first python script.

This script is for checking our images folder for any files that can't be opened. That actually happens to me often.

Like a thumbnail-database-file that's created after viewing the images.

import os
import cv2

First we import 'cv2', that's short for 'Open Computer Vision version 2'. A popular library for working with images and more.

images = []

An empty list for all the images filenames.

for path, subdirs, files in os.walk("C:\\Users\\Krakkus\\Desktop\\flower_images"):
    for name in files:
        fn = os.path.join(path, name)
        images.append(fn)

Then a loop for plowing through all the files in the folder and adding the full path for those files to that list.

for fn in images:
    print(fn)
    image = cv2.imread(fn)

And then a second loop to print the file and read the image from disk.

If that fails the last file printed is the offending one.

# h, w, c = image.shape
#
# if h == 224 and w == 224:
#     continue
#
# image = cv2.resize(image, (224, 224))
# cv2.imwrite(fn, image)

The commented piece of code is to resize the images to a fixed size, to fit the classification-models input, but that is no longer needed now-a-days.

Let's run the code.

vlcsnap-2025-02-22-22h28m03s816.png vlcsnap-2025-02-22-22h28m51s689.png

And 'cv2' is not found. Let's install that over the command line!

Now re-run the code.

vlcsnap-2025-02-22-22h29m26s441.png

That's better, every file is now read from disk as an image.

vlcsnap-2025-02-22-22h30m02s106.png

And no problems found, good for us.

vlcsnap-2025-02-22-22h30m35s864.png Training the model

Second script.. training the model, BAM! didn't see that coming after all this slow stuff did You?

vlcsnap-2025-02-22-22h33m13s598.png

I will even do you one better than that, starting the training and then explaining the code!

You can see it starting epoch 1 of 15, which means training cycles.

There also is some sort of a progress bar but the console messes that up.

Ok, the code!

import random
import tensorflow as tf

Right of the bat we import 'tensorflow' with an alias 'tf'.

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.applications import MobileNetV2  
from tensorflow.keras.callbacks import ReduceLROnPlateau

Then there are some 'tensorflow.keras' imports, for designing the network.

Please note there is also 'keras' as a separate package, almost identical, but we will use the one included INSIDE the tensorflow package.

data_dir = "C:\\Users\\Krakkus\\Desktop\\flower_images"

Again, the image folder.

batch_size = 128

A batch size, for when you are using a GPU, 128 is the number for a 3060 GPU. For CPU it doesn't matter that much.

img_height = 224
img_width = 224

And then the image dimensions for the input of the detection model.

seedne = random.randrange(100, 800)

Here we store a random number for splitting the image data-set into a training and a validation part.

That needs to be identical to not get overlap.

train_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=seedne,
  image_size=(img_height, img_width),
  batch_size=batch_size)

val_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=seedne,
  image_size=(img_height, img_width),
  batch_size=batch_size)

Then we have two blocks calling the same function to get the image data-set using that random number.

The only difference is in the first block we ask for the training set and in the second we ask for the validation set.

The validation split, is the ratio between the sets. 0.2 means there is 20% validation images and 80% training images.

class_names = train_ds.class_names
print(class_names)

Printing out the class names, which basically means the sub-directories in the image folder.

Just so you know, those are in alphabetical order, ALWAYS! Just remember that.

num_classes = len(class_names)

We take note of the number of classes, for constructing the network.

data_augmentation = keras.Sequential(
  [
    layers.RandomFlip("horizontal",
                      input_shape=(img_height,
                                  img_width,
                                  3)),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
  ]
)

Data augmentation, that is a part of the input side of the model used for training.

Each time an image is read from disk, it is flipped, rotated or zoomed to give variation to the images as to not train to the exact images but in a more generic way.

base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

Base model, very important.

You know those ready-made models that can classify 100 different objects? This is one of those!

We cut of the head, which are those 100 classes and keep the body, that does all the work.

base_model.trainable = False

Here we state that we do not want the body to be retrained, only the rest of the model.

If you have a very difficult classification to do, you can enable this, but training time will be much longer.

model = Sequential([
  data_augmentation,
  layers.Rescaling(1./255 , input_shape=(224, 224, 3)),
  base_model, 
  layers.GlobalAveragePooling2D(),
  layers.Dense(256, activation='relu'),
  layers.Dropout(0.5),
  layers.Dense(num_classes, activation='softmax')
])

Then we define the actual model, yes, this will be our own creation, how cool.

We have that augmentation input part, and then some scaling of the images, which is why the input size of the images didn't matter anymore.

Next, the base model, which will be static in this case, some people already did some great work on that.

And then we add new output layers, just copy this, trust me it works!

model.compile(optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.summary()

Then we compile the model, to get it ready for training, with an initial learning rate. Keep note of that!

reduce_lr = ReduceLROnPlateau(monitor='val_loss',  # Metric to monitor
                              factor=0.5,          # Reduce LR by this factor (e.g., 0.1 = divide by 10)
                              patience=5,          # Number of epochs to wait before reducing LR
                              min_lr=1e-7)         # Minimum learning rate

This is one of those things just to copy. It will decrease the learning rate over time, that helps to fit the model better.

epochs=15

Set the number of training cycles here, if not retraining the entire model this is a good number, otherwise, make it 150 to start with.

history = model.fit(
  train_ds,
  validation_data=val_ds,
  callbacks=[reduce_lr],
  epochs=epochs
)

And here we kick of the training, which you can see happening at the left screen.

model.save('my_model_224.keras')

Last line will explain itself right?

vlcsnap-2025-02-22-22h40m44s133.png

Model trained, model saved, excellent!

vlcsnap-2025-02-22-22h40m47s594.png Prediction

New script, final script! Let's do some actual classification.

import os
import cv2
import random
import tensorflow as tf
import numpy as np

Almost same imports as before, but 'numpy' is new, but was already installed alongside with tensorflow.

images = []
for path, subdirs, files in os.walk("C:\\Users\\Krakkus\\Desktop\\flower_images"):
    for name in files:
        fn = os.path.join(path, name)
        images.append(fn)

We do the same making a list of ALL the files like in the first script.

fn = random.choice(images)
print(fn)

Only then we choose a random item from that list, and print it!

org_img = cv2.imread(fn)
tf_image = cv2.resize(org_img, (224, 224))
tf_image = cv2.cvtColor(tf_image, cv2.COLOR_BGR2RGB)
tf_image = np.expand_dims(tf_image, axis=0)

That block is very important!

Training does not care much about import format, it handles that automatically.

Predictions however, we need to resize, change color format and change dimensions of the image-array.

model = tf.keras.models.load_model('my_model_224.keras')

Load the model we trained before, also important, duh!

prediction = model.predict([tf_image])

And when we feed it, it needs to be an array of 1 images.

Why? Because it can predict multiple images at once, but we only want 1.

print(prediction[0])

Same goes for the output, it returns us an array of outputs.

But we will handle just the one.

For each prediction, it returns a list of probabilities for each class.

predicted_class = np.argmax(prediction[0])
print(predicted_class)

We then find the highest value with 'argmax' which returns the index of that.

Which translates into folder number x, because alphabetical-order as you might remember :)

So lets run it!

vlcsnap-2025-02-22-22h44m37s895.png

That's our image, right.

vlcsnap-2025-02-22-22h45m28s942.png

And that is the array of probabilities.

vlcsnap-2025-02-22-22h45m51s387.png

The first number is the highest.

vlcsnap-2025-02-22-22h46m24s120.png

The first folder is 'Lilly'.

vlcsnap-2025-02-22-22h46m52s654.png

Which is where it got the file from! Amazing, isn't it?

Thank you for watching!