Elena' s AI Blog

TensorFlow: Evaluating the Saved Bird Species Prediction Model

02 May 2022 / 19 minutes to read

Elena Daehnhardt

Midjourney AI-generated art


In my previous post “TensorFlow: Transfer Learning (Fine-Tuning) in Image Classification”, I have described building a convolutional neural network based on EffecientNetB0 (initially trained on the ImageNet dataset), which underwent the feature extraction and fine-tuning steps using the 400 Bird Species Dataset at Kaggle. This was an exciting experiment since the ImageNet dataset contains only 40 bird species, while the Kaggle dataset has 400 bird species. Despite such differences in the underlying data, the model trained so well that the final model reached 98.5% accuracy on the test set. In this blog post, I am going to load this model saved in my deep learning repository and
evaluate its performance in detail to determine which birds are not well predicted.

Getting Data and Code

Using Helper Functions

I have shared my helpers.py Python script contains some useful functions for data preprocessing, model creation, and evaluation. You can use this file as you like, change it and share with me your ideas :) I will discuss the code parts that are useful in analysing the fitted bird species prediction model.

# Getting helper functions
!wget https://raw.githubusercontent.com/edaehn/deep_learning_notebooks/main/helpers.py
--2022-05-02 10:47:37--  https://raw.githubusercontent.com/edaehn/deep_learning_notebooks/main/helpers.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)...,,, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)||:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 33925 (33K) [text/plain]
Saving to: 'helpers.py'

helpers.py          100%[===================>]  33.13K  --.-KB/s    in 0.002s  

2022-05-02 10:47:38 (14.4 MB/s) - ‘helpers.py’ saved [33925/33925]
# Import files library from google.colab
from google.colab import files

# Import all functions from the helpers.py
from helpers import *

Downloading the Birds Species Dataset from Kaggle

Before getting the dataset, you need to upload your kaggle.json into the Colab file system.

# Setup to download Kaggle datasets into a Colab instance
! pip install kaggle
! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json
Requirement already satisfied: kaggle in /usr/local/lib/python3.7/dist-packages (1.5.12)
Requirement already satisfied: python-slugify in /usr/local/lib/python3.7/dist-packages (from kaggle) (6.1.2)
Requirement already satisfied: six>=1.10 in /usr/local/lib/python3.7/dist-packages (from kaggle) (1.15.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from kaggle) (4.64.0)
Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from kaggle) (2.23.0)
Requirement already satisfied: urllib3 in /usr/local/lib/python3.7/dist-packages (from kaggle) (1.24.3)
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.7/dist-packages (from kaggle) (2.8.2)
Requirement already satisfied: certifi in /usr/local/lib/python3.7/dist-packages (from kaggle) (2021.10.8)
Requirement already satisfied: text-unidecode>=1.3 in /usr/local/lib/python3.7/dist-packages (from python-slugify->kaggle) (1.3)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->kaggle) (3.0.4)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->kaggle) (2.10)

You see here “Requirement already satisfied” messages. I have already installed the kaggle library. You will need to run these commands for installing the kaggle package. Next, we can get the dataset directly from Kaggle.

! kaggle datasets download gpiosenka/100-bird-species/birds -p /content/sample_data/birds --unzip
Downloading 100-bird-species.zip to /content/sample_data/birds
100% 1.49G/1.49G [00:21<00:00, 60.5MB/s]
100% 1.49G/1.49G [00:21<00:00, 75.6MB/s]

Getting the Trained Model

I have created a fine-tuned bird species predictive model in my previous set of experiments. This model is saved in my GitHub repository, and we further reuse it.

# Getting saved fine-tuned EffecientNetB0 model
!wget https://github.com/edaehn/deep_learning_notebooks/raw/main/models/model_4_bird_species_prediction.zip
--2022-05-02 10:48:38--  https://github.com/edaehn/deep_learning_notebooks/blob/main/models/model_4_bird_species_prediction.zip
Resolving github.com (github.com)...
Connecting to github.com (github.com)||:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘model_4_bird_species_prediction.zip’

model_4_bird_specie     [ <=>                ] 123.66K  --.-KB/s    in 0.08s   

2022-05-02 10:48:38 (1.59 MB/s) - ‘model_4_bird_species_prediction.zip’ saved [126623]

Let’s unzip the trained model. The model is unzipped into the “model_4” directory.

# Unzipping saved model

Checking that the Dataset is Loaded Correctly

The function “walk_directory” (helpers.py) shows the number of directories and files in the “sample_data/birds” directory.

# Define the directory wherein the dataset is stored
dataset_path = "sample_data/birds"

# Show file numbers in the directory "sample_data/birds"
There are 4 directories and '5" files in sample_data/birds.
There are 400 directories and '0" files in sample_data/birds/train.
There are 0 directories and '146'' files in sample_data/birds/train/AFRICAN EMERALD CUCKOO.
There are 0 directories and '160'' files in sample_data/birds/train/CANARY.
There are 0 directories and '197" files in sample_data/birds/train/RED BEARDED BEE EATER.
There are 0 directories and '154'' files in sample_data/birds/train/SCARLET CROWNED FRUIT DOVE.
There are 0 directories and '201" files in sample_data/birds/train/VIOLET GREEN SWALLOW.
There are 0 directories and '130'' files in sample_data/birds/train/GOULDIAN FINCH.

The function show_five_birds() draws five random birds from the dataset.

sample_data/birds/train/ALEXANDRINE PARAKEET
Image shape: (224, 224, 3)
sample_data/birds/train/CHESTNET BELLIED EUPHONIA
Image shape: (224, 224, 3)
sample_data/birds/train/ANDEAN SISKIN
Image shape: (224, 224, 3)
sample_data/birds/train/EMERALD TANAGER
Image shape: (224, 224, 3)
sample_data/birds/train/MOURNING DOVE
Image shape: (224, 224, 3)
Five Random Birds from the Training Dataset

Figure 1. Five Random Birds from the Training Dataset

Getting Training and Test Data

# Getting training and test datasets
train_data, test_data = get_image_data(dataset_path=dataset_path, IMG_SIZE = (224, 224))
Found 58388 files belonging to 400 classes.
Found 2000 files belonging to 400 classes.

Loading and Evaluating the Trained Model

With Keras’ load_model(), we load in the previously trained model.

# Load unzipped model
loaded_model = tf.keras.models.load_model("model_4")

Next, the loaded model is evaluated with the test dataset. It got an accuracy of .985, which is pretty good! However, it is always nice to check the not well-predicted samples. Knowing the wrong predictions could give us ideas on how to improve our model. For instance, we could add more bird samples in their respective training folders.

# Evaluate on the full test dataset
63/63 [==============================] - 20s 142ms/step - loss: 0.0537 - accuracy: 0.9845
[0.053718529641628265, 0.984499990940094]

The Wrongest Bird Predictions

It is pretty interesting and helpful to find the test samples mispredicted. This could give us insights into how the model works and what could still be improved. My initial thought on this problem was that possibly, incorrectly predicted birds are somehow similar (for instance, in color or shape) with the bird species they are wrongly assigned to. Let’s check it out using the following steps:

  1. load the test dataset;
  2. use the model for predicting bird species probabilities;
  3. get the classes (bird species) corresponding with the highest prediction probabilities;
  4. create a Pandas dataframe storing image paths to the test bird images, their actual class labels,
  5. predicted class labels, prediction probabilities;
  6. get only incorrectly predicted bird images into a new dataframe, and sort it out in the descending order of prediction probability;
  7. show images of test birds (left side) and images of their predictions (right side).

To realise these steps, I have created two functions (see helpers.py), show_wrongly_predicted_images() for building up Pandas DataFrames using the test dataset and the trained model, and show_one_wrongly_predicted() for showing two bird species side by side for each test sample (step 7).

def show_wrongly_predicted_images(model, dataset_directory="sample_data/birds", top_wrong_predictions_number_to_show=False):

    test_data = tf.keras.preprocessing.image_dataset_from_directory(
        directory=dataset_directory + "/test",
        image_size=(224, 224),

    class_names = test_data.class_names

    # 2. Use model for predictions
    prediction_probabilities = model.predict(test_data, verbose=1)

    # Check the predictions we have got
    # print(f"Number of test rows: {len(test_data)}, \
            # number of predictions: {len(prediction_probabilities)}, \
            # shape of predcitions: {prediction_probabilities.shape}, \
            # the first prediction: {prediction_probabilities[0]}")

    # Getting indices of the predicted classes
    prediction_classes_index = prediction_probabilities.argmax(axis=1)

    # Get indices of our test_data BatchDataset
    test_labels = []
    for images, labels in test_data.unbatch():

    sklearn_accuracy = accuracy_score(y_true=test_labels,

    # 3. Finding where our model is most wrong

    # Find all files in the test dataset
    filepaths = []
    for filepath in test_data.list_files(dataset_directory + "/test/*/*.jpg",

    # Create a dataframe
    predictions_df = pd.DataFrame({"images_path": filepaths,
                                   "y_true": test_labels,
                                   "y_predicted": prediction_classes_index,
                                   "prediction_confidence": prediction_probabilities.max(axis=1),
                                   "true_classname": [class_names[i] for i in test_labels],
                                   "predicted_classname": [class_names[i] for i in prediction_classes_index]})

    # See which birds predicted correctly/incorrectly
    predictions_df["correct_prediction"] = predictions_df["y_true"] == predictions_df["y_predicted"]

    # Sort out the dataframe to find the most wrongly predicted classes
    top_wrong = predictions_df[predictions_df["correct_prediction"] == False].sort_values("prediction_confidence",

    # 4. Plot top top_wrong_predictions_number_to_show number of predictions

    top = zip(top_wrong["images_path"], top_wrong["true_classname"], top_wrong["predicted_classname"], top_wrong["prediction_confidence"])
    print(f"Wrongly predicted {len(top_wrong)} out of {len(predictions_df)}")
    if top_wrong_predictions_number_to_show:
        top = top[:top_wrong_predictions_number_to_show]

    for filename1, label1, label2, prob in top:
        filename2 = "/content/sample_data/birds/train/"+ label2 + "/" + random.sample(os.listdir("/content/sample_data/birds/train/" + label2), 1)[0]
        # print(f"{filename1}: {filename2}")
        show_one_wrongly_predicted(filename1, filename2, label1, label2+f" (prob={prob:.2f})")

    return sklearn_accuracy

def show_one_wrongly_predicted(filename1, filename2, label1, label2):
    Loads two images from their full-path filenames and show them in one plot with own titles corresponding to their
    class labels.
    :param filename1: full-path filename to the first image, the test image we are predicting.
    :param filename2: full-path to the second image relating to the predicted class.
    :param label1: true class label
    :param label2: predicted class label.
    img1 = tf.io.read_file(filename1)
    img1 = tf.image.decode_image(img1, channels=3)
    img2 = tf.io.read_file(filename2)
    img2 = tf.image.decode_image(img2, channels=3)
    figure, ax = plt.subplots(1, 2);

# Show top wrongly predicted birds
Wrongly Predicted Bird Species Wrongly Predicted Bird Species Wrongly Predicted Bird Species Wrongly Predicted Bird Species Wrongly Predicted Bird Species Wrongly Predicted Bird Species Wrongly Predicted Bird Species Wrongly Predicted Bird Species Wrongly Predicted Bird Species Wrongly Predicted Bird Species Wrongly Predicted Bird Species

Figure 2. Wrongly Predicted Bird Species

As we see from the images of wrongly predicted bird species, all of the are indeed alike in color and shape. Moreover, some species are very close to the bird families that you need to be an ornitologist or research the bird species to know the little differences between both species, predicted and the actual. For instance, Avadavat was predcited as a strawberry Finch with probability of 91%. In Wikipedia article about Red Avadavat we can learn that this bird is a Strawberry Finch belonging to the family of Estrildidae originating from India and is “is popular as a cage bird”.

At the end of this article, I am asking you, my dear readers, please, do not keep your pet birds in cages all the time. Birds need to be happy and fly, even only in a well-ventilated room or a proprietary-sized aviary. Otherwise, birds get depressed, suffer psychological trauma, and even a weakened heart due to obesity and lack of training. Do not imprison birds or other animals, and we all deserve to be happy and free! In return, your pet bird will become a loving and cheerful friend.

Predicting a Bird Downloaded from Web

As a bonus section, I will try predicting a bird species with an image of a red avadavat downloaded from the BlogSpot website. Will it be well predicted?

!wget http://2.bp.blogspot.com/-EB4avRIsLQ8/Tv25pjMDi3I/AAAAAAAAB9s/Io8ybYRjjFM/s1600/Red+avadavat+Amandava+amandava.jpg
--2022-05-03 12:21:25--  http://2.bp.blogspot.com/-EB4avRIsLQ8/Tv25pjMDi3I/AAAAAAAAB9s/Io8ybYRjjFM/s1600/Red+avadavat+Amandava+amandava.jpg
Resolving 2.bp.blogspot.com (2.bp.blogspot.com)..., 2607:f8b0:4001:c14::84
Connecting to 2.bp.blogspot.com (2.bp.blogspot.com)||:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 98904 (97K) [image/jpeg]
Saving to: ‘Red+avadavat+Amandava+amandava.jpg’

Red+avadavat+Amanda 100%[===================>]  96.59K  --.-KB/s    in 0s      

2022-05-03 12:21:25 (195 MB/s) - ‘Red+avadavat+Amandava+amandava.jpg’ saved [98904/98904]
predict_and_plot(loaded_model, filename, train_data.class_names, \
                 known_label=False, rescale=False)
An Avadavat Prediction with probability=1

Figure 3. An Avadavat Prediction

As we see, the Avadavat bird image was assigned a correct specie name with the prediction probability=1. The function predict_and_plot() and all the code is available in my GitHub repository with deep learning experiments.


In this post, I have described the process of in-depth model evaluation. I have reused the previously created EffecientNetB0 model, which is fine-tuned with the 400 Bird Species Kaggle dataset. As a result, I have found out which bird species are not well predicted. Thanks for reading, and good look with coding!


1. TensorFlow Developer Certificate in 2022: Zero to Mastery

2. Birds 400 - Species Image Classification

3. wikipedia: ImageNet

4. Wikipedia article about Red Avadavat

Did you like this post? Please let me know if you have any comments or suggestions.

Posts that might be interesting for you

desktop bg dark

About Elena

Elena, a PhD in Computer Science, simplifies AI concepts and helps you use machine learning.

Elena Daehnhardt. (2022) 'TensorFlow: Evaluating the Saved Bird Species Prediction Model', daehnhardt.com, 02 May 2022. Available at: https://daehnhardt.com/blog/2022/05/02/tf-reusing-and-evaluating-saved-models/
All Posts