Machine Learning Series

Elena Daehnhardt

E-mail Twitter GitHub Pinterest LinkedIn Ko-fi

Image credit: Illustration created with Midjourney, prompt by the author.

Image prompt

“An illustration representing cloud computing”

Machine Learning: From Foundations to Fine-Tuning

This series is ordered from beginner concepts to advanced practice. Start at Part 1 and build up.

Series Progress

27 of 27 posts published

Learning Path

Part 1: Deep Learning vs Machine Learning 16 Oct, 2021 Artificial Intelligence (AI) is a field of computer science. AI provides methods and algorithms to mimic human intelligence, reasoning, and decision-making and provide insights, which businesses could use in research or industry to build new exciting and innovative products or services. Machine Learning (ML) is a subset of AI with algorithms that learn from data. In this post, we sort out the differences between AI and ML.
Part 1: Deep Learning with DataCamp and Twitter 13 Dec, 2021 While having some machine learning experience of working with Scikit Learn, I was always interested in Deep Learning. The plan is to learn basic concepts and apply algorithms to a real-life situation, which I have always liked.
Part 2: Tools and Data to Experiment with Machine Learning 19 Oct, 2021 Python open-source library scikit-learn provides a comprehensive selection of machine learning techniques (regression, classification, clustering), feature selection, metrics, preprocessing, and other functionality. At this moment, Scikit-learn, is lacking deep learning functionality; however, we can use TensorFlow with the Scikit Flow wrapper for creating neural networks using the Scikit-learn approach.
Part 3: TensorFlow on M1 5 Jan, 2022 TensorFlow is a free OS library for machine learning created by Google Brain. Tensorflow has excellent functionality for building deep neural networks. I have chosen TensorFlow because it is pretty robust, efficient, and can be used with Python. Since I like Jupyter Notebooks and Conda, they were also installed on my system. Next, I am going through simple steps to install TensorFlow and the packages above on M1 macOS Monterey.
Part 3: Feature preprocessing 29 Jan, 2022 Machine Learning algorithms often require that data is in a specific type. For instance, we can use only numerical data. In other cases, ML algorithms would perform better or converge faster when we preprocess data before training the model. Since we do this step before training the model, we call it preprocessing.
Part 4: Tensors in TensorFlow 13 Jan, 2022 TensorFlow is a free OS library for machine learning created by Google Brain. Tensorflow has excellent functionality for building deep neural networks. I have chosen TensorFlow because it is pretty robust, efficient, and can be used with Python. In this post, I am going to write about how we can create tensors, shuffle them, index them, get information about tensors with simple examples.
Part 4: TensorFlow: Regression Model 21 Jan, 2022 I have described regression modeling in TensorFlow. We have predicted a numerical value and adjusted hyperparameters to better model performance with a simple neural network. We generated a dataset, demonstrated a simple data split into training and testing sets, visualised our data and the created neural network, evaluated our model using a testing dataset.
Part 5: TensorFlow: Global and Operation-level Seeds 15 Jan, 2022 In training Machine Learning models, we want to avoid any ordering biases in the data. In some cases, such as in Cross-Validation experiments, it is essential to mix data and ensure that the order of data is the same between different runs or system restarts. We can use operation-level and global seeds to achieve the reproducibility of results.
Part 5: TensorFlow: Evaluating the Regression Model 25 Jan, 2022 In this post, we have performed the evaluation of four regression models using TensorFlow. MAE and MSE error metrics were used to compare the Sequential models while finding the best neural network architecture regarding the defined hyperparameters.
Part 6: TensorFlow: Multiclass Classification Model 6 Feb, 2022 In Machine Learning, the classification problem is categorising input data into different classes. For instance, we can categorise email messages into two groups, spam or not spam. In this case, we have two classes, we talk about binary classification. When we have more than two classes, we talk about multiclass classification. In this post, I am going to address the latest multiclass classification, on the example of categorising clothing items into clothing types.
Part 7: TensorFlow: Convolutional Neural Networks for Image Classification 19 Feb, 2022 In this post, I have demonstrated CNN usage for birds recognition using TensorFlow and Kaggle 400 birds species dataset. We observed how the model works with the original and augmented images.
Part 8: TensorFlow: Transfer Learning (Feature Extraction) in Image Classification 3 Mar, 2022 Image classification is a complex task. However, we can approach the problem while reusing state-of-the-art pre-trained models. Using previously learned patterns from other models is named "Transfer Learning." This way, we can efficiently apply well-tested models, potentially leading to excellent performance.
Part 9: TensorFlow: Transfer Learning (Fine-Tuning) in Image Classification 6 Apr, 2022 We used a 400 species birds dataset for building bird species predictive models based on EffeicientNetB0 from Keras. The baseline model showed already an excellent Accuracy=0.9845. However, data augmentation did not help in improving accuracy, which slightly lowered to 0.9690. Further, this model with a data augmentation layer was partially unfrozen, retrained with a lower learning rate, and reached an Accuracy=0.9850.
Part 10: TensorFlow: Evaluating the Saved Bird Species Prediction Model 2 May, 2022 In this post, I have described the process of in-depth model evaluation. I have reused the previously created EffecientNetB0 model, which is fine-tuned with the 400 Bird Species Kaggle dataset. As a result, I have found out which bird species are not well predicted.
Part 11: Data exploration and analysis with Python Pandas 20 Jan, 2023 In Data Science, we have so many terms explaining concepts and techniques that it is easy to need clarification and get a clear understanding of all data science components and steps. In this post, I filled the gap by explaining data science's two essential components, data analysis and exploration. To clarify things, I have shown both approaches, compared them, and provided Python code using Pandas dataframe and graph drawing.
Part 11: Machine Learning Tests using the Titanic dataset 10 Feb, 2023 In this post, we created and evaluated several machine-learning models using the Titanic Dataset. We have compared the performance of the Logistic Regression, Decision Tree and Random Forest from Python's library scikit-learn and a Neural Network created with TensorFlow. The Random Forest Performed the best!
Part 12: Machine-Learning Process 30 Oct, 2023 The machine learning process involves a series of steps and activities designed to develop and deploy machine learning models to solve specific problems or make predictions. To simplify, we create programs that take in data and produce desired results in machine learning. There are several stages in the machine-learning process that we briefly describe in this post.
Part 12: Decision Tree versus Random Forest, and Hyperparameter Optimisation 6 Nov, 2023 Decision trees, with their elegant simplicity and transparency, stand in stark contrast to the robust predictive power of Random Forest, an ensemble of trees. In this post, we compare the key distinctions, advantages, and trade-offs between these two approaches. We will use Scikit-Learn for training and testing both models and also perform hyperparameter optimisation to find both model parameters for improved performance.
Part 13: TensorFlow: Romancing with TensorFlow and NLP 11 Jul, 2022 In this post we will create a simple poem generation model with Keras Sequential API.
Part 13: Cross-Validation Techniques 13 Mar, 2025 Building a machine learning model is easy; proving it actually works on unseen data is the hard part. In this post, we cover cross-validation techniques—from traditional K-Fold to Stratified and Time-Series splits—using hands-on examples in scikit-learn.
Part 14: LoRA fine-tuning wins 16 Oct, 2025 You no longer need to retrain entire language models. LoRA allows you to teach new capabilities via tiny adapters. Here is the architectural code, deployment cheat sheets, and production pitfalls.
Part 17: Floating-point format and Mixed Precision in TensorFlow 19 May, 2022 When creating large Machine Learning models, we want to minimise the training time. In TensorFlow, it is possible to do mixed precision model training, which helps in significant performance improvement because it uses lower-precision operations with 16 bits (such as float16) together with single-precision operations (f.i. using float32 data type). Google TPUs and NVIDIA GPUs devices can perform operations with 16-bit datatype much faster
Part 18: Audio Signal Processing with Python's Librosa 5 Mar, 2023 In this post, I focus on audio signal processing and working with WAV files. I apply Python's Librosa library for extracting wave features commonly used in research and application tasks such as gender prediction, music genre prediction, and voice identification. To succeed in these complex tasks, we need a clear understanding of how WAV files can be analysed, which I cover in detail with handy Python code snippets.
Part 19: Bias-Variance Challenge 10 Nov, 2023 In machine learning, we usually start from a simple baseline model and progressively adjust its complexity until we reach that spot with the best model performance. How can we do this? Let's detail the most essential machine learning concepts and the bias-variance challenge.
Part 20: How recommendation engines actually work (with Python code) 8 May, 2024 Ever wondered how Netflix or Spotify manages to guess what you want to watch or listen to next? The secret lies in recommendation algorithms. Here is a look at the math behind collaborative and content-based filtering, and how to implement them in Python.
Part 21: OpenAI's Model Show-off 19 Feb, 2024 OpenAI's GPT models are highly sophisticated machine learning models that are used in various fields such as natural language processing, coding assistance, and content creation. OpenAI's newest video-generating model, Sora, sets a new benchmark in video generation technology, which I quickly explore in this post.
Part 22: Apache-Licensed Summarizers 14 Nov, 2025 Looking for summarisation models you can safely use in your app? Here is a definitive guide to Apache-licensed transformer models, complete with selection matrices and production gotchas.