Introduction

In the previous post we learned to handle errors gracefully. Now let us talk about something that will save you enormous amounts of time: Python’s standard library.

The standard library is the collection of modules that ships with every Python installation. It covers file system operations, dates and times, data serialisation, mathematics, networking, testing, compression, and much more. The first instinct of many beginners — and honestly of many experienced developers — is to reach for a third-party package. Often the right answer is already in the box.

The Python community has a saying: batteries included. This post is about finding the batteries.

Modules and import

Before exploring specific modules, a quick note on how to use them. A module is a Python file — or a package of files — that you bring into your script with import:

import math
print(math.sqrt(144))  # 12.0

If you only need one thing from a module, from x import y keeps the namespace cleaner:

from math import sqrt, pi
print(sqrt(144))  # 12.0
print(pi)         # 3.141592653589793

You can alias a module to a shorter name with as:

import datetime as dt
today = dt.date.today()

That is all there is to it. The rest is knowing which modules exist and what they do.

pathlib — File Paths Done Right

Before pathlib (Python 3.4+) people assembled file paths by concatenating strings — error-prone and not portable across operating systems. pathlib gives you a Path object that behaves like a path should:

from pathlib import Path

# Create a path — works on Windows, macOS, and Linux
data_dir = Path("/tmp/bird_data")
data_dir.mkdir(exist_ok=True)  # create directory if it does not exist

# Build paths with /
observations_file = data_dir / "observations.txt"
print(observations_file)        # /tmp/bird_data/observations.txt
print(observations_file.name)   # observations.txt
print(observations_file.suffix) # .txt
print(observations_file.parent) # /tmp/bird_data

# Write and read
observations_file.write_text("Eagle, Amsterdam, 2026-05-10\nPigeon, Den Haag, 2026-05-10\n")
content = observations_file.read_text()
print(content)

# Check existence
print(observations_file.exists())  # True

# List directory contents
for f in data_dir.iterdir():
    print(f)

/tmp/bird_data/observations.txt
observations.txt
.txt
/tmp/bird_data
Eagle, Amsterdam, 2026-05-10
Pigeon, Den Haag, 2026-05-10

True
/tmp/bird_data/observations.txt

pathlib also integrates beautifully with open(), json, and the error handling from the previous post. Whenever you are working with files, reach for Path first.

datetime — Dates and Times

Bird watching is nothing without timestamps. The datetime module handles dates, times, and the arithmetic between them:

from datetime import datetime, date, timedelta

# Today's date and current timestamp
today = date.today()
now = datetime.now()

print(f"Today: {today}")
print(f"Now:   {now.strftime('%Y-%m-%d %H:%M:%S')}")

# Date arithmetic
migration_start = date(2026, 3, 15)
migration_end   = date(2026, 5, 20)
duration = migration_end - migration_start
print(f"Migration lasted {duration.days} days.")

# A week from now
next_survey = today + timedelta(weeks=1)
print(f"Next survey: {next_survey}")

# Parsing a date string
observation_date = datetime.strptime("10/05/2026 08:30", "%d/%m/%Y %H:%M")
print(f"Observation: {observation_date.isoformat()}")

Today: 2026-05-10
Now:   2026-05-10 09:14:02
Migration lasted 66 days.
Next survey: 2026-05-17
Observation: 2026-05-10T08:30:00

strftime formats a datetime into a string; strptime parses a string back into a datetime. The format codes (%Y, %m, %d, and so on) follow the C standard — a bit cryptic at first but well documented and consistent.

For time zones, look at zoneinfo (Python 3.9+) or the third-party python-dateutil library. Naive datetimes (without time zone information) are fine for local use but cause headaches in distributed systems or when users are in multiple time zones.

json — Data Exchange

JSON is the lingua franca of data exchange on the web. Python’s json module reads and writes it with two main functions:

import json
from pathlib import Path

# Python objects → JSON string
flock = {
    "location": "Amsterdam",
    "date": "2026-05-10",
    "birds": [
        {"name": "Heron",  "count": 3, "notes": "feeding near canal"},
        {"name": "Coot",   "count": 12, "notes": None},
    ]
}

json_string = json.dumps(flock, indent=2)
print(json_string)

# JSON string → Python objects
loaded = json.loads(json_string)
print(loaded["birds"][0]["name"])  # Heron

# Write to file / read from file
path = Path("/tmp/flock.json")
path.write_text(json.dumps(flock, indent=2))

with open(path) as f:
    from_file = json.load(f)
print(f"Loaded {len(from_file['birds'])} bird records from file.")

{
  "location": "Amsterdam",
  "date": "2026-05-10",
  "birds": [
    {
      "name": "Heron",
      "count": 3,
      "notes": "feeding near canal"
    },
    {
      "name": "Coot",
      "count": 12,
      "notes": null
    }
  ]
}

Heron
Loaded 2 bird records from file.

json.dumps serialises to a string; json.dump writes directly to a file object. json.loads parses a string; json.load reads from a file object. The naming is consistent once you see the pattern.

One limitation: json handles strings, numbers, lists, dicts, booleans, and None. It does not natively handle datetime objects, custom classes, or sets. For those you either write a custom encoder/decoder or use a library like pydantic.

random — Controlled Randomness

The random module is useful for simulations, sampling, testing, and generating example data:

import random

species = ["Eagle", "Pigeon", "Stork", "Swan", "Heron", "Coot", "Kingfisher"]

# Pick one at random
spotted = random.choice(species)
print(f"You spotted a {spotted}!")

# Pick several without replacement
survey_sample = random.sample(species, k=3)
print(f"Survey sample: {survey_sample}")

# Shuffle a list in place
random.shuffle(species)
print(f"Shuffled order: {species}")

# Random float between 0 and 1
confidence = random.random()
print(f"Identification confidence: {confidence:.2%}")

# Random integer
flock_size = random.randint(5, 50)
print(f"Flock size: {flock_size}")

# Reproducible results with a seed
random.seed(42)
print(random.choice(["Eagle", "Pigeon", "Stork"]))  # always the same with seed 42

You spotted a Heron!
Survey sample: ['Kingfisher', 'Swan', 'Eagle']
Shuffled order: ['Coot', 'Stork', 'Swan', ...]
Identification confidence: 73.42%
Flock size: 23
Stork

random.seed() is invaluable for reproducible experiments and tests — give the same seed and you get the same sequence every time.

One important note: random is not cryptographically secure. For passwords, tokens, or anything security-related, use the secrets module instead.

dataclasses — Structured Data Without Ceremony

dataclasses (Python 3.7+) is a decorator that generates boilerplate class code for you — __init__, __repr__, and __eq__ at minimum. It is the cleanest way to define a data-holding class without writing everything by hand:

from dataclasses import dataclass, field
from datetime import date

@dataclass
class BirdObservation:
    species: str
    location: str
    date: date
    count: int = 1
    notes: str = ""
    tags: list[str] = field(default_factory=list)

    def summary(self) -> str:
        tag_str = ", ".join(self.tags) if self.tags else "no tags"
        return (
            f"{self.date} | {self.location} | "
            f"{self.count}x {self.species} ({tag_str})"
        )


obs1 = BirdObservation(
    species="Heron",
    location="Amsterdam",
    date=date(2026, 5, 10),
    count=2,
    tags=["wading", "feeding"],
)

obs2 = BirdObservation(
    species="Kingfisher",
    location="Den Haag",
    date=date(2026, 5, 10),
)

print(obs1)
print(obs1.summary())
print(obs2.summary())
print(obs1 == obs2)   # False — different fields

BirdObservation(species='Heron', location='Amsterdam', date=datetime.date(2026, 5, 10), count=2, notes='', tags=['wading', 'feeding'])
2026-05-10 | Amsterdam | 2x Heron (wading, feeding)
2026-05-10 | Den Haag | 1x Kingfisher (no tags)
False

Notice field(default_factory=list) for the tags attribute. Remember the mutable default argument gotcha from the functions post? This is how you handle it in a dataclass — default_factory creates a fresh list for each instance.

@dataclass(frozen=True) makes instances immutable (like a tuple with named fields). @dataclass(order=True) adds comparison methods so you can sort a list of observations. These small additions cover a large fraction of what you would otherwise write in a full class.

Putting It Together: A Bird Survey Tool

Here is a small tool that combines pathlib, datetime, json, random, and dataclasses into something you could actually use:

import json
import random
from dataclasses import dataclass, asdict, field
from datetime import date
from pathlib import Path


@dataclass
class BirdObservation:
    species: str
    location: str
    date: str          # ISO format string for JSON compatibility
    count: int = 1
    tags: list[str] = field(default_factory=list)


def simulate_survey(location: str, n: int = 5) -> list[BirdObservation]:
    """Generate n random bird observations for testing."""
    species_pool = ["Heron", "Coot", "Kingfisher", "Pigeon", "Swan", "Moorhen"]
    today = date.today().isoformat()
    return [
        BirdObservation(
            species=random.choice(species_pool),
            location=location,
            date=today,
            count=random.randint(1, 20),
        )
        for _ in range(n)
    ]


def save_survey(observations: list[BirdObservation], filepath: str) -> Path:
    path = Path(filepath)
    path.parent.mkdir(parents=True, exist_ok=True)
    data = [asdict(obs) for obs in observations]
    path.write_text(json.dumps(data, indent=2))
    print(f"Saved {len(data)} observations to {path}.")
    return path


def load_survey(filepath: str) -> list[BirdObservation]:
    path = Path(filepath)
    if not path.exists():
        print(f"No survey file at {path}.")
        return []
    raw = json.loads(path.read_text())
    return [BirdObservation(**record) for record in raw]


# Run it
random.seed(7)
survey = simulate_survey("Westbroekpark, Den Haag", n=4)
path = save_survey(survey, "/tmp/surveys/may_survey.json")

loaded = load_survey("/tmp/surveys/may_survey.json")
print("\nLoaded observations:")
for obs in loaded:
    print(f"  {obs.date} | {obs.location} | {obs.count}x {obs.species}")

Saved 4 observations to /tmp/surveys/may_survey.json.

Loaded observations:
  2026-05-10 | Westbroekpark, Den Haag | 14x Pigeon
  2026-05-10 | Westbroekpark, Den Haag | 3x Swan
  2026-05-10 | Westbroekpark, Den Haag | 11x Coot
  2026-05-10 | Westbroekpark, Den Haag | 7x Heron

One hundred lines of readable code, no third-party dependencies, and a working data pipeline. That is the standard library at its best.

Conclusion

The standard library is one of Python’s great strengths. Before installing any package, it is worth a quick check of what Python already ships with. The official documentation index is the definitive reference, and it is much more readable than its reputation suggests.

In the final post in this series we look at generators and iterators — the Python feature that lets you work with sequences that are too large to fit in memory, or that do not fully exist yet. Migrating birds that arrive one by one, rather than all at once. It is one of my favourite Python topics.

Did you like this post? Please let me know if you have any comments or suggestions — your feedback always helps!

Python's Standard Library: Your Built-in Toolkit

📚 This post is part of the "Python Basics" series