Introduction
In the previous post we learned to handle errors gracefully. Now let us talk about something that will save you enormous amounts of time: Pythonβs standard library.
The standard library is the collection of modules that ships with every Python installation. It covers file system operations, dates and times, data serialisation, mathematics, networking, testing, compression, and much more. The first instinct of many beginners β and honestly of many experienced developers β is to reach for a third-party package. Often the right answer is already in the box.
The Python community has a saying: batteries included. This post is about finding the batteries.
Modules and import
Before exploring specific modules, a quick note on how to use them. A module is a Python file β or a package of files β that you bring into your script with import:
import math
print(math.sqrt(144)) # 12.0
If you only need one thing from a module, from x import y keeps the namespace cleaner:
from math import sqrt, pi
print(sqrt(144)) # 12.0
print(pi) # 3.141592653589793
You can alias a module to a shorter name with as:
import datetime as dt
today = dt.date.today()
That is all there is to it. The rest is knowing which modules exist and what they do.
pathlib β File Paths Done Right
Before pathlib (Python 3.4+) people assembled file paths by concatenating strings β error-prone and not portable across operating systems. pathlib gives you a Path object that behaves like a path should:
from pathlib import Path
# Create a path β works on Windows, macOS, and Linux
data_dir = Path("/tmp/bird_data")
data_dir.mkdir(exist_ok=True) # create directory if it does not exist
# Build paths with /
observations_file = data_dir / "observations.txt"
print(observations_file) # /tmp/bird_data/observations.txt
print(observations_file.name) # observations.txt
print(observations_file.suffix) # .txt
print(observations_file.parent) # /tmp/bird_data
# Write and read
observations_file.write_text("Eagle, Amsterdam, 2026-05-10\nPigeon, Den Haag, 2026-05-10\n")
content = observations_file.read_text()
print(content)
# Check existence
print(observations_file.exists()) # True
# List directory contents
for f in data_dir.iterdir():
print(f)
/tmp/bird_data/observations.txt
observations.txt
.txt
/tmp/bird_data
Eagle, Amsterdam, 2026-05-10
Pigeon, Den Haag, 2026-05-10
True
/tmp/bird_data/observations.txt
pathlib also integrates beautifully with open(), json, and the error handling from the previous post. Whenever you are working with files, reach for Path first.
datetime β Dates and Times
Bird watching is nothing without timestamps. The datetime module handles dates, times, and the arithmetic between them:
from datetime import datetime, date, timedelta
# Today's date and current timestamp
today = date.today()
now = datetime.now()
print(f"Today: {today}")
print(f"Now: {now.strftime('%Y-%m-%d %H:%M:%S')}")
# Date arithmetic
migration_start = date(2026, 3, 15)
migration_end = date(2026, 5, 20)
duration = migration_end - migration_start
print(f"Migration lasted {duration.days} days.")
# A week from now
next_survey = today + timedelta(weeks=1)
print(f"Next survey: {next_survey}")
# Parsing a date string
observation_date = datetime.strptime("10/05/2026 08:30", "%d/%m/%Y %H:%M")
print(f"Observation: {observation_date.isoformat()}")
Today: 2026-05-10
Now: 2026-05-10 09:14:02
Migration lasted 66 days.
Next survey: 2026-05-17
Observation: 2026-05-10T08:30:00
strftime formats a datetime into a string; strptime parses a string back into a datetime. The format codes (%Y, %m, %d, and so on) follow the C standard β a bit cryptic at first but well documented and consistent.
For time zones, look at zoneinfo (Python 3.9+) or the third-party python-dateutil library. Naive datetimes (without time zone information) are fine for local use but cause headaches in distributed systems or when users are in multiple time zones.
json β Data Exchange
JSON is the lingua franca of data exchange on the web. Pythonβs json module reads and writes it with two main functions:
import json
from pathlib import Path
# Python objects β JSON string
flock = {
"location": "Amsterdam",
"date": "2026-05-10",
"birds": [
{"name": "Heron", "count": 3, "notes": "feeding near canal"},
{"name": "Coot", "count": 12, "notes": None},
]
}
json_string = json.dumps(flock, indent=2)
print(json_string)
# JSON string β Python objects
loaded = json.loads(json_string)
print(loaded["birds"][0]["name"]) # Heron
# Write to file / read from file
path = Path("/tmp/flock.json")
path.write_text(json.dumps(flock, indent=2))
with open(path) as f:
from_file = json.load(f)
print(f"Loaded {len(from_file['birds'])} bird records from file.")
{
"location": "Amsterdam",
"date": "2026-05-10",
"birds": [
{
"name": "Heron",
"count": 3,
"notes": "feeding near canal"
},
{
"name": "Coot",
"count": 12,
"notes": null
}
]
}
Heron
Loaded 2 bird records from file.
json.dumps serialises to a string; json.dump writes directly to a file object. json.loads parses a string; json.load reads from a file object. The naming is consistent once you see the pattern.
One limitation: json handles strings, numbers, lists, dicts, booleans, and None. It does not natively handle datetime objects, custom classes, or sets. For those you either write a custom encoder/decoder or use a library like pydantic.
random β Controlled Randomness
The random module is useful for simulations, sampling, testing, and generating example data:
import random
species = ["Eagle", "Pigeon", "Stork", "Swan", "Heron", "Coot", "Kingfisher"]
# Pick one at random
spotted = random.choice(species)
print(f"You spotted a {spotted}!")
# Pick several without replacement
survey_sample = random.sample(species, k=3)
print(f"Survey sample: {survey_sample}")
# Shuffle a list in place
random.shuffle(species)
print(f"Shuffled order: {species}")
# Random float between 0 and 1
confidence = random.random()
print(f"Identification confidence: {confidence:.2%}")
# Random integer
flock_size = random.randint(5, 50)
print(f"Flock size: {flock_size}")
# Reproducible results with a seed
random.seed(42)
print(random.choice(["Eagle", "Pigeon", "Stork"])) # always the same with seed 42
You spotted a Heron!
Survey sample: ['Kingfisher', 'Swan', 'Eagle']
Shuffled order: ['Coot', 'Stork', 'Swan', ...]
Identification confidence: 73.42%
Flock size: 23
Stork
random.seed() is invaluable for reproducible experiments and tests β give the same seed and you get the same sequence every time.
One important note: random is not cryptographically secure. For passwords, tokens, or anything security-related, use the secrets module instead.
dataclasses β Structured Data Without Ceremony
dataclasses (Python 3.7+) is a decorator that generates boilerplate class code for you β __init__, __repr__, and __eq__ at minimum. It is the cleanest way to define a data-holding class without writing everything by hand:
from dataclasses import dataclass, field
from datetime import date
@dataclass
class BirdObservation:
species: str
location: str
date: date
count: int = 1
notes: str = ""
tags: list[str] = field(default_factory=list)
def summary(self) -> str:
tag_str = ", ".join(self.tags) if self.tags else "no tags"
return (
f"{self.date} | {self.location} | "
f"{self.count}x {self.species} ({tag_str})"
)
obs1 = BirdObservation(
species="Heron",
location="Amsterdam",
date=date(2026, 5, 10),
count=2,
tags=["wading", "feeding"],
)
obs2 = BirdObservation(
species="Kingfisher",
location="Den Haag",
date=date(2026, 5, 10),
)
print(obs1)
print(obs1.summary())
print(obs2.summary())
print(obs1 == obs2) # False β different fields
BirdObservation(species='Heron', location='Amsterdam', date=datetime.date(2026, 5, 10), count=2, notes='', tags=['wading', 'feeding'])
2026-05-10 | Amsterdam | 2x Heron (wading, feeding)
2026-05-10 | Den Haag | 1x Kingfisher (no tags)
False
Notice field(default_factory=list) for the tags attribute. Remember the mutable default argument gotcha from the functions post? This is how you handle it in a dataclass β default_factory creates a fresh list for each instance.
@dataclass(frozen=True) makes instances immutable (like a tuple with named fields). @dataclass(order=True) adds comparison methods so you can sort a list of observations. These small additions cover a large fraction of what you would otherwise write in a full class.
Putting It Together: A Bird Survey Tool
Here is a small tool that combines pathlib, datetime, json, random, and dataclasses into something you could actually use:
import json
import random
from dataclasses import dataclass, asdict, field
from datetime import date
from pathlib import Path
@dataclass
class BirdObservation:
species: str
location: str
date: str # ISO format string for JSON compatibility
count: int = 1
tags: list[str] = field(default_factory=list)
def simulate_survey(location: str, n: int = 5) -> list[BirdObservation]:
"""Generate n random bird observations for testing."""
species_pool = ["Heron", "Coot", "Kingfisher", "Pigeon", "Swan", "Moorhen"]
today = date.today().isoformat()
return [
BirdObservation(
species=random.choice(species_pool),
location=location,
date=today,
count=random.randint(1, 20),
)
for _ in range(n)
]
def save_survey(observations: list[BirdObservation], filepath: str) -> Path:
path = Path(filepath)
path.parent.mkdir(parents=True, exist_ok=True)
data = [asdict(obs) for obs in observations]
path.write_text(json.dumps(data, indent=2))
print(f"Saved {len(data)} observations to {path}.")
return path
def load_survey(filepath: str) -> list[BirdObservation]:
path = Path(filepath)
if not path.exists():
print(f"No survey file at {path}.")
return []
raw = json.loads(path.read_text())
return [BirdObservation(**record) for record in raw]
# Run it
random.seed(7)
survey = simulate_survey("Westbroekpark, Den Haag", n=4)
path = save_survey(survey, "/tmp/surveys/may_survey.json")
loaded = load_survey("/tmp/surveys/may_survey.json")
print("\nLoaded observations:")
for obs in loaded:
print(f" {obs.date} | {obs.location} | {obs.count}x {obs.species}")
Saved 4 observations to /tmp/surveys/may_survey.json.
Loaded observations:
2026-05-10 | Westbroekpark, Den Haag | 14x Pigeon
2026-05-10 | Westbroekpark, Den Haag | 3x Swan
2026-05-10 | Westbroekpark, Den Haag | 11x Coot
2026-05-10 | Westbroekpark, Den Haag | 7x Heron
One hundred lines of readable code, no third-party dependencies, and a working data pipeline. That is the standard library at its best.
Conclusion
The standard library is one of Pythonβs great strengths. Before installing any package, it is worth a quick check of what Python already ships with. The official documentation index is the definitive reference, and it is much more readable than its reputation suggests.
In the final post in this series we look at generators and iterators β the Python feature that lets you work with sequences that are too large to fit in memory, or that do not fully exist yet. Migrating birds that arrive one by one, rather than all at once. It is one of my favourite Python topics.
Did you like this post? Please let me know if you have any comments or suggestions β your feedback always helps!