Ultimate Python Cheatsheet + Gotchas (v3.12‑ready)
Copy‑paste recipes, production gotchas, performance tips, typing patterns, and refactoring playbook — in one bookmarkable page.
Everyday Snippets You’ll Use Weekly
# Safe file read (UTF‑8, small files)
from pathlib import Path
data = Path("input.txt").read_text(encoding="utf-8")
# JSON load/save with UTF‑8 and pretty output
import json, sys
obj = json.loads(sys.stdin.read())
Path("out.json").write_text(json.dumps(obj, ensure_ascii=False, indent=2), encoding="utf-8")
# Timeout HTTP GET with retries (requests)
import requests, time
for attempt in range(3):
try:
r = requests.get("https://api.example.com", timeout=5)
r.raise_for_status()
break
except requests.RequestException:
if attempt == 2: raise
time.sleep(0.7)
# Datetime: timezone‑aware now and parsing ISO‑8601
from datetime import datetime, timezone
now = datetime.now(timezone.utc)
dt = datetime.fromisoformat("2024-09-17T10:00:00+00:00")
# Dict of lists to CSV (no pandas)
import csv
rows = [{"name":"Ada","score":99},{"name":"Linus","score":95}]
with open("scores.csv","w",newline="",encoding="utf-8") as f:
w = csv.DictWriter(f, fieldnames=rows.keys()); w.writeheader(); w.writerows(rows)
# Top‑N frequency
from collections import Counter
top3 = Counter(["a","b","a","c","a","b"]).most_common(3)
# Safe temp directory
import tempfile, shutil
tmp = tempfile.mkdtemp()
# ... do work ...
shutil.rmtree(tmp)
# CLI with argparse
import argparse
p = argparse.ArgumentParser()
p.add_argument("--limit", type=int, default=10)
args = p.parse_args()
# Cached pure function (Python 3.9+)
from functools import lru_cache
@lru_cache(maxsize=1024)
def fib(n:int)->int: return n if n<2 else fib(n-1)+fib(n-2)
# Concurrency: CPU vs I/O
# CPU‑bound: multiprocessing; I/O‑bound: asyncio/threads
Tip: Prefer pathlib over os.path for cleaner, cross‑platform file handling.
Top 12 Python Gotchas (and the Safe Fix)
1) Mutable default arguments
# Bad
def add(item, bucket=[]): bucket.append(item); return bucket
# Good
def add(item, bucket=None):
bucket = [] if bucket is None else bucket
bucket.append(item); return bucket
2) datetime naive vs aware
from datetime import datetime, timezone
now = datetime.now(timezone.utc) # aware
3) Shadowing builtins
# Avoid naming vars: list, dict, id, type, file, input
items = # good
4) json dumps non‑UTF‑8
json.dumps(obj, ensure_ascii=False) # preserve Unicode
5) Joining paths manually
from pathlib import Path
p = Path("data") / "file.txt"
6) Floating point surprises
from decimal import Decimal
Decimal("0.1")+Decimal("0.2") == Decimal("0.3") # True
7) Logging vs print
import logging
logging.basicConfig(level=logging.INFO)
log = logging.getLogger("app")
log.info("Started")
8) Exception swallowing
# Bad: except Exception: pass
try:
...
except Exception as e:
raise RuntimeError("context") from e
9) Regex backtracking bombs
import re
# Prefer atomic & precise patterns; add timeouts with re.compile(..., flags)
10) In‑place list modify while iterating
# Use list comprehension to filter
items = [x for x in items if predicate(x)]
11) Global state in tests
# Reset with fixtures; avoid singletons in app code
12) Async mixed with blocking I/O
# Use async clients or run_in_executor for blocking calls
Lint for these early: ruff/flake8 + mypy + pytest in pre‑commit. Catch 80% of bugs before runtime.
Pragmatic Typing Recipes (3.12)
from typing import TypedDict, Literal, Iterable, Iterator, Self
from dataclasses import dataclass
class UserTD(TypedDict):
id: int
name: str
role: Literal["admin","editor","viewer"]
@dataclass(slots=True, frozen=True)
class Point:
x: float; y: float
def move(self, dx: float, dy: float) -> Self:
return Point(self.x+dx, self.y+dy)
def batched(it: Iterable[int], size: int) -> Iterator[list[int]]:
bucket: list[int] = []
for x in it:
bucket.append(x)
if len(bucket) == size:
yield bucket; bucket = []
if bucket: yield bucket
Need | Use | Notes |
---|---|---|
Immutable DTO | dataclass(frozen=True, slots=True) | Hashable, memory‑efficient |
JSON‑like dict | TypedDict | Structural, optional keys via NotRequired |
Enum‑like strings | Literal[...] | Great for configs and guards |
Return this type | Self | Fluent APIs |
Performance Cheats (Fast Wins)
- OK Prefer list/dict/set comprehensions over loops.
- OK Use join for string concat:
"".join(chunks)
. - OK Move constants to module level; enable
slots
for dataclasses. - WARN Avoid quadratic patterns: nested loops on large inputs.
- WARN Beware pandas row‑by‑row loops; vectorize or use
apply
sparingly. - DANGER Don’t share mutable defaults across calls or globals in threads.
# Timing micro‑benchmarks
from time import perf_counter
t0 = perf_counter()
# ... code ...
print(f"{perf_counter()-t0:.6f}s")
Profile before optimizing. Use:
cProfile
, line_profiler
, scalene
. For I/O, batch operations and reuse sessions.
Refactoring Patterns (Before → After)
Before: God function
def process(items):
results = []
for i in items:
if i % 2 == 0:
results.append(i*i)
results.sort()
return results[:10]
After: Pipeline + SRP
def square_evens(items): return (i*i for i in items if i % 2 == 0)
def top_n(sorted_iter, n):
from itertools import islice
return list(islice(sorted(sorted_iter), n))
out = top_n(square_evens(items), 10)
Aim for small, composable functions with nouns (data) separated from verbs (behavior). Test behavior, not internals.
Testing Patterns (pytest)
# pyproject.toml (core bits)
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "-q -ra"
# Example test with parameterization and tmp_path
import json, pytest
@pytest.mark.parametrize("a,b,ans", [(1,2,3),(10,-1,9)])
def test_add(a,b,ans):
assert a+b == ans
def test_write(tmp_path):
p = tmp_path / "out.json"
p.write_text(json.dumps({"ok":1}), encoding="utf-8")
assert p.exists()
- Use
tmp_path
for FS tests; never write to repo. - Patch network with
responses
orpytest-httpx
. - Mark slow/integration with
@pytest.mark.slow
.
Production Security Checklist
- Never log secrets; mask tokens and set structured logs.
- Pin dependencies with hashes; scan via
pip-audit
/uv
/pip-tools
. - Validate and bound all untrusted input; add timeouts everywhere (HTTP, DB, subprocess).
- Use keyring/ENV for secrets; avoid committing .env files.
- For SSH/Paramiko: prefer key auth, verify host keys, set timeouts, and least privilege.
Clipboard Candy (Handy One‑Liners)
# Human bytes
def h(n:int)->str:
for u in "BKMGTPE":
if n<1024: return f"{n:.1f}{u}"
n/=1024
# Flatten 2‑D
flat = [x for row in grid for x in row]
# Chunk list n
chunks = [items[i:i+n] for i in range(0,len(items),n)]
# Safe int
def to_int(s, default=None):
try: return int(s)
except (ValueError,TypeError): return default
Practical Answers
When should I choose asyncio vs threads vs processes?
Use asyncio for high‑concurrency I/O (HTTP, sockets), threads for simple I/O interop with blocking libs, and processes for CPU‑bound work (numerical loops, image/video processing).
What’s the fastest way to read large CSVs without pandas?
Use csv
with DictReader
, predeclare fieldnames, and process in streaming fashion; for extreme speed, consider polars
or pyarrow
.
How do I structure a small CLI project?
Use src/
layout, type hints, argparse
or typer
, add ruff
, mypy
, pytest
, and publish scripts in pyproject.toml
with entry points.
If this page helped, consider linking to it from “Resources,” “Style Guides,” or “Cheatsheets” sections. Sharing encourages more free content like this.
© 2025 Pythoneo · This page is designed to be highly linkable: evergreen, concise, practical, and safe to bookmark.