Ultimate Python Cheatsheet + Gotchas (v3.12‑ready)
Copy‑paste recipes, production gotchas, performance tips, typing patterns, and refactoring playbook — in one bookmarkable page.
Everyday Snippets You’ll Use Weekly
# Safe file read (UTF‑8, small files)
from pathlib import Path
data = Path("input.txt").read_text(encoding="utf-8")
# JSON load/save with UTF‑8 and pretty output
import json, sys
obj = json.loads(sys.stdin.read())
Path("out.json").write_text(json.dumps(obj, ensure_ascii=False, indent=2), encoding="utf-8")
# Timeout HTTP GET with retries (requests)
import requests, time
for attempt in range(3):
try:
r = requests.get("https://api.example.com", timeout=5)
r.raise_for_status()
break
except requests.RequestException:
if attempt == 2: raise
time.sleep(0.7)
# Datetime: timezone‑aware now and parsing ISO‑8601
from datetime import datetime, timezone
now = datetime.now(timezone.utc)
dt = datetime.fromisoformat("2024-09-17T10:00:00+00:00")
# Dict of lists to CSV (no pandas)
import csv
rows = [{"name":"Ada","score":99},{"name":"Linus","score":95}]
with open("scores.csv","w",newline="",encoding="utf-8") as f:
w = csv.DictWriter(f, fieldnames=rows.keys()); w.writeheader(); w.writerows(rows)
# Top‑N frequency
from collections import Counter
top3 = Counter(["a","b","a","c","a","b"]).most_common(3)
# Safe temp directory
import tempfile, shutil
tmp = tempfile.mkdtemp()
# ... do work ...
shutil.rmtree(tmp)
# CLI with argparse
import argparse
p = argparse.ArgumentParser()
p.add_argument("--limit", type=int, default=10)
args = p.parse_args()
# Cached pure function (Python 3.9+)
from functools import lru_cache
@lru_cache(maxsize=1024)
def fib(n:int)->int: return n if n<2 else fib(n-1)+fib(n-2)
# Concurrency: CPU vs I/O
# CPU‑bound: multiprocessing; I/O‑bound: asyncio/threads
Tip: Prefer pathlib over os.path for cleaner, cross‑platform file handling.
Top 12 Python Gotchas (and the Safe Fix)
1) Mutable default arguments
# Bad
def add(item, bucket=[]): bucket.append(item); return bucket
# Good
def add(item, bucket=None):
bucket = [] if bucket is None else bucket
bucket.append(item); return bucket
2) datetime naive vs aware
from datetime import datetime, timezone
now = datetime.now(timezone.utc) # aware
3) Shadowing builtins
# Avoid naming vars: list, dict, id, type, file, input
items = # good
4) json dumps non‑UTF‑8
json.dumps(obj, ensure_ascii=False) # preserve Unicode
5) Joining paths manually
from pathlib import Path
p = Path("data") / "file.txt"
6) Floating point surprises
from decimal import Decimal
Decimal("0.1")+Decimal("0.2") == Decimal("0.3") # True
7) Logging vs print
import logging
logging.basicConfig(level=logging.INFO)
log = logging.getLogger("app")
log.info("Started")
8) Exception swallowing
# Bad: except Exception: pass
try:
...
except Exception as e:
raise RuntimeError("context") from e
9) Regex backtracking bombs
import re
# Prefer atomic & precise patterns; add timeouts with re.compile(..., flags)
10) In‑place list modify while iterating
# Use list comprehension to filter
items = [x for x in items if predicate(x)]
11) Global state in tests
# Reset with fixtures; avoid singletons in app code
12) Async mixed with blocking I/O
# Use async clients or run_in_executor for blocking calls
Lint for these early: ruff/flake8 + mypy + pytest in pre‑commit. Catch 80% of bugs before runtime.
Pragmatic Typing Recipes (3.12)
from typing import TypedDict, Literal, Iterable, Iterator, Self
from dataclasses import dataclass
class UserTD(TypedDict):
id: int
name: str
role: Literal["admin","editor","viewer"]
@dataclass(slots=True, frozen=True)
class Point:
x: float; y: float
def move(self, dx: float, dy: float) -> Self:
return Point(self.x+dx, self.y+dy)
def batched(it: Iterable[int], size: int) -> Iterator[list[int]]:
bucket: list[int] = []
for x in it:
bucket.append(x)
if len(bucket) == size:
yield bucket; bucket = []
if bucket: yield bucket
| Need | Use | Notes |
|---|---|---|
| Immutable DTO | dataclass(frozen=True, slots=True) | Hashable, memory‑efficient |
| JSON‑like dict | TypedDict | Structural, optional keys via NotRequired |
| Enum‑like strings | Literal[...] | Great for configs and guards |
| Return this type | Self | Fluent APIs |
Performance Cheats (Fast Wins)
- OK Prefer list/dict/set comprehensions over loops.
- OK Use join for string concat:
"".join(chunks). - OK Move constants to module level; enable
slotsfor dataclasses. - WARN Avoid quadratic patterns: nested loops on large inputs.
- WARN Beware pandas row‑by‑row loops; vectorize or use
applysparingly. - DANGER Don’t share mutable defaults across calls or globals in threads.
# Timing micro‑benchmarks
from time import perf_counter
t0 = perf_counter()
# ... code ...
print(f"{perf_counter()-t0:.6f}s")
Profile before optimizing. Use:
cProfile, line_profiler, scalene. For I/O, batch operations and reuse sessions.
Refactoring Patterns (Before → After)
Before: God function
def process(items):
results = []
for i in items:
if i % 2 == 0:
results.append(i*i)
results.sort()
return results[:10]
After: Pipeline + SRP
def square_evens(items): return (i*i for i in items if i % 2 == 0)
def top_n(sorted_iter, n):
from itertools import islice
return list(islice(sorted(sorted_iter), n))
out = top_n(square_evens(items), 10)
Aim for small, composable functions with nouns (data) separated from verbs (behavior). Test behavior, not internals.
Testing Patterns (pytest)
# pyproject.toml (core bits)
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "-q -ra"
# Example test with parameterization and tmp_path
import json, pytest
@pytest.mark.parametrize("a,b,ans", [(1,2,3),(10,-1,9)])
def test_add(a,b,ans):
assert a+b == ans
def test_write(tmp_path):
p = tmp_path / "out.json"
p.write_text(json.dumps({"ok":1}), encoding="utf-8")
assert p.exists()
- Use
tmp_pathfor FS tests; never write to repo. - Patch network with
responsesorpytest-httpx. - Mark slow/integration with
@pytest.mark.slow.
Production Security Checklist
- Never log secrets; mask tokens and set structured logs.
- Pin dependencies with hashes; scan via
pip-audit/uv/pip-tools. - Validate and bound all untrusted input; add timeouts everywhere (HTTP, DB, subprocess).
- Use keyring/ENV for secrets; avoid committing .env files.
- For SSH/Paramiko: prefer key auth, verify host keys, set timeouts, and least privilege.
Clipboard Candy (Handy One‑Liners)
# Human bytes
def h(n:int)->str:
for u in "BKMGTPE":
if n<1024: return f"{n:.1f}{u}"
n/=1024
# Flatten 2‑D
flat = [x for row in grid for x in row]
# Chunk list n
chunks = [items[i:i+n] for i in range(0,len(items),n)]
# Safe int
def to_int(s, default=None):
try: return int(s)
except (ValueError,TypeError): return default
Practical Answers
When should I choose asyncio vs threads vs processes?
Use asyncio for high‑concurrency I/O (HTTP, sockets), threads for simple I/O interop with blocking libs, and processes for CPU‑bound work (numerical loops, image/video processing).
What’s the fastest way to read large CSVs without pandas?
Use csv with DictReader, predeclare fieldnames, and process in streaming fashion; for extreme speed, consider polars or pyarrow.
How do I structure a small CLI project?
Use src/ layout, type hints, argparse or typer, add ruff, mypy, pytest, and publish scripts in pyproject.toml with entry points.
If this page helped, consider linking to it from “Resources,” “Style Guides,” or “Cheatsheets” sections. Sharing encourages more free content like this.
© 2025 Pythoneo · This page is designed to be highly linkable: evergreen, concise, practical, and safe to bookmark.
