Seaborn is a high‑level statistical visualization library built on matplotlib, designed to turn tidy data into clear, publication‑quality charts with minimal code; install Seaborn, pick a plot function like scatterplot
/histplot
/catplot
, pass a pandas DataFrame and column names, then theme with set_theme()
and color palettes for consistent, attractive output.
pip install seaborn
), import a dataset into a pandas DataFrame, call a Seaborn function (e.g., sns.scatterplot(data=df, x="col1", y="col2", hue="group")
), and polish the figure with sns.set_theme(style="whitegrid")
and a palette; for small multiples and conditioning by categories, use sns.catplot(..., col="category")
or FacetGrid
.
What this guide covers
- Installing Seaborn and setting a global theme to get great defaults out of the box.
- Core plot families: relational, distribution, categorical, regression, and multi‑plot grids, with copy‑paste snippets.
- Color palettes, figure styling, and multi‑panel layouts that communicate statistical structure clearly.
Prerequisites
- Python 3.8+ with pandas, matplotlib; install via pip or conda.
- A tidy (long‑form) DataFrame with well‑named columns for dataset‑oriented plotting.
Step‑by‑step: From install to polished charts
1) Install and set the theme
# Install (choose one)
pip install seaborn
# or
conda install seaborn
# Basic usage
import seaborn as sns
import pandas as pd
sns.set_theme(style="whitegrid", context="notebook", palette="deep") # great defaults
Seaborn provides opinionated defaults through set_theme
and integrates with matplotlib rcParams for cohesive styling.
2) Load a dataset
# Any pandas DataFrame works; Seaborn also ships sample datasets
df = sns.load_dataset("tips") # total_bill, tip, sex, smoker, day, time, size
df.head()
The dataset‑oriented API lets you specify columns by name, avoiding manual array wrangling and enabling concise, declarative plots.
3) Relational plots: trends and groups
ax = sns.scatterplot(data=df, x="total_bill", y="tip", hue="time", style="smoker", size="size")
ax.set(title="Tips vs. Bill", xlabel="Total bill ($)", ylabel="Tip ($)")
Relational functions like scatterplot
/lineplot
map variables to visual encodings (color, size, style) for multivariate patterns with minimal code.
4) Distributions: univariate and bivariate
# Univariate
sns.histplot(data=df, x="total_bill", bins=30, kde=True)
# Bivariate density
sns.jointplot(data=df, x="total_bill", y="tip", kind="kde", fill=True)
Use histplot
/kdeplot
/jointplot
to summarize distributions and relationships; KDEs provide smooth estimates when appropriate.
5) Categorical comparisons with uncertainty
# Compare distributions and confidence intervals
sns.catplot(data=df, x="day", y="tip", hue="smoker", kind="bar", ci=95, height=4, aspect=1.4)
# Alternatives: boxen/violin/strip/swarm for different tradeoffs
catplot
unifies categorical charts and returns a FacetGrid for small multiples by row/col—handy for split‑by analyses.
6) Regression and statistical relationships
# Visualize linear fit with confidence interval
sns.lmplot(data=df, x="total_bill", y="tip", hue="smoker", height=4, aspect=1.3, ci=95)
Seaborn’s regression plots add trend lines and intervals, emphasizing statistical structure without manual model code for common cases.
7) Multi‑plot grids and conditioning
g = sns.FacetGrid(df, col="day", row="time", hue="smoker", margin_titles=True)
g.map_dataframe(sns.scatterplot, x="total_bill", y="tip")
g.add_legend()
Facet grids elicit comparisons across conditional subsets, a core design principle behind Seaborn’s dataset‑oriented API.
Color, palettes, and figure styling
# Choose perceptually balanced palettes
sns.set_theme(style="whitegrid")
sns.color_palette("viridis", as_cmap=False) # sequential
sns.color_palette("coolwarm", as_cmap=False) # diverging
sns.color_palette("tab10", as_cmap=False) # qualitative
Use qualitative palettes for categories, sequential for ordered magnitudes, and diverging for deviations around a midpoint; keep accessibility in mind.
Seaborn vs. matplotlib: when to use which?
- Use Seaborn when plotting tidy DataFrames and statistical patterns with defaults that “just work”.
- Drop to matplotlib for fine‑grained control of spines, ticks, and advanced annotations; Seaborn returns axes/grids you can customize.
Common pitfalls and quick fixes
- Mismatched column names or wide‑form data: reshape to long form with
pd.melt
for dataset‑oriented functions. - Overplotting in dense scatterplots: switch to
hexbin
viajointplot(kind="hex")
or add transparency withalpha
. - Theme clashes: call
sns.set_theme()
once at the top to standardize look and feel across figures.
FAQ
Question | Answer |
---|---|
How do I change figure size globally? | Use sns.set_theme() then adjust matplotlib rcParams (e.g., plt.rcParams["figure.figsize"]=(8,4) ) or pass height /aspect to Seaborn’s figure‑level functions like catplot . |
Can Seaborn handle missing values? | Seaborn generally ignores NaNs by default; consider dropna() or imputation before plotting for clarity and consistent axes. |
How do I export publication‑quality images? | Use plt.savefig("fig.png", dpi=300, bbox_inches="tight") or vector formats like SVG/PDF for print; Seaborn builds on matplotlib’s exporters. |