Seaborn Is based on matplotlib A module generated , Specializing in statistical visualization , Can and pandas Seamless link , Make it easier for beginners to get started . be relative to matplotlib,Seaborn More concise grammar , The relationship is similar to numpy and pandas Relationship between .

2.1 install :

1)linux system

sudo pip install seaborn

2)window system

pip install seaborn

2.2 quick get start

import  as sns

sns.set(style="ticks")

from matplotlib import pyplot

# Load dataset

tips = sns.load_dataset("tips")

# mapping

sns.boxplot(x="day", y="total_bill", hue="sex", data=tips, palette="PRGn")

sns.despine(offset=10, trim=True)

# Picture display and preservation

pyplot.savefig("GroupedBoxplots.png")

pyplot.show()

2.3seaborn common method

1, Univariate analysis drawing

1) Concentration trend of distribution , Reflects the degree to which data is close to or clustered towards its central value

x = np.random.normal(size=100)

sns.distplot(x, kde=True)# kde=False Off nuclear density distribution , rug Indicates in x A small strip generated on each observation on the axis ( Marginal blanket )

2, The best way to observe the distribution relationship between the two variables is to use a scatter diagram

1) Direct fitting probability density function

sns.jointplot(x="x", y="y", data=df, kind="kde")

2) It can more intuitively reflect the distribution of points

hex chart ( When there is a large amount of data )¶

Preferably black and white

When there is a large amount of data , use hex chart , Tell which piece is more ( Color depth )

mean, cov = [0, 1], [(1, .5), (.5, 1)]

data = np.random.multivariate_normal(mean, cov, 200)

df = pd.DataFrame(data, columns=["x", "y"])

x, y = np.random.multivariate_normal(mean, cov, 1000).T

with sns.axes_style("ticks"):

sns.jointplot(x=x, y=y, kind="hex")

3, Multivariable pairwise display

# Rhododendron data iris = sns.load_dataset("iris")

sns.pairplot(iris)

4,Seaborn Visualize various drawing operations

1, Box diagram box graph

import matplotlib.pyplot as plt

import numpy as np

Box chart median Q2, Quarter digit Q1, Three quarters Q3 Outliers ¶

IQR = Q3 - Q1

If Q1-1.5IQR perhaps Q3+1.5IQR Is the outlier

tang_data = [np.random.normal(0, std, 100) for std in range(1,4)]

fig = plt.figure(figsize=(8,6))

plt.boxplot(tang_data, vert=True, notch=True)

plt.xticks([x+1 for x in range(len(tang_data))], ["x1", "x2", "x3"])

plt.xlabel("x")

plt.title("box plot")

import seaborn as sns

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

2, Single feature histogram

1)distplot

x = np.random.normal(size=100)

sns.distplot(x, kde=False, bins=20)

2)countplot Counting diagram

countplot Hence the name Si Yi , Counting diagram , It can be considered as a histogram applied to classification variables , It can also be considered to be used to compare the count difference between categories , call count Functional barplot.

seaborn.countplot(x=None, y=None, hue=None, data=None, order=None,
hue_order=None, orient=None, color=None, palette=None, saturation=0.75,
ax=None, **kwargs)

x, y, hue: names of variables in data or vector data, optional

data: DataFrame, array, or list of arrays, optional

order, hue_order: lists of strings, optional # Set order

orient: “v” | “h”, optional # Set horizontal or vertical display

ax: matplotlib Axes, optional # Set subgraph location , The basics of drawing are described in the next section

3, Analyze the relationship between the two features , Using scatter diagram to express

mean, cov = [0,1], [(1, .5), (.5,1)]

data = np.random.multivariate_normal(mean, cov, 200)

df = pd.DataFrame(data, columns=["X1", "X2"])

sns.jointplot(x="X1", y="X2", data=df)

# kind = "hex"  # hexagon

data = np.random.multivariate_normal(mean, cov, 2000).T

with sns.axes_style("white"):

sns.jointplot(x=data[0], y=data[1], kind="hex", color="k")

4, Look at the relationship between two variables

iris = sns.load_dataset("iris")

sns.pairplot(iris)

5, Bar chart

sns.barplot(x="sex", y="survived", data=titanic, hue="class")

Point diagram , Don't look at the concentration trend , It depends on their changes

sns.pointplot(x="sex", y="survived", data=titanic, hue="class")

sns.pointplot(x="class", y="survived", data=titanic, hue="sex",
palette={"male":"g","female":"m"}, markers=["^", "o"], linestyles=["-","--"])

tips = sns.load_dataset("tips", data_home=".")

# jitter shock

sns.stripplot(x="day", y="total_bill", data=tips, jitter=True)

sns.swarmplot(x="day", y="total_bill", data=tips)

sns.swarmplot(x="day", y="total_bill", data=tips, hue="sex")

sns.swarmplot(x="day", y="total_bill", data=tips, hue="time")

6, Box diagram

sns.boxplot(x="day", y="total_bill", data=tips, hue="time")

7, Violin picture

sns.violinplot(x="day", y="total_bill", data=tips, hue="sex", split=True)

sns.violinplot(x="day", y="total_bill", data=tips, inner=None, split=True)

sns.swarmplot(x="day", y="total_bill", data=tips, color="k", alpha=1.0)

8, The size of the specified value is clear through the color of the thermal diagram , And the trend of change

uniform_data = np.random.rand(3,3)

sns.heatmap(uniform_data)

sns.heatmap(uniform_data, vmin=0.2, vmax=0.5)

normal_data = np.random.randn(3,3)

sns.heatmap(normal_data, center=0)

flights = sns.load_dataset("flights")

data = flights.pivot("month", "year", "passengers")

sns.heatmap(data)

sns.heatmap(data, annot=True, fmt="d", linewidths=.5, cbar=False,
cmap="YlGnBu")

9, Set the overall style of drawing

def sin_plot(flip=1):

x = np.linspace(0, 14, 100)

for i in range(1,7):

plt.plot(x, np.sin(x+i*.5)*(7-i)*flip)

sin_plot()

10, There are five theme styles ,darkgrid whitegrid dark white ticks

sns.set_style("darkgrid")

data = np.random.normal(size=(20,6)) + np.arange(6) / 2

sns.boxplot(data=data)

11, The style of each subgraph can be different ,with There's a style inside , A style outside

with sns.axes_style("whitegrid"):

plt.subplot(211)

sin_plot()

plt.subplot(212)

sin_plot(-1)

12, Layout style

sns.set_context("paper")

plt.figure(figsize=(8,6))

sin_plot()

sns.set_context("talk")

plt.figure(figsize=(8,6))

sin_plot()

sns.set_context("poster")

plt.figure(figsize=(8,6))

sin_plot()

sns.set_context("notebook", font_scale=3.5, rc={"lines.linewidth": 4.5})

plt.figure(figsize=(8,6))

sin_plot()

Technology