Plot.ly for scientific visualization
- Developed by a Montreal-based company https://plotly.com
- Open-source scientific plotting Python library for Python, R, MATLAB, Perl, Julia
- Front end uses JavaScript/HTML/CSS and D3.js visualization library, with Plotly JavaScript library on top
- Supports over 40 unique chart types
Links
- Online gallery with code examples by category
- Plotly Express tutorial, also plotly.express
- Plotly Community Forum
- Keyword index
- Getting started guide
- Plotly vs matplotlib (with video)
- Displaying Figures (including offscreen into a file)
- Saving static images (PNG, PDF, etc)
- Plotly and IPython / Jupyter notebook with additional plotting examples
Installation
- You will need Python 3, along with some Python package manager
- Use your favourite installer:
$ pip install plotly
$ uv pip install plotly
$ conda install -c conda-forge plotly
- Other recommended libraries to install for today’s session:
jupyter
,numpy
,pandas
,networkx
,scikit-image
,kaleido
Displaying Plotly figures
With Plotly, you can: 1. work inside a Python shell, 2. save your script into a *.py file and then run it, or 3. run code inside a Jupyter Notebook (start a notebook with jupyter notebook
or even better jupyter lab
).
Plotly supports a number of renderers, and it will attempt to choose an appropriate renderer automatically (in my experience, not very successfully). You can examine the selected default renderer with:
import plotly.io as pio
# show default and available renderers pio.renderers
You can overwrite the default by setting it manually inside your session or inside your code, e.g.
= 'browser' # open each plot in a new browser tab
pio.renderers.default = 'notebook' # plot inside a Jupyter notebook pio.renderers.default
If you want to have this setting persistent across sessions (and not set it manually or in the code), you can create a file ~/.plotly_startup.py
with the following:
try:
import plotly.io as pio
= "browser"
pio.renderers.default except ImportError:
pass
and set export PYTHONSTARTUP=~/.plotly_startup.py
in your ~/.bashrc
file.
Let’s create a simple line plot:
import plotly.graph_objs as go
from numpy import linspace, sin
= linspace(0.01,1,100)
x = sin(1/x)
y = go.Scatter(x=x, y=y, mode='lines+markers', name='sin(1/x)')
line = go.Figure([line])
fig # should open in your browser
fig.show()
"/Users/razoumov/tmp/lines.png", scale=2) # static, supports svg, png, jpg/jpeg, webp, pdf
fig.write_image("/Users/razoumov/tmp/lines.html") # interactive
fig.write_html("/Users/razoumov/tmp/2007.json") # for further editing fig.write_json(
In general, use fig.show()
when working outside of a Jupyter Notebook, or when you want to save your plot to a file. If you want to display plots inline inside a Jupyter notebook, set pio.renderers.default = 'notebook'
and use the command
go.Figure([line])
that should display plots right inside the notebook, without a need for fig.show()
.
You can find more details at https://plotly.com/python/renderers .
Plotly Express (data exploration) library
Normally, in this workshop I would teach plotly.graph_objs
(Graph Objects) which is the standard module in Plotly.py – you saw its example in the previous section.
Plotly Express is a higher-level interface to Plotly.py that sits on top of Graph Objects and provides 30+ functions for creating different types of figures in a single function call. It works with NumPy arrays, Xarrays, Pandas dataframes, basic Python iterables, etc.
Here is one way to create a line plot from above in Plotly Express, using just NumPy arrays:
import plotly.express as px
from numpy import linspace, sin
= linspace(0.01,1,100)
x = sin(1/x)
y = px.line(x=x, y=y, markers=True)
fig fig.show()
You can also use feed a dataframe into the plotting function:
import plotly.express as px
from numpy import linspace, sin
import pandas as pd
= linspace(0.01,1,100)
x = pd.DataFrame({'col1': x, 'col2': sin(1/x)})
df = px.line(df, x='col1', y='col2', markers=True)
fig fig.show()
Or you can feed a dictionary:
import plotly.express as px
from numpy import linspace, sin
= linspace(0.01,1,100)
x = {'key1': x, 'key2': sin(1/x)}
d = px.line(d, x='key1', y='key2', markers=True)
fig fig.show()
To see Plotly Express really shine, we should play with a slightly larger dataset containing several variables. The module px.data
comes with several datasets included. Let’s take a look at the Gapminder data that contains one row per country per year.
import plotly.express as px
= px.data.gapminder().query("year==2007")
df
="gdpPercap", y="lifeExp", markers=True) # this should be familiar
px.line(df, x# 1. replace df with df.sort_values(by='gdpPercap')
# 2. add log_x=True
# 3. change line to scatter, remove markers=True
# 4. don't actually need to sort now, with no markers
# 5. add hover_name="country"
# 6. add size="pop"
# 7. add size_max=60
# 8. add color="continent" - can now turn continents off/on
="lifeExp") # single-axis scatter plot
px.strip(df, x# 1. add hover_name="country"
# 2. add color="continent"
# 3. change strip to histogram
# 4. can turn continents off/on in the legend
# 5. add marginal="rug" to show countries in a rug plot
# 6. add y="pop" to switch from country count to population along the vertical axis
# 7. add facet_col="continent" to break continents into facet columns
="lifeExp", x="pop", y="continent", hover_name="country")
px.bar(df, color
="lifeExp", values="pop", path=["continent", "country"],
px.sunburst(df, color="country", height=800)
hover_name
="lifeExp", values="pop", path=["continent", "country"],
px.treemap(df, color="country", height=500)
hover_name
="lifeExp", locations="iso_alpha", hover_name="country", height=580) px.choropleth(df, color
Here is an ternary plot example with Montreal elections data (58 electoral districts, 3 candidates):
= px.data.election()
df ="Joly", b="Coderre", c="Bergeron", color="winner",
px.scatter_ternary(df, a="total", hover_name="district", size_max=15,
size={"Joly": "blue", "Bergeron": "green", "Coderre": "red"}) color_discrete_map
Plotting via Graph Objects
While Plotly Express is excellent for quick data exploration, it has some limitations: it supports fewer plot types and does not allow combining different plot types directly in a single figure. Plotly Graph Objects, on the other hand, is a lower-level library that offers greater functionality and customization with layouts. In this section, we’ll explore Graph Objects, starting with 2D plots.
Graph Objects supports many plot types:
import plotly.graph_objs as go
dir(go)
['AngularAxis', 'Annotation', 'Annotations', 'Bar', 'Barpolar', 'Box', 'Candlestick', 'Carpet', 'Choropleth', 'Choroplethmap', 'Choroplethmapbox', 'ColorBar', 'Cone', 'Contour', 'Contourcarpet', 'Contours', 'Data', 'Densitymap', 'Densitymapbox', 'ErrorX', 'ErrorY', 'ErrorZ', 'Figure', 'FigureWidget', 'Font', 'Frame', 'Frames', 'Funnel', 'Funnelarea', 'Heatmap', 'Histogram', 'Histogram2d', 'Histogram2dContour', 'Histogram2dcontour', 'Icicle', 'Image', 'Indicator', 'Isosurface', 'Layout', 'Legend', 'Line', 'Margin', 'Marker', 'Mesh3d', 'Ohlc', 'Parcats', 'Parcoords', 'Pie', 'RadialAxis', 'Sankey', 'Scatter', 'Scatter3d', 'Scattercarpet', 'Scattergeo', 'Scattergl', 'Scattermap', 'Scattermapbox', 'Scatterpolar', 'Scatterpolargl', 'Scattersmith', 'Scatterternary', 'Scene', 'Splom', 'Stream', 'Streamtube', 'Sunburst', 'Surface', 'Table', 'Trace', 'Treemap', 'Violin', 'Volume', 'Waterfall', 'XAxis', 'XBins', 'YAxis', 'YBins', 'ZAxis', 'bar', 'barpolar', 'box', 'candlestick', 'carpet', 'choropleth', 'choroplethmap', 'choroplethmapbox', 'cone', 'contour', 'contourcarpet', 'densitymap', 'densitymapbox', 'funnel', 'funnelarea', 'heatmap', 'histogram', 'histogram2d', 'histogram2dcontour', 'icicle', 'image', 'indicator', 'isosurface', 'layout', 'mesh3d', 'ohlc', 'parcats', 'parcoords', 'pie', 'sankey', 'scatter', 'scatter3d', 'scattercarpet', 'scattergeo', 'scattergl', 'scattermap', 'scattermapbox', 'scatterpolar', 'scatterpolargl', 'scattersmith', 'scatterternary', 'splom', 'streamtube', 'sunburst', 'surface', 'table', 'treemap', 'violin', 'volume', 'waterfall']
Scatter plots
We already saw an example of a Scatter plot with Graph Objects:
import plotly.graph_objs as go
from numpy import linspace, sin
= linspace(0.01,1,100)
x = sin(1/x)
y = go.Scatter(x=x, y=y, mode='lines+markers', name='sin(1/x)')
line go.Figure([line])
Let’s print the dataset line
:
type(line)
print(line)
It is a plotly object which is actually a Python dictionary, with all elements clearly identified (plot type, x numpy array, y numpy array, line type, legend line name). So, go.Scatter
simply creates a dictionary with the corresponding type
element. This variable/dataset line
completely describes our plot!* Then we create a list of such objects and pass it to the plotting routine.
Pass a list of two objects to the plotting routine with data = [line1,line2]
. Let the second dataset line2
contain another mathematical function. The idea is to have multiple objects in the plot.
Hovering over each data point will reveal their coordinates. Use the toolbar at the top. Double-clicking on the plot will reset it.
Add a bunch of dots to the plot with dots = go.Scatter(x=[.2,.4,.6,.8], y=[2,1.5,2,1.2])
. What is default scatter mode?
Change line colour and width by adding the dictionary line=dict(color=('rgb(10,205,24)'),width=4)
to dots
.
Create a scatter plot of 300 random filled blue circles inside a unit square. Their random opacity must anti-correlate with their size (bigger circles should be more transparent) – see the plot below.
Bar plots
Let’s try a Bar plot, constructing data
directly in one line from the dictionary:
import plotly.graph_objs as go
= go.Bar(x=['Vancouver', 'Calgary', 'Toronto', 'Montreal', 'Halifax'],
bar =[2463431, 1392609, 5928040, 4098927, 403131])
y= go.Figure(data=[bar])
fig fig.show()
Let’s plot inner city population vs. greater metro area for each city:
import plotly.graph_objs as go
= ['Vancouver', 'Calgary', 'Toronto', 'Montreal', 'Halifax']
cities = [662_248, 1_306_784, 2_794_356, 1_762_949, 439_819]
proper = [3_108_926, 1_688_000, 6_491_000, 4_615_154, 530_167]
metro = go.Bar(x=cities, y=proper, name='inner city')
bar1 = go.Bar(x=cities, y=metro, name='greater area')
bar2 = go.Figure(data=[bar1,bar2])
fig fig.show()
Let’s now do a stacked plot, with outer city population on top of inner city population:
= [m-p for p,m in zip(proper,metro)] # need to subtract
outside = go.Bar(x=cities, y=proper, name='inner city')
bar1 = go.Bar(x=cities, y=outside, name='outer city')
bar2 = go.Figure(data=[bar1,bar2], layout=go.Layout(barmode='stack')) # new element!
fig fig.show()
What else can we modify in the layout?
help(go.Layout)
There are many attributes!
Heatmaps
- go.Area() for plotting wind rose charts
- go.Box() for basic box plots
Let’s plot a heatmap of monthly temperatures at the South Pole:
import plotly.graph_objs as go
= ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec', 'Year']
months = [-14.4,-20.6,-26.7,-27.8,-25.1,-28.8,-33.9,-32.8,-29.3,-25.1,-18.9,-12.3,-12.3]
recordHigh = [-26.0,-37.9,-49.6,-53.0,-53.6,-54.5,-55.2,-54.9,-54.4,-48.4,-36.2,-26.3,-45.8]
averageHigh = [-28.4,-40.9,-53.7,-57.8,-58.0,-58.9,-59.8,-59.7,-59.1,-51.6,-38.2,-28.0,-49.5]
dailyMean = [-29.6,-43.1,-56.8,-60.9,-61.5,-62.8,-63.4,-63.2,-61.7,-54.3,-40.1,-29.1,-52.2]
averageLow = [-41.1,-58.9,-71.1,-75.0,-78.3,-82.8,-80.6,-79.3,-79.4,-72.0,-55.0,-41.1,-82.8]
recordLow = [recordHigh, averageHigh, dailyMean, averageLow, recordLow]
data = ['record high', 'aver.high', 'daily mean', 'aver.low', 'record low']
yticks = go.Heatmap(z=data, x=months, y=yticks)
heatmap = go.Figure([heatmap])
fig fig.show()
Try a few different colourmaps, e.g. ‘Viridis’, ‘Jet’, ‘Rainbow’. What colourmaps are available?
Contour maps
Pretend that our heatmap is defined over a 2D domain and plot the same temperature data as a contour map. Remove the Year
data (last column) and use go.Contour
to plot the 2D contour map.
Download data
Open a terminal window inside Jupyter (New | Terminal) and run these commands:
wget https://tinyurl.com/pvzip -O paraview.zip
unzip paraview.zip
mv data/*.{csv,nc} .
Geographical scatterplot
Go back to your Python Jupyter Notebook. Now let’s do a scatterplot on top of a geographical map:
import plotly.graph_objs as go
import pandas as pd
from math import log10
= pd.read_csv('cities.csv') # lists name,pop,lat,lon for 254 Canadian cities and towns
df 'text'] = df['name'] + '<br>Population ' + \
df['pop']/1e6).astype(str) +' million' # add new column for mouse-over
(df[
= df['pop'].max(), df['pop'].min()
largest, smallest def normalize(x):
return log10(x/smallest)/log10(largest/smallest) # x scaled into [0,1]
'logsize'] = round(df['pop'].apply(normalize)*255) # new column
df[= go.Scattergeo(
cities = df['lon'], lat = df['lat'], text = df['text'],
lon = dict(
marker = df['pop']/5000,
size = df['logsize'],
color = 'Viridis',
colorscale = True, # show the colourbar
showscale = dict(width=0.5, color='rgb(40,40,40)'),
line = 'area'))
sizemode = go.Layout(title = 'City populations',
layout = False, # do not show legend for first plot
showlegend = dict(
geo = 'north america',
scope = 50, # base layer resolution of km/mm
resolution = dict(range=[-130,-55]), lataxis = dict(range=[44,70]), # plot range
lonaxis = True, landcolor = 'rgb(217,217,217)',
showland = True, rivercolor = 'rgb(153,204,255)',
showrivers = True, lakecolor = 'rgb(153,204,255)',
showlakes = 1, subunitcolor = "rgb(255,255,255)", # province border
subunitwidth = 2, countrycolor = "rgb(255,255,255)")) # country border
countrywidth = go.Figure(data=[cities], layout=layout)
fig fig.show()
Modify the code to display only 10 largest cities.
Recall how we combined several scatter plots in one figure before. You can combine several plots on top of a single map – let’s combine scattergeo + choropleth:
import plotly.graph_objs as go
import pandas as pd
= pd.read_csv('cities.csv')
df 'text'] = df['name'] + '<br>Population ' + \
df['pop']/1e6).astype(str)+' million' # add new column for mouse-over
(df[= go.Scattergeo(lon = df['lon'],
cities = df['lat'],
lat = df['text'],
text = dict(
marker = df['pop']/5000,
size = "lightblue",
color = dict(width=0.5, color='rgb(40,40,40)'),
line = 'area'))
sizemode = pd.read_csv('gdp.csv') # read name, gdp, code for 222 countries
gdp = [0,"rgb(5, 10, 172)"] # define colourbar from top (0) to bottom (1)
c1 = [0.35,"rgb(40, 60, 190)"], [0.5,"rgb(70, 100, 245)"]
c2, c3 = [0.6,"rgb(90, 120, 245)"], [0.7,"rgb(106, 137, 247)"]
c4, c5 = [1,"rgb(220, 220, 220)"]
c6 = go.Choropleth(locations = gdp['CODE'],
countries = gdp['GDP (BILLIONS)'],
z = gdp['COUNTRY'],
text = [c1,c2,c3,c4,c5,c6],
colorscale = False,
autocolorscale = True,
reversescale = dict(line = dict(color='rgb(180,180,180)',width = 0.5)),
marker = 0,
zmin = dict(tickprefix = '$',title = 'GDP<br>Billions US$'))
colorbar = go.Layout(hovermode = "x", showlegend = False) # do not show legend for first plot
layout = go.Figure(data=[cities,countries], layout=layout)
fig fig.show()
3D Topographic elevation
Let’s plot some tabulated topographic elevation data:
import plotly.graph_objs as go
import pandas as pd
= pd.read_csv('mt_bruno_elevation.csv')
table = go.Surface(z=table.values) # use 2D numpy array format
surface = go.Layout(title='Mt Bruno Elevation',
layout =1200, height=1200, # image size
width=dict(l=65, r=10, b=65, t=90)) # margins around the plot
margin= go.Figure([surface], layout=layout)
fig fig.show()
Plot a 2D function f(x,y) = (1−y) sin(πx) + y sin^2(2πx), where x,y ∈ [0,1] on a 100^2 grid.
Elevated 2D functions
Let’s define a different colourmap by adding colorscale='Viridis'
inside go.Surface()
. This is our current code:
import plotly.graph_objs as go
from numpy import *
= 100 # plot resolution
n = linspace(0,1,n)
x = linspace(0,1,n)
y = meshgrid(x, y) # meshgrid() returns two 2D arrays storing x/y respectively at each mesh point
Y, X = (1-Y)*sin(pi*X) + Y*(sin(2*pi*X))**2 # array operation
F = go.Surface(z=F, colorscale='Viridis')
data = go.Layout(width=1000, height=1000, scene=go.layout.Scene(zaxis=go.layout.scene.ZAxis(range=[-1,2])));
layout = go.Figure(data=[data], layout=layout)
fig fig.show()
Lighting control
Let’s change the default light in the room by adding lighting=dict(ambient=0.1)
inside go.Surface()
. Now our plot is much darker!
ambient
controls the light in the room (default = 0.8)roughness
controls amount of light scattered (default = 0.5)diffuse
controls the reflection angle width (default = 0.8)fresnel
controls light washout (default = 0.2)specular
induces bright spots (default = 0.05)
Let’s try lighting=dict(ambient=0.1,specular=0.3)
– now we have lots of specular light!
3D parametric plots
In plotly documentation you can find quite a lot of different 3D plot types. Here is something visually very different, but it still uses go.Surface(x,y,z)
:
import plotly.graph_objs as go
from numpy import pi, sin, cos, mgrid
= pi/250, pi/250 # 0.72 degrees
dphi, dtheta = mgrid[0:pi+dphi*1.5:dphi, 0:2*pi+dtheta*1.5:dtheta]
[phi, theta] # define two 2D grids: both phi and theta are (252,502) numpy arrays
= sin(4*phi)**3 + cos(2*phi)**3 + sin(6*theta)**2 + cos(6*theta)**4
r = r*sin(phi)*cos(theta) # x is also (252,502)
x = r*cos(phi) # y is also (252,502)
y = r*sin(phi)*sin(theta) # z is also (252,502)
z = go.Surface(x=x, y=y, z=z, colorscale='Viridis')
surface = go.Layout(title='parametric plot')
layout = go.Figure(data=[surface], layout=layout)
fig fig.show()
3D scatter plots
Let’s take a look at a 3D scatter plot using the country index
data from http://www.prosperity.com for 142 countries:
import plotly.graph_objs as go
import pandas as pd
= pd.read_csv('legatum2015.csv')
df = go.Scatter3d(x=df.economy,
spheres =df.entrepreneurshipOpportunity,
y=df.governance,
z=df.country,
text='markers',
mode=dict(
marker= 'diameter',
sizemode = 0.3, # max(safetySecurity+5.5) / 32
sizeref = df.safetySecurity+5.5,
size = df.education,
color = 'Viridis',
colorscale = dict(title = 'Education'),
colorbar = dict(color='rgb(140, 140, 170)'))) # sphere edge
line = go.Layout(height=900, width=900,
layout ='Each sphere is a country sized by safetySecurity',
title= dict(xaxis=dict(title='economy'),
scene =dict(title='entrepreneurshipOpportunity'),
yaxis=dict(title='governance')))
zaxis= go.Figure(data=[spheres], layout=layout)
fig fig.show()
3D graphs
We can plot 3D graphs. Consider a Dorogovtsev-Goltsev-Mendes graph: in each subsequent generation, every edge from the previous generation yields a new node, and the new graph can be made by connecting together three previous-generation graphs.
import plotly.graph_objs as go
import networkx as nx
import sys
= 5
generation = nx.dorogovtsev_goltsev_mendes_graph(generation)
H print(H.number_of_nodes(), 'nodes and', H.number_of_edges(), 'edges')
# Force Atlas 2 graph layout from https://github.com/tpoisot/nxfa2.git
= nx.spectral_layout(H,scale=1,dim=3)
pos = [pos[i][0] for i in pos] # x-coordinates of all nodes
Xn = [pos[i][1] for i in pos] # y-coordinates of all nodes
Yn = [pos[i][2] for i in pos] # z-coordinates of all nodes
Zn = [], [], []
Xe, Ye, Ze for edge in H.edges():
+= [pos[edge[0]][0], pos[edge[1]][0], None] # x-coordinates of all edge ends
Xe += [pos[edge[0]][1], pos[edge[1]][1], None] # y-coordinates of all edge ends
Ye += [pos[edge[0]][2], pos[edge[1]][2], None] # z-coordinates of all edge ends
Ze
= [deg[1] for deg in H.degree()] # list of degrees of all nodes
degree = [str(i) for i in range(H.number_of_nodes())]
labels = go.Scatter3d(x=Xe, y=Ye, z=Ze,
edges ='lines',
mode=dict(size=12,line=dict(color='rgba(217, 217, 217, 0.14)',width=0.5)),
marker='none')
hoverinfo= go.Scatter3d(x=Xn, y=Yn, z=Zn,
nodes ='markers',
mode=dict(sizemode = 'area',
marker= 0.01, size=degree,
sizeref =degree, colorscale='Viridis',
color=dict(color='rgb(50,50,50)', width=0.5)),
line=labels, hoverinfo='text')
text
= dict(showline=False, zeroline=False, showgrid=False, showticklabels=False, title='')
axis = go.Layout(
layout = str(generation) + "-generation Dorogovtsev-Goltsev-Mendes graph",
title =1000, height=1000,
width=False,
showlegend=dict(xaxis=go.layout.scene.XAxis(axis),
scene=go.layout.scene.YAxis(axis),
yaxis=go.layout.scene.ZAxis(axis)),
zaxis=go.layout.Margin(t=100))
margin= go.Figure(data=[edges,nodes], layout=layout)
fig fig.show()
3D functions
Let’s create an isosurface of a decoCube
function at f=0.03. Isosurfaces are returned as a list of polygons, and for plotting polygons in plotly we need to use plotly.figure_factory.create_trisurf()
which replaces plotly.graph_objs.Figure()
:
from plotly import figure_factory as FF
from numpy import mgrid
from skimage import measure
= mgrid[-1.2:1.2:30j, -1.2:1.2:30j, -1.2:1.2:30j] # three 30^3 grids, each side [-1.2,1.2] in 30 steps
X,Y,Z = ((X*X+Y*Y-0.64)**2 + (Z*Z-1)**2) * \
F *Y+Z*Z-0.64)**2 + (X*X-1)**2) * \
((Y*Z+X*X-0.64)**2 + (Y*Y-1)**2)
((Z= measure.marching_cubes(F, 0.03) # create an isosurface
vertices, triangles, normals, values = zip(*vertices) # zip(*...) is opposite of zip(...): unzips a list of tuples
x,y,z = FF.create_trisurf(x=x, y=y, z=z, plot_edges=False,
fig =triangles, title="Isosurface", height=1200, width=1200)
simplices fig.show()
Try switching plot_edges=False
to plot_edges=True
– you’ll see individual polygons!
Dash library for making interactive web applications
Plotly Dash library is a framework for making interactive data applications.
- Dash Python User Guide https://dash.plotly.com
- https://dash.gallery/Portal has ~100 app examples
- can create a dropdown to select data to plot
- can enter a value into a box to select or interpolate data to plot
- selection in one plot shows in the other plot
- mix and match these into a single web app
- can create different tabs inside the app, with the render switching between them
- can make entire website with user guides, plots, code examples, etc.