Bokeh Review
I wanted to review Bokeh before implementing the survey response functionality.
Review of Bokeh Notes
Reviewing Bokeh notes because I need to learn more about it before implementing survey results section.
Bokeh is a Python library for creating interactive visualizations for modern web browsers. It helps you build beautiful graphics, ranging from simple plots to complex dashboards with streaming datasets. With Bokeh, you can create JavaScript-powered visualizations without writing any JavaScript yourself.
My Needs
- A histogram with a slider that can be used to change the size of bins.
- A plot to show dimensionality-reduced data points where the user can ask questions about sections of the chart.
- Different kinds of plots to use for different kinds of graphs in a text-to-sql scenario.
First Steps
Creating a Line Chart
The basic idea of Bokeh is a two-step process: First, you select from Bokeh's building blocks to create your visualization. Second, you customize these building blocks to fit your needs. Bokeh combines a Python library for defining the content and interactive functionalities fo your visualization with a JavaScript library called BokehJS that is working in the background to display your interactive visualizations in a web browser.
Based on your Python code, Bokeh automatically generates all the necessary JavaScript and HTML code for you. In its default setting, Bokeh automatizally loads any additional JavaScript code from Bokeh's CDN.
from bokeh.plotting import figure, show
from bokeh.io import output_notebook, export_png
output_notebook()
# prepare some data
x = [1, 2, 3, 4, 5]
y1 = [6, 7, 2, 4, 5]
y2 = [2, 3, 4, 5, 6]
y3 = [4, 5, 5, 7, 2]
# create a new plot with a title and axis labels
p = figure(title="Multiple glyphs example", x_axis_label="x", y_axis_label="y")
# add multiple renderers
p.line(x, y1, legend_label="Temp.", color="blue", line_width=3)
p.vbar(x=x, top=y2, legend_label="Rate", color="red", width=0.5, bottom=0)
p.scatter(x, y3, legend_label="Objects", color="yellow", size=16)
# show the results
show(p)
Adding and Customizing Renderers
There are many different ways you can customize rendering.
from bokeh.plotting import figure, show
# prepare some data
x = [1,2,3,4,5]
y = [4,5,5,7,2]
p = figure(title="Glyphs properties example",x_axis_label='x',y_axis_label='y')
# add circle renderer with additional arguments
scatter = p.scatter(
x,
y,
marker="circle",
size=80,
legend_label="Objects",
fill_color="red",
fill_alpha=0.5,
line_color="blue",
)
# show the results
show(p)
glyph = scatter.glyph
glyph.fill_color = "blue"
show(p)
Adding Legends, Text, and Annotations
Bokeh automaticallu adds a legend to your plot if you include the legend_label attribute when calling the renderer function. You can use the Legend object to customize the legend. You can also customize the title.
Annotations are visual elements that add to your plot and make it easier to read. For more information on the various kinds of annotations, see Annotations in the user guide.
from bokeh.plotting import figure, show
# prepare some data
x = [1, 2, 3, 4, 5]
y1 = [4, 5, 5, 7, 2]
y2 = [2, 3, 4, 5, 6]
# create a new plot
p = figure(title="Legend example")
# add circle renderer with legend_label arguments
line = p.line(x, y1, legend_label="Temp.", line_color="blue", line_width=2)
circle = p.scatter(
x,
y2,
marker="circle",
size=80,
legend_label="Objects",
fill_color="red",
fill_alpha=0.5,
line_color="blue",
)
# display legend in top left corner (default is top right corner)
p.legend.location = "top_left"
# add a title to your legend
p.legend.title = "Obervations"
# change appearance of legend text
p.legend.label_text_font = "times"
p.legend.label_text_font_style = "italic"
p.legend.label_text_color = "navy"
# change border and background of legend
p.legend.border_line_width = 3
p.legend.border_line_color = "navy"
p.legend.border_line_alpha = 0.8
p.legend.background_fill_color = "navy"
p.legend.background_fill_alpha = 0.2
# add line renderer with a legend
p.line(x, y, legend_label="Temp.", line_width=2)
# change headline location to the left
p.title_location = "left"
# change headline text
p.title.text = "Changing headline text example"
# style the headline
p.title.text_font_size = "25px"
p.title.align = "right"
p.title.background_fill_color = "darkgrey"
p.title.text_color = "white"
# show the results
show(p)
Customizing Your Plot
With Bokeh's themes, you can quickly change the appeareance of your plot. Themes are a set of pre-defined design parameters such as colors, fonts, or line styles. Bokeh comes with built in themes, and you can also define your own custom themes. To use one of the built-in themes, assign the name of the theme you want to use to the theme property of your document:
from bokeh.io import curdoc
from bokeh.plotting import figure, show
# prepare some data
x = [1, 2, 3, 4, 5]
y = [4, 5, 5, 7, 2]
# apply theme to current document
curdoc().theme = "dark_minimal"
# create a plot
p = figure(sizing_mode="stretch_width", max_width=500, height=250)
# add a renderer
p.line(x, y)
# show the results
show(p)
Bokeh's Plot objects have various attributes that influence the way your plot looks. To set the size of your plot, use the attributes width and height when calling the figure() function. Similiar to changing the design of an existing glyph, you can change a ploy'ss attributes any time after its creation.
To make your plot automatically adjust to your browser or screen size, use the attribute sizing_mode:
from bokeh.plotting import figure, show
# prepare some data
x = [1, 2, 3, 4, 5]
y = [4, 5, 5, 7, 2]
# create a new plot with responsive width
p = figure(
title="Plot responsive sizing example",
sizing_mode="stretch_width",
height=250,
x_axis_label="x",
y_axis_label="y",
)
# add scatter renderer
p.scatter(x, y, fill_color="red", size=15)
# show the results
show(p)
You can set various attributes to change the way the axis in your plot work and look. You can set various attributes to change the way the axes in your plot work and look. Options for customizing the appreance of your plot include:
- setting labels for your axes
- styling the numbers displayed with your axes
- defining colors and other layour properties for the axes themselves
from bokeh.plotting import figure, show
# prepare some data
x = [1, 2, 3, 4, 5]
y = [4, 5, 5, 7, 2]
# create a plot
p = figure(
title="Customized axes example",
sizing_mode="stretch_width",
max_width=500,
height=350,
)
# add a renderer
p.scatter(x, y, size=10)
# change some things about the x-axis
p.xaxis.axis_label = "Temp"
p.xaxis.axis_line_width = 3
p.xaxis.axis_line_color = "red"
# change some things about the y-axis
p.yaxis.axis_label = "Pressure"
p.yaxis.major_label_text_color = "orange"
p.yaxis.major_label_orientation = "vertical"
# change things on all axes
p.axis.minor_tick_in = -3
p.axis.minor_tick_out = 6
# show the results
show(p)
When drawing the axes for your plot, Bokeh automatically determines the range each axis needs to cover in order to display all your values. To define the range for your axes manually, use the y_range() function or the y_range() properties of your Plot object when you call the figure() function:
from bokeh.plotting import figure, show
# prepare some data
x = [1, 2, 3, 4, 5]
y = [4, 5, 5, 7, 2]
# create a new plot with responsive width
p = figure(
y_range=(0, 25),
title="Axis range example",
sizing_mode="stretch_width",
max_width=500,
height=250,
)
# add scatter renderer with additional arguments
p.scatter(x, y, size=8)
# show the results
show(p)
from bokeh.models import NumeralTickFormatter
from bokeh.plotting import figure, show
# prepare some data
x = [1, 2, 3, 4, 5]
y = [4, 5, 5, 7, 2]
# create new plot
p = figure(
title="Tick formatter example",
sizing_mode="stretch_width",
max_width=500,
height=250,
)
# format axes ticks
p.yaxis[0].formatter = NumeralTickFormatter(format="$0.00")
# add renderers
p.scatter(x, y, size=8)
p.line(x, y, color="navy", line_width=1)
# show the results
show(p)
from bokeh.plotting import figure, show
# prepare some data
x = [0.1, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0]
y0 = [i**2 for i in x]
y1 = [10**i for i in x]
y2 = [10**(i**2) for i in x]
# create a new plot with a logarithmic axis type
p = figure(
title="Logarithmic axis example",
sizing_mode="stretch_width",
height=300,
max_width=500,
y_axis_type="log",
y_range=[0.001, 10 ** 11],
x_axis_label="sections",
y_axis_label="particles",
)
# add some renderers
p.line(x, x, legend_label="y=x")
p.scatter(x, x, legend_label="y=x", fill_color="white", size=8)
p.line(x, y0, legend_label="y=x^2", line_width=3)
p.line(x, y1, legend_label="y=10^x", line_color="red")
p.scatter(x, y1, legend_label="y=10^x", fill_color="red", line_color="red", size=6)
p.line(x, y2, legend_label="y=10^x^2", line_color="orange", line_dash="4 4")
# show the results
show(p)
Set the x_axis_type or y_axis_type to datetime to display date or time information on an axis. Bokeh then creates a DatetimeAxis.
To format the ticks of a DatetimeAxis, use the DatetimeTickFormatter.
import random
from datetime import datetime, timedelta
from bokeh.models import DatetimeTickFormatter, NumeralTickFormatter
from bokeh.plotting import figure, show
# generate list of dates (today's date in subsequent weeks)
dates = [(datetime.now() + timedelta(day * 7)) for day in range(0, 26)]
# generate 25 random data points
y = random.sample(range(0, 100), 26)
# create new plot
p = figure(
title="datetime axis example",
x_axis_type="datetime",
sizing_mode="stretch_width",
max_width=500,
height=250,
)
# add renderers
p.scatter(dates, y, size=8)
p.line(dates, y, color="navy", line_width=1)
# format axes ticks
p.yaxis[0].formatter = NumeralTickFormatter(format="$0.00")
p.xaxis[0].formatter = DatetimeTickFormatter(months="%b %Y")
# show the results
show(p)
To change the apprearance of the grid, set the various properties of the xgrid(), ygrid(), and grid() methods of your Plot object.
You can customize the toolbar_location of your figure.
You can add tooltips to your figure with the HoverTool object.
Vectorizing Colors
To chnage colors depending on values in a variable, pass a variable continaing color information to the fill_color attribute:
import random
from bokeh.plotting import figure, show
# generate some data (1-10 for x, random values for y)
x = list(range(0, 26))
y = random.sample(range(0, 100), 26)
# generate list of rgb hex colors in relation to y
colors = [f"#{255:02x}{int((value * 255) / 100):02x}{255:02x}" for value in y]
# create new plot
p = figure(
title="Vectorized colors example",
sizing_mode="stretch_width",
max_width=500,
height=250,
)
# add line and scatter renderers
p.line(x, y, line_color="blue", line_width=1)
p.scatter(x, y, fill_color=colors, line_color="blue", size=15)
# show the results
show(p)
import numpy as np
from bokeh.plotting import figure, show
# generate some data
N = 1000
x = np.random.random(size=N) * 100
y = np.random.random(size=N) * 100
# generate radii and colors based on data
radii = y / 100 * 2
colors = [f"#{255:02x}{int((value * 255) / 100):02x}{255:02x}" for value in y]
# create a new plot with a specific size
p = figure(
title="Vectorized colors and radii example",
sizing_mode="stretch_width",
max_width=500,
height=250,
)
# add circle renderer
p.circle(
x,
y,
radius=radii,
fill_color=colors,
fill_alpha=0.6,
line_color="lightgrey",
)
# show the results
show(p)
Bokeh comes with dozens of pre-defined color palettes that you can use to map colors to your data.
You can export data to HTML, png, or output to a Jupyter Notebook.
The ColumnDataSource is Bokeh's own data structure. So far, you have used data structures like Python lists and NumPy arrays to pass data to Bokeh. Bokeh has automatically converted these lists into ColumnDataSource objects for you.
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource
# create dict as basis for ColumnDataSource
data = {'x_values': [1, 2, 3, 4, 5],
'y_values': [6, 7, 2, 3, 6]}
# create ColumnDataSource based on dict
source = ColumnDataSource(data=data)
# create a plot and renderer with ColumnDataSource data
p = figure(height=250)
p.scatter(x='x_values', y='y_values', size=20, source=source)
show(p)
You can use a pandas DataFrame and pass it into a ColumnDataSource to create a data source. Bokeh comes with various filtering methods. Use these filters if you want to create a specific subset of the data contained in your ColumnDataSource. In Bokeh, these filtered subsets are called “views”. Views are represented by Bokeh’s CDSView class.
Using Widgets
Widgets are additional visual elements that you can include in your visualization. Use widgets to display additional information or to interactively control elements of your Bokeh document.
You can use Bokeh Server to build complex dashboards and interactive applications.
User Guide
Introduction
bokeh.plotting is Bokeh's primary interface. This general-purpose interface is similar to plotting interfaces of libraries such as Matplotlib or Matlab. The interface lets you focus on relating glyphs to data. It automatically assembles plots with default elements such as axes, grids, and tools for you. The figure() function is at the core of the bokeh.plotting interface. This function creates a figure() model that include methods of adding different kinds of glyphs to a plot. Calling the figure() function is all it takes to create a basic plot object. To add data renderers to your plot object, call a glyph method such as figure.circle. With Bokeh's low-level bokeh.models interface, you have complete control over how Bokeh creates all elements of your visualiation.
Behind the scenes, Bokeh consists of two libraries:
- BokehJS, the JavaScript library
- BokehJS runs in the browser. This library handles rednering and user interactions. It takes a collection of declarative JSON objects as its input and uses them as instructions on how to handle various aspects of your visualization in a browser.
- Bokeh, the Python Library
- The Python library generates the JSON objects that BokehJS uses to render your visualization in a browser.
Basic Plotting
Creating a ColumnDataSource yourself gives you access to more advanced options. Creating your won ColumnDataSource allows you to share between multiple plots and widgets. If you use a single ColumnDataSource together with multiple renderers, those renderers also share information about data you select wth a select tool from Bokeh's toolbar. Think of a COlumnDataSource as a collection of sequences of data that each have their own, unique column name.
To create a basic ColumnDataSource object, you need a Python dictionary to pass to the object's data parameter:
- Bokeh uses the dictionary's keys as column names
- The dictionary's values are used as the data values for your ColumnDataSource
The data you pass as part of your dict can be any non-string ordered sequences of values.
data = {'x_values': [1, 2, 3, 4, 5],
'y_values': [6, 7, 2, 3, 6]}
source = ColumnDataSource(data=data)
Tou se a ColumnDataSoruce with a renderer function, you need to pass at least these three arguments:
- x: the name fo the ColumnDataSource's column that contains the data for the x values of your plot
- y: the mame of the ColumnDataSource's column that contains the data for the y values of your plot
- source: the name of the ColumnDataSource that contains the columns you just referenced for the x and y arguments
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.io import show, output_notebook
output_notebook()
# create a Python dict as the basis of your ColumnDataSource
data = {'x_values': [1, 2, 3, 4, 5],
'y_values': [6, 7, 2, 3, 6]}
# create a ColumnDataSource by passing the dict
source = ColumnDataSource(data=data)
# create a plot using the ColumnDataSource's two columns
p = figure()
p.circle(x='x_values', y='y_values', source=source)
show(p)
Bokeh uses a concept called a "view" to select subsets of data. Views are represented by Bokeh's CDSView class. When you use a view, you can use one or more filters to select specific data points without changing the underlying data. To plot with a filtered subset of data, pass a CDSView to the view argument of any renderer method on a Bokeh plot. A CSDView has one property, filter:
- filter is an instance of Filter model.
There are IndexFilters, BooleanFilters, GroupFilters, and CustomJSFilters.
!pip install bokeh_sampledata
from bokeh.layouts import gridplot
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, show
from bokeh.sampledata.penguins import data
from bokeh.transform import factor_cmap
from bokeh.io import output_notebook
output_notebook()
SPECIES = sorted(data.species.unique())
TOOLS = "box_select,lasso_select,help"
source = ColumnDataSource(data)
left = figure(width=300, height=400, title=None, tools=TOOLS,
background_fill_color="#fafafa")
left.scatter("bill_length_mm", "body_mass_g", source=source,
color=factor_cmap('species', 'Category10_3', SPECIES))
right = figure(width=300, height=400, title=None, tools=TOOLS,
background_fill_color="#fafafa", y_axis_location="right")
right.scatter("bill_depth_mm", "body_mass_g", source=source,
color=factor_cmap('species', 'Category10_3', SPECIES))
show(gridplot([[left, right]]))
Setting Ranges
By default, Bokeh attempts to automatically set the data bounds of plots to fit snugly around the data. If you want to set the plots range explicitly, set the x_range and/or y_range properties using a Range1d object that lets you set the start and end points of the range you want.
The figure() function can also accept (start,end) tuples as values for the x_range or y_range parameters.
To create a categorical axis, specify a FactorRange for one of the plot's ranges or a list of factors to be converted to one.
For time series, or any data that involves dates or time, you may want to use axes with labels suitable for different date and time scales. The figure() function accepts x_axis_type and y_axis_type as arguments. To specify a datetime axis, pass"datetime" for the value to either of these parameters.
You can also use log axes and mercator axes.
from bokeh.plotting import figure, show
factors = ["a", "b", "c", "d", "e", "f", "g", "h"]
x = [50, 40, 65, 10, 25, 37, 80, 60]
p = figure(y_range=factors)
p.scatter(x, factors, size=15, fill_color="orange", line_color="green", line_width=3)
show(p)
import pandas as pd
from bokeh.plotting import figure, show
from bokeh.sampledata.stocks import AAPL
df = pd.DataFrame(AAPL)
df['date'] = pd.to_datetime(df['date'])
print(df.head())
# create a new plot with a datetime axis type
p = figure(width=800, height=250, x_axis_type="datetime")
p.line(df['date'], df['close'], color='navy', alpha=0.5)
show(p)
Bar Charts
In addition to plotting numerical data on continuous ranges, you can also use Bokeh to plot categorical data on categorical ranges. Basic categorical ranges are represented in Bokeh as sequences of strings.
Bars
One of the most common ways to handle categorical data is to present it in a bar chart. Bar charts have one categorical axis and one continuous axis. Bar charts are useful when there is one value to plot for each category. The valuyes associated with each category are represented by drawing a bar for that category. The length of this bar along the continuous axus corresponds to the value for that category. Bar charts may also be stacked or grouped together according to hierarchical sub-categories. To create a basic bar chart, use the hbar() or vbar() glyph methods. To assingn these categories to the x-axis, pass this list as the x_range argument to figure().
from bokeh.plotting import figure, show
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
counts = [5, 3, 4, 2, 4, 6]
p = figure(x_range=fruits, height=350, title="Fruit Counts",
toolbar_location=None, tools="")
p.vbar(x=fruits, top=counts, width=0.9)
p.xgrid.grid_line_color = None
p.y_range.start = 0
show(p)
from bokeh.plotting import figure, show
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
counts = [5, 3, 4, 2, 4, 6]
# sorting the bars means sorting the range factors
sorted_fruits = sorted(fruits, key=lambda x: counts[fruits.index(x)])
p = figure(x_range=sorted_fruits, height=350, title="Fruit Counts",
toolbar_location=None, tools="")
p.vbar(x=fruits, top=counts, width=0.9)
p.xgrid.grid_line_color = None
p.y_range.start = 0
show(p)
from bokeh.models import ColumnDataSource
from bokeh.palettes import Bright6
from bokeh.plotting import figure, show
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
counts = [5, 3, 4, 2, 4, 6]
source = ColumnDataSource(data=dict(fruits=fruits, counts=counts, color=Bright6))
p = figure(x_range=fruits, y_range=(0,9), height=350, title="Fruit Counts",
toolbar_location=None, tools="")
p.vbar(x='fruits', top='counts', width=0.9, color='color', legend_field="fruits", source=source)
p.xgrid.grid_line_color = None
p.legend.orientation = "horizontal"
p.legend.location = "top_center"
show(p)
from bokeh.models import ColumnDataSource
from bokeh.palettes import Bright6
from bokeh.plotting import figure, show
from bokeh.transform import factor_cmap
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
counts = [5, 3, 4, 2, 4, 6]
source = ColumnDataSource(data=dict(fruits=fruits, counts=counts))
p = figure(x_range=fruits, height=350, toolbar_location=None, title="Fruit Counts")
p.vbar(x='fruits', top='counts', width=0.9, source=source, legend_field="fruits",
line_color='white', fill_color=factor_cmap('fruits', palette=Bright6, factors=fruits))
p.xgrid.grid_line_color = None
p.y_range.start = 0
p.y_range.end = 9
p.legend.orientation = "horizontal"
p.legend.location = "top_center"
show(p)
To style the visual attributes of Bokeh plots, you need to know what the available properties are. There are three groups of properries that many objects have in common, these properties are:
- line properties: line color, width, etc.
- fill properties: fill color, alpha, etc.
- text properties: font styles, colors, etc.
You can use LATEX notation.
Themes in Bokeh are devfined in YAML or JSON files. To create your own theme files, follow the format defined in bokeh.themes.Theme.
Bokeh can be used to render image data, contour plots, hex tiles, categorical plots, hierarchical data, geographical data, network graphs, timeseries plots, and statistical plots.
Interaction
Bokeh comes with a number of interactive tools that you can use to report information, to change plot parameters such as zoom level or range extents, or to add, edit, or delete glyphs. Tools can be grouped into 4 basic categories:
- Gestures
- These tools respond to single gestures, such as pan movement. The types of gestures are:
1. Pan/Drag tools
2. Click/Tap Tools
3. Scroll/Pinch Tools - Actions
- These are immediate or modal operations that are only activated when their button in the toolbar is pressed, such as the ResetTool or ExamineTool
- Inspectors
- These are passive tools that report information or annotate plots in some way, such as the HoverTool or CrosshairTool.
- Edit Tools
- These are sophisticated mult-gesture tools that can add, delete or modify glyphys on a plot.
See the js_on_change callback triggers for more information on JavaScript callbacks whenever the state of an object changes.
Comments
You can read more about how comments are sorted in this blog post.
User Comments
There are currently no comments for this article.