python python DEFAULT

Chapter 4. Visualization with Matplotlib

We’ll now take an in-depth look at the Matplotlib tool for visualization in Python. Matplotlib is a multiplatform data visualization library built on NumPy arrays, and designed to work with the broader SciPy stack. It was conceived by John Hunter in 2002, originally as a patch to IPython for enabling interactive MATLAB-style plotting via gnuplot from the IPython command line. IPython’s creator, Fernando Perez, was at the time scrambling to finish his PhD, and let John know he wouldn’t have time to review the patch for several months. John took this as a cue to set out on his own, and the Matplotlib package was born, with version 0.1 released in 2003. It received an early boost when it was adopted as the plotting package of choice of the Space Telescope Science Institute (the folks behind the Hubble Telescope), which financially supported Matplotlib’s development and greatly expanded its capabilities.

One of Matplotlib’s most important features is its ability to play well with many operating systems and graphics backends. Matplotlib supports dozens of backends and output types, which means you can count on it to work regardless of which operating system you are using or which output format you wish. This cross-platform, everything-to-everyone approach has been one of the great strengths of Matplotlib. It has led to a large userbase, which in turn has led to an active developer base and Matplotlib’s powerful tools and ubiquity within the scientific Python world.

In recent years, however, the interface and style of Matplotlib have begun to show their age. Newer tools like ggplot and ggvis in the R language, along with web visualization toolkits based on D3js and HTML5 canvas, often make Matplotlib feel clunky and old-fashioned. Still, I’m of the opinion that we cannot ignore Matplotlib’s strength as a well-tested, cross-platform graphics engine. Recent Matplotlib versions make it relatively easy to set new global plotting styles (see “Customizing Matplotlib: Configurations and Stylesheets”), and people have been developing new packages that build on its powerful internals to drive Matplotlib via cleaner, more modern APIs—for example, Seaborn (discussed in “Visualization with Seaborn”), ggplot, HoloViews, Altair, and even Pandas itself can be used as wrappers around Matplotlib’s API. Even with wrappers like these, it is still often useful to dive into Matplotlib’s syntax to adjust the final plot output. For this reason, I believe that Matplotlib itself will remain a vital piece of the data visualization stack, even if new tools mean the community gradually moves away from using the Matplotlib API directly.

Before we dive into the details of creating visualizations with Matplotlib, there are a few useful things you should know about using the package.

Importing matplotlib

Just as we use the shorthand for NumPy and the shorthand for Pandas, we will use some standard shorthands for Matplotlib imports:

The interface is what we will use most often, as we’ll see throughout this chapter.

Setting Styles

We will use the directive to choose appropriate aesthetic styles for our figures. Here we will set the style, which ensures that the plots we create use the classic Matplotlib style:

Throughout this section, we will adjust this style as needed. Note that the stylesheets used here are supported as of Matplotlib version 1.5; if you are using an earlier version of Matplotlib, only the default style is available. For more information on stylesheets, see “Customizing Matplotlib: Configurations and Stylesheets”.

show() or No show()? How to Display Your Plots

A visualization you can’t see won’t be of much use, but just how you view your Matplotlib plots depends on the context. The best use of Matplotlib differs depending on how you are using it; roughly, the three applicable contexts are using Matplotlib in a script, in an IPython terminal, or in an IPython notebook.

Plotting from a script

If you are using Matplotlib from within a script, the function is your friend. starts an event loop, looks for all currently active figure objects, and opens one or more interactive windows that display your figure or figures.

So, for example, you may have a file called containing the following:

You can then run this script from the command-line prompt, which will result in a window opening with your figure displayed:

$ python

The command does a lot under the hood, as it must interact with your system’s interactive graphical backend. The details of this operation can vary greatly from system to system and even installation to installation, but Matplotlib does its best to hide all these details from you.

One thing to be aware of: the command should be used only once per Python session, and is most often seen at the very end of the script. Multiple commands can lead to unpredictable backend-dependent behavior, and should mostly be avoided.

Plotting from an IPython shell

It can be very convenient to use Matplotlib interactively within an IPython shell (see Chapter 1). IPython is built to work well with Matplotlib if you specify Matplotlib mode. To enable this mode, you can use the magic command after starting :

At this point, any plot command will cause a figure window to open, and further commands can be run to update the plot. Some changes (such as modifying properties of lines that are already drawn) will not draw automatically; to force an update, use . Using in Matplotlib mode is not required.

Plotting from an IPython notebook

The IPython notebook is a browser-based interactive data analysis tool that can combine narrative, code, graphics, HTML elements, and much more into a single executable document (see Chapter 1).

Plotting interactively within an IPython notebook can be done with the command, and works in a similar way to the IPython shell. In the IPython notebook, you also have the option of embedding graphics directly in the notebook, with two possible options:

  • will lead to interactive plots embedded within the notebook

  • will lead to static images of your plot embedded in the notebook

For this book, we will generally opt for :

After you run this command (it needs to be done only once per kernel/session), any cell within the notebook that creates a plot will embed a PNG image of the resulting graphic (Figure 4-1):

Figure 4-1. Basic plotting example

Saving Figures to File

One nice feature of Matplotlib is the ability to save figures in a wide variety of formats. You can save a figure using the command. For example, to save the previous figure as a PNG file, you can run this:

We now have a file called my_figure.png in the current working directory:

-rw-r--r-- 1 jakevdp staff 16K Aug 11 10:59 my_figure.png

To confirm that it contains what we think it contains, let’s use the IPython object to display the contents of this file (Figure 4-2):

Figure 4-2. PNG rendering of the basic plot

In , the file format is inferred from the extension of the given filename. Depending on what backends you have installed, many different file formats are available. You can find the list of supported file types for your system by using the following method of the figure object:

Out[8]: {'eps': 'Encapsulated Postscript', 'jpeg': 'Joint Photographic Experts Group', 'jpg': 'Joint Photographic Experts Group', 'pdf': 'Portable Document Format', 'pgf': 'PGF code for LaTeX', 'png': 'Portable Network Graphics', 'ps': 'Postscript', 'raw': 'Raw RGBA bitmap', 'rgba': 'Raw RGBA bitmap', 'svg': 'Scalable Vector Graphics', 'svgz': 'Scalable Vector Graphics', 'tif': 'Tagged Image File Format', 'tiff': 'Tagged Image File Format'}

Note that when saving your figure, it’s not necessary to use or related commands discussed earlier.

A potentially confusing feature of Matplotlib is its dual interfaces: a convenient MATLAB-style state-based interface, and a more powerful object-oriented interface. We’ll quickly highlight the differences between the two here.

MATLAB-style interface

Matplotlib was originally written as a Python alternative for MATLAB users, and much of its syntax reflects that fact. The MATLAB-style tools are contained in the pyplot () interface. For example, the following code will probably look quite familiar to MATLAB users (Figure 4-3):

Figure 4-3. Subplots using the MATLAB-style interface

It’s important to note that this interface is stateful: it keeps track of the “current” figure and axes, which are where all commands are applied. You can get a reference to these using the (get current figure) and (get current axes) routines.

While this stateful interface is fast and convenient for simple plots, it is easy to run into problems. For example, once the second panel is created, how can we go back and add something to the first? This is possible within the MATLAB-style interface, but a bit clunky. Fortunately, there is a better way.

Object-oriented interface

The object-oriented interface is available for these more complicated situations, and for when you want more control over your figure. Rather than depending on some notion of an “active” figure or axes, in the object-oriented interface the plotting functions are methods of explicit and objects. To re-create the previous plot using this style of plotting, you might do the following (Figure 4-4):

Figure 4-4. Subplots using the object-oriented interface

For more simple plots, the choice of which style to use is largely a matter of preference, but the object-oriented approach can become a necessity as plots become more complicated. Throughout this chapter, we will switch between the MATLAB-style and object-oriented interfaces, depending on what is most convenient. In most cases, the difference is as small as switching to , but there are a few gotchas that we will highlight as they come up in the following sections.

Perhaps the simplest of all plots is the visualization of a single function . Here we will take a first look at creating a simple plot of this type. As with all the following sections, we’ll start by setting up the notebook for plotting and importing the functions we will use:

For all Matplotlib plots, we start by creating a figure and an axes. In their simplest form, a figure and axes can be created as follows (Figure 4-5):

Figure 4-5. An empty gridded axes

In Matplotlib, the figure (an instance of the class ) can be thought of as a single container that contains all the objects representing axes, graphics, text, and labels. The axes (an instance of the class ) is what we see above: a bounding box with ticks and labels, which will eventually contain the plot elements that make up our visualization. Throughout this book, we’ll commonly use the variable name to refer to a figure instance, and to refer to an axes instance or group of axes instances.

Once we have created an axes, we can use the function to plot some data. Let’s start with a simple sinusoid (Figure 4-6):

Figure 4-6. A simple sinusoid

Alternatively, we can use the pylab interface and let the figure and axes be created for us in the background (Figure 4-7; see “Two Interfaces for the Price of One” for a discussion of these two interfaces):

Figure 4-7. A simple sinusoid via the object-oriented interface

If we want to create a single figure with multiple lines, we can simply call the function multiple times (Figure 4-8):

Figure 4-8. Over-plotting multiple lines

That’s all there is to plotting simple functions in Matplotlib! We’ll now dive into some more details about how to control the appearance of the axes and lines.

Adjusting the Plot: Line Colors and Styles

The first adjustment you might wish to make to a plot is to control the line colors and styles. The function takes additional arguments that can be used to specify these. To adjust the color, you can use the keyword, which accepts a string argument representing virtually any imaginable color. The color can be specified in a variety of ways (Figure 4-9):

Figure 4-9. Controlling the color of plot elements

If no color is specified, Matplotlib will automatically cycle through a set of default colors for multiple lines.

Similarly, you can adjust the line style using the keyword (Figure 4-10):

Figure 4-10. Example of various line styles

If you would like to be extremely terse, these and codes can be combined into a single nonkeyword argument to the function (Figure 4-11):

Figure 4-11. Controlling colors and styles with the shorthand syntax

These single-character color codes reflect the standard abbreviations in the RGB (Red/Green/Blue) and CMYK (Cyan/Magenta/Yellow/blacK) color systems, commonly used for digital color graphics.

There are many other keyword arguments that can be used to fine-tune the appearance of the plot; for more details, I’d suggest viewing the docstring of the function using IPython’s help tools (see “Help and Documentation in IPython”).

Adjusting the Plot: Axes Limits

Matplotlib does a decent job of choosing default axes limits for your plot, but sometimes it’s nice to have finer control. The most basic way to adjust axis limits is to use the and methods (Figure 4-12):

Figure 4-12. Example of setting axis limits

If for some reason you’d like either axis to be displayed in reverse, you can simply reverse the order of the arguments (Figure 4-13):

Figure 4-13. Example of reversing the y-axis

A useful related method is (note here the potential confusion between axes with an e, and axis with an i). The method allows you to set the and limits with a single call, by passing a list that specifies (Figure 4-14):

Figure 4-14. Setting the axis limits with plt.axis

The method goes even beyond this, allowing you to do things like automatically tighten the bounds around the current plot (Figure 4-15):

Figure 4-15. Example of a “tight” layout

It allows even higher-level specifications, such as ensuring an equal aspect ratio so that on your screen, one unit in is equal to one unit in (Figure 4-16):

Figure 4-16. Example of an “equal” layout, with units matched to the output resolution

For more information on axis limits and the other capabilities of the method, refer to the docstring.

Labeling Plots

As the last piece of this section, we’ll briefly look at the labeling of plots: titles, axis labels, and simple legends.

Titles and axis labels are the simplest such labels—there are methods that can be used to quickly set them (Figure 4-17):

Figure 4-17. Examples of axis labels and title

You can adjust the position, size, and style of these labels using optional arguments to the function. For more information, see the Matplotlib documentation and the docstrings of each of these functions.

When multiple lines are being shown within a single axes, it can be useful to create a plot legend that labels each line type. Again, Matplotlib has a built-in way of quickly creating such a legend. It is done via the (you guessed it) method. Though there are several valid ways of using this, I find it easiest to specify the label of each line using the keyword of the plot function (Figure 4-18):

Figure 4-18. Plot legend example

As you can see, the function keeps track of the line style and color, and matches these with the correct label. More information on specifying and formatting plot legends can be found in the docstring; additionally, we will cover some more advanced legend options in “Customizing Plot Legends”.

Another commonly used plot type is the simple scatter plot, a close cousin of the line plot. Instead of points being joined by line segments, here the points are represented individually with a dot, circle, or other shape. We’ll start by setting up the notebook for plotting and importing the functions we will use:

Scatter Plots with plt.plot

In the previous section, we looked at / to produce line plots. It turns out that this same function can produce scatter plots as well (Figure 4-20):

Figure 4-20. Scatter plot example

The third argument in the function call is a character that represents the type of symbol used for the plotting. Just as you can specify options such as and to control the line style, the marker style has its own set of short string codes. The full list of available symbols can be seen in the documentation of , or in Matplotlib’s online documentation. Most of the possibilities are fairly intuitive, and we’ll show a number of the more common ones here (Figure 4-21):

Figure 4-21. Demonstration of point numbers

For even more possibilities, these character codes can be used together with line and color codes to plot points along with a line connecting them (Figure 4-22):

Figure 4-22. Combining line and point markers

Additional keyword arguments to specify a wide range of properties of the lines and markers (Figure 4-23):

Figure 4-23. Customizing line and point numbers

This type of flexibility in the function allows for a wide variety of possible visualization options. For a full description of the options available, refer to the documentation.

Scatter Plots with plt.scatter

A second, more powerful method of creating scatter plots is the function, which can be used very similarly to the function (Figure 4-24):

Figure 4-24. A simple scatter plot

The primary difference of from is that it can be used to create scatter plots where the properties of each individual point (size, face color, edge color, etc.) can be individually controlled or mapped to data.

Let’s show this by creating a random scatter plot with points of many colors and sizes. In order to better see the overlapping results, we’ll also use the keyword to adjust the transparency level (Figure 4-25):

Figure 4-25. Changing size, color, and transparency in scatter points

Notice that the color argument is automatically mapped to a color scale (shown here by the command), and the size argument is given in pixels. In this way, the color and size of points can be used to convey information in the visualization, in order to illustrate multidimensional data.

For example, we might use the Iris data from Scikit-Learn, where each sample is one of three types of flowers that has had the size of its petals and sepals carefully measured (Figure 4-26):

Figure 4-26. Using point properties to encode features of the Iris data

We can see that this scatter plot has given us the ability to simultaneously explore four different dimensions of the data: the (x, y) location of each point corresponds to the sepal length and width, the size of the point is related to the petal width, and the color is related to the particular species of flower. Multicolor and multifeature scatter plots like this can be useful for both exploration and presentation of data.

plot Versus scatter: A Note on Efficiency

Aside from the different features available in and , why might you choose to use one over the other? While it doesn’t matter as much for small amounts of data, as datasets get larger than a few thousand points, can be noticeably more efficient than . The reason is that has the capability to render a different size and/or color for each point, so the renderer must do the extra work of constructing each point individually. In , on the other hand, the points are always essentially clones of each other, so the work of determining the appearance of the points is done only once for the entire set of data. For large datasets, the difference between these two can lead to vastly different performance, and for this reason, should be preferred over for large datasets.

For any scientific measurement, accurate accounting for errors is nearly as important, if not more important, than accurate reporting of the number itself. For example, imagine that I am using some astrophysical observations to estimate the Hubble Constant, the local measurement of the expansion rate of the universe. I know that the current literature suggests a value of around 71 (km/s)/Mpc, and I measure a value of 74 (km/s)/Mpc with my method. Are the values consistent? The only correct answer, given this information, is this: there is no way to know.

Suppose I augment this information with reported uncertainties: the current literature suggests a value of around 71 2.5 (km/s)/Mpc, and my method has measured a value of 74 5 (km/s)/Mpc. Now are the values consistent? That is a question that can be quantitatively answered.

In visualization of data and results, showing these errors effectively can make a plot convey much more complete information.

Basic Errorbars

A basic errorbar can be created with a single Matplotlib function call (Figure 4-27):

Figure 4-27. An errorbar example

Here the is a format code controlling the appearance of lines and points, and has the same syntax as the shorthand used in , outlined in “Simple Line Plots” and “Simple Scatter Plots”.

In addition to these basic options, the function has many options to fine-tune the outputs. Using these additional options you can easily customize the aesthetics of your errorbar plot. I often find it helpful, especially in crowded plots, to make the errorbars lighter than the points themselves (Figure 4-28):

Figure 4-28. Customizing errorbars

In addition to these options, you can also specify horizontal errorbars (), one-sided errorbars, and many other variants. For more information on the options available, refer to the docstring of .

Continuous Errors

In some situations it is desirable to show errorbars on continuous quantities. Though Matplotlib does not have a built-in convenience routine for this type of application, it’s relatively easy to combine primitives like and for a useful result.

Here we’ll perform a simple Gaussian process regression (GPR), using the Scikit-Learn API (see “Introducing Scikit-Learn” for details). This is a method of fitting a very flexible nonparametric function to data with a continuous measure of the uncertainty. We won’t delve into the details of Gaussian process regression at this point, but will focus instead on how you might visualize such a continuous error measurement:

We now have , , and , which sample the continuous fit to our data. We could pass these to the function as above, but we don’t really want to plot 1,000 points with 1,000 errorbars. Instead, we can use the function with a light color to visualize this continuous error (Figure 4-29):

Figure 4-29. Representing continuous uncertainty with filled regions

Note what we’ve done here with the function: we pass an x value, then the lower y-bound, then the upper y-bound, and the result is that the area between these regions is filled.

The resulting figure gives a very intuitive view into what the Gaussian process regression algorithm is doing: in regions near a measured data point, the model is strongly constrained and this is reflected in the small model errors. In regions far from a measured data point, the model is not strongly constrained, and the model errors increase.

For more information on the options available in (and the closely related function), see the function docstring or the Matplotlib documentation.

Finally, if this seems a bit too low level for your taste, refer to “Visualization with Seaborn”, where we discuss the Seaborn package, which has a more streamlined API for visualizing this type of continuous errorbar.

Sometimes it is useful to display three-dimensional data in two dimensions using contours or color-coded regions. There are three Matplotlib functions that can be helpful for this task: for contour plots, for filled contour plots, and for showing images. This section looks at several examples of using these. We’ll start by setting up the notebook for plotting and importing the functions we will use:

Visualizing a Three-Dimensional Function

We’ll start by demonstrating a contour plot using a function , using the following particular choice for (we’ve seen this before in “Computation on Arrays: Broadcasting”, when we used it as a motivating example for array broadcasting):

A contour plot can be created with the function. It takes three arguments: a grid of x values, a grid of y values, and a grid of z values. The x and y values represent positions on the plot, and the z values will be represented by the contour levels. Perhaps the most straightforward way to prepare such data is to use the function, which builds two-dimensional grids from one-dimensional arrays:

Now let’s look at this with a standard line-only contour plot (Figure 4-30):

Figure 4-30. Visualizing three-dimensional data with contours

Notice that by default when a single color is used, negative values are represented by dashed lines, and positive values by solid lines. Alternatively, you can color-code the lines by specifying a colormap with the argument. Here, we’ll also specify that we want more lines to be drawn—20 equally spaced intervals within the data range (Figure 4-31):

Figure 4-31. Visualizing three-dimensional data with colored contours

Here we chose the (short for Red-Gray) colormap, which is a good choice for centered data. Matplotlib has a wide range of colormaps available, which you can easily browse in IPython by doing a tab completion on the module:<TAB>

Our plot is looking nicer, but the spaces between the lines may be a bit distracting. We can change this by switching to a filled contour plot using the function (notice the at the end), which uses largely the same syntax as .

Additionally, we’ll add a command, which automatically creates an additional axis with labeled color information for the plot (Figure 4-32):

Figure 4-32. Visualizing three-dimensional data with filled contours

The colorbar makes it clear that the black regions are “peaks,” while the red regions are “valleys.”

One potential issue with this plot is that it is a bit “splotchy.” That is, the color steps are discrete rather than continuous, which is not always what is desired. You could remedy this by setting the number of contours to a very high number, but this results in a rather inefficient plot: Matplotlib must render a new polygon for each step in the level. A better way to handle this is to use the function, which interprets a two-dimensional grid of data as an image.

Figure 4-33 shows the result of the following code:

There are a few potential gotchas with , however:

  • doesn’t accept an x and y grid, so you must manually specify the extent of the image on the plot.

  • by default follows the standard image array definition where the origin is in the upper left, not in the lower left as in most contour plots. This must be changed when showing gridded data.

  • will automatically adjust the axis aspect ratio to match the input data; you can change this by setting, for example, to make x and y units match.

Figure 4-33. Representing three-dimensional data as an image

Finally, it can sometimes be useful to combine contour plots and image plots. For example, to create the effect shown in Figure 4-34, we’ll use a partially transparent background image (with transparency set via the parameter) and over-plot contours with labels on the contours themselves (using the function):

Figure 4-34. Labeled contours on top of an image

The combination of these three functions—, , and —gives nearly limitless possibilities for displaying this sort of three-dimensional data within a two-dimensional plot. For more information on the options available in these functions, refer to their docstrings. If you are interested in three-dimensional visualizations of this type of data, see “Three-Dimensional Plotting in Matplotlib”.

A simple histogram can be a great first step in understanding a dataset. Earlier, we saw a preview of Matplotlib’s histogram function (see “Comparisons, Masks, and Boolean Logic”), which creates a basic histogram in one line, once the normal boilerplate imports are done (Figure 4-35):

Figure 4-35. A simple histogram

The function has many options to tune both the calculation and the display; here’s an example of a more customized histogram (Figure 4-36):

Figure 4-36. A customized histogram

The docstring has more information on other customization options available. I find this combination of along with some transparency to be very useful when comparing histograms of several distributions (Figure 4-37):

Figure 4-37. Over-plotting multiple histograms

If you would like to simply compute the histogram (that is, count the number of points in a given bin) and not display it, the function is available:

[ 12 190 468 301 29]

Two-Dimensional Histograms and Binnings

Just as we create histograms in one dimension by dividing the number line into bins, we can also create histograms in two dimensions by dividing points among two-dimensional bins. We’ll take a brief look at several ways to do this here. We’ll start by defining some data—an and array drawn from a multivariate Gaussian distribution:

plt.hist2d: Two-dimensional histogram

One straightforward way to plot a two-dimensional histogram is to use Matplotlib’s function (Figure 4-38):

Figure 4-38. A two-dimensional histogram with plt.hist2d

Just as with , has a number of extra options to fine-tune the plot and the binning, which are nicely outlined in the function docstring. Further, just as has a counterpart in , has a counterpart in , which can be used as follows:

For the generalization of this histogram binning in dimensions higher than two, see the function.

plt.hexbin: Hexagonal binnings

The two-dimensional histogram creates a tessellation of squares across the axes. Another natural shape for such a tessellation is the regular hexagon. For this purpose, Matplotlib provides the routine, which represents a two-dimensional dataset binned within a grid of hexagons (Figure 4-39):

Figure 4-39. A two-dimensional histogram with plt.hexbin

has a number of interesting options, including the ability to specify weights for each point, and to change the output in each bin to any NumPy aggregate (mean of weights, standard deviation of weights, etc.).

Kernel density estimation

Another common method of evaluating densities in multiple dimensions is kernel density estimation (KDE). This will be discussed more fully in “In-Depth: Kernel Density Estimation”, but for now we’ll simply mention that KDE can be thought of as a way to “smear out” the points in space and add up the result to obtain a smooth function. One extremely quick and simple KDE implementation exists in the package. Here is a quick example of using the KDE on this data (Figure 4-40):

Figure 4-40. A kernel density representation of a distribution

KDE has a smoothing length that effectively slides the knob between detail and smoothness (one example of the ubiquitous bias–variance trade-off). The literature on choosing an appropriate smoothing length is vast: uses a rule of thumb to attempt to find a nearly optimal smoothing length for the input data.

Other KDE implementations are available within the SciPy ecosystem, each with its own various strengths and weaknesses; see, for example, and . For visualizations based on KDE, using Matplotlib tends to be overly verbose. The Seaborn library, discussed in “Visualization with Seaborn”, provides a much more terse API for creating KDE-based visualizations.

Plot legends give meaning to a visualization, assigning labels to the various plot elements. We previously saw how to create a simple legend; here we’ll take a look at customizing the placement and aesthetics of the legend in Matplotlib.

The simplest legend can be created with the command, which automatically creates a legend for any labeled plot elements (Figure 4-41):

Figure 4-41. A default plot legend

But there are many ways we might want to customize such a legend. For example, we can specify the location and turn off the frame (Figure 4-42):

Figure 4-42. A customized plot legend

We can use the command to specify the number of columns in the legend (Figure 4-43):

Figure 4-43. A two-column plot legend

We can use a rounded box () or add a shadow, change the transparency (alpha value) of the frame, or change the padding around the text (Figure 4-44):

Figure 4-44. A fancybox plot legend

For more information on available legend options, see the docstring.

Choosing Elements for the Legend

As we’ve already seen, the legend includes all labeled elements by default. If this is not what is desired, we can fine-tune which elements and labels appear in the legend by using the objects returned by plot commands. The command is able to create multiple lines at once, and returns a list of created line instances. Passing any of these to will tell it which to identify, along with the labels we’d like to specify (Figure 4-45):

Figure 4-45. Customization of legend elements

I generally find in practice that it is clearer to use the first method, applying labels to the plot elements you’d like to show on the legend (Figure 4-46):

Figure 4-46. Alternative method of customizing legend elements

Notice that by default, the legend ignores all elements without a attribute set.

Legend for Size of Points

Sometimes the legend defaults are not sufficient for the given visualization. For example, perhaps you’re using the size of points to mark certain features of the data, and want to create a legend reflecting this. Here is an example where we’ll use the size of points to indicate populations of California cities. We’d like a legend that specifies the scale of the sizes of the points, and we’ll accomplish this by plotting some labeled data with no entries (Figure 4-47):

Figure 4-47. Location, geographic size, and population of California cities

The legend will always reference some object that is on the plot, so if we’d like to display a particular shape we need to plot it. In this case, the objects we want (gray circles) are not on the plot, so we fake them by plotting empty lists. Notice too that the legend only lists plot elements that have a label specified.

By plotting empty lists, we create labeled plot objects that are picked up by the legend, and now our legend tells us some useful information. This strategy can be useful for creating more sophisticated visualizations.

Finally, note that for geographic data like this, it would be clearer if we could show state boundaries or other map-specific elements. For this, an excellent choice of tool is Matplotlib’s Basemap add-on toolkit, which we’ll explore in “Geographic Data with Basemap”.

Multiple Legends

Sometimes when designing a plot you’d like to add multiple legends to the same axes. Unfortunately, Matplotlib does not make this easy: via the standard interface, it is only possible to create a single legend for the entire plot. If you try to create a second legend using or , it will simply override the first one. We can work around this by creating a new legend artist from scratch, and then using the lower-level method to manually add the second artist to the plot (Figure 4-48):

Figure 4-48. A split plot legend

This is a peek into the low-level artist objects that compose any Matplotlib plot. If you examine the source code of (recall that you can do this within the IPython notebook using ) you’ll see that the function simply consists of some logic to create a suitable artist, which is then saved in the attribute and added to the figure when the plot is drawn.

Plot legends identify discrete labels of discrete points. For continuous labels based on the color of points, lines, or regions, a labeled colorbar can be a great tool. In Matplotlib, a colorbar is a separate axes that can provide a key for the meaning of colors in a plot. Because the book is printed in black and white, this section has an accompanying online appendix where you can view the figures in full color ( We’ll start by setting up the notebook for plotting and importing the functions we will use:

As we have seen several times throughout this section, the simplest colorbar can be created with the function (Figure 4-49):

Figure 4-49. A simple colorbar legend

We’ll now discuss a few ideas for customizing these colorbars and using them effectively in various situations.

Customizing Colorbars

We can specify the colormap using the argument to the plotting function that is creating the visualization (Figure 4-50):

Figure 4-50. A grayscale colormap

All the available colormaps are in the namespace; using IPython’s tab-completion feature will give you a full list of built-in possibilities:<TAB>

But being able to choose a colormap is just the first step: more important is how to decide among the possibilities! The choice turns out to be much more subtle than you might initially expect.

Choosing the colormap

A full treatment of color choice within visualization is beyond the scope of this book, but for entertaining reading on this subject and others, see the article “Ten Simple Rules for Better Figures”. Matplotlib’s online documentation also has an interesting discussion of colormap choice.

Broadly, you should be aware of three different categories of colormaps:

Sequential colormaps

These consist of one continuous sequence of colors (e.g., or ).

Divergent colormaps

These usually contain two distinct colors, which show positive and negative deviations from a mean (e.g., or ).

Qualitative colormaps

These mix colors with no particular sequence (e.g., or ).

The colormap, which was the default in Matplotlib prior to version 2.0, is an example of a qualitative colormap. Its status as the default was quite unfortunate, because qualitative maps are often a poor choice for representing quantitative data. Among the problems is the fact that qualitative maps usually do not display any uniform progression in brightness as the scale increases.

We can see this by converting the colorbar into black and white (Figure 4-51):

Figure 4-51. The jet colormap and its uneven luminance scale

Notice the bright stripes in the grayscale image. Even in full color, this uneven brightness means that the eye will be drawn to certain portions of the color range, which will potentially emphasize unimportant parts of the dataset. It’s better to use a colormap such as (the default as of Matplotlib 2.0), which is specifically constructed to have an even brightness variation across the range. Thus, it not only plays well with our color perception, but also will translate well to grayscale printing (Figure 4-52):

Figure 4-52. The viridis colormap and its even luminance scale

If you favor rainbow schemes, another good option for continuous data is the colormap (Figure 4-53):

Figure 4-53. The cubehelix colormap and its luminance

For other situations, such as showing positive and negative deviations from some mean, dual-color colorbars such as (short for Red-Blue) can be useful. However, as you can see in Figure 4-54, it’s important to note that the positive-negative information will be lost upon translation to grayscale!

Figure 4-54. The RdBu (Red-Blue) colormap and its luminance

We’ll see examples of using some of these color maps as we continue.

There are a large number of colormaps available in Matplotlib; to see a list of them, you can use IPython to explore the submodule. For a more principled approach to colors in Python, you can refer to the tools and documentation within the Seaborn library (see “Visualization with Seaborn”).

Color limits and extensions

Matplotlib allows for a large range of colorbar customization. The colorbar itself is simply an instance of , so all of the axes and tick formatting tricks we’ve learned are applicable. The colorbar has some interesting flexibility; for example, we can narrow the color limits and indicate the out-of-bounds values with a triangular arrow at the top and bottom by setting the property. This might come in handy, for example, if you’re displaying an image that is subject to noise (Figure 4-55):

Figure 4-55. Specifying colormap extensions

Notice that in the left panel, the default color limits respond to the noisy pixels, and the range of the noise completely washes out the pattern we are interested in. In the right panel, we manually set the color limits, and add extensions to indicate values that are above or below those limits. The result is a much more useful visualization of our data.

Discrete colorbars

Colormaps are by default continuous, but sometimes you’d like to represent discrete values. The easiest way to do this is to use the function, and pass the name of a suitable colormap along with the number of desired bins (Figure 4-56):

Figure 4-56. A discretized colormap

The discrete version of a colormap can be used just like any other colormap.

Example: Handwritten Digits

For an example of where this might be useful, let’s look at an interesting visualization of some handwritten digits data. This data is included in Scikit-Learn, and consists of nearly 2,000 8×8 thumbnails showing various handwritten digits.

For now, let’s start by downloading the digits data and visualizing several of the example images with (Figure 4-57):

Figure 4-57. Sample of handwritten digit data

Because each digit is defined by the hue of its 64 pixels, we can consider each digit to be a point lying in 64-dimensional space: each dimension represents the brightness of one pixel. But visualizing relationships in such high-dimensional spaces can be extremely difficult. One way to approach this is to use a dimensionality reduction technique such as manifold learning to reduce the dimensionality of the data while maintaining the relationships of interest. Dimensionality reduction is an example of unsupervised machine learning, and we will discuss it in more detail in “What Is Machine Learning?”.

Deferring the discussion of these details, let’s take a look at a two-dimensional manifold learning projection of this digits data (see “In-Depth: Manifold Learning” for details):

We’ll use our discrete colormap to view the results, setting the and to improve the aesthetics of the resulting colorbar (Figure 4-58):

Figure 4-58. Manifold embedding of handwritten digit pixels

The projection also gives us some interesting insights on the relationships within the dataset: for example, the ranges of 5 and 3 nearly overlap in this projection, indicating that some handwritten fives and threes are difficult to distinguish, and therefore more likely to be confused by an automated classification algorithm. Other values, like 0 and 1, are more distantly separated, and therefore much less likely to be confused. This observation agrees with our intuition, because 5 and 3 look much more similar than do 0 and 1.


Pyplot running but not displaying graphs

After I uninstalled and reinstalled I get this now when I run it:
C:\Users\effa1\anaconda3\lib\site-packages\ UserWarning: mkl-service package failed to import, therefore Intel(R) MKL initialization ensuring its correct out-of-the box operation under condition when Gnu OpenMP had already been loaded by Python process is not assured. Please install mkl-service package, see GitHub - IntelPython/mkl-service: Python hooks for Intel(R) Math Kernel Library runtime control settin
from . import distributor_init
Traceback (most recent call last):
File "C:\Users\effa1\anaconda3\lib\site-packages\numpy\core_init
.py", line 22, in
from . import multiarray
File “C:\Users\effa1\anaconda3\lib\site-packages\numpy\core\”, line 12, in
from . import overrides
File “C:\Users\effa1\anaconda3\lib\site-packages\numpy\core\”, line 7, in
from numpy.core._multiarray_umath import (
ImportError: DLL load failed while importing _multiarray_umath: The specified module could not be found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “matplot lib”, line 1, in
import matplotlib.pyplot as plt
File “C:\Users\effa1\anaconda3\lib\site-packages\”, line 107, in
from . import cbook, rcsetup
File “C:\Users\effa1\anaconda3\lib\site-packages\matplotlib\”, line 28, in
import numpy as np
File “C:\Users\effa1\anaconda3\lib\site-packages\”, line 145, in
from . import core
File “C:\Users\effa1\anaconda3\lib\site-packages\numpy\”, line 48, in
raise ImportError(msg)


Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was

We have compiled some common reasons and troubleshooting tips at:

Please note and check the following:

  • The Python version is: Python3.8 from “C:\Users\effa1\anaconda3\python.exe”
  • The NumPy version is: “1.20.1”

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: DLL load failed while importing _multiarray_umath: The specified module could not be found.

  1. Malice risu
  2. Bethel ridge camping
  3. Wwf project

Matplotlib Plotting

❮ PreviousNext ❯

Plotting x and y points

The function is used to draw points (markers) in a diagram.

By default, the function draws a line from point to point.

The function takes parameters for specifying points in the diagram.

Parameter 1 is an array containing the points on the x-axis.

Parameter 2 is an array containing the points on the y-axis.

If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays [1, 8] and [3, 10] to the plot function.


Draw a line in a diagram from position (1, 3) to position (8, 10):

import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1, 8])
ypoints = np.array([3, 10])

plt.plot(xpoints, ypoints)


Try it Yourself »

The x-axis is the horizontal axis.

The y-axis is the vertical axis.

Plotting Without Line

To plot only the markers, you can use shortcut string notation parameter 'o', which means 'rings'.


Draw two points in the diagram, one at position (1, 3) and one in position (8, 10):

import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1, 8])
ypoints = np.array([3, 10])

plt.plot(xpoints, ypoints, 'o')


Try it Yourself »

You will learn more about markers in the next chapter.

Multiple Points

You can plot as many points as you like, just make sure you have the same number of points in both axis.


Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and finally to position (8, 10):

import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1, 2, 6, 8])
ypoints = np.array([3, 8, 1, 10])

plt.plot(xpoints, ypoints)


Try it Yourself »

Default X-Points

If we do not specify the points in the x-axis, they will get the default values 0, 1, 2, 3, (etc. depending on the length of the y-points.

So, if we take the same example as above, and leave out the x-points, the diagram will look like this:


Plotting without x-points:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10, 5, 7])



Try it Yourself »

The x-points in the example above is [0, 1, 2, 3, 4, 5].

❮ PreviousNext ❯

Scatter plot with third variable as color - Python Matplotlib

Visualization with Matplotlib

We'll now take an in-depth look at the Matplotlib package for visualization in Python. Matplotlib is a multi-platform data visualization library built on NumPy arrays, and designed to work with the broader SciPy stack. It was conceived by John Hunter in 2002, originally as a patch to IPython for enabling interactive MATLAB-style plotting via gnuplot from the IPython command line. IPython's creator, Fernando Perez, was at the time scrambling to finish his PhD, and let John know he wouldn’t have time to review the patch for several months. John took this as a cue to set out on his own, and the Matplotlib package was born, with version 0.1 released in 2003. It received an early boost when it was adopted as the plotting package of choice of the Space Telescope Science Institute (the folks behind the Hubble Telescope), which financially supported Matplotlib’s development and greatly expanded its capabilities.

One of Matplotlib’s most important features is its ability to play well with many operating systems and graphics backends. Matplotlib supports dozens of backends and output types, which means you can count on it to work regardless of which operating system you are using or which output format you wish. This cross-platform, everything-to-everyone approach has been one of the great strengths of Matplotlib. It has led to a large user base, which in turn has led to an active developer base and Matplotlib’s powerful tools and ubiquity within the scientific Python world.

In recent years, however, the interface and style of Matplotlib have begun to show their age. Newer tools like ggplot and ggvis in the R language, along with web visualization toolkits based on D3js and HTML5 canvas, often make Matplotlib feel clunky and old-fashioned. Still, I'm of the opinion that we cannot ignore Matplotlib's strength as a well-tested, cross-platform graphics engine. Recent Matplotlib versions make it relatively easy to set new global plotting styles (see Customizing Matplotlib: Configurations and Style Sheets), and people have been developing new packages that build on its powerful internals to drive Matplotlib via cleaner, more modern APIs—for example, Seaborn (discussed in Visualization With Seaborn), ggpy, HoloViews, Altair, and even Pandas itself can be used as wrappers around Matplotlib's API. Even with wrappers like these, it is still often useful to dive into Matplotlib's syntax to adjust the final plot output. For this reason, I believe that Matplotlib itself will remain a vital piece of the data visualization stack, even if new tools mean the community gradually moves away from using the Matplotlib API directly.



What Are the “plt” and “ax” in Matplotlib Exactly?

Indeed, as the most popular and fundamental data visualisation library, Matplotlib is kind of confusing from some perspectives. It is usually to see that someone asking about

  • When should I use “axes”?
  • Why some examples using “plt” while someone else using “ax”?
  • What’s the difference between them?

It is good that there are so many examples online to show people how to use Matplotlib to draw this kind of chart or that kind of chart, but I rarely see any tutorials mentioning “why”. This may cause people who have less programming experience or switching from other languages like R to become very confusing.

In this article, I won’t teach you to draw any specific charts using Matplotlib but will try to explain the basic but important regarding Matplotlib — what are the “plt” and “ax” people usually use.

To clarify, when I say “plt”, it doesn’t exist in the Matplotlib library. It is called “plt” because most Python programmers like to import Matplotlib and make an alias called “plt”, which I believe you should know, but just in case.

import matplotlib.pyplot as plt

Then, come back to our main topic. Let’s draw a simple chart for demonstration purposes.

import numpy as npplt.plot(np.random.rand(20))
plt.title('test title')

As shown in the above-annotated screenshot, when we draw a graph using :

  1. A object is generated (shown in green)
  2. An object is generated implicitly with the plotted line chart (shown in red)
  3. All the elements of the plot such as the x and y-axis are rendered inside the object (shown in blue)

Well, if we use some kind of metaphor here:

  • is like a paper that you can draw anything you want
  • We have to draw a chart in a “cell”, which is in this context
  • If we’re drawing only one graph, we don’t have to draw a “cell” first, just simply draw on the paper anyway. So, we can use .

Of course, we can explicitly draw a “cell” on the “paper”, to tell Matplotlib that we’re gonna draw a chart inside this cell. Then, we have the following code.

fig, ax = plt.subplots()
ax.set_title('test title')

Exactly the same results. The only difference is that we explicitly draw the “cell” so that we can get the and object.

Indeed, when we just want to plot one graph, it is not necessary to “draw” this cell. However, you must be noticed that we have to do this when we want to draw multiple graphs in one plot. In other words, the subplots.

n_rows = 2
n_cols = 2fig, axes = plt.subplots(n_rows, n_cols)
for row_num in range(n_rows):
for col_num in range(n_cols):
ax = axes[row_num][col_num]
ax.set_title(f'Plot ({row_num+1}, {col_num+1})')fig.suptitle('Main title')

In this code snippet, we firstly declared how many rows and columns we want to “draw”. 2 by 2 means that we want to draw 4 “cells”.

Then, in each cell, we plot a random line chart and assign a title based on its row number and column number. Please note that we’re using instances.

After that, we define a “Main title” on the “paper”, which is the instance. So, we have this supertitle that does not belong to any “cell”, but on the paper.

Finally, before calling the method, we need to ask the “paper” — instance — to automatically give enough padding between the cells by calling its method. Otherwise,

Hopefully, now you understand better what are and people are using it exactly.

Basically, the is a common alias of used by most people. When we plot something using such as , we implicitly created a instance and an inside the object. This is totally fine and very convenient when we just want to draw a single graph.

However, we can explicitly call to get the object and object, in order to do more things on them. When we want to draw multiple subplots on a , it is usually required to use this approach.

Also, here are the Matplotlib official API reference for the and classes. It is highly recommended to check them out and try some methods yourselves to make sure you understand even deeper.

If you feel my articles are helpful, please consider joining Medium Membership to support me and thousands of other writers! (Click the link above)

How to: Plot a Function in Python

How To Clear A Plot In Python

Matplotlib is a data visualization and graphical plotting library for Python. Matplotlib’s pyplot API is stateful, which means that it stores the state of objects until a method is encountered that will clear the current state.

This article focuses on how to clear a plot by clearing the current Axes and Figure state of a plot, without closing the plot window. There are two methods available for this purpose:

  • clf() | class: matplotlib.pyplot.clf(). Used to clear the current Figure’s state without closing it.
  • cla() | class: matplotlib.pyplot.cla(). Used to clear the current Axes state without closing it.

How to Clear a Pyplot Figure 

Figure is the top-level container object in a matplotlib plot. Figure includes everything visualized in a plot, including one or more Axes

You can use the matplotlib.pyplot.clf() function to clear the current Figure’s state.The following example shows how to create two identical Figures simultaneously, and then apply the clf() function only to Figure 2:

import matplotlib.pyplot as plt f1 = plt.figure() plt.plot([1, 2, 3]) plt.title("Figure 1 not cleared clf()")  f2 = plt.figure() plt.plot([1,2,3]) # Clear Figure 2 with clf() function: plt.clf() plt.title("Figure 2 cleared with clf()")

Figure 1. A Figure not cleared with the clf() function: 

QR How to clear a plot in Python Figure 1

Figure 2.  A Figure with the same elements cleared with theclf() function:

QR How to clear a plot in python Figure 2

How to Clear Pyplot Axes

Axes is a container class within the top-level Figure container. It is the data plotting area in which most of the elements in a plot are located, includingAxis, Tick, Line2D, Text, etc., and it also sets the coordinates. An Axes has at least an X-Axis and a Y-Axis, and may have a Z-Axis.

The matplotlib.pyplot.cla() function clears the current Axes state without closing the Axes. The elements within the Axes are not dropped, however the current Axes can be redrawn with commands in the same script.

The following example creates a Figure and then plots two Axes in two different subplots. Only the second Axes is cleared with the cla() function:

import matplotlib.pyplot as plt  fig, [ax, ax1] = plt.subplots(2, 1) ax.plot([1, 2, 3, 4])  ax1.plot([1, 2, 3, 4]) # cla() function clears the 2nd Axe: ax1.cla()   fig.suptitle('Cla() Example')

Figure 3. A Figure containing two Axes in different subplots. The first Axes is not cleared with the cla() function. The second Axesis cleared with cla():

QR How To Clear a plot in python Figure 3


You will also like:

How To Display A Plot In Python

Pythonistas typically use the Matplotlib plotting library to display numeric data in plots, graphs and charts in Python. A wide range of functionality is provided by matplotlib’s two APIs (Application Programming Interfaces):

  • Pyplot API interface, which offers a hierarchy of code objects that make matplotlib work like MATLAB.
  • OO (Object-Oriented) API interface, which offers a collection of objects that can be assembled with greater flexibility than pyplot. The OO API provides direct access to matplotlib’s backend layer.

The pyplot interface is easier to implement than the OO version and is more commonly used. For information about pyplot functions and terminology, refer to: What is Pyplot in Matplotlib

Display a plot in Python: Pyplot Examples

Matplotlib’s series of pyplot functions are used to visualize and decorate a plot.

How to Create a Simple Plot with the Plot() Function

The matplotlib.pyplot.plot() function provides a unified interface for creating different types of plots. 

The simplest example uses the plot() function to plot values as x,y coordinates in a data plot. In this case, plot() takes 2 parameters for specifying plot coordinates: 

  • Parameter for an array of X axis coordinates.
  • Parameter for an array of Y axis coordinates.

A line ranging from x=2, y=4 through x=8, y=9 is plotted by creating 2 arrays of (2,8) and (4,9):

import matplotlib.pyplot as plt import numpy as np # X axis parameter: xaxis = np.array([2, 8]) # Y axis parameter: yaxis = np.array([4, 9]) plt.plot(xaxis, yaxis)

Figure 1.  A simple plot created with the plot() function:

how to display a plot figure 1

How to Customize Plot Appearance with Marker & Linestyle

marker and linestyle are matplotlib keywords that can be used to customize the appearance of data in a plot without modifying data values.

  • marker is an argument used to label each data value in a plot with a ‘marker‘.
  • linestyle is an argument used to customize the appearance of lines between data values, or else remove them altogether.

In this example, each data value is labeled with the letter “o”, and given a dashed linestyle “–”:

import matplotlib.pyplot as plt import numpy as np xaxis = np.array([2, 12, 3, 9]) # Mark each data value and customize the linestyle: plt.plot(xcoords, marker = “o”, linestyle = “--”)

A partial list of string characters that are acceptable options for marker and linestyle:

“-” solid line style “--” dashed line style “ “ no line “o” letter marker

Matplotlib Scatter Plot Example

Matplotlib also supports more advanced plots, such as scatter plots. In this case, the scatter() function is used to display data values as a collection of x,y coordinates represented by standalone dots.

In this example, 2 arrays of the same length (one array for X axis values and another array for Y axis values) are plotted. Each value is represented by a dot:

Watch video here.

import matplotlib.pyplot as plt # X axis values: x = [2,3,7,29,8,5,13,11,22,33] # Y axis values: y = [4,7,55,43,2,4,11,22,33,44] # Create scatter plot: plt.scatter(x, y)

Matplotlib Example: Multiple Data Sets in One Plot

Matplotlib is highly flexible, and can accommodate multiple datasets in a single plot. In this example, we’ll plot two separate data sets, xdata1 and xdata2:

Watch video here.

import matplotlib.pyplot as plt import numpy as np # Create random seed: np.random.seed(5484849901) # Create random data: xdata = np.random.random([2, 8])   # Create two datasets from the random floats: xdata1 = xdata[0, :]  xdata2 = xdata[1, :]   # Sort the data in both datasets: xdata1.sort()  xdata2.sort() # Create y data points:  ydata1 = xdata1 ** 2 ydata2 = 1 - xdata2 ** 4 # Plot the data:  plt.plot(xdata1, ydata1)  plt.plot(xdata2, ydata2)   # Set x,y lower, upper limits: plt.xlim([0, 1])  plt.ylim([0, 1])  plt.title(“Multiple Datasets in One Plot")

Matplotlib Example: Subplots

You can also use matplotlib to create complex figures that contain more than one plot. In this example, multiple axes are enclosed in one figure and displayed in subplots:

import matplotlib.pyplot as plt import numpy as np # Create a Figure with 2 rows and 2 columns of subplots: fig, ax = plt.subplots(2, 2) x = np.linspace(0, 5, 100) # Index 4 Axes arrays in 4 subplots within 1 Figure: ax[0, 0].plot(x, np.sin(x), 'g') #row=0, column=0 ax[1, 0].plot(range(100), 'b') #row=1, column=0 ax[0, 1].plot(x, np.cos(x), 'r') #row=0, column=1 ax[1, 1].plot(x, np.tan(x), 'k') #row=1, column=1

Figure 2.  Multiple axe in subplots displayed in one figure:

how to display a plot figure 2

Matplotlib Example: Histogram Plot

A histogram is used to display frequency distributions in a bar graph.

In this example, we’ll combine matplotlib’s histogram and subplot capabilities by creating a plot containing five bar graphs. The areas in the bar graph will be proportional to the frequency of a random variable, and the widths of each bar graph will be equal to the class interval:

Watch video here.

import matplotlib.plot as plt import matplotlib.ticker as maticker import numpy as np # Create random variable: data = np.random.normal(0, 3, 800) # Create a Figure and multiple subplots containing Axes: fig, ax = plt.subplots() weights = np.ones_like(data) / len(data) # Create Histogram Axe: ax.hist(data, bins=5, weights=weights) ax.yaxis.set_major_formatter(maticker.PercentFormatter(xmax=1.0, decimals=1)) plt.title(“Histogram Plot”)

Matplotlib Example: Phase Spectrum Plot

A phase spectrum plot lets us visualize the frequency characteristics of a signal.

In this advanced example, we’ll plot a phase spectrum of two signals (represented as functions) that each have different frequencies:

import matplotlib.pyplot as plt import numpy as np # Generate pseudo-random numbers: np.random.seed(0) # Sampling interval:    dt = 0.01 # Sampling Frequency: Fs = 1 / dt  # ex[;aom Fs] # Generate noise: t = np.arange(0, 10, dt) res = np.random.randn(len(t)) r = np.exp(-t / 0.05) # Convolve 2 signals (functions): conv_res = np.convolve(res, r)*dt conv_res = conv_res[:len(t)] s = 0.5 * np.sin(1.5 * np.pi * t) + conv_res # Create the plot: fig, (ax) = plt.subplots() ax.plot(t, s) # Function plots phase spectrum: ax.phase_spectrum(s, Fs = Fs) plt.title(“Phase Spectrum Plot”)

Figure 3.   A Phase Spectrum of two signals with different frequencies is plotted in one figure:

how to display a plot figure 3

Matplotlib Example: 3D Plot

Matplotlib can also handle 3D plots by allowing the use of a Z axis. We’ve already created a 2D scatter plot above, but in this example we’ll create a 3D scatter plot:

Watch video here.

from mpl_toolkits.mplot3d import Axes3D import matplotlib.pyplot as plt fig = plt.figure() # Create 1 3D subplot: ax = fig.add_subplot(111, projection='3d') # ‘111’ is a MATlab convention used in Matplotlib# to create a grid with 1 row and 1 column. # The first cell in the grid is the new Axes location. # Create x,y,z coordinates: x =[1,2,3,4,5,6,7,8,9,10] y =[11,4,2,5,13,4,14,2,4,8] z =[2,3,4,5,5,7,9,11,19,9] # Create a 3D scatter plot with x,y,z orthogonal axis, and red "o" markers: ax.scatter(x, y, z, c='red', marker="o") # Create x,y,z axis labels: ax.set_xlabel('X Axis') ax.set_ylabel('Y Axis') ax.set_zlabel('Z Axis')

How to Use a Matplotlib Backend

Matplotlib can target just about any output format you can think of. Most commonly, data scientists display plots in their Jupyter notebook, but you can also display plots within an application.  

In this example, matplotlib’s OO backend uses the Tkinter TkAgg() function to generate Agg (Anti-Grain Geometry) high-quality rendering, and the Tk mainloop() function to display a plot:

from tkinter import * from tkinter.ttk import * import matplotlib matplotlib.use("TkAgg") from matplotlib.figure import Figure # OO backend (Tkinter) tkagg() function: from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg root = Tk() figure = Figure(figsize=(5, 4), dpi=100) plot = figure.add_subplot(1, 1, 1) x = [ 0.1, 0.2, 0.3, 0.4 ] y = [ -0.1, -0.2, -0.3, -0.4 ] plot.plot(x, y, color="red", marker="o",  linestyle="--") canvas = FigureCanvasTkAgg(figure, root) canvas.get_tk_widget().grid(row=0, column=0) root.mainloop()

Figure 4.  An OO backend plot displayed using Tkinter tkagg() function:

how to display a plot figure 4

Final Tip:  matplotlib script execution creates a text output in the Python console (not part of the UI plot display) that may include warning messages or be otherwise visually unappealing. To fix this, you can add a semicolon (;) at the end of the last line of code before displaying the plot. For example:

# pyplot scatter() function: plt.scatter(x, y);

1243 1244 1245 1246 1247