Now that my data is in the correct format for matplotlib to understand, I can generate my first pass at a contour plot: from IPython.display import set_matplotlib_formats I’m not a huge fan of this formatting requirement since we have to duplicate a bunch of data, but hopefully I’ve helped you understand the basic process required to get here from a more standard “long” data format. (array(),Īrray())Īnd now let’s display the matrices X and Y generated by np.meshgrid(X_unique, Y_unique): pd.DataFrame(X).round(3) First take note of the unique values in each of my x/y axes: X_unique,Y_unique Rather than explain in detail, it’s easier to just show you what meshgrid is doing. This means that we need to duplicate our $x$ and $y$ values along different axes, so that each entry in Z has its corresponding $x$ and $y$ coordinates in the same entry of the X and Y matrices. This by itself is not terribly unintuitive, but the odd part about matplotlib’s contour method is that it also requires your X and Y data to have the exact same shape as your Z data. We can see the resulting data structure below: pd.DataFrame(Z).round(3) What’s going on here? Looking at the Z data first, I’ve merely used the pivot_table method from pandas to cast my data into a matrix format, where the columns/rows correspond to the values of Z for each of the points in the range of the $x$/$y$-axes. Y_unique = np.sort(contour_data.y.unique()) X_unique = np.sort(contour_data.x.unique()) Z = contour_data.pivot_table(index='x', columns='y', values='z').T.values
CONTOUR PLOT CODE
Given data in this format, we can quickly convert it to the requisite structure for matplotlib using the code below. It’s not detrimental if your data don’t meet this requirement, but you may get unwanted blank spots in your plot if your data is missing any points in the plane. (Said differently, if $X$ is the set of points you want to plot on the $x$-axis and $Y$ is the set of points you want to plot on the $y$-axis, then your dataframe should contain a $z$-value for every point in the Cartesian product of $X \times Y$.) If you know you’re going to be making a contour plot, you can plan ahead of time so your data-generating process results in this format. Nota bene: For best results, make sure that there is a row for every combination of x and y coordinates in the plane of the range you want to plot. In this case, my three dimensions are just x, y, and z which maps directly to the axes on which we wish to plot them. To begin, I’ll start with some dummy data that is in a standard “long” format, where each row corresponds to a single observation. To run the Python code in this post on your machine, you’ll need pandas, numpy, and matplotlib installed. Note: This post can be launched as a Notebook by clicking here. In this post, I’ll give you the code to get from a more traditional data structure to the format required to use Python’s ax.contour function. The most difficult part of using the Python/ matplotlib implementation of contour plots is formatting your data. This isn’t to say Pythonic contour plots don’t come with their own set of frustrations, but hopefully this post will make the task easier for any of you going down this road. Of course, you can make anything look great with enough effort, but you can also waste an excessive amount of time fiddling with customizable tools. While I usually use R/ggplot2 to generate my data visualizations, I found the support for good-looking, out-of-the-box contour plots to be a bit lacking. While 3-D surface plots might be useful in some special cases, in general I think they should be avoided since they add a great deal of complexity to a visualization without adding much (if any) information beyond a 2-D contour plot. When I have continuous data in three dimensions, my first visualization inclination is to generate a contour plot.