Matplotlib features

matplotlib is a library that I've been extensively using throughout my PhD. Coming from a place where TCanvas, TLegend and TH1F were the norm, I can tell you that this library is particularly straightforward. Don't get me wrong: it is by no means perfect, but it does contain some features that can be used to make plots more enjoyable to read (though the lack of default LaTeX font will always be a disappointment).


As it is much more straightforward to illustrate these features with concrete code, let's give an example. The goal here is to plot a very simple X and Y sets of data in Python. A very naive first approach would give:

     
import matplotlib.pyplot as plt
# create two sets of data  
plot_x = [0, 1, 2, 3, 4, 
          5, 6, 7, 8]
plot_y = [400, 100, 400, 300, 400, 
          500, 400, 700, 400]
# plot data
plt.plot(plot_x, plot_y)
# show the plot
plt.show()
    

By default, the curve inherits a color (changing if there are many of them), the figure is of a fixed size, and so is the font, and you will end up with the following result:

figure with very basic features

Yes, it is great for quickly visualizing results.

However, especially when it comes to publishing results, it is interesting to make them more attractive: What are the quantities on the axes? What type of curve is it? Can the scale be changed? Can the text be more readable? Let's give one of the many possible answers to these questions.

Side note: As these questions sometimes require several iterations, it will always be best, when results are coming from computationally intensive calculations, to separate the plotting phase from the computational one.


First, let's look at the figure itself. Even if matplotlib can figure out a default figure size, it is also possible to set it yourself.

     
# create a figure
fig, ax = plt.subplots(figsize=(8, 6))
    

Then, for the plot, let's change a bit the color and the style. On top of that, add a label to the curve.

     
# plot data
plt.plot(plot_x, plot_y, linestyle='--', 
         marker='s', color='purple', 
         label='Curve 1')
    

Another thing to avoid redundancy is to change the notation on the axis, such that the numbers are indicated now using a scientific notation. Additionally, the factor on top of the plot can be shifted to the left.

     
# set scientific notation and move it slightly
from matplotlib.ticker import ScalarFormatter
plt.gca().yaxis.set_major_formatter(ScalarFormatter(useMathText=True))
plt.ticklabel_format(style='sci', axis='y', 
                     scilimits=(0, 0))
ax.yaxis.get_offset_text().set_x(-0.12)
    

Adding a grid is nice to track the position of the points composing the curve.

     
# add a grid
from matplotlib.ticker import MultipleLocator
ax.yaxis.set_minor_locator(MultipleLocator(50))
ax.grid(which='major', axis='both')
ax.grid(which='minor', axis='both', 
        color='whitesmoke')
    

Labels on axis don't necessarily have to be numbers. If it corresponds to certain categories, better make that clear. Depending on the length of the label, it is often better to tilt them. And let's also

     
# add labels for x-axis and shift them
x_labels = ['A', 'B', 'C', 'D', 
            'E', 'F', 'G', 'H', 'I']
plt.xticks(plot_x, x_labels, rotation=30, 
           rotation_mode='anchor', ha='right')
for xs in ax.xaxis.get_majorticklabels():
    xs.set_y(-0.02)
for ys in ax.yaxis.get_majorticklabels():
    ys.set_x(-0.02)
    

As we changed the size of the plot, we might as well change the size of the font.

     
# set font sizes
font_size = 20
ax.yaxis.get_offset_text().set_fontsize(font_size)
plt.title('Another suggestion', 
          fontsize=font_size)
plt.xlabel('X-axis', fontsize=font_size)
plt.ylabel('Y-axis', fontsize=font_size)
plt.xticks(fontsize=font_size)
plt.yticks(fontsize=font_size)
    

Finally, to make the legend visible without hiding the curve, let's put it on the lower right of the plot.

     
# add a legend
plt.legend(loc='lower right', fontsize=font_size, 
           handlelength=1, framealpha=1)
    

Mixing all these ingredients together, we obtain the following code:

     
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter
from matplotlib.ticker import MultipleLocator

# create two sets of data
plot_x = [0, 1, 2, 3, 4, 
          5, 6, 7, 8]
plot_y = [400, 100, 400, 300, 400, 
          500, 400, 700, 400]

# create a figure
fig, ax = plt.subplots(figsize=(8, 6))

# plot data
plt.plot(plot_x, plot_y, linestyle='--', 
         marker='s', color='purple', 
         label='Curve 1')

# set scientific notation and move it slightly
plt.gca().yaxis.set_major_formatter(ScalarFormatter(useMathText=True))
plt.ticklabel_format(style='sci', axis='y', 
                     scilimits=(0, 0))
ax.yaxis.get_offset_text().set_x(-0.12)

# add a grid
ax.yaxis.set_minor_locator(MultipleLocator(50))
ax.grid(which='major', axis='both')
ax.grid(which='minor', axis='both', 
        color='whitesmoke')

# add labels for x-axis and shift them
x_labels = ['A', 'B', 'C', 'D', 
            'E', 'F', 'G', 'H', 'I']
plt.xticks(plot_x, x_labels, rotation=30, 
           rotation_mode='anchor', ha='right')
for xs in ax.xaxis.get_majorticklabels():
    xs.set_y(-0.02)
for ys in ax.yaxis.get_majorticklabels():
    ys.set_x(-0.02)

# set font sizes
font_size = 20
ax.yaxis.get_offset_text().set_fontsize(font_size)
plt.title('Another suggestion', 
          fontsize=font_size)
plt.xlabel('X-axis', fontsize=font_size)
plt.ylabel('Y-axis', fontsize=font_size)
plt.xticks(fontsize=font_size)
plt.yticks(fontsize=font_size)

# add a legend
plt.legend(loc='lower right', fontsize=font_size, 
           handlelength=1, framealpha=1)

# show the plot
plt.show()
    

And this yields the following plot:

figure with improved features

Of course, one doesn't have to use all the above, as the result might look a bit goofy depending of what has to be shown, but it nevertheless gives some hints of changes to take into account in order to make a plot more pleasant to read and overall more professional-looking.