Pearson Data Visualization Style Guide, v0.0.3

Use of Color

Chromatic: multiple colors

Qualitative: Categories

Qualitative data values are different categories or classes. This is often called nominal data, meaning that each value represents a named thing, rather than an ordered or numerical progression. For qualitative data, use a different hue for each item.)

TODO: list common qualitative / categorical value type examples widely used in Pearson

Examples of categorical data: courses, learning topics, schools, etc.

TODO: ask if there is or should be consistent color-coding for different topic areas (e.g math = orange, science = red, social studies = blue) and sub-colors within those categories. Could be good for symbology

Do not use different hues to represent elements of the same category. This may confuse or distract people with some cognitive disabilities, by implying a pattern or categorization that doesn't exist, thus increasing their cognitive load and even frustration in trying to understand this non-existent pattern.

Misuse of multiple colors also distracts from the message of the data visualization, and increases visual clutter.

Criteria for qualitative color values

Color may used to distinguish discrete items or related groups of items which do not have an intrinsic order, such as different courses of study, or different schools school systems within a district or countries on a map (when not ranking them by another criteria). For this purpose, a color palette (or set of specific colors) should follow perceptual guidelines: TODO: simplify this paragraph

  • Each color in the palette should look clearly distinct from the others, following the color contrast ratio of at least 3:1 relative luminance (per the WCAG 2.1 Success Criterion 1.4.11 Non-text Contrast)
  • Each color in the palette should be perceptually uniform, such as different hues with a saturation and lightness equivalent to the other colors in the palette
  • No single color in the palette should stand out as stronger or weaker relative to the other colors, including mixing chromatic and grayscale colors
  • Collectively, there should be no perceived ordering or ranking among the colors of the palette, which may give the impression of an apparent order among the represented items.

Note that the Pearson branding color palette is not ideal for use with data visualization, because it isn't perceptually uniform. But if it's going to be used for qualitative charts, it should be consistent

Sequential

Sequential data values are items that have ordered, numerical values, such as performance metrics.

Examples of sequential data: test scores, grades, rankings, performance times, class size, etc.

TODO: describe perceptual uniformity

TODO: describe sequential colors as shown in modified palette above

TODO: list common sequential value type examples widely used in Pearson

Criteria for sequential color values

When color is used to represent an ordered sequence of data values, the color palette (or set of specific colors) should follow these perceptual guidelines:

  • The sequence of colors in the palette should clearly indicate which values are larger and which values are smaller (in general, darker colors are perceived as larger and lighter colors as smaller)
  • Each color in the palette should clearly and consistently represent how distant it is from other values
  • The colors scale should be perceptually uniform across the color scale, such as the same hue with different saturation or lightness, or multi-hued across a familiar set of colors (e.g. natural world colors)
  • The extremes of the color scale should be clearly labeled, by axis or legend, with at least one interval value labeled as well (e.g the midpoint)
  • Both stepped and smooth gradations are acceptable, but the usage should be clear

TODO: show and explain RAG color palette

Five colors of same green hue, from light to dark

Diverging

Diverging data values represent the deviation of the value in along a linear axis relative to a neutral midpoint. This midpoint may be zero, where the two poles are positive and negative, or some arbitrary average such as test scores, where the two poles may indicate how well or poorly a student performed. There may be an implied range of acceptable thresholds within the two poles, where the extremes are outside the acceptable range.

Examples of diverging data: individual test scores relative to the average, etc.

TODO: list common diverging value type examples widely used in Pearson

Criteria for diverging color values

When color is used to represent an diverging set of data values, the color palette (or set of specific colors) should follow these perceptual guidelines:

  • The midpoint should typically be a light color when the value is neutral, or a dark color when the midpoint is the implied ideal
  • Each side of the color scale should follow the criterial for sequential colors, with the difference in color from the midpoint growing strongest at each end (e.g. for light midpoints, the lighter colors meaning less different to the midpoint and the darker colors meaning more different than the midpoint)
  • Both sides should be balanced by using a shared uniform scale, so that values at the same distance from the midpoint are perceived as equivalent
  • Distinct colors should be used so the reader understands both the direction and the magnitude of the distance from the midpoint
Five colors interpolated between a red hue and a green hue

Mixed data values

Sometimes more than one data type is being represented in the same chart. For example, a bar chart might represent schools by the size of student population (the length of the bar), and also by region (the color of the bar). This is common, and can be used effectively. However, care must be taken to explain both signals: the sequential (the dependent variable axis labels) and the categorical (a color-coded legend), with a textual explanation in the title or caption.

Sequential palette for qualitative data

Normally, you should reserve a sequential palette for quantitative (not qualitative) values. However, when presenting qualitative data in a manner ordered by decreasing quantity value, it can be effective to use variations on the same hue for the categories. This also works well for people with decreased color vision and for printing in grayscale. Note: when combining darker and lighter colors, darker colors are typically associated with larger values, while lighter colors are associated with smaller values. Reversing or mixing the sequence of colors may cause confusion or misunderstanding in readers.

Nine colors of different distinct hues
Examples of Qualitative Color Palette
Bar chart with single color for same category
Do: single color for same category
Stacked bar chart with multiple colors for different categories
Do: multiple colors for different categories
Bar chart with multiple colors for same category
Don't: multiple colors for same category
Donut chart with multiple colors for different categories
Do: multiple colors for different categories
Donut chart with multiple shades of green
Do: multiple shades of same hue for different categories
Donut chart with multiple shades of green
Don't: misorder multiple shades of same hue for different categories

Achromatic: grayscale

Uses for grayscale colors include display in media or devices without chromatic colors (e.g. print or e-ink devices), and de-emphasis of data points to contrast with accentuated data points.

Print and Grayscale Displays

For small nunbers of qualitative or sequential data, you can substitute grayscale colors on a white background for the chromatic colors. Normal human vision cannot reliably differentiate between more than a few shades of gray, so limit the palette to five colors, if possible.

Do not distinguish items by color alone, especially with the limited differentiation by achromatic colors. Consider using supplementary patterns or symbols where possible. Note that diverging data do not work well with a grayscale palette, so use of patterns or symbols may be necessary for visualizations of diverging data.

It is a good practice to assume that your data visualization may be printed on a grayscale printer, and to limit your use of color even for chromatic charts.

Five colors of grays, from light to dark

Note: For use in print, it is sometimes possible to use technologies such as SVG and CSS to specify special print colors, such as grayscale colors, that are selected specifically for grayscale print (rather than chromatic colors printed with only saturation and lightness, and no hue). Where possible, specify CSS print media rules for grayscale color palettes.

Grayscale for de-emphasis

Often, it is good to reduce visual clutter and draw emphasis to key sections or data points by using lighter or heavily desaturated achromatic or chromatic colors for all other elements of a design.

Because of the convention that light grays denote a de-emphasized element, you should not use achromatic colors in combination with chromatic colors as part of qualitative or sequential palette. This may cause confusion to readers, including those with some cognitive disabilities, and it may be insufficient contrast for some readers with color blindness.

Avoid similar shades of chromatic and grayscale colors

When using grayscale to de-emphasize elements, ensure that the grayscale and chromatic colors have distinct saturation and lightness (e.g. shades). Elements of the same or similar shades will be indistinguishable to people with decreased color vision, and when printed in grayscale.

Bar chart with single blue bar and 3 light gray bars
Do: use a single color for emphasis, and a single distinct grayscale shade for de-emphasis
Bar chart with single green bar and 3 gray bars of the same shade
Don't: use similar shades of chromatic and grayscale colors to differentiate data points
Donut chart with multiple grayscale shades
Do: multiple grayscale shades for different categories
Donut chart with multiple grayscale shades
Do: use a single color for emphasis, and a single distinct grayscale shade for de-emphasis
Donut chart with multiple colors for different categories
Don't: mix chromatic and grayscale colors for different categories

TODO: more color examples

Color Semantics

TODO: describe the semantics or meanings of colors, such as for RAG status charts (e.g. don't use reds to mean positive or go conditions)

Axes and Labeling

For cartesian (X/Y) charts, always display both X and Y axes, with clear labels for numerical axis.

Visual clutter

Visual clutter is disorganization in the collection of elements, or any unneccessary visual elements that don't directly contribute to the reader's ability to understand the information in a document or image. Visual clutter contributes to extraneous cognitive load, and may reduce the reader's task performance.

Clutter is the state in which excess items, or their representation or organization, lead to a degradation of performance at some task.

—Rosenholtz et al., Feature Congestion: A Measure of Display Clutter, 2005

Axis guide lines optional

Independent axis (usually Y-axis) guide lines that extend the visual tick mark across the width of the chart may help some readers correlate datapoint position (e.g. bar height) with a specific value. This is especially true for younger readers who may draw an association with graph paper. Other readers may find such guide lines to be distracting visual clutter. Inclusion of guide lines is optional, and both approaches are acceptable.

Don't hide axis lines or labels

Omitting or hiding axis lines or labels to reduce visual clutter reduces usability for many readers, especially younger readers or those with some cognitive disabilities, and may cause misinterpretation of the data. Always include axis lines, and always include axis labels for numeric axes. Axis labels for categorical axes are optional, if the chart title includes a clear description of what's being measured (i.e. the category type).

Bar chart with both X and Y axis
Do: display both X and Y axes
Bar chart with both X and Y axis
Do: reduce visual clutter
Stacked bar chart with multiple colors for different categories
Don't: hide one of the axes
Bar chart with multiple colors for same category
Don't: omit the label for numerical axis

Avoid patterns

Use solid colors for fill and lines, and avoid patterns such as cross-hatches or dashed lines. Instead, use well-defined color combinations in(cludin grayscale) that are distinguishable by people with color disabilities, use distinct symbols where appropriate (e.g. as data points on line charts or scatter plots), or use different line thicknesses if necessary.

Note: Earlier accessiblity advice sometimes encouraged the use of patterned fills or dashed line patterns in data visualizations. This practice often makes charts more difficult to read, and significantly increases cognitive load or some people with cognitive disabilities. In addition, patterns that cross one another can produce visual effects that distort the data representation.

TODO: provide good and bad examples

Patterns for semantics

An exception to this guidance is the limited use of patterned fills or strokes to indicate some qualitative exception, such as missing data, uncertain data, or future projected data. Reserve the use of patterns for this semantic data, rather than as a visual distinction.

Font styles and sizes

Consistent font family

The font family used in charts and diagrams should be consistent with the text on the containing page. Currently, this will typically be the sans serif typeface Open Sans, but may also be the serif typeface Playfair Display.

Consistent font styles

Limit the number of colors and styles used for labels in your chart or diagram. Avoid bold or italic text, unless it is intended to emphasize a particular feature. Text color should normally be distinct, contrasting, neutral grayscale colors, such as graphite gray, ink black, or chalk white.

Hierarchical font sizes

As with any document, the size of text within a chart or diagram should reflect the hierarchical level of the label. For example, the title of the chart should be largest, followed by the axis labels, followed by the axis tick labels. At each level of hierarchy, the different labels should have a uniform size. For example, the labels for the X and Y axis should be the same size, and the labels for the axis tick labels values on both the X and Y axis should be the same (smaller) size.

Avoiding shrinking longer labels to fit, within the same hierarchical level. Instead, use a consistent font size for that level that fits all labels (or use abbreviations where appropriate).

Other labels outside a strict hierarchy, such as value labels, legend labels, or annotations, should match the closest appropriate font size (e.g. value labels should be the same size as axis tick labels).

Inconsistent font sizes can make it harder for the reader to construct a mental model of the structure of the chart.

Font display size

Because images have their own internal font size, and because images can be displayed at arbitrary sizes, care must be taken that the image display size maintains an appropriate minimum font size. The smallest font size as displayed in the chart image should match the minimum font size specified in the Pearson style guide.

Bar chart displayed at a readable font size
Do: specify a display size for the chart image that makes the smallest text readable
Bar chart displayed at an unreadably small font size
Don't: specify a display size for the chart image that makes the text too small

Note: For the purpose of this document, illustrative examples do not obey this font-size guidance. This is purposeful, to emphasize the salient feature of the chart being described. Where the understandability and perceivability of the data is important, use appropriate font sizes.

Text descriptions and summaries

All images should have alt-text, that is, text that provides a brief, cogent summary of what the image displays. This is especially important for charts and diagrams.

At a minimum, the alt-text should contain:

  • the type of the data visualization, such as "line chart", "bar chart", "pie chart", "flow chart", "map", or "diagram"
  • the type of the data, typically what's described on the axis labels
  • the meaning of the data visualization, such as the overall trend, the most salient (highest, lowest, average) values, or the manner in which the chart reinforces the content of the containing article

Summaries

In addition to alt-text, for more complex charts or diagrams, a longer text description should be provided that describes each step or aspect of the visualization, and how it is connected to other steps or aspects

Explorable data or alternate forms

If possible, the data itself should be in a structure, such as SVG or HTML, that can be navigated and explored by screen reader users.

Ideally, the chart should also include another view of the data, such as a data table, or a link to download the data in a spreadsheet format (such as CSV or Excel).

Optionally, the chart should also include citations or sources for the data.

TODO: provide examples for alt text and summaries

Dashboards

  • colors should be consistently used across all different chart types

    TODO: show examples of dashboard color usage

Animation

TODO: describe the use and misuse of animation

Chart types

TODO: describe the most common chart types, what they should be used for, and layout guides for each

Bar Chart

Annotated bar chart
Bar chart annotated with different spacing and sizing rules

Data

  • Keep the data simple: i.e. x, y and 1 set of values. Unless its a key learning point to process a lot of complex data then do not add more information or layers on top of this its too much for people to process and will confuse. If you need to convey more do it by breaking it down into other data formats. This means you may need two charts instead of one for example.
  • As a rule of thumb 7-10 data elements at most are best. Break it down wherever you can and avoiding scrolling or losing headings off the screen.
  • Try to ensure the information is displayed in different or additional ways to enable understanding wherever possible and appropriate (For example, data tables, or a written description of a graph or map, a couple of maps instead of one complex one.)

Layout

  • Give the item a title that appears before the data
  • Left align every text element if you can.
  • Try not to split text data and visual data. Data should be read and then viewed or vice versa rather than splitting the data either side of a graph for example.
  • Try to ensure data can be viewed in a table with lines or equivalent so that the eye can track the row of data more easily across the screen/page rather than having to try and match each piece across a row with white space between.
  • Keep good white space between lines and blocks of data to enable legibility.
  • Use at least a minimum 1pt keyline everywhere, 2pt for patterned lines if possible.
  • Make sure any keylines or data on top of Graph paper contrast enough ie 70% and/or are thicker to be distinguishable.

Color

  • Colour palette should 1-5 colours ideally, 10 at most if really needed eg for diverse map data. - A finite amount of colours consistently applied enables good colour recognition and decoding.
  • Avoid fatiguing / intense colours and combinations and vibrating colours like red on blue, bright pink on green, red on green, black on red, and green on purple.
  • Do not use red and green in the same document/chart or data visualisation if at all possible.
  • Use contrasting colours to allow for information to be clearly seen.
  • If you need to use tints of the same colour or show a colour range for something like a heat map or temperature chart for example try to make sure there is a 40% difference between them.
  • If using a light colour outline any data or elements that do not contrast well enough with the background, or choose a darker colour.

Pattern

  • Pattern should be avoided if at all possible - its actually really hard for people to view, causes cognitive overload very quickly and is actually not that easily discernible for a lot of people. This means we need to rely more on contrast and labelling to resolve colour blindness issues.
  • Where pattern is used it should be minimised, contrast well, not be made up of glaring colours or colours opposite each other on the colour wheel. Don’t use large patterns in small areas, or complex or clashing patterns and colours..
  • The space between pattern elements should not be the same as the pattern element. The same goes for patterned lines like dots and dashes. The space between dots or lines should not be the same size as the dot or line. They can actually be hard for a lot of people to see.
  • Two similar patterns or line types should not be used in the same document, its too hard for people to see. eg two dotted or dashed lines or patterns.
  • Keep colour choices and pattern choices, if using, consistent for the same types of data.

Labelling and keys

  • Keep labelling consistently placed and outside of but close to the data element.
  • Always label everything if functionality allows. Keys are not an alternative, they should be used
  • as an additional tool where necessary.
  • Key items should be in a logical order and big enough to be legible (For example, 44px square minimum or 10mm square minimum for colours, patterns and symbols.)
  • If you have to use a key then make use of pattern and symbols. (This is especially critical if using a key with more than two elements of data.)

Further guidance

Tools and technology