Pearson Data Visualization Style Guide, v0.0.3
Use of Color
Chromatic: multiple colors
Qualitative: Categories
Qualitative data values are different categories or classes. This is often called nominal data, meaning that each value represents a named thing, rather than an ordered or numerical progression. For qualitative data, use a different hue for each item.)
TODO: list common qualitative / categorical value type examples widely used in Pearson
Examples of categorical data: courses, learning topics, schools, etc.
TODO: ask if there is or should be consistent color-coding for different topic areas (e.g math = orange, science = red, social studies = blue) and sub-colors within those categories. Could be good for symbology
Do not use different hues to represent elements of the same category. This may confuse or distract people with some cognitive disabilities, by implying a pattern or categorization that doesn't exist, thus increasing their cognitive load and even frustration in trying to understand this non-existent pattern.
Misuse of multiple colors also distracts from the message of the data visualization, and increases visual clutter.
Criteria for qualitative color values
Color may used to distinguish discrete items or related groups of items which do not have an intrinsic order, such as different courses of study, or different schools school systems within a district or countries on a map (when not ranking them by another criteria). For this purpose, a color palette (or set of specific colors) should follow perceptual guidelines: TODO: simplify this paragraph
- Each color in the palette should look clearly distinct from the others, following the color contrast ratio of at least 3:1 relative luminance (per the WCAG 2.1 Success Criterion 1.4.11 Non-text Contrast)
- Each color in the palette should be perceptually uniform, such as different hues with a saturation and lightness equivalent to the other colors in the palette
- No single color in the palette should stand out as stronger or weaker relative to the other colors, including mixing chromatic and grayscale colors
- Collectively, there should be no perceived ordering or ranking among the colors of the palette, which may give the impression of an apparent order among the represented items.
Note that the Pearson branding color palette is not ideal for use with data visualization, because it isn't perceptually uniform. But if it's going to be used for qualitative charts, it should be consistent
Sequential
Sequential data values are items that have ordered, numerical values, such as performance metrics.
Examples of sequential data: test scores, grades, rankings, performance times, class size, etc.
TODO: describe perceptual uniformity
TODO: describe sequential colors as shown in modified palette above
TODO: list common sequential value type examples widely used in Pearson
Criteria for sequential color values
When color is used to represent an ordered sequence of data values, the color palette (or set of specific colors) should follow these perceptual guidelines:
- The sequence of colors in the palette should clearly indicate which values are larger and which values are smaller (in general, darker colors are perceived as larger and lighter colors as smaller)
- Each color in the palette should clearly and consistently represent how distant it is from other values
- The colors scale should be perceptually uniform across the color scale, such as the same hue with different saturation or lightness, or multi-hued across a familiar set of colors (e.g. natural world colors)
- The extremes of the color scale should be clearly labeled, by axis or legend, with at least one interval value labeled as well (e.g the midpoint)
- Both stepped and smooth gradations are acceptable, but the usage should be clear
TODO: show and explain RAG color palette
Diverging
Diverging data values represent the deviation of the value in along a linear axis relative to a neutral midpoint. This midpoint may be zero, where the two poles are positive and negative, or some arbitrary average such as test scores, where the two poles may indicate how well or poorly a student performed. There may be an implied range of acceptable thresholds within the two poles, where the extremes are outside the acceptable range.
Examples of diverging data: individual test scores relative to the average, etc.
TODO: list common diverging value type examples widely used in Pearson
Criteria for diverging color values
When color is used to represent an diverging set of data values, the color palette (or set of specific colors) should follow these perceptual guidelines:
- The midpoint should typically be a light color when the value is neutral, or a dark color when the midpoint is the implied ideal
- Each side of the color scale should follow the criterial for sequential colors, with the difference in color from the midpoint growing strongest at each end (e.g. for light midpoints, the lighter colors meaning less different to the midpoint and the darker colors meaning more different than the midpoint)
- Both sides should be balanced by using a shared uniform scale, so that values at the same distance from the midpoint are perceived as equivalent
- Distinct colors should be used so the reader understands both the direction and the magnitude of the distance from the midpoint
Mixed data values
Sometimes more than one data type is being represented in the same chart. For example, a bar chart might represent schools by the size of student population (the length of the bar), and also by region (the color of the bar). This is common, and can be used effectively. However, care must be taken to explain both signals: the sequential (the dependent variable axis labels) and the categorical (a color-coded legend), with a textual explanation in the title or caption.
Sequential palette for qualitative data
Normally, you should reserve a sequential palette for quantitative (not qualitative) values. However, when presenting qualitative data in a manner ordered by decreasing quantity value, it can be effective to use variations on the same hue for the categories. This also works well for people with decreased color vision and for printing in grayscale. Note: when combining darker and lighter colors, darker colors are typically associated with larger values, while lighter colors are associated with smaller values. Reversing or mixing the sequence of colors may cause confusion or misunderstanding in readers.
Achromatic: grayscale
Uses for grayscale colors include display in media or devices without chromatic colors (e.g. print or e-ink devices), and de-emphasis of data points to contrast with accentuated data points.
Print and Grayscale Displays
For small nunbers of qualitative or sequential data, you can substitute grayscale colors on a white background for the chromatic colors. Normal human vision cannot reliably differentiate between more than a few shades of gray, so limit the palette to five colors, if possible.
Do not distinguish items by color alone, especially with the limited differentiation by achromatic colors. Consider using supplementary patterns or symbols where possible. Note that diverging data do not work well with a grayscale palette, so use of patterns or symbols may be necessary for visualizations of diverging data.
It is a good practice to assume that your data visualization may be printed on a grayscale printer, and to limit your use of color even for chromatic charts.
Note: For use in print, it is sometimes possible to use technologies such as SVG and CSS to specify special print colors, such as grayscale colors, that are selected specifically for grayscale print (rather than chromatic colors printed with only saturation and lightness, and no hue). Where possible, specify CSS print media rules for grayscale color palettes.
Grayscale for de-emphasis
Often, it is good to reduce visual clutter and draw emphasis to key sections or data points by using lighter or heavily desaturated achromatic or chromatic colors for all other elements of a design.
Because of the convention that light grays denote a de-emphasized element, you should not use achromatic colors in combination with chromatic colors as part of qualitative or sequential palette. This may cause confusion to readers, including those with some cognitive disabilities, and it may be insufficient contrast for some readers with color blindness.
Avoid similar shades of chromatic and grayscale colors
When using grayscale to de-emphasize elements, ensure that the grayscale and chromatic colors have distinct saturation and lightness (e.g. shades). Elements of the same or similar shades will be indistinguishable to people with decreased color vision, and when printed in grayscale.
TODO: more color examples
Color Semantics
TODO: describe the semantics or meanings of colors, such as for RAG status charts (e.g. don't use reds to mean positive or go conditions)
Axes and Labeling
For cartesian (X/Y) charts, always display both X and Y axes, with clear labels for numerical axis.
Visual clutter
Visual clutter is disorganization in the collection of elements, or any unneccessary visual elements that don't directly contribute to the reader's ability to understand the information in a document or image. Visual clutter contributes to extraneous cognitive load, and may reduce the reader's task performance.
Clutter is the state in which excess items, or their representation or organization, lead to a degradation of performance at some task.
Axis guide lines optional
Independent axis (usually Y-axis) guide lines that extend the visual tick mark across the width of the chart may help some readers correlate datapoint position (e.g. bar height) with a specific value. This is especially true for younger readers who may draw an association with graph paper. Other readers may find such guide lines to be distracting visual clutter. Inclusion of guide lines is optional, and both approaches are acceptable.
Don't hide axis lines or labels
Omitting or hiding axis lines or labels to reduce visual clutter reduces usability for many readers, especially younger readers or those with some cognitive disabilities, and may cause misinterpretation of the data. Always include axis lines, and always include axis labels for numeric axes. Axis labels for categorical axes are optional, if the chart title includes a clear description of what's being measured (i.e. the category type).
Avoid patterns
Use solid colors for fill and lines, and avoid patterns such as cross-hatches or dashed lines. Instead, use well-defined color combinations in(cludin grayscale) that are distinguishable by people with color disabilities, use distinct symbols where appropriate (e.g. as data points on line charts or scatter plots), or use different line thicknesses if necessary.
Note: Earlier accessiblity advice sometimes encouraged the use of patterned fills or dashed line patterns in data visualizations. This practice often makes charts more difficult to read, and significantly increases cognitive load or some people with cognitive disabilities. In addition, patterns that cross one another can produce visual effects that distort the data representation.
TODO: provide good and bad examples
Patterns for semantics
An exception to this guidance is the limited use of patterned fills or strokes to indicate some qualitative exception, such as missing data, uncertain data, or future projected data. Reserve the use of patterns for this semantic data, rather than as a visual distinction.
Font styles and sizes
Consistent font family
The font family used in charts and diagrams should be consistent with the text on the containing page. Currently, this will typically be the sans serif typeface Open Sans, but may also be the serif typeface Playfair Display.
Consistent font styles
Limit the number of colors and styles used for labels in your chart or diagram. Avoid bold or italic text, unless it is intended to emphasize a particular feature. Text color should normally be distinct, contrasting, neutral grayscale colors, such as graphite gray, ink black, or chalk white.
Hierarchical font sizes
As with any document, the size of text within a chart or diagram should reflect the hierarchical level of the label. For example, the title of the chart should be largest, followed by the axis labels, followed by the axis tick labels. At each level of hierarchy, the different labels should have a uniform size. For example, the labels for the X and Y axis should be the same size, and the labels for the axis tick labels values on both the X and Y axis should be the same (smaller) size.
Avoiding shrinking longer labels to fit, within the same hierarchical level. Instead, use a consistent font size for that level that fits all labels (or use abbreviations where appropriate).
Other labels outside a strict hierarchy, such as value labels, legend labels, or annotations, should match the closest appropriate font size (e.g. value labels should be the same size as axis tick labels).
Inconsistent font sizes can make it harder for the reader to construct a mental model of the structure of the chart.
Font display size
Because images have their own internal font size, and because images can be displayed at arbitrary sizes, care must be taken that the image display size maintains an appropriate minimum font size. The smallest font size as displayed in the chart image should match the minimum font size specified in the Pearson style guide.
Note: For the purpose of this document, illustrative examples do not obey this font-size guidance. This is purposeful, to emphasize the salient feature of the chart being described. Where the understandability and perceivability of the data is important, use appropriate font sizes.
Text descriptions and summaries
All images should have alt-text, that is, text that provides a brief, cogent summary of what the image displays. This is especially important for charts and diagrams.
At a minimum, the alt-text should contain:
- the type of the data visualization, such as "line chart", "bar chart", "pie chart", "flow chart", "map", or "diagram"
- the type of the data, typically what's described on the axis labels
- the meaning of the data visualization, such as the overall trend, the most salient (highest, lowest, average) values, or the manner in which the chart reinforces the content of the containing article
Summaries
In addition to alt-text, for more complex charts or diagrams, a longer text description should be provided that describes each step or aspect of the visualization, and how it is connected to other steps or aspects
Explorable data or alternate forms
If possible, the data itself should be in a structure, such as SVG or HTML, that can be navigated and explored by screen reader users.
Ideally, the chart should also include another view of the data, such as a data table, or a link to download the data in a spreadsheet format (such as CSV or Excel).
Optionally, the chart should also include citations or sources for the data.
TODO: provide examples for alt text and summaries
Dashboards
- colors
should be consistently
used
across all different chart types
TODO: show examples of dashboard color usage
Animation
TODO: describe the use and misuse of animation
Chart types
TODO: describe the most common chart types, what they should be used for, and layout guides for each
Bar Chart
Data
- Keep the data simple: i.e. x, y and 1 set of values. Unless its a key learning point to process a lot of complex data then do not add more information or layers on top of this its too much for people to process and will confuse. If you need to convey more do it by breaking it down into other data formats. This means you may need two charts instead of one for example.
- As a rule of thumb 7-10 data elements at most are best. Break it down wherever you can and avoiding scrolling or losing headings off the screen.
- Try to ensure the information is displayed in different or additional ways to enable understanding wherever possible and appropriate (For example, data tables, or a written description of a graph or map, a couple of maps instead of one complex one.)
Layout
- Give the item a title that appears before the data
- Left align every text element if you can.
- Try not to split text data and visual data. Data should be read and then viewed or vice versa rather than splitting the data either side of a graph for example.
- Try to ensure data can be viewed in a table with lines or equivalent so that the eye can track the row of data more easily across the screen/page rather than having to try and match each piece across a row with white space between.
- Keep good white space between lines and blocks of data to enable legibility.
- Use at least a minimum 1pt keyline everywhere, 2pt for patterned lines if possible.
- Make sure any keylines or data on top of Graph paper contrast enough ie 70% and/or are thicker to be distinguishable.
Color
- Colour palette should 1-5 colours ideally, 10 at most if really needed eg for diverse map data. - A finite amount of colours consistently applied enables good colour recognition and decoding.
- Avoid fatiguing / intense colours and combinations and vibrating colours like red on blue, bright pink on green, red on green, black on red, and green on purple.
- Do not use red and green in the same document/chart or data visualisation if at all possible.
- Use contrasting colours to allow for information to be clearly seen.
- If you need to use tints of the same colour or show a colour range for something like a heat map or temperature chart for example try to make sure there is a 40% difference between them.
- If using a light colour outline any data or elements that do not contrast well enough with the background, or choose a darker colour.
Pattern
- Pattern should be avoided if at all possible - its actually really hard for people to view, causes cognitive overload very quickly and is actually not that easily discernible for a lot of people. This means we need to rely more on contrast and labelling to resolve colour blindness issues.
- Where pattern is used it should be minimised, contrast well, not be made up of glaring colours or colours opposite each other on the colour wheel. Don’t use large patterns in small areas, or complex or clashing patterns and colours..
- The space between pattern elements should not be the same as the pattern element. The same goes for patterned lines like dots and dashes. The space between dots or lines should not be the same size as the dot or line. They can actually be hard for a lot of people to see.
- Two similar patterns or line types should not be used in the same document, its too hard for people to see. eg two dotted or dashed lines or patterns.
- Keep colour choices and pattern choices, if using, consistent for the same types of data.
Labelling and keys
- Keep labelling consistently placed and outside of but close to the data element.
- Always label everything if functionality allows. Keys are not an alternative, they should be used as an additional tool where necessary.
- Key items should be in a logical order and big enough to be legible (For example, 44px square minimum or 10mm square minimum for colours, patterns and symbols.)
- If you have to use a key then make use of pattern and symbols. (This is especially critical if using a key with more than two elements of data.)
Further guidance
- Inclusive writing guidance: UK Schools Accessibility
- Alternative Text guidance: UK Schools Accessibility
- Inclusive visual design checklist: UK Schools Accessibility
- Microsoft accessible Office documents: UK Schools Accessibility
- The Art of Accessible Multimedia: 1hr US Accessibility for school assessment team webinar
- Universal Design – Color: US Accessibility for school assessment team
- Art Guidelines relating to Epilepsy: US Accessibility for school assessment team
- Pearson Online Color Palette: US Accessibility for school assessment team
Tools and technology
- Circle Graph in Illustrator: (pie chart) US Accessibility for school assessment team
- Grapher for Mac
- Graphing Calculator - Desmos
- Geometry Tool - Desmos
- Desmos accessibility