Color: The Limits of a ‘Standard’ Observer

January 13, 2022

Painting vision with a broad brush can be problematic

By Jess Baker, Tony Esposito, and Jason Livingston

Imagine you adjust two separate color-tunable LED fixtures to visually match in color appearance, but when you pull out your expensive, accurate spectroradiometer and measure the chromaticity of each, there is a significant difference that should be noticeable. How can that be?

Alternatively, you tune two LED fixtures to be a perfect chromaticity match, according to that meter. Then, you show them to everyone in your office. Chaos ensues when the team realizes they are not a visual match for most of the group despite what the meter says. In addition to being a headache for specifiers, discrepancies between color perception and the “official” measurement tools also present challenges when manufacturing light sources, often putting large investments on the line where staff cannot achieve a visual consensus and reconciliation with data for new products. This article explores the tension between the necessity for a reliable representation of “average” color vision while offering guidance to lighting practitioners on how to handle the inherent limitations of reliance on a “standard” observer.

The color discrepancies described above occur because we use a single standard observer to represent a wide range of people with different vision as well as different viewing conditions, and because the most commonly used standard observer (CIE 1931 2-deg) has known inaccuracies. In theory, a standard observer—which is a set of three-color matching functions—represents the visual system of a “standard” person with “color normal” vision for a specified field of view. A good standard observer should be derived from, and represent, the average retinal sensitivities of people who have normal color vision, allowing it to be the basis of metrics characterizing average visual perception. A good standard observer allows us to compute and predict an acceptably accurate color match for most people, most of the time. This facilitates repeatable calculations and promotes commerce.

The CIE has previously introduced four standard observers (Figure 1). The 1931 2-deg and the 1964 10-deg standard colorimetric observers are both derived from color matching experiments, roughly representing the average responses of the relatively small number of study participants. The CIE recommends using the 1964 observer when the field of view is larger than 4 deg, and in general the 1964 observer is probably more accurate because it is based on a larger number of subjects with a more sophisticated measurement system. The net result is that the problem mentioned at the outset of this article is usually reduced when the 1964 10-deg observer is used. Nevertheless, this version of the standard observer is rarely used in the industry. The latest standard observers, the CIE 2015 2-deg and 10-deg cone-fundamentals-based tristimulus functions, provide a new approach by focusing on direct measurements of cone sensitivities rather than results of color matching experiments. Their improvement in matching accuracy, particularly compared to the highly-flawed CIE 1931 2-deg standard observer, has been demonstrated, but the uptake by the industry has been minimal so far.

Figure 1. CIE Tristimulus functions.
Figure 1. CIE Tristimulus functions.
Courtesy of Michael Royer

Though scientists have made strides in improving our representation of a “standard observer,” the 1931 2-deg colorimetric observer is still the dominant one used for making lighting calculations today, simply due to the inertia that must be overcome to change metrics. While good for its time, errors exist, and many in the color science community are discovering alternative approaches even beyond targeting the right average person and field of view.

EVEN IF THE LIGHTING INDUSTRY could exactly define the standard observer for a population, there would still be people for whom the standard is not representative. It is simply not possible to create a color match for all people simultaneously, even if all viewers have normal color vision. This is because the human eye varies, even within the range of what is considered normal, which we call intra-observer variability. A good standard observer is our best solution for lighting metrics that are applicable for most people, most of the time, but any discrepancies such as low vision, abnormalities or defects, certain color deficiencies, or even just our eyes yellowing from age, means the standard observer will be less representative of the individual. This creates an inevitable deficiency in our overall foundation for vision metrics and excludes sensitive populations from most lighting guidance.

In the past, practitioners might not have been aware of the limitations of basing colorimetry on a standard observer. But for current and future tunable and color mixed systems, we need to know and understand it in order to predict mismatches and avoid them or explain the root cause of mismatches to our clients.

Applications that require close color match are best served by assembling the largest group of viewers possible to review physical samples under the light sources being considered. Even though a person or group of people may not be average, their ability to evaluate match in context may provide additional critical information over using the standard observer alone, such as field size, surround and illuminance. Similarly, specifiers may need to consider their project’s end user’s perception of color over their own. For example, to achieve a good color match of multiple color mixing light sources for a renovation of a senior care living facility, a young designer may want to create a mock-up for the residents. Generally, a flexible spectral specification will aid the designer in tuning their fixtures appropriately. Light sources with narrowband emission (e.g., “red,” “blue,” and “green” LEDs) tend to exacerbate a mismatch, but the more LED channels in a color-mixing system, the easier it will be to find agreement among a group of people when tuning (you may also want to pad in some additional commissioning time). In general, fewer mismatches can also be expected from lighting systems with similar spectral power distributions. This can be further refined to suggest that LEDs with more similar blue-pump LEDs are more likely to minimize unintended mismatch.

While the current foundation for vision and color-based metrics is outdated, efforts are underway to develop better standards and correct errors. To ensure color match, lighting practitioners should continue to do what they do best: consider the final users, view samples and push for more mock-ups. Just maybe don’t always trust your own eyes.