Cornell box
The Cornell box is a test scene designed to evaluate the accuracy of rendering software by comparing a rendered image with a photograph of a real-world model under the same lighting conditions. It has become a commonly used 3D test model in computer graphics research.[1]
The box was created by Cindy M. Goral, Kenneth E. Torrance, Donald P. Greenberg, and Bennett Battaile at the Cornell University Program of Computer Graphics as part of their research on radiosity and diffuse interreflection. Their findings were published in the paper Modeling the Interaction of Light Between Diffuse Surfaces, presented at SIGGRAPH '84.[2] [3]
Reference Model
[edit]
A physical model of the Cornell box is constructed and photographed using calibrated equipment. Photographic image references shared via Cornell University website[4] were captured using a liquid-cooled Photometrics PXL1300L CCD camera with a precision of 12 bits. Seven narrow-band filters are employed to obtain a coarse sampling across the visible spectrum. To enhance accuracy, dark current is subtracted from the images, and flat-field correction is applied to compensate for cosine fall-off and lens fall-off. The precise settings of the scene are measured, including the emission spectrum of the light source, reflectance spectra of all surfaces, and the exact position and dimensions of objects, walls, light sources, and the camera.
A matching virtual 3D scene is created, and an digital image is generated during the rendering process for comparison with the reference photograph. The comparison helps evaluate the accuracy of rendering algorithms, particularly in handling global illumination, radiosity, and light transport. The Cornell box is designed to demonstrate diffuse interreflection. Light reflecting off the red and green walls subtly tints the adjacent white walls, demonstrating complex global illumination effects.
Scene Configuration
[edit]The basic environment consists of:
- A single light source
- A green right wall
- A red left wall
- A white back wall, floor, and ceiling
Objects are commonly placed inside the box to study their interaction with light. The original configuration included two boxes, while subsequent versions introduced a reflective mirror sphere and a refractive glass sphere, commonly used in ray tracing research.[5]
History
[edit]The Original Cornell Box
[edit]The original Cornell Box was described by Cindy M. Goral, Kenneth E. Torrance, and Donald P. Greenberg in their 1984 paper titled Modeling the Interaction of Light Between Diffuse Surfaces, presented at SIGGRAPH '84.[3]
In this initial version, the interior was painted in red, blue, and gray, and no occluding objects were placed inside the box. Rather than placing a light source inside, the box was illuminated indirectly using a set of lights and a white diffuse surface.
-
The rendered image is on the left, and photo reference on the right.
The ray tracing program was written in C on a VAX-11/780 superminicomputer. -
Sketch of the scene configuration from the same paper with Cornell box referred to as a 'test cube'.
Hemi-Cube Form Factors
[edit]This simulation of the Cornell Box was carried out by Michael F. Cohen and Donald P. Greenberg for their 1985 paper The Hemi-Cube: A Radiosity Solution for Complex Environments presented at SIGGRAPH '85.[6] The hemi-cube technique allowed form factors to be calculated using scan conversion algorithms, which were supported by hardware at the time, and made it possible to calculate shadows from occluding objects inside the scene.
This version of the Cornell Box was the first to feature objects placed inside — a short block on the left side, a tall block on the right side, and a light source in the center of the 'ceiling'. This configuration matches the scene data[4] shared by Cornell University discussed in Scene Data (section). This layout became a standard reference and was widely reproduced, but in many rendered versions of this scene, the arrangement of the blocks is often mirrored.
Spherical Harmonics
[edit]François X. Sillion, James Arvo, Stephen Westin, and Donald P. Greenberg made significant contributions to global illumination in computer graphics, particularly through the use of spherical harmonics.
In their 1991 paper presented at SIGGRAPH '91, A Global Illumination Solution for General Reflectance Distributions[8], they demonstrated a method that extended light transfer simulations to handle complex reflectance properties beyond ideal diffuse or specular surfaces. This approach utilized spherical harmonic decomposition to encode bidirectional reflectance distribution functions (BRDFs) and directional intensity distributions, allowing for more accurate and efficient rendering of materials with complex reflectance characteristics. In this version of the Cornell box, the blocks are arranged in a flipped configuration, and the tall block features a mirror-like (aluminum) surface instead of a diffuse one.[8] This modification does not align with the scene data[4] shared by Cornell University and could contribute to the misconception described in Scene Data (section).
The photographic and synthetic image references found in the SIGGRAPH '91 paper[8], in a specific subpage of the Cornell University website named 'Cornell Box Comparison'[7] and in the officially provided reproducible data[4] are all unique, even though they look fairly identical. Common misconceptions are discussed in the section below, but it is worth noting that while the synthetic image found in this section is entirely different from the 'officially' provided synthetic reference[4], the photographic reference closely resembles the 'official' photographic image discussed in Photographic Images (section). At first glance, it is easy to notice the changed camera position, as well as the addition of a square object obstructing and absorbing light — a so-called 'flag', often used in photography to achieve the opposite effect of 'bounce cards'. It was likely used to block distracting effects caused by the strong light source being directly visible by the camera. This absorbing object is not present in the officially provided photographic image (section).
Cornell University data and common misconceptions
[edit]Publicly released resources from Cornell University contain mixed data that has led to some misconceptions. Over the years, edits were made to the website[9], but few details about these changes are provided, and no documentation exists regarding the modifications. The website was created before web archiving services became widely available, meaning some updates may not be accessible or documented.
Synthetic image
[edit]![]() | This section is written like a personal reflection, personal essay, or argumentative essay that states a Wikipedia editor's personal feelings or presents an original argument about a topic. (March 2025) |
This section needs additional citations for verification. (March 2025) |

A rendered image reference was uploaded by the Cornell University Program of Computer Graphics. The camera position and Cornell box configuration match the provided scene data[4], but short and tall blocks are mirrored horizontally. The synthetic image contains a shading artifact near the left corner of the short block, making it easily recognizable. The website provides no details on the ray tracing or rendering methods used. No publicly available photograph of the original reference exists, aside from the photographic images discussed in the section below. As a result, a rendered image with artifacts, which is over 30-years-old, is still often used as a reference for recreating the scene, despite significant advancements in ray tracing software and hardware. Some examples that can be found on this page were path traced with the unbiased Octane Render rendering system, but intentional biased inaccuracies had to be introduced to mimic the limitations of older software. Without these adjustments, modern results would be too accurate and would deviate significantly from the available synthetic reference.
It's a bit of a paradox now: this reference image, the Cornell Box itself, and its role in research that helped to pave the way for the advancements seen in modern ray tracers. These very same modern ray tracers must now be deliberately constrained when recreating this image, because they are capable of producing results that are far more accurate and realistic than what was initially possible.
The synthetic Cornell Box image reference was created at a time when rendering technology was still a work-in-progress. The limitations of the source data and the absence of the original non-synthetic photographic image reference are key reasons why the Cornell Box, in its currently available form, is no longer suitable for modern applications. While it doesn't sound like a paradox, it's quite natural that some test tools, assets, or references become outdated over time because they fulfilled their task or newer solutions became available. However, the fact that people recreate an old, already rendered image of an asset — originally created to test and improve things — because the original reference is unavailable does sound like a little paradox. Of course, it was a gesture from Cornell University to share the data used in the research and it's usually used recreationally by people who find it interesting (apparently).
Over the years, advancements in software and hardware have led to significant improvements allowing more accurate reference gathering and measurement. Today, even an average social media profile picture has a higher resolution than the provided synthetic reference. Despite this, the Cornell Box remains an important historical artifact and an enduring icon. In many ways, it is the 'Hello, World!' of ray tracing world.
Photographic images
[edit]Cornell University shared data that must be manually processed before being assembled into an image (e.g., through multispectral imaging). Few details are provided, except for the camera used, the implementation of seven narrow-band filters (to capture multiple wavelengths of the visible spectrum), and applied post-processing (noise reduction and lens correction). The captured wavelengths are briefly mentioned in the filenames, and seven different monochromatic sets are available. The specific configuration and capture time of the data are not listed, except for the reference to the data being taken 'in its current configuration' — which refers to the box's configuration at the time of capture, and is fairly self-explanatory. The data has been available on the website since at least 1998.
The 'final' processed image is not available for download, and no official or unofficial renditions have been released. To display it, the original data must first be interpreted and assembled. Because no examples are accessible online, an image was crafted specifically for this article and released on Wikimedia Commons, along with some additional context.
Images found below were carefully recreated using available data, and they are for illustrative purposes only. Different methods and techniques can be used, leading to a variety of possible results. It is not difficult to notice that the compiled image (the result of digitally processed data) does not match the synthetic image reference[𝓲 5] (e.g., the tall block has a highly specular surface) or the scene data[𝓲 6] (e.g., the camera position has been offset to reduce a distracting reflection on the tall block).
Seven monochromatic datasets were captured at different visible spectrum wavelengths. Using information obtained from the filenames, we can determine the captured wavelengths: 700nm, 650nm, 600nm, 550nm, 500nm, 450nm, and 400nm. These datasets can be combined to create red, green, and blue channels, which are then used to produce a polychromatic image. One method to create these channels is by assigning the closest corresponding colors on the visible spectrum. The channels were assigned as follows:
- Red channel: 700nm, 650nm, 600nm
- Green channel: 550nm, 500nm
- Blue channel: 450nm, 400nm
Basic corrections were not applied, so no adjustments were made to contrast, saturation, white balance, etc. No advanced corrections (such as lens corrections for chromatic aberration) were performed. Minimal tonemapping was applied, and no pixels were cropped. Defective pixels and capture errors were retouched, but the unedited version is also available. The same principles apply to the synthetic renders, which were generated with a slightly offset camera, as the precise data for this particular shot is not provided. The camera was not properly calibrated and is only vaguely aligned with the short block.
Scene Data
[edit]
The original geometry data supplied by Cornell University clearly defines the positions of the objects, with the tall block on the right and the short block on the left, as shown in the Hemi-Cube Form Factors section. While 3D applications and algorithms may interpret axes differently, causing potential confusion, this does not affect the specific misconception in this case. We have direct access to the original data, and although we understand the correct configuration, it's unclear why a mirrored version (flipped along one axis) is more prominently used.
This change was first observed in the 'Spherical Harmonics' rendition of the box. Although the authors did not explicitly comment on this change, a plausible explanation could be that without the flipped layout and repositioned camera, the uneven reflection on the tall block would be too distracting. Additionally, if the green wall were visible in the reflection, it would not contrast as strongly as the red wall does on the white background.[𝓲 7] Repositioning the objects and camera could be a more efficient solution than recreating and repainting the entire box.
An accurate method for setting up the camera according to the original specifications is to set the focal length to 35mm and the sensor size (film gate) to 25mm. Alternative setup methods, including the camera’s position, rotation, and other relevant details, are provided in the description of the 3D Cornell Box model featured in this section. The alternative coordinates that could be used to reproduce the repositioned camera were never published.
Historical Context and Applications
[edit]The Cornell box was developed in the early 1980s as part of research into radiosity, one of the first rendering techniques capable of simulating diffuse interreflection. This work laid the foundation for physically based rendering methods, played a pivotal role in validating global illumination and inspired advancements in ray tracing methods.
Later, the Cornell box was adapted for evaluating newer methods such as photon mapping, introduced by Henrik Wann Jensen in the 1990s[10][11], which improved the simulation of caustics and indirect lighting. Modern applications of the Cornell box extend to testing Monte Carlo path tracing, machine learning-based rendering techniques and other advanced approaches to rendering. It's also frequently used as a benchmark and remains an essential test scene in many rendering engines like Blender, Unreal Engine, and Arnold.[12]
Since its creation in the previous millennium, many companies have developed their own updated and higher-quality references for internal use, as advancements in technology have allowed for more detailed and accurate measurements. Although primarily used for computer graphics, variations of the Cornell box have also been employed in acoustics research to model sound reflections and validate simulation methods.[13]
See also
[edit]References
[edit]- ^ Niedenthal, Simon (2002-06-01). "Learning from the Cornell Box". Leonardo. 35 (3): 249–254. doi:10.1162/002409402760105235. ISSN 0024-094X. S2CID 57565464.
- ^ History of the Cornell Box
- ^ a b Cindy M. Goral, Kenneth E. Torrance, Donald P. Greenberg, and Bennett Battaile. Modeling the Interaction of Light Between Diffuse Surfaces Archived 2010-06-27 at the Wayback Machine. SIGGRAPH 1984.
- ^ a b c d e f Cornell Box Data
- ^ Jensen, Henrik Wann (2001). Realistic Image Synthesis Using Photon Mapping. A. K. Peters. ISBN 9781568811475.
- ^ a b Michael F. Cohen, Donald P. Greenberg. The Hemi-Cube: A Radiosity Solution for Complex Environments. SIGGRAPH 1985, Vol. 19, No. 3, pp. 31-40.
- ^ a b c Cornell Box Comparison subpage
- ^ a b c Sillion, François X. (1991). "A Global Illumination Solution for General Reflectance Distributions". SIGGRAPH ’91 Conference Proceedings. Las Vegas, United States: ACM: 187–196.
- ^ The Cornell Box website
- ^ Tamstorf, Rasmus; Jensen, Henrik Wann (1996). "Adaptive Sampling and Bias Estimation in Path Tracing" (PDF). Technical University of Denmark. Retrieved March 18, 2025.
- ^ Jensen, Henrik Wann; Christensen, Per (August 5, 2007). "High Quality Rendering using Ray Tracing and Photon Mapping" (PDF). SIGGRAPH 2007 Course 8. Pixar Animation Studios. Retrieved March 18, 2025.
- ^ Pharr, Matt (2016). Physically Based Rendering: From Theory to Implementation. Morgan Kaufmann. ISBN 9780123785800.
- ^ Tsingos, N.; Carlbom, I.; Elko, G.; Kubli, R.; Funkhouser, T. (2002-07-01). "Validating acoustical simulations in the Bell Labs Box" (PDF). IEEE Computer Graphics and Applications. 22 (4): 28–37. doi:10.1109/MCG.2002.1016696. ISSN 0272-1716.
Annotations (𝓲)
[edit]- ^ Recreation of the synthetic reference, as the original photographic reference doesn't exist. More info in the #Synthetic Image section and the file description.
- ^ More info in the #Synthetic Image section and the file description.
- ^ A render uploaded in 2003 by the original author of this page, User:SeeSchloss. More info in the file description.
- ^ Picture taken at 'The Science Behind Pixar Exhibition at the Telus World of Science in Edmonton'
- ^ Synthetic image reference provided by Cornell University discussed in Synthetic Image section
- ^ Reference scene data provided by Cornell University and discussed in Scene Data section
- ^ Examples can be viewed in #Photographic Images section.