Wilkinson's Grammar of Graphics
The Grammar of Graphics (GoG) is a grammar-based system for representing graphics to provide grammatical constraints on the composition of data and information visualizations. A graphical grammar differs from a graphics pipeline as it focuses on sematic components such as scales and guides, statistical functions, coordinate systems, marks and aesthetic attributes.[1][2] For example, a bar chart can be converted into a pie chart by specifying a polar coordinate system without any other change in graphical specification.:[1][3]
The grammar of graphics concept was launched by Leland Wilkinson in 2001 (Wilkinson et al., 2001; Wilkinson, 2005) and graphical grammars have since been written in a variety of languages with various parameterisations and extensions.[1] The major implementations of graphical grammars are nViZn created by a team at SPSS/IBM, followed by Polaris focusing on multidimensional relational databases which is commercialised as Tableau, a revised Layered Grammar of Graphics by Hadley Wickham in Ggplot2, and Vega-Lite which is a visualisation grammar with added interactivity.[4] The grammar of graphics continues to evolve with alternate parameterisations, extensions, or new specifications.
Wilkinson's Grammar of Graphics
[edit]Theory
[edit]Wilkinson conceived the seven elements of a graphics to be[3][1]
- Variables: mapping of objects to values represented in a graphic
- Algebra: operations to combine variables and specify dimensions of graphs
- Geometry: creation of geometric graphs from variables
- Aesthetics: sensory attributes
- Statistics: functions to change the appearance and representation of graphs
- Scales: represent variables on measured dimensions
- Coordinates: mapping to coordinate systems
With these, Wilkinson hypothesised that[3][1][5]
- These are seven constructs are orthogonal and virtually all known statistical charts can be generated relatively parsimoniously
- This computational system is not a taxonomy of charts and rather it describes the meaning of what we do when we construct statistical graphics.
Implementations
[edit]Wilkinson wrote SYSTAT, a statistical software package, in the early 1980s. This program was noted for its comprehensive graphics,[6] including the first software implementation of the heatmap display now widely used among biologists. After his company grew to 50 employees, he sold it to SPSS in 1995. At SPSS, he assembled a team of graphics programmers who developed the nViZn platform [7] that produces the visualizations in SPSS, Clementine, and other analytics products.
While at Stanford, Tableau founders Hanrahan and Stolte, as well as Diane Tang, created the predecessor to Tableau, named Polaris.[8] Polaris was a data visualization software tool, built with the support of a United States Department of Energy defense program, the Accelerated Strategic Computing Initiative (ASCI).[9][10] The main differences between Wilkinson's system and Polaris are the use of SQL relational algebra for database services and using shelves instead of cross and nest operators.[8]
Wickham's Layered Grammar of Graphics
[edit]Theory
[edit]Hadley Wickham conceived an alternate parameterisation of the syntax Wilkinson had derived, creating a layered grammar of graphics which he implemented as Ggplot2 for R (programming language) users.[11][12] This added a hierarchy of defaults based around the idea of building up a graphic from multiple layers.[13][14] Wickham conceived these elements to be:
- Defaults: consists of data and mapping
- Data: dataset
- Mapping: aesthetic mappings
- Layer: consists of data, mapping, geom, stat, and position
- Data: dataset, or inherit from defaults
- Mapping: aesthetic mappings, or inherit from defaults
- Geom: geometric object
- Stat: statistical transformation
- Position: position adjustment
- Scale: mapping of data to aesthetic attributes
- Coord: mapping of data to the plane of the plot
- Facet: split up the data
Reception
[edit]Wilkinson is generally positive on Wickham's parameterisation and implementation of ggplot2,[4] praising its elegance and expressivity whilst claiming that his original Grammar of Graphics is capable of representing a wider range of statistical graphics.[5][15]
Implementations
[edit]Ggplot2 is the first implementation of a layered grammar of graphics in R and implementations in other programming languages have ensued. These include direct ports plotnine for Python, gramm for MATLAB, Lets-Plot for Kotlin and gadfly for Julia. Projects inspired by elements of Wickham's grammar include Vega-Lite which specifies plots in JSON and uses a JavaScript engine, and Seaborn for Python.
Vega-Lite: A Grammar of Interactive Graphics
[edit]Theory
[edit]Vega-Lite combines ideas from Wilkinson's Grammar of Graphics and Wickham's Layered Grammar of Graphics with a composition algebra for layered and multi-view displays with a grammar of interaction. The Vega-Lite specification is instantiated in JSON and rendered by the lower-level Vega.[16] The graphical grammar implemented by Vega-Lite is composed of the following:
- Unit: consists of data, transforms, mark-type and encoding
- Data: relational table consisting of records (rows) and named attributes (columns)
- Transforms: data transformations
- Mark-type: geometric object for visual encoding
- Encodings: mapping of data attributes to visual marks properties where each encoding consists of:
- Channel: e.g. colour, shape, size, or text
- Field: data attribute
- Data-type: e.g. nominal, ordinal, quantitative, or temporal
- Value: use a literal instead of a data-type
- Functions: e.g. binning, aggregation, and sorting
- Scale: maps from data domain to visual range
- Guide: axis or legend for visualising scale
- Composite Views: compose views from multiple unit specifications with operators:
- Layer: charts plotted on top of each other
- Hconcat/Vconcat: place views side-by-side
- Facet: subset data to produce a trellis plot
- Repeat: multiple plots similar to facet but with full data replication in each cell
- Interaction: selections identify the set of points a user is interested in manipulating, with components:
- Selection: get the minimal number of backing points
- Name: reference
- Type: how many backing values are stored
- Predicate: determine the set of selected points e.g. single, list, interval
- Domain|Range: store data domain or visual range
- Event: e.g. mouseover, mousedown, mouseup,
- Init: initialise with specific backing points
- Transforms: e.g. project, toggle, translate, zoom, and nearest
- Resolve: resolve selections to union or intersect
- Selection: get the minimal number of backing points
Implementations
[edit]Whilst Vega-Lite is the sole implementation of this graphics grammar specification with compilation to Vega, other implementations do create JSON files which can be interpreted by Vega-Lite.
Related projects
[edit]- Ggplot2[2] is an R package for plotting
- Tableau Software[8] (originally known as Polaris) is a commercial software built using the Grammar of Graphics
- nViZn[17] built by Wilkinson.
- SYSTAT (statistics package)[18] built by Wilkinson
- ggpy, ggplot for Python,[19] but has not been updated since 20 November 2016
- plotnine[20] started as an effort to improve the scalability of ggplot for Python and is largely compatible with ggplot2 syntax.
- Plotly - Interactive, online ggplot2 graphs[21]
- gramm, a plotting class for MATLAB inspired by ggplot2[22]
- gadfly, a system for plotting and visualization written in Julia, based largely on ggplot2[23]
- Chart::GGPlot - ggplot2 port in Perl,[24] but has not been updated since 16 March 2023
- The Lets-Plot for Python library includes a native backend and a Python API, which was mostly based on the ggplot2 package.[25]
- Lets-Plot Kotlin API is an open-source plotting library for statistical data implemented using the Kotlin programming language, and is built on the principles of layered graphics first described in the Leland Wilkinson's work The Grammar of Graphics.[26]
- ggplotnim, plotting library using the Nim programming language inspired by ggplot2.[27]
- Vega and Vega-Lite are plotting libraries that use JSON to specify plots.
- chart-parts - React-friendly Grammar of Graphics,[28][29] but has not been updated since 10 Dec 2021
- g2[30] - a JavaScript library
References
[edit]- ^ a b c d e Myatt, Glenn J.; Johnson, Wayne P. (2012). Making sense of data III: a practical guide to designing interactive data visualizations. Hoboken, N.J: Wiley. ISBN 978-0-470-53649-0.
- ^ a b Wickham, Hadley (2016). ggplot2: elegant graphics for data analysis. Use R!. Carson Sievert (2nd ed.). Cham: Springer international publishing. ISBN 978-3-319-24275-0.
- ^ a b c Wilkinson, Leland; Wills, Graham (2011). The grammar of graphics. Statistics and computing (2. ed., softcover reprint of the hardcover 2. ed., 2005 ed.). New York, NY: Springer. ISBN 978-1-4419-2033-1.
- ^ a b Wilkinson, Leland (2011). "ggplot2: Elegant Graphics for Data Analysis by WICKHAM, H." Biometrics. 67 (2): 678–679. doi:10.1111/j.1541-0420.2011.01616.x.
- ^ a b "Episode #201: Leland Wilkinson". PolicyViz. Retrieved 2025-07-13.
- ^ Jarrett, Jeffrey (1992). "SYSTAT/SYGRAPH and Micro-TSP". Statistics and Computing. 2 (4): 231–236. doi:10.1007/BF01889683. S2CID 119499841.
- ^ Jones, Lacey; Symanzik, Jürgen. "Statistical Visualization of Environmental Data on the Web Using nViZn" (PDF). Archived from the original (PDF) on October 5, 2011.
- ^ a b c Stolte, C.; Tang, D.; Hanrahan, P. (2002). "Polaris: a system for query, analysis, and visualization of multidimensional relational databases". IEEE Transactions on Visualization and Computer Graphics. 8 (1): 52–65. Bibcode:2002ITVCG...8...52S. doi:10.1109/2945.981851.
- ^ "Polaris: Database and Data Cube Visualization". graphics.stanford.edu. Retrieved 2023-04-30.
- ^ Stolte, C.; Tang, D.; Hanrahan, P. (January 2002). "Polaris: a system for query, analysis, and visualization of multidimensional relational databases". IEEE Transactions on Visualization and Computer Graphics. 8 (1): 52–65. Bibcode:2002ITVCG...8...52S. doi:10.1109/2945.981851. ISSN 1941-0506.
- ^ Wickham, Hadley (2025-03-22), hadley/ggplot1, retrieved 2025-07-13
- ^ "Create Elegant Data Visualisations Using the Grammar of Graphics". ggplot2.tidyverse.org. Retrieved 2025-07-13.
- ^ Wickham, Hadley Alexander (2008). Practical tools for exploring data and models (Doctor of Philosophy thesis). Ames: Iowa State University, Digital Repository. doi:10.31274/rtd-180813-16852.
- ^ Wickham, Hadley (2010). "A Layered Grammar of Graphics". Journal of Computational and Graphical Statistics. 19 (1): 3–28. doi:10.1198/jcgs.2009.07098. ISSN 1061-8600.
- ^ Chang, Winston (2018). R graphics cookbook: practical recipes for visualizing data (2nd ed.). Beijing ; Boston: O'Reilly. ISBN 978-1-4919-7860-3.
- ^ Satyanarayan, Arvind; Moritz, Dominik; Wongsuphasawat, Kanit; Heer, Jeffrey (2017). "Vega-Lite: A Grammar of Interactive Graphics". IEEE Transactions on Visualization and Computer Graphics. 23 (1): 341–350. Bibcode:2017ITVCG..23..341S. doi:10.1109/TVCG.2016.2599030. ISSN 1077-2626. PMID 27875150.
- ^ "nViZn page". www.cs.uic.edu. Archived from the original on 2022-01-20. Retrieved 2025-07-13.
- ^ "SYSTAT page". www.cs.uic.edu. Archived from the original on 2024-04-18. Retrieved 2025-07-13.
- ^ "yhat/ggpy: ggplot port for python". GitHub. yhat. Retrieved 2024-02-01.
- ^ "plotnine". Archived from the original on 2 August 2023. Retrieved 2 August 2023.
- ^ "Plotly graphing library for ggplot2 in ggplot2". Plotly Graphing Libraries. Plotly. Retrieved 2024-02-01.
- ^ "ggplot for Matlab". GitHub. Pierre Morel (@piermorel). Retrieved 11 December 2015.
- ^ "Gadfly.jl". Gadfly.jl. Retrieved 11 September 2018.
- ^ "Stephan Loyd/Chart-GGPlot-0.0001". MetaCPAN. Retrieved 30 March 2019.
- ^ "JetBrains/lets-plot". GitHub. JetBrains. Retrieved 3 April 2021.
- ^ "JetBrains/lets-plot-kotlin". GitHub. JetBrains. Retrieved 4 April 2021.
- ^ "ggplotnim". GitHub. Vindaar. Retrieved 1 August 2023.
- ^ microsoft/chart-parts, Microsoft, 2025-06-26, retrieved 2025-07-13
- ^ "Introduction to Chart Parts". microsoft.github.io. Retrieved 2025-07-29.
- ^ antvis/G2, AntV Visualization Team, 2025-07-15, retrieved 2025-07-16