reglScatterplot() was designed for the size of typical
single-cell and spatial datasets, but it can push well past that.
| Point count | Status | Notes |
|---|---|---|
| 1 - 500 000 | Flawless | Below the auto performance-mode threshold; full interactivity |
| 500 k - 5 M | Smooth | performanceMode kicks in
automatically |
| 5 M - 20 M | Usable | Use pointSize = 1,
opacity = 1, drop pointLabels |
| 20 M - 100 M | Standalone HTML reaches RAM ceiling | Tile-based architectures (e.g. deepscatter) start to win |
| > 100 M | Out of reach in-browser | Server-side rendering / WebGPU territory |
To keep large datasets shippable inside a standalone
htmlwidget, every numeric channel is binary-encoded and
base64-wrapped before transit:
| Channel | Encoder | Precision | Bytes / point |
|---|---|---|---|
| X / Y (normalised) | .toBase64U16() |
1 / 32 767 | 2 |
| Continuous color z | .toBase64U16Unit() |
1 / 65 535 | 2 |
| Categorical color z | .toBase64U16Int() |
exact (< 65 536) | 2 |
| Filter ranges | toBase64() (Float32) |
full f32 | 4 |
At 10 M points the resulting HTML file is around 80 - 90 MB - large but finite. The same data with Float32 everywhere would be ~150 MB.
library(reglScatterplotR)
bench_sizes <- c(1e4, 1e5, 1e6, 5e6)
for (n in bench_sizes) {
df <- data.frame(x = rnorm(n), y = rnorm(n), v = runif(n))
t0 <- Sys.time()
w <- reglScatterplot(df,
x = "x", y = "y", colorBy = "v",
height = 600
)
payload <- htmlwidgets:::toJSON(w$x)
cat(sprintf(
"n = %s : build = %.2fs, payload = %.1f MB\n",
format(n, big.mark = ","),
as.numeric(Sys.time() - t0, units = "secs"),
nchar(payload) / 1024 / 1024
))
rm(df, w, payload)
gc(verbose = FALSE)
}
#> n = 10,000 : build = 0.02s, payload = 0.1 MB
#> n = 1e+05 : build = 0.04s, payload = 0.8 MB
#> n = 1e+06 : build = 0.43s, payload = 7.8 MB
#> n = 5e+06 : build = 1.34s, payload = 39.2 MBOn a 2020-era laptop with an RTX 2060, 5 M points takes ~1.5s on the R side and another ~2s for the browser to parse and upload to the GPU; pan/zoom then runs at 60 fps.
The widget honours an explicit pixel height verbatim. If
the value exceeds the height of the host window (small browser tab,
RStudio Viewer in a tiling WM, narrow Jupyter notebook column, etc.),
the bottom of the canvas is clipped by the host - not by
reglScatterplot.
# Bad in small viewports: a 500 px tall widget overflows a 450 px window.
reglScatterplot(df, x = "x", y = "y", height = 500)
# Good: fill whatever vertical space is available.
reglScatterplot(df, x = "x", y = "y", height = "100%")
# Also good: omit `height` entirely - the sizingPolicy fills the viewer pane.
reglScatterplot(df, x = "x", y = "y")Knitting to HTML produces a full-page document where the widget can take as much height as you give it, so the same code that clips in the Viewer pane prints cleanly in a knit report. This is purely a viewport effect.
When you really want to push past 5 M, every per-point byte counts. Suggested defaults for huge inputs:
reglScatterplot(huge_df,
x = "x", y = "y",
pointSize = 1, # one pixel per point
opacity = 1, # no blending math
showAxes = FALSE, # drops the D3 axis layer
showTooltip = FALSE, # frees per-point hit-test work
enableDownload = FALSE, # no html2canvas / jsPDF download
pointLabels = NULL # don't ship gene names
)Things you might think help but don’t: * Reducing vmin /
vmax clip range - colour scale only, not memory. * Setting
legendPosition = "bottom-left" - cosmetic, no perf
impact.
reglScatterplot is one of three credible options for
high-volume scatter in R. They aren’t doing the same thing:
| Package | Interactive? | Best at | Limit |
|---|---|---|---|
reglScatterplot |
Yes | 1 - 20 M points in HTML / Shiny | Browser RAM / VRAM |
plotly (+ toWebGL()) |
Yes | < 500 k points, broad feature set | JSON payload bloats past 1 M |
scattermore |
No (static) | Quickly rasterising 10 M+ to a PNG | No pan / zoom interactivity |
ggplot2 |
No (static) | Publication graphics, small data | Practical ceiling ~50 k pts |
The right choice depends on what you need:
ggplot2 or
scattermore.reglScatterplot.plotly.For genuinely huge data (multi-modal CosMx slides, whole-atlas integrations beyond ~50 M cells), no in-browser library is the right answer today. The viable paths are:
deepscatter
(Apache Arrow + Parquet tiles). Requires a server or a static tile
directory.For now, reglScatterplot covers the typical single-cell,
spatial and fold-change use cases comfortably. If you find yourself
loading the same 10 M+ dataset repeatedly, the right next step is to
switch to a tile server, not a faster scatterplot.
sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] reglScatterplotR_0.99.3
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 R6_2.6.1 fastmap_1.2.0 xfun_0.58
#> [5] maketools_1.3.2 cachem_1.1.0 knitr_1.51 htmltools_0.5.9
#> [9] rmarkdown_2.31 buildtools_1.0.0 lifecycle_1.0.5 cli_3.6.6
#> [13] viridisLite_0.4.3 sass_0.4.10 jquerylib_0.1.4 compiler_4.6.0
#> [17] sys_3.4.3 tools_4.6.0 evaluate_1.0.5 bslib_0.11.0
#> [21] yaml_2.3.12 otel_0.2.0 htmlwidgets_1.6.4 jsonlite_2.0.0
#> [25] rlang_1.2.0 crosstalk_1.2.2