summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorWilfred Hughes <me@wilfred.me.uk>2023-08-04 17:19:27 -0700
committerWilfred Hughes <me@wilfred.me.uk>2023-08-04 17:19:27 -0700
commit892d4fdb5887e4aa7ba6a76c34987057518548ec (patch)
treec9e86a8065eb6bbb8568b59f1bfdbef04b72d973
parentc937f819a18cc37dec12cc870397f08d48f020dd (diff)
Ensure size_hint never exceeds graph_limit
If we have thousands of syntax nodes on both sides, we can end up attempting to preallocate a very large hashmap. In #542, a user hit an issue with two JSON files where the LHS had 33,000 syntax nodes and the RHS had 34,000 nodes, so we'd attempt to preallocate a hashmap of capacity 1,122,000,000. This required allocating 70,866,960,400 bytes (roughly 66 GiB). Impose a sensible limit on the hashmap. Fixes #542
-rw-r--r--CHANGELOG.md5
-rw-r--r--src/diff/dijkstra.rs7
2 files changed, 11 insertions, 1 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md
index f5cd92ca9..95d158d4c 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,11 @@ prominent.
Improved syntax hightling for Java built-in types.
+### Diffing
+
+Fixed an issue with runaway memory usage when the two files input
+files had a large number of differences.
+
## 0.49 (release 26th July 2023)
### Parsing
diff --git a/src/diff/dijkstra.rs b/src/diff/dijkstra.rs
index 891203282..c7f5d4ef7 100644
--- a/src/diff/dijkstra.rs
+++ b/src/diff/dijkstra.rs
@@ -205,7 +205,12 @@ pub fn mark_syntax<'a>(
// graph whose size is roughly quadratic. Use this as a size hint,
// so we don't spend too much time re-hashing and expanding the
// predecessors hashmap.
- let size_hint = lhs_node_count * rhs_node_count;
+ //
+ // Cap this number to the graph limit, so we don't try to allocate
+ // an absurdly large (i.e. greater than physical memory) hashmap
+ // when there is a large number of nodes. We'll never visit more
+ // than graph_limit nodes.
+ let size_hint = std::cmp::min(lhs_node_count * rhs_node_count, graph_limit);
let start = Vertex::new(lhs_syntax, rhs_syntax);
let vertex_arena = Bump::new();