-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Open
Labels
.Team/QueryingPriority:P2Average run of the mill bugAverage run of the mill bugType:BugProduct defectsProduct defects
Description
Describe the bug
(mapv (comp first vector)
[1 2]
(concat (range 1000)
(for [_ (range 10)]
(throw (ex-info "you should not be here" {})))))
[1 2]
performance-test=>
(perf/mapv (comp first vector)
[1 2]
(concat (range 1000)
(for [_ (range 10)]
(throw (ex-info "you should not be here" {})))))
Execution error (ExceptionInfo) at metabase.util.performance-test/eval2368416$iter$fn$fn (REPL:604).
you should not be hereThis is a succinct demo of the bug. And the following will be an infinite loop:
(mapv (comp first vector)
[1 2]
(range))
[1 2]if you use perf/mapv
Issue: it uses count to determine if it's under chunking limits at <= 32 for optimizations. But will try to count infinite sequences or large sequences.
A trivial adaptation to use bounded-count has to update a helper
#?(:clj
(defn- smallest-count
(^long [c1 c2] (min (count c1) (count c2)))
(^long [c1 c2 c3] (min (count c1) (count c2) (count c3)))
(^long [c1 c2 c3 c4] (min (count c1) (count c2) (count c3) (count c4)))))and then if you put a bound parameter to each of these you run into one more issue:
performance-test=> (:clj
(defn- smallest-count
(^long [bound c1 c2] (min (bounded-count bound c1) (bounded-count bound c2)))
(^long [bound c1 c2 c3] (min (bounded-count bound c1) (bounded-count bound c2) (bounded-count bound c3)))
(^long [bound c1 c2 c3 c4] (min (bounded-count bound c1) (bounded-count bound c2) (bounded-count bound c3) (bounded-count bound c4)))))
Syntax error (IllegalArgumentException) compiling fn* at (REPL:609:21).
fns taking primitives support only 4 or fewer argsAt this point i kinda backed away.
diff --git a/src/metabase/util/performance.cljc b/src/metabase/util/performance.cljc
index 7e837d40a97..a201d8ff7a2 100644
--- a/src/metabase/util/performance.cljc
+++ b/src/metabase/util/performance.cljc
@@ -193,22 +193,22 @@
#?(:clj
(defn- smallest-count
- (^long [c1 c2] (min (count c1) (count c2)))
- (^long [c1 c2 c3] (min (count c1) (count c2) (count c3)))
- (^long [c1 c2 c3 c4] (min (count c1) (count c2) (count c3) (count c4)))))
+ (^long [bound c1 c2] (min (bounded-count bound c1) (bounded-count bound c2)))
+ (^long [bound c1 c2 c3] (min (bounded-count bound c1) (bounded-count bound c2) (bounded-count bound c3)))
+ (^long [bound c1 c2 c3 c4] (min (bounded-count bound c1) (bounded-count bound c2) (bounded-count bound c3) (bounded-count bound c4)))))
(defn mapv
"Drop-in replacement for `clojure.core/mapv`.
Iterates multiple collections more efficiently and uses Java iterators under the hood (the CLJ version). CLJS
version is only optimized for a single collection arity."
([f coll1]
- (let [n (count coll1)]
+ (let [n (bounded-count 33 coll1)]
(cond (= n 0) []
(<= n 32) (small-persistent! (reduce apply-and-small-conj! (small-transient n f) coll1))
:else (persistent! (reduce #(conj! %1 (f %2)) (transient []) coll1)))))
([f coll1 coll2]
#?(:clj
- (let [n (smallest-count coll1 coll2)]
+ (let [n (smallest-count 33 coll1 coll2)]
(cond (= n 0) []
(<= n 32) (small-persistent! (reduce apply-and-small-conj! (small-transient n f) coll1 coll2))
:else (persistent! (reduce #(conj! %1 (f %2 %3)) (transient []) coll1 coll2))))
@@ -216,7 +216,7 @@
(core/mapv f coll1 coll2)))
([f coll1 coll2 coll3]
#?(:clj
- (let [n (smallest-count coll1 coll2 coll3)]
+ (let [n (smallest-count 33 coll1 coll2 coll3)]
(cond (= n 0) []
(<= n 32) (small-persistent! (reduce apply-and-small-conj! (small-transient n f) coll1 coll2 coll3))
:else (persistent! (reduce #(conj! %1 (f %2 %3 %4)) (transient []) coll1 coll2 coll3))))
@@ -224,7 +224,7 @@
(core/mapv f coll1 coll2 coll3)))
([f coll1 coll2 coll3 coll4]
#?(:clj
- (let [n (smallest-count coll1 coll2 coll3 coll4)]
+ (let [n (smallest-count 33 coll1 coll2 coll3 coll4)]
(cond (= n 0) []
(<= n 32) (small-persistent! (reduce apply-and-small-conj! (small-transient n f) coll1 coll2 coll3 coll4))
:else (persistent! (reduce #(conj! %1 (f %2 %3 %4 %5)) (transient []) coll1 coll2 coll3 coll4))))
diff --git a/test/metabase/util/performance_test.cljc b/test/metabase/util/performance_test.cljc
index f438f2e58fe..5c2df7dff00 100644
--- a/test/metabase/util/performance_test.cljc
+++ b/test/metabase/util/performance_test.cljc
@@ -27,6 +27,15 @@
(= r 2) vec)))))]
(is (= (apply concat inputs) (apply perf/concat inputs))))))
+(deftest mapv-test
+ (testing "Does not realize long or infinite sequences"
+ (is (= [1 2]
+ (perf/mapv (comp first vector)
+ [1 2]
+ (concat (range 1000)
+ (for [_ (range 10)]
+ (throw (ex-info "you should not be here" {})))))))))
+
(defn- mapv-via-run! [f coll]
(let [v (volatile! [])]
(perf/run! #(vswap! v conj (f %)) coll)To Reproduce
see above
Expected behavior
No response
Logs
No response
Information about your Metabase installation
masterSeverity
p2
Additional context
bad footgun
Metadata
Metadata
Assignees
Labels
.Team/QueryingPriority:P2Average run of the mill bugAverage run of the mill bugType:BugProduct defectsProduct defects