clojure - Best way to lazily collapse multiple contiguous items of a sequence into a single item -
clojure - Best way to lazily collapse multiple contiguous items of a sequence into a single item -
[note: title , text heavily edited create more clear i'm not particularly after strings, after general sequences, , lazy processing of same]
using character sequences / strings example, want turn string like
"\t a\r s\td \t \r \n f \r\n"
into
" s d f "
in more general terms, want turn contiguous whitespace (or other arbitray set of items) in sequence single item, , lazily.
i've come next partition-by/mapcat combo, wonder if there easier or otherwise improve ways (readability, performance, anything) accomplish same thing.
(defn is-wsp? [c] (if (#{\space \tab \newline \return} c) true)) (defn collapse-wsp [coll] (mapcat (fn [[first-elem :as s]] (if (is-wsp? first-elem) [\space] s)) (partition-by is-wsp? coll)))
in action:
=> (apply str (collapse-wsp "\t a\r s\td \t \r \n f \r\n")) " s d f "
update: used strings / character sequences / wsp, example, want generic function on sequences of type collapses arbitrary numbers of contiguous items, part of predefined set of items, single predefined item. i'm particularly interested know if there improve alternatives partition-by/mapcat, not much if can optimized 'string' special case.
update 2:
here's lazy version - 1 above isn't lazy, fear, apart doing redundant is-wsp? checks. generalized parameter names etc. doesn't replace string.whatever() phone call - it's arbitrary sequences.
(defn lazy-collapse ([coll is-collapsable-item? collapsed-item-representation] (lazy-collapse coll is-collapsable-item? collapsed-item-representation false)) ([coll is-collapsable-item? collapsed-item-representation in-collapsable-segment?] (let [step (fn [coll in-collapsable-segment?] (when-let [item (first coll)] (if (is-collapsable-item? item) (if in-collapsable-segment? (recur (rest coll) true) (cons collapsed-item-representation (lazy-collapse (rest coll) is-collapsable-item? collapsed-item-representation true))) (cons item (lazy-collapse (rest coll) is-collapsable-item? collapsed-item-representation false)))))] (lazy-seq (step coll in-collapsable-segment?)))))
this fast, lazy, i'd able express more concisely, quite lazy myself.
benchmarks of lazy collapsers far: whether code readable or not easy justice looking @ code, in order see how compare in terms of performance, here benchmarks. first check if function supposed do, , spit out how long takes to
create lazy seq 1m times create lazy seq , take first item 1m times create lazy seq , take sec item 1m times create lazy seq , take lastly item (i.e. realize lazy seq) 1m timestests 1 through 3 meant gauge laziness @ to the lowest degree little bit. ran test couple of times, , there no important variation in execution times.
user=> (map (fn [collapse] (println (class collapse) (str "|" (apply str (collapse test-str is-wsp? \space)) "|")) (time (dotimes [_ 1000000] (collapse test-str is-wsp? \space))) (time (dotimes [_ 1000000] (first (collapse test-str is-wsp? \space)))) (time (dotimes [_ 1000000] (second (collapse test-str is-wsp? \space)))) (time (dotimes [_ 1000000] (last (collapse test-str is-wsp? \space))))) [collapse-overthink collapse-smith collapse-normand lazy-collapse]) user$collapse_overthink | s d f | "elapsed time: 153.490591 msecs" "elapsed time: 3064.721629 msecs" "elapsed time: 4337.932487 msecs" "elapsed time: 24797.222682 msecs" user$collapse_smith | s d f | "elapsed time: 141.474904 msecs" "elapsed time: 812.998848 msecs" "elapsed time: 2112.331739 msecs" "elapsed time: 10750.224816 msecs" user$collapse_normand | s d f | "elapsed time: 314.978309 msecs" "elapsed time: 1423.779761 msecs" "elapsed time: 1669.660257 msecs" "elapsed time: 8074.759077 msecs" user$lazy_collapse | s d f | "elapsed time: 169.906088 msecs" "elapsed time: 638.030401 msecs" "elapsed time: 1195.445016 msecs" "elapsed time: 6050.945856 msecs"
bottom line far: nicest code slowest, ugliest code fastest. i'm pretty sure doesn't have way...
a lazy solution, written in recursive style:
(defn collapse [l p v] (cond (nil? (seq l)) nil (p (first l)) (lazy-seq (cons v (collapse (drop-while p l) p v))) :otherwise (lazy-seq (cons (first l) (collapse (rest l) p v)))))
l
list, p
predicate, , v
value replace subsequences match predicate with.
if you're after pure speed @ expense of readability, can this:
(defn collapse-normand2 ([l p v] (lazy-seq (collapse-normand2 (seq l) p v nil))) ([l p v _] (when l (lazy-seq (let [f (first l) r (next l)] (if (p f) (cons v (collapse-normand2 r p v nil nil)) (cons f (collapse-normand2 r p v nil))))))) ([l p v _ _] (when l (lazy-seq (let [f (first l) r (next l)] (if (p f) (collapse-normand2 r p v nil nil) (collapse-normand2 r p v nil)))))))
there's way create more readable. on 4 tests.
clojure
Comments
Post a Comment