list - Dealing with the surprising lack of ParList in scala.collections.parallel -
list - Dealing with the surprising lack of ParList in scala.collections.parallel -
so scala 2.9 turned in debian testing, bringing newfangled parallel collections it.
suppose have code equivalent to
def expensivefunction(x:int):int = {...} def process(s:list[int]):list[int} = s.map(expensivefunction)
now teeny bit i'd gleaned parallel collections before docs turned on machine, expecting parallelize switching list parlist
... surprise, there isn't one! (just parvector
, parmap
, parset
...).
as workround, (or one-line equivalent) seems work enough:
def process(s:list[int]):list[int} = { val ps=scala.collection.parallel.immutable.parvector()++s val pr=ps.map(expensivefunction) list()++pr }
yielding approximately x3 performance improvement in test code , achieving massively higher cpu usage (quad core plus hyperthreading i7). seems kind of clunky.
my question sort of aggregated:
why isn't thereparlist
? given there isn't parlist
, there improve pattern/idiom should adopt don't sense they're missing ? am "behind times" using lists lot in scala programs (like scala books bought in 2.7 days taught me) , should making more utilize of vectors
? (i mean in c++ land i'd need pretty reason utilize std::list
on std::vector
).
first, allow me show how create parallel version of code:
def expensivefunction(x:int):int = {...} def process(s:list[int]):seq[int] = s.par.map(expensivefunction).seq
that have scala figure things out -- and, way, uses parvector. if want list
, phone call .tolist
instead of .seq
.
as questions:
there isn't parlist
because list
intrinsically non-parallel info structure, because operation on requires traversal.
you should code traits instead of classes -- seq
, parseq
, genseq
, example. performance characteristics of list
guaranteed linearseq
.
all books before scala 2.8 did not have new collections library in mind. in particular, collections didn't share consistent , finish api. do, , you'll gain much taking advantage of it.
furthermore, there wasn't collection vector
in scala 2.7 -- immutable collection (near) constant indexed access.
list scala map parallel-processing scala-collections
Comments
Post a Comment