iteration - Haskell iteratee: simple worked example of stripping trailing whitespace -
iteration - Haskell iteratee: simple worked example of stripping trailing whitespace -
i'm trying understand how utilize iteratee library haskell. of articles i've seen far seem focus on building intuition how iteratees built, helpful, want downwards , utilize them, sense bit @ sea. looking @ source code iteratees has been of limited value me.
let's have function trims trailing whitespace line:
import data.bytestring.char8 rstrip :: bytestring -> bytestring rstrip = fst . spanend isspace
what i'd is: create iteratee, read file , write out somewhere else trailing whitespace stripped each line. how go structuring iteratees? see there's enumlinesbs
function in data.iteratee.char plumb this, don't know if should utilize mapchunks
or convstream
or how repackage function above iteratee.
if want code, it's this:
procfile' ifile ofile = filedriver (joini $ enumlinesbs ><> mapchunks (map rstrip) $ i.mapm_ (b.appendfile ofile)) ifile
commentary:
this three-stage process: first transform raw stream stream of lines, apply function convert stream of lines, , consume stream. since rstrip
in middle stage, creating stream transformer (enumeratee).
you can utilize either mapchunks
or convstream
, mapchunks
simpler. difference mapchunks
doesn't allow cross chunk boundaries, whereas convstream
more general. prefer convstream
because doesn't expose of underlying implementation, if mapchunks
sufficient resulting code shorter.
rstripe :: monad m => enumeratee [bytestring] [bytestring] m rstripe = mapchunks (map rstrip)
note map
in rstripe
. outer stream (which input rstrip) has type [bytestring]
, need map rstrip
onto it.
for comparison, if implemented convstream:
rstripe' :: enumeratee [bytestring] [bytestring] m rstripe' = convstream $ mline <- i.peek maybe (return b.empty) (\line -> i.drop 1 >> homecoming (rstrip line)) mline
this longer, , it's less efficient because apply rstrip function 1 line @ time, though more lines may available. it's possible work on of available chunk, closer mapchunks
version:
rstripe'2 :: enumeratee [bytestring] [bytestring] m rstripe'2 = convstream (liftm (map rstrip) getchunk)
anyway, stripping enumeratee available, it's composed enumlinesbs
enumeratee:
enumstriplines :: monad m => enumeratee bytestring [bytestring] m enumstriplines = enumlinesbs ><> rstripe
the composition operator ><>
follows same order arrow operator >>>
. enumlinesbs
splits stream lines, rstripe
strips them. need add together consumer (which normal iteratee), , you're done:
writer :: filepath -> iteratee [bytestring] io () author fp = i.mapm_ (b.appendfile fp) processfile ifile ofile = enumfile defaultbufsize ifile (joini $ enumstriplines $ author ofile) >>= run
the filedriver
functions shortcuts enumerating on file , running resulting iteratee (unfortunately argument order switched enumfile):
procfile2 ifile ofile = filedriver (joini $ enumstriplines $ author ofile) ifile
addendum: here's situation need powerfulness of convstream. suppose want concatenate every 2 lines one. can't utilize mapchunks
. consider when chunk singleton element, [bytestring]
. mapchunks
doesn't provide way access next chunk, there's nil else concatenate this. convstream
however, it's simple:
concatpairs = convstream $ line1 <- i.head line2 <- i.head homecoming $ line1 `b.append` line2
this looks nicer in applicative style,
convstream $ b.append <$> i.head <*> i.head
you can think of convstream
continually consuming portion of stream provided iteratee, sending transformed version inner consumer. isn't general enough, since same iteratee called @ each step. in case, can utilize unfoldconvstream
pass state between successive iterations.
convstream
, unfoldconvstream
allow monadic actions, since stream processing iteratee monad transformer.
haskell iteration bytestring iteratee
Comments
Post a Comment