streaming-bytestring.cabal

name:                streaming-bytestring
version:             0.1.4.6
synopsis:            effectful byte steams, or: bytestring io done right.

description:         This is an implementation of effectful, memory-constrained 
                     bytestrings (byte streams) and functions for streaming 
                     bytestring manipulation, adequate for non-lazy-io.  
                     Some examples of the use of byte streams to implement simple 
                     shell progams can be found 
                     <https://gist.github.com/michaelt/6c6843e6dd8030e95d58 here>.
                     See also the illustrations of use with e.g. @attoparsec@,
                     @aeson@, @http-client@, @zlib@ etc. in the 
                     <https://hackage.haskell.org/package/streaming-utils streaming-utils> 
                     library.  Usage is as close as possible to that of @ByteString@ 
                     and lazy @ByteString@. 
                     .
                     A @ByteString IO ()@ is the most natural representation of
                     an effectful stream of bytes arising chunkwise from a handle.
                     Indeed, the implementation follows the
                     details of @Data.ByteString.Lazy@ and @Data.ByteString.Lazy.Char8@
                     in unrelenting detail, omitting only transparently non-streaming
                     operations like @reverse@. It is just a question of replacing 
                     the lazy bytestring type:
                     .
                     > data ByteString     = Empty   | Chunk Strict.ByteString ByteString
                     .
                     with the /minimal/ effectful variant:
                     .
                     > data ByteString m r = Empty r | Chunk Strict.ByteString (ByteString m r) | Go (m (ByteString m r))
                     .
                     (Constructors are necessarily hidden in internal modules in both the @Lazy@ and the @Streaming@.) 
                     .
                     That's it. As a lazy bytestring is implemented internally 
                     by a sort of list of strict bytestring chunks, a streaming bytestring is 
                     simply implemented as a /producer/ or /generator/ of strict bytestring chunks.
                     Most operations are defined by simply adding a line to what we find in
                     @Data.ByteString.Lazy@. The only possible simplification would 
                     involve specializing to @IO@, throughout - but this would e.g. block
                     the use of @ResourceT@ to manage handles and the like, and a number
                     of other convenient operations like @copy@, which permits one to 
                     apply two operations simultaneously over the length of the byte stream.
                     .
                     Something like this alteration of type is of course obvious and mechanical, once the idea of
                     an effectful bytestring type is contemplated and lazy io is rejected.
                     Indeed it seems that this is the proper expression of what was
                     intended by lazy bytestrings to begin with. The documentation, after all,
                     reads
                     .
                     * \"A key feature of lazy ByteStrings is the means to manipulate large or 
                       unbounded streams of data without requiring the entire sequence to be 
                       resident in memory. To take advantage of this you have to write your 
                       functions in a lazy streaming style, e.g. classic pipeline composition. 
                       The default I/O chunk size is 32k, which should be good in most circumstances.\"
                     .
                     ... which is very much the idea of this library: the default chunk size for
                     'hGetContents' and the like follows @Data.ByteString.Lazy@; operations
                     like @lines@ and @append@ and so on are tailored not to increase chunk size. 
                     .
                     The present library is thus if you like nothing but /lazy bytestring done right/. 
                     The authors of @Data.ByteString.Lazy@ must have supposed that 
                     the directly monadic formulation of such their type 
                     would necessarily make things slower. This appears to be a prejudice. 
                     For example, passing a large file of short lines through
                     this benchmark transformation
                     .
                     > Lazy.unlines      . map    (\bs -> "!"       <> Lazy.drop 5 bs)       . Lazy.lines
                     > Streaming.unlines . S.maps (\bs -> chunk "!" >> Streaming.drop 5 bs)  . Streaming.lines
                     .
                     gives pleasing results like these
                     .
                     > $  time ./benchlines lazy >> /dev/null
                     > real	0m2.097s
                     > ...
                     > $  time ./benchlines streaming >> /dev/null
                     > real	0m1.930s
                     .
                     For a more sophisticated operation like
                     .
                     > Lazy.intercalate "!\n"      . Lazy.lines
                     > Streaming.intercalate "!\n" . Streaming.lines
                     .
                     we get results like these:
                     .
                     > time ./benchlines lazy >> /dev/null
                     > real	0m1.250s
                     > ...
                     > time ./benchlines streaming >> /dev/null
                     > real	0m1.531s
                     . 
                     The pipes environment would express the latter as 
                     .
                     > Pipes.intercalates (Pipes.yield "!\n") . view Pipes.lines 
                     .
                     meaning almost exactly what we mean above, but with results like this
                     .
                     >  time ./benchlines pipes >> /dev/null
                     >  real	0m6.353s
                     .
                     The difference, however, /is emphatically not intrinsic to pipes/; 
                     it is just that 
                     this library depends the @streaming@ library, which is used in place 
                     of @free@ to express the 
                     <http://www.haskellforall.com/2013/09/perfect-streaming-using-pipes-bytestring.html "perfectly streaming">
                     splitting and iterated division or "chunking" of byte streams. 
                     .
                     These concepts belong to the ABCs of streaming; @lines@ is just
                     a textbook example, and it is of course handled correctly in 
                     @Data.ByteString.Lazy@.
                     But the concepts are /catastrophically mishandled/ in /all/ streaming io libraries 
                     other than pipes. Already the @enumerator@ and @iteratee@ libraries
                     were completely defeated by @lines@: 
                     see e.g. the @enumerator@ implementation of 
                     <http://hackage.haskell.org/package/enumerator-0.4.20/docs/Data-Enumerator-Text.html#v:splitWhen splitWhen and lines>.
                     This will concatenate strict text forever, if that's what is coming
                     in.  The rot spreads from there. 
                     It is just a fact that in all of the general streaming io 
                     frameworks other than pipes,it becomes torture to express elementary distinctions 
                     that are transparently and immediately contained in any 
                     idea of streaming whatsoever. 
                     .
                     Though, as was said above, we barely alter signatures in @Data.ByteString.Lazy@ 
                     more than is required by the types, the point of view that emerges 
                     is very much that of
                     @pipes-bytestring@ and @pipes-group@. In particular
                     we have these correspondences:
                     .
                     > Lazy.splitAt      :: Int -> ByteString              -> (ByteString, ByteString)
                     > Streaming.splitAt :: Int -> ByteString m r          -> ByteString m (ByteString m r)
                     > Pipes.splitAt     :: Int -> Producer ByteString m r -> Producer ByteString m (Producer ByteString m r)
                     .
                     and
                     .
                     > Lazy.lines      :: ByteString -> [ByteString]
                     > Streaming.lines :: ByteString m r -> Stream (ByteString m) m r
                     > Pipes.lines     :: Producer ByteString m r -> FreeT (Producer ByteString m) m r
                     .
                     where the @Stream@ type expresses the sequencing of @ByteString m _@ layers
                     with the usual \'free monad\' sequencing. 
                     .
                     Interoperation with @pipes-bytestring@ uses this isomorphism:
                     . 
                     > Streaming.ByteString.unfoldrChunks Pipes.next :: Monad m => Producer ByteString m r -> ByteString m r
                     > Pipes.unfoldr Streaming.ByteString.nextChunk  :: Monad m => ByteString m r -> Producer ByteString m r
                     .
                     Interoperation with @io-streams@ is thus:
                     .
                     > IOStreams.unfoldM Streaming.ByteString.unconsChunk :: ByteString IO () -> IO (InputStream ByteString)
                     > Streaming.ByteString.reread IOStreams.read         :: InputStream ByteString -> ByteString IO ()
                     .
                     and similarly for other rational streaming io libraries. 
                     .
                     Problems and questions about the library can be put as issues on 
                     the github page, or mailed to the 
                     <https://groups.google.com/forum/#!forum/haskell-pipes pipes list>.
                     .
                     A tutorial module is in the works; 
                     <https://gist.github.com/michaelt/6c6843e6dd8030e95d58 here>,
                     for the moment, 
                     is a sequence of simplified implementations of familiar shell utilities.  
                     The same programs are implemented at the end of the excellent
                       <http://hackage.haskell.org/package/io-streams-1.3.2.0/docs/System-IO-Streams-Tutorial.html io-streams tutorial>.
                       It is generally much simpler; in some case simpler than what
                       you would write with lazy bytestrings. 
                       <https://gist.github.com/michaelt/2dcea1ba32562c091357 Here>
                       is a simple GET request that returns a byte stream.
                       .

                    
license:             BSD3
license-file:        LICENSE
author:              michaelt
maintainer:          what_is_it_to_do_anything@yahoo.com
-- copyright:           
category:            Data, Pipes, Streaming
build-type:          Simple
extra-source-files:  README.md
cabal-version:       >=1.10
stability:           Experimental
homepage:            https://github.com/michaelt/streaming-bytestring
bug-reports:         https://github.com/michaelt/streaming-bytestring/issues
source-repository head
    type: git
    location: https://github.com/michaelt/streaming-bytestring


library
  exposed-modules:     Data.ByteString.Streaming
                       , Data.ByteString.Streaming.Char8
                       , Data.ByteString.Streaming.Internal

                       
  -- other-modules:       
  other-extensions:    CPP, BangPatterns, ForeignFunctionInterface, DeriveDataTypeable, Unsafe
  build-depends:       base  <5.0
                     , deepseq 
                     , bytestring
                     , mtl >=2.1 && <2.3
                     , mmorph >=1.0 && <1.2
                     , transformers >=0.3 && <0.6
                     , transformers-base
                     , streaming >=  0.1.4.0 && < 0.1.4.8
                     , resourcet
                     , exceptions
  if impl(ghc < 7.8) 
    build-depends:
                     bytestring < 0.10.4.0
                     , bytestring-builder
  else               
    build-depends:     
                     bytestring >= 0.10.4                     


  default-language:    Haskell2010
  ghc-options: -O2
  
test-suite test
  default-language:
    Haskell2010
  type:
    exitcode-stdio-1.0
  hs-source-dirs:
    tests
  main-is:
    test.hs
  build-depends:
      base >= 4 && < 5
    , transformers
    , tasty >= 0.11.0.4
    , tasty-smallcheck >= 0.8.1
    , smallcheck >= 1.1.1
    , streaming
    , streaming-bytestring
    , bytestring