Java 8 – Good Practice for Passing Streams Around in APIs for Lazy Operations

javalambda

In pre-Java 8 lambda-heavy libraries like Guava, the outputs use common Java Collection Framework interfaces so is easy to pass them around to external/internal APIs and still harness some lazy computation if the library method does it (e.g. lazy filter() and transform()).

However, in Java 8 Streams, the call to get a Collection/Map is terminal (i.e. eager) and it will also allocate new data structures to hold the results.

For complicated computations with multiple stages and strategy pattern in the middle, this causes a lot of unnecessary allocations due to the intermediate results.

So, do people think it is a good practice for internal APIs (i.e. strategy pattern strategies) to take and return Streams or should I just fallback to the lazy but not streamlined (pun intended I guess) Guava APIs?

Edit:

My main concern with Stream is that it can only be consumed once and passing something like a Supplier<Stream<X>> looks extremely cumbersome. It almost pushes you just to pass a Collection and then re-stream() it (and paying the cost of eager evaluation at that point).

Best Answer

Laziness in Java 8 Streams works the same as it used to for Iterables in Guava: you have to pass on the Iterable to stay lazy and evaluation happens, once you build a Collection from the Iterator. Both Streams and Iterators can only be consumed once.

So for your method interfaces, the more general way (permitting laziness) is to use the Stream interface (whenever you would have used Iterable before). As @Philipp says, this allows them to be used in Stream pipelines.

Hopefully, since Stream now is an official Java standard interface, there will be more and more other libraries and functions which can efficiently work on Streams directly.