Thursday, April 14, 2022

collect() Method And Collectors Class in Java Stream API

Collect method in Java Stream API is used to perform a mutable reduction operation on the element of the given stream.

A mutable reduction operation can be defined as an operation that accumulates input elements into a mutable result container, such as a Collection or StringBuilder, as it processes the elements in the stream. In simple way you can say collect methods are used to get a collection from the stream.

Note that collect method is a terminal operation.


collect method in Java stream API

There are two variants of collect() method in Java Stream API-

  1. <R,A> R collect(Collector<? super T,A,R> collector)- Performs a mutable reduction operation on the elements of this stream using a Collector.
    • R - the type of the result
    • A - the intermediate accumulation type of the Collector
    • T – Element type of the stream
    • collector – Instance of the Collector interface
  2. Another variation of the collect method in Java Stream API is shown below-
    <R> R collect(Supplier<R> supplier, BiConsumer<R,? super T> accumulator, BiConsumer<R,R> combiner)
    
    • supplier- a function that creates a new result container.
    • accumulator- function for adding an element into a result.
    • combiner- function for combining two partial result.

Collector interface

In the first collect() method if you have noticed argument is an instance of Collector interface which is also part of Java Stream API and added in Java 8.

public interface Collector<T,A,R>

Collector interface specifies four functions that work together to accumulate entries into a mutable result container, and optionally perform a final transform on the result. These functions are-

  • supplier()- Creation of a new result container.
  • accumulator()- Incorporating a new data element into a result container.
  • combiner()- Combining two result containers into one.
  • finisher()- Performing an optional final transform on the container.

Collectors Class in Java Stream API

Collectors class, an implementation of Collector interface implements various useful reduction operations (as static methods), such as accumulating elements into collections, summarizing elements according to various criteria, etc.

These are the methods you will generally use rather than implementing Collector interface yourself.

Java collect() method examples

Let's see some examples of collect() method where methods of Collectors class are used. For these examples Employee class will be used.

Employee class

class Employee {
 private String empId;
 private int age;
 private String name;
 private char sex;
 private int salary;
 Employee(String empId, int age, String name, char sex, int salary){
  this.empId = empId;
  this.age = age;
  this.name = name;
  this.sex = sex;
  this.salary = salary;
 }
 public String getEmpId() {
  return empId;
 }
 public void setEmpId(String empId) {
  this.empId = empId;
 }
 public int getAge() {
  return age;
 }
 public void setAge(int age) {
  this.age = age;
 }
 public String getName() {
  return name;
 }
 public void setName(String name) {
  this.name = name;
 }
 public char getSex() {
  return sex;
 }
 public void setSex(char sex) {
  this.sex = sex;
 }
 public int getSalary() {
  return salary;
 }
 public void setSalary(int salary) {
  this.salary = salary;
 } 
}
  1. If you want a list having names of all the employees you can use the toList method of the Collectors class.

    List<String> nameList = empList.stream()
             .map(Employee::getName)
             .collect(Collectors.toList());
    
  2. If you want to store the names in the set.
    Set<String> nameSet = empList.stream()
                                       .map(Employee::getName)
                                       .collect(Collectors.toSet());
    
  3. If you want to specify the collection yourself, as example you want the name to be stored in sorted order and want to use TreeSet for the purpose.
    Set<String> nameSet = empList.stream()
                                       .map(Employee::getName)
                                       .collect(Collectors.toCollection(TreeSet::new));
    
  4. If you want to store data in a Map so that empId is the key and name is the value.
    Map<String, String> nameMap = empList.stream()
                                    .collect(Collectors.toMap(Employee::getEmpId, Employee::getName));
    
  5. If you want all the names as a String, joined by comma
    String names = empList.stream()
                          .map(Employee::getName)
                          .collect(Collectors.joining(","));
    
  6. If you want total salary given to all the employees-
    int totalSalary = empList.stream()
                             .collect(Collectors.summingInt(Employee::getSalary));
    
  7. If you want to group the employees by gender-
    Map<Character, List<Employee>> empMap = empList.stream().collect(Collectors.groupingBy(Employee::getSex));
    
    There is also a groupingByConcurrent method which should be used with parallelStream, see example here Parallel Stream in Java Stream API.
  8. In case function is a Predicate, i.e. returns a boolean-valued function it is more efficient to use partitioningBy rather than groupingBy. As Example if you want to partition by employees getting salary greater than or equal to 8000.
    Map<Boolean, List<Employee>> empMap = empList.stream().collect(Collectors.partitioningBy(e -> e.getSalary() >= 8000 ));
    

There are also methods like summarizingInt, summarizingDouble and summarizingLong that provide summary statistics, see an example in Java stram example.

collect method with supplier, accumulator and combiner

You can also use the second variant of the collect() method if you want to provide your own supplier, accumulator and combiner methods.

For example, if you want to accumulate strings into an ArrayList

List<String> asList = Stream.of("a", "b", "c").collect(ArrayList::new, 
ArrayList::add, ArrayList::addAll);
If you prefer using lambda expressions instead of method reference then you can write the same thing as-
asList = Stream.of("a", "b", "c").collect(() -> new ArrayList<>(), (alist, word) -> alist.add(word), (alist1, alist2) -> alist1.addAll(alist2));

Here you can see first param is creating a new result container (a new ArrayList), second param is a function for adding element to the result and third is combining two partial results.

That's all for this topic Collecting in Java Stream API. If you have any doubt or any suggestions to make please drop a comment. Thanks!

>>>Return to Java Advanced Tutorial Page


Related Topics

  1. Java Stream API Tutorial
  2. Parallel Stream in Java Stream API
  3. Map Operation in Java Stream API
  4. FlatMap in Java
  5. Spliterator in Java

You may also like-

  1. Lambda Expression Examples in Java 8
  2. Effectively Final in Java 8
  3. Deadlock in Java Multi-Threading
  4. Why wait(), notify() And notifyAll() Must be Called Inside a Synchronized Method or Block
  5. Java split() Method - Splitting a String
  6. Difference Between Checked And Unchecked Exceptions in Java
  7. Association, Aggregation and Composition in Java
  8. Spring Boot REST API CRUD Example With Spring Data JPA

1 comment:

  1. Great post. Previously, I was confused what are the exactly roles of Supplier, Accumulator and Combiner and how to use it with Stream.collect(Supplier Accumulator, Combiner) method. Thank you. I request you to give example of collect() method that uses finisher because your following example do not use finisher.

    asList = Stream.of("a", "b", "c").collect(() -> new ArrayList<>(), (alist, word) -> alist.add(word), (alist1, alist2) -> alist1.addAll(alist2));


    R collect(Supplier supplier, BiConsumer accumulator, BiConsumer combiner)
    supplier- a function that creates a new result container.
    accumulator- function for adding an element into a result.
    combiner- function for combining two partial result.




    ReplyDelete