Friday, May 1, 2020

Creating Tar File And GZipping Multiple Files in Java

If you want to GZIP multiple files that can’t be done directly as you can only compress a single file using GZIP. In order to GZIP multiple files you will have to archive multiple files into a tar and then compress it to create a .tar.gz compressed file. In this post we'll see how to create a tar file in Java and gzip multiple files.

Using Apache Commons Compress

Here I am posting a Java program to create a tar file using Apache Commons Compress library. You can download it from here– https://commons.apache.org/proper/commons-compress/download_compress.cgi

Make sure to add commons-compress-xxx.jar in your application’s class path. I have used commons-compress-1.13 version.

Steps to create tar files

Steps for creating tar files in Java are as follows-

  1. Create a FileOutputStream to the output file (.tar.gz) file.
  2. Create a GZIPOutputStream which will wrap the FileOutputStream object.
  3. Create a TarArchiveOutputStream which will wrap the GZIPOutputStream object.
  4. Then you need to read all the files in a folder.
  5. If it is a directory then just add it to the TarArchiveEntry.
  6. If it is a file then add it to the TarArchiveEntry and also write the content of the file to the TarArchiveOutputStream.

Folder Structure used

Here is a folder structure used in this post to read the files. Test, Test1 and Test2 are directories here and then you have files with in those directories. Your Java code should walk through the whole folder structure and create a tar file with all the entries for the directories and files and then compress it.

Test
  abc.txt
  Test1
     test.txt
     test1.txt
  Test2
     xyz.txt

Creating tar file in Java example

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.GZIPOutputStream;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.utils.IOUtils;

public class TarGZIPDemo {
 public static void main(String[] args) {
  String SOURCE_FOLDER = "/home/netjs/Documents/netjs/Test";
  TarGZIPDemo tGzipDemo = new TarGZIPDemo();
  tGzipDemo.createTarFile(SOURCE_FOLDER);

 }
  private void createTarFile(String sourceDir){
    TarArchiveOutputStream tarOs = null;
    try {
      File source = new File(sourceDir);
      // Using input name to create output name
      FileOutputStream fos = new FileOutputStream(source.getAbsolutePath().concat(".tar.gz"));
      GZIPOutputStream gos = new GZIPOutputStream(new BufferedOutputStream(fos));
      tarOs = new TarArchiveOutputStream(gos);
      addFilesToTarGZ(sourceDir, "", tarOs);    
    } catch (IOException e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
    }finally{
      try {
        tarOs.close();
      } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
      }
    }
  }
 
 public void addFilesToTarGZ(String filePath, String parent, TarArchiveOutputStream tarArchive) throws IOException {
  File file = new File(filePath);
  // Create entry name relative to parent file path 
  String entryName = parent + file.getName();
  // add tar ArchiveEntry
  tarArchive.putArchiveEntry(new TarArchiveEntry(file, entryName));
  if(file.isFile()){
   FileInputStream fis = new FileInputStream(file);
   BufferedInputStream bis = new BufferedInputStream(fis);
   // Write file content to archive
   IOUtils.copy(bis, tarArchive);
   tarArchive.closeArchiveEntry();
   bis.close();
  }else if(file.isDirectory()){
   // no need to copy any content since it is
   // a directory, just close the outputstream
   tarArchive.closeArchiveEntry();
   // for files in the directories
   for(File f : file.listFiles()){        
    // recursively call the method for all the subdirectories
    addFilesToTarGZ(f.getAbsolutePath(), entryName+File.separator, tarArchive);
   }
  }          
 }
}

On opening the created .tar.gz compressed file using archive manager.

creating .tar.gz file in Java

That's all for this topic Creating Tar File And GZipping Multiple Files in Java. If you have any doubt or any suggestions to make please drop a comment. Thanks!

>>>Return to Java Programs Page


Related Topics

  1. Zipping Files And Folders in Java
  2. Unzip File in Java
  3. Compress And Decompress File Using GZIP Format in Java
  4. Java Program to Convert a File to Byte Array
  5. Reading Delimited File in Java Using Scanner

You may also like-

  1. How to Create Deadlock in Java
  2. Convert int to String in Java
  3. Read or List All Files in a Folder in Java
  4. How to Compile Java Program at Runtime
  5. How HashMap Works Internally in Java
  6. Serialization Proxy Pattern in Java
  7. Bounded Type Parameter in Java Generics
  8. Polymorphism in Java