How To Compress and Extract Zip Files with Python

By Adam McQuistan in Python  04/16/2020 Comment

Introduction

In this How To article I demonstrate using Python to compress files into zip archives along with extracting files contained in a zip archive. The venerable, batteries included, Python standard library provides the zipfile module which exposes a well designed API for working with zip archives in a platform manner and will be the focus of this article.

Compressing Files Into A Zip Archive with Python's ZipFile Class

For this first section on compressing files into a zip archive, along with other parts in this tutorial, I will be working with the following directory structure and test files.

$ tree .
.
└── testdata
    ├── file01.txt
    ├── file02.txt
    ├── file03.txt
    ├── file04.txt
    ├── file05.txt
    ├── file06.txt
    ├── file07.txt
    ├── file08.txt
    ├── file09.txt
    └── file10.txt

The zipfile module provides a ZipFile class that I will be primarily working with in this article. One thing to mention right off the bat is that the ZipFile object is a context manager and thus a suitable candidate for using the with(...) construct in Python which will automatically handle resource deallocation upon exiting the with(...) block and should be the preferred method of usage.

To start the discussion I will first demonstrate how to compress a single file into a zip archive.

# compress_single.py

from zipfile import ZipFile

if __name__ == '__main__':
    single_file = 'testdata/file01.txt'
    
    with ZipFile('file01.zip', mode='w') as zf:
        zf.write(single_file)

Running the above program will compress the single testdata/file01.txt file into the zip archive named file01.zip by constructing an instance of the ZipFile object in write mode then calling the write(...) method on the resulting object passing it the path to the file to be compressed.

$ python compress_single.py
$ ls -l
-rw-r--r--   1 adammcquistan  staff   192 Apr 15 22:45 compress.py
-rw-r--r--   1 adammcquistan  staff   156 Apr 15 22:48 file01.zip
drwxr-xr-x  12 adammcquistan  staff   384 Apr 15 22:26 testdata

Compressing mutiple files into a single zip archive is only a simple extension to the previous example. To accomplish this use case multiple write(...) method calls, one for each file to add to the archive, are made on the ZipFile object as shown below.

# compress_many.py

from zipfile import ZipFile

if __name__ == '__main__':
    input_files = [
        'testdata/file01.txt',
        'testdata/file02.txt',
        'testdata/file03.txt',
        'testdata/file04.txt',
        'testdata/file05.txt',
        'testdata/file06.txt',
        'testdata/file07.txt',
        'testdata/file08.txt',
        'testdata/file09.txt',
        'testdata/file10.txt',
    ]
    with ZipFile('files.zip', mode='w') as zf:
        for f in input_files:
            zf.write(f)

And running the script looks as expected.

$ ls -l
-rw-r--r--   1 adammcquistan  staff   514 Apr 15 23:12 compress_many.py
-rw-r--r--   1 adammcquistan  staff  1367 Apr 15 23:16 files.zip
drwxr-xr-x  12 adammcquistan  staff   384 Apr 15 22:26 testdata

Inspecting ZipInfo Contents Of A Zip Archive

Often times its useful to programmatically peek into a zip archive and inspect its contents. To accomplish this the zipfile module provides the ZipInfo class which represents individual zip archive items such as each entries name and decompressed size. Once a ZipFile object is constructed you can call the infolist() method on it which returns a list of ZipInfo objects.

As an example, the following module named peek_zip.py queries the files.zip archive created in the last section and displays the name and size of each ZipInfo object representing the contents of the files.zip archive.

# peek_zip.py

from zipfile import ZipFile

if __name__ == '__main__':
    with ZipFile('files.zip') as zf:
        for zipinfo in zf.infolist():
            print(f"{zipinfo.filename} ({zipinfo.file_size}B)")

As you can see running the program shows the expected output.

$ python peek_zip.py
testdata/file01.txt (20B)
testdata/file02.txt (21B)
testdata/file03.txt (20B)
testdata/file04.txt (20B)
testdata/file05.txt (20B)
testdata/file06.txt (20B)
testdata/file07.txt (22B)
testdata/file08.txt (21B)
testdata/file09.txt (21B)
testdata/file10.txt (20B)

Extracting Contents Of A Zip Archive with ZipFile

Extracting the contents of a zip archive is a fairly trivial task as well. To accomplish this task one should construct a ZipFile object passing it the path to the archive you wish to extract along with a 'r' argument to the mode parameter indicating you are reading from the archive. Then you can either extract individual files with the extract(...) method or all contents with extractall(...) method.

To extract and individual entry supply the ZipInfo name and the path to which you want to extract it to or omit it completely and have it extract to the current working directory.

# extract_single.py

import os
from zipfile import ZipFile

if __name__ == '__main__':
    output_dir = 'extract_singles'
    
    if not os.path.exists(output_dir):
        os.mkdir(output_dir)
    
    with ZipFile('files.zip', mode='r') as zf:
        zf.extract('testdata/file01.txt', path=output_dir)

Running the program and doing a directory listing is shown below.

$ python extract_single.py 
$ ls -l extract_singles/testdata 
-rw-r--r--  1 adammcquistan  staff  20 Apr 15 23:57 file01.txt

Similarly, you can use the extractall(...) method to extract the entire contents of a zip archive to a specified location as seen below.

# extract_all.py

import os
from zipfile import ZipFile

if __name__ == '__main__':
    output_dir = 'extract_all'
    
    if not os.path.exists(output_dir):
        os.mkdir(output_dir)
    
    with ZipFile('files.zip', mode='r') as zf:
        zf.extractall(output_dir)

Then for completeness here is the output.

$ python3 extract_all.py 
$ ls -l extract_all/testdata 
-rw-r--r--  1 adammcquistan  staff  20 Apr 16 00:05 file01.txt
-rw-r--r--  1 adammcquistan  staff  21 Apr 16 00:05 file02.txt
-rw-r--r--  1 adammcquistan  staff  20 Apr 16 00:05 file03.txt
-rw-r--r--  1 adammcquistan  staff  20 Apr 16 00:05 file04.txt
-rw-r--r--  1 adammcquistan  staff  20 Apr 16 00:05 file05.txt
-rw-r--r--  1 adammcquistan  staff  20 Apr 16 00:05 file06.txt
-rw-r--r--  1 adammcquistan  staff  22 Apr 16 00:05 file07.txt
-rw-r--r--  1 adammcquistan  staff  21 Apr 16 00:05 file08.txt
-rw-r--r--  1 adammcquistan  staff  21 Apr 16 00:05 file09.txt
-rw-r--r--  1 adammcquistan  staff  20 Apr 16 00:05 file10.txt

Resources for Learning More

thecodinginterface.com earns commision from sales of linked products such as the books above. This enables providing continued free tutorials and content so, thank you for supporting the authors of these resources as well as thecodinginterface.com

Conclusion

In this article I have discussed and provided several code samples demonstrating how to work with zip files using the Python programming language utilizing the zipfile module from the standard library.

As always, I thank you for reading and please feel free to ask questions or critique in the comments section below.

 

Share with friends and colleagues

[[ likes ]] likes

Navigation

Community favorites for Python

theCodingInterface