Streaming image processing allows you to work with large or partially available image data without loading the entire file into memory. Pillow’s ImageFile.Parser
and incremental decode methods enable efficient, on-the-fly processing—ideal for large JPEGs, network streams, or real-time applications.
1. Understanding ImageFile.Parser
ImageFile.Parser
incrementally builds an image from byte chunks. It’s essential for streaming scenarios where data arrives in segments (e.g., network downloads or large file reads).
from PIL import ImageFile; ImageFile.LOAD_TRUNCATED_IMAGES = True
2. Basic Streaming Decode Example
from PIL import ImageFile # Enable streaming of truncated images ImageFile.LOAD_TRUNCATED_IMAGES = True parser = ImageFile.Parser() with open('large.jpg', 'rb') as f: while chunk := f.read(8192): parser.feed(chunk) # Finalize and retrieve image img = parser.close() img.load() # decode remaining data if needed img.show()
This reads the file in 8KB chunks, feeding them to the parser, and constructing the image without a full-file buffer.
3. Streaming from Network Sources
Process images directly from HTTP streams without saving locally:
import requests from PIL import ImageFile ImageFile.LOAD_TRUNCATED_IMAGES = True parser = ImageFile.Parser() response = requests.get('https://example.com/large.jpg', stream=True) for chunk in response.iter_content(chunk_size=8192): if chunk: parser.feed(chunk) img = parser.close() img.save('downloaded.jpg')
Use stream=True
to iterate over incoming data and build the image incrementally.
4. Incremental Processing During Decode
Apply operations (e.g., resizing) as soon as partial decode yields scanlines:
from PIL import ImageFile, Image class StreamingProcessor(ImageFile.Parser): def __init__(self, operation, *args, **kwargs): super().__init__(*args, **kwargs) self.op = operation def feed(self, data): super().feed(data) try: img = self.image # Process if enough data decoded processed = img.resize((img.width//2, img.height//2)) processed.show() # or save intermediate results except AttributeError: # image not ready yet pass # Usage: proc = StreamingProcessor(None) with open('large.jpg', 'rb') as f: for chunk in iter(lambda: f.read(8192), b''): proc.feed(chunk) img = proc.close()
Inherit from ImageFile.Parser
to access self.image
when scanlines available, enabling real-time processing.
5. Handling Partial Metadata & Headers
Extract EXIF or image size before full decode to adjust pipelines:
from PIL import ImageFile parser = ImageFile.Parser() with open('large.jpg', 'rb') as f: header = f.read(1024) parser.feed(header) # access basic info width, height = parser.image.size print(f"Image dimensions: {width}x{height}") # Continue streaming... rest = f.read() parser.feed(rest) img = parser.close()
parser.image
after header decode; use small initial chunk to read basic metadata.
6. Resource Cleanup
Ensure parser and image objects are closed to free memory:
parser = ImageFile.Parser() # feed data... img = parser.close() img.close()
parser = ImageFile.Parser()
7. Summary Checklist
- Enable
ImageFile.LOAD_TRUNCATED_IMAGES
for robustness. - Read data in chunks (e.g., 8KB) and feed to
ImageFile.Parser
. - Use network streaming with
requests.iter_content()
. - Extend parser for incremental operations when
self.image
available. - Extract metadata early from initial chunks.
- Close parser and image, reset parser in long streams.