Architecture¶

This page describes how the downloader is structured internally and how a request flows through the project.

High-Level Flow¶

The normal flow is:

The CLI parses arguments and builds WebtoonDownloadOptions.
download_webtoon() wires together the client, fetcher, transformers, storage, and exporters.
WebtoonFetcher resolves the series and chapter list.
WebtoonDownloader orchestrates chapter tasks.
ChapterDownloader fetches chapter viewer pages and schedules page downloads.
HttpImageDownloader streams image bytes.
Storage writers persist the result as files, ZIPs, CBZs, or PDFs.

Area	Responsibility
`webtoon_downloader/cmd`	CLI, option validation, progress UI
`webtoon_downloader/core/webtoon/client.py`	HTTP client, headers, retry transport, image streaming
`webtoon_downloader/core/webtoon/fetchers.py`	Series lookup and chapter enumeration
`webtoon_downloader/core/webtoon/downloaders/comic.py`	Top-level orchestration for a whole series
`webtoon_downloader/core/webtoon/downloaders/chapter.py`	Per-chapter workflow and page scheduling
`webtoon_downloader/core/downloaders/image.py`	Image streaming and transformer pipeline
`webtoon_downloader/storage`	Folder, ZIP, and PDF output writers
`webtoon_downloader/transformers`	Image format conversion and byte-stream mutation

The CLI lives in webtoon_downloader/cmd/cli.py.

It is responsible for:

This layer should remain thin. Most business logic belongs in the core modules, not inside click handlers.

WebtoonHttpClient centralizes request behavior:

The image path uses stream_image() because image downloads have different requirements from metadata or page fetches.

WebtoonFetcher converts the series URL to the mobile domain, extracts the title_no, and uses the Webtoons API to retrieve episode metadata.

The fetcher constructs normalized ChapterInfo records used throughout the rest of the pipeline.

WebtoonDownloader is the top-level coordinator for a download run.

It handles:

This is the right place to look when changing run-level behavior.

ChapterDownloader handles one chapter at a time:

It owns chapter-level concurrency through an internal semaphore.

HttpImageDownloader is the leaf downloader:

This layer is where request failures become ImageDownloadError.

The storage package defines asynchronous writers for different output targets:

All of them implement the same AioWriter protocol so the rest of the pipeline can stay storage-agnostic.

Metadata export is handled separately from image storage:

This separation keeps archive writing and metadata writing from being tightly coupled.

The project uses layered exceptions so failures stay readable:

The CLI unwraps that chain to show compact summaries while still preserving root causes.

If you want to extend the project, the safest entry points are: