Architecture¶
This page describes how the downloader is structured internally and how a request flows through the project.
High-Level Flow¶
The normal flow is:
- The CLI parses arguments and builds
WebtoonDownloadOptions. download_webtoon()wires together the client, fetcher, transformers, storage, and exporters.WebtoonFetcherresolves the series and chapter list.WebtoonDownloaderorchestrates chapter tasks.ChapterDownloaderfetches chapter viewer pages and schedules page downloads.HttpImageDownloaderstreams image bytes.- Storage writers persist the result as files, ZIPs, CBZs, or PDFs.
Main Components¶
| Area | Responsibility |
|---|---|
webtoon_downloader/cmd |
CLI, option validation, progress UI |
webtoon_downloader/core/webtoon/client.py |
HTTP client, headers, retry transport, image streaming |
webtoon_downloader/core/webtoon/fetchers.py |
Series lookup and chapter enumeration |
webtoon_downloader/core/webtoon/downloaders/comic.py |
Top-level orchestration for a whole series |
webtoon_downloader/core/webtoon/downloaders/chapter.py |
Per-chapter workflow and page scheduling |
webtoon_downloader/core/downloaders/image.py |
Image streaming and transformer pipeline |
webtoon_downloader/storage |
Folder, ZIP, and PDF output writers |
webtoon_downloader/transformers |
Image format conversion and byte-stream mutation |
CLI Layer¶
The CLI lives in webtoon_downloader/cmd/cli.py.
It is responsible for:
- parsing options
- validating combinations such as
--latestvs--start/--end - setting up the event loop
- wiring progress callbacks
- presenting user-facing errors
This layer should remain thin. Most business logic belongs in the core modules, not inside click handlers.
HTTP Layer¶
WebtoonHttpClient centralizes request behavior:
- connection pooling
- request headers and user-agent selection
- mobile vs standard Webtoons domain handling
- retry transport selection
- explicit request and image-stream timeouts
The image path uses stream_image() because image downloads have different requirements from metadata or page fetches.
Fetching Chapters¶
WebtoonFetcher converts the series URL to the mobile domain, extracts the title_no, and uses the Webtoons API to retrieve episode metadata.
The fetcher constructs normalized ChapterInfo records used throughout the rest of the pipeline.
Series Orchestration¶
WebtoonDownloader is the top-level coordinator for a download run.
It handles:
- resolving the chapter list
- determining the output directory
- preparing optional export data
- choosing the correct storage writer per chapter
- spawning chapter tasks
This is the right place to look when changing run-level behavior.
Chapter Orchestration¶
ChapterDownloader handles one chapter at a time:
- fetch the viewer page
- extract image URLs and chapter notes
- export metadata if enabled
- schedule page downloads
- write
ComicInfo.xmlfor CBZ output when applicable
It owns chapter-level concurrency through an internal semaphore.
Image Pipeline¶
HttpImageDownloader is the leaf downloader:
- opens the image stream
- applies optional transformers
- writes the byte stream to the selected storage backend
- reports per-page progress
This layer is where request failures become ImageDownloadError.
Storage Backends¶
The storage package defines asynchronous writers for different output targets:
AioFolderWriterAioZipWriterAioPdfWriter
All of them implement the same AioWriter protocol so the rest of the pipeline can stay storage-agnostic.
Export And Metadata¶
Metadata export is handled separately from image storage:
- text and JSON exports go through
DataExporter - CBZ metadata uses ComicInfo XML helpers in
comicinfo.py
This separation keeps archive writing and metadata writing from being tightly coupled.
Error Model¶
The project uses layered exceptions so failures stay readable:
WebtoonDownloadErrorChapterDownloadErrorImageDownloadError- specialized fetch and rate-limit errors
The CLI unwraps that chain to show compact summaries while still preserving root causes.
Extension Points¶
If you want to extend the project, the safest entry points are:
- new storage writers under
storage/ - new image transformers under
transformers/ - exporter format improvements in
exporter.py - fetcher/parser updates when Webtoons changes upstream behavior