Async Batch Processing Pipelines for Indoor Mapping & Wayfinding

A single deployment cycle in indoor mapping rarely touches one floor plan. It touches a campus — hundreds of heterogeneous DWG, SVG, PDF, and IFC exports that all need parsing, vectorizing, and topology construction before any wayfinding engine can route over them. This page covers the async batch architecture that makes that volume tractable, and it sits inside the broader Automated Floor Plan Parsing & Vectorization section: ingestion feeds the format-specific parsers, the parsers feed vectorization, and vectorization feeds the routing graph. Get the orchestration wrong and the symptoms are always the same — a blocked event loop, runaway memory, and a job that dies two-thirds of the way through a building.

Problem Statement: Why Synchronous Processing Fails at Scale

The naive approach — loop over a directory, open each file, parse it, write the result — works for a demo and collapses in production. Three failure modes appear the moment volume climbs:

Event-loop starvation. Geometry work (polygonization, snapping, intersection testing) is CPU-bound. Run it inline inside an async def and it blocks the loop, so every other coroutine — health checks, progress reporting, queue draining — stalls until the kernel returns. From the outside this looks like random latency spikes and missed heartbeats.
Unbounded memory growth. A campus of multi-storey IFC exports can each expand to gigabytes once loaded. Read them all eagerly and the process is OOM-killed before the first export finishes vectorizing.
Cascading upstream failures. When the CAD/BIM source is a network-mounted store or an object bucket that throttles, a synchronous reader has no backpressure: it keeps issuing reads, exhausts file descriptors, and brings the whole batch down with one OSError.

A production-grade design decouples ingestion, parsing, vectorization, and topology construction into discrete, non-blocking stages, each with its own concurrency limit. The rest of this page builds that design step by step, then hands off to the deep-dive in Building async pipelines for batch floor plan processing for the ingestion-layer internals.

Prerequisites & Dependencies

Before implementing the pipeline, fix these assumptions:

Python 3.11+ for asyncio.TaskGroup and improved cancellation semantics (the examples below stay compatible with 3.10 by using gather).
aiofiles for non-blocking file writes during export, and concurrent.futures.ProcessPoolExecutor (stdlib) for delegating CPU-bound geometry off the event loop.
A geometry kernel — shapely for polygon work, ezdxf for DWG/DXF, IfcOpenShell for IFC. These run inside the process pool, never on the loop.
A normalized intermediate format. Each parser must agree on output before it reaches vectorization. Rasterized PDFs and vector DWG files need different read strategies, which is why format normalization belongs upstream of the queue — see SVG/DWG Parsing Workflows for the per-format extraction routines that should run before a manifest is enqueued.
A coordinate assumption. Workers must know the source units and Y-axis convention; validated geometry is later projected into a consistent Indoor Coordinate Reference System, and multi-storey jobs depend on the Level Mapping & Z-Axis Logic rules to keep floor levels distinct.

Architecture: How the Stages Decouple

The pipeline is a producer/consumer system with a bounded buffer between every stage. A coroutine-driven scanner produces lightweight task manifests; a pool of worker coroutines consumes them, each delegating the heavy geometry to a separate OS process and writing the result back asynchronously.

Two design choices carry the whole architecture. First, the buffer between scanner and workers is a bounded asyncio.Queue with maxsize = 2 * worker_count. When workers fall behind, the queue fills, the scanner’s await queue.put() blocks, and ingestion naturally throttles to match consumption — backpressure for free, no manual rate limiting. Second, every manifest is lightweight: a file path, source format, floor identifier, and a trace id for distributed tracing. The heavy bytes never sit in the queue; they are read only inside the worker that will process them, so queue depth bounds metadata, not megabytes.

asyncio cannot parallelize CPU work — one interpreter, one GIL-bound loop thread. The bridge is loop.run_in_executor paired with a ProcessPoolExecutor: the worker coroutine awaits the executor, yielding the loop to other coroutines while a separate process chews through the geometry kernel. Concurrency is capped twice — an asyncio.Semaphore bounds how many tasks are in flight on the loop side, and the executor’s max_workers bounds how many processes run at once. Size the semaphore and the pool to the physical core count (or a tuned fraction in a memory-constrained container) so total resident memory stays under the cgroup allocation.

Step-by-Step Implementation

Step 1: Define the task manifest and the CPU-bound kernel

The manifest is the only thing that travels through the queue. The kernel is an ordinary synchronous function so it can be pickled and shipped to a worker process; it must not touch the event loop or any async object.

import logging
import time
import uuid
from concurrent.futures import ProcessPoolExecutor
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s | %(levelname)s | %(name)s | %(message)s",
)
logger = logging.getLogger("indoor_pipeline")


@dataclass
class TaskManifest:
    file_path: Path
    source_format: str
    floor_level: str
    trace_id: str = field(default_factory=lambda: str(uuid.uuid4()))
    retry_count: int = 0


def process_geometry_kernel(file_path: str, source_format: str) -> dict[str, Any]:
    """Run heavy CPU-bound vectorization in a separate process.

    Replace the body with real shapely / ezdxf / IfcOpenShell routines;
    the contract is that it returns a JSON-serializable result and raises
    on unrecoverable geometry errors.
    """
    logger.info("kernel start | %s (%s)", file_path, source_format)
    try:
        time.sleep(0.5)  # stand-in for polygonization + topology validation
        return {
            "status": "success",
            "geometry_count": 142,
            "topology_valid": True,
            "floor_level": Path(file_path).stem.split("_")[0],
        }
    except (ValueError, MemoryError) as exc:
        # ValueError: degenerate / self-intersecting geometry.
        # MemoryError: oversized IFC storey loaded into the worker.
        logger.error("kernel failed | %s | %s", file_path, exc)
        raise

Step 2: Build the bounded queue, semaphore, and process pool

The pipeline object owns the shared resources. The queue depth is derived from worker count so the buffer scales with concurrency instead of a magic constant.

import asyncio


class AsyncBatchPipeline:
    def __init__(self, max_workers: int = 4) -> None:
        self.max_workers: int = max_workers
        self.task_queue: asyncio.Queue[TaskManifest | None] = asyncio.Queue(
            maxsize=max_workers * 2
        )
        self.semaphore: asyncio.Semaphore = asyncio.Semaphore(max_workers)
        self.executor: ProcessPoolExecutor = ProcessPoolExecutor(max_workers=max_workers)
        self.loop: asyncio.AbstractEventLoop = asyncio.get_running_loop()
        logger.info("pipeline ready | workers=%d queue_depth=%d", max_workers, max_workers * 2)

Step 3: Scan with backpressure

The scanner is a coroutine that walks the source directory and enqueues one manifest per supported file. await self.task_queue.put(...) is the backpressure point: when the queue is full it suspends until a worker frees a slot.

    async def ingest_scanner(self, scan_dir: Path) -> None:
        logger.info("scan start | %s", scan_dir)
        supported = {".dwg", ".svg", ".pdf", ".ifc"}
        try:
            for fp in scan_dir.rglob("*"):
                if fp.suffix.lower() not in supported:
                    continue
                manifest = TaskManifest(
                    file_path=fp,
                    source_format=fp.suffix.lstrip("."),
                    floor_level=fp.stem.split("_")[0],
                )
                await self.task_queue.put(manifest)  # blocks when queue is full
                logger.debug("enqueued | %s | qsize=%d", fp.name, self.task_queue.qsize())
        except OSError as exc:
            # Network-mounted source dropped mid-scan; stop producing cleanly.
            logger.error("scan aborted | %s | %s", scan_dir, exc)
        finally:
            for _ in range(self.max_workers):
                await self.task_queue.put(None)  # one sentinel per worker

Step 4: Drain, delegate, and retry in the workers

Each worker loops until it pulls a None sentinel. It acquires the semaphore, delegates the CPU work to the pool with run_in_executor, then writes the result. A bounded retry re-enqueues transient failures; task_done() always fires in finally so queue.join() can complete.

    async def worker(self, worker_id: int) -> None:
        while True:
            manifest = await self.task_queue.get()
            if manifest is None:
                self.task_queue.task_done()
                break
            try:
                async with self.semaphore:
                    result = await self.loop.run_in_executor(
                        self.executor,
                        process_geometry_kernel,
                        str(manifest.file_path),
                        manifest.source_format,
                    )
                    await self._export_feature_collection(result, manifest)
                    logger.info("worker-%d done | %s", worker_id, manifest.file_path.name)
            except (ValueError, MemoryError) as exc:
                logger.error("worker-%d failed | %s | %s", worker_id, manifest.file_path.name, exc)
                if manifest.retry_count < 2:
                    manifest.retry_count += 1
                    await self.task_queue.put(manifest)
            finally:
                self.task_queue.task_done()

Step 5: Export as a GeoJSON FeatureCollection

Downstream stages expect the same envelope every other parser emits, so the worker wraps the kernel output in a GeoJSON FeatureCollection rather than a bespoke shape. The trace id and a stable topology_hash travel with it for cache and rebuild decisions.

import hashlib
import json

import aiofiles


    async def _export_feature_collection(
        self, geo_result: dict[str, Any], manifest: TaskManifest
    ) -> None:
        topology_hash = hashlib.sha256(
            json.dumps(geo_result, sort_keys=True).encode()
        ).hexdigest()[:16]
        feature_collection = {
            "type": "FeatureCollection",
            "metadata": {
                "floor_level": manifest.floor_level,
                "trace_id": manifest.trace_id,
                "topology_hash": topology_hash,
                "source_format": manifest.source_format,
            },
            "features": [],  # populated from geo_result geometry in production
        }
        out_path = Path(f"output_{manifest.trace_id}.geojson")
        try:
            async with aiofiles.open(out_path, "w") as fh:
                await fh.write(json.dumps(feature_collection))
        except OSError as exc:
            logger.error("export failed | %s | %s", out_path, exc)
            raise

Step 6: Orchestrate the lifecycle

The runner starts the workers, runs the scanner, waits for the queue to drain with join(), then shuts the executor down. Because the scanner emits one sentinel per worker, each worker exits on its own — no forced cancellation, no half-written exports.

    async def run(self, scan_dir: Path) -> None:
        workers = [asyncio.create_task(self.worker(i)) for i in range(self.max_workers)]
        try:
            await self.ingest_scanner(scan_dir)
            await self.task_queue.join()
        finally:
            await asyncio.gather(*workers, return_exceptions=True)
            self.executor.shutdown(wait=True)
            logger.info("pipeline complete")


async def main() -> None:
    scan_directory = Path("./floor_plans_input")
    scan_directory.mkdir(exist_ok=True)
    pipeline = AsyncBatchPipeline(max_workers=4)
    await pipeline.run(scan_directory)


if __name__ == "__main__":
    asyncio.run(main())

Edge Cases & Gotchas

Symptom	Root cause	Resolution
`EventLoopBlockedWarning`, periodic latency spikes	CPU-bound geometry running on the loop thread	Route every `shapely`/`IfcOpenShell` call through `loop.run_in_executor` with a `ProcessPoolExecutor`; never call them directly inside an `async def`.
`OSError: [Errno 24] Too many open files`	Unbounded concurrent file handles	Lower `max_workers`, raise `ulimit -n`, and ensure every `aiofiles` context exits; the `Semaphore` caps concurrent I/O.
Workers idle, queue never fills	Scanner blocking on a slow network mount	Add a timeout guard around the walk, or pre-stage files locally; verify storage latency before blaming the pipeline.
`MemoryError` / OOM kill on IFC	A worker loads a full multi-storey model into RAM	Chunk the IFC by storey before dispatch so each task is one floor level; keep `queue_depth` tight.
`queue.join()` hangs forever	A worker died before calling `task_done()`	Keep `task_done()` in a `finally`; one missed call leaves the join counter unbalanced.
`CancelledError` corrupts an export	Cancellation fires mid-write	Wrap the critical export in `asyncio.shield()` and reconcile partial files on restart by `topology_hash`.

Validation Output

Confirm correctness with an assertion harness, not by eyeballing logs. A clean run drains the queue to zero, leaves no orphaned futures, and produces one valid FeatureCollection per input file.

def assert_batch_complete(pipeline: AsyncBatchPipeline, expected: int, written: int) -> None:
    assert pipeline.task_queue.qsize() == 0, "queue not drained — workers exited early"
    assert written == expected, f"expected {expected} outputs, wrote {written}"
    logger.info("batch validated | %d/%d floor plans exported", written, expected)

Correct vs. incorrect output is easy to distinguish at the envelope level. A healthy export carries metadata and a parseable geometry list:

{ "type": "FeatureCollection",
  "metadata": { "floor_level": "level3", "topology_hash": "9f2c1ab7d4e05c83", "source_format": "dwg" },
  "features": [ /* ... */ ] }

A failed-but-silent run typically emits the wrapper with "features": [] and no topology_hash change across versions — a strong signal the kernel raised and the retry path swallowed it. Alert on empty feature lists rather than on process exit codes.

Performance & Scale Notes

Throughput is gated by the slowest stage, not the loop. With CPU-bound kernels, end-to-end time is roughly (file_count / pool_workers) * per_file_seconds. Adding loop-side workers beyond the pool size only deepens the queue; scale the ProcessPoolExecutor to add real parallelism.
Memory scales with pool_workers x peak_file_RSS, not with batch size. This is why the bounded queue matters: it keeps thousands of pending floor plans as kilobyte manifests while only pool_workers heavy models are resident at once.
Profile before tuning. Measure per-file CPU and peak RSS, then set max_workers so workers x peak_RSS stays under the container limit with headroom. Watch executor task-duration percentiles — if P95 exceeds your budget, optimize the kernel before adding processes.
Stream completion events instead of reloading. Publishing each topology_hash to a broker (Redis Streams, RabbitMQ) lets navigation teams trigger incremental routing-graph rebuilds for just the changed floor levels. Those artifacts are then gated by CI Gating for Map Updates before they ship, and stale tiles are dropped by Cache Invalidation Strategies.
Use tracemalloc in staging to catch slow leaks in long-lived worker processes before they reach production batch windows.

Downstream, the deterministic envelope hands off cleanly to Wall & Door Detection Algorithms, which depend on clean, non-overlapping polygons, and to the contract defined in JSON Schema Design for Indoor Maps.

Frequently Asked Questions

Why a ProcessPoolExecutor instead of a ThreadPoolExecutor?

Floor plan vectorization is CPU-bound and runs under the GIL, so threads would serialize on it and give no real parallelism. A ProcessPoolExecutor runs each geometry kernel in its own interpreter, sidestepping the GIL. Use a thread pool only for the rare blocking call that releases the GIL (some C-extension I/O); for shapely, ezdxf, and IfcOpenShell geometry, processes are the correct choice.

How deep should the queue be?

maxsize = 2 * worker_count is a reliable default: deep enough that workers never starve between tasks, shallow enough that backpressure engages before memory bloats. Because manifests are lightweight, the queue bounds metadata, not file bytes — so you can keep it tight without sacrificing throughput.

How do I stop one corrupt floor plan from killing the whole batch?

Catch the kernel’s known exceptions (ValueError for degenerate geometry, MemoryError for oversized models) inside the worker, log with the trace id, and re-enqueue up to a retry cap. Anything past the cap is recorded and skipped, not raised. The batch finishes and the failures surface as empty feature lists in validation rather than as a crashed process.

Why chunk IFC files before dispatch instead of inside the worker?

Loading a full multi-storey IFC into a worker means peak RSS is the size of the entire building, multiplied by every concurrent process — the fast path to an OOM kill. Splitting the model by storey upstream makes each task exactly one floor level, so memory scales with floors-in-flight, not building size, and the Level Mapping & Z-Axis Logic rules keep the levels distinct.

This page is part of the Automated Floor Plan Parsing & Vectorization section of the Indoor Mapping & Wayfinding Automation reference.

Async Batch Processing Pipelines for Indoor Mapping & Wayfinding

Problem Statement: Why Synchronous Processing Fails at Scale #

Prerequisites & Dependencies #

Architecture: How the Stages Decouple #

Step-by-Step Implementation #

Step 1: Define the task manifest and the CPU-bound kernel #

Step 2: Build the bounded queue, semaphore, and process pool #

Step 3: Scan with backpressure #

Step 4: Drain, delegate, and retry in the workers #

Step 5: Export as a GeoJSON FeatureCollection #

Step 6: Orchestrate the lifecycle #

Edge Cases & Gotchas #

Validation Output #

Performance & Scale Notes #

Frequently Asked Questions #

Related #

Related pages

Building Async Pipelines for Batch Floor Plan Processing