Best Practices for Indoor POI Taxonomy

This page covers the design decision that decides whether an indoor POI taxonomy stays routable — how deep to make the hierarchy and what to express as attributes instead — and it sits under POI Taxonomy & Classification within the broader Indoor Mapping Architecture & Standards reference. The classification pipeline enforces whatever vocabulary you give it; this page is about giving it a vocabulary that does not quietly fracture the routing graph.

Concept Definition

An indoor POI taxonomy is a controlled, finite vocabulary of point-of-interest types arranged as a strict tree, where every physical space resolves to exactly one leaf and every leaf maps to one routable entity. “Best practice” here is not about being expressive — it is about three measurable properties that keep the taxonomy compatible with distance-based pathfinding:

Bounded depth. A three-tier model — domain (Corporate, Healthcare, Transit), category (Circulation, Amenity, Clinical), type (Elevator, Restroom, Exam_Room) — is enough to drive routing weight, accessibility filtering, and search. Each extra tier multiplies the leaf space a pathfinder and a search index must reason over without adding routing information.
Single-leaf resolution. A POI belongs to one and only one classification path. Multi-parent membership makes an A*/Dijkstra search double-count a node’s edges or fabricate phantom corridors, because the same geometry resolves to two graph nodes.
Attributes, not tiers, for cross-cutting facts. Wheelchair access, badge requirements, gender-neutral status, and department ownership are tags on a leaf, never new branches. category: "Restroom" with attributes: {wheelchair_accessible: true, gender_neutral: false} stays one leaf; Accessible_Unisex_Restroom forks the tree and bloats the leaf space combinatorially.

The anti-pattern these rules prevent is over-classification: a taxonomy so granular that semantically identical spaces land on different leaves, so the routing graph attaches different traversal costs to the same kind of space and search relevance scatters. Over-classification rarely throws an exception — it surfaces as routing that is plausibly wrong.

Minimal Working Example

The technique that operationalizes these rules is a granularity audit: before a taxonomy is published, assert that no path exceeds the allowed depth and that no leaf has more than one parent. The self-contained routine below takes the parent/child edges of a candidate taxonomy and returns the two failure classes that break routing — paths that are too deep, and leaves that resolve under multiple parents.

import logging
from collections import defaultdict
from typing import Iterable

logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
logger = logging.getLogger("poi_taxonomy_audit")

def audit_taxonomy_shape(
    edges: Iterable[tuple[str, str]], max_depth: int = 3
) -> dict[str, list[str]]:
    """Flag taxonomy leaves that are too deep or have >1 parent.

    `edges` are (parent, child) pairs; `max_depth` is the allowed tier count.
    Returns the offending leaves keyed by failure class.
    """
    parents: dict[str, list[str]] = defaultdict(list)
    try:
        for parent, child in edges:
            parents[child].append(parent)
    except ValueError as exc:  # malformed edge tuple
        logger.error("Edge list is not (parent, child) pairs: %s", exc)
        raise

    multi_parent = [c for c, ps in parents.items() if len(ps) > 1]

    def depth(node: str, seen: frozenset[str] = frozenset()) -> int:
        if node in seen:  # cycle guard
            return max_depth + 1
        ps = parents.get(node)
        return 1 if not ps else 1 + max(depth(p, seen | {node}) for p in ps)

    too_deep = [c for c in parents if depth(c) > max_depth]
    issues = {"multi_parent": multi_parent, "too_deep": too_deep}
    logger.info("Audited %d nodes: %s", len(parents), {k: len(v) for k, v in issues.items()})
    return {k: v for k, v in issues.items() if v}

Run it on the edge export before classification; a non-empty result means the design is wrong, not the data, so fix the vocabulary rather than patching individual rows downstream.

Parameter & Threshold Reference

Parameter	Type	Default / Threshold	Notes
`max_depth`	`int`	`3`	Tier count: domain → category → type. `> 3` rarely adds routing information.
Parents per leaf	`int`	`1` (exact)	More than one parent forks the node in the routing graph.
Leaves per category	`int`	`≤ 12` recommended	A category exploding past ~12 leaves usually signals attributes masquerading as types.
`classification_path` length	`list[str]`	exactly `3`, no empty element	An empty trailing element passes a naive length check but breaks leaf lookup.
Cross-cutting facts	attribute tags	not tiers	Accessibility, badge access, hours → key/value payload on the leaf.
Leaf naming	`str`	`Snake_Case` noun, no modifiers	`Restroom`, not `Accessible_Unisex_Restroom`; modifiers belong in attributes.
`routing_weight`	`float`	`0.1`–`1.0`	Derived from the leaf type, not encoded in its name.

Common Errors & Fixes

AssertionError: multi_parent leaves = ['Elevator_Lobby'] — a space was placed under two categories (often Circulation and Amenity) to make it findable from both. The audit catches it, but the fix is design, not code: keep the single most routing-relevant parent (Circulation) and add a queryable tag (is_amenity_adjacent: true) so search still finds it without forking the graph node.

Routing weights that disagree for identical spaces — two huddle rooms resolve to Conference_Small and Huddle, so the routing graph assigns them different traversal costs even though they are the same kind of space. The root cause is leaf granularity drifting per source. Collapse near-duplicate leaves and express the distinction as an attribute:

def collapse_leaves(leaf: str, alias_map: dict[str, str]) -> str:
    """Fold near-duplicate leaf labels onto one canonical type."""
    canonical = alias_map.get(leaf.strip(), leaf.strip())
    if canonical != leaf:
        logger.info("Collapsed leaf %r -> %r", leaf, canonical)
    return canonical

# alias_map = {"Huddle": "Conference_Room", "Conf_Rm": "Conference_Room"}

KeyError on leaf lookup after a sync — a path degraded from ["Healthcare", "Clinical", "Exam_Room"] to ["Healthcare", "Exam_Room", ""] because an upstream column was dropped, and the empty trailing element slipped past a length-only check. Validate that no path element is empty, not merely that there are three of them, so the broken row is rejected at the boundary instead of crashing the leaf lookup later.

Integration Point

This design step sits upstream of everything that consumes a POI. The vocabulary you settle here is the controlled list the POI Taxonomy & Classification pipeline freezes into its pydantic validator, and the leaf types you keep shallow become the nodes the routing graph weights. Because every leaf binds to geometry, the taxonomy only behaves once each POI sits in a metric frame established by a consistent Indoor Coordinate Reference System — classify before that and the spatial within bind that gates routability passes or fails by units. Downstream, the resilient Fallback Routing Architectures assume every POI shares one taxonomy when they prune restricted edges, and the published vocabulary travels in the same FeatureCollection envelope defined by JSON Schema Design for Indoor Maps so SDKs read one shape. Wire the granularity audit into CI Gating for Map Updates so an over-classified taxonomy fails the build before it reaches a published map.

This page sits under the POI Taxonomy & Classification collection, part of the Indoor Mapping Architecture & Standards reference.

Best Practices for Indoor POI Taxonomy

Concept Definition #

Minimal Working Example #

Parameter & Threshold Reference #

Common Errors & Fixes #

Integration Point #

Related #

Concept Definition

Minimal Working Example

Parameter & Threshold Reference

Common Errors & Fixes

Integration Point

Related