OSPF — Link-State Routing at Scale

OSPF

How OSPF builds a complete map of the network using link-state advertisements, elects designated routers, calculates shortest paths with Dijkstra's algorithm, and scales through hierarchical area design.

layer3ospflink-statelsaspfareadrbdrrfc2328

Overview

RIP shares routing tables. OSPF shares the network topology itself.

In OSPF, every router floods information about its own directly connected links throughout the network. Each router collects this information from all other routers, building a complete map of the network — the Link-State Database (LSDB). Once every router has the same LSDB, each independently runs Dijkstra’s Shortest Path First (SPF) algorithm to compute the optimal route to every destination. The result is consistent routing tables across all routers, computed from a shared understanding of the network.

This fundamental difference — sharing topology rather than routing tables — gives OSPF dramatic advantages over distance-vector protocols like RIP:

OSPF is defined in RFC 2328 (OSPFv2 for IPv4) and RFC 5340 (OSPFv3 for IPv6). It is the most widely deployed interior gateway protocol in enterprise and service provider networks.


OSPF Neighbor Relationships

Before exchanging topology information, OSPF routers must establish neighbor adjacencies. Routers discover each other by sending Hello packets out every OSPF-enabled interface every Hello interval (10 seconds by default on broadcast networks, 30 seconds on non-broadcast links). A neighbor is declared dead if no Hello is received within the Dead interval (typically 4× the Hello interval).

Hello packets are multicast to 224.0.0.5 (all OSPF routers) and carry critical parameters that must match for adjacency to form:

ParameterMust Match?
Area IDYes
AuthenticationYes
Hello intervalYes
Dead intervalYes
Subnet maskYes (on broadcast networks)
Stub area flagYes
MTUOn many platforms (must match or MSS adjust)
Router A
Router B
Hello
Router ID, area, timers — starts Discovery
Hello
Includes Router A's ID in neighbor list — 2-Way
DBD (Database Description)
List of LSAs in my LSDB
DBD (Database Description)
List of LSAs in my LSDB
LSR (Link-State Request)
Give me these LSAs I don't have
LSU (Link-State Update)
Here are the requested LSAs
LSAck
Acknowledged — adjacency Full

The adjacency process goes through several states:

When two routers reach Full state, their LSDBs are identical and SPF can run.


Every OSPF router generates Link-State Advertisements (LSAs) describing its own router ID, connected interfaces, neighbors, and costs. These LSAs are flooded throughout the OSPF domain (or area) and stored by every router in the LSDB.

The most important LSA types:

LSA TypeNameGenerated ByDescribes
Type 1Router LSAEvery routerAll of the router’s interfaces and their costs
Type 2Network LSADesignated Router (DR)All routers connected to a broadcast segment
Type 3Summary LSAArea Border Router (ABR)Routes from other areas (inter-area)
Type 4ASBR Summary LSAABRLocation of the ASBR
Type 5AS External LSAAS Boundary Router (ASBR)External routes (from BGP, static, etc.)
Type 7NSSA External LSAASBR in NSSAExternal routes in stub areas

Each LSA has a Sequence Number, Age, and Checksum. The sequence number increases each time the LSA is updated. When a router receives a newer version of an LSA (higher sequence number), it replaces the old version and refloods the update.


OSPF Metric — Cost

OSPF uses cost as its metric. The cost of a path is the sum of the costs of all links in the path from source to destination. The SPF algorithm finds the path with the minimum total cost.

The default cost formula: cost = reference bandwidth / interface bandwidth

With the default reference bandwidth of 100 Mbps:

Interface SpeedDefault Cost
10 Mbps10
100 Mbps1
1 Gbps1
10 Gbps1

Important problem: The default reference bandwidth was set when 100 Mbps was fast. With this formula, a 100 Mbps link and a 10 Gbps link have the same cost (1) — OSPF cannot distinguish between them. In any modern network, the reference bandwidth must be changed to at least 10,000 Mbps (10 Gbps) or higher so that different high-speed link types get different costs.

! Increase OSPF reference bandwidth:
router ospf 1
  auto-cost reference-bandwidth 10000

Cost can also be set manually on individual interfaces when automatic calculation does not produce the desired result.


Designated Router and Backup Designated Router

On a broadcast network (Ethernet), if every OSPF router formed a full adjacency with every other router on the segment, the number of adjacencies would grow as n(n−1)/2 — on a segment with 10 routers, that is 45 adjacencies. Each adjacency means LSDB synchronization, which is expensive.

OSPF solves this with the Designated Router (DR) and Backup Designated Router (BDR) election. On each broadcast segment, one router is elected DR and another is elected BDR. All other routers (called DROther) form full adjacencies only with the DR and BDR — not with each other. This reduces adjacencies from 45 to approximately 2n−3 for 10 routers.

The DR’s job is to generate the Type 2 Network LSA for the broadcast segment, describing all routers connected to it. The DR also receives and retransmits LSAs on behalf of the segment, using the multicast address 224.0.0.6 (all DR routers).

DR/BDR election rules:

  1. Router with highest OSPF interface priority wins (default priority 1; priority 0 = ineligible to be DR/BDR)
  2. Tie-broken by highest Router ID (typically the highest IP address, or a configured Router ID)

The DR/BDR election is non-preemptive. If a new router with a higher priority joins the segment, it does not take over as DR until the current DR fails. This avoids unnecessary topology changes.


OSPF Areas — Hierarchical Design

A single OSPF domain with thousands of routers would face two problems: the LSDB would be enormous (every router stores LSAs from every other router), and every topology change would trigger an SPF recalculation on every router.

OSPF areas solve this by dividing the network into logical groups. Within an area, full topology information is shared. Between areas, only summary (aggregated) route information crosses area boundaries.

All areas must connect to Area 0 (the backbone area). Area 0 is the hub through which all inter-area traffic flows. Non-backbone areas connect to Area 0 via Area Border Routers (ABRs) — routers that have interfaces in both Area 0 and a non-backbone area.

Area 1 ─── ABR ─── Area 0 ─── ABR ─── Area 2

                    ABR

                   Area 3

Within an area: All routers share the full topology (all Type 1 and Type 2 LSAs). SPF runs on the complete intra-area topology.

Between areas: ABRs summarize the routes from each area and advertise them into Area 0 as Type 3 Summary LSAs. Area 0 propagates these summaries to other areas. Routers in Area 1 do not know the internal topology of Area 2 — they only know the summarized routes.

This hierarchy dramatically limits the impact of topology changes: a link failure in Area 2 triggers SPF only in Area 2. Routers in Area 0 and Area 1 only need to update their routing tables (which change because the summary may change), not rerun a full SPF.


Special Area Types

Stub area: Does not accept Type 5 (External) LSAs. Instead, the ABR injects a default route into the area. Routers in the stub area use the default route for external destinations. Reduces LSDB size significantly in areas that do not need detailed external routing information.

Totally stubby area (Cisco proprietary): Does not accept Type 3, 4, or 5 LSAs — only intra-area routes and a default route. The most restrictive and smallest LSDB possible. Used for branches with only one exit point.

NSSA (Not-So-Stubby Area): A stub area that also needs to import external routes. The ASBR in an NSSA generates Type 7 LSAs (instead of Type 5). The ABR converts Type 7 to Type 5 when advertising to Area 0. This allows external routes to be injected from within the area while still keeping most external LSAs out.


SPF Calculation

When the LSDB is complete (all LSAs received), each router independently runs Dijkstra’s Shortest Path First algorithm to build a shortest-path tree from itself to every destination in the area.

The algorithm:

  1. Place the router itself at the root with cost 0
  2. Examine all neighbors — add them to the candidate list with their link costs
  3. Select the candidate with the lowest total cost, add it to the SPF tree
  4. Examine that router’s neighbors, updating costs if a shorter path is found
  5. Repeat until all nodes are in the SPF tree

The result is the shortest (lowest cost) path from the local router to every other router in the area. The next-hop of that shortest path becomes the next-hop in the routing table for each destination.

SPF is computationally expensive. A large area with thousands of routers means a large SPF computation. This is why OSPF limits the LSA flood scope with areas — SPF only runs per area, not for the entire network.


OSPF Authentication

OSPF supports authentication to prevent rogue routers from injecting false LSAs. Two modes:

Simple (cleartext) authentication: A shared password is included in the OSPF packet header. Any network capture reveals the password. Provides minimal security.

MD5 authentication: A keyed MD5 hash of the OSPF packet and a shared secret is appended to each packet. The shared secret is never transmitted. This prevents forged LSA injection as long as the shared secret is kept confidential.

OSPFv3 (for IPv6) drops its own authentication mechanism entirely and relies on IPsec for security, since IPv6 includes IPsec support natively.


Key Concepts

Every OSPF router has the same view

Within an area, every router’s LSDB contains exactly the same LSAs. There is no inconsistency, no matter how the information propagated. SPF running on the same database always produces the same tree. This is the core advantage over distance-vector: there is a shared ground truth.

OSPF convergence depends on timers and SPF scheduling

After a topology change, OSPF convergence involves: detecting the failure (Dead interval or BFD), flooding the LSA change, and running SPF. Modern implementations schedule SPF with exponential backoff to avoid thrashing the CPU when many changes occur simultaneously. BFD (Bidirectional Forwarding Detection) provides sub-second failure detection, bypassing the slow Dead interval.


References