Tag: routing

  • iBGP vs eBGP: Same Protocol, Completely Different Rules

    BGP is BGP — until you realize you are running two fundamentally different versions of it depending on where your routers sit relative to an AS boundary. Miss that distinction and you will spend hours debugging why routes are not propagating the way you expect.


    The One Line That Explains Everything

    The difference comes down to one thing: Autonomous System numbers.

    • eBGP (External BGP): peers in different AS numbers
    • iBGP (Internal BGP): peers in the same AS number

    That is it. Same protocol (TCP port 179), same UPDATE messages, same path attributes — but the rules change dramatically based on which side of that boundary you are on. Think of it like the difference between talking to a colleague on your own team versus negotiating with another company. The language is the same; the trust model is completely different.


    eBGP: Talking to the Outside World

    eBGP is what most people picture when they think of BGP — the session between your edge router and your ISP, between your data center and a transit provider, or between two organizations exchanging routes across an administrative boundary.

    Key Behaviors

    • TTL = 1 by default — eBGP peers must be directly connected. Use ebgp-multihop when peering across intermediate hops (loopbacks, out-of-band peering links).
    • Next-hop changes to the advertising router — your ISP advertises a default route with their interface IP as the next-hop. You own the resolution from there.
    • Administrative distance of 20 — preferred over almost everything else in the routing table.
    # Cisco IOSeBGP to upstream ISP
    router bgp 65001
     neighbor 203.0.113.1 remote-as 64512
     neighbor 203.0.113.1 description ATT_Transit
     neighbor 203.0.113.1 ebgp-multihop 2
     !
     address-family ipv4 unicast
      neighbor 203.0.113.1 activate
      neighbor 203.0.113.1 route-map FILTER-IN in
      neighbor 203.0.113.1 route-map ADVERTISE-OUT outCode language: CSS (css)

    When to Use eBGP

    • Internet peering — ISPs, CDNs, IXPs
    • Multi-homed connectivity to two or more transit providers
    • Data center interconnect (DCI) across separate AS domains
    • Any time you are crossing an organizational or administrative boundary

    iBGP: Getting Your Own Routers to Agree

    iBGP is where things get more nuanced — and where most engineers get tripped up. Your edge routers learn routes from ISPs via eBGP. But your internal core routers need to know about those routes too so they can make forwarding decisions. That is iBGP’s job: distributing externally-learned routes across your own AS.

    Key Behaviors

    • TTL = 255 — peers do not need to be directly connected. You almost always peer over loopbacks for stability. If a physical link flaps but the loopback is reachable via IGP, the BGP session stays up.
    • Next-hop is NOT changed — this is the one that bites people. A route learned via eBGP and re-advertised over iBGP keeps the ISP address as the next-hop. Your internal routers need an IGP (OSPF or IS-IS) to resolve it. No IGP reachability equals a black hole.
    • Administrative distance of 200 — less preferred than eBGP.
    • No re-advertisement between iBGP peers — a route learned from one iBGP peer will not be passed to another iBGP peer. This is the loop prevention mechanism, and it is exactly why full mesh or route reflectors are required.
    # JuniperiBGP peering over loopbacks
    set protocols bgp group IBGP type internal
    set protocols bgp group IBGP local-address 10.0.0.1
    set protocols bgp group IBGP neighbor 10.0.0.2 description CORE-RTR-2
    set protocols bgp group IBGP neighbor 10.0.0.3 description CORE-RTR-3
    set protocols bgp group IBGP neighbor 10.0.0.4 description EDGE-RTR-2Code language: CSS (css)

    The Full-Mesh Problem

    Here is the catch with iBGP: every router must peer with every other router in the AS. In a 5-router network that is 10 sessions. In a 20-router network that is 190. It scales as n(n-1)/2 and gets ugly fast.

    The standard solution is Route Reflectors (RR). Designate one or more routers as a hub — all other routers (clients) peer only with the RR. The RR re-advertises routes between clients, which is the one deliberate exception to the no-re-advertisement rule. This is how every real enterprise and service provider network solves the scaling problem.

    # Cisco IOSRoute Reflector config
    router bgp 65001
     neighbor 10.0.0.2 remote-as 65001
     neighbor 10.0.0.2 description CORE-RTR-2
     neighbor 10.0.0.2 route-reflector-client
     neighbor 10.0.0.3 remote-as 65001
     neighbor 10.0.0.3 description CORE-RTR-3
     neighbor 10.0.0.3 route-reflector-clientCode language: CSS (css)

    Put your RR on your most stable, most connected routers — not the edge. Edge routers go into maintenance windows, lose circuits, and get replaced. Your RR needs to be the last thing to fail.


    Key Differences at a Glance

    eBGPiBGP
    AS relationshipDifferent ASesSame AS
    Default TTL1 (direct connect)255 (loopback peering)
    Next-hop behaviorChanges to local routerPreserved (unchanged)
    Route propagationAdvertises freelyWill not re-advertise to iBGP peers
    Admin distance20200
    Scale challengeManaged per-peer policyFull mesh required (use RR)
    Typical useISP peering, DCI, org boundariesInternal route distribution

    Design Decisions: Where It Gets Real

    Single-homed with one ISP?

    You probably do not need BGP at all. A static default route is simpler, more stable, and easier to troubleshoot. BGP to a single upstream is operational overhead without meaningful benefit.

    Multi-homed with two ISPs?

    Now BGP earns its place. Run eBGP to both providers. Use LOCAL_PREF to control outbound path preference. Use MED to influence inbound — with the caveat that your ISP actually has to honor it, which is not guaranteed. iBGP between your edge routers lets them share what each learned so both can make consistent forwarding decisions.

    Enterprise core with multiple egress points?

    Route reflectors are your friend. Place them on your core distribution layer, not the edge. Build redundancy into your RR design — a single RR is a single point of failure for your entire BGP topology.

    Data center fabric?

    BGP in the underlay is increasingly standard — especially with Arista, Cumulus, and SONiC in EVPN deployments. Running eBGP between spine and leaf (each leaf pair in its own private AS) sidesteps the full-mesh problem entirely and gives you clean failure isolation per rack. This is where iBGP scaling limitations pushed the industry toward a new architecture — and it is worth understanding that history before you design your next fabric.


    The Takeaway

    iBGP and eBGP are not two protocols — they are the same tool used for two completely different jobs. eBGP is how you talk to the world. iBGP is how you make sure the rest of your network knows what you learned from the world.

    Get the boundary right and BGP is remarkably predictable. Forget the next-hop behavior, skip route reflectors, or blur where your AS actually ends — and you will be staring at show bgp summary wondering why half your routers have black holes at 2am.

    — Mike


    Questions or want to dig deeper? Connect with me on LinkedIn.

  • Tagged Layer 3 Interfaces vs Router-on-a-Stick: Two Sides of the Same Coin

    Both tagged Layer 3 interfaces and router-on-a-stick use 802.1Q VLAN tagging to multiplex multiple Layer 3 networks over a single physical link. The concepts are nearly identical—the main differences lie in the platform, scale, and typical use cases. Let’s break down what makes them similar and where they diverge.


    The Foundation: 802.1Q VLAN Tagging

    Both designs rely on 802.1Q trunking to carry multiple VLANs across a single physical interface. Each VLAN gets its own Layer 3 subinterface (or logical unit), allowing a single link to handle multiple routed networks simultaneously.

    Think of it like a single fiber optic cable carrying multiple wavelengths of light (DWDM). One physical medium, multiple logical channels.

    Router-on-a-Stick: The Classic Pattern

    How It Works

    Router-on-a-stick connects a router to a Layer 2 switch via a single 802.1Q trunk. The router creates multiple subinterfaces on one physical port, with each subinterface handling routing for a specific VLAN.

    Configuration Example (Cisco Router):

    interface GigabitEthernet0/0
     description Trunk to Layer 2 Switch
     no ip address
    
    interface GigabitEthernet0/0.10
     description VLAN 10 - Finance
     encapsulation dot1Q 10
     ip address 192.168.10.1 255.255.255.0
    
    interface GigabitEthernet0/0.20
     description VLAN 20 - Engineering  
     encapsulation dot1Q 20
     ip address 192.168.20.1 255.255.255.0
    
    interface GigabitEthernet0/0.30
     description VLAN 30 - Guest
     encapsulation dot1Q 30
     ip address 192.168.30.1 255.255.255.0Code language: PHP (php)

    Primary Use Case

    Inter-VLAN routing in small to medium environments:

    • Branch offices with Layer 2 switches
    • Small campus networks
    • Budget-constrained deployments
    • Networks with light to moderate inter-VLAN traffic

    Tagged Layer 3 Interfaces: The Enterprise Pattern

    How It Works

    Tagged Layer 3 interfaces use the same 802.1Q subinterface concept, but typically on enterprise routers or Layer 3 switches connecting to other Layer 3 devices or provider networks. Rather than inter-VLAN routing for local users, these interfaces often carry:

    • Multiple customer connections (ISP/carrier use case)
    • Different VRFs or routing instances
    • Segregated services over shared infrastructure
    • WAN connections with multiple circuits

    Configuration Examples

    Juniper (Logical Units):

    set interfaces et-0/0/1 description "Carrier_Circuit_to_DMZ_Switch"
    set interfaces et-0/0/1 vlan-tagging
    
    set interfaces et-0/0/1 unit 200 description "ATT"
    set interfaces et-0/0/1 unit 200 vlan-id 200
    set interfaces et-0/0/1 unit 200 family inet address 10.23.59.1/30
    
    set interfaces et-0/0/1 unit 308 description "Zayo"
    set interfaces et-0/0/1 unit 308 vlan-id 308
    set interfaces et-0/0/1 unit 308 family inet address 10.23.58.1/30
    
    set interfaces et-0/0/1 unit 322 description "Lumen"
    set interfaces et-0/0/1 unit 322 vlan-id 322
    set interfaces et-0/0/1 unit 322 family inet address 10.23.57.1/30
    
    set interfaces et-0/0/1 unit 337 description "Verizon"
    set interfaces et-0/0/1 unit 337 vlan-id 337
    set interfaces et-0/0/1 unit 337 family inet address 10.23.56.1/30Code language: JavaScript (javascript)

    Arista (Subinterfaces with VRFs):

    interface Ethernet3
       description "Verizon"
       no switchport
    
    interface Ethernet3.3011
       description "Customer1"
       encapsulation dot1q vlan 3011
       vrf Cust1
       ip address 10.140.242.45/31
    
    interface Ethernet3.3012
       description "Customer2"
       encapsulation dot1q vlan 3012
       vrf Cust2
       ip address 10.140.242.49/31
    
    interface Ethernet3.3018
       description "Customer3"
       encapsulation dot1q vlan 3018
       vrf Customer3
       ip address 10.140.242.53/31Code language: JavaScript (javascript)

    Primary Use Cases

    Service multiplexing and network segregation:

    • Carrier/ISP networks serving multiple customers over shared infrastructure
    • Enterprise edge routers with multiple WAN circuits or partners
    • Data center interconnects (DCI) carrying multiple tenants
    • MPLS PE routers with VRF-segregated customers
    • DMZ/extranet environments with strict segmentation requirements

    Key Differences

    FeatureRouter-on-a-StickTagged Layer 3 Interfaces
    Typical PlatformSmall branch routers (ISR, etc.)Enterprise routers (MX, ASR, 7xxx)
    Connected ToLayer 2 access switchLayer 3 device, carrier, or upstream
    Primary PurposeInter-VLAN routing for end usersService multiplexing, WAN aggregation
    Traffic PatternEast-west (VLAN to VLAN)North-south (external connections)
    VRF UsageRarely usedCommon (customer/service isolation)
    ScaleTypically 3-10 VLANsCan support dozens to hundreds
    Port Speed1G typical10G/40G/100G common
    Routing ComplexitySimple (default gateway role)Complex (BGP, OSPF, policy routing)

    The Real Difference: Context and Scale

    Technically, both designs are doing the same thing: using 802.1Q tagging to create multiple Layer 3 interfaces on a single physical port. The distinctions come down to:

    1. Network Location

    • Router-on-a-stick: Access layer, connecting to end-user VLANs
    • Tagged L3 interfaces: Edge/core, connecting to WAN, partners, or other infrastructure

    2. Traffic Type

    • Router-on-a-stick: Internal traffic between VLANs (Finance ↔ Engineering)
    • Tagged L3 interfaces: External services, customers, or carriers (Bank of America, Wells Fargo, Verizon,ATT)

    3. Isolation Requirements

    • Router-on-a-stick: Simple VLAN separation, shared routing table
    • Tagged L3 interfaces: Often uses VRFs for strict routing isolation between customers/services

    4. Performance Expectations

    • Router-on-a-stick: Bandwidth bottleneck is an accepted trade-off for simplicity
    • Tagged L3 interfaces: High-speed links (10G+) with hardware-accelerated forwarding

    Real-World Example: Financial Services Edge Router

    In the Arista example above, a single 10G interface to a carrier (Lumen) carries three completely isolated networks:

    • VLAN 3011: Dedicated Wells Fargo connection (VRF: WellsFargo)
    • VLAN 3012: Shared FIX protocol link (VRF: Shared_Fix)
    • VLAN 3018: Extranet services (VRF: Extranet)

    Each subinterface exists in a separate VRF, ensuring complete routing isolation. Traffic from Wells Fargo can never leak into the Extranet VRF, even though they share the same physical wire.

    This is service multiplexing—using 802.1Q to deliver multiple isolated services over shared infrastructure.

    When to Use Each Design

    Use Router-on-a-Stick When:

    • You need inter-VLAN routing in a small office or branch
    • You have Layer 2 switches and one router
    • Budget constraints prevent Layer 3 switching
    • Inter-VLAN traffic is moderate and predictable

    Use Tagged Layer 3 Interfaces When:

    • Connecting to carriers, partners, or WAN providers
    • You need strict traffic segregation (VRFs)
    • Multiplexing multiple customers or services over shared links
    • Building data center interconnects or MPLS PE infrastructure
    • Working with high-bandwidth circuits (10G+)

    Common Pitfalls and Considerations

    MTU and Fragmentation

    802.1Q adds 4 bytes to the frame. If your physical interface MTU is 1500, your effective Layer 3 MTU per subinterface is 1496. Always verify MTU settings match on both ends to avoid fragmentation issues.

    Native VLAN Considerations

    Some platforms allow a “native” (untagged) VLAN on trunk ports. Be explicit about whether you’re using this feature to avoid misconfigurations and potential security issues.

    Performance Monitoring

    Monitor each subinterface individually—don’t just look at the physical interface utilization. One busy subinterface can saturate the link and affect all others.

    QoS and Traffic Shaping

    When multiplexing critical services, implement QoS policies to ensure high-priority traffic (e.g., VoIP, financial transactions) isn’t starved by bulk data transfers.

    Conclusion

    Router-on-a-stick and tagged Layer 3 interfaces are fundamentally the same technology—802.1Q subinterfaces providing Layer 3 routing over a single physical link. The key differences are:

    • Router-on-a-stick: Small-scale inter-VLAN routing for local users
    • Tagged L3 interfaces: Enterprise-scale service multiplexing with VRF isolation

    Both have their place in modern networks. Understanding when and why to use each pattern is essential for designing efficient, scalable infrastructure—whether you’re building a branch office network or connecting to major financial institutions over carrier circuits.


    Working with VLANs, VRFs, or enterprise routing? Let’s connect on LinkedIn