Network Working Group X. Xu Internet-Draft Huawei Intended status: Standards Track P. Francis Expires: August 15, 2009 MPI-SWS February 11, 2009 Simple Tunnel Endpoint Signaling in BGP draft-xu-tunnel-00.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 15, 2009. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract Virtual Aggregation (VA) is a mechanism for shrinking the size of the Xu & Francis Expires August 15, 2009 [Page 1] Internet-Draft BGP Tunnel Endpoint February 2009 DFZ FIB in routers [I-D.francis-intra-va]. VA can result in longer paths and increased load on routers within the ISP that deploys VA. This document describes a mechanism that allows an AS that originates a route to associate a tunnel endpoint terminating at itself with the route. This allows routers in a remote AS to tunnel packets to the originating AS. If transit ASes between the remote AS and the originating AS install the prefixes associated with tunnel endpoints in their FIBs, then tunneled packets that transit through them will take the shortest path. This results in reduced load for the transit AS, and better performance for the customers at the source and destination. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Requirements notation . . . . . . . . . . . . . . . . . . . 3 2. Document revisions . . . . . . . . . . . . . . . . . . . . . . 3 3. Syntax of the Tunnel Address Attribute . . . . . . . . . . . . 4 4. Usage of the Tunnel Address and TE-Encap Attributes . . . . . 4 4.1. Originating AS . . . . . . . . . . . . . . . . . . . . . . 4 4.2. Non-Originating ASes . . . . . . . . . . . . . . . . . . . 5 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 7 7. Normative References . . . . . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 7 Xu & Francis Expires August 15, 2009 [Page 2] Internet-Draft BGP Tunnel Endpoint February 2009 1. Introduction Virtual Aggregation (VA) [I-D.francis-intra-va] is a mechanism for reducing FIB size for routers within the AS that deploys VA. This is done through "FIB Suppression", where certain routers in the AS may not install routes to certain prefixes in their FIB. The downside of using VA is that packets addressed to suppressed prefixes transiting the AS may take a longer path than otherwise necessary. For instance, imagine a packet traversing AS-path S-A-B-C-D, where ASes S and D are the service providers for their respective customers. Further, assume that ASes A, C, and D are using VA, and that A and C are FIB-suppressing the prefix associated with the packet. In this case, when the packet transits A and C, there is a good chance that it will take an extra router hop within A and C. This increases load for A and C, and degrades performance for S's and D's customers. The mechanism described in this draft allows D, for instance, to associate a tunnel endpoint address with the prefixes that it originates. The tunnel endpoint address can be an anycasted address that terminates at some or all of D's routers. If A and C FIB- install the route to the prefix associated with the tunnel endpoint address, then packets tunneled to the FIB-suppressed prefix will take the shortest path. This draft describes a mechanism for advertising the tunnel endpoint address across ASes in BGP. This draft uses both the Address Specific BGP Extended Communities Attribute for IPv4 and IPv6 to carry the tunnel endpoint address ([RFC4360] and [I-D.ietf-l3vpn-v6-ext-communities] respectively). Where additional tunnel parameters must be signaled (i.e. for GRE or L2TP), this draft uses the Tunnel Encapsulation Attribute (TEncap- Attribute) defined in [I-D.ietf-softwire-encaps-safi] to encode these parameters. 1.1. Requirements notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Document revisions This draft was previously released with file name draft-xu-idr-tunnel-00.txt. The changes from that draft are as Xu & Francis Expires August 15, 2009 [Page 3] Internet-Draft BGP Tunnel Endpoint February 2009 follows: 1. The need for a new sub-TLV definition in [I-D.ietf-softwire-encaps-safi] has been eliminated (in favor of the Address Specific BGP Extended Communities Attribute). 2. The need to carry the AS number of the originating AS separate from the AS-Path attribute has been eliminated. 3. The requirement that the AS-path to the tunnel endpoint address and the AS-path to the destination prefix be the same was dropped. As a result, however, legacy ASes may believe that packets take a different AS-path than the one they actually take. 4. The mechanism to avoid transient loops between providers of multi-homed sites has been made optional rather than required. 3. Syntax of the Tunnel Address Attribute This draft defines a new type for the Address Specific BGP Extended Communities Attribute for both IPv4 and IPv6 to be used as the Tunnel Address Attribute. The value of the high-order octet for the IPv4 type field is 0x01 as defined in [RFC4360] and for the IPv6 type field it is 0x00 as defined in [I-D.ietf-l3vpn-v6-ext-communities]. The attribute is transitive across ASes. The value of the low-order octet for the type field (i.e. the Sub-Type) is (TBD by IANA) for IPv4 and (TBD by IANA) for IPv6. The Global Administrator field is set to the Tunnel Address. This is the IP address of the tunnel endpoint. The Local Administrator field is set to zero and ignored upon receipt. If the Tunnel Attribute (TEncap-Attribute) defined in [I-D.ietf-softwire-encaps-safi] is not present, then the encapsulation type is assumed to be IP-in-IP. If the encapsulation type is GRE or L2TP, then the TEncap-Attribute must be present. It defines the parameters associated with the tunnel as specified in [I-D.ietf-softwire-encaps-safi]. 4. Usage of the Tunnel Address and TE-Encap Attributes 4.1. Originating AS The "Originating AS" is defined here as the AS whose AS number is the first AS in the AS path. Only the AS originating a route may include a Tunnel Address Attribute and optional TEncap-Attribute. The TEncap-Attribute is included only if the tunnel type is something Xu & Francis Expires August 15, 2009 [Page 4] Internet-Draft BGP Tunnel Endpoint February 2009 other than IP-in-IP (i.e. GRE or L2TP). In the remainder of this draft, the Tunnel Address Attribute alone, or both together, are referred to as the "Tunnel Attributes". The Tunnel Attributes MUST NOT be added to externally received routes (i.e. via eBGP), except in the case where the sole AS number of the received route is a private AS number, and it is replaced by that of the receiving AS. The reachable NLRI in the update may be both IPv4 and IPv6. If a tunnel endpoint router receives a packet on the tunnel, and the only known route to the destination is via routes originated by other ASes (not including private ASes of customers), then the packet may be dropped. This prevents transient loops whereby a multi-homed customer is unreachable by both of its provider ASes, but neither AS has yet heard the withdraw from the other AS, and so both think that the other AS can reach the customer. On the other hand, in the case where the customer is reachable via the other AS, a policy of dropping such packets causes unnecessary packet loss. The originating AS may of course aggregate the prefixes of customers reachable via multiple routers. In this case there must be only one tunnel endpoint address for the aggregated prefix. This in turn suggests that the tunnel endpoint address is common to all of the routers. In other words, the tunnel endpoint address must be anycasted across the routers. More generally, the tunnel endpoint address should be anycasted across all routers in the origin AS. Note that if different routers that originate a route for the same aggregated prefix use different tunnel endpoint addresses, the following problem can occur. Imagine that there are two routers R1 and R2 that are originating routes to the same prefix but use different tunnel endpoint addresses. Now, assume that router R1 crashes. There is no way to withdraw the tunnel endpoint: R2 has no mechanism with which do it. As a result, remote routers with packets destined for sites attached to R2 may nevertheless tunnel them to R1 causing them to be dropped. It is possible that different routers with the same tunnel endpoint address advertise different tunnel parameters or even tunnel types in their respective TEncap-Attributes. This is allowed, however all such routers must be able to accept tunnels for every advertised tunnel. 4.2. Non-Originating ASes ASes that have deployed VA should FIB-install any routes containing a tunnel endpoint address. This will prevent packets tunneled to tunnel endpoint addresses from taking any extra hops. Xu & Francis Expires August 15, 2009 [Page 5] Internet-Draft BGP Tunnel Endpoint February 2009 When a router in a non-originating AS receives a route with an associated tunnel endpoint address, it must decide whether or not to use the tunnel. The router always has the option of ignoring the tunnel (and will do so by default if it does not recognize the tunnel attributes). A router may choose to tunnels where the AS_PATH to the tunnel endpoint address does not match the AS path to the reachable prefix. There are pros and cons to doing this. On the plus side, doing this means that the AS-path taken by the packet is the same as the AS-path in the route to the destination prefix. This in turn means that the AS-path that upstream legacy ASes see is the actual AS-path taken. On the minus side, this rule has the characteristic that, if a transit AS decides to use one AS path to some prefixes from an origin AS, and another AS path to other prefixes from the origin AS, then only one of these paths can have a valid tunnel endpoint address associated with it. Packets transmitted via the other path cannot be tunneled. If routers in a non-originating AS combine routes from different received updates into a single update, and the tunnel attributes from the received updates are not identical, then the tunnel attributes must be excluded from the generated update. This prevents an error whereby a route is associated with the wrong tunnel. Likewise if routers in a non-originating AS receive an update with multiple different tunnel attributes, then it must ignore and drop all of the tunnel attributes. It is important to note that the behavior in the above paragraph must be followed for both legacy routers (i.e. those that do not recognize the tunnel attributes) as well as updated routers. It is the authors' understanding that all routers today, when combining the routes from different received updates into a single update, will in fact drop any unrecognized attributes from the new attribute. If there are routers that do not do this, however, then this draft will produce errors. There is a fix to these errors that involves placing the originating AS number in the Tunnel Address Attribute, and indeed this was the approach taken by the original version of this draft. If it is determined that such legacy routers exist, then we can revert back to the original draft. 5. IANA Considerations IANA must issue a new Sub-Type for the Address Specific BGP Extended Communities Attribute. Xu & Francis Expires August 15, 2009 [Page 6] Internet-Draft BGP Tunnel Endpoint February 2009 6. Security Considerations If downstream ASes choose to tunnel packets along an AS-path different from the AS-path to the destination prefix, then upstream ASes may not know the AS-path packets are taking. This can violate a security policy whereby certain ASes must be avoided (see Section 4.2). 7. Normative References [I-D.francis-intra-va] Francis, P., Xu, X., and H. Ballani, "FIB Suppression with Virtual Aggregation", draft-francis-intra-va-00 (work in progress), February 2009. [I-D.ietf-l3vpn-v6-ext-communities] Rekhter, Y., "IPv6 Address Specific BGP Extended Communities Attribute", draft-ietf-l3vpn-v6-ext-communities-01 (work in progress), December 2008. [I-D.ietf-softwire-encaps-safi] Mohapatra, P. and E. Rosen, "BGP Encapsulation SAFI and BGP Tunnel Encapsulation Attribute", draft-ietf-softwire-encaps-safi-03 (work in progress), June 2008. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended Communities Attribute", RFC 4360, February 2006. Authors' Addresses Xiaohu Xu Huawei Technologies No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District Beijing, Beijing 100085 P.R.China Phone: +86 10 82836073 Email: xuxh@huawei.com Xu & Francis Expires August 15, 2009 [Page 7] Internet-Draft BGP Tunnel Endpoint February 2009 Paul Francis Max Planck Institute for Software Systems Gottlieb-Daimler-Strasse Kaiserslautern 67633 Germany Phone: +49 631 930 39600 Email: francis@mpi-sws.org Xu & Francis Expires August 15, 2009 [Page 8]