Thursday, 4 March 2010

The tunnel packing problem: what is it, and why does it matter?

I was looking for a reference to the tunnel packing problem and whilst the concept may have been defined in other contexts, I can’t find anywhere where it is actually defined for MPLS-TE, so here goes ... If you can find it defined elsewhere – let me know – otherwise, I hope it’s useful.

The Symptoms
To understand the tunnel packing problem and why it is important, you first need to understand how tunnel paths are determined in MPLS-TE.  With MPLS-TE, tunnel paths can be dynamically calculated online in a distributed fashion by the TE tunnel sources (known as tunnel “head-ends”) themselves – this is referred to as using dynamic path options – or can be determined by an offline centralised function (also known as a tunnel server or path computation element [PCE]) which then specifies the explicit tunnel path a head-end should use for a particular tunnel – this is referred to as using explicit path options.  With either approach, a constraint-based routing (CBR) algorithm uses a constraint-based shortest path first (CSPF) calculation to determine the path that a particular tunnel will take based upon a fit between the available network bandwidth resources (and optionally policy constraints) and the required bandwidth (and policies) for that tunnel.  This CSPF calculation is similar to a conventional IGP shortest path first (SPF) calculation, but also takes into account bandwidth and administrative constraints, pruning links from the topology if they advertised insufficient resources, i.e. not enough bandwidth for a tunnel, or if they violate tunnel policy constraints.  The shortest (i.e. lowest cost) path is then selected from the remaining topology.

With that background out of the way – let’s look at a specific example to explain the tunnel packing problem itself.  Consider the following network – they don’t get much simpler – with all links 10Gps and all with the same metrics (assume no affinities etc).

Now imagine that we want to set up three MPLS-TE tunnels from A to D in sequence; tunnel #1 for 1Gbps, tunnel #2 for 1Gbps, and tunnel #3 for 10Gbps.  As there are two equal cost paths with the same bandwidth available (A-B-D and A-C-D), tunnel #1 could take either path depending upon the particular path calculation implementation, i.e. different vendor’s implementations may make different choices in tie-breaker situations; let’s assume path A-B-D is chosen.  This leaves 9Gbps of available bandwidth on A-B and B-D, with 10Gbps still available on A-C and C-D.

With the first tunnel established, both paths still have sufficient bandwidth for tunnel #2, in which case the path that will be chosen will also depend upon the particular path calculation implementation.  Some implementations use a model called most fill, where if there is more than one path with sufficient bandwidth available, links with the least available bandwidth (i.e. those that are already most filled) are preferred.  Other implementations use a model called least fill, where links with the most available bandwidth (i.e. those that are already least filled) are preferred.  For this example, let’s first consider that least-fill is used, in which case path A-C-D would be chosen for tunnel #2, as this was the least-filled path.  This leaves 9Gbps of available bandwidth on A-B, B-D, A-C and C-D.  Now if we try and determine a path for the tunnel #3 (10Gbps), even though there is 18Gbps of available bandwidth in total across the two paths, there are no paths with 10Gbps bandwidth available and the tunnel cannot be established.

So what’s the solution?  What if we use most-fill instead?  If we did, assuming tunnel #1 takes path A-B-D, tunnel #2 would also take path A-B-D, and tunnel #3 would be able to take path A-C-D.

If we consider another example, however, where we try and set up four MPLS-TE tunnels from A to D in sequence; tunnel #1 for 5Gbps, tunnel #2 for 1Gbps, tunnel #3 for 5Gbps, and tunnel #4 for 6Gbps.  Let’s assume that tunnel #1 uses path A-B-D; this leaves 5Gbps of available bandwidth on A-B and B-D, with 10Gbps still available on A-C and C-D.  If most-fill is used, tunnel #2 will also use path A-B-D, leaving 4Gbps of available bandwidth on A-B and B-D.  This is not enough bandwidth for tunnel #3, which will use path A-C-D, leaving 5Gbps of available bandwidth on A-C and C-D.  Now if we try and determine a path for the tunnel #4, there are no paths with 6Gbps bandwidth available and the tunnel cannot be established.  In this case the same issue would occur if least-fill had been used.

If in this second example, however, tunnel #1 and tunnel #3 followed path A-B-D, tunnels #2 and #4 could use path A-C-D, i.e. all of the tunnels could be established.  If dynamic path options are used with head-ends calculating the tunnel paths, however, they would not be able to find this solution.  The reason why this is the case is because each head-end only has a myopic view and can only see the available (i.e. unallocated) bandwidth on the links in the network and its own tunnels (i.e. the tunnels for which it is the head-end), but does not have a view of the tunnels from other head-ends.

Hence the symptom of a poor solution to the tunnel packing problem is inefficient use of the available bandwidth.

The Tunnel Packing Problem Defined
We can consider the Tunnel Packing Problem to be a sub problem of the more general traffic engineering problem, which can be defined as a mathematical optimization problem; that is a computational problem in which the objective is to find the best of all possible solutions.  We can define the Tunnel Packing optimisation problem as: when given a fixed network topology and a fixed source-to-destination traffic demand matrix to be carried, to determine the routing of MPLS-TE tunnels that makes most effective use of capacity.  In order to solve this problem, however, it is important to define what is meant by the objective “most effective”.  The most common objective is to minimise the worst-case utilisation across all links in the network.  Given a particular link, the worst-case utilisation under a set of failure scenarios is defined to be the maximum utilisation for that link over all of the failure scenarios in the set. The overall network worst-case utilisation is a single number which summarizes how resilient the network is, under a given traffic engineering configuration, to network failure.

The Solution
So we’ve defined the problem, and highlighted the issue that it causes, but what is the solution?  The solution is not to use dynamic path options with head-end calculated tunnel paths but rather to use an offline tunnel server / path computation element which has a view of the entire network and of all of the TE tunnels that need to be set up and can determine the optimal routing of all tunnels.  This requires that explicit path options are used and that the tunnel server specifies the path that each tunnel should take.

Martin Horneffer, “IGP Tuning in an MPLS Network”,
NANOG 33, January 2005
But is this really an issue in actual networks and if so how effective is the solution?  In a study at Deutsche Telekom, with head-end calculated tunnel paths using dynamic path options, the network worst-case utilisation was over ~175%.  Whereas, with explicit path options for working case (primary) and failure case (secondary) tunnel paths, where a tunnel server (in this case Cariden MATE) had determined the tunnel paths, the network worst-case utilisation was ~95%, which was within a few percent of the theoretical optimum (which is never achievable in practise).  From this study, it is also interesting to note that IGP-metric based traffic engineering was also able to get within a few percent of the theoretical optimum.

Hence, if you are considering MPLS-TE for its traffic engineering capabilities – i.e. to make more effective use of the available capacity – rather than for its Fast Reroute (FRR) capabilities or simply as a transport protocol, they you should also consider the requirement for a tunnel server.

For more details and further reading see:

Monday, 4 January 2010

The Network Traffic Matrix: what is it and why does it matter?

The network traffic demand matrix is a measure, estimation or prediction of the aggregated traffic flows across a network.  A good understanding of the traffic matrix is absolutely fundamental to the efficient and effective planning, design, engineering and operation of any IP or MPLS network.


Why?  Well it’s only by knowing the traffic matrix and the network routing model that you can understand the impact on the network caused by different situations such as when traffic is rerouted in network failure cases, or if the network topology is changed, or if traffic demands grow over time.  An understanding of the traffic matrix is also required to analyse the benefits of and compare different approaches for traffic engineering.  Further, accurate knowledge of the traffic matrix is essential for accurate traffic trending and traffic growth predictions.

Traffic matrices can be specified to different levels of aggregation: by IP prefix, by router, by POP, or by AS; this may be per QOS class, per service and even per application.  Network capacity is increasingly being provisioned taking single network element failure cases into account; to understand traffic rerouting in failure cases an internal traffic matrix is needed which aggregates traffic at the router-to-router level.  An external traffic matrix – i.e. router to AS – provides additional context, which can be useful for managing peering connection capacity, and for understanding where network failures might impact in the external traffic matrix, i.e. due peering routing policies.

Given the critical importance of the traffic matrix, you might imagine that direct methods for easily and regularly measuring the network’s traffic matrix statistics are standardised and widely supported by vendors.  Unfortunately not ...

In practise the measurement of traffic matrix statistics has gone through something of an evolution.  Initially (timeframe: 2001) the focus was on direct measurement of the traffic matrix from the network: where MPLS is deployed this might be achieved with LSP accounting; where BGP is deployed, this might be achieved using Netflow version 9 and aggregating Netflow data by BGP next hop.  In many cases, however, it’s not possible to use methods such as these alone for direct measurement of the complete network traffic matrix.

The lack of a viable general approach for direct measurement of the complete network traffic matrix gave rise to approaches for estimating or deducing the traffic demand matrix based upon easily obtainable network data, i.e. link statistic data (timeframe:2003).  These approaches rely on mathematical methods, e.g. tomogravity, and have been shown to provide accurate results when used for failure case planning, but may be less accurate when it comes to working case predictions.

Fear not, however, as more recently, hybridised approaches have been developed which use regression to combine demand deduction based upon link statistics with direct measurements of demands where available.  The result is a practical solution which can be easily and regularly used to produce an accurate traffic matrix for both failure case and working case planning.  For more details – see the following references: