A step toward realistic evaluation of high-speed TCP protocols
Sangtae Ha+, Yusung Kim$, Long Le+, Injong Rhee+, and Lisong Xu*
+Department of Computer Science
North Carolina State University
Raleigh, NC 27699
*Department of Computer Science and Engineering
University of Nebraska, Lincoln
Lincoln, NE 68588-0115
$Department of Computer Science
Korean Advanced Institute of Science and Technology
Taejon, South Korea.
Congestion control is an important component of a transport protocol in a packet-switched shared network. The congestion control algorithm of the widely used transport protocol TCP is responsible for detecting and reacting to overloads in the Internet and has been the key to the Internet’s operational success. However, as link capacity grows and new Internet applications with high-bandwidth demand emerge, TCP performance is unsatisfactory, especially on high-speed and long distance networks. The main reason for this is the conservative behavior of TCP in adjusting its congestion window that governs the senders’ transmission rates.
A number of solutions have been proposed to remedy the aforementioned problem of TCP by changing the way in which TCP adapts its congestion window: BIC TCP , CUBIC , FAST , HSTCP , HTCP , STCP , TCP-Westwood , LTCP  and TCP-Africa . These new protocols promise to improve TCP performance on high-speed networks significantly and are hence usually called TCPs for high-speed networks.
While the design of TCPs for high-speed networks has received a considerable amount of interest, far less attention has been paid to thorough evaluations and their comparison of these protocols. For example, Internet measurement studies showed complex behaviors and characteristics of Internet traffic [14, 15, 20]. Unfortunately, existing evaluation work  did not capture these behaviors in their testing environments. Since congestion control algorithms are very sensitive to environmental variables such as background traffic and propagation delays, realistic performance evaluations of TCPs for high-speed networks require creating realistic network environments where these protocols are likely to be used.
There are many factors constituting a network environment. Most frequently used factors often captured for creating a “realistic” testing environment include static end-to-end characteristics such as (1) bottleneck bandwidth, (2) round-trip times of protocol flows being observed, (3) the network topology over the path that protocol flows of interest travel through, and (4) queue size at the bottleneck link. These factors are more or less static and do not change over the course of the experiment. What is missing in most of existing evaluation work are the considerations of (1) what the protocol flows of interest dynamically (i.e., in a time-varying manner) experience in the middle of the network path, namely the dynamic statistical properties of “background traffic” over the intermediate links and (2) the impact of background traffic to the statistical properties of such traffic. Most of these dynamic characteristics cannot be measured at end points. Yet they can greatly influence the behaviors of the protocol flows being observed at the end points.
There are several reasons why background traffic is important in protocol testing. First, network environments without any randomness in packet arrivals and delays are highly susceptible to the phase effect , a commonly observed simulation artifact caused by extreme synchronization of the network flows on the end-to-end path. A good mix of background traffic with diverse arrival patterns and delays reduce the likelihood of the phase effect. Second, a high degree of statistical multiplexing is often assumed in protocol design. For instance, the inventors of HSTCP and STCP rely on statistical multiplexing for faster convergence. So criticizing these protocols for slow or no convergence under environments without background traffic as done in  is unnecessary. As today’s Internet contains a high degree of statistical multiplexing, testing with no or little background traffic does not capture the actual intended behaviors of protocols in production networks. Third, as much as background traffic can influence the behavior of the protocol flows being observed, the statistical behaviors of this “passing-through” aggregate traffic can also be significantly altered by the nature of the protocols flows being tested. Thus, measuring the statistical properties of the background traffic in the middle of the network enables us to study the impact. This impact is important from the perspective of fairness or backward compatibility of the protocols.
In this work, we have created an experimental network model that captures some of the complex characteristics of background traffic [14, 15, 20]. We use our network model to evaluate a large collection of recently proposed TCPs for high-speed networks: BIC, CUBIC, FAST, HSTCP, H-TCP, and STCP. While we do not claim that we have the most realistic experimental network model, we believe that our work is a right step toward improving experimental methodologies for evaluating network protocols. Since we make no claim about the realism of our background traffic mix, this work-in-progress report has a modest goal of simply contrasting protocol behaviors observed from two different environments created with and without background traffic. It is well known  that the presence of background traffic in a network experiment can change the behavior of the protocols being tested. However, few studies on exactly how high-speed TCP flows can be impacted by the background traffic and what aspects of background traffic can change the behaviors of the TCP flows in what ways, and vice versa. Identifying these factors is important as the testing environments exhibiting these factors can be good candidates for more “intelligent” base testing cases and scenarios.
Our future work will evolve into testing protocols with more realistic background traffic. We plan to use some of the existing traffic generators such as Tmix  and Harpoon  that rely on real network traces as seeds for generating synthetic network traffic, and plan to create a standard set of network testing environments where the network community can test and compare protocol behaviors.
Floyd  proposes a framework for evaluating congestion control algorithms. The framework includes a number of metrics to be considered such as throughput, packet loss rates, delays, and fairness as well as a range of network environments. Along the same line, Wei et al.  propose that the networking community establishes a TCP benchmark suite to leverage comparative performance evaluations of TCP variants. The benchmark includes various scenarios for realistic performance evaluations such as heavy-tail file size distributions and ranges of propagation delays. These frameworks illustrate the need for realistic performance evaluations of new congestion control algorithms and accentuate the motivation for our work and existing evaluation work that we briefly review below.
Bullot et al.  compare the performance of TCP New Reno with HSTCP, FAST, STCP, HSTCP-LP, H-TCP, and BIC on high-speed production networks. They report that TCP-SACK gives poor performance and most TCP variants for high-speed networks deliver significant improvement over TCP-SACK. Although Bullot et al.’s results are very informative, as their experiments are performed over a real production network path, they don’t have control over background traffic on the network. Only UDP background traffic is added and the impact of network environments created by various mixes of background traffic on protocol behaviours are not considered.
Li et al.  perform experiments for STCP, HSTCP, BIC, FAST, and H-TCP in a lab network. They noted that most protocols, especially FAST, STCP, HSTCP and BIC, exhibit substantial unfairness in their experiments and highlighted the good performance of HTCP. They do not have any background traffic in their experiments.
3.1 Testbed Setup
The experimental network that we used to perform experiments for TCPs for high-speed networks is shown in Figure 1. At each edge of the network are four machines that have identical hardware configurations. Two machines are used as TCP senders and run iperf to simulate high-performance applications that have the demand to transmit a large amount of data to two other machines functioning as TCP receivers on the other side of the network. The TCP senders run a modified version of Linux 2.6.13 kernel that includes the implementations of new congestion control algorithms for high-speed networks.
Figure 1: Experimental network setup.
As pointed out by Li et al. , existing implementations of various congestion control algorithms often make changes to parts of the TCP stack that are not directly related to the congestion control algorithm in order to improve their overall performance. To be fair to all congestion control algorithms, we run a modified version of Linux 2.6.13 kernel that separates the implementation of congestion control algorithms from the standard TCP stack (with the exception of FAST that has an implementation in Linux kernel 2.4 because FAST is not yet publicly available in Linux kernel 2.6). Further, we modified the SACK implementation to remove inefficiencies of the standard SACK implementation. Our improved SACK implementation is equally effective for all congestion control algorithms.
At the core of the network are two FreeBSD 5.2.1 machines running dummynet software . These machines are carefully tuned to be capable of forwarding traffic up to 600 Mbps consistently. The dummynet software is used in the first router to control the bandwidth and buffer size of the bottleneck link in the network. The bandwidth of the bottleneck link is configured to be 400 Mbps. Unless mentioned otherwise, the buffer size of the bottleneck link is fixed to the maximum of 2 Mbytes. While various rules of thumb recommend that the buffer size be proportional to the bandwidth delay product (BDP), we follow a more practical approach where the buffer size cannot be less than some reasonable capacity. We test the protocols with a smaller router buffer size than BDP because this is a likely trend in high-speed routers. This trend is in line with recent research results showing that the buffer size of a bottleneck link with a high degree of multiplexing of TCP connections can be much less than the bandwidth delay product [16, 17].
TCP-SACK is used as the baseline TCP for comparison and fairness tests and also for generating background traffic.
3.2 Model for propagation delays
An extension of dummynet is used in the second router to assign per-flow delays to long-lived as well as short-lived background traffic flows. This configuration gives all packets from a flow the same amount of delay randomly sampled from a distribution. We use the distribution of RTT obtained from a measurement study .
3.3 Models for background traffic
Congestion control protocols for high-speed networks are unlikely to run alone in dedicated networks. Thus, we need to generate some background traffic to make our results realistic as in real-world scenarios. Two types of flows are used to generate background traffic: long-lived and short-lived. The long-lived flows are generated by iperf and are used to simulate regular long-lived TCP flows such as ftp. The amount of traffic generated by these flows is controlled by the number of iperf connections. But in all of our experiments, we fix the number of these connections for convenience. This parameter needs to be varied to perform more comprehensive study. Short-lived flows are used to simulate web sessions and are generated using Lognormal (Body) and Pareto (Tail) distribution for their file sizes [12, 15, 20]. The inter-arrival time between two successive short-lived flows follows an exponential distribution and is used to control the amount of short-lived flows [12, 20]. The chosen distributions of file sizes and inter-arrival times are taken from Internet traffic characteristics [12, 20]. Further, we also generate reverse traffic consisting of both short-lived and long-lived flows to achieve the effects of ACK compression  and to reduce the phase effect. The maximum receiver window size of these long and short-lived TCP flows used for background traffic is set to the default 64KB. This is because most TCP stacks used in the Internet have the default receiver buffer size of 64KB. Our background traffic, along with the model for propagation delays described above, allows us to create workloads containing some of the statistical properties observed in real-world scenarios for high-speed TCP.
Each experiment runs for 1200 seconds. The long-lived and short-lived background flows start at time 0. If we have two high-speed TCP flows in each experiment, the first flow starts at 30 seconds and the second flow starts at 130 seconds. We took the measurement after the first 135 seconds to eliminate the effect of the start-up phase. We perform a suite of experiments where propagation delays for the two high-speed TCP flows are set to 16, 42, 82, 162, and 324 milliseconds to simulate different network scenarios. Note that while the propagation delays for high-speed TCP flows are set to these values, propagation delays for background traffic are randomly sampled from our propagation delay model described in section 3. Actual delays that a packet experiences depend on the buffer size and the amount of traffic at the time of packet transmission. We simulate scenarios where various high-speed TCP flows either experience the same or different propagation delays. For each of the chosen propagation delays, we perform experiments with and without background traffic to contrast experimental results and protocol behaviors between these scenarios. The congestion control protocols for high-speed networks are evaluated based on the following properties: fairness, convergence time, RTT fairness, TCP friendliness, link utilization, stability of throughput, and packet loss rates.
We measure the utilization ratio of the bottleneck link capacity. In our current test setup, 400Mbps is the maximum link capacity and about 95% utilization of this capacity seems the maximum capacity utilization observed at the bottleneck router with any protocol we tested. We conjecture that 5% utilization loss is due to the software implementation of the router. We found that the utilization ratio shows high sensitivity to the queue size of the bottleneck router, the characteristics of background traffic and the behavior of protocols being tested.
Below Figure 2 shows the results of experiments where we run one flow of a high-speed TCP protocol together with one flow of regular TCP-SACK (SACK). In each experimental run, both flows have the same RTT. We vary the RTT from 16ms to 324ms. The buffer size of the bottleneck router is fixed to 2Mbytes. With a 324ms RTT, this buffer size amounts to roughly 12% of bandwidth and delay product (BDP) of the network. All protocols drop their utilization below 65% in the 324ms RTT run. This is because the window size (cwnd) and its variance of the high speed TCP flow begin tested (except for TCP-SACK) are too large to be accommodated by such buffer. With 324ms RTT, the maximum window size of high speed flows is larger than 10,000 packets (which is around 20Mbytes).
Figure 2: Link utilization ratio when buffer size is limited to 2Mbps, one TCP-SACK, and one TCP variant flow, 400Mbps bottleneck capacity. We vary RTT from 16ms to 324ms. Each data point represents the average link utilization of each run. ( )
To our surprise, as we add background traffic (both short and long lived flows) to the same experiment, the utilization ratio has dramatically changed. Figure 3 shows the utilization results of the same experiment as in Figure 2 but with background traffic.
Figure 3: Link utilization. The same experiment as the above but with background traffic. (More information about this experiment)
The utilization ratio with 324ms RTT has enhanced up to 90-95% for HSTCP, CUBIC, BIC, FAST and TCP-SACK. This change in utilization confirms the recent work by Appenzeller et al.  that the high statistical multiplexing in the bottleneck link (i.e., high randomness) permits use of small buffers. Note that there is enough background traffic to consume all the link capacity even without high speed TCP flows. These background flows are generated by TCP and their transmission rates are elastic to the amount of traffic in the network to the extent limited by the maximum allowed by 64KB receiver buffer size.
Even with background traffic, the utilization of HTCP, FAST and STCP with RTTs 160ms and 324ms is significantly lower than that of other protocols. We conjecture that this problem is related to the inherent protocol behavior of HTCP in the way that HTCP ties its window reduction to the estimated buffer size of the network. For STCP, it is being simply too aggressive. For FAST, its behavior becomes less predictable with presence of background traffic due to noise in RTT estimation. We later examine these behaviors of HTCP, FAST and STCP. Figures 4 and 5 below show two different types of experiments, all with background traffic: (1) two flows of a TCP variant with the same RTT and (2) two flows of a TCP variant, but the RTT of one flow is set to 164ms and the RTT of the other flow is varied from 16ms to 164ms.
Figure 4: Link utilization with background traffic, two flows of a high speed TCP variant with the same RTT. (More information)
Figure 5: Link utilization with background traffic, two flows of a high-speed TCP variant. The RTT of one flow is set to 162ms and the RTT of the other flow is varied from 16ms to 164ms. (More information)
Both experiments in Figures 4 and 5 are conducted with two flows of the same TCP variant. But the difference lies in the RTT values of each flow. Thus, when they have the same RTT, their average transmission rate is approximately the same (if they ensure intra-protocol fairness) and also lower than when they have different RTTs in which case one flow tends to have a higher transmission rate than the other. We observe that the utilization of HTCP is generally lower than that of the other protocols. Note that the experiments inject enough background traffic to consume the entire network capacity even without high speed flows. Especially in the second experiment (Figure 5), HTCP shows lower utilization even with low RTTs. This suggests that HTCP may also have lower utilization when it competes with heterogeneous traffic of different RTTs.
Figure 6 shows the average utilization of each traffic type measured in the second experiment when one HTCP flow with RTT 164ms (red line) competes with the other HTCP flow with RTT 42ms (green line), which corresponds to the 42ms point in Figure 5 above. The high level of fluctuations in the total link utilization (marked light yellow line) is clearly visible.
Figure 6: the average utilization of each traffic type measured at a second interval when one HTCP flow with RTT 164ms competes with the other HTCP flow with RTT 42ms (the 42ms point in Figure 5 above). Light greenish yellow lines are the total utilization; red lines are the HTCP flow with 164ms; green lines are the HTCP flow with 42ms; blue lines are long-lived flows, and purple lines are short-lived flows. Light green: total utilization, Red line: HTCP with 162ms, Green line: HTCP with 42ms, Blue line: long-lived background TCP flows, Purple line: short-lived TCP flows. More information)
Figures 7 (a) and (b) show the throughput measured at the router for the runs with CUBIC flows and STCP flows respectively. All the experiment setups are the same as in Figure 6. Both cases show much improved stability compared to that of HTCP.
Figure 7: The same metric in the same experimental environments as in Figure 6 except the high-speed flows are replaced by those of CUBIC and those of STCP respectively. The light yellow-green lines (the top lines) indicate the total utilization of the bottleneck link. Both achieve very good link utilization.
The different utilization ratios of various high-speed protocols can be explained as follows. We found that protocols that increase their transmission rate slowly (or at a reduced increment) near the saturation point tend to have high utilization. CUBIC, BIC and FAST have this feature (the poor performance of FAST is not related to this feature, but to its dependency to RTT, we conjecture). The smaller increments near the saturation point increase the congestion epoch and result in fewer self-induced losses. Furthermore, they reduce the chance of synchronized losses. When a high rate flow with a long RTT overflows the buffer, the overflow may last for some time (at the minimum for the duration of RTT). During this period, competing flows arriving to the same buffer may also experience the same losses. Most protocols except FAST, BIC and CUBIC tend to have a convex window growth function whose window increments are the largest at the time of packet losses. A large number of overflown packet could prolong the length of overflow and cause severe loss synchronizations. On the other hand, FAST, BIC and CUBIC have a concave growth function, so the window growth rate at the time of loss is the smallest, thus tend to have reduced synchronized losses. Below we examine this behavior of protocols in more detail.
(a) one HSTCP flow with one TCP flow
(b) two HSTCP flows with the same RTT
Figure 8: Loss synchronization of HSTCP flows. (a) The same setup as in Figure 3 with 162ms RTT, and (b) the same setup as in Figure 5, but two HSTCP flows have the same RTT of 162ms.
In Figure 8, we find that whenever the dominating HSTCP flow has a loss, the other flows have losses. This is visible from the timings of fluctuations in the total utilization being matched with those in the dominating flow (dark green line) and the amount of utilization reduction is bigger than the rate reduction by one HSTCP flow. The blue line is the total throughput used by the long-lived background TCP flows.
(a) one STCP flow with one TCP flow
(b) two STCP flows with different RTTs
Figure 9: Loss synchronization of STCP flows. (a) The same setup as in Figure 3 with 162ms RTT, and (b) the same setup as in Figure 5, but two STCP flows have the same RTT of 162ms.
In Figure 9 (a), we observe that this problem is even worse with STCP. This is because just before its drop, its increment is very large as it adopts an exponential growth function, overflowing the buffers. This has effect on the other flows and also itself. It has multiple cascaded losses and background traffic (which does not occupy even a significant portion of the bandwidth) also experiences packet losses (blue line drops when the green line drops) at the same time – loss synchronization is severe. In Figure 9 (b), a similar phenomenon is observed here. In this case, the two STCP flows take too much bandwidth. The synchronized reduction between STCP and the other flows is hard to see because the bandwidth occupied by the other flows is very small (STCP taking most of bandwidth). But the amount of total utilization reduction at each time of loss is larger than one STCP flow’s reduction.
(a) one BIC flow with one TCP flow
(b) two BIC flows with different RTTs
Figure 10: Loss synchronization of BIC flows. (a) The same setup as in Figure 3 with 162ms RTT, and (b) the same setup as in Figure 5, but two BIC TCP flows have the same RTT of 162ms.
BIC shows improved utilization in both cases of Figure 10. We still see some synchronization in Figure 10 (a), but it is not as severe as in HSTCP and STCP.
(a) one CUBIC flow with one TCP flow (more information)
(b) two CUBIC flows with different RTTs (more information)
Figure 11: Loss synchronization of CUBIC flows. The same setup as in Figure 10.
CUBIC shows a slight improvement over BIC. The amount of synchronization is very small clearly visible from comparison between Figures10, and 11. The low utilization at the beginning of Figure 11 (a) before 150 seconds is due to the aggressiveness of slow start from the joining TCP-SACK flow causing the high speed flow to have several cascaded losses while TCP-SACK is not fast enough (after slow start) to consume the available bandwidth.
We measure the stability of a protocol by the coefficient of variance (CoV). This stability metric is also used in . In this metric, we take samples of transmission rates of the protocol flow being observed at a periodic timed interval. Each sample is the arithmetic average of the transmission rate observed during that interval. In our experiment, we are interested in the stability (or instability) that high-speed protocol flows induce to the entire network traffic. Thus, here we measure the total throughput observed at the bottleneck router instead of the transmission of a protocol. Below, we present the average CoV of the average values of the total throughput observed at the bottleneck router measured at every 10 second interval (we also have results for other time intervals; you can find more information from our experimental page linked below). We measure CoV after the first 200 second in each run of 1200 seconds.
Figure 12 shows the average CoV of the various protocol flows when two flows of a high-speed TCP variant run with the same RTT. No background traffic is added. In our experiment, 0.1 CoV indicates high instability. In this graph, we can see HTCP, STCP and TCP-SACK cause high fluctuations in the bottleneck router capacity usage. To demonstrate the level of instability with CoV 0.1 or higher, we plot the router capacity utilization of STCP and HTCP runs with 324ms RTT measured at every second interval in Figure 13. The total utilization of these two runs is around 80%. CUBIC, BIC and HSTCP achieve more than 90%. Information about the link utilization for this experiment can be found by clicking here.
Figure 12: Average CoV over 10 second interval. Two high-speed TCP variants running with the same RTT, no background traffic, 400Mbps. 2MB buffer. (More information)
Figure 13: The utilization of STCP and HTCP flows measured at one second intervals. The light green line indicates the total utilization of the link capacity. 400Mbps, 324ms RTT, 2MB buffer.
Figure 14 shows the same metric of the runs with the same setup as the above but with background traffic. We observe that the CoVs of all the protocols have reduced. Thus, we can conclude that as we add more background traffic, the stability of protocols gets improved. But HTCP still shows very high CoV values. FAST also shows gradually increasing CoV values as RTT increases.
Figure 14: Average CoV over 10 second interval. The same environment as the above, but with background traffic. (More information)
Figure 15 shows the utilization measured at each second interval for various protocols after we add background traffic to the same experiment as in Figure 12. We observe that both STCP and HTCP have improved quite a bit compared to the no-background traffic case. FAST also fluctuates more with high RTTs and with background traffic. Although STCP has smaller time scale fluctuations than FAST, its CoV is less than that of FAST with 324ms RTT. This is because CoV (shown in Figure 19) is an average value over 10 second intervals implying STCP has much less fluctuations over a long-term scale than FAST with 324ms RTT.
Figure 15: The utilization of HTCP, STCP, CUBIC and FAST flows of RTT 324ms measured at one second intervals after we add background traffic. STCP has improved substantially while HTCP still shows significant fluctuations.
This instability of HTCP does not happen only in long RTT networks, but also when different RTT high-speed flows are competing as we showed earlier. Figure 16 shows the average CoV when two flows of high-speed flows with different RTTs compete for the same bottleneck capacity (one is fixed to 162ms and the other is varied from 16ms to 162ms). This phenomenon is also observed in . Unlike in the experiment with the same RTT in Figure 15, FAST shows much better stability. This indicates that the test cases must also include the cases with more diverse RTTs (even for the high speed TCP flows being observed).
Figure 16: Average CoV over 10 second interval. Two flows of the same TCP variant with different RTT (one flow with fixed RTT of 164ms, the other flow’s RTT is varied from 16ms to 164ms). 2MB buffer with background traffic. (More information)
In the above we examined network capacity utilization and network stability. In this section, we examine packet loss observed at the bottleneck router as high-speed TCP flows compete for the bottleneck capacity. We measure the total packet loss at the bottleneck link and do not distinguish flows that experience packet losses. Like the stability measurement, this loss metric also measures the impact of high-speed TCP flows on the background traffic.
Figure 17: The packet loss ratio observed at the bottleneck queue. No background traffic, 2MB buffer. The losses include the total losses observed at the queue regardless of flows. (More information)
The packet loss ratio (the total number of packets lost over the total number of packets sent at the router) of various protocols is plotted in Figure 17. In this experiment, we do not add any background traffic and only two flows of the same TCP variant protocol with the same RTT run at the same time. We observe that HTCP has much higher loss rates than the other protocols in the runs with RTTs 80ms and longer. STCP also shows high packet loss rates as the RTT increases. This is because STCP is seeing the effect of small buffer sizes with high RTTs. FAST shows the least packet loss ratio among all the protocols.
As we add background traffic, we see packet loss rates for most protocols slightly increase. But STCP has a reduced loss rate with 324ms RTT. We conjecture that this is because the randomness present in the network improves the stability and robustness of STCP (and all protocols as well). Figure 18 shows the results. HTCP still induces significantly more packet losses even with background traffic than the other flows.
Figure 18: The packet loss ratio observed at the bottleneck queue. With background traffic, 2MB buffer. The losses include the total losses observed at the queue regardless of flows. (More information)
Figure 19 shows the packet loss ratio when two flows of the same TCP variant with different RTTs compete at a bottleneck link. Background traffic is added. The packet loss ratio of HTCP is among the highest. This behavior of HTCP is correlated with the CoV of HTCP shown earlier, implying high instability induces more packet losses (also utilization) in the network not necessarily only to its own flows but to the entire network flows.
Figure 19: The packet loss rate observed at the bottleneck queue. Two flows of the same TCP variant, but with different RTTs (one is fixed to 164ms and the other is varied). (More information)
We measure the fairness in sharing the bottleneck bandwidth among competing flows that have different RTTs. There are several notions of this “RTT fairness”. One notion is to achieve the equal bandwidth sharing where the two competing flows may share the same bottleneck bandwidth even if they have different RTTs. This property may not be always desirable because long RTT flows tend to use more resources than short RTT flows since they are likely to travel through more routers over a longer path. Another notion is to have bandwidth shares inversely proportional to the RTT ratios. This proportional fairness makes more sense in terms of the overall end-to-end resource usage. Although there is no commonly accepted notion of RTT-fairness, it is clear that the bandwidth share ratio should be within some reasonable bound so that no flows are being starved because they travel a longer distance.
Note that RTT-fairness is highly correlated with the amount of randomness in packet losses (or in other words, the amount of loss synchronization) . Under more random environments, protocols tend to have better RTT-fairness.
Figure 20 shows the RTT fairness of various protocols without any background traffic. Two flows are tested; we fix the RTT of one flow to 162ms and vary the other flow from 16ms to 162ms.
Figure 20: RTT-fairness tested with no background traffic (using Jain’s fairness index). No background traffic. 400 Mbps. Two flows of each TCP variant are tested. One flow has a fixed RTT of 162ms while the other flow having a variable RTT from 16ms to 162ms. (More information)
FAST has the best fairness index. FAST achieves the equal RTT fairness among the two FAST flows regardless of their RTTs. CUBIC has RTT fairness linearly proportional to the inverse of the RTT ratio (i.e., the short RTT flow having proportionally more bandwidth share than the long RTT flow). So its RTT fairness is slightly lower than FAST. As we indicated above, we question whether this equal sharing property of FAST regardless of delays is desirable. HTCP and HSTCP seem to have similar RTT fairness. BIC’s RTT fairness is lower than HSTCP but higher than STCP. This behavior is expected as explained in . BIC is known to follow the same RTT fairness as TCP-SACK under a very large BDP network . In the current testing environment, BIC’s RTT fairness is targeted to be in between those of TCP and STCP. The argument is that in a network of this size, there will be enough multiplexing so RTT unfairness would not be so severe. We also found that HTCP allows the long RTT flow to have more bandwidth share. Figure 21 shows the bandwidth share and RTTs of the two flows of HTCP being tested. We found this quite unusual and do not know why HTCP has this behavior. We know that this behavior was not observed in the earlier version of HTCP.
(a) Bandwidth share of the two HTCP flows
(b) RTTs of the two flows
Figure 21: RTT fairness test of HTCP when one flow has 162ms RTT and the other flow has 22ms RTT (the delays shown in the figure include buffer delay as well). In these figures, we find that HTCP allows the longer delay flow to use more link capacity. (More information)
Figure 22 shows the same metric as in Figure 20 but with background traffic. In this experiment, we expect more asynchrony in packet losses. In the test, we found that background traffic has the biggest impact on FAST while in general, most protocols improve their fairness compared to the cases without background traffic. As suggested earlier, BIC’s fairness has been improved substantially close to TCP-SACK as we add background traffic.
Figure 22: RTT fairness test with background traffic. In general, most protocols have gained in fairness with background traffic (for comparison, see Figure 20). CUBIC achieves a proportional RTT fairness (i.e., bandwidth share among the two flows are inverse-linearly proportional to the RTT ratio). FAST seems to have the biggest impact from the background traffic. (More information)
To see the behavior of FAST in more detail, we examine the transmission rates of the two competing flows of FAST in Figure 23 where one flow has 22ms RTT and the other flow has 162ms RTT. Without background traffic, FAST is able to achieve equal sharing. With background traffic, it does not. This is, we conjecture, because the second flow (red line) fails to estimate the minimum RTT of the link because of background traffic and thus fails to converge to the equal share.
(a) FAST with no background traffic
(b) FAST with background traffic
Figure 23: RTT fairness of FAST when two flows with different RTTs (one – 22ms and the other -164ms) compete. Without background traffic, FAST induces equal sharing while with background traffic, the long RTT flow fails to grab the bandwidth away from the short RTT flow.
We measure how TCP-friendly the high-speed protocols are by running experiments with one high-speed flow and one regular TCP flow with the same RTT over the same bottleneck link. These experiments were performed with and without background traffic. We measured TCP friendliness by Jain’s fairness index  using the throughput of the high-speed flow and of the regular TCP flow. Jain’s fairness index is a normalized number between 0 and 1 (1 being the greatest fairness). Jain’s fairness indices for various high speed TCP variants are shown in Figure 24 as we vary the round-trip times from 16 to 324 milliseconds. The bottleneck bandwidth is set to 400Mbps. Figure 25 shows the results for 100Mbps link bandwidth.
Figure 24: TCP friendliness of various protocols with and without background under 400 Mbps. One regular TCP-SACK flow and one high-speed TCP variant flow, both with the same RTT, are running under 400Mbps link.
Figure 25: TCP friendliness of various protocols with and without background under 100 Mbps. One regular TCP-SACK flow and one high-speed TCP variant flow, both with the same RTT, are running under 100Mbps link.
From Figures 24 and 25, we observe that HTCP has the best TCP-friendliness in very low RTT networks (where TCP-friendliness is important because TCP-SACK does not have much performance problem) with or without background traffic. However, as RTT increases beyond 16ms, HTCP’s fairness to TCP drops rapidly in both cases.
In general, we note that all TCP variants (except FAST) improved their TCP friendliness when background traffic is added to the experiments. This is mostly because of two reasons. First, increased background traffic takes away bandwidth from high speed TCP variants so they become less aggressive as their average window sizes become less than without background traffic. The other reason is that with background traffic, randomness in packet losses increases. As we can see that HSTCP (purple), CUBIC (red) and BIC (dark green) improve their TCP fairness indices considerably with background traffic, background traffic breaks loss synchronization and allows flows to adapt their transmission rates more asynchronously. In the same vein, this phenomenon occurs more vividly in a smaller bandwidth network as we can see in Figure 25-b (most notably with HSTCP).
We observe that FAST shows the best TCP friendliness in high RTT networks. This is not necessarily desirable because TCP-SACK is too conservative in high BDP networks. In addition, the TCP friendliness of FAST has been affected the most by the presence of background traffic.
(a) FAST and TCP without background traffic (Green is FAST, Red is TCP-SACK, and light green is the total utilization)
(b) FAST and TCP with background traffic (green - FAST, red - TCP-SACK, blue - long-lived TCP flows, purple - short-lived TCP flows)
Figure 26: Throughput of FAST and TCP-SACK with and without background traffic. 400Mbps, 82ms RTT.
A close examination of experimental results for FAST in Figure 26 shows that FAST (green line) is very friendly to TCP-SACK (red line) and consumes only the bandwidth left unused by TCP-SACK and the background traffic (blue line). It is interesting to note that FAST actually consumes less bandwidth than TCP-SACK in most of our experiments; FAST consumes more bandwidth than TCP-SACK only when the round-trip times increase above a certain value. Note that TCP-SACK becomes less effective in high RTT networks. These behaviors of FAST occur when FAST competes with heterogeneous flows that use packet losses as the only congestion indication. Loss-based protocols increase their transmission rates as long as there is no loss even when queuing delays are high. On the other hand, FAST will reduce its rate under such a circumstance. Thus, FAST becomes friendly to TCP flows, sometimes excessively so. More experimental results for FAST and TCP-SACK without background traffic can be found here, here, here, here, and here. Experimental results for FAST and TCP-SACK with background traffic can be found here, here, here, here, and here.
(a) STCP and TCP without background traffic (green - STCP, red - TCP-SACK, light green - total utilization)
(b) STCP and TCP with background traffic (green - STCP, red - TCP-SACK, blue - long-lived TCP flows, purple - short-lived TCP flows)
Figure 27: Throughput of STCP and TCP-SACK with and without background traffic. 400Mbps, 82ms RTT.
Experimental results for STCP (shown in Figure 27) indicate that STCP (dark green line) usually increases its congestion window too aggressively and causes the router queue to overflow frequently (these results are consistent with the observations that we made in Sections 1 and 6). This aggressive behavior of STCP causes TCP-SACK (red line) to reduce its transmission rate dramatically. When experiments were performed with background traffic, STCP still causes the router queue to overflow frequently but to a less extent than in the experiments performed without background traffic. In general, the throughput of the regular TCP-SACK flow only decreases slightly when background traffic is added. On the other hand, STCP obtains a considerably lower throughput in experiments with background traffic than without background traffic. This is the reason why STCP obtains a slightly better fairness index when background traffic is added (see Figure 24).
More experimental results for STCP and TCP-SACK without background traffic can be found here, here, here, here, and here. Details of experimental results for STCP and TCP-SACK with background traffic can be found here, here, here, here, and here.
Figure 28: Intra-protocol fairness of different protocols with and without background traffic. Two flows of a high speed TCP variant with the same RTT running over 400Mbps bottleneck link.
We measure the intra-protocol fairness of protocols by performing experiments with two flows of a high-speed protocol with the same RTT. The throughput of these two flows is used as input to compute Jain’s fairness index. These experiments are conducted with and without background traffic and the round-trip times are varied between 16 and 324 milliseconds.
Without background traffic, HSTCP, TCP-SACK and STCP show lower fairness indices than the other protocols. However, as we add background traffic, we find that all the protocols (except FAST) show pretty good fairness. The reason for this result could be that since FAST is delay-based, background traffic introduced more dynamics and fluctuations in the bottleneck queue and made it more difficult for FAST flows to estimate their fair shares of bandwidth based on delay information. We also observe that HSTCP (violet line) obtained a higher fairness index in the presence of background traffic (as noted by its author, HSTCP relies on statistical multiplexing for faster convergence).
Figure 29: Throughput of two FAST flows with and without background traffic. 400Mbps. 82ms RTT
Figure 29 shows the throughput of two FAST flows (red and dark green lines) in experiments with and without background traffic. FAST shows very stable behavior without background traffic and the throughput values of the two flows are close enough without background traffic. However, as we add background traffic, we often find that one flow of FAST (green line) may dominate the other (red line) as shown in the figure. The contrasting behaviors of FAST with and without background traffic demonstrate again the importance of evaluating protocols in different scenarios, in particular evaluations with background traffic.
More information about experiments for FAST without background traffic can be found here, here, here, here, and here. More information about experiments for FAST with background traffic can be found here, here, here, here, and here.
Figure 30: Throughput of two STCP flows with and without background traffic. 400Mbps, 82ms RTT.
Figure 30 shows the throughput of two STCP flows (red and dark green lines) in experiments with and without background traffic. When the two STCP flows run without background traffic (Figure 30 (a)), they showed a great amount of unfairness (the throughput of one STCP flow was about three times as much as that of the other STCP flow). However, when the two STCP flows ran with background traffic, they achieved about the same throughput (see Figure 30 (b)). We also observe that background traffic helps the two STCP flows converge faster and better.
More information about experiments for STCP without background traffic can be found here, here, here, here, and here. More information about experiments for STCP with background traffic can be found here, here, here, here, and here.
Figure 31: Throughput of two HSTCP flows with and without background traffic. 400Mbps, 82ms RTT.
Figure 31 shows the throughput of two HSTCP flows (red and dark green lines) in experiments with and without background traffic. When the two HSTCP flows ran without background traffic (Figure 31 (a)), they showed a great amount of unfairness (the throughput of one HSTCP flow is about twice as much as that of the other HSTCP flow). However, when the two HSTCP flows ran with background traffic, they obtained approximately the same throughput. We also observe that background traffic helps the two HSTCP flows converge faster and better.
More information about experiments for HSTCP without background traffic can be found here, here, here, here, and here. More information about experiments for HSTCP with background traffic can be found here, here, here, here, and here.
Figure 32: Throughput of two HTCP flows with and without background traffic. 400Mbps, 82ms RTT.
Figure 32 shows the throughput of two HTCP flows (red and dark green lines) in experiments with and without background traffic. When the two HTCP flows ran without and with background traffic, each obtained approximately half the link capacity. It shows good fairness indices and convergence independent of background traffic.
More information about experiments for HTCP without background traffic can be found here, here, here, here, and here. More information about experiments for HTCP with background traffic can be found here, here, here, here, and here.
Figure 33: Throughput of two BIC flows with and without background traffic. 400Mbps, 82ms RTT.
More information about experiments for BIC without background traffic can be found here, here, here, here, and here. More information about experiments for BIC with background traffic can be found here, here, here, here, and here.
Figure 33 shows the throughput of two BIC flows (red and dark green lines) in experiments with and without background traffic. When the two BIC flows ran without background traffic, each flow obtained about the same throughput on average but they showed relatively slow convergence. However, when background traffic is added, the two BIC flows converge much faster. Each BIC flow still achieves approximately the same throughput when it runs with background traffic. The similar behavior is observed with CUBIC below.
Figure 34: Throughput of two CUBIC flows with and without background traffic. 400Mbps, 82ms RTT.
More information about experiments for CUBIC without background traffic can be found here, here, here, here, and here. More information about experiments for CUBIC with background traffic can be found here, here, here, here, and here.
Figure 35: Convergence time of different protocols with and without background traffic.
Figure 35 measures the convergence time of two high-speed protocol flows that are started at different times. The convergence time is defined to be the elapsed time when the timed average throughput of the second flow reaches 80% of the first flow (recall from section 2 that the two flows are started at 30 and 130 seconds). The average throughput is obtained at one second interval. The convergence time shown in Figure 40 allows us to discuss quantitatively the dynamic behaviors of protocols that we already qualitatively pointed out above.
CUBIC and BIC shows up to 150 second convergence time when running without background traffic. Under 300ms RTT, their convergence times reduce. This is because the small buffer size (of 2MB) allows the second flow to perturb the network significantly, when it goes into slow start, to force the first flow to drop its bandwidth share quickly. This case can be seen from STCP – because STCP is very aggressive, the second flow always forces the first flow to come down quickly. Thus STCP shows fairly short convergence times with high RTTs. (However, in most cases, STCP does not show good convergence beyond 80% with no background traffic) On the other hand, HSTCP shows very slow convergence time and also in most case, their convergence beyond 80% is not possible. As can be seen in Figure 35 (b), HSTCP (purple) reduces its convergence time considerably when running with background traffic. Further, STCP (yellowish green line) also showed improved convergence behavior with background traffic. On the other hand, FAST (brown line) increased their convergence time noticeably in the presence of background traffic. With low RTTs, FAST flows do not converge. The other protocols showed a rather short convergence time both with and without background traffic (note the change of scales of the y axis).
Figure 36: Fairness index of different protocols over different timescales.
While the convergence time sheds light on the dynamic behavior of the protocols, it does not necessarily give the complete view on the convergence behavior of the protocols. While the convergence time measures the time that the second high-speed flow takes until it reaches 80% throughput of the first flow, it does not provide any information about the dynamic behavior of these flows after that. Another metric that can be used to investigate the dynamic behavior of protocols is the average fairness index over different time scales.
In Figure 36, we show the average fairness index over different time scales of the protocols under evaluation when experiments were performed with a round-trip time of 82 milliseconds. We note that most protocols show equally good or better convergence behavior with background traffic. The only exception is FAST (brow line) that took more time to converge with background traffic than without background traffic. Another minor exception is BIC (red line) that shows slightly worse convergence behavior with background traffic.
In this rather long paper, we summarized the results of an evaluation study of a collection of congestion control algorithms for high-speed networks. We used various metrics such as fairness, convergence time, packet loss rates, link utilization, RTT fairness, TCP friendliness, and stability of throughput to evaluate these protocols. We focused on the impact of background traffic on the behaviors of these protocols. It is known that background traffic may affect the protocol behavior, but little is known “how” it is going to affect the behavior. Our study sheds a limited light to the problem. Further study will show more interesting properties with background traffic.
We do not declare any winner in our evaluation but simply show contrasting experimental results and complex protocol behaviors when experiments were conducted with and without background traffic. Our results demonstrated that different conclusions can be drawn when protocol evaluations were conducted with and without background traffic. Thus, evaluating a new protocol without background traffic can be dangerous and a thorough evaluation needs to look at a variety of testing scenarios to make a valid observation about the behavior of a protocol. Evaluations without background traffic may not give a full picture for the protocol performance (which is not new to know). While we do not claim that our models for traffic and propagation delays are the most realistic, we believe that the steps we took are at least steps toward the right evaluation of these protocols and hope that our work improves the methodology in evaluting various congestion control protocols.
Further, we also conclude that high-speed protocols have rather complex behaviors and a thorough evaluation of these protocols need to investigate all aspects of their behaviors. In light of our evaluation, it appears that there will probably be no “perfect” high-speed protocol that would be a clear winner in all different (and sometimes conflicting) aspects of protocol behaviors.
 Lisong Xu, Khaled Harfoush, and Injong Rhee, “Binary Increase Congestion Control for Fast Long-Distance Networks”, INFOCOM 2004.
 Injong Rhee and Lisong Xu, “CUBIC: A New TCP-Friendly High-Speed TCP Variant”, PFLDnet 2005.
 Cheng Jin, David X. Wei and Steven H. Low, “FAST TCP: motivation, architecture, algorithms, performance”, INFOCOM 2004.
 Sally Floyd, “HighSpeed TCP for Large Congestion Windows”, IETF RFC 3649, December 2003.
 Douglas Leith and Robert Shorten, “H-TCP Protocol for High-Speed Long Distance Networks”, PFLDnet 2004.
 Tom Kelly, “Scalable TCP: Improving Performance on Highspeed Wide Area Networks”, Computer Communication Review, April 2003.
 Hadrien Bullot, R. Les Cottrell, and Richard Hughes-Jones, “Evaluation of Advanced TCP Stacks on Fast Long-Distance Production Networks”, PFLDnet 2004.
 Yee-Ting Li, Douglas Leith, and Robert Shorten, “Experimental Evaluation of TCP Protocols for High-Speed Networks”, Technial report, Hamilton Institute, 2005.
 C. Jin, D. X. Wei, S. H. Low, G. Buhrmaster, J. Bunn, D. H. Choe, R. L. A. Cottrell, J. C. Doyle, W. Feng, O. Martin, H. Newman, F. Paganini, S. Ravot, S. Singh, “FAST TCP: From Theory to Experiments”, IEEE Network, January/February 2005.
 Ren Wang, Kenshin Yamada, M. Yahya Sanadidi, and Mario Gerla, “TCP with sender-side intelligence to handle dynamic, large, leaky pipes”, IEEE Journal on Selected Areas in Communications, 23(2):235-248, 2005.
 Ryan King, Rudolf Riedi, Richard Baraniuk, “Evaluating and Improving TCP-Africa: an Adaptive and Fair Rapid Increase Rule for Scalable TCP”, PFLDnet 2005.
 David X. Wei, Pei Cao, and Steven H. Low, “Time for a TCP Benchmark Suite?”, Technical report, 08/2005, available at www.cs.caltech.edu/~weixl/research/technicals/benchmark/summary.ps .
 Sally Floyd, Metrics for the Evaluation of Congestion Control Mechanisms, August 2005, Internet draft, draft-irtf-tmrg-metrics-00.txt .
 Jay Aikat, Jasleen Kaur, F. Donelson Smith, and Kevin Jeffay, “Variability in TCP Roundtrip Times”, ACM IMC 2003.
 Paul Barford and Mark Crovella, “Generating Representative Web Workloads for Network and Server Performance Evaluation”, ACM SIGMETRICS 1998.
 G. Appenzeller, I. Keslassy, and N. Mckeown, “Sizing router buffers”, in Proceeding of ACM SIGCOMM’04.
 Dhiman Barman, Georgios Smaragdakis and Ibrahim Matta, “The Effect of Router Buffer Size on HighSpeed TCP Performance”, IEEE Globecom 2004.
 D. Chiu and R. Jain. “Analysis of the Increase/Decrease Algorithms for Congestion Avoidance in Computer Networks.” Journal of Computer Networks and ISDN, 17(1):1–14, 1989.
 E. Altman, K. Avrachenkov, B.J. Prabhu, “Fairness in MIMD congestion control algorithms”, in Proceedings of the IEEE INFOCOM, 2005.
 Sally Floyd and Vern Paxson, “Difficulties in Simulating the Internet”, ACM/IEEE Transactions on Networking, Vol.9, No.4, pp. 392-403, August 2001.
 Luigi Rizzo, “Dummynet: A simple approach to the evaluation of network protocols”, ACM Computer Communications Review, Vol. 27, No. 1, January 1997, pp. 31-41.
 Lixia Zhang, Scott Shenker, and David D. Clark, “Observations on the Dynamics of a Congestion Control Algorithm: the Effects of Two-Way Traffic”, SIGCOMM 1991.
 Joel Sommers, Paul Barford, and Hyungsuk Kim, “Harpoon: A Flow-Level Traffic Generator for Router and Network Tests”, extended abstract, ACM SIGMETRICS 2004.
 F. Hernández-Campos, F. D. Smith, and K. Jeffay, “Generating Realistic TCP Workloads”, in Proceedings of CMG 2004, December 2004.
 Sally Floyd and Eddie Kohler, “Internet Research Needs Better Models”, HotNets-I, October 2002.
 Sumitha Bhandarkar, Saurabh Jain and A. L. Narasimha Reddy, Kohler, " Improving TCP Performance in High Bandwidth High RTT Links Using Layered Congestion Control," PFLDNet 2005