Identifying TCP Congestion Control Mechanisms ... - Jacobs University

8 downloads 3678 Views 280KB Size Report
According to the response received from the host, it figures out the type of TCP version ... congestion control mechanism running on remote web servers.
1

Identifying TCP Congestion Control Mechanisms Using Active Probing SeyedShams Feyzabadi Computer Science Department Jacobs University Bremen [email protected] Transmission Control Protocol (TCP) carries most of the traffic on the Internet these days. There are several implementations of TCP, and the most important difference amongst them is their mechanism for controlling congestion. One of the methods for determining type of a TCP is active probing. Active probing considers a TCP implementation as a black box, sends different streams of data to the appropriate host. According to the response received from the host, it figures out the type of TCP version implemented. In this paper we explain our approach that uses active probing and is able to identify TCP Congestion Control behavior for all types of mechanisms such as Selective Acknowledgment, Early Congestion Notification, Reno, Tahoe, Cubic, BIC etc. As a result behavior of these TCP algorithms during their Congestion Avoidance period is compared. Keywords: Transmission Control Protocol (TCP), Congestion Control, Active Probing

I. INTRODUCTION Transmission Control Protocol (TCP) is widely used on the Internet, and most of the traffics are carried by this protocol. So performance of the Internet depends significantly on TCP's performance. One of the most important aspects of controlling traffic is dealing with congestions. There are several types of TCP, and each of them has implemented their own congestion control mechanism that is the main difference amongst them. Dealing with traffic using different congestion control mechanisms will result in different traffic shapes. Since several TCP connections might share the same link, performance of each of them has a significant effect on other connections. For example if we run a Reno and a Cubic TCP, that will be explained later, on the same link, they will not work as usual. Since Cubic is a fast TCP, it will increase its speed while Reno can not do the same thing and it will decrease its speed. Sum of traffics of these different TCPs makes the traffic volume of the respective link. It is obvious that we can not easily understand the traffic shape on a link, because of several different TCPs running on that link and their effects on each other. Some special fields of Internet researches need to know about traffic shapes on their links such as ISPs, OS vendors, etc. Knowing TCP mechanisms running on their links would be very useful in order to deduce the overall traffic shape. Since they neither have access to the source code of sender host nor the permission to log into that host and check, they should find a way to detect the TCP type remotely. In this case active probing will be useful [1]. Active probing considers the remote host as a black box. It sends different streams of data to the remote host. It then responds to the stream. According to the response, it can be inferred what type of congestion control is used and consecutively what type of TCP is running.

Actually, it is not possible to check arbitrary types of remote hosts, because some of them may not respond to our request for several reasons such as lack of permission, blocking the streams by the firewall etc. Hence, special types of hosts have to be chosen for evaluation. We chose web servers for two reasons. First, no specific permission is required for working with a web server. Whenever an HTTP GET request is sent to a web server, it will send the appropriate web page. The second reason is due to statistical reasons. Web servers are huge resources of Internet traffic nowadays. So it is very useful to have a tool capable of detecting their traffic shape and congestion control behavior. Indeed it will be possible to detect the behavior of a significant portion of the Internet traffic. In this paper we explain our approach for detecting TCP congestion control mechanism running on remote web servers. Our software is developed in C language and utilizes some of TCP Behavior Inference Tool (TBIT) features in order to manipulate TCP packets [2]. The remainder of this paper is organized as follows. Section two explains different congestion control mechanisms that are widely used recently. Section three explains some other tools to identify TCP congestion control remotely. Then we describe a mechanism that is used in order to distinguish the congestion control type running on a web server. Afterwards using that mechanism in practice, test results are presented. At the end the work is concluded and future works are brought up. II. SOME CONGESTION CONTROL MECHANISMS There are several different mechanisms for controlling congestion. It generally consists of four parts: Slow Start, Congestion Avoidance, Fast Retransmit and Fast Recovery. The Slow Start and Congestion Avoidance algorithms must be used by TCP senders to control the amount of outstanding

2 data being injected into the network. For this reason, the sender starts with a low rate of sending data segments, and makes the data amount twice after receiving acknowledgment (ACK) for each of them. After crossing a threshold it will go to Congestion Avoidance phase [12]. A TCP receiver should send an immediate duplicate ACK whenever an out-of-order segment arrives. The purpose of sending this ACK is to inform the sender that a segment was received out-of-order and which sequence number is expected. After receiving three duplicate ACKs, the sender enters Fast Recovery and Fast Retransmission stages and according to the TCP type that is used, the sender will change the transmission rate in order to control the congestion. The original definition of TCP congestion control mechanism could be found in [5]. In the following parts of this section we will briefly describe three different mechanisms implemented to control the congestion within Fast Retransmit and Fast Recovery stages. A. Reno TCP Reno is one of the earliest implemented TCPs after Tahoe. It contains Slow Start, Congestion Avoidance, Fast Recovery and Fast Retransmit phases. After receiving a small number of duplicate ACKs for the same TCP segment, the sender infers that a packet has been lost and retransmits the packet without waiting for a transmission timer to expire, leading to higher channel utilization and connection throughput. Fast Recovery operates by assuming each duplicate ACK received represents a single packet having left the pipe. Thus, during Fast Recovery the TCP sender is able to make intelligent estimates of the amount of outstanding data. Fast Recovery is entered by a TCP sender after receiving an initial threshold of duplicate ACKs. This threshold, usually known as tcprexmtthresh, is generally set to three. Once the threshold of duplicate ACKs is received, the sender retransmits one packet and reduces its congestion window by one half. Instead of slow-starting, the Reno sender uses additional incoming duplicate ACKs to clock subsequent outgoing packets [3]. B. Binary Increase Congestion (BIC) This congestion control mechanism used to be the default TCP on Linux machines. It also contains the basic four parts, namely Slow Start, Congestion Avoidance, Fast Recovery and Fast Retransmit. Its behavior is similar to Reno in Slow Start, Fast Recovery and Fast Retransmit. The only difference could be seen in Congestion Avoidance. As stated before, Reno increases its Congestion Window (CWND) size by one after receiving all ACKs of the previous CWND. BIC acts differently in this area. It uses binary search in order to get to the saturation point. This behavior makes it fast and scalable. The detailed algorithm is as follows. After a packet loss, it reduces its CWND size by a factor of β, the default value is 0.2. The window size just before reduction is set to Wmax, and after reduction is set to Wmin. In the next step it finds the midpoint of these two values and sets the new CWND size to the midpoint. If the midpoint is

very far from Wmin, a constant value called Smax is used. If no loss detected new Wmin will be this CWND size. This process continues until the increment is less than a constant value of Smin. Then the CWND is set to Wmax. If no loss is detected it enters the max probing phase and window increment format will be exactly symmetric to the previous part. The detailed description of this algorithm could be found in ref 10. A diagram of this algorithm during the Congestion Avoidance phase is shown in figure 1.

Figure 1: BIC growth function (from ref 11)

C. Cubic BIC utilizes the bandwidth very good, but it is very aggressive in short RTT networks. Cubic is the enhanced version of BIC and is used as the Linux default congestion control mechanism nowadays. As represented by its name, Cubic TCP uses a Cubic function for its CWND growth function. At the same time it uses time instead of RTT. So it doesn’t matter if it is running in a short RTT or long RTT networks. The only difference between Cubic and BIC is the Congestion Avoidance phase. The detailed algorithm is as follows. After a packet loss, it reduces the CWND size by multiplicative factor of β that is again by default 0.2. The window size just before reduction is set to Wmax. Then the CWND growth function follows a Cubic function with a plateau set on Wmax. In grows in a concave mode to reach the plateau then converts it to a convex mode. So the following formula will be used Where K is

W (t ) = C (t − K ) 3 + Wmax K =

3

W max β C

And C is the Cubic constant, t is elapsed time from the last window reduction and K is the time period takes to get from W to Wmax while no other loss detected. Then the diagram of this TCP algorithm during Congestion Avoidance phase would be as shown in figure 2.

3

Figure 2: CUBIC growth function (from ref 11)

III. RELATED WORKS TCP Behavior Inference Tool (TBIT) is an implemented tool that uses active probing to check the running TCP on web servers. It can test several aspects of the running TCP including initial value of congestion window, congestion control algorithm, conformant congestion control, response to Selective ACKnowledgment (SACK), response to Explicit Congestion Notification (ECN) and time wait duration. BSD Packet Filter (BPF) is kernel agent that provides a user-level packet capture. It captures specific packets and discards others as soon as possible. BPF contains two components: the network tap and packet filter. Network tap collects copies of packets from the network device drivers and delivers them to the application. The filter decides if a packet should be accepted and, if so, how much of it to copy to the listening port [4]. TBIT Generally works as following. Its process establishes and maintains a TCP connection with the remote host entirely at the user level. The TBIT process fabricates TCP packets and uses raw IP sockets to send them to a remote host. It also sets up a host firewall to prevent packets from the remote host to be replied by the kernel of the local machine. At the same time, a BPF device is used to deliver these packets to the TBIT process. This user-level TCP connection can then be manipulated to extract information about the remote TCP. The flow of data in TBIT is shown in figure 3. Output TBIT

Raw Socket

Input Firewall

BPF Kernel

Figure 3: TBIT Architecture

Our focus will be on congestion control identification part of TBIT. Generally there are two types of TCP mechanisms. The first type requires some specific establishments during the handshaking of two hosts. These TCPs, like SACK and ECN, have to enable some specific features in SYN packet to be able to start. That means, before starting a TCP connection, sender and receiver must agree on the TCP mechanism which is going to be used. The second type doesn’t require any establishment during handshaking. These types of TCPs only depend on the sender host. In the other words, the web server sends all packets according to its configurations. To identify the TCP running on a remote host we should deal with both of these TCP types. Detecting the first type is not hard, because of the agreement that was done in advance. Thus whenever the TBIT enables special features in its SYN packet, the server can agree or not. This type of TCP will be explained later with more details. Identifying the second type is not so straightforward. TBIT has to identify it in a more complicated way. Now considering client A and server B, we explain how it identifies the second type of TCP and what the problems are in this algorithm. z It sets up firewall and BPF on A in order to captures receiving from B. z Establishes a connection between A and B. z Send a TCP SYN packet with the destination address of host B and a destination port of 80. The Maximum Segment Size (MSS) is set to 100 bytes. z Request the base web page z The remote server starts sending the base page with 100-byte packets. z Create ACKs according to TCP protocol up to packet 13. z Drop packet 13. z Receive and acknowledge packets 14 and 15, but the acknowledge will be three times for packet 13 that will make sender go to Fast Retransmission state. z Drop the packet 16. z Receive and acknowledge other packets up to 25. z Close the connection after acknowledging packet 25. Now according to responses received from the remote hosts, it can infer what type of TCP congestion control is running on that machine. NewReno is characterized by Fast Retransmission for packet 13, no additional Fast Retransmissions or Transmission Timeouts, and no unnecessary retransmission of packet 17. Reno TCP is characterized by Fast Retransmission for packet 13, a Transmission Timeout for packet 16 and no unnecessary retransmission for packet 17. Tahoe TCP is characterized by no Retransmission Timeout before the retransmission of packet 13, but an unnecessary retransmission of packet 17 [2]. Since TCP is a user-configurable transport protocol [8], responses received from different hosts are not limited only to these three types. So it would face some other types of congestion controls that are not widely used. This algorithm worked fine at the time of its implementation, but it can’t detect newer TCP versions like Cubic, BIC etc. The reason is that it is based on Fast Retransmission. Since most of newer TCPs have implemented

4 the same algorithm as NewReno for their Fast Retransmission part, it identifies all newer TCPs with the name of NewReno. IV. OUR APPROACH In the previous section, we introduced TBIT, explained its functionality and problems of working with it. Now we are going to explain our approach to identify TCP congestion control algorithm. Generally we used TBIT as our base source code and architecture. It means we used TBIT’s source code to setup the firewall. Then it is used to configure the BPF in order to capture the packets and bring them to our source code. The start of our approach is exactly like TBIT, but the identification is different. As we mentioned before, there are two TCP types. For the first type, establishment required, no new TCP version has been implemented since implementation of TBIT. So we easily used its approach for this reason. We will explain this approach in 4.1 and 4.2. Identifying the second type of TCP is not easy, and is completely different from TBIT’s approach. Our approach is to be able to identify all TCP mechanisms, even the ones haven’t been implemented yet. So instead of checking only the response to Fast Retransmission, we try to count CWND size. The approach will be explained completely in 4.3. 4.1. Response to Selective Acknowledgement The selective acknowledgment extension uses two TCP options. The first is an enabling option, SACK-permitted, which may be sent in a SYN segment to indicate that the SACK option can be used once the connection is established. The other is the SACK option itself, which may be sent over an established connection once permission has been given by SACK-permitted. Our algorithm for checking response to SACK works as follows. • It sets up the firewall and BPF as described before, and sends a request to the web server. • Our software sends a SYN packet with SACK-Permitted enabled option. • If it receives a SYN/ACK with SACK-Permitted enabled option, it assumes that the web server is SACKEnabled. • It closes the connection. As it is obvious, it doesn’t need transferring of data packets, and everything is done after handshaking. 4.2. Response to ECN Explicit Congestion Notification (ECN) is a mechanism to allow routers to mark TCP packets to indicate congestion, instead of dropping them, when possible. While ECN-capable routers are not yet widely deployed, the latest versions of the Linux operating system include full ECN support. Setting up an ECN-enabled TCP connection involves a handshake between the sender and the receiver. This process is described in detail in ref 13. Here we provide only a brief description of the aspects of ECN that we are interested in. An ECNcapable client sets the ECN ECHO and CWR (Congestion Window

Reduced) flags in the header of the SYN packet; this is called an ECN-setup SYN. If the server is also ECN-capable, it will respond by setting the ECN ECHO flag in the SYN/ACK; this is called an ECN-setup SYN/ACK. From that point onwards, all data packets exchanged between the two hosts, except for retransmitted packets, can have the ECN-Capable Transport (ECT) bit set in the IP header. If a router along the path wishes to mark such a packet as an indication of congestion, it does so by setting the Congestion Experienced (CE) bit in the IP header of the packet [2]. TBIT’s algorithm for identifying this congestion control mechanism is very similar to the one explained in SACK. Generally it works as follows. • It sets up the firewall and BPF as described before, and sends a request to the web server. • Our software sends a SYN packet with ECHO_ECN and CWR flags set. • If it receives a SYN/ACK with ECN_ECHO enabled flag, it assumes that the web server is ECN-Enabled. • It closes the connection. 4.3. Detecting CWND size As we mentioned before, it is not possible to identify all TCP congestion control mechanisms by considering only their response to Fast Retransmission. We decided to count TCP packets in a received CWND by the client. • It sets up the firewall and BPF as described before. • The client sends an HTTP request to retrieve the index page of the web server. • The server starts sending packets and waits for ACKs from the client. • After handshaking, the client stores all received packets, and it doesn’t send any ACK for any one of packets. • It waits for a fixed time and sends ACKs for all the packets. Before sending any ACK, it counts number of packets in its buffer and sets CWND as the difference between last buffer size and the newer buffer size. • After sending the ACK for packet 62, it doesn’t send any more ACKs and ignores all the packets in its buffer. • Instead, it sends two more Acks for packet 62 to make TCP go to the Fast Retransmission. • The client continues this task without anymore packet drop, counts CWND size and stores this size in a text file. • After receiving FIN message from the server, it closes the connection. As it is obvious in this algorithm, our approach makes some intentional delays in order to receive all packets of a CWND and tries to count them. This delay mustn’t be too long to make the server retransmits packets because of time out. At the moment we set it to one second, but it can be set to any other values according to network features. The diagram of our algorithm is shown in figure 4.

5

Delay e.g one second

Delay e.g one second

Handshaking

Data Acks

Data

Acks Client

Server

Figure 4: Our approach for detecting CWND size

It is shown in the picture whenever the server receives an Ack from the client it changes the CWND size. This reason clarifies for us why we need to have a delay in order to detect CWND size. The other important issue is sending Acks for all the packets in our buffer. Sending all the Acks together, after expiration of delay time, will cause problems of reordering to the server. It will also make the server confuse to recognize actual RTT. The main reason for this is the buffer size and its ability to send a specific number of packets at the same time. So it makes random delays between these Acks that will cause retransmission time out. For solving this problem we added a short fixed delay times between sending two consecutive Acks. Then our network buffer would have enough time to send Acks on the network without reordering or transmission time out. The algorithm shown in figure 4 can compute CWND size. But if we follow only this procedure, detecting TCP mechanism will not be possible. When a server starts sending packets to the client, TCP works in the Slow Start mode. After CWND passes a threshold, it goes to Congestion Avoidance mode. Since Slow Start is the same among most of TCP versions we can only detect different TCP versions according to their behavior during Congestion Avoidance. In practice, the threshold is initially set to a really huge number. After the first packet loss, different TCP versions set different values as the threshold. Thus, it is required to have a packet loss in order to go to Congestion Avoidance. We decided to drop packet 63 and do not send Ack for other packets received after it. So we send three duplicate Acks for packet 62. Thus the CWND size before packet drop will be 2, 4, 8, 16 and 32. According to Slow Start mechanism next CWND size will be 64, but we will not send Ack for any one of the packets sent in this CWND. Then it will go to Fast Retransmission and resend dropped packets. There are still some minor problems.

4.3.1 Receive Window and Scale Factor Receive Window; RWIN is the amount of data that a computer can accept without acknowledging the sender. In the other words, it is the buffer size of a computer. If sender has not received acknowledgement for the first packet it sent, it will stop and wait and if this wait exceeds a certain limit, it may even retransmit. This is how TCP achieves reliable data transfer [14]. RWIN has especial field in TCP header. It can be set during handshaking. TCP header is designed to have a field of 16 bits to handle RWIN. So it can only handle 64KBs of buffer size. It is obvious that buffer sizes of these days are much bigger than this value. There is another TCP option that is called Scale Factor. Its job is to extend RWIN to 4GBs. It is placed only in SYN packet and should be set during the handshaking only. These two points have a very close relationship to our approach. We have to store some number of packets in our client, and send Acks after a while. During this period we store some packets are in pending mode from the server’s point of view. If the number of pending packets gets more than RWIN, the server won’t send any more packets and the CWND size won’t change. That means our algorithm will not work and we can’t identify TCP congestion control mechanism. For overcoming this problem these two values should have large enough values according to maximum segment size and the CWND size. Thus during handshaking process, we set a really large values for these two parameters. V. PRACTICES AND RESULTS We generally tested our approach on two situations. The first experiments were done in the laboratory where everything was under control and the network traffic was very light. And the second one was done in real environment using Internet, and regular web sites. In this situation the traffic was not under our control. In the following subsection, we explain the experiment in the lab. 5.1. Lab Experiment We used two different machines; one as the client and the other was the server. Operating system on both of these machines was Ubuntu 8.4. The bandwidth of 54 Mbps seemed to be very good in this situation. No other network flow was running at the time of our experiments that would mean no especial traffic was on the link. Actually we used Wireless network that intrinsically has the possibility of packet loss, but we didn’t consider experiments with unexpected packet loss. Our web server was Apache-TomCat and we put a really huge index page for it in order to see the behavior of its congestion control mechanism clearer. The size of this page was approximately 500KBs, and we had an intentional packet drop on packet 63. Maximum Segment size was set to 100 bytes in handshaking. Now we are showing our experiments in the lab within the following parts of this paper.

6

Figure 5: Reno CWND size

5.1.1. Reno Test In the first test we set the running TCP on the server to Reno and tried to create a diagram from the CWND size in time. Our algorithm creates only a file and put CWND sizes in the file. So we used a java library to create a diagram from the output file. The diagram is shown in figure 5. Let’s match TCP behavior of Reno to what is described in Reno manuals. There is a very important point here. This diagram is designed according to CWND size in client point of view not from the server side. That’s why it is not exactly like what we expect it to be. Now we will explain it with more details. It starts CWND size by two packets, and goes to Slow Start mode. After each RTT, one second in this case, it doubles the CWND size. After the fifth RTT, the client receives a CWND of 32 packets size. It sends 32 Acks to the server, and then the server increases its CWND size to 64 packets. But the client doesn’t reply any one of these packets. It only sends three duplicate Acks for packet 62 in order to make the server go to Fast Retransmit phase. The server immediately sends dropped packet to the client, namely packet 63. Since it sent three Acks for packet 62, the server assumes the client received packets 64 and 65. But the client decided to drop all these packets. So the server’s assumption is wrong and it has to go to Slow Start again. This happened in RTT 6, 7 and 8. But the important part is that the server sets its Slow Start threshold to its previous CWND size on the server side, 64, divided by two that is 32 packets. The server goes again to Slow Start and sends packets to reach the CWND size of 32. Then it increases the CWND size by one after each RTT. It continues this process until receiving the FIN flag in TCP header that means last packet of the web page was transferred. Size of the last CWND is less than what we expected, because of shortage of data at that point. The last points that we can get from this diagram are CWND sizes and the time it takes to retrieve all web page from the web server. It is a very good example of Additive Increase Multiplicative Decrease, AIMD. It takes 77 RTTs to receive FIN flag, and the last CWND size is around 92 packets length. We will compare this value with the values received by other TCP mechanisms in the following subsections.

Figure 6: BIC CWND size

5.1.2. BIC Test We explained BIC algorithm in section two of this paper. It was mentioned that the algorithm depends on some configurable variables. The test situation of this algorithm in our experiment was exactly like the one with Reno with the only difference that we used the default values for these configurable variables. The kernel we used was 2.6.24-21 without any changes. The diagram of this TCP algorithm is like figure 6. Explaining this diagram, we should mention that the Slow Start and Fast Retransmission is the same as Reno. What happens after Fast Retransmission is explained here. It goes to Slow Start again but sets its threshold to 32 packets. Then it goes to Congestion Avoidance mode with Wmax set to 64, because the last CWND size for the server was 64. It uses a binary search to get to Wmax. So it increases CWND size very fast when it is far from Wmax, and decreases its growth value while approaching to the Wmax. It is shown in the picture that the configurable parameters in our BIC implementation are very conservative in saturation point, and it stays a long time in a CWND size of around 64 packets. After this long time and no loss detected, it increases the CWND size again and goes faster. It retrieves the whole web page after 57 RTT, and the last CWND size is around 300. It is easily understandable that it is faster than Reno, and utilizes the network capacity better than Reno. 5.1.3. Cubic Test As we explained in section two, Cubic is another TCP congestion algorithm used for high speed networks and is default mechanism on Linux kernels. We set the congestion control mechanism of our web server Linux machine to Cubic. We used the same machine as our previous test and didn’t change any parameter of TCP implemented on the same kernel. We expected similar diagram to BIC, but faster. The diagram is shown in figure 7.

7 As shown in the figure, its behavior is very similar to Reno with some differences. First of all, it doesn’t go to Slow Start after Fast Retransmission. Instead of going to Slow Start, it sends packets by the CWND size of 32 after that Fast Retransmission. The second point is that its growth function during Congestion Avoidance is slower than Reno. Even though it is very similar to Reno, it doesn’t increase its CWND size by one after each RTT. And at the end, its last CWND size is 80 and retrieves the whole web page after 78 RTTs. It was predictable because its growth function was slower than Reno and these two parameters prove this.

Figure 7: Cubic CWND size

It is clearly shown in this diagram that all of our expectations were satisfied by Cubic. It is very similar to BIC diagram, but less conservative and absolutely faster. It sets the plateau of its Cubic function on 64 packets. It passes this point after 15th RTT, and then speeds up its growth function. Since we set our Receive Window to 64KBs, its congestion window may not go beyond this value. It is understandable from this diagram that the last two CWND sizes are exactly 640 packets length. Considering Maximum Segment Size of 100 bytes that we set in the SYN packet, it is exactly equal to what we expected. The last points about this diagram concern its speed and CWND sizes. It retrieves the whole web page after 27 RTT, and the last CWND size could be even more than 640 packets. It is obvious that the algorithm is two times faster than BIC and around three times faster than Reno. 5.1.4. Windows XP TCP Test There are some implemented TCP congestion control mechanisms that are not open source. So we can’t understand their behavior clearly from the source and everything should be understood according to this active probing. One of these congestion controls works on Windows XP. So we tested our approach on a TomCat Server working on Windows XP Service Pack 2. The response of this mechanism to our probing method is shown in figure 8.

Figure 8: Windows XP CWND size

5.1.5. SACK Test As we mentioned in section two, we detect SACK test only in handshaking. We tested this part in our lab environment and the corresponding Linux supported SACK. Since this part was completely done by TBIT project, and we just brought it into our implementation to make our package complete, more details could be found in [2]. 5.1.6. ECN Test Just like SACK we borrowed this feature from TBIT project and tested it with our web server. It seemed that our web server didn’t support ECN and didn’t respond to our ECN message in SYN packet. 5.2. Real Experiments We tested our approach on some real working web servers in the world, and captured their response. There are two differences between lab and real world situations. The first one is the traffic. It is not under our control and unexpected packet drops, unexpected delays might occur. These problems might cause difficulties for our tests. Considering these problems, if our test goes wrong or strange we tested it again. The second problem was index page size of web servers. It is not easy to find an index web page of 500KBs. Their sizes are mostly around 3 or 4 KBs. So we changed our algorithm and dropped the 7th packet instead of 63th one. Then the Slow Start will end when CWND size is 4 packets for the client, and obviously it is 8 on the server side. The third problem is that some servers can detect our intentional delay and won’t respond to our test request. This is what happened when we tested our software with www.yahoo.com, www.youtube.com and www.google.de. These web sites didn’t send anything after handshaking. The best and closest server that we could test was www.jacobs-university.de. It is very close to us and we know a lot about it. Figure 9 shows the behavior of this web site to our request.

8

Figure 9: Jacobs web size CWND size

It is obvious from the diagram that the running TCP on Jacobs University web site is Cubic. Its plateau on 8 packets is very clear. It retrieves the whole web page after 15 RTTs. We tested our approach with some other web servers and sometimes we got really strange behavior from those web servers. For example when we sent our request to www.bsag.de, it sends the whole web page in the first congestion window. That means its Initial Congestion Window is set to the whole page size. Some other web servers couldn’t be tested because their home pages were less than what we manage to analyze. When we tried to decrease the Maximum Segment Size in order to make enough number of packets, they couldn’t handle low values. It seems that they have a threshold in setting Maximum Segment Size, and won’t send anything less than that value, e.g. Jacobs University web site doesn’t accept mss less than 50 bytes. We tested the SACK option on the real world and it seemed that most of web servers support SACK. We tested it with www.google.com, www.Jacobs-university.de, www.yahoo.de, exchange.jacobs-university.de and mail.yahoo.com. They all support SACK. The only surprise was www.yahoo.com that doesn’t support. There are more tests in TBIT project that could be found in ref 2. The only problem with the tests done in TBIT project is the time. They were done around 9 years ago and most of these web servers have been significantly changed since then. The same as SACK we tested ECN approach with real web servers, and none of them supported ECN. More detailed tests are presented in ref 2. Generally they tested this approach with more than 20,000 web servers and only 22 of them supported it, and it means ECN is a kind of useless congestion control mechanism on the Internet. VI.

CONCLUSION AND FUTURE WORKS

In this paper we introduced a new algorithm for identifying TCP congestion control mechanism using active probing. We explained how to count Congestion Window size of a remote computer by a simple algorithm, and how to force a remote web server into its Congestion Avoidance mode. We utilized

some features from TBIT in order to capture packets, and two complete packages, namely SACK and ECN in order to have a complete TCP identification package. We showed our results in both lab and real environment. It is obvious this tool can detect all TCP mechanisms, implemented or to be implemented, if they are detectable by understanding their Congestion Window size. One of the drawbacks of this tool is, however, it can not detect name of the TCP mechanism, and can only realize the Congestion Window size in time. As a future work, one can try to name the running TCP. It would be possible to do so, but some points are to be taken into account. For example BIC and Cubic shapes during Congestion Avoidance is very similar to each other. Another possible future work would be detecting configurable parameters of a running TCP. It is possible if CWND size and name of the TCP mechanism are known. REFERENCES [1] D. E. Comer and J. C. Lin, “Probing TCP Implementations”, In USENIX Summer 1994 Conference [2] J. Padhye and S. Floyd, “On Inferring TCP Behavior”, In Proceedings of ACM SIGCOMM 2001 [3] K. Fall and S. Floyd, “Simulation-based Comparisons of Tahoe, Reno and SACK TCP”, Computer Communication Review, 26(3), July 1996 [4] S. McCanne and V. Jacobson, “The BSD Packet Filter: A New Architecture for User-Level Packet capture”, In proceedings of the winter USENIX technical conference, January 1993 [5] M. Allman, V. Paxon and W. Stevens, “TCP Congestion Control”, April 1999 , RFC 2581 [6] TBIT web site: http://www.icir.org/tbit [7] S. Floyd and T. Henderson, “The NewReno Modification to TCP Fast Recovery Algorithm”, April 1999, RFC 2582 [8] J. Postel, “Transmission Control Protocol”, September 1981, RFC 793 [9] V. Jacobson “Congestion Avoidance and Control”, Computer Communication Review, 18(4), August 1988 [10] I. Rhee and L. Xu, “CUBIC: A New TCP-Friendly High Speed TCP Variant”, In Proceedings of the third PFLDNet Workshop, France, February 2005 [11] BIC and CUBIC official home page at: http://netsrv.csc.ncsu.edu/twiki/bin/view/Main/BIC [12] W. Stevens, “TCP Slow Start, Congestion Avoidance, Fast Retransmit”, 1997, RFC 2001 [13] S. Floyd and D. Black, “The addition of Explicit Congestion Notification (ECN) to IP”, 2001, RFC 3168 [14] www.wikipedia.org