On Packet Concatenation with QoS support for Wireless ... - IEEE Xplore

On Packet Concatenation with QoS support for Wireless Local Area Networks Dzmitry Kliazovich and Fabrizio Granelli DIT - University of Trento Via Sommarive 14, I-38050 Trento (Italy) E-mail: [klezovic,granelli]@dit.unitn.it Abstract-Wireless networks are becoming increasingly popular in the world of telecommunications. IEEE 802.11 standard provides reliable data delivery in wireless LANs. The cost for such reliability is the overhead related to the data transmission at the link layer. In this paper, the application of a packet concatenation algorithm at the IP layer is proposed to the purpose of overhead reduction together with increase of fairness in medium access at the link layer. The performance of the proposed algorithm is evaluated through simulations as well as real experiments. Results underline significant performance enhancements deriving from the use of the proposed packet concatenation algorithm in wireless LANs. Keywords – Packet Concatenation, WLAN Optimization.

I. INTRODUCTION Wireless networks are becoming increasingly popular in the world of telecommunications, especially for the provisioning of mobile access to network services. The availability of this new medium came up with new problems which should be solved in order to optimize the performance of protocols natively developed for wired scenarios. The goal of this paper is to discuss performance problems of wireless LANs arising in presence of small packets, defining major points of possible performance optimization (Section II). An overview of existing approaches is presented in Section III through a classification on the basis of their position in the ISO/OSI reference architectural model. Section IV presents an approach for packet concatenation on IP level, describing advantages and drawbacks of the approach. Quality of Service (QoS) scheme for different packet concatenation methodologies is proposed in Section V. Performance evaluation (Section VI) describes conducted simulation experiments and test-bed measurements, which underline theoretical and real benefits of the presented scheme. Finally, Section VII concludes the presented work mentioning directions for future research on the topic.

85%) of Internet traffic is TCP-based. The reliability of TCP is obtained through the utilization of a positive acknowledgement scheme. Acknowledgements are small-sized packets (40 bytes) which cause the transfer of bulks of small packets over communication network. More in general, the transmission of application data through the wireless channel requires its encapsulation by lower (from transport down to physical) layers of the protocol stack of the sender node. Link and physical layer headers specified by IEEE 802.11 standard [3] represent a relevant overhead if compared with wired network ones. Such headers introduce an overhead with respect to application data payload actually transported by the packet. Since this overhead does not depend on the size of a packet, for small packets it can be even several times greater than application data payload. For an ordinary TCP ACK packet it is shown that “the total overhead is four times the payload!” [4]. IEEE 802.11 standard [3] together with its extensions (a,b,g) specifies different rates for data transmission ranging from 1Mbps to 54Mbps. However, the introduced rates are those which are achieved at the physical layer of the wireless channel, and the maximum achievable throughput level is far lower than the reported rate (see [5] for a theoretical demonstration in the case of IEEE 802.11b). Fig. 1 provides an overview of packet encapsulation for an application which employs TCP as a transport layer protocol. Most of the overhead is related to the PLCP Preamble, which is used for the synchronization of wireless receiver. According to IEEE 802.11 standard, this preamble as well as the PLCP header is transmitted always at 1 Mbps regardless of the actual link speed: within the transmission of any data frame over the wireless channel, PLCP preamble and header will take 192 microseconds regardless of the maximum available bitrate on the channel.

II. WIRELESS LAN PERFORMANCE ISSUES Even if the influence of wireless links in packet delivery includes several aspects (limited bandwidth, increased latency, channel losses, mobility and others), in this paper the focus is on the overhead added to the transmitted data by the protocol stack implemented at the sender node. The most widely used transport layer protocol is the Transmission Control Protocol (TCP) [1]. According to the statistics presented by researchers [2], the majority (over

0-7803-8938-7/05/$20.00 (C) 2005 IEEE

1395

Fig. 1. Packet encapsulation of TCP segment over IEEE 802.11 [5].

Table I summarizes the analysis of efficiency of pure TCP throughput under the hypotheses no collisions, no fragmentation, no RTS/CTS and no bursts. Experiments for 802.11b extension are presented in [5]. The analysis of maximum achievable TCP throughput is done for the TCP/IP datagram size which corresponds to the most common Maximum Transmission Unit (MTU) of 1500 bytes used in Ethernet LANs. However, the size of the packets present in networks is far from being fixed at MTU. More than a half of packets in the Internet are smaller than 100 bytes [2], which means that it is required to take into account the relationship between performance and packet size. For that reason, the evaluation results of TCP throughput over 802.11b wireless network versus packet size are presented in Fig. 2. IEEE 802.11b extension is chosen for the experiments as the mostly widespread nowadays supported by the majority of vendors. However a conceptual similarity of the results follows other extensions (802.11a and 802.11g). TABLE I. THROUGHPUT EFFICIENCY OF 802.11A,B. Link speed, Mbps TCP Throughput, Mbps Efficiency, % 802.11b 1 0.75 74.9 2 1.41 70.7 61.5 5.5 3.38 11 48.4 5.32 802.11a 12 9.2 76.6 24 67.5 16.2 54 49.2 26.57

The analysis of Fig. 2 shows that with reduction of the packet size the performance of 802.11b is dramatically decreased (left part of the graph). Thus, for packet sizes less than 100 bytes the throughput is less than 10% of the available bandwidth. Throughput, Mbps .

6 5 4 3 2

MAC 802.11b

1

Threoretical

0 0

300

600 900 1200 TCP/IP Datagram size, by tes

1500

Fig. 2. Throughput of IEEE 802.11b versus packet size.

The main idea for the performance optimization on the wireless channel is to improve the throughput of the system by enlarging packet size. The red dotted line in Fig. 2 corresponds to the theoretically achievable value when outgoing packets are equal to MTU in size. III. CONCATENATION SCHEMES AND RELATED WORK The aim of this section is to discuss about the possibility to improve wireless LANs performance by the mean of concatenation of data outgoing from the protocol stack before its actual transmission on the wireless medium.

Concatenation is the process on linking packets together in order to avoid performance degradation of the network due to the transmission of small packets. Such principle can be applied at different layers of the protocol stack. Table II presents a possible classification of such schemes. TABLE II. CLASSIFICATION OF CONCATENATION SOLUTIONS. Solution Position Transport Nagle algorithm and modifications IP Layer Packet Concatenation on IP (PAC-IP) Link Packet Frame Grouping (PFG) Layer PAcket Concatenation (PAC)

A. Transport Layer Solutions One of the first solutions within this area was introduced by Nagle in 1984 – now known as Nagle algorithm [6]. This algorithm aims at reducing the number of small packets which are generated by various applications (such as Telnet). Nagle algorithm allows the TCP sender to collect more data from the application instead of immediate output of small packets. The collection is limited by the maximum size of the packet that can be collected, which corresponds to the Maximum Segment Size of a TCP connection, as well as by the time required for the concatenation process. Nowadays, Nagle algorithm is a standard requirement for TCP implementations. Further research showed that Nagle algorithm does not perform well in some scenarios, like in case of implementation of Delayed-ACK option of TCP [12]. The precise investigation of such situations as well as comparison of different modifications of the Nagle algorithm is presented in [7]. B. Link Layer Solutions Link layer solutions are aware of the type of medium they are running on. 1) Packet Frame Grouping (PFG). PFG [4] was developed with the aim of improving multimedia performance over wireless LANs. However, PFG can be applied to any type of traffic. The key idea of the approach is to group small frames at the link level in order to share the header overhead within the whole group. Similar to the fragmentation technique specified in the standard [3], packet frame grouping separates outgoing data frames and their link level ACKs by Short InterFrame Space (SIFS). Main advantages of PFG are: 1) latency in packet delivery is not increased; 2) no additional data copy is required; 3) it is not limited to packets sent to a particular host. An implementation of the approach requires only minor modifications to the link layer protocol. 2) PAcket Concatenation (PAC) – another link layer approach – is described in [8]. The core idea of PAC is to concatenate MAC layer frames into a superframe. The selection of packets for such concatenation is based on the next hop address, which must be the same for all frames. PAC is able to concatenate up to 9 MAC data frames into a superframe, the delivery of which is acknowledged by a new type of

1396

ACK, which supports selective acknowledgement of the subframes. PAC keeps latency untouched and produces better overhead reduction if compared with PFG approach. However, the disadvantages are: 1) it requires additional data copy at the link level in order to concatenate frames for transmission; 2) it is limited to packets designed to the particular host (next hop); 3) it requires modifications of MAC implementation, like in case of Frame Scheduler (which requires computational resources) and the introduction of a new packet type for selective acknowledgement. Summarizing, link layer solutions are designed for finer optimization, achieved by a concatenation scheme which is aware of the wireless medium characteristics. However, most of them modify the standardized link layer protocol. Such modifications require a big effort from the research community for standardization as well as from industry for the modification of the firmware of wireless devices. This is the main reason why such approaches are not implemented at the moment. Nevertheless, solutions especially designed for wireless links may mitigate the drawbacks of more universal higher layer approaches. For example, in most cases they do not introduce an additional delay in single packet delivery, since the concatenation process can be applied to packets which are already waiting for the communication medium to become idle in the transmission queue. C. IP Layer Solutions The previous two sections reviewed existing concatenation approaches on Transport and Link layer correspondingly. The paper aims at discussing the possibility of packet concatenation implemented at the IP layer. Theoretical and implementation details are presented in the following section. IV. IP LEVEL PACKET CONCATENATION (PAC-IP) Packet concatenation can be implemented at the IP layer in order to improve the performance of the wireless network. The main idea is to concatenate IP packets (IP header + IP payload) into a single object (called concatenated collection), which will be considered as ordinary payload at the link layer. This concept is illustrated in Fig. 3. The concatenated collection is forwarded to the link layer for the transmission on the wireless channel. As a result, Link Layer (LL) and Physical (PHY) layer headers are added only once for the entire collection. IP Layer

IP Header

PHY

LL

Header Header

TCP Application Data

Header

IP Packet

FSC

PHY

LL

Header Header

IP Packet

FSC

PHY

LL

Header Header

IP Packet

Link Layer

PHY

LL

Header

Header

Payload

FSC

Maximum Collection Size (MCS)

Fig.3. IP packets collection into a single LL packet.

FSC

At the receiver side, a concatenated collection can be easily separated into the original IP packets by using the collection size stored in MAC header and each IP packet size (a field in the standard IP header). PAC-IP does not change standardized headers (neither at the link nor at the IP level). In order to implement the algorithm, a software module should be inserted to the protocol stack exactly below IP and above the link layer, as it is shown in Fig. 4. It will concatenate the incoming IP packets into a single outgoing packet collection. Concatenation requires data copy. The data buffer of the first incoming packet is extended to the maximum collection size and then next incoming IP packet is concatenated to the collection by data copy process. Such technique is well suited with sk_buff representation of packets within the open protocol stack model present in OS Linux. The Maximum Collection Size (MCS) is limited to the MTU of the network interface (1500 bytes) in order to avoid further fragmentation employed by the link layer. From the point of view of overhead reduction, transmission of bigger packets brings to better performance. However, wireless channels suffer from problems such as collisions, hidden nodes, signal interference and so on. Experimental studies show that a trade-off can be achieved between physical overhead reduction deriving from enlarging the packet and frame error rate [9]. As a result, MCS can be dynamically adjusted to the optimal value by the algorithm proposed in [9]. The concatenation module is extended with an internal timer which specifies the Maximum Concatenation Time (MCT) for a single collection. If the timeout occurs, the concatenation process should be finished immediately and the collected packet must be forwarded for the transmission – even if the MCS size is not reached. Similar to other Link Layer solutions, PAC-IP groups packets designed for the same destination, considering only the next hop in the data path. There is a variety of application scenarios for wireless networks; among them the most widely spread nowadays are: (1) wireless-cum-wired (where the wireless hop is the last hop of the network between the base station and wireless node); (2) multi-hop (where the route of the packet goes through several wireless links). In both cases grouping packets by their next destination address (next hop) provides relevant advantages, making the concatenation useful not only for the source node but also for other nodes where the traffic can be concentrated, e.g. when the base station delivers packets from different sources of wired network to the same wireless node. Moreover, IEEE 802.11 MAC through the specification of the medium contention algorithms introduces packet based fairness among the nodes of the system: each node has an equal opportunity in medium access for a single packet transmission. However, there are no considerations related to the packet size. PAC-IP improves fairness among the nodes by providing equal opportunities for the amount of data transmission.

1397

QoS Classifier

IP layer Packet Sheduler

PC 0

PAC-IP concatenation Dest 1

Dest 2

Dest N

+ Interface Queue

PC 1

PC 2

PC N

CM 0

CM 1

CM 2

MCT [0]

MCT [1]

MCT [2]

CM N

MCT [N]

Channel Estimat or

Output t o t he next layer

Fig. 5. QoS-enabling module for Packet Concatenation schemes.

Link Layer

Fig. 4. PAC-IP software module structure.

The main drawback of the described approach is the delay added during concatenation procedure to the packet delivery, which can be relevant in some cases. V. QUALITY OF SERVICE (QOS) EXTENSION The core idea of all methods for data concatenation is the same – to concatenate the data (byte-stream or packets) into units of the fixed size. This is exactly opposite with respect to the fragmentation idea implemented in the majority of network interfaces – to partition (fragment) the whole data unit into several packets of fixed size. Most concatenation techniques introduce additional delay in packet delivery – the time required for the packet concatenation procedure. It means that, in case of Maximum Collection Time (MCT) timeout expiration, the first packet of the collection will have the delivery delay increased by MCT value while other packets of the collection will have an additional delay between 0 (if a packet is added exactly before the timeout occurrence) and the MCT value. Such situation can be not suitable for applications which require delay guarantees. In order to satisfy delay requirements, we propose to introduce Quality of Service (QoS) support in concatenation algorithms. The differentiation among traffic classes according to the delay requirements can be introduced in concatenation algorithms through the introduction of an additional module, as depicted in Fig. 5. The main idea for differentiation is to manage not only one concatenation process but several of them with different MCTs. The values of MCT[n] are allowed to vary from 0 to the maximum reasonably possible MCT (MCTmax). The purpose of the QoS classifier is to classify each packet and then to forward it to the appropriate Concatenation Module (CM). Depending on the level of the protocol stack where concatenation scheme is implemented, the QoS classifier can provide classification as well as specification of QoS requirements. As an example, in case of QoS-enabled version of PAC-IP, an additional the purpose of QoS classifier is to specify a “non-concatenate” class for such TCP packets which could be considered as control packets.

Another important part of QoS-enabling module is Channel Estimator, whose main purpose is to estimate the packet transmission time and then to provide estimation results to CMs. In the simplest implementation, the estimation is performed by simple calculation of the transmission time relying on the information like available bitrate and the size of the collected packet (considering link and physical layer overhead). Each Concatenation Module (CM) within the model is designed to perform packet concatenation with a chosen MCT time. MCT[0] is a special case designed for the “nonconcatenate” traffic class when the concatenation time is equal to 0. This means that all the incoming packets should be forwarded to the next layer without any concatenation with other packets. However, the specified values of MCT can be dynamically adjusted by Channel Estimator which should always have knowledge about the packet which is currently being transmitted on the medium by the node as well as the estimated time when transmission should end. Relying on that information, the Channel Estimator could temporarily increase the default MCT time of a particular module in order to enlarge the concatenation process while the medium is busy. VI. PERFORMANCE EVALUATION The performance of the proposed solution is analyzed by simulations with the ns-2 network simulator [10] as well as by measurements in real IEEE 802.11 networks. A. Simulation results Results are obtained in grid topology where two static nodes are linked through a single TCP connection. One of them continuously sends data, while the other one only replies with TCP acknowledgements. IEEE 802.11b is chosen as the reference physical standard. The simulation parameters are summarized in Table III. The throughput of TCP connections is chosen as the main parameter for the performance analysis. Drawing on the considerations presented in the introduction section of the paper, we decided to compare the proposed protocol with standard TCP scheme presented in [1].

1398

duced by data copy (releasing the resources of the host), PAC-IP could be implemented inside the firmware of the wireless device using such techniques like DMA. Throughput, Mbps.

Fig. 6 presents the throughput comparison of PAC-IP against IEEE 802.11 standard. The maximum collection size is set to 1500 bytes (most common MTU in Ethernet) which leads to the concatenation of packets of less than 750 bytes. In simulations, the TCP source is in saturation state, i.e. it always has a packet to send. TABLE III. SIMULATION PARAMETERS Parameter Name Value Slot 20 us SIFS 10 us 50 us DIFS PLCP preamble + header 192 us Data Rate 11 Mbps Basic Data Rate 1 Mbps two-ray ground Propagation Model

7 6 5 4 3 2 1 0

MAC 802.11 PAC-IP simulations PAC-IP on testbed

0

500 1000 TCP/IP datagram size, bytes

1500

Fig. 6. PAC-IP throughput (simulation and experimental) vs IEEE 802.11.

In case packet concatenation is employed (left half of the graph), the resulting throughput level is close to the throughput achieved by IEEE 802.11 when 1500-bytes packets are transmitted. The only assumption that was made is that TCP source can output data packets enough to fulfill the collection within the concatenation process. B. Real-network simulations The results achieved by the simulation show good agreement with the design principles of PAC-IP. However, for the purpose of further investigation on the behavior of the algorithm in real scenarios, a simple IEEE 802.11b testbed is produced, consisting of two computers: a fixed workstation and a mobile laptop. Both computers are equipped with wireless 802.11b Orinoco Silver cards. In order to support PAC-IP functionality, Orinoco_cs wireless driver is modified. Results regarding TCP throughput are achieved by using Iperf (version 1.7.0) [11] performance measurements tool. TCP throughput for testbed experiments is presented in Fig. 6 by dash-dotted line. There is a relevant difference with the results obtained from the simulation: the TCP flow with small data packet sizes (left part of the curve) achieves less throughput. The reason is that, in case of simulation, it was assumed that TCP could fill the packet of MCT during concatenation, while practical implementations of TCP can output only a given number of packets (depending of the window size) and then it waits for an acknowledgement from the receiver. However, even the reduced size of the collected packet brings a significant throughput improvement if compared with IEEE 802.11. The single-flow environment is chosen for the experiments to underline this difference. In opposite, in a multi-flow environment the resulting curve better approximates the one obtained by the simulations. Furthermore, the implementation of PAC-IP requires data copy of the incoming packet for their concatenation. In order to evaluate the influence of data copy process to the delivery time, all the packets (even those with a size greater than half of MCS) are copied before releasing them to the link layer. As a result, the time required for the data copy process reduces the throughput in case of large packets (right part of the graph) of a single TCP data flow. To avoid the delay intro-

VII. CONCLUSIONS The paper highlights the problems related to performance of wireless LANs due to small packets as well as to the overhead introduced at the link and physical layers by IEEE 802.11. After a classification of existing approaches, packet concatenation at the IP level (PAC-IP) is proposed as a possible solution designed to improve the throughput performance especially in case of transmission of small packets over the wireless link. The evaluation of PAC-IP is performed both through simulations as well as on an experimental IEEE 802.11b-enabled test-bed. The results show good agreement with the design aspects. As a second contribution, QoS support based on delay differentiation is proposed for different concatenation schemes. Ongoing activities deal with performance evaluation of the implementation of the QoS-enabling module within PAC-IP, through simulation and experiments using the test-bed presented in the paper. REFERENCES [1] J. Postel, Transmission Control Protocol, RFC 793, September 1981. [2] G. Miller K. Thompson and R. Wilder, “Wide-area Internet traffic patterns and characteristics”, IEEE Network, pages 10--23, Nov./Dec. 1997. [3] Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications, IEEE 802.11 standard, 1997. [4] J. Tourrilhes, “Packet Frame Grouping: Improving IP multimedia performance over CSMA/CA”, ICUPC, 1998. [5] The Norwegian academic and research data network, http://www.uninett.no/wlan/throughput.html [6] J. Nagle, “Congestion Control in IP/TCP Internetworks”, RFC 896, 1984. [7] J. Mogul and G. Minshall, “Rethinking the TCP Nagle Algorithm”, Computer Communication Review, pp. 6-20, January 2001. [8] K. Yeung, “802.11a Modeling and MAC Enhancements for High Speed Rate Adaptive Networks”, Technical Report UCLA, 2002. [9] P. Lettieri and M. Srivastava, “Adaptive Frame Length Control for Improving Wireless Link Throughput, Range, and Energy Efficiency,” INFOCOM, Vol. 2, pp. 564 – 571, March 1998. [10] NS-2 simulator tool home page. http://www.isi.edu/nsnam/ns/, 2000. [11] Iperf performance measurement tool, http://dast.nlanr.net/Projects/Iperf [12] D. Clark, “Window and acknowledgement strategy in TCP”, RFC 813, Internet Engineering Task Force, July 1982.

1399