An admission-control technique for delay reduction ... - Semantic Scholar

4 downloads 6004 Views 805KB Size Report
Nov 3, 2008 - enterprise network) implements a proxy server that keeps an up-to- date copy of ...... economic aspects of content delivery on the Internet.
Decision Support Systems 46 (2009) 594–603

Contents lists available at ScienceDirect

Decision Support Systems j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l o c a t e / d s s

An admission-control technique for delay reduction in proxy caching Cuneyd C. Kaya a, Guoying Zhang b, Yong Tan c,⁎, Vijay S. Mookerjee a a b c

School of Management, University of Texas at Dallas, Richardson, TX 75080, United States Dillard College of Business Administration, Midwestern State University, Wichita Falls, TX 76308-2099, United States Michael G. Foster School of Business, University of Washington, Seattle, WA 98195-3200, United States

a r t i c l e

i n f o

Article history: Received 5 April 2007 Received in revised form 17 October 2008 Accepted 27 October 2008 Available online 3 November 2008 Keywords: Cache management Proxy caching Admission control Screening Delay reduction Download traffic

a b s t r a c t We evaluate an admission-control (screening) policy for proxy server caching that augments the LRU (Least Recently Used) algorithm. Our results are useful for operating a proxy server deployed by an Internet Service Provider or for an enterprise (forward) proxy server through which employees browse the Internet. The admission-control policy classifies documents as cacheable and non-cacheable based on loading times and then uses LRU to operate the cache. The mathematical analysis of the admission control approach is particularly challenging because it considers the dynamics of the caching policy (LRU) operating at the proxy server. Our results show substantial reduction (around 50% in our numerical simulations) in user delay. The improvement can be even larger at high levels of proxy server capacity or when the user demand patterns are more random. An approximation technique provides near optimal results for large problem sizes demonstrating that our approach can be used in real-world situations. We also show that the traffic downloaded by the proxy server does not change much (as compared to LRU) as a result of screening. A detailed simulation study on LRU and other caching algorithms validate the theoretical results and provide additional insights. Furthermore, we have provided ways to estimate policy parameter values using realworld trace data. © 2008 Elsevier B.V. All rights reserved.

1. Introduction The last few years have witnessed a tremendous growth in the amount of information available on the World Wide Web. In addition to personal computer access, there is also an increase in the variety of devices (such as mobile phones, Personal Digital Assistants, etc.), being used to retrieve information from the World Wide Web. As the proliferation of the internet continues, it is important that users continue to get acceptable performance while using the Web. However, although network and server capacities have dramatically increased over the past several years, the emergence of bandwidth hungry applications such as video, audio on demand and distributed games has made the demand for performance even greater. Previous studies show that the estimated monetary loss to ecommerce firms from slow response times exceeded $4 billion in 2001 [42] and in 2000 Christmas shopping season alone, e-tailers' loss was estimated at $14 billion [17]. The amount of time it takes to view a web page has been shown to be a significant factor in determining the success of a site as well as the satisfaction of its users. Reducing delays can have substantial economic impacts. For example, decreasing the

⁎ Corresponding author. E-mail addresses: [email protected] (C.C. Kaya), [email protected] (G. Zhang), [email protected] (Y. Tan), [email protected] (V.S. Mookerjee). 0167-9236/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.dss.2008.10.004

loading time of an electronic retailer's home page by one second reduced the rate of visitors abandoning the web page from 30% to 8% [42]. Web caching (among other technologies) addresses issues of capacity and performance and has become an integral part of the Web infrastructure [20,31]. When a user requests a document, it may have to be fetched from a remote server and loaded on the user's computer. While the document is being fetched, the user often waits without doing anything productive. Most users do not like to wait much for a document and long response times often lead to user frustration and eventual termination of use [24,40]. Web caching can take place at different locations; e.g., at the user's browser, at a proxy server, etc. [27]. Browser caching is carried out by commercial software with built-in caching capabilities, while proxy caching is done by specialized caching algorithms run at proxy servers implemented at the Points of Presence (POP) of an Internet Service Provider (ISP) or at the edge of an enterprise network. This study deals with improving the caching performance of a proxy server placed at the edge of an (ISP or enterprise) network. Proxy caching has two main purposes. First, it reduces user delay by caching popular documents at the proxy server. Second, proxy caching can reduce the traffic generated by the proxy server when downloading from documents' original servers (or simply download traffic by proxy server, Dp) and lower the associated bandwidth costs. Proxy servers have non-caching benefits as well, e.g., they prevent users in the enterprise networks to access to sites, such as online games and

C.C. Kaya et al. / Decision Support Systems 46 (2009) 594–603

595

other proxy caching algorithms such as Latency estimation algorithm (LAT) [40] and Greedy Dual Size (GD-Size) [8]. Finally, we conduct a simulation study as well as an experiment with real web trace data to validate the theoretical results obtained in the paper. 1.2. Summary of results

Fig. 1. Model setting.

entertainment websites, which are not related to the business, thus preventing reduced productivity [31]. Fig. 1 shows the basic setting for our model. An ISP (or an enterprise network) implements a proxy server that keeps an up-todate copy of popular web documents in its cache. When a user requests a cached document the request can directly be served from the proxy cache. Otherwise, the proxy server downloads the document from the origin server using an outgoing communication link. The goal of this paper is to improve proxy caching performance (in terms of reducing end user delay). A secondary measure of performance is the amount of download traffic (Dp) by the proxy server. This measure is important because the amount of download traffic can affect the cost of operating the outgoing link. Our experiments show that the admission control (screening) technique proposed here can substantially reduce end user delay while holding the download traffic (Dp) at about the same level. Most existing research on proxy caching uses the LRU policy;1 for example, the popular Squid proxy cache software uses a minor variation of LRU [15,30]. One limitation of the LRU algorithm is that it ignores the loading time (the time taken to fetch the document from the origin server) of documents. Intuitively speaking, the benefit from caching is most when a document with high loading time is requested and can be supplied from the cache. Our approach improves LRU performance by screening out documents (i.e., deeming these documents non-cacheable) with relatively low loading times. 1.1. Contributions of the study First, we provide a precise way to screen documents. This is done by deriving an exact mathematical expression for performance (average delay per request) under screening. Using this delay expression, we find the optimal extent of screening so as to minimize delay. Second, we find an approximation for the delay expression and derive a closed form expression for the optimal (with respect to the approximation) extent of screening. The approximation technique is particularly useful when there are a large number of documents that can be potentially accessed by the user. Third, we conduct a variety of numerical experiments that show significant improvement over basic LRU that can be achieved by screening without a significant change in the download traffic (Dp). We also study the benefit of screening for 1 LRU (Least Recently Used) policy is a replenishment policy based on the recency of a document’s last access (or use). If needed, the least recently used document, among all the documents in the cache, is purged to make room for a newly used document.

The main result of this paper is that screening can offer substantial benefits (in terms of delay) over the basic LRU policy. These benefits are especially high if the proxy server's cache capacity is high or if the user demand for documents is more random unless the capacity is low. Another useful result is that the optimal screening threshold can be extremely accurately approximated by a simple formula using parameters of the problem that are typically easy to obtain. The approximation formula is particularly useful when there are thousands or even hundreds of thousands of documents that can potentially be requested by users. We also demonstrate that the delay reducing benefits of screening do not come at the cost of additional download traffic on the outgoing link. Similar improvements are observed for other caching algorithms, such as LAT and GD-Size. The paper is organized as follows. In Section 2 we review related literature on Web caching. In Section 3, we present and analyze the screening policy. A series of numerical experiments are conducted to study its properties. Section 4 describes a simulation study to validate the theoretical results. In Section 5, we propose methods to estimate parameters using real-world web trace data. Section 6 summarizes and concludes the paper. 2. Related work The literature on Web caching is vast. Barish and Obraczke [6] discuss several caching architectures and deployment options. Caches can be deployed near the consumer (in the case of browser caching or proxy caching), near the content provider (in the case of web server caching) or at a point in the middle (in the case of proxy caching) depending on the network topology and conditions. In our study, proxy server is placed at the edge of the network (ISP or enterprise network) that the user belongs to. Podlipnig and Bözsörmenyi [30] provide a detailed survey on caching algorithms for the World Wide Web. Datta et al. [14] identify the problems on the web that cause delays and review various caching strategies to ease them. Scaling issues also attracted attention. A scalable website is the one that can serve enough requests even under high workloads. Menascé [25] describes a caching solution on the web server itself that will provide some level of scaling. Challenger et al. [9] provide a scalable system for caching dynamic web data. Fagni et al. [16] proposes a caching strategy that extracts from historical usage data the results of the most frequently submitted web queries and stores them in a static, readonly portion of the cache. The remaining entries of the cache are dynamically managed according to a given replacement policy. More recently, Chiang et al. [12] propose a periodic cache replacement policy to handle increasingly dynamic content on the Web. The heuristic approach in their study addresses the decision problems of how frequently a cache should be replaced and which dynamic fragments should be selected to the cache. Our focus is on the Least Recently Used (LRU) algorithm as well as its variations and its limitations. The LRU algorithm (and its variants) is one of the most popular methods used in proxy caching [4]. Mookerjee and Tan [27] derive mathematical expressions to estimate expected latency. Jelenkovic and Radovanovic [18] provide an explicit average-case analysis of LRU caching with statistically dependent request sequences. The surprising insensitivity of LRU caching performance demonstrates its robustness to changes in document popularity. LRU's variants mainly consist of paying attention to document size, in addition to the recency based priority given to a document by the

596

C.C. Kaya et al. / Decision Support Systems 46 (2009) 594–603

basic LRU policy. Dilley et al. [15] improve the replacement policy in Squid (a prominent proxy cache implementation) with LFU-DA (Least Frequently Used-Dynamic Aging), and GDS-Hits (Greedy Dual SizeHits) algorithms. LFU-DA and LFU-Age policies introduce an aging factor in the basic LFU policy in order to reduce cache pollution [2]. Abrams et al. [1] enhance LRU by introducing the LRU-Threshold and LRU-MIN policies. LRU-Threshold is a size and recency-based strategy that only caches documents no larger than a threshold size. Within the cache, replacements occur based on LRU policy. LRU-MIN tries to minimize the number of documents replaced by replacing larger documents before smaller ones. Murta et al. [28] divide the cache into several partitions with different sizes and each partition is operated based on LRU. Documents of similar size are placed in the same partition. This partitioning strategy reduces size heterogeneity, and hence documents of very different sizes are not likely to compete with one other for space in cache. Another improvement on LRU is Segmented LRU [3]. Here the cache is divided into two segments: an unprotected segment and a protected segment. Both segments use the basic LRU policy for replacement. Once a document in the cache is hit, it is moved to the protected segment. A document cannot be removed from the cache if it is in the protected segment, however, it may be moved to the unprotected segment and evicted from the cache from there, if necessary. Other improvements to LRU include using a usage frequency count to prioritize documents, for example, Generational Replacement [29] and LRU⁎ [10]; a document popularity metric such as LRU-Hot [26]); and size based priority LRU-SP [11]. Kumar and Norris [22] propose a proxy caching mechanism that takes into consideration of aggregate user request pattern. Both the historical request pattern and current dynamic request pattern are incorporated in their optimization model. Kumar et al. [21] implement a prototype of an online analytical processing system for mobile devices. The prototype utilizes multi-layered caching techniques to improve the performance. Reddy and Fletcher [32] improve the existing algorithms by an adaptive technique that uses document life histories to optimize the cache performance. By using the age notion, they also suggest that LRU is capable of estimating the future demand on a document although this may not be optimal. Juurlink [19] presents a replacement policy that predicts the time each page will be referenced again and evicts the page that has the largest predicted time of next reference. Usually, client browsers cache some content as their default settings. This causes the same document to be cached by both the browser and the proxy server. Tan et al. [37] study the duplication effects on browser and proxy caching. They provide an exact expression for the optimal level of duplication between a set of browsers and a proxy server. The results show that the level of duplication should be controlled when the content being cached is volatile. There is a limited amount of recent work on introducing document delay into the operation of the basic LRU policy. Most of this work is based on simulation and modifies the basic LRU policy. For example, Watson et al. [39] present an empirically derived model of Web useraccess activity, which can be used to conduct model-driven simulation studies of cache performance analysis. Scheuermann et al. [34] provide a delay-conscious caching algorithm where a profit metric is used to calculate the benefit of retrieving the document from the cache rather than retrieving from the original server. Shin et al. [36] introduce the algorithm LRU-SLFR, LRU-based small latency first replacement, which combines LRU policy with real latency to achieve the best overall performance. LRU-SLFR algorithm is a LRU policy with real network latency and access-count. It makes the linked-list as the LRU policy and makes groups by the algorithms. Wooster and Abrams [40] explore a proxy caching algorithm that estimates the download time for a document based on a history of download times. The algorithm (LAT) chooses the document with the smallest download time for replacement, i.e., document popularity is not considered. Cao and Irani [8] provide improved performance with the GreedyDual-

Table 1 Consideration of loading time in caching

Entry Exit

LRU

LAT

GD-Size

No No

No Yes

No Yes

Size algorithm where the size of the document is also considered in document purging decisions from the proxy caches. The GreedyDualSize algorithm is an extension of the DualSize algorithm proposed by [41] by using Cost/Size as the parameter used in caching. Here, cost is defined as the cost of bringing the document to the server, and size is the size of the document. The cost can be defined depending on the goal of the algorithm, thus, the cost corresponds to the loading time of the document in our study. We test the screening technique over the LAT and the GD-Size algorithms in this paper, since both algorithms consider document loading times when purging documents from the cache. Although not based on LRU, a screening algorithm for web caching to reduce disk usage on proxy servers has been studied. It is suggested that without screening, most caching algorithms maximize hit ratio, however, maximizing hit ratio does not necessarily maximize the cache performance. Caching an object requires a disk access. Frequent disk access causes the proxy server to be overloaded and this in turn results in lower proxy performance [23]. Rizzo and Vicisano [33] provide a replacement policy (LRV) that is based on the relative value of each document. The document with the least relative value is purged from the cache. The relative value of the document is calculated using the document's age, number of previous accesses and size. Bose and Cheng [7] build a queuing model to understand how various factors such as hit rate, arrival rate and file sizes affect the proxy server performance. Table 1 presents three popular proxy caching algorithms in terms of whether these algorithms consider loading time in entry decisions (deciding on whether a document should be admitted into cache) and/ or exit decisions (deciding which cached document to remove from the cache). As can be seen, none of the popular caching algorithms consider loading time for entry decisions but both LAT and GD-Size consider loading time for exit decisions.2 Since the screening technique proposed here affects the entry decision, the technique is potentially useful for all three algorithms. One advantage of our screening approach is that it does not alter the existing policy that is implemented in the proxy server, rather, the screening is done outside and caching technique is still the existing policy except that it operates on a screened set of documents. This allows the vast number of LRU implementations to remain intact — only a front-end step can be added on to the existing implementation. In the next section we describe the screening policy and derive a delay expression for the same. 3. Analysis of screening policy The screening technique is implemented as follows. After a document is fetched from the origin server, its loading time (denoted by z) is estimated. If the loading time is below a pre-determined threshold (¯z), i.e. z b z¯ , the document is not cached; otherwise the document is cached and its tenure in the cache is in accordance with the LRU policy. Note that if z¯ = 0, the proposed screening policy reduces to the basic LRU policy. In the following, the screening threshold (¯z) is determined so as to minimize the expected end user delay.

2 GD-Size is a member of a family of algorithms called GD⁎ that considers several attributes of a document for purging from the cache. In order to maximize hit ratio, H = 1 / size criterion is used in the algorithm. Since our objective is to minimize average loading time we use H = loading time / size as the criterion.

C.C. Kaya et al. / Decision Support Systems 46 (2009) 594–603

3.1. Assumptions We begin with the following two assumptions: 1. Document loading times follow a heavy tailed distribution [5] such as Pareto distribution with density function given by f ð zÞ =

aba ; a > 0; 8zzb; za + 1

ð1Þ

In the Pareto distribution a and b are the shape and the scale parameters respectively. The distribution is heavy tailed and captures the possibility that there could be a significant number of documents with high loading times. Barford and Crovella [5] review the essentials of generating web workloads to test network and server performance where the underlying distribution for loading times follows a Power Law distribution. We have also derived all the results in this section using a Uniform loading time distribution. While the qualitative nature of the results is maintained with uniform loading times, we only report the results pertaining to the Pareto distribution since it is more realistic. 2. A document's age is defined as the amount of time elapsed since last access to the document. The demand on a document with age x is a non-homogenous Poisson process with instantaneous mean rate given by

Let us label the n documents as 1, 2, …, n. First, sort the documents in descending order of loading time: z(1) ≥ z(2) ≥ z(3) ≥…≥ z(n). Next, select the first L documents in this ordering as the cacheable set. The expected delay per access for a given value of L is composed of two terms: W(L) = W1(L) + W2(L). The first term, W1(L), is the expected delay when the user requests one of the documents in the cacheable set, while the delay associated with the non-cacheable set is W2(L). Note that the user experiences delay only when the request causes a cache miss, i.e., the request is made on one of the documents outside of the cache. The average loading time of a cacheable document is L− 1ΣLi = 1z(i), and the probability of a cache miss is given by 1 − ΣLr = 1pr qr. The term pr is the probability that there are r documents in the cache and qr is the probability that a document in the cache is requested, given that there are r documents in the cache. Thus the delay associated with cacheable documents is given by,

W1 ðLÞ =

α; β > 0; and α b 1;

    L L L  L−1 ∑ zðiÞ  1− ∑ pr qr : n i=1 i=1

ð5Þ

The expression for qr is derived in [27] as

qr = 1− 1 θðxÞ = ; αx + β

597

L L−r ð1−α Þl Π ; L l = L−r + 1 ð1−α Þl + α

ð6Þ

ð2Þ

where α is the age-sensitivity parameter. As α increases, document demand is more age-sensitive and the age is a stronger predictor of demand; conversely as α decreases, document demand becomes less predictive and hence more random. The demand model in Eq. (2) reflects the fact that a document that is frequently accessed is likely to have low age (on average). By using a non-homogeneous process (i.e., allowing the mean θ(x) to change) we accommodate the fact that the popularity of a document could change over time.

If the document sizes have mean μy and standard deviation σy (but can follow any general distribution), it can be shown that [27]: ! ! C−μ y r C−μ y ðr + 1Þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi −Φ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; pr ≈Φ σ y r ðL−r Þ=ðL−1Þ σ y ðr + 1ÞðL−r−1Þ=ðL−1Þ

ð7Þ

where Φ(·) is the CDF of standard Normal. Using order statistics arguments, we derive an expression for the average loading time of the first L documents as below (See Appendix A):

3.2. Screening parameter After some analysis, it is possible to show that the optimal extent of screening (¯z) is given by  1=a n z =b ; L⁎

ð3Þ

for a set of n documents that could potentially be accessed by the users of the proxy server. In Eq. (3), L⁎ denotes the optimum number of documents marked as cacheable in the set of n documents. The value of L⁎ is obtained by solving   1 !! A R 1−α 1−1a L 1− 1− = 0: AL L

ð4Þ

In Eq. (4) R is the cache capacity in terms of the average number of documents that can be accommodated in the cache. In the next subsection and Appendix B, we derive Eqs. (3) and (4). 3.3. Derivation We first find the number of documents (L) that should make up the set of cacheable documents. When L documents are accepted into the cacheable set, the lowest (n − L) documents (in terms of loading time) are considered non-cacheable. Thus we can restate the problem as follows. Given n documents whose loading times and sizes are drawn from known distributions, find L⁎ such that the expected delay per access, W(L), is minimized.

1 L ab Γ ðn + 1Þ Γ ðL + 1−1=aÞ ∑ zð Þ = : L i = 1 i ða−1Þ Γ ðn + 1−1=aÞ Γ ðL + 1Þ

ð8Þ

To evaluate W2(L) (the delay associated with non-cacheable documents) we note that the average loading time for a document in the non-cacheable set is (n − L)− 1Σni = L + 1z(i). The delay associated with non-cacheable documents is therefore,

W2 ðLÞ =

    n n−L 1  ∑ zðiÞ : n n−L i = L + 1

ð9Þ

The total expected delay is W(L) = W1(L) + W2(L). To find the optimum L⁎ value, the following approximation can be used. For a cache capacity of C, the average number of documents that will fit in Table 2 Approximation results (10,000 documents, a = 1.5 and high cache capacity) α

Lapprox

L⁎

Delay with L = Lapprox

Delay with L = L⁎

0.15 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

3002 3015 3111 3339 3750 4448 5689 8257 10,000

3002 3015 3111 3338 3749 4447 5688 8255 10,000

19.8361 19.8242 19.6965 19.2990 18.4682 16.9740 14.3796 9.6063 1.7006

19.8361 19.8242 19.6965 19.2990 18.4682 16.9740 14.3796 9.6063 1.7006

598

C.C. Kaya et al. / Decision Support Systems 46 (2009) 594–603

Fig. 4. Minimum average expected delay per request.

Fig. 2. Effect of cache size on loading time threshold (¯z).

the cache can be approximated by R = C / μy. The total expected delay can be written as (See Appendix B): W=

  1  1! ab ab L 1−a R 1−α − 1− 1− : a−1 a−1 n L

ð10Þ

The minimization of Eq. (10) yields the first order condition in Eq. (4) that can be solved to obtain L⁎ and consequently the optimal screening parameter (¯z). As can be seen from Table 2, the approximation works well in all the cases that were studied. In addition, the nature of the delay function is such that it is relatively flat around the optimal value. Thus small deviations from the optimal value do not cause a significant difference in the delay. 3.4. Numerical results We study different levels of proxy cache capacity ranging from 1% to 30% of all the potential documents (n). Other parameters varied in these experiments are the age-sensitivity of the documents and the document size standard deviation. The following Table shows the values used for the numerical study. From Eq. (3) one can easily estimate the effect of scale and shape parameters b and a. Note that age sensitivity parameter α implicitly affects z¯ , because L⁎ is obtained from (4). Fig. 2 shows that the loading time threshold, z¯ , varies significantly with the cache capacity; however, is relatively insensitive to α. Fig. 3 shows the effects of the age-sensitivity parameter on the expected delay per access for different levels of capacity. A low capacity cache in our experiments stores 1% of the documents; hence, for our experiments the cache capacity is 10 documents. At medium capacity, the cache can store 5% of the documents while high and very

Fig. 3. Improvement over LRU.

high cache capacity correspond to 10% and 30% of the documents, respectively. Fig. 3 implies that as the cache capacity increases, the improvement (over LRU) increases for low levels of the age-sensitivity parameter. However, the improvement over LRU (across cache capacities) is roughly the same at high levels of age-sensitivity. The reason for this is that when age-sensitivity is high, there is less need for a large cache, i.e., the re-referencing behavior at high α ensures that a relatively small cache is sufficient to provide adequate response. For lower levels of cache capacities the improvement drops slightly as the request pattern becomes more age-sensitive. However, for higher cache capacities, the drop in improvement is more dramatic. Once again, the reason for this is that the large cache size becomes less important as the request pattern becomes more age-sensitive. Fig. 4 shows the delay with respect to age-sensitivity. For each level of cache capacity, the delay (slightly) decreases as the age-sensitivity of demand increases. The intuition behind this is shown in Fig. 5, where it can be seen that the number of cacheable documents increases with age-sensitivity. As age-sensitivity increases, fewer documents are screened out (i.e., the cacheable set becomes larger) because at high age-sensitivity, once a document is requested, it can keep getting requested and the penalty of not allowing this document in the cache can be very high. As can be seen from Fig. 5, at high agesensitivity and very high levels of cache capacity, the screening policy becomes very close to the basic LRU policy. Another factor that could affect the performance of the screening policy is variation in document size. To study size variation effects, the mean document size is held at 10 units, and to avoid negative sizes, we only consider σy = 1, 2 and 3 corresponding to a low, medium and high variation respectively. Fig. 6 shows that there is little effect of document size variation on the performance of screening policy.

Fig. 5. L⁎ for different levels of age-sensitivity and cache capacity.

C.C. Kaya et al. / Decision Support Systems 46 (2009) 594–603

599

Fig. 7. Age-sensitivity and document size variation (medium C).

Fig. 6. Variation of document sizes and capacity (α = 0.3).

Fig. 7 shows a result similar to Fig. 6. One important aspect to note in this experiment is that the loading times and document sizes are uncorrelated. Because of this, the variation in document size does not have an impact — a finding consistent to the one by [27] where the performance of the LRU algorithm was found to be insensitive to document size variation.3

N(μyν,σy). (See Appendix C to see how σy is chosen to have the desired correlation between loading time and size). 4.2. Demand trace generation The next step in the simulation is to generate a demand trace. We generate a demand trace in the following way.

4. Simulation validation Step 1

The simulation experiments in this section serve to validate the theoretical results in the previous section. First we simulate the basic LRU algorithm and demonstrate that the theoretical delay predicted for this algorithm matches with the simulation results. Next we find the optimal theoretical delay for the screening approach and compare this to the simulation result. In addition, we use simulation as a means to obtain the expected delays for other policies such as LAT and GD-Size. Finally, the simulation results allow us to examine the performances of these policies when the second objective, namely, traffic downloaded, is adopted. 4.1. Document generation Before starting the simulation, n (in this case one thousand) documents with loading times and sizes are generated. As stated earlier, loading times are drawn from a Pareto distribution using the following inverse distribution function  1 1 a zi = b ; 1−s

ð11Þ

where a and b are the shape and scale parameters of Pareto distribution and s is a random number uniformly distributed between 0 and 1. The document sizes are drawn from a Normal distribution with mean size of 10 and standard deviations of 1, 2 and 3. In order to test the effects of correlation between loading time and size, we fix the set of loading times (zi) and choose a (correlated) set of sizes (yi) as follows yi =

zi + u ; v

ð12Þ

where v is the scaling constant to set the mean of the sizes to a desired value (in this case 10) and u is a random number that is drawn from 3

In Section 4, we conduct an experiment where the loading time of a document is directly proportional to its size (i.e., a correlation coefficient of 1) and find exactly the reverse result. There size variation is relevant because it affects document delay and hence the performance of the caching algorithm.

Step 2 Step 3 Step 4 Step 5 Step 6

Start with n documents with random ages; Set a small time interval h; Choose demand parameters α, β, and demand trace size D; Set k = 0. Generate a random number p = p + h.   α p∈[0,1];  Calculate T ðiÞ = xi + αβ p−1−α −1 for each i; This is the time to next access for document i. Find Δt = min T(i), ∀i = 1,2,…,n. Increase xj(j ≠ i) by Δt, and set xi = 0. k = k + 1; If k = D then Stop; Else If k b D then go to STEP 2.

4.3. Simulation results A trace of 10,000 hits was generated using the above procedure. Table 3 shows simulation results for documents with a size coefficient of variation of 0.1 (i.e. standard deviation of 1) and cache capacity of 500 KB, corresponding to the ability to cache 5% of the documents. The first group of three columns shows the average delay when LRU is implemented, the delay when the screening method is used over LRU and the benefit of screening for LRU. The screening parameter z¯ is found using Eq. (3). We also use this parameter to run screening on LAT and GD-Size, and find improvements. However, since z¯ in Eq. (3) is derived for LRU, it may not necessarily yield the best performance for LAT and GD-Size. Instead, Table 4 shows the best improvements due to screening observed in simulation experiments. Both LAT and GD-Size can be further improved with screening to almost 7% and 49%, respectively. Table 5 shows simulation results for different levels of document size variation and cache capacity. The benefits of screening are higher when the demand pattern is more random (relatively low α) and the cache capacity is high (able to accommodate 10% of the documents). This is consistent with our findings in Section 3 (see Fig. 3), thus Table 3 Experimental parameter values Number of documents (n) Cache capacity Loading time distribution Mean document size (μy) S.D. of document sizes (σy) Age-sensitivity parameter (α)

10,000 500 units (e.g., KB, MB, GB etc.) unless noted otherwise Low: 100, Medium: 500, High: 1000, Very high: 3000 Pareto distribution with a = 1.5 and b = 20. 10 units (e.g. KB, MB, GB etc.) 1 unless noted otherwise; Low: 1, Medium: 2, High: 3 0.3 unless noted otherwise; Low: 0.2, Medium: 0.4, High: 0.6

600

C.C. Kaya et al. / Decision Support Systems 46 (2009) 594–603

Table 4 Simulation results for different age sensitivities at medium cache capacity (500 KB) (Cov of document sizes: 0.3) Alpha

LRU delays

0.1 0.2 0.3 0.4 0.5

LAT delays

GD-Size delay

LRU

Screening

Benefit (%)

LAT

Screening

Benefit (%)

GD-Size

Screening

Benefit (%)

56.954 56.358 55.506 51.922 51.210

40.825 38.598 37.955 37.892 39.093

40 46 46 37 31

40.700 40.499 39.597 39.455 38.758

37.818 37.535 37.180 37.001 36.325

8 8 7 7 7

59.540 59.661 59.349 56.745 57.182

44.662 39.926 44.554 42.615 38.837

33 49 33 33 47

providing further validation of the theoretical results in section 4 and the method of finding z¯ . We also consider the correlation between loading time and document size. The results of the experiments on a trace where there is relatively low correlation (0.3 and 0.4) between size and loading time are displayed in Table 6. These correlation levels are consistent with the characteristics of the real web trace we use in the next section where the correlation coefficient between loading time and size is 0.37. Scheuermann et al. [34] suggest a similar level of correlation. The process of obtaining a correlated size values given the loading time of the document is given in Appendix C. 4.4. Traffic implications

of California at Berkeley Home IP Web Traces [38]. The last subsection describes the findings when our screening technique is simulated using the real web-trace data. 5.1. Estimation of demand parameters A data point in the web trace consists of the following information: (tij, sij, zij), where tij is the time at which the jth hit on the ith document took place, sij and zij are the size and delay of ith document at the jth hit. These values can change over time since the contents of the document as well as its loading time could change across hits. We define the age of the ith document just before the jth hit as xij = tij −tið j−1Þ ; j > 1:

This section looks at the effects of the screening policy on the traffic generated on the outgoing link. To address this question, we consider some of the parameter settings used in Section 4.3 and use simulation to estimate the traffic downloaded by each caching technique and screening. Here the loading times are exogenously provided to the designer and these times may or may not be correlated with size. An example of uncorrelated exogenous loading times is when the download traffic flows on different routes and the congestion parameters associated with these routes vary considerably across the routes. Thus a small document following a congested route may take more time to download than a large document that follows a less congested route. Exogenous loading times that are correlated with size could occur if the traffic flows on uniformly congested routes (or the same route with constant expected delay) and hence, the delays are proportional to the document sizes. However, these loading times can be considered exogenous if the congestion parameters are not significantly influenced by the traffic generated by the users of the proxy server network (e.g., because the traffic contributed by the network users is small in proportion to the total traffic). The results on Table 7 show that except for some extreme cases (such as high age sensitivity) there is no significant change in the download traffic between screening and LRU. Thus the reduction in delay obtained by using screening does not come at the cost of additional download traffic. 5. Parameter estimation using real web-trace data In this section, we propose ways to estimate parameter values from real web-trace data obtained from the server log files of the University

ð13Þ

The instantaneous access rate to document i right before its jth hit can be estimated as θij =

j : tij −ti1

ð14Þ

According to Eq. (2), we can formulate the following regression model: θ−1 = α  x + β + ~e

ð15Þ

to estimate the parameters α and β. Using the data from the Berkeley web trace [38], the above regression model provides an estimate of α = 0.43 and β = 2000. The R2 is 0.79; this indicates that (15) provides a very good fit to the trace data. 5.2. Estimation of parameters for loading time distribution To estimate the shape and scale parameter for the loading time, first, we use the fact that Eq. (1) can be written as f ðzÞ =

aba a+1

ðz + bÞ

; 8z z 0:

ð16Þ

Taking the maximum loading times for each document, and minimizing the sum of squared errors of the distribution function, the Berkeley web trace yields an estimate of a = 1.26 and b = 6.89, for the shape and scale parameters respectively. A plot of the distribution function displaying a good visual fit using the above parameters is given in Fig. 8 below. In this estimation, we used an unbiased and efficient bin size 3.49σN− 1/3 where σ is the standard deviation of the

Table 5 Simulation results for different levels of cache capacity and size variation Alpha 0.2

Size S.D. 1 3

0.5

1 3

Cache capacity (%)

LRU delay LRU

Screening

Benefit (%)

LAT

LAT delay Screening

Benefit (%)

GD-Size

GD-Size delay Screening

Benefit (%)

1 10 1 10 1 10 1 10

59.760 52.505 59.775 52.632 57.259 43.563 57.288 43.563

50.574 33.210 50.878 32.736 49.091 32.349 49.394 33.119

18 58 17 61 17 35 16 32

52.430 38.647 50.065 38.647 49.354 36.680 50.065 36.680

49.856 33.388 49.043 32.112 48.459 32.043 47.930 30.277

5 16 2 20 2 14 4 21

59.495 59.661 59.495 59.661 57.550 57.125 57.550 57.125

51.673 39.926 51.600 39.926 49.952 39.308 49.952 39.308

15 49 15 49 15 45 15 45

C.C. Kaya et al. / Decision Support Systems 46 (2009) 594–603

601

Table 6 Effects of correlation between size and loading time (α = 0.3, C = 500) ρ (y, z) S.D of doc sizes 0.3 0.4

Delay savings

Traffic savings

LRU (%) LAT (%) GD-Size (%) LRU (%) LAT (%) GD-Size (%)

Low (0.76) 37.60 High (2.67) 32.15 Low (0.61) 36.99 High (2.88) 30.61

0.28 0.31 0.17 0.29

20.19 18.09 20.40 19.77

−1.82 −2.20 −2.14 −2.05

−0.11 −0.07 −0.12 −0.18

−1.52 −2.05 −1.45 −1.05

sample and N is the sample size [35]. Although our algorithm works with any size distribution, other work on real web traces suggest that document sizes are also found to follow Pareto distribution [13]. Once the values for these parameters are estimated, L⁎ can be found using Eq. (3), and hence the optimal screening parameter, z¯ . For the Berkeley web trace, the optimal screening parameter z¯ ≈ 52 at the medium cache level (5%). Any document below a loading time of 52 seconds is not admitted into the cache. 5.3. Size dependency The documents which are accessed in the real trace are discovered to have a high variation in their size. This phenomenon causes the cache to be filled inefficiently. In addition, we discover some dependency between loading time and size of the documents in the real web trace. When a large document with high delay is accessed and therefore admitted to the cache, the remaining cache space may not be enough for the next requested document causing the policy to remove many documents in order to accommodate the new document. In order to avoid this problem we screen the documents according to loading time/size ratio. This allows us to smooth the variation of the document size and is enough to capture the variation in the loading time. Table 8 shows the benefit of screening against no screening. Similarly, with a little improvement of bandwidth the end users can benefit from screening. 6. Summary and conclusions This study proposed and evaluated a screening policy for browser caching that was based upon the loading time of documents. In the first phase of this study, we expressed the average expected delay for a screening policy and found the optimum number of documents to consider as cacheable. We have presented a method to convert the

Fig. 8. Distribution function of loading times.

screening threshold to a loading time threshold to be used in the screening policy. To validate the analytical results, we simulated the LRU policy and the screening policy and found a close match between the analytical values of delay and those obtained in the simulation experiment. An important result of study is that screening can be especially beneficial when the document request pattern is more random (as opposed to one that strongly exhibits re-referencing) or when the cache capacity is relatively high. We also simulated the screening policy to augment the LAT and GD-Size policies and these simulations provided similar insights. However, as expected, the LAT policy did not provide as much benefit as the other two algorithms, because it already takes loading time as the criteria when purging a document from the cache. While running the simulations we kept track of the sizes of the documents and the bandwidth usage of the proxy server. The interest was to see if screening can reduce delays without significantly impacting the traffic on the outgoing link. Our experiments showed that screening can significantly reduce delays, while holding the download traffic at (approximately) the same level. We also found that the documents accessed in the actual web trace have very high size variation and that the loading times of the documents were dependent to their size. This led us to extend the loading time criterion to loading time/size. The simulations showed a smaller benefit than the case of generated trace. This study is part of our on-going interest in the operational and economic aspects of content delivery on the Internet. This study can be beneficial to companies in the business of creating content delivery

Table 7 Delay and bandwidth savings for loading times Alpha 0.1

ρ (y, z) 0 0.5 1

0.3

0 0.5 1

0.5

0 0.5 1

Cache capacity

LRU Delay savings (%)

Bandwidth savings (%)

LAT Delay savings (%)

Bandwidth savings (%)

GD-Size Delay savings (%)

Bandwidth savings (%)

500 1000 500 1000 500 1000 500 1000 500 1000 500 1000 500 1000 500 1000 500 1000

46.17 63.62 44.26 62.41 0.08 0.66 44.89 58.57 41.66 62.41 0.00 −1.89 30.21 31.09 30.23 30.36 −1.59 −5.16

−0.42 −0.85 −0.38 −0.76 0.08 0.66 −2.76 −5.29 −0.32 −0.76 0.00 −1.89 −8.89 −17.19 −8.59 −16.79 −1.59 −5.16

1.81 14.82 1.81 − 3.81 1.81 2.78 1.90 14.66 1.90 2.92 1.90 2.92 2.36 14.47 2.36 2.65 2.36 2.65

0.54 5.27 0.54 − 1.99 0.54 1.02 0.51 5.36 0.51 1.04 0.51 1.04 0.65 5.61 0.65 0.86 0.65 0.86

33.31 48.58 33.24 45.85 0.00 64.74 33.21 50.27 31.05 46.27 26.30 0.00 27.06 45.32 27.48 42.55 6.63 59.20

0.31 3.31 1.81 3.95 0.00 64.74 0.48 2.80 1.98 3.36 26.30 0.00 −2.57 0.38 −0.41 2.59 6.63 59.20

602

C.C. Kaya et al. / Decision Support Systems 46 (2009) 594–603

B. Proof for approximation

Table 8 Delay and bandwidth savings for loading times Delay savings

Average delay per access can be written as,

Cache capacity (%)

LRU (%)

LAT (%)

GD-Size (%)

Traffic difference LRU (%)

LAT (%)

GD-Size (%)

1.0 0.5 0.1

1.753 3.074 4.583

6.612 5.502 0.129

0.003 0.005 0.246

− 10.525 −7.622 −3.404

− 3.035 − 2.671 − 0.066

0.000 −0.003 −0.030

products (such as edge servers, caching accelerators) as well as those involved in the business of providing content delivery services (such as content distributors, Internet Service Providers). For future research on this problem we plan to study the interaction between the caching policy used at proxy-servers and the placement of these servers, e.g., a way for proxy-servers to cooperate with one another for the fast distribution of content.

 W=

   1! ab 1 ab Γ ðn + 1Þ Γ ðL + 1−1=aÞ R 1−α − 1− 1− W= ; a−1 n a−1 Γ ðn + 1−1=aÞ Γ ðL + 1Þ L

 P ðithjzÞ =

 n−1 ð1−F ðzÞÞi−1 ðF ðzÞÞn−i : i−1

ðA1Þ

ðB2Þ

where we assume that R / L bb 1. Furthermore, since n is large and a > 1, we can use the approximate form of the Gamma function and get, W=

If all z's are drawn from a distribution f (z), given a value of z, when drawn n times, the probability of being the ith ranked is,

ðB1Þ

where qr and pr are given by Eqs. (6) and (7). The first step of the approximation is to assume equal size documents. Let R = C / μy. Then we have,

Appendix A. Proof for order statistics

   n−L 1 ab Γ ðn + 1Þ Γ ðL + 1−1=aÞ n−L n n−L ða−1Þ Γ ðn + 1−1=aÞ Γ ðL + 1Þ    L L ab Γ ðn + 1Þ Γ ðL + 1−1=aÞ + 1− ∑ pr qr ; n ða−1Þ Γ ðn + 1−1=aÞ Γ ðL + 1Þ r=1

  1  1! ab ab L 1−a R 1−α − 1− 1− : a−1 a−1 n L

ðB3Þ

From Eq. (B3), L⁎ can be found by solving,   1 !! A R 1−α 1 L1−a 1− 1− = 0: AL L

ðB4Þ

In general we can write

One can verify that n

∑ P ðithjzÞ = 1;

ðA2Þ

L⁎ = uða; α ÞR:

ðB5Þ

i=1

C. Correlated size generation

and  P ðithÞ = ∫ P ðithjzÞf ðzÞdz = zaZ

n−1 i−1



ði−1Þ!ðn−iÞ! 1 = : n! n

ðA3Þ

So given ith ranked, the conditional z distribution is f ðzjithÞ =

P ðithjzÞf ðzÞ Γ ðn + 1Þ = ð1−F ðzÞÞi−1 ðF ðzÞÞn−i f ðzÞ: P ðithÞ Γ ðn−i + 1ÞΓ ðiÞ

ðA4Þ

The conditional mean is EðzjithÞ = ∫ zf ðzjithÞdz: zaZ

ðA5Þ

If z follows Pareto distribution with density f (z) = abaz− a− 1, where a > 1, and z > b, 1

EðzjithÞ = b ∫ 0

Γ ðn + 1Þ ð1−zÞi−1−1=a zn−i dz; Γ ðn−i + 1ÞΓ ðiÞ

ðA6Þ

which leads to, EðzjithÞ = b

Γ ði−1=aÞΓ ðn + 1Þ : Γ ðiÞΓ ðn + 1−1=aÞ

ðA7Þ

One can find the mean for first L documents and the last n − L documents as follows, 1 L ab Γ ðn + 1Þ Γ ðL + 1−1=aÞ ∑ EðzjithÞ = ; Li=1 ða−1Þ Γ ðn + 1−1=aÞ Γ ðL + 1Þ

ðA8Þ

and given the expected mean for n documents is ab / (a − 1), then the expected mean for the last (n − L) documents is,   L 1 1 ab Γ ðn + 1Þ Γ ðL + 1−1=aÞ ∑ EðzjithÞ = n−L : n−L i = L + 1 n−L ða−1Þ Γ ðn + 1−1=aÞ Γ ðL + 1Þ

ðA9Þ

Take three random variables Z, Y and U such that Y = Z + U, and assume that Z and U are independent. If the correlation between Y and Z is ρ(Y, Z) then, VarðU Þ = Varð Z Þ

1

!

−1 : ρðY; Z Þ2

Proof: Assume that Y = Z + U; Z and U are any two independent random variables with variances σ Z2 and σ U2 respectively. Cov(Y, Z) = Cov(Z,Z + U) = Cov(Z) + Cov(Z,U). From the independence assumption, ðY;Z Þ Cov(Z,U) = 0, and therefore, Cov(Y,Z) = Var(Z) = σ Z2. Since ρðY; Z Þ = Cov σY σZ , then ρðY; Z Þ = σσ YZ . Furthermore, Z and U are independent, and thus, σZ ρðY; Z Þ = pffiffiffiffiffiffiffiffiffiffiffiffiffi . 2 2 σZ + σU

References [1] M. Abrams, C.R. Standridge, G. Abdulla, S. Williams, E.A. Fox, Caching proxies: limitations and potentials, Proceedings of the 4th International WWW Conference, Boston, MA, December 1995. [2] M.F. Arlitt, L. Cherkasova, J. Dilley, R.J. Friedrich, T.Y. Jin, evaluating content management techniques for Web proxy caches, ACM SIGMETRICS perform, Evaluation Review 27 (4) (March 2000) 3–11. [3] M.F. Arlitt, R.J. Friedrich, T.Y. Jin, Performance evaluation of Web proxy cache replacement policies, Technical Report HPL-98-97 (R.1), Hewlett–Packard Company, Palo Alto, CA, 1999. [4] R. Ayani, Y.M. Teo, Y.S. Ng, Cache pollution in Web proxy servers, Proceedings of International Parallel and Distributed Processing Symposium, 2003, pp. 1–7. [5] P. Barford, M. Crovella, Generating representative Web workloads for network and server performance evaluation, Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'98/ PERFORMANCE'98), Madison, WI, June 1998, pp. 151–160. [6] G. Barish, K. Obraczke, World Wide Web caching: trends and techniques, IEEE Communications Magazine 38 (5) (May 2000) 178–184. [7] I. Bose, H.K. Cheng, Performance models of a firm's proxy cache server, Decision Support Systems 29 (2000) 47–57. [8] P. Cao, S. Irani, Cost-aware WWW Proxy caching algorithms, Proceedings of the USENIX Symposium on Internet Technologies and Systems, Monterey, CA, December 1997.

C.C. Kaya et al. / Decision Support Systems 46 (2009) 594–603 [9] J. Challenger, A. Lyengar, P. Dantzig, A scalable system for consistently caching dynamic Web data, Proceedings of IEEE INFOCOM'99, vol.1, IEEE Press, Piscataway, N.J., March 1999, p. 294. [10] C.Y. Chang, T. McGregor, G. Holmes, The LRU⁎ WWW proxy cache document replacement algorithm, Proceedings of the Asia Pacific Web Conference, 1999. [11] K. Cheng, Y. Kambayashi, LRU-SP: a size-adjusted and popularity-aware LRU replacement algorithm for Web caching, Proceedings of the 24th Annual International Computer Software and Applications Conference, 2000, p. 48. [12] I.R. Chiang, P.B. Goes, Z. Zhang, Periodic cache replacement policy for dynamic content at application server, Decision Support Systems 43 (2007) 336–348. [13] C. Cunha, A. Bestavros, M. Crovella, Characteristics of WWW client-based traces, Technical Report TR-95-010, Boston University, April 1995. [14] A. Datta, K. Dutta, H. Thomas, D. VanderMeer, World Wide Wait: a study of Internet scalability and cache-based approaches to alleviate it, Management Science 49 (10) (2003) 1425–1444. [15] J. Dilley, M. Arlitt, S. Perret, Enhancement and validation of the squid cache replacement policy, Proceedings of the 4th International Web Caching Workshop, 1999. [16] T. Fagni, R. Perego, F. Silvestri, S. Orlando, Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data, ACM Transactions on Information Systems 24 (1) (2006) 51–78. [17] J. Hahn, R.J. Kauffman, J. Park, Designing for ROI: toward a value-driven discipline for E-commerce system design, Proceedings of the 2002 Hawaii International Conference on System Sciences, Big Island, HI, January 2002, pp. 2663–2672. [18] P. Jelenkovic, A. Radovanovic, Asymptotic insensitivity of least-recently-used caching to statistical dependency, INFOCOM 2003 22nd Annual Joint Conference of the IEEE Computer and Communications Societies, 2003, pp. 438–447. [19] B. Juurlink, Approximating the optimal replacement algorithm, Proceedings of the 1st conference on Computing frontiers, 2004, pp. 313–319. [20] B. Krishnamurthy, J. Rexford, Web Protocols and Practice: HTTP/1.1, Networking Protocols, Caching, and Traffic Measurement, Addison Wesley, 2001. [21] N. Kumar, A. Gangopadhyay, G. Karabatis, Supporting mobile decision making with association rules and multi-layered caching, Decision Support Systems 43 (2007) 16–30. [22] C. Kumar, J.B. Norris, A new approach for a proxy-level web caching mechanism, Decision Support Systems 46 (1) (2008) 52–60. [23] M. Kurcewicz, W. Sylwestrzak, A. Wierzbicki, A filtering algorithm for web caches, Computer Networks and ISDN Systems 30 (22–23) (November 1998) 2203–2209. [24] J. Mardesich, The Web is no shopper's paradise, Fortune (November 8 1999) 188–198. [25] D. Menascé, Scaling Web sites through caching, IEEE Internet Computing 7 (4) (July/August 2003) 86–89. [26] J.M. Menaud, V. Issarny, M. Banatre, Improving effectiveness of Web caching, Recent Advances in Distributed Systems. Lecture Notes in Computer Science, vol.1752, Springer–Verlag, Berlin, Germany, 2000, pp. 375–401. [27] V. Mookerjee, Y. Tan, Analysis of a least recently used cache management policy for Web browsers, Operations Research 50 (2) (2002) 345–357. [28] C.D. Murta, V.A.F. Almeida, W. Meira, Analyzing performance of partitioned caches for the WWW, Proceedings of the 3rd International WWW Caching Workshop, Manchester, England, June 1998. [29] N. Osawa, T. Yuba, K. Hakozaki, Generational replacement schemes for a WWW proxy server, high-performance computing and networking (HPCN'97), Lecture Notes in Computer Science, vol. 1225, Springer-Verlag, Berlin, Germany, 1997, pp. 940–949. [30] S. Podlipnig, L. Bözsörmenyi, A survey of web cache replacement strategies, ACM Computing Surveys 35 (4) (December 2003) 374–398. [31] M. Rabinovich, L. Spatscheck, Web Caching and Replication, 1st editionAddison Wesley, 2002. [32] M. Reddy, G.P. Fletcher, Intelligent web caching using document life histories: a comparison with existing cache management, Proceedings of the 3rd International WWW Caching Workshop, Manchester, England, June 1998. [33] R. Rizzo, L. Vicisano, Replacement policies for a proxy cache, IEEE/ACM Transactions on Networking 8 (2) (2000) 158–170. [34] P. Scheuermann, J. Shim, R. Vingralek, A case for delay-conscious caching of Web documents, Computer Networks and ISDN Systems 29 (8–13) (September 1997) 997–1005. [35] D. Scott, On optimal and data-based histograms, Biometrika 66 (3) (1979) 605–610. [36] S.W. Shin, K.Y. Kim, J.S. Jang, LRU based small latency first replacement (SLFR) algorithm for the proxy cache, Proceedings of IEEE/WIC International Conference, 2003, pp. 499–502. [37] Y. Tan, Y. Ji, V.S. Mookerjee, Analyzing document-duplication effects on policies for browser and proxy caching, INFORMS Journal on Computing 18 (4) (2006) 506–522.

603

[38] UC Berkeley Home IP Web Traces, 2006, available at http://ita.ee.lbl.gov/html/ contrib/UCB.home-IP-HTTP.html. Last accessed on January 16, 2006. [39] E.F. Watson, Y. Shi, Y.S. Chen, A user-access model-driven approach to proxy cache performance analysis, Decision Support Systems 25 (1999) 309–338. [40] R.P. Wooster, M. Abrams, Proxy caching that estimates page load delays, Computer Networks and ISDN Systems 29 (1997) 977–986. [41] N. Young, The k-server dual and loose competitiveness for paging, Algorithmica 11 (6) (June 1994) 525–541. [42] M. Zari, H. Saiedian, M. Naeem, Understanding and reducing Web delays, IEEE Computer 34 (12) (December 2001) 30–37. Cüneyd C. Kaya received his MS in Information Technology and Management and his PhD in Management Science with Information Systems concentration from the University of Texas at Dallas. He has a BSc in Mathematics from Istanbul Technical University. His research interests are in economics of Web content distribution, performance evaluation of web proxy servers, and capacity management in content provision sites. He is also interested in information systems security and dynamics of web user communities. He is currently Manager of Decision Analysis at Blockbuster.

Guoying Zhang is an Assistant Professor of Management Information Systems at the Dillard College of Business Administration, Midwestern State University. She received her PhD in Information Systems from the University of Washington. Her research interests include information security, social networks, and economics of information systems. Yong Tan is an Associate Professor of Information Systems and Evert McCabe Faculty Fellow at the Michael G. Foster School of Business, University of Washington. His research interests include electronic commerce, social networks, software engineering, and economics of information systems. He has published in Operations Research, Management Science, Information Systems Research, INFORMS Journal on Computing, IEEE/ACM Transactions on Networking, IEEE Transactions on Software Engineering, IEEE Transactions on Knowledge and Data Engineering, IIE Transactions, and European Journal of Operational Research. He is an associate editor of Management Science and Information Systems Research. Vijay S. Mookerjee is the Charles and Nancy Davidson Distinguished Professor of Information Systems at the School of Management, University of Texas at Dallas. He holds a Ph.D. in Management, with a major in MIS, from Purdue University. His current research interests include optimal software development methodologies, storage and cache management, and the economic design of expert systems and machine learning systems. He has published in and has articles forthcoming in several archival Information Systems, Computer Science, and Operations Research journals. He serves on the editorial board of Management Science, Information Systems research, INFORMS Journal on Computing, Operations Research, Decision Support Systems, Information Technology and Management, and Journal of Database Management.