Multiple Relay Selection Based on Game Theory in ... - IEEE Xplore

6 downloads 0 Views 836KB Size Report
Multiple Relay Selection Based on Game Theory in Cooperative Cognitive Radio Networks. ∗. WU Renyong, WANG Wenru and LI Renfa. (The College of ...
Chinese Journal of Electronics Vol.26, No.3, May 2017

Multiple Relay Selection Based on Game Theory in Cooperative Cognitive Radio Networks∗ WU Renyong, WANG Wenru and LI Renfa (The College of Information Science and Engineering, Hunan University, Changsha 410082, China) Abstract — Due to the inefficiency of traditional fixed spectrum allocation policies, the paradox of apparent spectrum scarcity occurs while most of the bands are underutilized. This has prompted proposals for Dynamic spectrum sharing (DSS), which explains why Cognitive radio network (CRN) has been widely accepted as a promising approach to settle inefficient usage of scarce available radio spectrum. As a subset of DSS, Dynamic spectrum leasing (DSL) strategy has been proposed based on game idea, where Primary user (PU) has an incentive to allow Cognitive users (CUs) to access its licensed spectrum for a fraction of time in exchange for revenue. This paper proposes an approach, named multiple relay selection based on Game theory (GTMRS), to optimize the utilities of PU and CUs as a whole, where a pricing-based spectrum leasing mechanism is applied. While the parameter price c is jointly determined by PU and CUs, all selected cognitive user’s optimal cooperative powers can be satisfied through a non-cooperative game among themselves. Numerical results show that more CUs are involved in the cooperation and both utilities of PU and CUs as a whole are improved, which means the whole system throughput is increased. Key words — Cognitive radio networks, Dynamic spectrum leasing, Cooperative transmission, Nash equilibrium.

I. Introduction With the rapid growing of mobile telecommunication industry, spectrum scarcity is becoming a severe problem that the whole industry has to face[1] . One of the feasible solutions is to exploit the potential of existing spectrum through improving spectrum efficiency. As a promising proposal, Cognitive radio networks (CRNs)[2] have been widely accepted as practical, real-time, highly focused applications of computational intelligence technology, where the task is to adapt radio-enabled information services to the specific needs of a specific user. CRNs generally work in three modes, i.e., interweave, underlay and overlay[3]. The earliest interweave idea is put forward by Mitola[4] , where Cognitive users (CUs) should be intelligent enough to find holes through spectrum sens-

ing, and then access these spectrum holes for their own communications, so the interference to Primary user (PU) does not need to be considered. In underlay mode, both PU and CU are allowed to access the same spectrum band simultaneously, where a PU just regards the CUs’ transmission signals as a kind of interference and guarantees the total interference level is less than a certain acceptable threshold[5] . In overlay mode, CUs utilize the knowledge of PU’s codebook and message to cancel the interference caused by themselves[6] . As mentioned above, a CRN in interweave or underlay mode always works in detection and access phases to fulfill transmission task, so spectrum efficiency and network throughput are governed by detection precision and access proceeding[2]. In view of this situation, cooperative strategy has been widely introduced into CRNs[7] to model users’ behaviors. For example, in detection stage, cooperative sensing in interweave CRNs can effectively improve sensing precision through combining multiple copies of detection information[8] , while the transmission parameters for underlay CRNs can be optimized through cooperative feedback[9] ; In access stage, spectrum efficiency can be improved by cooperative interference coordinating among multiple CUs[2] . However, here the cooperation looks somewhat passive because a PU always regards CUs’ transmission as a kind of negative interference and has no incentive to cooperate with them. A cognitive radio network adopting this cooperative idea is called Cooperative cognitive radio network (CCRN)[7,10−12] . In fact, as PU’s transmission rate is increased with the help of these cooperative relays, for a certain amount of traffic of PU, the transmission time can be reduced, which means CUs can obtain more opportunities to transmit their own data. It is a win-win situation for both sides. However in reality, node locations and channel aspects will dramatically affect the total system performance, so the strategy of relay selection is another

∗ Manuscript Received Jan. 21, 2015; Accepted July 12, 2015. c 2017 Chinese Institute of Electronics. DOI:10.1049/cje.2017.03.019 

Multiple Relay Selection Based on Game Theory in Cooperative Cognitive Radio Networks

fundamental issue for CCRNs. This paper designs a novel relay selection algorithm based on game theory to select the optimal set of CUs and allocate their optimal cooperative powers in Amplify-andforward (AF) cooperative mode. Analysis and numerical results all prove that more CUs are involved in the cooperations and network throughput is improved largely. The main contributions are as follows: 1) In accordance to the pricing payment strategy, the relationship between cooperative power and access time of CUs is deduced by introducing a intermediate parameter, and then the utility functions for PU and CUs in the model are defined. 2) We prove that there is a unique Nash equilibrium existing in the non-cooperative game among CUs, so CUs’ optimal cooperative powers can be obtained. 3) We propose a iterative relay selection algorithm based on the optimization of cooperative power to remove the relays that suppress the total equivalent SNR on Primary receiver (PR), where a so-called modified channel harmonic mean factor is introduced to act as a virtual timer.

II. Related Works According to the number of players involved in a game, the relay selection algorithms can be divided into two categories: single and multiple. As a single-relay selection (SRS), Ref.[13] adopts the optimal stopping theory and derives its corresponding optimal stopping rules to select a relay efficiently from a large number of cognitive users. In Ref.[14], three single-relay selection schemes are proposed based on transmission rate maximization, channel gain maximization, and harmonic mean minimization of channel gain respectively. Compared with single relay selection algorithms, Multiple-relay selection (MRS) can achieve better diversity gain and higher transmission rate. For a fixed amount of data, Ref.[15] seeks to select a set of cooperative relays to minimize the total transmission time in wireless communications. In Ref.[16], several SNRsuboptimal multiple-relay selection schemes are proposed based on the relay ordering idea, whose complexity is linear with the number of relays. Using the same relay ordering idea, a SRS scheme is proposed in Ref.[17] to maximize the worse SNR value of two end users and a MRS scheme is developed for two-way relay networks. For the CRNs only consisting of CR users[18,19] , SNR maximization methods are presented while the users work in underlay mode and need to control their interference. Ref.[18] presents a low-complexity interference aware multiple relay selection scheme, whose basic idea is to maximize SNR at the destination node of CUs under the constraint of acceptable interference to PU. However, this method almost does not consider power control, which will make system throughput decrease with the increasing of CUs’ transmission power. To mitigate this fault, Ref.[19] adopts a simple power control strategy to select the relays which

625

can maximize SNR at the destination. However, SNR is obtained only by simply summing up all SNR components from different transmission paths, which will severely affect algorithm accuracy. Therefore, in Ref.[20], a centralized algorithm is proposed to achieve optimal relay selection and dynamic spectrum access in interferencelimited video-streaming single-hop ad hoc networks, and in Ref.[21], a distributed algorithm is designed to allocate relays and spectrum in interference-limited infrastructureless networks based on variational inequality theory. As a mathematical tool, game theory has been widely used to model or analyze network users’ behaviors, especially in cooperative communication and cognitive radio networks. Ref.[22] summarizes the incentive mechanisms in cooperative communications and lists some challenging research issues. Considering spectrum sharing feature of CCRNs, various schemes have been proposed in Ref.[23]Ref.[31] based on game theory. In Ref.[23], the concept of stackelberg game is introduced such that a source node plays as a buyer and pays for cooperative power of cognitive users, who play as sellers. Both buyer and sellers are selfish there, so source node aims to maximize its own benefit by selecting the CUs with appropriate location and channel state as relays and buy a optimal amount of cooperative power from each one on condition that the relay nodes have set the price per unit of service. In Ref.[24], after addressing the problem of the existing conventional CCRN framework, a novel MIMO-CCRN system architecture is proposed, where a Stackelberg game is formulated in the framework to maximize the utilities of PU and SUs. On the contrary, Ref.[25] introduces a novel spectrum leasing mechanism for CCRNs, where PU leases the licensed spectrum band to CUs in exchange for CUs’ cooperation, in other words, CUs pay charges to PU in order to access legacy spectrum. However, in these schemes, the selection mechanism’s influence is not fully considered and all CUs’ cooperative powers are fixed, while it has been proved that PU’s utility can be maximized by adjusting the parameters of the algorithm. In Ref.[26], a relay selection and admission control algorithm based on Stackelberg game (denoted as SGRS in this paper) is designed where Decodeand-forward (DF) is adopted as cooperative protocol. However, the influence of CUs’ cooperative powers to PU’s utility is still not fully considered while PU can achieve its maximal utility under the situation by adjusting the spectrum price, which is the major motivation of this research.

III. System Model In this paper, consider a system shown in Fig.1: a Primary user sends signals to its Primary receiver (PR), and in the same frequency band, N Cognitive users would like to exploit possible transmission opportunities from the

Chinese Journal of Electronics

626

PU to send their own signals to the Secondary receiver (SR). Assume that each CU’s antenna can transmit or receive signal, and all communication channels between CUs are modeled as independent complex Gaussian random variables and maintain invariant within a slot, but generally variant over slots (i.e., Rayleigh block-fading channel). And the fading coefficients of wireless channel between PU and PR, PU and CUi , CUi and PR, CUi and SR are respectively denoted as h0 , h0i , hi0 , hi . In addition, assume that before all transmissions, PU has known its channel fading coefficients h0 and h0i , and CUi also has known its channel fading coefficients hi and hi0 . In this paper, the common channel noise is denoted as n0 with the same average power N0 for all channels, and AF protocol is applied to forward message. In this paper, PU needs to find a appropriate relay set S from all CUs to cooperatively fulfill its transmission as shown in Fig.1 and CUs have their own transmission requirements, so each time slot has to be further divided into three phases (subslots). The first phase refers to the 1 − α portion and is dedicated to the broadcasting from PU to the selected cooperative CUs, while the other two phases are dedicated to cognitive users’ transmission. The first 0.5α portion is dedicated to PU’s direct transmission and CUs’ re-transmission to PR, and the second is occupied by the selected CUs to access the spectrum band to transmit their own data in a Time-division multiple access (TDMA) manner. Here, the time parameters α(0 ≤ α ≤ 1) is jointly decided by PU and CUs, which is different from Ref.[25].

2017



Pi [18] is a amplification facP0 |h0i |2 + N0 tor, Pi is the cooperative transmission power level at CUi and n0 implies Gaussian white noise with zero mean and variance N0 . Obviously, β1 must satisfies the physical limitation of CU’s cooperative power level Pi . In this phase, assume all selected relays transmit their respective amplified signals to the receiver at the same time, so the PR receives a superposition of all signals[17] , it can be given as  ( Pi |hi0 |yAF + n0 ) y0 = where β1 =

i∈S

 P0 Pi |hi0 ||h0i |  Pi |hi0 |   = + ( + 1))n0 x p 2 P0 |h0i |2 + N0 (P0 h0i + N0 ) i∈S i∈S In the third phase, all the selected relays transmit their own data to SR using the same spectrum band. Given the transmission power Ps which is invariable for all CUs and the transmitted signal xi , the corresponding received signal yi at SR can be written as  yi = Ps |hi |xi + n0 Obviously, the achieved transmission rate for CUi in the third phase is Ri = B log2 (1 +

|hi |2 Ps ), N0

In regard to the aforementioned formulations, we can reason that the selected CUs acquire access times in the third phase through cooperating with PU to forward data in the second phase, so the equivalent cooperative signalto-noise ratio (SNR) can be calculated as

γs =

k  P0 Pi |hi0 ||h0i | 2  ( ) P0 |h0i |2 + N0 i=1

1 × k N0  ( i=1

=

Assume that all the selected relays CUi ∈ S directly receive signals from PU and then forward to PR in a AF manner. Given P0 the transmission power level of PU and xp the transmitted signal symbol at PU, so the received signal at CUi in the first phase can be written as  y0i = P0 |h0i |xp + n0 and its amplified signal yAF is yAF = β1 y0i

Pi |hi0 | P0 |h0i |2 + N0

)2 + 1

k  Pi |hi0 ||h0i |  ( )2 2+N P |h | 0 0i 0 i=1

P02 × k N0  ( i=1

Fig. 1. System model of a CCRN

∀CUi ∈ S

Pi |hi0 | )2 + 1 P0 |h0i |2 + N0

(1)

To simplify the above expression, first define a parameter Φi as Pi |hi0 | Φi =  P0 |h0i |2 + N0 then Eq.(1) can be rewritten as k  P02 ( Φi |h0i |)2

γs =

i=1 k 

N0 ((

i=1

(2) Φ2i ) + 1)

Multiple Relay Selection Based on Game Theory in Cooperative Cognitive Radio Networks

Therefore, the profit of PU achievable transmission rate in the cooperation link can be write as (Here, we do not need to consider the SNR component caused by the direct transmission at PT, because we just calculate the profit in the cooperation) k 

P02 ( R0 = B log2 (1 +

2

Φi |h0i |)

i=1 k 

N0 ((

)

(3)

Φ2i ) + 1)

i=1

where B is the channel bandwidth, and without loss of generality, it is set to be one band unit in the following sections.

IV. Utility Function and Optimization Strategy In cooperative cognitive radio networks, PU leases its idle spectrum to CUs for a fraction of time in exchange for the cooperative retransmission of CUs, so the problem of sharing scarce spectrum bands in CCRNs can be transformed into a game among multiple players in some sense. 1. Utility function In this paper, for a selected cooperative relay CUi , the cost of occupying access time is to consume its own transmission power to cooperate with PU, so the relationship between access time and cooperative power[25,31] can be defined as cti = Pi |h0i |2 |hi0 |2

(4)

where c is the price of per unit of spectrum access time which is determined by PU and CUs, ti denotes access time of each CU. This equation implies that CUs are granted access time by PU to transmit their own data in exchange for their cooperation with PU, and if a CU wants much more access time, it has to consume more power to cooperate with PU. On the contrary, a CU can obtain more opportunity by enlarging its cooperative power level. For each CU, not only the profit it can achieve, but also the power cost in the cooperation data transmission will be considered. Therefore, the utility function of CU consists of two parts: profit and cost, so its utility is defined as follows: 1 (5) Ui = w1 Ri ti − w2 Pi ( α) 2 where w1 is the equivalent profit per data unit transmitted to SR, and w2 is the equivalent cost per power unit consumed in cooperation with PU. Both parameters are predefined. In this paper, assume that all CUs are selfish and rational, so each CU tries to optimize its utility in some way.

627

For PU, the goal is to maximize its transmission rate in the cooperation process, so its utility is defined as 1 U0 = wp R0 (1 − α) 2

(6)

where wp is the profit per data unit transmitted to PR. PU’s strategy is to select the optimal set of CUs as collaborators, such that the set not only makes CUs as a whole achieve optimal utility, but also makes PU maximize its utility. 2. CU’S optimization strategy and Nash equilibrium Given the price c and the selected relay set S, in accordance to the previous utility definitions, every CU need to consume part of power to help PU transmit data. For PU, an optimal cooperative power level in the selected relay set should be considered in order to maximize its utility which will be referred in next section. And for CU, its utility is also closely related to the cooperative power, in other words, an optimal cooperative power level can optimize the utility of the selected relay set as a whole. Meanwhile, all CUs of the cooperative relay set S will compete with each other to maximize their own utilities by setting appropriate cooperative power, which forms a non-cooperative game among the selected CUs. Without loss of generality, here the game is denoted as G = {S, {Pi }, {Ui }}, where S is the candidate node set, Pi is the cooperative power of CUi , Ui is the utility of CUi . Each CUi ∈ S must select its power within the strategy space P = [Pi ]i∈S to maximize its utility Ui (Pi , P−i ). We first analyze whether the Nash equilibrium (NE) point exists and its uniqueness. Theorem 1 A Nash equilibrium exists in the noncooperative power game, G = {S, {Pi }, {Ui }}. Proof See Appendix A. Meanwhile, from Eq.(16), we can obtain the bestresponse function by setting the first derivative of Ui with respect to Pi as 0. (w1 Ri − 2w2 Pi )|h0i |2 |hi0 |2 dUi = dPi c  Pj |h0j |2 |hj0 |2 w2 −

j∈S,j=i

c

=0

(7)

By solving Eq.(7), we can obtain the optimal cooperative power and denoted as Pi∗  Pj |h0j |2 |hj0 |2 w1 Ri j∈S,j=i ri (P ) = Pi∗ = − (8) 2w2 2|h0i |2 |hi0 |2 Theorem 2 The non-cooperative game has a unique equilibrium. Proof See Appendix B.

Chinese Journal of Electronics

628

Theorem 3 The unique equilibrium for the noncooperative power game is given by  2Δi (1 + k) − 2 Δi Pi∗ =

(1 +

i∈S 2 k)|h0i | |hi0 |2

(9)

where Δi = (w1 /2w2 )Ri |h0i |2 |hi0 |2 . Proof See Appendix C. In summary, we have prove in this section that, there is a unique Nash equilibrium existing for a selected relay set, which will optimize each CU’s utility. In other words, once a relay set is known, each relay in the set can calculate its optimal cooperative power according to Eq.(9), and then obtain its optimal utility. Therefore in the following section, we only need to discuss how to determine a relay set to maxmize the utility of PU.

V. PU’s Maximized Utility and Relay Selection in CCRNs On the one hand, according to the previous sections, we can calculate each CU’s access time in third phase from Eqs.(4) and (9).  Δi 2(1 + k)Δi − 2 t∗i =

i∈S

c(1 + k)

(10)

As previously stated, the third phase for the selected CUs to access the licensed spectrum is assigned by Time division multiplexing access (TDMA) manner, therefore we can obtain the relationship between c and α by summing up all access times   2(1 + k) Δi − 2k Δi  i∈S i∈S ∗ ti = c(1 + k) i∈S

2 =



Δi

i∈S

c(1 + k)

that is 4 α=

=



1 α 2

c(1 + k)

(11)

From this equation, once the price c is determined, the whole time slot assignment is also determined. Meanwhile, according to Eqs.(10) and (11), we can obtain  (1 + k)Δi − Δi t∗i i∈S  =  t∗i Δi i∈S

access time for each selected CU can also be known once c is given. On the other hand, PU’s objective is to maximize its utility by selecting the best relay set   1 max U0 = max wp R0 (1 − α c>0 2    ∗ = max wp R0 (1 − ti ) i∈S



⎫ Δi ⎪ ⎪ ⎬ i∈S ) = max wp R0 (1 − ⎪ c(1 + k) ⎪ ⎪ ⎪ ⎭ ⎩ ⎧ ⎪ ⎪ ⎨

2

(12)

To resolve the above equation, we first consider the value of c. Given the utility of PU in the case of direct transmission as Udir = wp B log2 (1 +

P0 |h0 |2 ) N0

the major goal of this paper is to fulfill that the utility U0 in the case of cooperative transmission exceeds Udir , that is to say, U0 ≥ Udir , so we have  2 Δi P0 |h0 |2 i∈S ) ≥ wp (13) U0 = wp R0 (1 − c(1 + k) N0 by solving Eq.(13), we obtain the lower bound of price c  2R0 N0 Δi c≥

i∈S

(1 + k)(R0 N0 − BP0 |h0 |2 )

However, price c can not maintains a large value at all time. From Eq.(9), note that c is inversely proportional to ti . When c is too large, CUs will obtain too short time to access PU’s spectrum while cooperative power is unaffected because of the undesirable utility according to Eq.(5). So given the minimum requirement of CU’s utility is Usmin (≥ 0), we have 1 Ui = w1 Ri t∗i − w2 Pi∗ ( α) ≥ Usmin 2

Δi

i∈S

2017

i∈S

For each selected relay, the ratio of access time in the third phase is determined, which means that the assigned

Substituting Eqs.(9) and (10) into the above inequality, we can obtain  2w2 Pi∗ ((1 + k)Δi − Δi ) c≤

i∈S

(1 + k)Usmin

A optimal price c in the range can only be determined according to the relay set which maximizes both utilities of PU and CUs. There are different optimization strategies existing: for CUs, different optimal cooperative powers will be calculated out for different relay sets; for PU, just one suitable set is optimal to improve its transmit rate.

Multiple Relay Selection Based on Game Theory in Cooperative Cognitive Radio Networks

From Eq.(12), to maximize PU’s utility is equivalent to improve its transmit rate. According to Eqs.(2) and (3), it is needed to maximize the cooperative SNR γs on the destination side PR, so in this respect, the problem can be transformed into finding a relay set which can maximize γs on the destination side. The optimization problem is max : γs

P0 ,Pi

s.t.

P0 ≤ P0max

Pi = min {Pimax , Pi∗ } ,

i∈S

(14)

Obviously, optimal cooperative power Pi∗ affects γs to a large extent. To solve this problem, in the paper, we adopt iterative strategy to select relays on the basis of optimized cooperative power. Given that SNR from CUi is denoted as γi , obviously the total cooperative SNR γs according to Eq.(2) is not a simple sum-up of all γi . P0 Pi |h0i hi0 |2 γi = (Pi |hi0 |2 + P0 |h0i |2 + N0 )N0 Note that here γi is not only related to the channel coefficients h0i and hi0 , but related to the cooperative power level of CUi . In each iteration round, cooperative transmit power Pi varies with the number of iteration relays, so how to remove or update relay nodes in the iteration process is the entrance to resolve the optimization problem. For ease of discussion, we first introduce a new harmonic mean factor to consider the transmitting power of PU and cooperative power of CUi , which is entirely different from previous factors decided only by channel information. 1 1 + ∗ )−1 P0 |h0i |2 Pi |hi0 |2 P0 Pi∗ |h0i hi0 |2 = P0 |h0i |2 + Pi∗ |hi0 |2

H(hi0 , h0i , Pi∗ ) = (

(15)

Note that the Eq.(15) is closely related to the received SNR given that only CUi cooperates with PU. The larger the factor is, the greater the received SNR is. Since the equivalent cooperative SNR γs at PR is combined in a complicate manner by each received SNR γi , we use this factor to replace SNR in the following sections to simplify discussion. The principle behind this definition is that the relays that contribute more to SNR γs are selected to relay data for PU while the relays that contribute little or even suppress the total SNR are removed. Therefore, γs on PR is maximized. Based on this factor, a so-called timer is further introduced in the selection algorithm for each CUi , which is proportional to the harmonic factor and is used as a timer Ti = λH(h0i , hi0 , Pi∗ )

629

where λ is a constant. The virtual timer Ti is just used to update the number of cooperation relays after removing the relay which has the smallest harmonic factor in a iteration round till γs reaches its optimal point, that is to say, as all timers count down, the first relay whose timer reduces to zero is removed. From Eq.(15), we have learned that the harmonic mean factor is closely related to the cooperative power level of CUi and the channel coefficients, so a larger harmonic factor means better channel gain or higher cooperative power level and is more helpful to improve the total SNR. Otherwise, it is needed to adjust the cooperative power to improve the total SNR by removing some relays. [Ti , ID] = min{Ti } where [Ti , ID] represents the minimal timer and its corresponding node identification. Algorithm 1 The ensemble colorization method 1: each relay in S calculates its XΔi ; Δi ; 2: PR calculates the sum-up i∈S

3: for k = relaynum : −1 : 1 4: each relay in S calculates its power Pi∗ and PR calculate γs ; 5: if(γs > γ0 ) then 6: CUi calculates its own timer Ti ; 7: find the minimum timer: [Ti ,ID]=min {Ti }; 8: remove CUi (i =ID) from S, that is S = S/ID; 9: γ0 = γs ; 10: else 11: break 12: end if 13: end for 14: for t = 0 : 1 : |S| 15: each CUi (∈ S) calculates its utility Ui (i ∈ S) 16: find a minimum utility: [Ui , ID ] = min {Ui (i ∈ S)}; 17: if min{Ui } < Usmin then 18: remove CUi from S, S = S/ID and inform PR; 19: update cooperative power and utility until min{Ui } > Usmin ; 20: else 21: break 22: end if 23: end for Output: S, γs , Ui (CUi ∈ S)

In the process of seeking the best relay set to maximize the total SNR on PR, we have to adopt an iterative selection strategy. In every round, each relay node updates its cooperative power level, then calculates the total SNR and compares it with the former value. So this iterative algorithm is a distributed process and its pseudo-code is shown in Algorithm 1, which is designed based on the theory of a unique Nash equilibrium existing for a selected relay set. That is to say, if the cooperative power in a relay

Chinese Journal of Electronics

630

set isn’t determined, the SNR at PR can’t be calculated to maxmize PU’s utility. Note that information switching happens among CUs and PR to support various actions like calculating, deleting and updating, which is the basis of the algorithm. In the algorithm, a optimal relay set can be selected to achieve the maximum benefit of PU and CUs. Since it has proved that the existence of Nash equilibrium in noncooperative power game among participated relay nodes, which can ensure the selected CUs to reach the optimization of utility. The following task just needs to guarantee utility maximization of PU. It is known that utility function maximization of PU is to maximize the SNR at PR. However, the total SNR is influenced by the cooperative power level of each participated relay, and the cooperative power level is closely related to the number of participated relays, so the relays whose participation suppresses the SNR need to be removed. Therefore, we introduce harmonic mean factor H which is decided by cooperative power level of relays and channel gain between nodes. This approach solves the relationship between the number of relay nodes, cooperative power level and the SNR in a certain extent.

VI. Performance Analysis In this paper, the simulation scenario is shown as Fig.2. The coordinates of PU, PR and SR are set as (0, 150), (300, 150), and (250, 260) respectively, and there are N (= 10) CUs randomly distributed in the area between PU and PR. In this paper, to easy the following discussion, all CUs are indexed with a number as shown in Fig.2.

Fig. 2. The simulation scenario

Assume that the channel gain between any two nodes is h = 1/dη , where d is distance and η(=2) is the path loss exponent. In the simulation, assume one time slot is 1 second. The transmission power of PU is initialized as Ps = 1W, and the transmission powers of CUs in the third phase are all set as P0 = 0.8W while their maximum transmit powers in the second phase are Pimax = 1W. Also assume here the noise power spectral density is

N0 for w1 ter set

2017

= −100dBW/Hz. The revenue-weighted parameters the primary user and the CUs are wp = 10 and = 103 , respectively, and the cost-weighted paramefor CUs is w2 = 500. All CUs’ minimum utilities are to be Uimin = 0.2.

Fig. 3. The influence of price c on utilities of PU and CUs

First we discuss how to set up the price c. Fig.3 shows that the price c per unit access time has fundamental impact on the utilities of PU and all selected CUs, where the left vertical axis represents utilities of the selected CUs while the right one denotes primary user’s utility. It is noted that, as price c increases from 0 to 2.5 × 10−6 , PU’s utility increases from 182 to 198 while the utility of CU10 decreases more quickly from 260 to 20. And after price c being larger than 0.3 × 10−6 , all utilities gradually become stable and different CUs’ utilities are drawn close. The possible reason is that, for CUs, price c and the access time are inversely proportional when the number of selected relays keeps unchangeable and their optimal cooperative powers even are not affected by the price c according to Eq.(9), therefore the revenue in the third phase will be reduced as price c increases, which leads to the reduction of CUs’ utilities according to Eq.(5). For PU, when c increases, the reduction of the third phase prolongs the other two phases, which will improve PU’s utility according to Eq.(12). Considering that price c is determined by PU, here we set c = 0.3 × 10−6 in this paper. Table 1 shows the relationship between channel status information Δi and harmonic factor Hi where the last row lists the final value of each CUi after the so-called noncooperative game. It is noted that the relays 2, 4, 5, 8, 9 have very small or even negative initial factors, so finally are eventually removed from the candidate set. The reasonable explanation to this phenomenon is that the whole throughput can be optimized by removing these relays because they only have minute or even negatively effect on the total SNR, which is partly proved by Fig.4. Fig.4 shows the optimal cooperative power, access time and utilities of the selected relays after the noncooperative process. In Table 1 and Fig.4, note that the

Multiple Relay Selection Based on Game Theory in Cooperative Cognitive Radio Networks

ID Δi (×10−7 ) Initial Hi (×10−3 ) Final Hi (×10−3 )

1 0.7837 0.0770 0.0787

Table 1. CUs’ channel status Δi and harmonic factor H 2 3 4 5 6 7 0.6596 0.8811 0.6493 0.6404 0.9202 0.7042 0.0318 0.0775 0.0224 –0.0120 0.0816 0.0711 0 0.1290 0 0 0.1438 0.0187

8 0.6390 –0.4010 0

9 0.6484 –0.0210 0

631

10 0.9008 0.0904 0.1745

Fig. 4. Optimal cooperative power, access time and utilities. (a) Optimized cooperative power; (b) Optimal access time; (c) Utilities

CUs with higher Δi and Hi will use larger power to cooperate and purchase more channel access time, which lead to more revenue in the cooperation process. To explain this, recall that, for a CU, higher Δi means better channel condition, which will increase the probability that PU will select it as relay node. Meanwhile, larger harmonic mean factor Hi indicates that it can help PU to transmit data with larger cooperative power and would contribute more SNR to the total SNR. In other words, a CU having preferred channel gain and higher cooperative power will obtain more access time than others according to Eq.(4). As a result of this, owing to having more access time and large channel gains, these CUs will obtain larger utilities. Fig.5 shows CUs’ access times and utilities in GTMRS and SGRS. It is obvious that in Fig.5(a), each selected relay has more access time in GTMRS than in SGRS and there are more CUs involved in cooperation with PU in GTMRS even though some of them only obtain minute

access time. This effect is reasonable because CR technology is introduced to allow more CUs to share the spectrum without affecting PU. Fig.5(b) shows PU’s utilities with different relay numbers in GTMRS and SGRS. Obviously PU’s utility in GTMRS is larger than in SGRS. A important reason is that GTMRS is proposed based on the idea of maximizing transmission rate, which is the key factor to improve PU’s utility. Therefore, GTMRS can enable more CUs participate in the cooperation with PU and obtain more access time even though their utilities decrease compared to SGRS. In return, PU gains a larger utility. Fig.6 shows the change of the selected relay numbers in GTMRS and SGRS. Obviously, more CUs are selected in GTMRS than in SGRS. This result is one of goals in cooperative cognitive radio networks, because more CUs participating in the cooperation with PU means a perfect combination of Nash equilibrium point among CUs and maximization of PU’s transmission rate.

Fig. 5. Performance comparison of GTMRS and SGRS. (a) Optimal access time; (b) Utilities

VII. Conclusions This paper focuses on how to select a proper set of cognitive users to serve as cooperative relays for PU’s transmission in a overlay model, and how can CUs obtain their

Fig. 6. Number of the selected GTMRS and SGRS

CUs

in

optimal cooperative powers and access opportunities (access times). In the proposed algorithm GTMRS, a cooperative spectrum-leasing mechanism is formulated based on a pricing payment game between PU and multiple CUs. PU allows CUs to access its spectrum for a portion of

Chinese Journal of Electronics

632

time, and in return, CUs should consume parts of their own powers to forward data for PU. In the sharing mechanism, each cognitive user’s cooperative power level is positively correlated to its access time, which promotes the cooperation between CUs and PU, and also makes CUs be more selfish. Therefore, there exist a non-cooperative game among the selected CUs because the achieved optimal cooperative power allocation among CUs means the maximal utility for PU and CUs as a whole. We prove that a unique Nash equilibrium point exists in the power game and is also the optimal point. Based on this result, an iterative GTMRS algorithm is designed to maximize PU and CUs’ utilities. Simulation results show that the primary user’s utility is optmized, more CUs are allowed to participate in the cooperation and longer access time is assigned to each CU. In future, we will expand the ideas to situations with multiple PUs. Appendix A

Proof of Theorem 1

Definition 1 A power vector P = (P1 , P2 , · · · , Pk ) is a Nash equilibrium of game G = {S, {Pi }, {Ui }} if, for any relay CUi ∈ S, ui (Pi , P−i) ≥ ui (Pi , P−i ) for all Pi , where ui (Pi , P−i ) is the resulting utility of CUi ∈ S given the other players’ time selection result P−i . Proposition 1 An Nash equilibrium exists in game G = {S, {Pi }, {Ui }}. If for all CUi ∈ S: a) Pi is a non-empty, convex, and compact subset of some Euclidean space K . b) Ui (P ) is continuous in power domain P and concave in Pi . Proof of Theorem 1 Obviously, the time space Pi = [Pi ]i∈S is a non-empty, convex and compact subset of the Euclidean space K . From Eq.(5), obviously Ui (P ) is continuous in P . Note that the total access time of all selected relays who have X partj , ticipated in the cooperation equals to 0.5α = ti + j∈S,j=i

which is the duration of the third phase, and the cooperative cti . power of relay CUi ∈ S can be rewritten as Pi = |h0i |2 |hi0 |2 So we can rewrite Eq.(5) Pi |h0i |2 |hi0 |2 (w1 Ri − w2 Pi ) Ui = c X Pj |h0j |2 |hj0 |2 −w2 Pi c j∈S,j=i

2017

Therefore, a Nash equilibrium exists in the noncooperative power game.

Appendix B

Proof of Theorem 2

According to Theorem 1, we know that at least a Nash equilibrium denoted r(P ) exists in the non-cooperative power game. First, the best-response access power Pi∗ is positive of course so r(P ) > 0. Second, if P and P  are different power vector and P  ≥ P , which means Pi ≥ Pi , we have dri (P )/dPi = 0 and drj (P )/dPi = −(1/2) < 0 in accordance to Eq.(8). So for ∀i, j ∈ S, where i = j, both ri ([P1 , P2 , · · · , Pi , · · · , Pk ] ≤ ri ([P1 , P2 , · · · , Pi , · · · , Pk ]) and rj ([P1 , P2 , · · · , Pi , · · · , Pk ] ≤ rj ([P1 , P2 , · · · , Pi , · · · , Pk ]) are satisfied. Thus, the best-response power function is monotonous. Finally, the last property is satisfied because we have X Pi |h0i |2 |hi0 |2 w1 Ri j∈S,j=i − ) μr(P ) − r(μP ) = μ( 2w2 2|h0i |2 |hi0 |2 X Pi |h0i |2 |hi0 |2 u w1 Ri j∈S,j=i −( − ) 2w2 2|h0i |2 |hi0 |2 w1 Ri = (μ − 1)( )>0 (A-2) 2w2 In conclusion, the best-response correspondence power r(P ) is a standard function, therefore it is the unique Nash Equilibrium of the noncooperation power selection game.

Appendix C

Proof of Theorem 3

According to Eq.(8) and 2Pi∗ |h0i |2 |hi0 |2 = 2Δi − Pi∗ |h0i |2 |hi0 |2 +

X

Theorem 1, we have X Pj∗ |h0j |2 |hj0 |2 , so j∈S,j=i

Pj∗ |h0j |2 |hj0 |2 = 2Δi . Note that this result

j∈S

is equivalent to k equations where i = 1, 2, · · · , k and k = |S|. X ∗ Pj |h0j |2 |hj0 |2 = 2Δ1 P1∗ |h01 |2 |h10 |2 + j∈S

P2∗ |h02 |2 |h20 |2 +

X

Pj∗ |h0j |2 |hj0 |2 = 2Δ2

j∈S

Pk∗ |h0k |2 |hk0 |2

+

X

Pj∗ |h0j |2 |hj0 |2 = 2Δk

j∈S

(A-1)

Now take the second-order derivative with respect to Pi to prove its concavity. X Pj |h0j |2 |hj0 |2 (w1 Ri − 2w2 Pi )|h0i |2 |hi0 |2 − w2 dUi j∈S,j=i = dPi c 2w2 |h0i |2 |hi0 |2 d2 Ui = −