Dynamic Link Selection and Power Allocation with ... - IEEE Xplore

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2723048, IEEE Access 1

Dynamic Link Selection and Power Allocation with Reliability Guarantees for Hybrid FSO/RF systems Yan Wu, Qinghai Yang, Daeyoung Park, and Kyung Sup Kwak

Abstract—In this paper, we study transmission strategies for hybrid free space optical (FSO)/radio frequency (RF) systems by jointly considering link selection, power allocation, and reliability guarantees. Specifically, under an explicit long-term average reliability requirement, the transmitter in the hybrid FSO/RF system makes decisions about which links should be selected and how much power should be allocated to the corresponding active link. The problem of minimizing the power consumption cost while guaranteeing packet success-probability requirements and peak and average power constraints is considered and formulated as a stochastic problem. Using the Lyapunov optimization techniques, we solve this problem and derive closed-form power allocation solutions for different link selection modes. Furthermore, we design a dynamic link selection and power allocation (DLSPA) algorithm that can arbitrarily push the consumed power approach to the optimum at the expense of a tradeoff over reliability queue occupancy. Simulation results verify the theoretical analysis and validate the performance superiority of our proposed scheme. Index Terms—Hybrid FSO/RF, reliability-guarantees, link selection, power control, Lyapunov optimization.

I. I NTRODUCTION The ever-growing demands for higher data rates along with enormous new wireless applications challenge the availability of radio frequency (RF) spectrum resources. Free space optical (FSO) systems that utilize the large licence-free optical spectrum are considered a promising solution to this challenge [1]. One advantage of the FSO system is that it causes no interference with existing RF communications systems, because the optical spectrum is a totally different part of the electromagnetic spectrum [2]. Furthermore, atmospheric conditions have different effects on FSO and RF links. For example, heavy rain is the main degrading factor in RF links, whereas it has no particular effect on FSO links. Conversely, FSO links are susceptible to fog, whereas RF link performance is not significantly affected by this factor. These properties promote the creation of hybrid FSO/RF communication systems, which have received much attention from both regulatory and academic bodies (see Borah et al. [3] and the references therein). *This research was supported in part by NSF China (61471287), 111 Project (B08038) and MSIP, Korea, under ITRC Program (IITP-2016-H8501-16-1019) supervised by IITP. Y. Wu and Q. Yang are with State Key Lab. of ISN, School of Telecom. Engineering, and also with Collaborative Innovation Center of Information Sensing and Understanding, Xidian University, No.2 Taibainan-lu, Xian, 710071, Shaanxi, China. (Email: [email protected]). D. Park and K.S. Kwak are with Department of Information and Communication Engineering, Inha University, #100 Inha-Ro, Nam-gu, Incheon, 22212, Korea. (Email:[email protected]; [email protected]).

Recently, link selection and power control schemes have been investigated extensively for the hybrid FSO/RF systems. First, for works focusing on link selection between FSO and RF, Usman et al. [4] proposed a hard switching protocol, and Zhang et al. [5] proposed a soft switching protocol, in which the more reliable link is selected for data transmission. However, hard switching may cause undesirably frequent hardware switching between FSO and RF links [6]. On the other hand, the soft switching requires both of FSO and RF links to be active even when a single FSO or RF link can support the required QoS, resulting in power waste. Secondly, for works focusing on power control in hybrid FSO/RF systems, Moradi et al. [7] applied a water-filling power adaptation scheme only on the FSO link. Letzepis et al. [8], applied power adaptation on both FSO and RF links of a hybrid FSO/RF system, assuming that both links are active all the time but are transmitting at different rates. Rakia et al. [9], proposed adaptive power control and adaptive combing of the RF and FSO signals, but they assumed that the FSO link is active all the time with constant transmitted power. Apart from these research inadequacies for the two individual issues, efforts are also lacking on designing a transmission strategy that jointly considers link selection, power allocation, and reliability guarantees in a hybrid FSO/RF system. Motivated by the aforementioned concerns, in this paper, we study an optimal transmission strategy for hybrid FSO/RF systems by jointly considering link selection, power allocation, and reliability guarantees. Specifically, under an explicit longterm average reliability requirement, the transmitter in a hybrid FSO/RF system makes decisions on which links should be chosen and how much power should be allocated to the links that are active at that time. The problem of minimizing the power consumption cost while guaranteeing packet successprobability requirements and peak and average power constraints is considered and formulated as a stochastic problem. Using the Lyapunov optimization techniques [10], [11], [12], we solve this problem and derive closed-form power allocation solutions for different link selection modes. Furthermore, we design a dynamic link selection and power allocation (DLSPA) algorithm that can arbitrarily push the consumed power approach to an optimal value at the expense of a tradeoff over reliability queue occupancy. Note that similar work considering link-layer transmission policies was studied in [13]. The main difference between our work and [13] is that we consider delay-limited scenario, in which the transmitter has to immediately send the packets

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.


RF Transmission

RF Link

RF Reception

Data In OFF

OFF FSO Link

FSO Transmission Transmitter

FSO Reception Receiver

Fig. 1: An illustration for a hybrid FSO/RF system.

to the receiver, or dropped; while authors in [13] considered delay-tolerant scenario, in which the transmitter is allowed to store the packets in its buffer and send them to the receiver when the quality of channel state is favorable. Thus, our work and [13] are complementary. In particular, the delaylimited scenario is justified for delay-sensitive traffic, such as mission critical applications or real-time voice and video streaming applications, which have strict delay constraints and rate requirements [14]. While delay-tolerant scenario can be applied for delay-insensitive traffic, such as file downloads and e-mail applications. The rest of this paper is organized as follows. In Section II, we present our system model and explain our objectives. In Section III, we design an optimal control algorithm by jointly considering link selection and power allocation. Simulation results are presented in Section IV, followed by conclusions in Section V. II. S YSTEM M ODEL We consider a hybrid FSO/RF system1 with one transmitter (Tx) sending delay-limited data to its receiver (Rx) over two parallel links, i.e., an FSO link and an RF link2 . We denote the link set as: I = {1, 2}, where i = 1 and i = 2 represent the FSO link and the RF link, respectively. The system is assumed to operate in slotted time and the time interval T is normalized to a unit without loss of generality. Let the channel coefficients for link i between Tx and Rx be hi (t) in slot t, and we assume hi (t) remains fixed in each slot, but potentially changes on slot boundaries. Besides, the bandwidth for link i is assumed to be Bi ∀i = 1, 2. An illustration for a hybrid FSO/RF system is shown in Fig. 1. A. FSO Link Model At slot t, if the FSO transmitter sends modulation optical signal X1 (t) with power P1 (t) to its receiver, then the re( )r/2 ceived electrical signal is Y1 (t) = h1 (t)R X1 (t) + n(t), where h1 (t) is the channel coefficient of the FSO link, R 1 In practice, FSO and RF links operate at different speeds, so synchronization issues need to be considered. For detailed discussions on timing synchronization of hybrid RF/FSO systems, please refer to Kumar and Borah [15]. 2 The RF unlicensed bands at 2.4 GHz, 5 GHz, and 60 GHz can be used for the implementation of hybrid FSO/RF systems.

is the effective photoelectric conversion ratio, and n(t) is the channels additive white Gaussian noise (AWGN) with normalized unit variance at the FSO receiver; r = 1 and r = 2 represent the mode of heterodyne detection and intensity modulation/direct detection (IM/DD), respectively3 [17], [18]. In particular, the instantaneous received irradiance is defined as h1 (t) = h01 (t)ha1 (t)hb1 (t), where h01 (t) is the path loss effect, while ha1 (t) and hb1 (t) denote the effects of pointing error and atmospheric turbulence, respectively [19], [20]. Therefore, the instantaneous electrical signal-to-noise ratio (SNR) for the FSO link can be written as ( )r γ1 (t) = h1 (t)RP1 (t) /N0 B1 , (1) where 0 ≤ P1 (t) ≤ P1max with P1max denoting the maximum allowable transmit power in the FSO link, which is mainly determined by eye safety regulations [2]. In this paper, we consider FSO link using heterodyne detection in the receiver side, i.e., only r = 1 is considered for Eq. (1). For r = 2 case, the results can be derived similarly. If continuous rate adaptation is used, from the classical Shannon’s theorem, the achievable rate over the FSO link is given by C1 (t) = B1 log2 (1 + γ1 (t)).

(2)

B. RF Link Model In slot t, if we use the RF link to transmit data, the received RF signal at the Rx is given by Y2 (t) = h2 (t)X2 (t) + n(t), where X2 (t) is the transmitted RF signal with power P2(t), the channel coefficient of RF link h2 (t) captures both channel fading and path loss, and n(t) is the AWGN with normalized unit variance at the RF receiver. Therefore, the instantaneous received SNR for the RF link can be writen as γ2 (t) = |h2 (t)|2 P2 (t)/N0 B2 ,

(3)

where 0 ≤ P2 (t) ≤ P2max is the transmitted power with P2max denotes the maximum allowable transmit power in the RF link, which is restricted by the hardware or standard regulation. If continuous rate adaptation is used, from the classical Shannon’s theorem, the achievable rate over the RF link is given by C2 (t) = B2 log2 (1 + γ2 (t)).

(4)

C. Control Decisions In each slot t, new packets arrive randomly according to a Bernoulli process A(t) of rate λ. We assume that A(t) is independently and identically distributed (i.i.d.) over slots, and the maximum number of arrivals is upper bounded by a constant Amax . Besides, suppose that each packet contains M bits of data and its delay constraint is restricted to 1 slot. Thus, a packet not transmitted within 1 slot is dropped 3 Note that Heterodyne detection is suitable for extracting information encoded as modulation of the phase and/or frequency of an optical signal, such as M-QAM, M-PSK and M-FSK. While direct detection (DD) is usually used to recover the intensity modulation (IM) signal, such as OOK, PPM and PAM.



and retransmission is not considered. Additionally, let η denote the minimum time-average reliability requirement for the transmitter, which represents the fraction of packets that were transmitted successfully. In each slot t, if the transmitter has a new packet to transmit, it makes the following decisions to transmit data to the Rx: link selection (which links are to be selected at each time) and power allocation (how much power is to be allocated to the links that are activated at the time). ω Let Dω (t) = [Lω i (t); P (t)] be the collective control action at slot t under policy ω, which comprises the choices of the link selection and power allocation for the transmitter. Here, Lω i (t) indicates whether link i is active in slot t or not. In our model, ω Lω i (t) ∈ L = {0, 1}, ∀ i, with the interpretation that Li (t) = 1 ω if link i is active and Li (t) = 0 otherwise. Furthermore, Pω (t) is the collection of Piω (t) ∀i ∈ I, with Piω (t) representing the allocated power for link i in slot t. Note that Piω (t) = 0 for all links i if link selection action Lω i (t) = 0. Denote V as the set of all valid control actions, i.e., V = {L} × {P}, where L and P are the sets of all valid link selection and power allocations, respectively. We define the “reliability” variable for the transmitter as follows:   1, if a packet transmitted ( ) succesfully in slot t, V H(t), Dω (t) = (5)  0, otherwise. ( ) where the outcome of V H(t), Dω (t) depends on the current channel state and control action. We further assume that the feedback channel is instantaneous and error-free, so that the result, i.e., ACK (Acknowledgement) or NACK (Negative Acknowledgement), of each packet transmission is determined perfectly transmitter. For clarity, we use V ω (t) in place ( by the ) ω of V H(t), D (t) in the rest of this paper. D. Power Consumption Cost Model The power consumption cost incurred by each action is composed of three types of cost, i.e., the transmission cost due to the FSO/RF link being activated to transmit data, the RF circuit operation cost owing to the RF link being in service, and the waking-up cost resulting from the FSO/RF link turning from the sleep state to the operating state. Specifically, at the beginning of each slot t, if only the FSO link is active (i.e., L1 (t) = 1, L2 (t) = 0), a power cost of P1 (t) is incurred, which accounts for the optical power used for emitting light as well as data transmission. If only the RF link is active (i.e., L1 (t) = 0, L2 (t) = 1), the power cost is P2tr (t) = P2 (t) + Pc , where P2 (t) is the transmitted power, and Pc corresponds to circuit power consumption for signal processing. If both of the links are activate (i.e., L1 (t) = 1, L2 (t) = 1), the power cost is P1 (t) + P2tr (t); otherwise (i.e., L1 (t) = 0, L2 (t) = 0), no power cost is incurred. Furthermore, when the transmitter goes back to operating mode from sleep ω mode (i.e., Lω i (t − 1) = 0, Li (t) = 1), it incurs the waking-up cost, including power consumption, denoted as υi , and time ω delay, denoted as τi . If Lω i (t − 1) = 1 and Li (t) = 0, it means

link i is disconnected. The disconnection of link i is assumed to be cost-free. E. Power Cost Minimization Our goal is to design a policy, ω, that minimizes the time-average power cost for the hybrid FSO/RF system while ensuring its service quality. We formulate this problem as a stochastic optimization problem: ∑ min : θ¯ = (¯ eω ¯ω ¯ω i +u i )+q i∈I

s. t. :

C1 : s¯ω ≥ ηλ avg C2 : e¯ω ∀i∈I i ≤ Pi ω C3 : 0 ≤ Pi (t) ≤ Pimax ∀ i ∈ I, ∀t C4 : Dω (t) ∈ V ∀t, (6)

where e¯z ¯z ¯z are the time-average power usage due i , u i and q to transmit power of link i, link i waking-up, and the RF link being active under policy z respectively. Based on the reference that the time average∑expectation of a t−1 random variable X(t) is defined as limt→∞ 1t τ =0 E{X(τ )}, the above time-average parameters are defined as: 1∑ E{Piz (τ )} t→∞ t τ =0

(7)

1∑ z + E{[Lz i (τ ) − Li (τ − 1)] υi } t→∞ t τ =0

(8)

1∑ E{I{Lz (τ )=1} Pc }. 2 t→∞ t τ =0

(9)

t−1

e¯z i = lim t−1

u ¯z i = lim and

t−1

q¯z = lim

Similarly, s¯z is the time-average reliability, which is defined as: 1∑ E{V z (τ )}. t→∞ t τ =0 t−1

s¯z = lim

(10)

In the above definitions, E{·} denotes the expectation with respect to the channel state and policy z, and I{ι} is the indicator function that returns 1, if ι is true ∑ and 0, otherwise. ¯ = limt→∞ 1 t−1 E{θ} with For notation clarity, we let θ τ =0 t ∑ z + θ= (Pi + [Lz i (τ ) − Li (τ − 1)] υi ) + I{Lz (t)=1} Pc . i∈I

2

If we define a two-tuple (H(t), Q(t)) as the system state for problem (6), where H(t) is the current channel state, and Q(t) is the current queue state (to be described in Sec. III). We note that the system state affects the control action, and the control decisions also affect the dynamics of this system. Therefore, optimization problem (6) belongs to the constrained Markov decision process (CMDP) class [16]. To solve problem (6) optimally, the conventional techniques are based on dynamic programming [16] that requires prior information about channel probabilities and arrival distributions as well as suffers from the curse of dimensionality, or the learning-based approaches



that are confronted with large convergence times. Inspired by the developed extension to the Lyapunov optimization techniques [10], in the following section, we solve problem (6) by designing a dynamic control algorithm that overcomes these challenges and can be implemented in an online fashion. III. O PTIMAL C ONTROL A LGORITHM In this section, we first introduce two virtual queues to tackle constraints (C1) and (C2) in problem (6). Then, we present a dynamic link selection and power allocation (DLSPA) algorithm that can be shown to achieve the optimal solutions for the stochastic optimization problem (6). In particular, we derive closed-form power allocation solutions for different link selection modes.

an upper bound to optimality, and stabilize all queues by minimizing an upper bound of the following item: ( ) ∆ Q(t) + SE{θ|Q(t)} (15) where S is a non-negative parameter to control the tradeoff between the value of objective in (6) and the virtual queues’ backlogs. Instead of minimizing (15) directly, our algorithm minimizes an upper bound of it, which has a similar effect. In addition, the upper bound of the drift-plus-penalty expression for (15) is as follows (see Appendix C for the derivation): Eq.(15) ≤ U − G(t)E{V (t)|Q(t)} + ∑ ∑ (Zi (t) + S)E{Pi (t)|Q(t)} − Zi (t)Piavg i∈I

+S A. Two Virtual Queues

i∈I

υi 1{Li (t−1)=0} E{Li (t)|Q(t)}+SPc E{L2 (t)|Q(t)},

i∈I

In order to tackle time-average reliability constraint C1 in (6), we introduce virtual “reliability queue” G(t) with arrival rate A(t) for the transmitter, which is updated as follows: G(t + 1) = [G(t) − V (t), 0]+ + ηA(t).

(11)

Similarly, to satisfy time-average power constraint C2 in (6), we introduce virtual power queue Zi (t), ∀i ∈ I, which is updated as follows: Zi (t + 1) = [Zi (t) −

∑

Piavg , 0]+

+ Pi (t).

where U = 21 (1 + η 2 λ2 +

B. Dynamic Algorithm Design via Drift-Plus-Penalty Minimization Method Next, we design the DLSPA algorithm to stabilize all virtual queues and solve optimization (6). Let Q(t) = [G(t), Xi (t)], ∀i ∈ I be the vector of all the queues in the system. Define our Lyapunov function as ∑ ( ) 1( ) L Q(t) = G2 (t) + Zi2 (t) . (13) 2 i∈I

i∈I

(Piavg )2 + (Pimax )2 ).

Hence, minimizing the right-hand side of Eq. (16) is equal to min :

∑

(Zi (t) + S)E{Pi (t)|Q(t)}

i∈I

+S

∑ i∈I

(12)

Note that these virtual queues are only introduced to facilitate the control algorithm to achieve the time-average constraints of problem (6). In particular, the time-average constraints can be turned into queueing stability problems under this approach. Therefore, if a policy ω satisfies constraint C1, then it means (11) is stabilized, which requires the service ∑t−1 rate to be no less than the input rate, i.e., s¯ω = lim 1t τ =0 E{V ω (τ )} ≥ t→∞ ∑t−1 lim 1t τ =0 E{ηA(τ )} = ηλ. Similarly, satisfying constraint t→∞ = C2 is ∑ equivalent to stabilizing (12), which yields e¯ω i t−1 1 lim t τ =0 E{Piω (τ )} ≤ Piavg , where we used definitions t→∞ (10) and (7).

The one-slot conditional Lyapunov drift is ( ) ( ) ( ) ∆ Q(t) = E{L Q(t + 1) − L Q(t) |Q(t)}.

(16) ∑

υi 1{Li (t−1)=0} E{Li (t)|Q(t)}

+SPc E{L2 (t)|Q(t)} − G(t)E{V (t)|Q(t)}

s. t. :

0 ≤ Piω (t) ≤ Pimax ∀ i ∈ I, ∀t, Dω (t) ∈ V ∀t.

(17)

To this end, the DLSPA algorithm operates as follows. In every time slot t, given the current queue state Q(t) and the current channel state H(t), the transmitter makes a control action D(t) that optimizes (17). Subsequently, the virtual queues are updated according to Eqs. (11) and (12). The sketch of our DLSPA algorithm for a hybrid FSO/RF system with reliability guarantees is presented in Algorithm 1. Furthermore, we give the performance of the DLSPA algorithm in Theorem 1. Theorem ( 1 (DLSPA) Algorithm performance): Suppose that H(t) = h1 (t), h2 (t) is i.i.d.4 over slots and all virtual queues are initialized to 0. Then, for any given control parameter S > 0, implementing the DLSPA algotithm can stabilize all virtual queues, i.e., the minimum reliability and time-average power constraints can be satisfied. Furthermore, the following performance bounds are achieved. (a) For some µ > 0, which depends on the slackness of the feasibility constraints, we guarantee the time-average reliability queue as follows: 1 t→∞ t

lim

∑t−1

τ =0 E{G(τ )} ≤

U +S

(∑

max +vi )+Pc i∈I (Pi

µ

)

. (18)

(14)

Following the drift-plus-penalty framework in Lyapunov optimization, we can make the objective function in (6) within

4 For a more general Markov-modulated H(t), a similar statement can be made by using the techniques of Neely [10]. Without loss of generality, we only consider the i.i.d. case in this paper.



Algorithm 1 Control Algorithm on the Transmitter Initialization: Set up virtual queues G and Zi , ∀ i ∈ I, and initialize their backlogs to 0; In every time slot t: 1: Observe the current queue states G(t), Zi (t), and the CSI H(t); 2: Solve optimization Eq. (17) to obtain optimal link selection and power allocation strategies Li (t), Pi (t); 3: Update link setup table with Li (t), ∀ i ∈ I, and makes the link setup decisions as follows: 4: for i ∈ I do 5: if Li (t − 1) = 0 and Li (t) = 1 then 6: Activate Link i; 7: else if Li (t − 1) = 1 and Li (t) = 0 then 8: Turn off Link i; 9: else 10: Keep Link i state unchanged; 11: end if 12: end for 13: Update virtual queues G(t) and Zi (t) according to Eqs. (11) and (12);

for the transmitter to choose: (1) only the FSO link is active, i.e., L1 (t) = 1, L2 (t) = 0; (2) only the RF link is active, i.e., L1 (t) = 0, L2 (t) = 1; (3) both the FSO and RF links are active, i.e., L1 (t) = 1, L2 (t) = 1; (4) sleeping, i.e., L1 (t) = 0, L2 (t) = 0. Whenever link selection mode k ∈ {1, 2, 3, 4} is chosen at slot t, we can compute the optimal cost of problem (17), denoted as gk (t) as well as the corresponding power allocation action to achieve this cost, denoted as Dkp (t). Then, the link selection mode k and the resulting power allocation action Dkp (t) associated with the minimum cost are regarded as the optimal actions. That is, (k ∗ , Dkp∗ (t)) = arg min gk (t), p (k,Dk (t))

where (k ∗ , Dkp∗ (t)) ∈ Dω (t) represent the optimal actions. Notice that the cost g4 (t) for the sleep mode is trivially zero. Below, we present the computations for each minimum cost gk (t) k ∈ {1, 2, 3, 4}, on the condition that each transmission is successful. The minimum cost g1 for mode k = 1 associated with a successful transmission Psuc (M ) = 1, i.e., V (t) = 1 is computed by solving the following optimization problem: min : P1 (t)

s. t. : (b) The time-average power consumption cost satisfies [ ∑t−1 ∑ θ¯= lim 1t (Pi (t)+[Li (τ )−Li (τ − 1)]+ υi )} τ =0 E{ t→∞

+

] ω E{I P } ≤ θ∗ + {L2 (τ )=1} c τ =0 i∈I

∑t−1

U S,

(19)

∑ where θ∗ = i∈I (e∗i + u∗i + q ∗ ) is the optimal∑ value of the objective in problem (6), and U = 12 (1 + η 2 λ2 + (Piavg )2 + i∈I

(Pimax )2 ). Proof : See Appendix A. Remark 1: Eq. (19) shows that θ¯ ≤ θ∗ + US , combined with θ¯ ≥ θ∗ from (6), we have θ∗ ≤ θ¯ ≤ θ∗ + US . Therefore, θ¯ can arbitrarily approach θ∗ when S is large enough to make US arbitrarily small. On the other hand, the corresponding timeaverage reliability queue backlog bounds increase linearly with S according to Eq. (18). In other words, the reduction in θ¯ results in increasing in backlogs G(t), which indicates a tradeoff between cost and reliability as [O(1/S), O(S)]. In the following sections, we study the solutions of optimization problem (17) in detail. In particular, we define successful transmission probability as the probability that total instantaneous mutual information is no smaller than the data target transmission rate M . It is given by Psuc (M ) = Pr(C ≥ M ), where C = Blog2 (1 + γ) is the total instantaneous mutual information, with γ as its received SNR and B as its channel bandwidth. C. Solutions to Optimization Problem (17) In every slot t, the transmitter determines the optimal control action according to the solutions of the optimization problem (17). As described in Sec. II, there are four link selection modes

(Z1 (t) + S)P1 (t) + Sυ1 1{L1 (t−1)=0} − G(t) h1 (t)RP1 (t) )≥M N0 B1 CC2 : 0 ≤ P1 (t) ≤ P1max . (20) CC1 : ξ1 B1 log2 (1 +

Here, ξ1 = 1−I{L1 (t−1)=0} τ1 is the effective transmission time when FSO link being active at time slot t, which is related to the state of FSO link at slot t − 1. Specifically, when FSO is sleeping at slot t−1, it takes time cost τ1 for FSO link to wake up, thus the effective transmission time should exclude this cost. Otherwise, the effective transmission time equals to 1. The constraint CC1 denotes that the mutual information should be no smaller than data rate M to guarantee that data transmission between Tx and Rx is successful. Problem (20) is a convex optimization problem with a monotonic increasing objective function and a convex constraint set. Obviously, if the feasible solution exists, then for minimum cost, the constraint CC1 must be met with equality as well as CC2 is satisfied. Otherwise, this transmission mode k = 1 is infeasible and we set g1 = +∞. As a result, the minimum cost for link selection mode k = 1 is derived as Eq. (21). In the case of g1 = +∞, link selection mode k = 1 will be discarded because sleep mode cost g4 (t) = 0 is strictly better, but comparing it with costs g2 (t) and g3 (t) is still necessary. Similarly, to obtain minimum cost g2 for mode k = 2 associated with a successful transmission Psuc (M ) = 1 or V ω (t) = 1, we solve the following convex problem: min : P2 (t)

s. t. :

(Z2 (t) + S)P2 (t) + Sυ2 1{L2 (t−1)=0} − G(t) + SPc

|h2 (t)|2 P2 (t) ) ≥ M, N0 B2 max 0 ≤ P2 (t) ≤ P2 . ξ2 B2 log2 (1 +

(22)

Similarly, ξ2 = 1 − I{L2 (t−1)=0} τ2 is the effective transmission time when RF link being active at time slot t. The minimum cost for mode k = 2 is given by Eq. (23).



{ g1 =

{ g2 =

(k=1)

(Z1 (t) + S)P1 +∞,

(k=2)

(Z2 (t) + S)P2 +∞,

(k=1)

+ Sυ1 1{L1 (t−1)=0} − G(t), if P1 = otherwise.

(k=2)

+ Sυ2 1{L2 (t−1)=0} − G(t) + SPc , if P2 = otherwise.

For L1 (t) = 1, L2 (t) = 1, we consider the Tx-Rx pair adopts coding over multiple parallel channels (CMPC) [6]. In this approach, by using the non-uniform codes over a set of parallel sub-channels the total available channel capacity C1 + C2 can be achieved. Then, we can compute minimum cost g3 for mode k = 3 associated with a successful transmission Psuc (M ) = 1 or V ω (t) = 1, by solving the following convex problem: min :

∑

Pi (t) i∈I

(Zi (t)+S)Pi (t)+S

∑ i∈I

s. t. :ξ1 B1 log2 (1 + h

1 (t)RP1 (t)

N0 B1

N0 B1 M/B1 −1 ) h1 (t)R (2

υi 1{Li (t−1)=0}−G(t)+SPc

) + ξ2 B2 log2 (1 +

0 ≤ Pi (t) ≤ Pimax .

|h2 (t)|2 P2 (t) ) N0 B2

≥ M,

(24)

By utilizing the Karush-Kuhn-Tucker (KKT) conditions, we have the optimal power allocation solution as follows (see Appendix B for details): [ ξi B i δ ∗ 1 ]Pimax Pi∗ (t) = − , (25) (Zi (t) + S)ln2 αi 0 [ ]M where X I denotes min[max(X, I), M ] and δ ∗ is the optimal Lagrange multiplier obtained by subgradient updating equation Eq. (42). Substituting this optimal solution into the objective function of (24), minimum cost g3 can thus be easily obtained. To this end, for Algorithm 1, the complexity is mainly dominated by step 2, i.e., solving optimization Eq. (17). Among four tranamission modes, the cost g4 (t) for the sleep mode is trivially zero, thus, we determine the link selection mode k and the resulting power allocation action Dkp (t) by computing Eqs. (21), (23) and (25). Besides, the subgradient method converges to the desired state after O( κ12 ) iterations [21], where κ is the maximum tolerance deviation from the optimal value. Hence, the total complexity of Algorithm 1 is of order O(1) + O(1) + O( κ22 ) = O(2 + κ22 ). IV. S IMULATION R ESULTS In this section, we present our simulation results. First, the performance bounds of the algorithm proven in Theorem 1 are illustrated. Secondly, we evaluate the performance of our proposed DLSPA scheme by comparing it with the FSO link always active and FSO&RF always active schemes. Since the algorithm does not depend on any distribution for the channel gain, without loss of generality, we model the channel amplitudes for RF and FSO links as i.i.d. Nakagamim and Gamma-Gamma variables, respectively. We normalized noise and bandwidth as N0 = 1 and B1 = B2 = 1, then the value of instantaneous SNR equals to the channel gain. Hence,

≤ P1max ,

N0 B2 M/B2 −1 ) |h2 (t)|2 (2

(21)

≤ P2max ,

(23)

for a Gamma-Gamma model, the probability density function (PDF) of FSO channel state h1 is given by [19]

[

fh1 (h1 ) =

ϕ2 h−1 3,0 ϕ2 αβ ( h1 ) ϕ2 +1 1 G ¯ 1 |ϕ2 ,α,β Γ(α)Γ(β) 1,3 ϕ2 +1 h

]

(26)

where ϕ is the ratio between equivalent beam radius at the FSO receiver aperture and the pointing error (jitter) standard ¯ 1 is the average channel gain deviation at the FSO receiver, h of FSO link associated with G[·] is the Meijer G-function as defined in [22], and α and β are the effective number of small-scale and large-scale eddies of the turbulent environment, respectively. For a Nakagami-m model, the probability density function (PDF) of RF channel state h2 is given by [18] fh2 (h2 ) = (

( mh2 ) m m hm−1 ) 2 exp − Ω Γ(m) Ω

(27)

∫∞ where Γ(m) =∫ 0 e−x xm−1 dx is the gamma function [22], ∞ and Γ(a, n) = n e−x xa−1 dx is the upper incomplete gamma Ω are the Gamma distribution’s shape and function, m and m scale parameters respectively. The channels are discretized into eight equal probability bins as in [26, pp. 63], and we choose ¯ 1 = 0dB, α = 2.064, β = 1.342 for Eq. (26), and ϕ = 1, h m = 1, Ω = 5dB for Eq. (27). The FSO and RF channel state spaces are H1 = {h11 = −15 dB, h21 = −2.4110 dB, h31 = 1.6008 dB, h41 = 4.3540 dB, h51 = 6.5959 dB, h61 = 8.6804 dB, h71 = 10.8437 dB, h81 = 13.3609 dB} and H2 = {h12 = −20 dB, h22 = −8.7434 dB, h32 = −6.3835 dB, h42 = −4.9381 dB, h52 = −3.8171 dB, h62 = −2.8437 dB, h72 = 8 −1.8112 dB, dB}, respectively. Every time slot ( h2 = −0.6903 ) t, H(t) = h1 (t), h2 (t) randomly takes values from the two spaces H1 and H2 . In addition, we set the minimum reliability constraint η = 0.98 and assume that each packet contains M = 5 bits. Besides, we set P1max = P2max = 15 W and P1ave = P2ave = 5 W. A. Cost Reliability Tradeoff We illustrate the cost reliability tradeoff proven in Theorem 1 using DLSPA algorithm and each data point of the following curves was run for 5000 time slots. In Fig. 2 (a), we plot average power consumption θ¯ versus control parameter S. First, we see that power consumption θ¯ eventually converges to the optimal value θ∗ when S is sufficiently large, which confirms the asymptotic optimality



0.7

0.55 λ=0.7 packets/slot λ=0.65 packets/slot λ=0.6 packets/slot

0.45

our proposed scheme FSO always active scheme FSO&RF always active scheme

0.6 Power consumption

Average power cosumption

0.5

0.4 0.35 0.3 0.25

0.5 0.4 0.3 0.2 0.1

0.2

0 0

5

10 15 Control parameter S

20

25

0

500

(a)

2500

180 λ=0.7 packets/slot λ=0.65 packets/slot λ=0.6 packets/slot

28 26

Reliability−guaratee efficiency

Average reliability queue occupancy

2000

(a)

30

24 22 20 18 16 14 12

1000 1500 Time (seconds)

0

5

10 15 Control parameter S

20

25

(b)

Fig. 2: (a) Average power consumption versus control parameter S. (b) Average reliability queue occupancy versus control parameter S.

stated in Theorem 1 (b). Meanwhile, in Fig. 2 (b), we observe that the time-average reliability queue backlogs grow approximately linearly with S, which verifies the time-average reliability bound stated in Theorem 1 (a). Figs. 2 (a) and 2 (b) depict the tradeoff between power consumption θ¯ and reliability occupancy by [O(1/S), O(S)]. Therefore, in order to achieve the desired power consumption and guarantee explicit reliability performance, the key step is to select a proper control parameter S. In particular, if the hybrid system is energy-limited, then we should choose a larger S. Whereas, if the hybrid system concerns more about reliability, then a smaller S is required. Additionally, in Figs. 2 (a) and (b), we observe that the DLSPA algorithm adjusts its optimal control action to the changes of packet arrival rate λ. With the variations of λ, for the corresponding selected link, transmit power varies as well, so as to keep network reliability occupancy within the predefined length (i.e., the target reliability is guaranteed). We also observe that a larger packet arrival rate λ leads to more power consumption and a longer reliability backlog with any fixed S. This is because a larger λ will consume more power to ensure the target reliability performance.

our proposed scheme FSO always active scheme FSO&RF always active scheme

160 140 120 100 80 60 40 20 0

0

500

1000 1500 Time (seconds)

2000

2500

(b)

Fig. 3: Performance comparison among our proposed scheme, FSO always active scheme and the FSO&RF always active scheme (S = 20). (a) Average power consumption against time T . (b) Average reliability-guarantee efficiency against time T .

B. Performance of the Proposed DLSPA Scheme In this subsection, we evaluate the performance of our proposed scheme by comparing it with the FSO link always active and FSO&RF always active schemes with control parameter S = 20. For the FSO link always active scheme, the FSO link is active all the time and RF is activated only when the quality of the FSO link falls below a predetermined threshold [9]. For the FSO&RF always active scheme, both FSO and RF links are active all the time but transmit at different rates [8]. Fig. 3 (a) shows the power consumption incurred in each time slot when each scheme is applied. As expected, our proposed scheme outperforms the other two schemes at all times. That is because by capitalizing on sleep mode that the transmitter powering off the unused components for a period of time, the proposed scheme can save much energy. The FSO link always active and FSO&RF always active schemes waste energy by keeping the unnecessary link active when a single FSO or RF link’s quality is good enough to support the required QoS requirement by itself. As a result, the FSO&RF always active scheme leads to the highest power consumption, as shown in Fig. 3 (a). To measure how much power consumption is needed for a



given reliability requirement, a metric called reliability guarantee efficiency specified by parameter ψ, is introduced. It is defined as reliability queue occupancy divided by power consumption. As shown in Fig. 3 (b), our proposed scheme achieves the highest reliability guarantee efficiency, in contrast to the other two schemes. The reason is that by permitting link selection and power allocation adaptively, the minimum power consumption is achieved, and the target reliability is guaranteed in our DLSPA scheme, which verifies its performance superiority again. Furthermore, based on the operating principle for different schemes, it can be derived that the complexity of FSO link always active scheme is of order O(1), while the complexity of FSO&RF always active scheme is of order O( κ22 ) with κ being the maximum tolerance deviation from the optimal value, conditioned that the subgradient method is used for rate control. Recalling that the complexity of our proposed scheme is of order O(2 + κ22 ), which implies that our proposed scheme achieves performance superiority at the cost of a slight computation complexity. V. C ONCLUSIONS In this paper, we developed a dynamic link selection and power allocation transmission algorithm for a hybrid FSO/RF system that minimizes the power consumption cost while guaranteeing the packet success-probability requirement and peak and average power constraints. By using Lyapunov optimization along with the notion of reliability queues, we design a dynamic link selection and power allocation (DLSPA) algorithm. This algorithm can push the consumed power very close to the optimal value, but with a tradeoff over reliability queue occupancy. Simulation results confirmed the theoretical analysis as well as the performance superiority of our proposed scheme.

feasible control action C ω (t) every slot purely as a function of the current channel state H(t), and guarantees:

E

{∑

A. Proof of Theorem 1 The proof of Theorem 1 is based on the results in Lemma Lemma 1: (Existence of an optimal stationary, randomized policy) Assuming that H(t) is independently and identically distributed (i.i.d.) ( )over slots, and that all queues are initialized at 0, i.e., L Q(0) = 0, there exists a stationary randomized policy ω that chooses feasible control action C ω (t) every slot purely as a function of the current channel state H(t), and that has the following properties for any constant µ > 0:

E

{∑

i∈I

(Piω

µ, E{Piω (τ )}

+ [Li (t) − Li (t −

1)]+ υiω )

}

= θ∗ , (29)

i∈I

∑ where θ∗ = i∈I (e∗i + u∗i ) + q ∗ is the optimal value of the objective in problem (6). Then, there exists a positive constant µ that satisfies E{Oω (τ )} ≥ ηλ + µ > ηλ, E{Piω (τ )}

< E{Piω (τ )} + µ ≤ Piavg .

(30)

Next, we prove the equation above holds for arbitrarily small value of µ. Define θ∗ (µ) as the minimum time-average power consumed by any stationary policy ω that satisfies Eq. (29), we further have ( µ ) ∗ µ ∗ θ (µmax ), (31) θ∗ ≤ θ∗ (µ) ≤ 1 − θ + µmax µmax where the fist inequality holds by optimality, while the second inequality is due to θ∗ (µ) being no larger than the average power consumption associated with the mixed strategy, which µ applies strategy θ∗ (µmax ) with probability µmax and strategy µ θ∗ with probability 1 − µmax . Here, µmax is the largest value of µ that guarantees Eq. (30). Note that θ∗ (µ) → θ∗ as µ → 0, which completes the proof. Thus, using Eq. (29), we can rewrite Eq. (16) as ∑ ( ) ∆ Q(t) + SE{ (Pi (t) + [Li (t) − Li (t − 1)]+ υi ) i∈I

+I{Lω2 (t)=1} Pc |Q(t)} ≤ U − G(t)µ −

+S

∑

∑

Zi (t)µ

i∈I

∗ i∈I (ei

+ u∗i ) + q ∗ .

(32)

E{L(Q(K))} − E{L(Q(0))} + µK [

1.

E{O (τ )} ≥ ηλ +

(Piω + [Li (t) − Li (t − 1)]+ υiω ) + L2 (t)Pcω

(a) As for Eq. (32), summing it over t ∈ {0, 1, · · · , K − 1} and then dividing the sum by µK, we have

A PPENDIX

ω

E{Oω (τ )} ≥ ηλ, E{Piω (τ )} ≤ Piavg ,

+µ≤

Piavg , }

+ L2 (t)Pcω

= θ∗ , (28)

∑ where θ∗ = i∈I (e∗i + u∗i ) + q ∗ is the optimal value of the objective in problem (6). Proof: Using standard results on CMDP [16], we claim that there exists a stationary, randomized policy ω, that chooses

S

∑K−1 t=0

E{

K(U +S

i∈I

∑ (Pi (t)+[Li (t)−Li (t−1)]+ υi )}+ K−1 t=0 E{I{Lω (t)=1} Pc } µK

∑K−1 ∗ ∗ ∗ t=0 i∈I (ei +ui )+q )−µ

K(U +S

E{

µK

∑

∑

Zi (t)}−µ

i∈I

∗ ∗ ∗ i∈I (ei +ui )+q )−µ µK

∑K−1 t=0

∑K−1 t=0

E{G(t)}

E{G(t)}

, (33)

where the last inequality is due to the fact that Zi (t) ≥ 0. Rearranging the terms in the above inequality and exploiting L(Q(K)) ≥ 0 and L(Q(0)) = 0, then taking a limit as K → ∞, we get

lim 1 K→∞ K ≤

U +S

∑K−1 τ =0

(∑

]

2

∑

≤

≤

∑

E{G(t)} ≤

max i∈I (Pi

µ

U +S(

+ vi ) + Pc

∑

∗ ∗ ∗ i∈I (ei +ui )+q )

µ

) .

(34)



(b) Similarly, summing Eq. (32) over t ∈ {0, 1, · · · , K − 1}, and then dividing the sum by SK we have E{L(Q(K))} − E{L(Q(0))} + SK [ ∑K−1

S

t=0

E{

∑

i∈I

K(U +S

≤

2

SK

∑

∑K−1 ∗ ∗ ∗ t=0 i∈I (ei +ui )+q )−µ

≤

]

∑ (Pi (t)+[Li (t)−Li (t−1)]+ υi )}+ K−1 t=0 E{I{Lω (t)=1} Pc }

K(U + S

∑

∗ i∈I (ei

+

u∗i )

SK

∑

E{

Zi (t)}−µ

∑K−1

i∈I

t=0

E{G(t)}

Pi∗ (t) =

∗

+q )

,

+

] E{I{Lω2 (t)=1} Pc } ≤ θ∗ + i∈I

∑K−1 t=0

where θ∗ =

∑

∗ i∈I (ei

U S,

(36)

+ u∗i + q ∗ ).

B. Solution of Eq. (24) using Karush-Kuhn-Tucker (KKT) conditions From Eq. (24), we get the Lagrangian function

∑ L3 (P1 , P2 , δ, θi , ρi ) = (Zi (t) + V )Pi (t) i∈I ∑ ∑ θi Pi (t) +V υi 1{Li (t−1)=0} − G(t) + V Pc −

+δ M − ξ1 B1 log2 (1 + −ξ2 B2 log2 (1 +

(41)

[ ]M where X I denotes min[max(X, I), M ]. Considering that the dual problem is always convex, it is guaranteed that the dual is always convex, it is guaranteed that the gradient-type scheme (e.g., subgradient) converges to the global optimum. Specifically, the subgradient updating equation for optimal Lagrange multiplier δ ∗ in Eq. (41) is given as [ ( h1 (t)RP1 (t) ) δ (n+1) = δ (n) + ϵ(n) M − Blog2 (1 + N0 B |h2 (t)|2 P2 (t) )]+ −Blog2 (1 + , (42) ) N0 B where the iterative index is denoted by n and ϵ(n) is sequence of positive step size designed properly [21]. Once δ ∗ obtained, optimal power allocation value is derived according to Eq. (41).

C. Derivation of (16)

h1 (t)RP1 (t) ) N0 B1

|h2 (t)|2 P2 (t) ) ) N0 B2

+

∑

ρi (Pi (t) − Pimax ),

Since the inequality [max{Q − R, 0} + A]2 ≤ Q2 + R2 + A2 + 2Q(A − R) always holds, we have

(37)

{ ( ) ∆ Θ(t) ≤ 12 E O2 (t) + η 2 A2 (t) + 2G(t)(ηA(t) − O(t))+

i∈I

where ρi , θi , ∀i = 1, 2 and δ are Lagrange multipliers corresponding to constraints in Eq. (24). Then the dual objective function problem is formulated as d(δ, θi , ρi ) = min L(P1 , P2 , δ, θi , ρi )

(38)

P1 ,P2

i∈I

max d(δ, θi , ρi )

s.t. θi ≥ 0, ρi ≥ 0.

≤ 21 (1 + η 2 λ2 +

δ,θi ,ρi

(39)

Since a strictly feasible point exists for Eq. (37), strong duality holds based on Slater’s condition and the KKT conditions are necessary and sufficient for optimality. Define two intermediate parameters for notation simplicity, α1 = hN10(t)R B1 and 2

, the KKT conditions are

∂L3 (P1 ,P2 ,δ,θ,ρ) ∂P2

= Z1 (t) + V − = Z2 (t) + V −

ξ 1 B1 δ ∗ α 1 (1+α1 P1∗ (t))ln2 ∗

ξ 2 B2 δ α 2 (1+α2 P2∗ (t))ln2

∑

i∈I

i∈I

(Piavg )2 + (Pimax )2 )+

} { ) ∑( Zi (t)(Pi (t) − Piavg ) |Θ(t) . (43) E G(t)(ηA(t) − O(t)) + i∈I

Let U = 12 (1+η 2 λ2 +

∑ i∈I

(Piavg )2 +(Pimax )2 ). For any given

control parameter S ≥ 0, we add the following “penalty” metric into both sides of the above inequality

θi∗ Pi∗ (t) = 0, ρ∗i (Pi∗ (t) − Pimax ) = 0, θi∗ , ρ∗i ≥ 0, ∂L3 (P1 ,P2 ,δ,θ,ρ) ∂P1

} ) (Piavg )2 + (Pi (t))2 + 2Zi (t)(Pi (t) − Piavg ) |Θ(t) i∈I } { ) ∑ ( avg 2 = 12 E O2 (t) + η 2 A2 (t) + (Pi ) + (Pi (t))2 + ∑(

{ } ) ∑( 1 2Zi (t)(Pi (t) − Piavg ) |Θ(t) 2 E 2G(t)(ηA(t) − O(t)) +

and dual problem is given as

|h2 (t)| N0 B2

ξi B i δ ∗ 1 ]Pimax , − (Zi (t) + S)ln2 αi 0

i∈I

i∈I

(

α2 =

[

(35)

SK where the last inequality is due to the fact that G(t) ≥ 0 and Zi (t) ≥ 0. Rearranging the terms in the above inequality and exploiting L(Q(K)) ≥ 0 and L(Q(0)) = 0, then taking a limit as K → ∞, we get [ ∑K−1 ∑ 1 lim K (Pi (t) + [Li (t) − Li (t − 1)]+ υi )} t=0 E{ K→∞

If δ ∗ < 0, then θi∗ −ρ∗i > 0, that is, Pi∗ (t) = Pimax . For δ ∗ ≥ 0, there are three cases: ∗ i Bi δ − α1i . 1) If θi∗ = ρ∗i , then Pi∗ (t) = (Ziξ(t)+S)ln2 2) If θi∗ > ρ∗i , then θi∗ > 0, and we have Pi∗ (t) = 0. 3) If θi∗ < ρ∗i , then ρ∗i > 0, and we have Pi∗ (t) = Pimax . To sum up, we have

− θ1∗ + ρ∗1 = 0, − θ2∗ + ρ∗2 = 0.

(40)

∑ SE{ (Pi (t) + [Li (t) − Li (t − 1)]+ υi ) + i∈I

I{Lω2 (t)=1} Pc |Q(t)}.

(44)



Then, we get ( ) ∑ ∆ Θ(t) + V E{ (Pi (t) + [Li (t) − Li (t − 1)]+ υi ) i∈I { +I{Lω2 (t)=1} Pc |Θ(t)} ≤ U + E G(t)(ηA(t) − O(t)) } ∑( avg ) + Zi (t)(Pi (t) − Pi ) |Θ(t) i∈I ∑ +V E{ (Pi (t) + [Li (t) − Li (t − 1)]+ υi ) + L2 (t)Pc |Θ(t)} i∈I ∑ = U − G(t)E{O(t)|Θ(t)} + (Zi (t) + V )E{Pi (t)|Θ(t)} i∈I ∑ ∑ − Zi (t)Piavg + V υi 1{Li (t−1)=0} E{Li (t)|Θ(t)} i∈I

i∈I

+V Pc E{L2 (t)|Θ(t)}.

[18] I. S. Ansari, F. Yilmaz and M.-S. Alouini, “Performance analysis of freespace optical links over m´ alaga (M) turbulence channels with pointing errors,” IEEE Trans. wireless commun., vol. 15, no. 1, pp. 91–102, Jan. 2016. [19] E. Soleimani-Nasab and M. Uysal, “Generalized performance analysis of mixed RF/FSO cooperative systems,” IEEE Trans. wireless commun., vol. 15, no. 1, pp. 714–727, Jan. 2016. [20] I. S. Ansari, F. Yilmaz and M.-S. Alouini, “On the performance of mixed RF/FSO variable gain dual-hop transmission systems with pointing errors” in Proc. IEEE VTC Fall, Las Vegas, NV, USA, Sept. 2013, pp. 1–5. [21] L. Vandenberghe, “Subgradient method,” Univ. California, Los Angeles, CA, USA, Tech. Rep., 2013. [Online]. Available: http://www.seas.ucla. edu/ vandenbe/236C/lectures/sgmethod.pdf [22] I. S. Gradshteyn and I.M. Ryzhik, Table of Integrals, Series and Products, 7th ed., A. Jeffrey, Ed. Amsterdam, The Netherlands: Elsevier, 2007. [23] N. Salodkar, “Online algorithms for delay constrained scheduling over a fading channel,” Doctor Thesis, 2008.

(45)

R EFERENCES [1] H. Willebrand and B. S. Ghuman, Free Space Optics: Enabling Optical Connectivity in Today Networks. Indianapolis, IN, USA: Sams, 2002. [2] M. Khalighi and M. Uysal, “Survey on free space optical communication: A communication theory perspective,” IEEE Commun. Surveys Tuts., vol. 16, no. 4, pp. 2231–2258, Nov. 2014. [3] D. K. Borah, A. C. Boukouvalas, C. C. Davis, and S. H. H. Yiannopoulos, “A review of communication-oriented opticalwireless systems,” EURASIP J. Wireless Commun. Netw., vol. 2012, no. 1, pp. 91:1–91:28, Mar. 2012. [4] M. Usman, H. Yang, and M.-S. Alouini, “Practical switching-based hybrid FSO/RF transmission and its performance analysis,” IEEE Photonics J., vol. 6, no. 5, pp. 1–13, Oct. 2014. [5] W. Zhang, S. Hranilovic, and C. Shi, “Soft-switching hybrid FSO/RF links using short-length raptor codes: Design and implementation,” IEEE J. Sel. Areas Commun., vol. 27, no. 9, pp. 1698–1708, Dec. 2009. [6] S. Vangala, H. Pishro-Nik, “Optimal hybrid RF-wireless optical communication for maximum efficiency and reliability,” 41st Annual Conference on CISS, pp. 684–689, March 2007. [7] H. Moradi, M. Falahpour, H. H. Refai, P. G. LoPresti, and M. Atiquzzaman, “On the capacity of hybrid FSO/RF links,” in Proc. IEEE Global Commun. Conf., Miami, FL, USA, Dec. 2010, pp. 1–5. [8] N. Letzepis, K. Nguyen, A. G. Fabregas, and W. Cowley, “Outage analysis of the hybrid free-space optical and radiofrequency channel,” IEEE J. Sel. Areas Commun., vol. 27, no. 9, pp. 1709–1719, Dec. 2009. [9] T. Rakia, H.-C. Yang, F. Gebali, and M.-S. Alouini “Power adaptation based on truncated channel inversion for hybrid FSO/RF transmission with adaptive combining,” IEEE Photon. J., vol. 7, no. 4, Aug. 2015. [10] M. J. Neely, Stochastic Network Optimization with Application to Communication and Queueing Systems. Morgan & Claypool., 2010. [11] R. Urgaonkar and M. J. Neely, “Delay-limited cooperative communication with reliability constraints in wireless networks,” IEEE Trans. Inf. Theory, vol. 60, no. 3, pp. 1869–1882, March 2014. [12] Y. Li, M. Sheng, Y. Shi, X. Ma, and W. Jiao, “Energy efficiency and delay tradeoff for time-varying and interference-free wireless networks,” IEEE Trans. Wireless commun., vol. 13, no. 11, pp. 5921–5931, Nov. 2014. [13] N. D. Chatzidiamantis, L. Georgiadis, H. G. Sandalidis, and G. K. Karagiannidis, “Throughput-optimal link-layer design in power constrained hybrid OW/RF systems,” IEEE J. Sel. Areas Commun., vol. 33, no. 9, pp. 1972–1984, Sep. 2015. [14] Z. Wang, Internet QoS: architecture and mechanism for quality of service, Morgan Kaufmann, 2001. [15] K. Kumar and D. Borah, “Quantize and encode relaying through FSO and hybrid FSO/RF links,” IEEE Trans. Veh. Technol., vol. 64, no. 6, pp. 2361–2374, Jun. 2015. [16] M. L. Puterman. Markov Decision Processes, John Wiley & Sons, 2005. [17] E. Zedini, I. S. Ansari, and M.-S. Alouini, “Unified performance analysis of mixed line of sight RF-FSO fixed gain dual-hop transmission systems,” Proc. IEEE Wireless Commun. Netw. Conf., New Orleans, LA, USA, pp. 46–51, Mar. 2015.