Exact equalities and thermodynamic relations for nonequilibrium ...

4 downloads 63 Views 702KB Size Report
Dec 25, 2014 - arXiv:1405.0697v2 [cond-mat.stat-mech] 25 Dec 2014. Exact equalities and thermodynamic relations for nonequilibrium steady states. Teruhisa ...
Exact equalities and thermodynamic relations for nonequilibrium steady states

arXiv:1405.0697v2 [cond-mat.stat-mech] 25 Dec 2014

Teruhisa S. Komatsu1 , Naoko Nakagawa2 , Shin-ichi Sasa3 , and Hal Tasaki4 Abstract We study thermodynamic operations which bring a nonequilibrium steady state (NESS) to another NESS in physical systems under nonequilibrium conditions. We model the system by a suitable Markov jump process, and treat thermodynamic operations as protocols according to which the external agent varies parameters of the Markov process. Then we prove, among other relations, a NESS version of the Jarzynski equality and the extended Clausius relation. The latter can be a starting point of thermodynamics for NESS. We also find that the corresponding nonequilibrium entropy has a microscopic representation in terms of symmetrized Shannon entropy in systems where the microscopic description of states involves “momenta”. All the results in the present paper are mathematically rigorous.

Contents 1 Introduction 1.1 Background and motivation . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 SST in a typical example . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Setup and definitions 2.1 Markov jump process . . . . . . . . . . . . . 2.2 Description in terms of paths . . . . . . . . 2.3 Entropy production, work, and time-reversal 2.4 Examples . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

3 Jarzynski-type equalities for NESS

2 2 5 10 10 12 13 17 20

4 Thermodynamic relations for NESS 23 4.1 Extended Clausius relation and entropy . . . . . . . . . . . . . . . . . . . 23 4.2 Higher order relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3 Perturbative estimate of the error . . . . . . . . . . . . . . . . . . . . . . 26 5 Rigorous results about thermodynamic relations 1 2 3 4

Laboratory for Computational Molecular Design, RIKEN QBiC, Kobe 650-0047, Japan College of Science, Ibaraki University, Mito, Ibaraki 310-8512, Japan Department of Physics, Kyoto University, Kyoto, 606-8502, Japan Department of Physics, Gakushuin University, Mejiro, Toshima-ku, Tokyo 171-8588, Japan

1

27

6 Method based on the time reversal symmetry 32 6.1 Some definitions and basic symmetry . . . . . . . . . . . . . . . . . . . . 32 6.2 Proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.3 Proof of Theorem 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7 Method based on modified rate matrix 7.1 Modified rate matrix and its eigenvectors . . . . . . . . . 7.2 Splitting Lemma . . . . . . . . . . . . . . . . . . . . . . 7.3 Description of key quantities in terms of the eigenvectors 7.4 Thermodynamic relations . . . . . . . . . . . . . . . . . 7.5 Excess entropy production . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

36 36 37 38 39 41

8 Representation of the stationary distribution and entropy

43

9 Proof of the extended Clausius inequality

45

10 Models with momenta and symmetrized 10.1 Setting and the main observation . . . . 10.2 Proof of Theorem 10.1 . . . . . . . . . . 10.3 An illustrative example . . . . . . . . . .

1 1.1

Shannon . . . . . . . . . . . . . . . . . .

entropy 47 . . . . . . . . . . . . 48 . . . . . . . . . . . . 50 . . . . . . . . . . . . 50

Introduction Background and motivation

General Properties of physical systems in thermal equilibrium are relatively well understood both from physical and mathematical points of view. Thermodynamics characterizes macroscopic properties of equilibrium states, and poses strong constraints on possible transitions between equilibrium sates, especially when an outside agent makes operation to the system. Statistical mechanics provides a probabilistic description of equilibrium states based on the microscopic mechanical description of the system. To develop similar universal theories for systems out of equilibrium has been a major remaining challenge in theoretical physics. Among various classes of nonequilibrium states, nonequilibrium steady states (which we shall abbreviated as NESS throughout the paper), which have no macroscopic changes but have nonvanishing flows, may be a promising ground for developing such theories. Looking back into the history of equilibrium physics, we find that the thermodynamic definition of entropy (and free energy) in terms of thermodynamic operation was a fundamental starting point. The notion and the properties of the entropy and the free energy were essential guides for the later development of equilibrium statistical mechanics. If we trust the analogy to the history, one possible route toward universal theories for NESS may be to start from a thermodynamics for NESS and pin down relevant thermodynamic functions. In [1], Oono and Paniconi made a proposal of an operational thermodynamics for NESS and coined the term “steady state thermodynamics” (SST). 2

We should point out however that the notion of entropy for NESS is much more subtle than one might imagine in the beginning. One reason is that the entropy in equilibrium physics plays several essentially different roles; it is a thermodynamic function whose derivatives correspond to physically observable quantities, it is a quantitative measure of which adiabatic process is possible and which is not, it is also a large deviation functional governing the fluctuation of physical quantities. There is a possibility that this “degeneracy” is an accident observed only in equilibrium states, and the degeneracy is immediately lifted when one goes out of equilibrium. If this is the case we shall encounter more than one “nonequilibrium entropies” each of which characterizing different physical aspect of a NESS. It will then be important to clarify which physics of NESS is represented by which extension of entropy. In our own attempt to develop SST and to extend the notion of entropy to NESS [2, 3], we have concentrated on the aspect which gave birth to the concept of entropy, namely, the Clausius relation in thermodynamics. We started from a microscopic description of a heat conducting NESS, and showed that a very natural generalization of the Clausius relation, in which the heat is replaced with its “renormalized” counterpart, is valid when the “order of nonequilibrium” ǫ is sufficiently small. This was a realization of the early phenomenological discussions by Landauer [4] and by Oono and Paniconi [1], and an extension of the similar result by Ruelle [5] for models with Gaussian thermostat. We also found that, in systems where microscopic states have no time-reversal symmetry (i.e., the microscopic description of states involves “momenta”), the microscopic representation of the entropy differs from the traditional Gibbs-Shannon form, and requires further symmetrization with respect to the time-reversal transformation. This approach to SST was later extended to quantum systems [6]. A geometric interpretation of the thermodynamic relations and corresponding exact relations were discussed in [7, 8] both for classical and quantum systems. Other schemes for SST, which are distinct from those in [2, 3], have been proposed [9, 10, 11, 13, 14]. See the end of section 1.2 for details. See also [15] for an earlier attempt at approaching SST from a phenomenological point of view, and [16, 17, 18] for discussions about the “zeroth law” in SST. Closely related problem of heat capacity in NESS is discussed in [19, 20]. See [21] for a unified treatment of thermodynamics in NESS and adiabatic pumping in equilibrium. Among other various promising attempt at discussing entropy (or a related quantity) in NESS, let us refer to the macroscopic fluctuation theory developed by Bertini, De Sole, Gabrielli, Jona-Lasinio, and Landim [22, 23, 24, 25], exact solution of the large deviation functional by Derrida, Lebowitz, and Speer [26, 27], the proposal of the additivity principle by Bodineau and Derrida [28], and the recent interesting proposal based on adiabatic accessibility by Lieb and Yngvason [29]. As for closely related approaches based on “fluctuation theorems” see, e.g., the recent review [30]. In the present paper, which is the first mathematical paper in our series of works on SST, we treat a general class of Markov jump processes that model nonequilibrium systems, and prove physically important relations including the NESS version of the Jarzynski equality and the extended Clausius relation. Although most of the results 3

have been announced before, they were all derived heuristically. We here present mathematically rigorous results for the first time. Let us describe the organization of the present paper5 . The following section 1.2 has been prepared for the readers who are not familiar with our approach (and other related approaches) to nonequilibrium thermodynamics. We shall motivate our study by briefly describing the standard Clausius relation in equilibrium thermodynamics in a simple setting, and explaining the difficulties one encounters when trying to extending it naively to NESS. We then describe our scheme of “renormalization” and introduce the extended Clausius relation for NESS. Finally we compare our approach with other proposals of thermodynamics for NESS. In section 2, we shall give almost complete definitions necessary in the present paper. In sections 2.1 and 2.2, we define the Markov jump process and the corresponding description in terms of paths. These definitions are standard. Of particular importance ˆ defined in (2.12). In are the notion of protocol α ˆ and the expectation value hf iαst→ section 2.3, we define essential “thermodynamic” quantities, namely, the entropy production Θ, its nonequilibrium part Ψ, and the work W . We also discuss the experimental measurability of these quantities. We shall discuss some examples in section 2.4. In section 3 we discuss the Jarzynski type equality (3.1) that holds for thermodynamic operations between two NESS. The equality is exact and rigorous, and will be the basis of our main result, namely the extended Clausius relation. Let us stress that our Jarzynski type equality (3.1) for NESS is distinct from existing exact equalities for general stochastic processes in that it contains only (almost) measurable thermodynamic type quantities. It is challenging to design experimental verification of the equality (3.1) with modern techniques in calorimetry. In section 4, we shall be heuristic, and describe thermodynamic relations that can be derived from the rigorous equality (3.1). In section 4.1, we start from the most important extended Clausius relation, and also discuss a different representation of the relation in terms of the excess entropy production. We further discuss higher order “thermodynamic” relations in section 4.2, and present a heuristic estimate of error terms in section 4.3. In section 5, which is a core of the present paper, we discuss rigorous versions of the thermodynamic relations without going into the proofs. After fixing the class of models for simplicity, we state Theorem 5.1 which allows us to identify (with a certain precision) our nonequilibrium entropy with the Shannon entropy. Then in Theorems 5.2 and 5.3, we state the extended Clausius relation and its higher order generalizations for the step protocol and the quasi-static protocol, respectively. The extended Clausius relation written in terms of the excess entropy production is stated in Theorem 5.4. Finally in Theorem 5.5, we state an inequality corresponding to the extended Clausius relation. Sections 6, 7, 8, and 9 are devoted to the proofs of the theorems. In section 6, we present arguments based on time-reversal symmetry to prove exact equalities discussed in section 3. The arguments are basically standard, but our Jarzyn5

This part can be read as a summary of the whole paper.

4

β

β(t)

β′

J(t)

Figure 1: Thermodynamic operation in an equilibrium state. The bigger gray box denotes the heat bath whose inverse temperature can be controlled, and the smaller box filled with dots denote the system that we are interested in. We start from the equilibrium state with the inverse temperature β, and slowly change the inverse temperature to β ′ . The inverse temperature of the bath and the heat flux from the bath to the system at time t are denoted as β(t) and J(t), respectively. Rτ The total entropy production in the bath − −τ dt β(t)J(t) plays an essential role in the Clausius relation (1.1).

ski type equality for NESS is proved by using a new statement which we call “splitting Lemma” (Lemma 6.1). In section 7 we present totally different approach based on the method of modified rate matrix. This method is used in section 7.2 to prove the splitting lemma, and in section 7.4 to justify the heuristic estimates in section 4.3 of the error in the thermodynamic relations. This completes the proof of the extended Clausius relation and the related relations stated in Theorems 5.2, 5.3, and 5.4. In section 8, we shall use the results from the previous sections to prove Theorem 8.1 about a useful and suggestive representation (first written down by two of us, T.S.K. and N.N.) of the probability distribution of NESS. Then this representation is used to prove Theorem 5.1 about the nonequilibrium entropy. In section 9, we prove Theorem 5.5 about the extended Clausius inequality for NESS. The proof makes use of the standard argument based on the relative entropy and a rigorous version of the linear response formula stated as Lemma 9.1. In the final section 10, we discuss a slightly different class of models which include “momenta”. We show that essentially all the results (except for the extended Clausius inequality) automatically extend to this situation if one replaces the Shannon entropy with a new quantity (10.7) called the symmetrized Shannon entropy. The symmetrized Shannon entropy has a very suggestive form and might be a key for further understanding of the essential properties of NESS in systems with momenta. In section 10.3, we discuss a simple toy model which illustrates the need of the symmetrized Shannon entropy.

1.2

SST in a typical example

Here we shall briefly discuss the essence of SST, i.e., the extended Clausius relation in the simplest example of heat conducting system. We also try to place our work in the broader context of thermodynamics and statistical mechanics by discussing relevant background. 5

Clausius relation for operation between equilibrium states: To motivate our extended Clausius relation, we first review the standard Clausius relation, which is the starting point of equilibrium thermodynamics. Consider a physical system (which can be basically anything) attached to a single heat bath whose temperature can be controlled. See Fig. 1. When the inverse temperature of the bath is fixed at β, the system settles to the equilibrium state corresponding to β after a sufficiently long time. Recall that a physical system in equilibrium exhibits no macroscopic changes, and has no macroscopic flows (of, e.g., matter or energy). We next consider a thermodynamic operation. We start from a situation where the inverse temperature of the bath is β and the system is in the corresponding equilibrium. Then we change the inverse temperature of the bath according to a protocol fixed in advance, i.e., a smooth function β(t) of time t ∈ [−τ, τ ] where β(−τ ) = β and β(τ ) = β ′ . We shall assume that τ is large and β(t) varies slowly. Let J(t) be the heat flux (the energy that flows within a unit time) from the bath to the system. Then the well known Clausius relation is Z τ ′ S(β ) − S(β) ≃ dt β(t)J(t), (1.1) −τ

where S(β) is the entropy of the system in the equilibrium state corresponding to β. (More precisely, the entropy is a function of the equilibrium state.) The relation (1.1) becomes an exact equality in the limit τ ↑ ∞ where β(t) varies infinitesimally slowly. The equality can be proved mathematically in various setting. We note that (−1) times the right-hand side of (1.1) is interpreted as the total entropy production in the heat bath. To see this, take a short time interval from t to t+∆t. The energy (heat) that flows into the bath during this interval is ∆Q = −∆t J(t). Then the corresponding increase (or production) of entropy in the bath is given by the standard relation ∆Sbath = β(t) ∆Q = −∆t β(t) J(t), which becomes the minus of (1.1) after integration over the whole process. Although the entropy S(β) was introduced above as a purely thermodynamic (or macroscopic) quantity, there is a neat expression in terms of microscopic probability distribution. If one represents the equilibrium state in terms of the canonical distribution ρβx = e−βHx /Z(β) (see section 2.1 for the notation), the same entropy is written as X S(β) = − ρβx log ρβx , (1.2) x

where the right-hand side is nothing but the Shannon entropy of the probability distribution ρβx . Extended Clausius relation for operation between NESS: In our approach to thermodynamics for NESS, we wish to focus on possible extensions of the Clausius relation (1.1) and the expression (1.2) of the entropy. The hope is that proper extensions might be a starting point of a full-fledged thermodynamics. To be specific, we focus on a NESS in a heat conducting system, which is a typical nonequilibrium setting. Consider a system which is attached to two large heat baths whose temperatures can be controlled. See Fig. 2. 6

β1

β2

β1 (t)

β1′

J1 (t)

J2 (t)

β2′

β2 (t)

Figure 2: Thermodynamic operation in a NESS (nonequilibrium steady state). There are two heat baths and the system of interest in between them. We fix the inverse temperatures of the baths to β1 and β2 , and assume that the system is in the corresponding NESS. Then we slowly change the inverse temperatures of the baths to β1′ and β2′ . The inverse temperatures of the baths at time t are denoted as β1 (t) and β2 (t), and the heat flux from the baths to the system as J1 (t) and J2 (t). R τ Now the total entropy production (in the baths) − −τ dt β1 (t)J1 (t) + Rτ −τ dt β2 (t)J2 (t) is proportional to the total time 2τ , and cannot appear in a thermodynamic relation as it is. In fact our extended Clausius relation (1.4) is written in terms of its “renormalized” counterpart.

Suppose first that the inverse temperatures of the baths are fixed at β1 and β2 , respectively. It is expected that, after a sufficiently long time, the system settles to a stationary state which has a steady temperature gradient and a constant heat current through it. Such a state, which has no macroscopically observable changes, but has a nonvanishing flow of energy, is a typical example of NESS. As in the case of equilibrium, we consider a thermodynamic operation to NESS. We start from the situation where the two heat baths have fixed inverse temperatures β1 and β2 , and the system is in the corresponding NESS. Then we change the inverse temperatures of the baths according to a fixed protocol, i.e., functions β1 (t) and β2 (t) of time t ∈ [−τ, τ ]. We write β1 = β1 (−τ ), β2 = β2 (−τ ), β1′ = β1 (τ ), and β2′ = β2 (τ ). We wish to ask whether a relation analogous to the Clausius relation (1.1) holds in this setting. Since there are two heat baths, we need to consider two heat currents separately. By Jk (t), where k = 1, 2, we denote the heat flux from the k-th bath to the system at time t. Then a naive analogue to (1.1) is Z τ ? X ′ ′ dt βk (t) Jk (t), (1.3) S(β1 , β2 ) − S(β1 , β2 ) ≃ k=1,2

−τ

where the minus of the right-hand side is the total entropy production in the two heat baths. Here S(β1 , β2 ) is a certain function of two inverse temperatures β1 , β2 , which should be called the nonequilibrium entropy. But it turns out that a relation like (1.3) can never be valid. This is most clearly seen by examining the right-hand side in the case where both β1 (t) = β1 and β2 (t) = β2 are 7

independent of t. Suppose that β1 < β2 . In the NESS characterized by the two inverse temperatures β1 and β2 , there is a steady heat flux Jst > 0 from the bath 1 to the bath 2 through the system. We thus have J1 (t) = Jst and J2 (t) = −Jst for any t ∈ [−τ, τ ]. The right-hand side of (1.3) is thus equal to 2τ (β1 − β2 )Jst , which is negative and grows proportionally with τ . The left-hand side, on the other hand, is vanishing since βk′ = βk . The relation (1.3) is clearly invalid. More generally, fix the initial inverse temperature βk and the final inverse temperature βk′ (for k = 1, 2), and take reference functions β˜k (s) of s ∈ [−1, 1] such that β˜k (−1) = βk and β˜k (1) = βk′ . For a given τ > 0, we choose our protocol as βk (t) = β˜k (t/τ ). Note that, when τ becomes large, the right-hand side of (1.3) diverges (roughly) proportionally to τ because there always is a heat current going through the system. On the other hand the left-hand side is independent of τ , because S(β1 , β2 ) should be a function of the two inverse temperatures. We again conclude that the relation (1.3) cannot be valid. Extended Clausius relation: We need to find a way to “renormalize” the divergence in the right-hand side of (1.3) to get a finite quantity. One strategy is to introduce the reverse operation as follows. See Fig. 3. We start from the situation where the two heat baths have fixed inverse temperatures β1′ and β2′ , and the system is in the corresponding NESS. Then we change the inverse temperatures of the baths according to the reverse protocol defined by the functions βk† (t) := βk (−t) for k = 1, 2. Again we denote by Jk† (t), where k = 1, 2, the heat flux from the k-th bath to the system at time t in this process. We expect Jk† (t) ≃ Jk (−t) when the operation is slow enough6 . But we don’t have the exact equality Jk† (t) = Jk (−t) in general since the currents at a given moment may depend on the history of the system. The subtle difference between Jk† (t) and Jk (−t) can be a key to understand the nature of NESS. † Since J the total entropy production in the baths R kτ(t) ≃ †Jk (−t), P † − k=1,2 −τ dt βk (t) Jk (t) for the reverse protocol should diverge as τ ↑ ∞ in the same manner as that in the original protocol, i.e., the minus R τ right-hand side P of the of (1.3). This observation suggests that their difference k=1,2 −τ dt βk (t) Jk (t) − Rτ P † † k=1,2 −τ dt βk (t) Jk (t) may be finite in the limit τ ↑ ∞, and may play a meaningful role. This is indeed the case, and we shall prove, for a class of models close to equilibrium, that the extended Clausius relation Z Z τ o 1 Xn τ ′ ′ S(β1 , β2 ) − S(β1 , β2 ) ≃ dt βk (t) Jk (t) − dt βk† (t) Jk† (t) , (1.4) 2 k=1,2 −τ −τ

holds in the limit τ ↑ ∞. See Theorem 5.3 for the precise statement. Note that the unwanted divergence is properly “renormalized” by considering the difference between the total entropy productions (in the baths) in the original and the time-reversed operations. The same relation can be written by using the notion of excess entropy production as in [2, 3]. See Theorem 5.4. 6

To be precise, this is true when there always is a nonvanishing temperature difference. When β1 (t) = β2 (t), the currents are very small and we rather have Jk† (t) ≃ −Jk (−t).

8

β1

β1 (t), β2 (t)

β1′

β2

β1† (t), β2† (t)

β2′

Figure 3: We consider the reverse protocol in which the inverse temperatures are varied from (β1′ , β2′ ) back to (β1 , β2 ) in exactly the reverse manner. Note that we are not proposing reverse time-evolutions of any kind, but considering the ordinary time-evolution under the reverse protocol. By taking the difference between the total entropy productions (in the baths) in the original and in the reverse protocols, we get a finite quantity which appears in the extended Clausius relation (1.4).

The nonequilibrium entropy in (1.4) satisfies S(β, β) = S(β), where S(β) is the equilibrium entropy. Likewise the extended Clausius relation (1.4) reduces to the original Clausius relation (1.1) if the temperatures of the baths are always identical with each other, i.e., β1 (t) = β2 (t) for any t ∈ [−τ, τ ]. We can say that our relation (1.4) is a natural extension of the original Clausius relation (1.1) to operations between NESS. We also stress that the right-hand side of (1.4) can be, in principle, measured experimentally; one needs to perform a pair of experiments for the original and the reverse protocols, and measure the heat currents from the two baths. There are however (at least) two serious drawbacks in our theory. First the extended Clausius relation (1.4) is an approximate relation which is meaningful only when the system is close to the equilibrium. Our theory says nothing about systems which are very far from equilibrium. Secondly the extended Clausius relation (1.4) holds only for protocols where the parameters are varied very slowly. In the equilibrium thermodynamics, the Clausius inequality is known to hold for processes which are not necessarily slow. We can also prove an inequality corresponding to (1.4) (only for models without “momenta”), but it contains an error term which is not perfectly under control. See Theorem 5.5 and section 10. Other schemes of “renormalization”: Let us note that the above procedure is certainly not the unique way of “renormalizing” the divergent entropy production. For the moment, at least three other schemes of renormalization are known. The scheme by Hatano and Sasa [9] developed for the overdamped Langevin system was the first realization of SST based on microscopic (or mesoscopic) dynamics. A very close, but slightly different, scheme based on macroscopic fluctuation theory was recently proposed by Bertini, Gabrielli, Jona-Lasinio, and Landim [10, 11]. The scheme due to Maes and Netocny [13] makes a full use of the large deviation analysis. See [13, 14] for discussions about the relations between different schemes. Unlike our scheme, all these three schemes lead to extended Clausius relations (or 9

analogous equivalent relations) which are exact for systems arbitrarily far away from equilibrium. Moreover, these equalities are accompanied by corresponding inequalities which are valid for general processes. These are clear advantages of the three schemes. On the other hand, the renormalization in these three schemes requires subtraction of rather involved quantities which are not directly observable in experiments. In this sense our scheme, which uses only directly measurable quantities, has an advantage. We also note that our scheme applies to a larger class of models than the others. Although the Maes-Netocny scheme is based on microscopic (or mesoscopic) dynamics, it does not apply to models with inertia (momenta) as it is. As for the Hatano-Sasa scheme it has been pointed out [31] that a consistent thermodynamic interpretation is impossible once the momentum degrees of freedom is introduced. See also [14]. Among the four, ours seems to be the only scheme which provides a consistent thermodynamic relation in models including momenta (although we lack inequalities). See section 10. For the moment we cannot say anything definite about which (or, even any) of the four schemes is most promising. We believe that further investigations from mathematical, theoretical, and experimental points of view are necessary.

2

Setup and definitions

Here we introduce general Markov jump processes that we study, and fix the notation. Quantities specific to our approach to nonequilibrium physics are introduced in section 2.3. We also describe typical examples in section 2.4.

2.1

Markov jump process

Let the state space S be a finite set. The elements x, y, . . . ∈ S are states (in a suitable mesoscopic description) of the system. The probability distribution is denoted in vector notation as p = (px )x∈S where px is the probability to find the system in a state x. We assume that there is a set of parameters α which characterizes the system. For concreteness we assume that α takes its values in a compact subset of Rn for some n > 1. Fix an arbitrary time scale τo > 0. During the time interval [−τo , τo ], an external agent performs an operation to the system by controlling α according to a protocol (i.e., a function of time t) α(t) (with t ∈ [−τo , τo ]) which is fixed in advance. The function α(t) need not be continuous. We write the initial and the final values of the parameters as α = α(−τo ) and α′ = α(τo ), respectively. We also take τ which is much larger than τo , and consider the time evolution of the system in the longer time interval [−τ, τ ]. We extend the protocol to the whole time interval by simply setting α(t) = α for t ∈ [−τ, −τo ] and α(t) = α′ for t ∈ [τo , τ ]. See Figure 4. We denote the whole protocol as α ˆ = (α(t))t∈[−τ,τ ] . A special protocol in which α(t) takes a constant value α throughout [−τ, τ ] is denoted as (α). We consider a Markov jump process characterized by a protocol α. ˆ

10

α′ α −τ

−τo

0

t τo

τ

Figure 4: The protocol α ˆ = (α(t))t∈[−τ,τ ] , which brings the model parameters ′ from α to α . The parameters α(t) stay constant in the initial and the final stages. α For given parameters α and x, y ∈ S such that x 6= y, let Rx→y ≥ 0 be the transition rate from the state x to y. Physically speaking transitions in our system is caused by interactions between the system and heat baths attached to it. We assume that α α Rx→y 6= 0 implies Ry→x 6= 0 for any x 6= y. We also assume that the whole state space α S is “connected” by nonvanishing Rx→y . More precisely, for any x, y ∈ S with x 6= y, one can take a sequence x0 , x1 , . . . , xn such that x0 = x, xn = y, and Rxαj−1 →xj 6= 0 for any j = 1, 2, . . . , n. We also define the escape rate at x ∈ S by X α λαx := Rx→y . (2.1) y∈S (y6=x)

The Markov jump process corresponding to the protocol α ˆ = (α(t))t∈[−τ,τ ] is defined by the master equation X dpx (t) α(t) = −λα(t) p (t) + py (t) Ry→x , x x dt y∈S

(2.2)

(y6=x)

for any x ∈ S and t ∈ [−τ, τ ], where px (t) is the probability to find the system in x at time t. The equation (2.2) is neatly rewritten in the vector notation as dp(t) = Rα(t) p(t), dt

(2.3)

where p(t) = (px (t))x∈S is regarded as a column vector. The transition rate matrix7 Rα α is defined by specifying its entries as (Rα )yx = Rx→y for x 6= y and (Rα )xx = −λαx . The formal solution of (2.3) is written as hZ t i α(s) p(t) = exp← ds R pinit , (2.4) −τ

7

The standard generator of a stochastic process is given by the transpose of R.

11

S x2 x0 x1 xn xn−1 −τ

t1

t2

tn τ

t

Figure 5: A schematic picture of a path x ˆ.

where pinit is the initial distribution given at −τ . The time-ordered exponential is defined by hZ t i α(s) exp← ds R −τ

:= lim exp N ↑∞

h (t + τ )Rα(sN−1 ) i N

exp

h (t + τ )Rα(sN−2 ) i N

· · · exp

h (t + τ )Rα(s0 ) i , N

(2.5)

with sj = {(t + τ )/N}j − τ . When α(t) is time-independent, (2.5) coincides with the usual exponential exp[(t + τ )Rα ]. It is a well known consequence of the Perron-Frobenius theorem that, for any parameter α, one has lim exp[s Rα ] pinit = ρα , (2.6) s↑∞

where pinit is an arbitrary initial probability distribution. Here ρα = (ραx )x∈S is the unique stationary probability distribution characterized by the condition Rα ρα = 0. It is also known that ραx > 0 for any x. Physically speaking ρα is the probability distribution for the nonequilibrium steady state (NESS) of the system with constant parameters α.

2.2

Description in terms of paths

It is sometimes more convenient to describe the Markov jump process in terms of a path (or a history) xˆ of the state. A path is naturally identified with a piecewise constant function xˆ = (x(t))t∈[−τ,τ ] , but we shall often specify it in terms of the history of jumps as xˆ = (n, (x0 , x1 , . . . , xn ), (t1 , t2 , . . . , tn )), (2.7) where n = 0, 1, 2, . . . is the total number of jumps, x0 , . . . , xn ∈ S (such that xj−1 6= xj for j = 1, . . . , n) are the states that the system has visited, and tj is the time at which the jump xj−1 → xj took place. They are ordered as −τ < t1 < t2 < . . . < tn < τ , and we often write t0 = −τ and tn+1 = τ . We also write x0 and xn as x(−τ ) and x(τ ), respectively. See Figure 5. 12

Then the weight (more precisely, the transition probability density) associated with a path xˆ is n n h Z tj+1 i Y Y α ˆ α(tj ) α(t) exp − (2.8) T [ˆ x] := Rxj−1 →xj dt λxj . tj

j=0

j=1

The weight is normalized so that Z

Dˆ x δx(−τ ),x T αˆ [ˆ x] = 1

(2.9)

for any initial state x ∈ S, where the “integral” over all the paths is defined by Z Z τ Z τ Z τ Z τ ∞ X X Dˆ x(· · · ) := dt1 dt2 dt3 · · · dtn (· · · ). (2.10) n=0 x0 ,...,xn ∈S (xj−1 6=xj )

−τ

t1

t2

tn−1

In this language, the general solution (2.4) is written as Z α ˆ px (t) = Dˆ x pinit x]. x(−τ ) δx(t),x T [ˆ

(2.11)

One way to see the equivalence of (2.4) and (2.11) is to write the matrix product explicitly (in terms of the sums over S) in (2.5). Let f [ˆ x] be an arbitrary function of xˆ. We define the expectation value of f [ˆ x] by Z α(−τ ) ˆ hf iαst→ := Dˆ x f [ˆ x] ρx(−τ ) T αˆ [ˆ x], (2.12) where the subscript “st →” indicates that the system starts from the steady state for parameter α(−τ ), and nothing is specified for the final condition. We stress that this is a physically natural expectation, which can be realized experimentally.

2.3

Entropy production, work, and time-reversal

Let us further specify our problem, and also introduce some important quantities. We assume that each state x ∈ S is associated with its energy Hxν ∈ R. Here ν is a parameter (or a set of parameters) that characterizes the Hamiltonian Hxν , and is a component of α. See (2.30) and (2.34) for examples. We assume that ν takes its value ′ in a compact subset of Rn for some n′ ≥ 1. α Entropy production: For any x, y ∈ S such that Rx→y 6= 0, we define the entropy 8 production in the heat baths associated with the transition x → y by α θx→y := log 8

α Rx→y , α Ry→x

(2.13)

Throughout the present paper, the entropy production always means the entropy production in the heat baths.

13

or, equivalently, by the “local detailed balance condition” α

α α Ry→x = e−θx→y Rx→y .

(2.14)

α α Clearly one has θx→y = −θy→x . Mathematically speaking (2.13) is a mere definition. α With this definition of θx→y , we can justifies the “detailed fluctuation theorem” (6.4), which will be a basis of the present work. To give the quantity a physical interpretation as entropy production, we need some preparations. A class of processes called equilibrium (stochastic) dynamics describe a system attached to heat baths with a single temperature and free from any non-conservative forces (see footnote 15). Such a system approaches the corresponding equilibrium state after (β,ν) a sufficiently long time9 . The transition rates Rx→y (where we have written α = (β, ν)) in an equilibrium dynamics satisfy the detailed balance condition ν

ν

(β,ν) (β,ν) e−βHx Rx→y = e−βHy Ry→x ,

(2.15)

for any x, y ∈ S. Here β is the single inverse temperature of the heat baths. It is well known (and easy to prove) that the condition (2.15) ensures that the corresponding ν (β,ν) stationary distribution is the canonical distribution ρx = e−βHx /Zν (β), where Zν (β) = P −βHxν is the normalization constant. x∈S e Under the detailed balance condition (2.15), the entropy production (2.13) becomes (β,ν) = β (Hxν − Hyν ) = −β qx→y , θx→y

(2.16)

where qx→y = Hyν − Hxν is the change in the energy of the system, which is equal to the heat transferred from the baths to the system. The final expression in (2.16) is nothing but the well-known formula for the change (or the production) of entropy in equilibrium thermodynamics. The main subject of the present work is non-equilibrium stochastic dynamics, for which the detailed balance condition (2.15) can never be satisfied for any choice of β α and Hxν . We nevertheless assume here that the entropy production θx→y is written as α θx→y = −βB qx→y ,

(2.17)

where βB is the inverse temperature of the single heat bath10 that is relevant to the transition x → y, and qx→y is the energy (heat) transferred from the bath to the system during the transition. The idea behind the identification (2.17) is that heat baths are always in equilibrium states so that we can use the relation from equilibrium thermodynamics for each transition, even when the system never settles to equilibrium11 . 9

The approach to equilibrium is indeed a nonequilibrium phenomenon that can be studied in the framework of equilibrium dynamics. 10 We assume here (and in what follows) that each transition is associated with only a single bath. See also section 2.4. 11 In a formulations based on a stochastic process (as in the present work), the relation (2.17) is nothing more than an interpretation. In more microscopic formulations based on mechanics, one may justify such relations. See, e.g., [43].

14

Throughout the present work we assume that the nonequilibrium system can be interpreted as a perturbation to an equilibrium system. As for the entropy production we write α α θx→y = ψx→y + β (Hxν − Hyν ), (2.18) where β > 0 is a certain reference inverse temperature, which may not be unique. The α quantity ψx→y should be called the nonequilibrium part of entropy production. It also α α satisfies ψx→y = −ψy→x . For a given path xˆ as in (2.7), we can define the total entropy production in xˆ as α ˆ

Θ [ˆ x] =

n X

j) θxα(t , j−1 →xj

(2.19)

j) ψxα(t . j−1 →xj

(2.20)

j=1

and its nonequilibrium part as Ψαˆ [ˆ x] =

n X j=1

For any subinterval [τ1 , τ2 ] ⊂ [−τ, τ ], we define partial entropy productions by Θ

[τ1 ,τ2 ],α ˆ

n X   j) , [ˆ x] = χ tj ∈ [τ1 , τ2 ] θxα(t j−1 →xj

Ψ[τ1 ,τ2 ],αˆ [ˆ x] =

j=1 n X j=1

  j) χ tj ∈ [τ1 , τ2 ] ψxα(t , j−1 →xj

(2.21) (2.22)

where χ[true] = 1 and χ[false] = 0. Work: Let us write the protocol for the parameter of the Hamiltonian as νˆ := (ν(t))t∈[−τ,τ ] , which is a component of the full protocol α ˆ . For a path xˆ, we define νˆ

W [ˆ x] :=

n X j=0

j+1 ) (Hxν(t j



j) Hxν(t ) j

=

n Z X j=0

tj+1

tj

 ν  dν(t) ∂Hx(t) dt , dt ∂ν ν=ν(t)

(2.23) ν(t

)

where the final expression is valid only when ν(t) is differentiable. Note that Hxj j+1 − ν(t ) Hxj j is the change in the energy of the system during the interval (tj , tj+1 ), in which the state of the system is always xj . Since this change in the energy is caused solely by the change in ν, we can identify it with the work done by the external agent who operates on the system. Therefore (2.23) is the total work done by the external agent to the system in the path xˆ. Note that, since ν(t) varies only for t ∈ [−τo , τo ], the summand in (2.23) vanishes if (tj , tj+1 ) ∩ [−τo , τo ] = ∅.

15

It is obvious from (2.18) that Θ, Ψ, and W are related with each other. In fact by summing up (2.18) for all the transitions in xˆ, we see Θαˆ [ˆ x] = Ψαˆ [ˆ x] + β

n X j=1

= Ψαˆ [ˆ x] + β

j) j) Hxν(t − Hxν(t j−1 j

n nX j=0



o  ν(tj ) ν(tn+1 ) ν(t0 ) j+1 ) Hxν(t − H − H + H xj xn x0 j

 ν(−τ ) ν(τ ) = Ψαˆ [ˆ x] + β W νˆ [ˆ x] + β Hx(−τ ) − Hx(τ ) .

(2.24)

Time-reversal: For a path xˆ as in (2.7), we define its time-reversal xˆ† as xˆ† = (n, (xn , xn−1 , . . . , x0 ), (−tn , −tn−1 , . . . , −t2 , −t1 )).

(2.25)

If we use the language of function and denote original path as xˆ = (x(t))t∈[−τ,τ ] , the time-reversed path is xˆ† = (x(−t))t∈[−τ,τ ] . Similarly for a protocol α ˆ = (α(t))t∈[−τ,τ ] and its component νˆ = (ν(t))t∈[−τ,τ ] , we define their time-reversal as α ˆ † = (α(−t))t∈[−τ,τ ] and νˆ = (ν(−t))t∈[−τ,τ ] , respectively. One easily finds that the total entropy production, its nonequilibrium part, and the work are antisymmetric with respect to the time-reversal, i.e., †

Θαˆ [ˆ x] = −Θαˆ [ˆ x† ],



Ψαˆ [ˆ x] = −Ψαˆ [ˆ x† ],



W νˆ [ˆ x] = −W νˆ [ˆ x† ].

(2.26)

Measurability of the quantities: In operational approaches to thermodynamics we believe it essential to distinguish between physical quantities which are experimentally measurable (at least in principle) and which are not. In what follows we assume that a path xˆ has been realized, and ask whether the quantities W νˆ [ˆ x], Θαˆ [ˆ x], and Ψαˆ [ˆ x] are measurable. This corresponds to the measurability of these quantities in a single experiment. As in most treatments of equilibrium thermodynamics, we assume that the total work W νˆ [ˆ x] is measurable. The work is a purely mechanical quantity, and the external agent can, in principle, always determine it by precisely measuring the (generalized) force and the displacement. We next argue that the total entropy production (in the baths) Θαˆ [ˆ x] is also measurable. Suppose that the system is in touch with n heat baths, where the inverse x] be the total amount of heat that flowed temperature of the j-th bath is βj . Let Qαjˆ [ˆ into the system from the j-th bath during the experiment, i.e., the sum of qx→y for every transition (which involve the j-th bath) in the path xˆ . We assume that the total heat Qαjˆ [ˆ x] can be measured for each j. This may not be a trivial assumption, but in principle we can think of carefully designed heat baths where heat flux can be monitored accurately12 . Since P the relation (2.17) means that αˆthe total entropy production is written as Θαˆ [ˆ x] = − nj=1 βj Qαjˆ [ˆ x], we conclude that Θ [ˆ x] is measurable. 12

Such measurements are indeed possible in modern calorimetry.

16

The measurability of the nonequilibrium part Ψαˆ [ˆ x] of the total entropy producα ˆ tion is more subtle. We argue that Ψ [ˆ x] is measurable or semi-measurable depending on the model. (See the next section for details of the models.) In the models for heat conduction, where the system exchanges energy only with heat baths, we find Pn α ˆ x] from (2.33). This means that Ψαˆ [ˆ x] is determined from Ψ [ˆ x] = − j=1 (βj − β)Qαjˆ [ˆ α ˆ the measurable total heat Qj [ˆ x]. In the models of systems driven by an external nonconservative force, on the other hand, the system exchanges energy with the external field as well as the heat baths. It then turns out (see (2.36)) that Ψαˆ [ˆ x] is identical to β times the total work done to the system by the external field. The work done by the external field may be measured in principle13 , but the measurement seems to be extremely difficult in general. We thus regard Ψαˆ [ˆ x] as a semi-measurable quantity in this case.

2.4

Examples

Although our theory applies to a large variety of physical models, it might be useful to have some concrete examples in mind. Here we define a standard class of equilibrium dynamics, and then describe two typical problems of nonequilibrium physics. Equilibrium dynamics: Before discussing nonequilibrium problems, let us discuss equilibrium dynamics, which will be the starting point. To define transition rates, it is convenient to first specify the Hamiltonian Hxν and the connectivity function c(x, y) such that c(x, y) = c(y, x) ≥ 0 for any x, y ∈ S with x 6= y. We assume that the state space S is connected via nonvanishing c(x, y), or more precisely, for any x, y ∈ S one can take a sequence x0 , x1 , . . . , xn such that x0 = x, xn = y, and c(xj−1 , xj ) 6= 0 for any j = 1, 2, . . . , n. We make no assumptions on the Hamiltonian Hxν except that it is real. Then the transition rates for the equilibrium dynamics at the inverse temperature β may be defined, for example, as ν

(β,ν) Rx→y = c(x, y) eβHx ,

or

ν

(2.27) ν

(β,ν) Rx→y = c(x, y) e(β/2)(Hx −Hy ) .

(2.28)

It is clear that both the definitions satisfy the necessary conditions for transition rates including the detailed balance condition (2.15). Suppose that one has Hxν = Hyν and c(x, y) 6= 0 for some x 6= y. Then the rate (2.28) (β,ν) (β,ν) becomes Rx→y = Ry→x = c(x, y), and is independent of the inverse temperature β. The abstract scheme discussed above applies to various concrete physical settings. Let us describe a system of particles on a lattice. Let the lattice Λ be a finite set whose 13

One strategy is to measure the back action from the system to the generator (such as a coil) of the field. In a colloidal system it may be possible to determine the work done by the field by precisely measuring the positions of charged particles.

17

x1 x2 x3

Figure 6: A configuration x = (x1 , x2 , x3 ) of three particles on the lattice.

elements are denoted as u, v . . . ∈ Λ. We denote by B the set of bonds on Λ. More precisely the element of B is a pair {u, v} = {v, u} with some u, v ∈ Λ such that u 6= v. We assume that Λ is connected via the bonds in B. The  simplest example is the onedimensional periodic lattice Λ = {1, 2, . . . , L} with B = {x, x + 1} x ∈ Λ , where we identify L + 1 with 1. We assume that there are N particles on the lattice, and let x denote a configuration of the particles on Λ. More precisely, we set x = (x1 , . . . , xN ), where xj ∈ Λ is the position of the j-th particle (j = 1, . . . , N). One may or may not impose the hard-core condition, i.e., xj 6= xk whenever j 6= k. See Figure 6. For any two configurations x = (x1 , . . . , xN ) and y = (y1 , . . . , yN ), we set c(x, y) = 1 if {xk , yk } ∈ B for some k and xj = yj for any j such that j 6= k, and c(x, y) = 0 otherwise. In other words, c(x, y) = 1 if and only if one can modify the configuration x into y by moving one particle along a bond in B. As for the Hamiltonian, the standard choice is Hx :=

N X

V1 (xj ) +

j=1

N X

V2 (xj , xk ),

(2.29)

j,k=1 (j>k)

where the single particle potential V1 (·) and the two-particle interaction potential V2 (·, ·) are arbitrary real valued functions on Λ and Λ × Λ, respectively. Heat conduction: Let us discuss an idealized model of heat conduction. We assume that the system interacts with n heat baths with different temperatures. We label the baths by the index j = 1, 2, . . . , n, and denote by βj the inverse temperature of the j-th bath. The set of parameters that characterizes the model is α = (β1 , . . . , βn , ν).

(2.30)

With any x, y ∈ S such that c(x, y) 6= 0 we associate a unique index j(x, y) = j(y, x) ∈ {1, . . . , n}, which indicates that the j(x, y)-th bath is relevant for the transition between x and y. Then we define the transition rate as (β

,ν)

j(x,y) α Rx→y = Rx→y ,

18

(2.31)

for any x, y ∈ S such that x 6= y, where the right-hand side is defined by (2.27) or (2.28). From the definition (2.13), one finds α θx→y = βj(x,y) (Hxν − Hyν ) = ∆βj(x,y) (Hxν − Hyν ) + β (Hxν − Hyν ),

(2.32)

where we have chosen the reference inverse temperature β (somewhat arbitrarily), and wrote ∆βj := βj − β. Comparing with (2.18), one finds α ψx→y = ∆βj(x,y) (Hxν − Hyν ),

(2.33)

which is indeed small when all the inverse temperatures β1 , . . . , βn are close to each other, and β is chosen properly14 . Driven system: We shall illustrate a system which is in contact with a single heat bath with the inverse temperature β, but is driven by a non-conservative external force. For each pair x, y ∈ S such that c(x, y) 6= 0, we define a quantity dx→y ∈ R which satisfies the antisymmetry dx→y = −dy→x . Physically, dx→y is interpreted as the displacement (in the direction of the non-conservative external force) of the particle associated with the transition x → y. In the simplest example of particles on the one-dimensional periodic lattice, we set dx→y = 1 if a particle jumps to the right in the transition x → y, and dx→y = −1 if a particle jumps to the left. We assume that the non-conservative15 external force f is applied to the system. The model is parameterized by α = (β, ν, f ). (2.34) We then define the transition rate by α (β,ν) Rx→y = eβf dx→y /2 Rx→y ,

(2.35)

for any x, y ∈ S such that x 6= y, where the right-hand side is defined by (2.27) or (2.28). From the definitions (2.13) and (2.18), one readily finds α ψx→y = βf dx→y .

(2.36)

This means that the nonequilibrium part of the total entropy production Ψαˆ [ˆ x] (see (2.20)) can be interpreted as the total work done by the non-conservative external force to the system (multiplied by β). 14 In most of realistic situations for heat conduction, only some small portions of the system is in touch with the heat baths. To model such a situation by using a system of particles on a lattice, we assume that the energy of the system changes only when a particle hops within one of the portions which are in touch with the baths. In other words, if an allowed transition x → y is such that a particle hops outside the portions, then one must have Hxν = Hyν . We further use the transition rule (2.28) so as to make the corresponding transition rate (which is indeed 1) independent of any inverse temperatures. 15 We define the force in this setting as fx→y = f dx→y . The force fx→y is said to be conservative if one can write fx→y = Ux − Uy for any x, y ∈ S with a suitable function (i.e., potential) Ux .

19

3

Jarzynski-type equalities for NESS

We start by presenting some exact equalities which are valid for general operations (i.e., protocols) to NESS. They are reminiscent of the Jarzynski equality (3.2), which holds for operations to equilibrium states [32, 30]. We note, however, that the derivation of these equalities for NESS is not as straightforward as that of the original Jarzynski equality [32]. One of the main difficulties is that we do not know the explicit form of the probability distribution ρα of NESS while the corresponding stationary distribution in the equilibrium case is the canonical distribution. Our main equality is the following. We here consider a general protocol α ˆ introduced in the beginning of section 2.1. See, in particular, Figure 4. As we have discussed at the end of section 2.3, we regard that the work W νˆ is measurable, and the nonequilibrium part Ψαˆ of the entropy production is measurable or semi-measurable depending on the model. Theorem 3.1 There exists a function (that we call the free energy) P F (α) of νthe parameters α which coincides with the equilibrium free energy −β −1 log x∈S e−βHx for an equilibrium system with α = (β, ν), and we have for any protocol α ˆ that

αˆ νˆ α ˆ exp[−(β W + Ψ )/2] 1 st→ F (α′ ) − F (α) = − lim log

(3.1) αˆ† . † † β τ ↑∞ ν ˆ α ˆ exp[−(β W + Ψ )/2] st→

This theorem will be proved in section 6.2, using the results from section 7. We recall that ν is a component of α (as in (2.30) and (2.34)), and likewise the protocol νˆ = (ν(t))t∈[−τ,τ ] is a component of the full protocol α ˆ = (α(t))t∈[−τ,τ ] . Note that we fix the operation time scale τo when taking the limit τ ↑ ∞. It means that we are treating an arbitrary operation, including very “wild” ones. The equality (3.1) expresses the difference of the (nonequilibrium) free energy in terms of the expectation values defined for nonequilibrium processes. In this sense it may be regarded as a nonequilibrium version of the Jarzynski equality

1 ν ˆ (β,ˆ ν) F (β, ν ′ ) − F (β, ν) = − log e−βW eq→ β

(3.2)

for equilibrium processes. A fundamental difference of our equality from the original equality is that we must consider the expectation values for both the original protocol ˆ† α ˆ and its time-reversal α ˆ † . Let us stress, however, that the expectation h· · · iαst→ is not † at all unphysical; one simply executes the operation according to the protocol α ˆ , and considers a natural time-evolution starting from the NESS corresponding to α′ . Another difference is that here the expectation values involve the nonequilibrium part Ψ of the entropy production as well as the work W . (β,ˆ ν) We recall that one can prove the minimum work principle hW νˆ ieq→ ≥ F (β, ν ′ ) − F (β, ν), which is a representation of the second law of thermodynamics (in the standard equilibrium thermodynamics), by simply applying the Jensen inequality to the Jarzynski 20

equality (3.2). Unfortunately our equality (3.1) does not lead directly to any inequalities since the right-hand side is the ratio of the two expectation values. Below in Theorem 3.2, we see another equality which better resembles the original Jarzynski equality (3.2). But our (3.1) is of considerable importance especially because it is intimately related to thermodynamic relations in NESS, as we shall discuss in section 4. We note that various exact equalities which are valid for general stochastic processes (including ours) have been derived from the “detailed fluctuation theorem” in, e.g., [33, 34, 35]. Many similar equalities can be derived in the same manner. We believe that our equality (3.1) is essentially different from these equalities. While the equalities derivable with the methods in [33, 34, 35] contain quantities like log ραx which depend explicitly on the unknown stationary distribution ρα , our equality only contains W and Ψ which are (semi-)measurable thermodynamic type quantities. The derivation of our new equality is based not only on the “detailed fluctuation theorem” but also on the new “splitting lemma” (Lemma 6.1) which allows us to treat a process whose initial distribution is ρα without using ρα explicitly. It may be inspiring to rewrite the quantity inside the limit in (3.1) as

αˆ ν ˆ ˆ exp[−Ψαˆ /2] st→ he−β W /2 iαmod log + log

αˆ† , † ˆ† he−β W νˆ /2 iαmod exp[−Ψαˆ† /2] st→

where we have defined the modified expectation by

αˆ f exp[−Ψαˆ /2] st→ α ˆ hf imod :=

αˆ . exp[−Ψαˆ /2] st→

(3.3)

(3.4)

We note that the extra weight exp[−Ψαˆ [ˆ x]/2] has an effect of canceling the nonequiα librium contribution ψx→y in the transition rates. This is in particular true for the example of driven system discussed in section 2.4. Compare the transition rate (2.35) α with the formula (2.36) for ψx→y . ˆ The physics described by the expectation h· · · iαmod is then expected to be close to that of equilibrium. But there still is considerable “nonequilibrium effect” coming from the escape rates λαx , which are untouched in the modification (3.4). Possible essential roles played by the escape rates in nonequilibrium states have been emphasized by Maes ˆ and his collaborators [36, 37, 38]. In the expectation h· · · iαmod , nonequilibrium flows are cancelled and we can focus on the effects from the escape rates. We still do not know if this interpretation leads us to any new insights. Remark 1: The equality (3.1) is one of the series of equalities which can be proved in the similar manner. A general form includes an arbitrary constant κ ∈ R, and is

αˆ νˆ α ˆ exp[−κ β W − Ψ /2] 1 st→ F (α′) − F (α) = − lim log

(3.5) αˆ† . † † β τ ↑∞ α ˆ ν ˆ exp[−(1 − κ) β W − Ψ /2] st→

21

Remark 2: We can also prove the following exact equalities which involve the total entropy production Θ, rather than its nonequilibrium part Ψ. See the end of section 6.2 ˜ for the proof. There exists a function S(α) of the parameters α, and we have for any protocol α ˆ that

αˆ exp[−Θαˆ /2] st→ ′ ˜ ) − S(α) ˜ S(α = lim log

(3.6) αˆ † . τ ↑∞ exp[−Θαˆ† /2] st→

˜ Although one may be tempted to identify S(α) as nonequilibrium entropy, this interpretation may not be adequate. For parameters α which correspond to equilibrium, one ˜ finds that S(α) is (similar to but) not the same as the equilibrium entropy. See (6.15). Another exact Jarzynski-type equality for NESS was derived by one of us (N.N.) in [39]. Since we can prove this equality with the same machinery as the previous one, we shall briefly discuss it here. For an arbitrary function f [ˆ x] of xˆ, we define its Ψ-modified expectation as ˆ hf iαΨ-mod



αˆ ′ f exp[−(Ψ[−τ,−τ /2],(α) + Ψ[τ /2,τ ],(α ) )/2] st→ :=

αˆ , exp[−(Ψ[−τ,−τ /2],(α) + Ψ[τ /2,τ ],(α′ ) )/2] st→

(3.7)

where Ψ for restricted time intervals are defined in (2.22). We have replaced the protocols by (α) and (α′) in order to emphasize that α(t) is constant in these intervals (we here assume τ ≥ 2τo ). This is similar to the modified expectation (3.4) introduced above, but now the modification factor presents only in the time intervals [−τ, −τ /2] and [τ /2, τ ]. One can say that the system is in the modified nonequilibrium at the beginning and the end of the history, while it is in the full-fledged nonequilibrium in the middle. This hybrid allows one to prove the following strong result. Theorem 3.2 Let F (α) be the nonequilibrium free energy introduced in Theorem 3.1. For any protocol α ˆ , one has

† α ν ˆ ˆ ′ ˆ† lim he−β W iαst→ = e−β{F (α )−F (α)} lim exp[−Ψ[−τ /2,τ /2],αˆ ] Ψ-mod .

τ ↑∞

τ ↑∞

(3.8)

This theorem will be proved in section 6.3, using the results from section 7. It is remarkable that the left-hand side of (3.8) only includes the standard mechanical ˆ work W νˆ and the physically natural average h· · · iαst→ . Although the right-hand side is a little more complicated, it was shown in [39] that it can also be measured experimentally at least when the “degree of nonequilibrium” is small enough. By using the Jensen inequality, one can show from (3.8) a “second law” ˆ lim hW νˆ iαst→ ≥ F (α′ ) − F (α) −

τ ↑∞

1 † α ˆ† lim log exp[−Ψ[−τ /2,τ /2],αˆ ] Ψ-mod . β τ ↑∞

22

(3.9)

4

Thermodynamic relations for NESS

We shall observe here that our main equality (3.1) can be used to generate thermodynamic relations associated with operations that bring a NESS to a different NESS. The simplest and the most important is the extended Clausius relation (4.5), (4.6), and (4.10), which was derived in our earlier works [2, 3]. We concentrate on heuristic arguments in the present section, and discuss corresponding rigorous results in section 5.

4.1

Extended Clausius relation and entropy

Extended Clausius relation: Here we shall concentrate on a situation where the system is close to equilibrium and the change in the parameters during the operation is small. It is then expected that the arguments of the two exponential functions in the α right-hand side of the equality (3.1) are small, because ψx→y vanishes in an equilibrium νˆ system, and W is small if the change of the Hamiltonian is small. By expanding in these quantities to the lowest order, we see that (3.1) yields F (α′ ) − F (α) ≃

o 1 n † † † hβ W νˆ + Ψαˆ iαˆ − hβ W νˆ + Ψαˆ iαˆ , 2β

(4.1)

ˆ where we have abbreviated h· · · iαst→ as h· · · iαˆ for simplicity. We also assumed τ is sufficiently large, and have omitted limτ ↑∞ . Although (4.1) may be interpreted as a thermodynamic relation, it is much better to rewrite it in terms of the total entropy production Θ. By substituting (2.24), this becomes † † hΘαˆ iαˆ − hΘαˆ iαˆ ′ ′ ′ β {F (α ) − F (α)} ≃ + β hH ν iαst − β hH ν iαst . (4.2) 2 For an arbitrary function gx on S, we have defined its expectation value in the steady state with α as X hgiαst := gx ραx . (4.3) x∈S

If one introduces the nonequilibrium entropy through the “familiar” relation  (4.4) S(α) := β hH ν iαst − F (α) ,

the relation (4.2) becomes





hΘαˆ iαˆ − hΘαˆ iαˆ S(α ) − S(α) ≃ − , 2 ′

(4.5)

which is the extended Clausius relation obtained in [2, 3]. This is essentially the same as (1.4) in the introduction. To be slightly more precise about the near equality, we can write the same relation as †



hΘαˆ iαˆ − hΘαˆ iαˆ + O(ǫ2 δ) + O(δ 2 ), S(α ) − S(α) = − 2 ′

23

(4.6)

where ǫ denotes the “degree of nonequilibrium”, and δ denotes the amount of change in the parameters (see section 5 for precise definitions). The O(δ 2) term can be omitted for a quasi-static (i.e., smooth and slow) protocol. In Theorems 5.2 and 5.3, we present corresponding rigorous estimates. Consider an equilibrium protocol α ˆ eq = (αeq (t))t∈[−τ,τ ] , where αeq (t) for any t ∈ [−τ, τ ] corresponds to an equilibrium system. In such a case the standard adiabatic theorem † † implies that hΘαˆeq iαˆeq ≃ −hΘαˆeq iαˆeq for a sufficiently slow and smooth process. Then the relation (4.5) becomes ′ S(αeq ) − S(αeq ) ≃ −hΘαˆeq iαˆeq ,

(4.7)

which is nothing but the standard Clausius relation (1.1). Since S(αeq ) coincides with the standard entropy, we find the the extended Clausius relation (4.5) is an extension of the standard Clausius relation. † † Note that, in a NESS, both hΘαˆ iαˆ and hΘαˆ iαˆ grow proportionally with the total time τ since there always is a nonvanishing entropy production. The extended Clausius relation (4.5) shows that their difference is a finite quantity independent of τ (provided that τ is long enough) and characterizes the effect of the operation. It is crucial that the † † quantities hΘαˆ iαˆ and hΘαˆ iαˆ can be measured by executing (at least) two experiments with the protocol α ˆ and the corresponding reverse protocol α ˆ†. Excess entropy production: The right-hand side of the extended Clausius relation (4.5) or (4.6) can also be written in terms of an interesting quantity called the excess α entropy production. We first define the entropy production rate σst in the NESS with the parameters α by 1 (α) α σst := hΘ(α) ist→ , (4.8) 2τ where (α) denotes the protocol in which α(t) = α for any t ∈ [−τ, τ ]. We shall prove in section 7.5 that the definition is independent of τ . This independence may be intuitively apparent since the system is always in the NESS with α, and Θ(α) is the total entropy production. For an arbitrary protocol α ˆ = (α(t))t∈[−τ,τ ] , we next define the corresponding housekeeping entropy production by Z τ α(t) α ˆ Σhk := dt σst . (4.9) −τ

The house-keeping entropy production is indeed the main contribution to the total ˆ entropy production hΘαˆ iαst→ , especially when τ is large and α(t) varies slowly. The α ˆ α ˆ α ˆ difference hΘ ist→ −Σhk is called the excess entropy production. It represents the intrinsic response of the system to the change of the parameters. By using the excess entropy production, the extended Clausius relation (4.6) is written as [2, 3] n o ˆ S(α′ ) − S(α) = − hΘαˆ iαˆ − Σαhk + O(ǫ2 δ), (4.10) 24

where we omitted the O(δ 2 ) term assuming a quasi-static protocol. We present a rigorous version of the relation in Theorem 5.4. Nonequilibrium entropy: Finally let us discuss the basic property of the entropy S(α). For any probability distribution p = (px )x∈S , the corresponding Shannon entropy is defined by X SSh [p] := − px log px . (4.11) x∈S

It is well-known that for an equilibrium parameter αeq , one has S(αeq ) = SSh [ραeq ], i.e., the Shannon entropy of the stationary distribution (which is the canonical distribution) is exactly equal to the thermodynamic entropy. This equality is no longer valid in nonequilibrium systems, but we can still show the near equality S(α) = SSh [ρα ] + O(ǫ3 ). The proof is based on the representation n

τ,(α) o τ,(α)

1 log ραx = β F (α) − β Hxα − lim Ψ(α) x→ − Ψ(α) st→x + O(ǫ3 ), 2 τ ↑∞

(4.12)

(4.13)

for the stationary distribution ρα = (ραx )x∈S , which was derived by Komatsu and Nakagawa [40]. See Theorems 5.1 and 8.1 for rigorous versions. Remark: Recall that we also have the equality (3.6), which directly deals with the entropy production Θ. Although one might suspect that (3.6) leads us immediately to the extended Clausius relation (4.5), it is indeed not the case. Unlike the nonequilibrium part Ψ, the total entropy production Θ does not vanish in the limit of equilibrium. The expansion in Θ is not justified even in a heuristic discussion.

4.2

Higher order relations

We continue to be heuristic, and discuss higher order contributions from the expansion of (3.1) that we considered above. Recall that for a general expectation h· · · i and random variables X1 , X2 , · · · , Xk , the corresponding cumulant is defined by Pk ∂ ∂ ∂ c u X i i hX1 X2 · · · Xk i := ··· logh e i=1 , (4.14) i ∂u1 ∂u2 ∂uk u1 =u2 =···=uk =0 which leads to the formal expansion

X

logh e i = where ch X k i := ch X · · X} i. | ·{z

∞ c X h Xk i k=1

k

25

k!

,

(4.15)

By applying (4.15) (formally) to the right-hand side of the equality (3.1), one finds ∞ ˆ c

ˆ† o 1 X (−1)k nc

νˆ α ˆ k α νˆ† α ˆ† k α F (α ) − F (α) = − . (β W + Ψ ) − (β W + Ψ ) β k=1 2k k! ′

(4.16)

This is an improvement, which contains infinitely many “nonlinear” terms, of the extended Clausius relation (4.5) or (4.6). But a thermodynamic relation with infinitely many terms may not be useful or enlightening. It may be useful to have truncated versions of the nonlinear relation which are valid in certain limited situations. Here we still concentrate on a situation where the system is close to equilibrium and the change in the parameters during the operation is small. As before we denote by ǫ the “degree of nonequilibrium”, and by δ the amount of change in the parameters. We shall define these quantities precisely later in section 5. The error estimate in (4.6) was derived heuristically (but not proved) in [2, 3]. With a similar argument, we can show for any n = 1, 2, . . . that αˆ

αˆ† † † (β W νˆ + Ψαˆ )2n − c (β W νˆ + Ψαˆ )2n = O(ǫ2n δ) + O(δ 2 ).

c

(4.17)

We shall describe the derivation of this estimates in the next section 4.3. By using (4.17), we can write the truncated version of the higher order relations, which is 2n−1 ˆ c

ˆ† o 1 X (−1)k nc

νˆ α ˆ k α νˆ† α ˆ† k α F (α ) − F (α) = − (β W + Ψ ) − (β W + Ψ ) β k=1 2k k! ′

+ O(ǫ2n δ) + O(δ 2).

(4.18)

See Theorems 5.2 and 5.3 for the corresponding rigorous estimates. Remark: In [3], we have derived (again non-rigorously) a “non-linear nonequilibrium thermodynamic relation”, which does not exactly fit into the above form. Although we can derive the equality in [3] within the present framework, we won’t discuss the derivation here16 .

4.3

Perturbative estimate of the error

Let us present a heuristic derivation of (4.17). We shall make use of formal perturbative estimates freely without being bothered by the validity of the perturbation calculation. The following estimate (as it is) can be mathematically justified only for systems with a fixed time τ and extremely small ǫ and δ. Since the convergence estimate is not uniform in τ (in the present section), we have no rigorous control of the the limit τ ↑ ∞, in which we are interested. Nevertheless this estimate will be used as a part of rigorous argument later in section 7.4. 16

A convenient derivation is to start from (3.5) with κ = 1, and apply the cumulant expansion as above.

26



Define ∆Φαˆ and ∆Φαˆ , which are of O(δ), by Ψαˆ + βW νˆ = Ψ(α) + ∆Φαˆ ,







Ψαˆ + βW νˆ = Ψ(α) + ∆Φαˆ .

(4.19)

Then the quantity to be estimated is written as

αˆ

αˆ† c (α) 2n αˆ c (α) 2n αˆ† † † c = (Ψ ) − (Ψ ) (β W νˆ + Ψαˆ )2n − c (β W νˆ + Ψαˆ )2n



αˆ† † α ˆ + O(δ 2). (4.20) + 2n c ∆Φαˆ (Ψ(α) )2n−1 − 2n c ∆Φαˆ (Ψ(α) )2n−1

αˆ

αˆ† To evaluate c (Ψ(α) )2n − c (Ψ(α) )2n , we expand around the same expectation taken in the constant protocol h· · · i(α) . To be precise, we note that one can write hf iαˆ = hf (1 + ∆Γαˆ )i(α) for any function f [ˆ x], where ∆Γ[ˆ x] = O(δ). Then the leading

(α) 2n (α) c (α) 2n (α) c contribution is (Ψ ) − (Ψ ) , which is obviously vanishing. The remain

† (α) der is c (Ψ(α) )2n (∆Γαˆ − ∆Γαˆ ) , which is of order Ψ2n O(δ) = O(ǫ2n δ). We have also noted Ψ = O(ǫ).

αˆ

αˆ† † is more delicate The evaluation of the terms c ∆Φαˆ (Ψ(α) )2n−1 −c ∆Φαˆ (Ψ(α) )2n−1 and essential. Note that a naive order counting would show that these terms are of O(ǫ2n−1 δ) + O(δ 2). This is of O(ǫ) worse than what we want. To make an optimal estimate we expand around the same expectation taken in the equilibrium process h· · · iαˆeq , where α ˆ eq is a protocol17 which always stays in equilibrium and close to α. ˆ To be precise we note that hf iαˆ = hf (1 + ∆Ωαˆ )iαˆeq for any f [ˆ x], where α ˆ ∆Ω [ˆ x] = O(ǫ). Let us also note that the expectation in equilibrium processes satisfies the symmetry relation e−βF (αeq ) h e−βW

ν ˆ /2



f iαˆeq = e−βF (αeq ) h e−βW

ν ˆ† /2



f † iαˆeq ,

(4.21)

where νˆ is a component of the equilibrium protocol α ˆ eq , and the time-reversed function † † † ′ f is defined by f [ˆ x ] = f [ˆ x]. This can be easily derived from (6.5). Since F (αeq )− † α ˆ eq † (ˆ αeq ) F (αeq ) = O(δ) and W = O(δ), we find that h f i = h f i + O(f δ). α ˆ † α ˆ† (α) † Now it is crucial for us that (∆Φ ) = −∆Φ and (Ψ ) = −Ψ(α) . Therefore

αˆ†

αˆeq † − c ∆Φαˆ (Ψ(α) )2n−1 eq = the leading order in the expansion is c ∆Φαˆ (Ψ(α) )2n−1 O(∆Φ Ψ2n−1 δ) = O(ǫ2n−1 δ 2 ), which can be absorbed into O(δ 2) in (4.17). The next contribution in the expansion is of order ∆Φ Ψ2n−1 ∆Ω = O(ǫ2n δ), which is the lowest order contribution.

5

Rigorous results about thermodynamic relations

Although our equality (3.1) is rigorous and works for essentially arbitrary protocol α, ˆ the corresponding thermodynamic relations (4.5), (4.6), (4.10), and (4.18) have been derived only heuristically. Here we present rigorous results which (at least partially) justify our claim. See [41] for a related work for the linear response theory. 17

In a system with parameters (2.34) with a nonequilibrium protocol (β, ν(t), f (t)), for example, we choose the equilibrium protocol as (β, ν(t), 0).

27

Models: In order to state rigorous results, we shall specify the class of models that we consider. Although our proof covers quite a general class of models, we shall be moderately concrete here. As in section 2.4, we fix a finite state space S and a corresponding connectivity function c(x, y). We also fix the reference inverse temperature β. We consider an arbitrary real Hamiltonian Hxν which depends smoothly on ν, and satisfies |Hxν | ≤ E¯ for any x ∈ S and ν (in the allowed range) with a constant E¯ > 0. κ We also introduce a nonequilibrium function ξx→y for any x, y ∈ S such that c(x, y) 6= 0, where κ is a parameter (or a set of parameters) which takes its value in a compact subset ′′ κ κ of Rn . We assume that ξx→y satisfies |ξx→y | ≤ 1 for any κ. We introduce the degree of nonequilibrium ǫ ≥ 0, and characterize the system by α := (β, ǫ, ν, κ).

(5.1)

We then define the corresponding transition rate by κ

α (β,ν) Rx→y = eǫ ξx→y Rx→y ,

(5.2)

(β,ν)

where the rate Rx→y for the equilibrium dynamics is defined by (2.27) or (2.28). Note that both the examples (2.31) and (2.35) are written in the form (5.2). Nonequilibrium entropy: We start from the characterization of our nonequilibrium entropy. The entropy S(α) is defined by “thermodynamic” relation (4.4) in terms of the free energy F (α), which will be defined later in (6.12). We have already noted that S(α) coincides with the standard entropy in equilibrium, and is close to the Shannon entropy (4.11) of the stationary distribution ρα as in (4.12). The following is the corresponding rigorous statement. Theorem 5.1 (Nonequilibrium entropy and the Shannon entropy) One has (5.3) S(α) − SSh [ρα ] ≤ A ǫ3 for any α, where A is a positive constant which depends only on the class of models.

This theorem is proved in section 8 by using Theorem 8.1, which is a rigorous version of the representation (4.13). Like the constant A above, all the constants in the following theorems depend only on the choice of the class of models. Extended Clausius relation for a step protocol: Let us discuss rigorous statements about the extended Clausius relation and the related higher order relations. As a first step, we treat the step protocol, which is defined by ( α = (β, ǫ, ν, κ) t ∈ [−τ, 0], (5.4) α(t) = α′ = (β, ǫ, ν ′ , κ′ ) t ∈ (0, τ ]. 28

α′ α −τ

2τ /(N + 1)

0

t τ

Figure 7: The monotone stepwise protocol defined by (5.8) and (5.9). There are N steps separated by the temporal width 2τ /(N + 1). We can prove the extended Clausius relation (5.11) in the limit where we take τ ↑ ∞ and then N ↑ ∞.

Theorem 5.2 (Extended Clausius relation for the step protocol) There are positive constants B, Cn , and Cn′ (with n = 1, 2, . . .), and we have the following. Define the amount of change characterizing the protocol (5.4) by δ = |ν − ν ′ | + Bǫ |κ − κ′ |.

(5.5)

Then for the step protocol (5.4) with any ǫ, ν, and κ, one has 2n−1 o kn

X †

(−1) c 1 ˆ ˆ c νˆ† α ˆ† k α νˆ α ˆ k α ′ (β W + Ψ ) st→ − (β W + Ψ ) st→ F (α ) − F (α) + lim k β τ ↑∞ 2 k! k=1

2n

≤ Cn ǫ δ +

Cn′

2

δ ,

(5.6)

for any positive integer n. In particular, by setting n = 1 (and C = βC1, C ′ = βC1′ ), one gets α ˆ α ˆ α ˆ† α ˆ† hΘ i − hΘ i st→ st→ (5.7) S(α′ ) − S(α) + lim ≤ Cǫ2 δ + C ′ δ 2 , τ ↑∞ 2

which is a rigorous version of the extended-Clausius relation (4.6).

This theorem is proved in section 7.4, where we make use of the exact relation (3.1) in Theorem 3.1 combined with a rigorous perturbative argument. Extended Clausius relation for a quasi-static protocol: Although the above Theorem 5.2 applies only to protocols which consist of a single step, it is quite natural to expect that corresponding relations for slowly varying protocols are also valid. Heuristically speaking, one may approximate a smooth slowly varying protocol by a sum of small step protocols, for which the relations (5.6) or (5.7) are valid. Unfortunately a rigorous control of such quasi-static limit is not easy (probably) for technical reasons. Instead, we shall here present a (much easier) result for monotone

29

protocols which can be obtained as limits of protocols that consists of many small steps18 . See Figure 7. Define19 α ˆ = (α(t))t∈[−τ,τ ] by α(t) = αj where j = 0, . . . , N, and

h for t ∈ −τ +

 2τ 2τ j, −τ + (j + 1) , N +1 N +1

(5.8)

 (N − j)ν + jν ′ (N − j)κ + jκ′  αj := β, ǫ, . , N N

(5.9)

Theorem 5.3 (Extended Clausius relation for a quasi-static protocol) Under the same assumptions as in Theorem 5.2, the following is valid for the protocol (5.8). One has, for n = 1, 2, . . . 2n−1 o kn

X †

1 (−1) ˆ ˆ c νˆ† α ˆ† k α c νˆ α ˆ k α ′ − (β W + Ψ ) (β W + Ψ ) lim lim F (α ) − F (α) + st→ st→ k N ↑∞ τ ↑∞ β 2 k! k=1

2n

≤ Cn ǫ δ,

(5.10)

and, in particular, α ˆ α ˆ α ˆ† α ˆ† hΘ i − hΘ i st→ st→ ≤ Cǫ2 δ, S(α′ ) − S(α) + lim lim N ↑∞ τ ↑∞ 2

(5.11)

where δ is defined by (5.5).

Proof: We simply replace δ in Theorem 5.2 by δ/N, and sum up the inequalities. Note that there are no terms proportional to δ 2 in the right-hand sides of (5.10) and (5.11), reflecting the quasi-static limit N ↑ ∞. As an important variation, consider the protocol (5.8) in which we keep the parameter ν for the Hamiltonian constant and only vary the parameter κ. Since one has δ = Bǫ |κ − κ′ | by setting ν = ν ′ in (5.5), the extended Clausius relation (5.11) becomes α ˆ α ˆ α ˆ† α ˆ† hΘ i − hΘ i st→ st→ (5.12) S(α′ ) − S(α) + lim lim ≤ C ′′ ǫ3 |κ − κ′ |. N ↑∞ τ ↑∞ 2 with C ′′ = BC. 18

For simplicity we only consider protocols in which the reference inverse temperature β is fixed. It is not difficult to treat protocols where β varies; one simply redefines β properly when one decomposes the whole protocol into a sum of step protocols. 19 To be consistent with the notation introduced in section 2.1, we are here taking τ0 = {(N −1)/(N + 1)}τ . In this case τ0 also diverges when we let τ ↑ ∞.

30

To see the implication of (5.12), take the simple but important example where κ κ takes its value in [0, 1], and ξx→y = κ ξx→y for some nonequilibrium function ξx→y . Then ′ by choosing κ = 1 and κ = 0 in (5.12), we get α ˆ α ˆ α ˆ† α ˆ† hΘ ist→ − hΘ ist→ (5.13) ≤ C ′′ ǫ3 , S(β, ν) − S(α) + lim lim N ↑∞ τ ↑∞ 2

where α = (β, ǫ, ν, κ). Note that the system with κ′ = 0 is nothing but the equilibrium system with parameters β and ν, and therefore S(β, ν) is the equilibrium entropy (see (5.2)). If we suppose that we know everything about equilibrium states, the extended Clausius relation (5.13) implies that one can determine the nonequilibrium entropy S(α) with the precision of O(ǫ2 ) only by (experimentally) measuring the entropy productions † ˆ† ˆ hΘαˆ iαst→ and hΘαˆ iαst→ . Extended Clausius relation in terms of excess entropy production: Finally we comment on the extended Clausius relation which is written in terms of the excess entropy production. As we have discussed in section 4.1, the house-keeping entropy ˆ α production Σαhk is defined by (4.9), where the entropy production rate σst is defined by (4.8). In the case of stepwise protocol (5.8), the house-keeping entropy production (4.9) becomes N X 2τ α ˆ Σαhk = σstj . (5.14) N + 1 j=0 ˆ ˆ Then we have the following where the excess entropy production hΘαˆ iαst→ − Σαhk plays a central role.

Theorem 5.4 (Extended Clausius relation in terms of excess entropy production) Under the same assumptions as in Theorem 5.2, we have ˆ α ˆ (5.15) ist→ ≤ Cǫ2 δ, S(α′) − S(α) + lim lim hΘαex N ↑∞ τ ↑∞

for the “quasi-static” protocol (5.8). We have defined the excess entropy production by ˆ ˆ Θαex := Θαˆ − Σαhk .

(5.16)

This theorem is proved in section 7.5. ˆ Note that the divergence (as τ ↑ ∞) of the total entropy production hΘαˆ iαst→ is α ˆ precisely canceled by that of the house-keeping entropy production Σhk . The inequality corresponding to the extended Clausius relation: As we have α ˆ eq ′ noted in section 1.2, the standard Clausius relation S(αeq )−S(αeq ) ≃ −hΘαˆeq ieq→ , which becomes an exact equality in the quasi-static limit, is associated with the inequality α ˆ eq ′ S(αeq ) − S(αeq ) ≥ −hΘαˆeq ieq→ , which is valid for any equilibrium protocol α ˆ eq . We shall 31

make some remarks regarding the inequality corresponding to our extended Clausius relation. We first note that there can be no inequalities corresponding to the extended Clausius relation expressed in the forms (4.5), (4.6), or (5.11). To see this suppose that † † the inequality S(α′) − S(α) & −(hΘαˆ iαˆ − hΘαˆ iαˆ )/2 is valid for general protocols α. ˆ But if one replaces α ˆ by its time-reversal α ˆ † , this inequality becomes S(α′ ) − S(α) . † † −(hΘαˆ iαˆ − hΘαˆ iαˆ )/2. This means that a general inequality is impossible (unless the equality is valid for any α). ˆ If we write the extended Clausius relation by using the excess entropy production as in (5.15), on the other hand, we can prove the following. Theorem 5.5 (Extended Clausius inequality) Let α ˆ be an arbitrary protocol as defined in section 2.1. We fix τ0 . Then we have Z τ n o X α ˆ α ˆ 2 ′ ˜ dt |p˙ x (t)| − 2Aǫ3 , (5.17) S(α ) − S(α) ≥ lim −hΘex ist→ − Cǫ τ ↑∞

−τ

x∈S

where p(t) = (px (t))x∈S is the probability distribution at time t, i.e., the solution of the master equation (2.2), (2.3) with the initial condition p(−τ ) = ρα , and C˜ is a constant which depends only on the class of models. A is the constant introduced in Theorem 5.1. We see that (5.17) has the precise form that one expects as the inequality corresponding to the extended Clausius relation (5.15). It should be noted, however, that R P 2 τ ˜ the error term Cǫ −τ dt x∈S |p˙ x (t)| depends explicitly on the solution of the master equation (2.2), (2.3). The theorem is proved in section 9, where we make use of the standard inequality for Markov processes.

6

Method based on the time reversal symmetry

We prove our theorems in sections 6, 7, 8, and 9. In this section we prove Theorems 3.1 and 3.2 by using the time reversal symmetry of the path probability. In the course we use some lemmas which are proved in section 7.

6.1

Some definitions and basic symmetry

In addition to the expectation (2.12), we define a new expectation Z α ˆ hf ix→ := Dˆ x f [ˆ x] δx(−τ ),x T αˆ [ˆ x],

(6.1)

in which the system starts from a specified initial state x ∈ S. It is also convenient to define the “unnormalized expectations” in which the final state is specified as Z ˆ [f ]αst→y :=

Dˆ x f [ˆ x] ραx(−τ ) δx(τ ),y T αˆ [ˆ x], 32

(6.2)

and ˆ [f ]αx→y

:=

Z

Dˆ x f [ˆ x] δx(−τ ),x δx(τ ),y T αˆ [ˆ x].

(6.3)

The corresponding normalized expectation will appear later in (8.1). Let us derive the well-known symmetry relation, which will be a basis of the present work. From the definitions (2.25) of the time-reversed path and (2.8) of the weight, one finds n n h Z tj+1 i Y Y α ˆ† † α(tj ) T [ˆ x]= Rxj →xj−1 exp − dt λα(t) xj j=1

tj

j=0

which differs from (2.8) only in the subscripts of the transition rates. By using the definition (2.14) of θα , we get =

n Y

j) Rxα(t j−1 →xj

j) exp[−θ α(t xj−1 →xj ]

j=0

j=1



n Y

 = exp −Θ [ˆ x] T αˆ [ˆ x], α ˆ

h Z exp −

tj+1 tj

dt λα(t) xj

i

(6.4)

where Θαˆ [ˆ x] is defined in (2.19). The relation (6.4), which is quite standard in nonequilibrium physics, is the path version of the local detailed balance (2.14), and is sometimes called the “detailed fluctuation theorem”.

6.2

Proof of Theorem 3.1

By using (2.24), one can rewrite (6.4) as ν(−τ )

ˆ [ˆ −βHx(−τ ) −(βW νˆ [ˆ x]+Ψα x])/2

e



ν(τ )



ˆ [ˆ −βHx(τ ) −(βW νˆ [ˆ x† ]+Ψα x† ])/2

T αˆ [ˆ x] = e



T αˆ [ˆ x† ],

(6.5)

where we made use of (2.26). By integrating over all path xˆ such that x(−τ ) = x and x(τ ) = y, and recalling the definition (6.3), we get αˆ αˆ† ν  ν ˆ α ˆ ν′  ν ˆ† α ˆ† e−βHx e−(βW +Ψ )/2 x→y = e−βHy e−(βW +Ψ )/2 y→x .

(6.6)

At this stage the following “splitting lemma”, which will be proved in section 7.2, [−τ /2,τ /2],α ˆ is essential. The lemma enables us to treat the expectation h . . . ist→ (where the initial distribution is chosen as the stationary distribution) without using the stationary distribution ρα explicitly. Technically speaking the use of the “splitting lemma” is one of the new points in the present work. Lemma 6.1 Let f [ˆ x] be an arbitrary nonnegative (and nonvanishing) function of path which depends only on x(t) with t ∈ [−τ /4, τ /4]. Then for arbitrary protocol α ˆ and x, y ∈ S, one has lim

τ ↑∞

Y (α) Y (α′ ) [ f e−Ψ [−τ,−τ /2],(α)

h e−Ψ(α) /2 ix→

α ˆ /2

ˆ ]αx→y

[−τ /2,τ /2],α ˆ

h f e−Ψαˆ /2 ist→ 33



[τ /2,τ ],(α′ )

[ e−Ψ(α ) /2 ]st→y

= 1,

(6.7)

where Y (α) is a certain positive function. We have defined the expectations in the limited time intervals (such as [−τ, −τ /2]]) by replacing the time interval [−τ, τ ] in (2.12), (6.1), (6.2), and (6.3) with the specified ones. We also see that the limit lim

τ ↑∞

h e−Ψ

(α) /2

[−τ,−τ /2],(α)

ix→

(6.8)

[τ /2,τ ],(α)

[ e−Ψ(α) /2 ]st→x

exists for any x ∈ S. With a slight abuse of notation, we have denoted here by Ψ the entropy production in [−τ,−τ /2],(α) each subinterval. Thus Ψ(α) in h· · · ix→ , for example, actually means Ψ[−τ,−τ /2],(α) . By applying the decomposition (6.7) to the equality (6.6), we have ν

lim

τ ↑∞

e−βHx h e−Ψ ν′

[−τ,−τ /2],(α)

(α) /2

ix→

[−τ,−τ /2],(α′ )



e−βHy h e−Ψ(α ) /2 iy→

h e−(βW

ν ˆ +Ψα ˆ )/2 †

h e−(βW νˆ

[−τ /2,τ /2],α ˆ

ist→

[ e−Ψ

ˆ† ˆ † )/2 [−τ /2,τ /2],α +Ψα ist→

(α′ ) /2

[τ /2,τ ],(α′ )

]st→y

[τ /2,τ ],(α)

[ e−Ψ(α) /2 ]st→x

= 1, (6.9)

which is an essential relation for us. Let us first set α ˆ = (α), a constant protocol. Then (6.9) becomes ν

lim

τ ↑∞

e−βHx h e−Ψ

(α) /2

ν

[−τ,−τ /2],(α)

[ e−Ψ

[−τ,−τ /2],(α)

[ e−Ψ(α) /2 ]st→x

ix→

e−βHy h e−Ψ(α) /2 iy→

(α) /2

[τ /2,τ ],(α)

]st→y

[τ /2,τ ],(α)

= 1.

(6.10)

Since the limit (6.8) exists, this implies an interesting relation −βHxν

e

lim

τ ↑∞

h e−Ψ

(α) /2

[−τ,−τ /2],(α)

ix→

−βHyν

[τ /2,τ ],(α)

[ e−Ψ(α) /2 ]st→x

=e

lim

τ ↑∞

h e−Ψ

(α) /2 ′

[−τ,−τ /2],(α)

iy→

[τ /2,τ ],(α)

[ e−Ψ(α ) /2 ]st→y

(6.11)

for any x, y ∈ S. In other words, the quantity equated in (6.11) is independent of x (or of y). Let us examine this quantity for α which corresponds to equilibrium, where we ν [τ /2,τ ],(α) have α = (β, ν), and Ψ(α) = 0. Since limτ ↑∞ [1]st→x = ρβ,ν = eβ{F (β,ν)−Hx } , where x F (β, ν) is the standard equilibrium free energy, we see that (6.11) is equal to e−βF (β,ν) . This motivates us to define for general α the corresponding free energy F (α) by −βF (α)

e

−βHxν

:= e

lim

τ ↑∞

h e−Ψ

(α) /2

[−τ,−τ /2],(α)

ix→

[τ /2,τ ],(α)

[ e−Ψ(α) /2 ]st→x

.

(6.12)

By considering a general protocol α, ˆ and combining (6.9) with (6.12), we get lim

τ ↑∞

e−βF (α) h e−(βW

e−βF (α′ ) h e−(βW νˆ

ν ˆ +Ψα ˆ )/2 †

[−τ /2,τ /2],α ˆ

ist→

ˆ† ˆ † )/2 [−τ /2,τ /2],α +Ψα ist→

= 1,

which (after replacing τ /2 with τ ) is the desired equality (3.1) in Theorem 3.1.

34

(6.13)

Remark: The equality (3.6) can be proved essentially in the same manner as (3.1). One simply starts from the symmetry relation (6.4) and repeats the same procedure. ˜ In the derivation, the quantity S(α) is naturally defined as [−τ,−τ /2],(α)

(α)

h e−Θ /2 ix→ ˜ S(α) := lim log . [τ /2,τ ],(α) τ ↑∞ [ e−Θ(α) /2 ]st→x

(6.14)

In an equilibrium state, this definition gives

βH/2 e eq ˜ S(α) = log Z(α) + log −βH/2 , e eq

(6.15)

which is different from the standard equilibrium entropy S(α) = log Z(α) + β hHieq .

6.3

Proof of Theorem 3.2

By using (2.24), we now rewrite (6.4) as ν(−τ )

ˆ [ˆ −βHx(−τ ) −βW νˆ [ˆ x]−Ψb,α x]/2

e



ν(τ )



ˆ [ˆ ˆ [ˆ −βHx(τ ) −Ψi,α x† ]−Ψb,α x† ]/2

T αˆ [ˆ x] = e



T αˆ [ˆ x† ],

(6.16)

where we have decomposed the entropy production as Ψαˆ [ˆ x] = Ψb,αˆ [ˆ x] + Ψi,αˆ [ˆ x], where b,α ˆ [−τ,−τ /4],α ˆ [τ /4,τ ],α ˆ i,α ˆ [−τ /4,τ /4],α ˆ Ψ [ˆ x] := Ψ [ˆ x] + Ψ [ˆ x] and Ψ [ˆ x] := Ψ [ˆ x]. Again we integrate over xˆ with x(−τ ) = x and x(τ ) = y to get ν

e−βHx [e−βW

ν ˆ −Ψb,α ˆ /2

ν′

ˆ ]αx→y = e−βHy [e−Ψ

i,α ˆ † −Ψb,α ˆ † /2



α ˆ ]y→x ,

(6.17)

By using the decomposition (6.7) as before, this implies ν

lim

τ ↑∞

e−βHx h e−Ψ ν′

(α) /2 ′

[−τ,−τ /2],(α)

ix→

[−τ,−τ /2],(α′ )

e−βHy h e−Ψ(α ) /2 iy→

h e−βW

ν ˆ −Ψb′ ,α ˆ /2 †

h e−Ψi,αˆ

[−τ /2,τ /2],α ˆ

ist→

[ e−Ψ

ˆ† ˆ † /2 [−τ /2,τ /2],α −Ψb′ ,α ist→

(α′ ) /2

[τ /2,τ ],(α′ )

]st→y

[τ /2,τ ],(α)

[ e−Ψ(α) /2 ]st→x

= 1,

(6.18) ′ where Ψb ,αˆ [ˆ x] := Ψ[−τ /2,−τ /4],αˆ [ˆ x] + Ψ[τ /4,τ /2],αˆ [ˆ x]. We can use the definition (6.12) of the free energy to rewrite this as lim

τ ↑∞

e−βF (α) h e−βW

ν ˆ −Ψb′ ,α ˆ /2



e−βF (α′ ) h e−Ψi,αˆ

[−τ /2,τ /2],α ˆ

ist→

ˆ† ˆ † /2 [−τ /2,τ /2],α −Ψb′ ,α ist→

= 1.

(6.19)

Now, by rewriting τ /2 as τ (which is allowed because we let τ ↑ ∞), this precisely becomes ν ˆ ˆ e−βF (α) he−β W iαΨ-mod lim = 1, (6.20) αˆ†

τ ↑∞ −βF (α′ ) e exp[−Ψ[−τ /2,τ /2],αˆ† ] Ψ-mod where we used Ψ-modified expectation defined in (3.7). Finally by using the relation (6.21) below, we get the desired (3.8) in Theorem 3.2 from (6.20).

Lemma 6.2 Let f [ˆ x] be a function which depends only on x(t) with t ∈ [−τo , τo ]. Then ˆ ˆ lim hf iαΨ-mod = lim hf iαst→

τ ↑∞

τ ↑∞

35

(6.21)

7

Method based on modified rate matrix

Here we prove the technically important splitting lemma (Lemma 6.1), and our main results about SST summarized in Theorems 5.2 and 5.4. The expressions of various quantities in terms of matrices obtained by modifying the rate matrix Rα play essential roles20 . The desired results then follow from the Perron-Frobenius theorem.

7.1

Modified rate matrix and its eigenvectors

In what follows we treat vectors whose components are indexed by x ∈ S. We use boldface symbols like v = (vx )x∈S for column vectors, and arrowedPsymbols like ~u = (ux )x∈S for row vectors. We use the standard notation where ~uv = x∈S ux vx denotes the scalar product, and v~u denotes the matrixPwhose xy-entry is vx uy (i.e., the Kronecker product). For a matrix A, we write ~uAv = x,y∈S ux (A)xy vy . We let ~1 := (1, 1, . . . , 1) the row vector whose components are all 1. Finally δ (x) and ~δ(x) denote the column and row vectors, respectively, whose x-component is 1 and all the other components are 0. Since Rα is a transition rate matrix, there is a unique positive vector ρα such that α ~1ρ = 1 and Rα ρα = 0. Here 0 is the Perron-Frobenius eigenvalue of Rα , and the corresponding left eigenvector is ~1. ˜ α by specifying its entries as We define a modified matrix R ˜ α )yx := Rα e−ψx→y /2 . (R yx α

(7.1)

˜ α is no longer a transition rate matrix, it still satisfies the conditions for the Although R ˜ α, Perron-Frobenius theorem. We denote by µα ∈ R the Perron-Frobenius eigenvalue of R α α α α ~ and by ϕ = (ϕx )x∈S and ξ = (ξx )x∈S the corresponding right and left eigenvectors, respectively, i.e., ˜ α ϕα = µα ϕα , ξ~α R ˜ α = µα ξ~α . R (7.2) We can assume ϕαx > 0 and ξxα > 0 for any x ∈ S. We normalize these vectors so that P P α α α ~1ϕα = 1 and ξ~α ϕα = 1 (more precisely, x∈S ϕx = 1 and x∈S ξx ϕx = 1). ˜ α )xy = (R ˜ α )yx eβ(Hyν −Hxν ) . From the eigenvalue From (2.14) and (2.18), we see that (R equation for ξ~α , we see X X ˜ α )xy = eβHyν ˜ α )yx e−βHxν ξ α , µα ξyα = ξxα (R (R (7.3) x x∈S

x∈S

ν ˜ α . We can therefore write which implies that (e−βHx ξxα )x∈S is the right eigenvector of R ν

ϕαx

e−βHx ξxα , = Z(α)

(7.4)

with a certain function Z(α) of the parameters α. In equilibrium system, it coincides with the canonical partition function since ξ~α = ~1. We note in passing that the relations 20

Such a technique is common, for example, in the large deviation theory [42]. Similar technique was used for steady state thermodynamics in [7].

36

P P ν −βHxν α ~1ϕα = 1 and ξ~α ϕα = 1 imply Z(α) = ξx and Z(α) = x∈S e−βHx (ξxα)2 , x∈S e respectively. A straightforward consequence of the Perron-Frobenius theorem is that there exist constants C, C ′ > 0 and γ > 0, and one has

t R˜ α

e ≤ C eµα t , (7.5) ∞ and

t R˜ α

α α

e − eµ t ϕα ξ~α ∞ ≤ C ′ eµ t e−γ t ,

(7.6)

for any t ≥ 0. We can assume that the constants C, C ′ , γ are independent of α. Here we used the matrix norm kAk∞ := maxx,y∈S |(A)xy |. Note that ϕα ξ~α is the (non-orthogonal) projection onto the Perron-Frobenius eigenvector ϕα .

7.2

Splitting Lemma

Let us prove Lemma 6.1, which played an essential role in the proof of the equalities. The key for the proof is the following bound. Lemma 7.1 Let Y (α) = ξ~α ρα . Then for any t1 , t2 ≥ 0, one has

Y (α) e(t1 +t2 )R˜ α − et1 R˜ α ρα~1 et2 R˜ α ≤ C ′′ eµα t (e−γ t1 + e−γ t2 ), ∞

(7.7)

for any α, where C ′′ > 0 is a constant.

Proof: From (7.5), one roughly finds that ˜α

α

e(t1 +t2 )R ≃ eµ and

˜α

˜α

α

et1 R ρα~1 et2 R ≃ eµ

t1

ϕα ξ~α ρα~1 eµ

α

(t1 +t2 )

ϕα ξ~α,

(7.8)

ϕα ξ~α = Y (α) eµ

α

t2

(t1 +t2 )

ϕα ξ~α .

(7.9)

It is a standard exercise to make this into a rigorous estimate by using (7.5) and (7.6). [−τ /4,τ /4],α ˆ ˜ the matrix whose entry (F) ˜ yx is equal to [ f e−Ψαˆ /2 ]x→y We denote by F . Then one can write

[ f e−Ψ

α ˆ /2



˜ α ˜ (3τ /4)R ˜ α (x) ˆ ]αx→y = ~δ(y) e(3τ /4)R F e δ ,

−Ψ(α) /2

he

h f e−Ψ

α ˆ /2

/2],(α) i[−τ,−τ = ~1 e x→

[−τ /2,τ /2],α ˆ

ist→

′ −Ψ(α ) /2

[e

˜α (τ /2)R

˜ α′ (τ /4)R

= ~1 e

[τ /2,τ ],(α′ )

]st→y

δ (x) ,

(7.11)

˜ e(τ /4)R˜ α ρα , F

(7.12)

˜ α′ (τ /2)R

= ~δ(y) e

(7.10)

ρα .

(7.13)

We apply (7.7) to (7.10) and split 3τ /4 into τ /2 and τ /4 as Y (α) Y (α′ ) [ f e−Ψ

α ˆ /2





′ ˜α ˜ α ˜ (τ /4)R ˜α α ˜α ˆ ]αx→y ≃ ~δ(y) e(τ /2)R ρα ~1 e(τ /4)R F e ρ ~1 e(τ /2)R δ (x)

37

(7.14)

where the right-hand side is equal to the product of (7.11), (7.12), and (7.13). This roughly shows that

α′ α F ∞ , (7.15) (numerator of (6.7))−(denominator of (6.7)) ≤ C˜ e(3/4)(µ +µ )τ e−γ τ /4 ˜

where C˜ is a constant. On the other hand, we can also show that

α′ α F ∞ (denominator of (6.7)) ≥ C˜ ′ e(3/4)(µ +µ )τ ˜

(7.16)

by noting that the Perron-Frobenius theorem ensures that, in the expression in the ˜ yx for any x, y has a uniformly nonvanishing right-hand side of (7.12), the entry (F) contribution. This proves (6.7), which is the main claim in Lemma 6.1. The existence of the limit (6.8) and the proof of Lemma 6.2 are easy.

7.3

Description of key quantities in terms of the eigenvectors

We shall examine how the two important quantities in our theory can be written in terms the language of the present section. Although these observations are not directly used in the proof (and thus can be omitted) they may shed light on mathematical structures behind our thermodynamic relations. We first examine the nonequilibrium free energy defined in (6.12). Clearly we can rewrite (6.12) as −βF (α)

e

−βHxν

=e

~1 e(τ /2)R˜ α δ x ν lim = e−βHx α ˜ τ ↑∞ ~ δx e(τ /2)R ρα

~1ϕα ξ~α δ x ~δx ϕα ξ~α ρα

where we used (7.6). By recalling Y (α) = ξ~α ρα , this becomes −βHxν

=e

Z(α) ξxα = , α ϕx Y (α) Y (α)

(7.17)

where we used (7.4). Thus the free energy is neatly expressed in terms of the quantities obtained from ϕα and ξ~α as F (α) =

1 log Y (α) − log Z(α) . β

(7.18)

Next we examine the right-hand side of our main equality (3.1) in the special case of the step protocol (5.4), which was treated in section 5. We of course know (rigorously) that it is equal to F (α′ ) − F (α), it might be interesting to analyze the right-hand side as it is. Let us define the work matrix Wνˆ by ( ′ Hxν − Hxν if x = y; (7.19) (Wνˆ )yx := 0 if x 6= y. 38

Then the right-hand side of (3.1) (for the step protocol) is rewritten as

αˆ ′ ~1 eτ R˜ α e−βWνˆ /2 eτ R˜ α ρα exp[−(β W νˆ + Ψαˆ )/2] st→ 1 1 − lim log

log αˆ† = − β τlim ↑∞ ~1 eτ R˜ α e−βWνˆ† /2 eτ R˜ α′ ρα′ β τ ↑∞ exp[−(β W νˆ† + Ψαˆ† )/2] st→

By using (7.6), this becomes



α α ′ ′ ν ˆ 1 e(µ +µ )τ ~1 ϕα ξ~α e−βW /2 ϕα ξ~αρα + (remainder) = − lim log α α′ † β τ ↑∞ e(µ +µ )τ ~1 ϕα ξ~α e−βWνˆ /2 ϕα′ ξ~α′ ρα′ + (remainder) ′ ν ˆ 1 Y (α) ξ~α e−βW /2 ϕα = − log † β Y (α′ ) ξ~αe−βWνˆ /2 ϕα′ ′ ν ˆ 1 ξ~α e−βW /2 ϕα 1 = {log Y (α′ ) − log Y (α)} − log . † β β ξ~α e−βWνˆ /2 ϕα′

(7.20)

Recalling (3.1) and (7.18), this implies ξ~α e−βW /2 ϕα 1 1 ′ − {log Z(α ) − log Z(α)} = − log . † β β ξ~α e−βWνˆ /2 ϕα′ ′

7.4

ν ˆ

(7.21)

Thermodynamic relations

We can now prove Theorem 5.2, which mathematically states our thermodynamic relations for NESS including the extended Clausius relation. Our strategy here is to justify the heuristic error estimate in the expansion (4.18) for the case of the step protocol (5.4), i.e., α ˆ with α(t) = α for t ∈ [−τ, 0], and α(t) = α′ for t ∈ (0, τ ]. A key observation for the proof has already been made in (7.20). This expression shows that, for the step protocol, the right-hand side of the equality (3.1) can be compactly represented in terms of the modified rate matrix and its eigenvectors, and converges exponentially as τ ↑ ∞. Since the modified matrix and its eigenvectors can be expanded in convergent power series of ǫ and δ, we can compare them with the heuristic power estimate in section 4.2. We shall make this idea more precise. Basic setup: To generate a power series corresponding to (4.18), we introduce a new expansion parameter ζ ∈ [−η, 1 + η] where η > 0 is a small fixed constant. Then we define

αˆ exp[−ζ(β W νˆ + Ψαˆ )/2] st→ 1 (7.22) Ξ(α, ˆ ζ, τ ) := − log

αˆ† . β exp[−ζ(β W νˆ† + Ψαˆ† )/2] st→

Note that limτ ↑∞ Ξ(α, ˆ 1, τ ) is nothing but the right-hand side of our main equality (3.1), and thus equal to F (α′ ) − F (α). We also find from the definition of cumulant (4.14) that  k  ˆ ˆ† o

∂ Ξ(α, ˆ ζ, τ ) (−1)k nc

νˆ α ˆ k α c νˆ† α ˆ† k α = − (β W + Ψ ) . (7.23) (β W + Ψ ) − st→ st→ ∂ζ k β 2k ζ=0 39

For each n = 1, 2, . . ., consider the (rigorous) expansion   2n+1   2n X 1 ∂ k Ξ(α, ˆ ζ, τ ) ∂ Ξ(α, ˆ ζ, τ ) 1 Ξ(α, ˆ 1, τ ) = + , k 2n+1 k! ∂ζ (2n + 1)! ∂ζ ˜n (ˆ ζ=0 ζ= ζ α,τ ) k=0

(7.24)

where ζ˜n (α, ˆ τ ) ∈ (0, 1), and rearrange it as Ξ(α, ˆ 1, τ ) =

2n−1 X k=0

  1 ∂ k Ξ(α, ˆ ζ, τ ) + R1 + R2 , k! ∂ζ k ζ=0

(7.25)

where we have set  2n  ∂ Ξ(α, ˆ ζ, τ ) 1 R1 : = (2n)! ∂ζ 2n ζ=0 n

ˆ† o

ˆ 1 c νˆ† α ˆ † 2n α c νˆ α ˆ 2n α = − 2n (β W + Ψ ) st→ − (β W + Ψ ) st→ , β 2 (2n)!  2n+1  ∂ Ξ(α, ˆ ζ, τ ) 1 . R2 := (2n + 1)! ∂ζ 2n+1 ζ=ζ˜n (ˆ α,τ )

(7.26) (7.27)

Recalling Ξ(α, ˆ 1, τ ) = F (α′) − F (α) and (7.23), we see that the expression (7.25) is nothing but the desired expansion (4.18) if we can show that R1 + R2 = O(ǫ2n δ) + O(δ 2 ). Perturbation: We now turn to a rigorous perturbative estimate of the remainder R1 + R2 in the expansion (7.25). ˜ α,ζ by Let us first define a new matrix R ( α if x = y; ˜ α,ζ )yx := Rxx (7.28) (R α /2 α −ζψx→y Ryx e if x 6= y, and denote by µα,ζ , ϕα,ζ , and ξ~α,ζ its Perron-Frobenius eigenvalue and the corresponding right and left eigenvectors, respectively. We use the same normalization as in section 7.1 for the eigenvectors. (β,ν) Let us fix the equilibrium transition rate Rx→y that appears in (5.2). Then, according (β,ν) α to the expression (5.2), the rate Rx→y can be regarded as a perturbation to Rx→y where ǫ α′ is the parameter of perturbation. Likewise the rate Rx→y after the step can be regarded (β,ν) as a perturbation to Rx→y where ǫ and δ are the parameters of the perturbation. The ˜ α,ζ and R ˜ α′ ,ζ for a fixed ζ. same is true for the matrices R We imagine that every quantity is obtained by a perturbation around the corresponding quantity with ǫ = δ = 0. Since the unperturbed matrix R(β,ν),ζ has a nonvanishing spectral gap below its Perron-Frobenius eigenvalue, one can express the perturbed eigen˜ α,ζ and R ˜ α′ ,ζ in convergent power series of ǫ and δ provided values and eigenvectors of R that ǫ and δ are small enough. Our task here is to rearrange these series and compare the result with the heuristic series (4.16). 40

By using the matrices (the work matrix Wνˆ is defined in (7.19)), the key quantity (7.22) can be rewritten as ′

~1 eτ R˜ α ,ζ e−ζβWνˆ /2 eτ R˜ α,ζ ρα,ζ 1 Ξ(α, ˆ ζ, τ ) = − log ~1 eτ R˜ α,ζ e−ζβWνˆ† /2 eτ R˜ α′ ,ζ ρα′ ,ζ β α,ζ α′ ,ζ ′ ′ ν ˆ e(µ +µ )τ ~1 ϕα ,ζ ξ~α ,ζ e−ζβW /2 ϕα,ζ ξ~α,ζ ρα + (remainder) 1 = − log α,ζ α′ ,ζ † β e(µ +µ )τ ~1 ϕα,ζ ξ~α,ζ e−ζβWνˆ /2 ϕα′ ,ζ ξ~α′ ,ζ ρα′ + (remainder) ′ ν ˆ 1 Y (α, ζ) ~ξα ,ζ e−ζβW /2 ϕα,ζ + D(α, ˆ ζ, τ ) = − log , † ν ˆ ′ β Y (α′ , ζ) ξ~α,ζ e−ζβW /2 ϕα ,ζ + D(α ˆ † , ζ, τ )

(7.29)

where we have used (7.6) (which of course holds for the matrices with ζ). The time dependent parts D(α, ˆ ζ, τ ), D(α ˆ † , ζ, τ ) decay exponentially in τ uniformly in ζ, and also has convergent series expansions in ǫ and δ for any τ . We are now ready to evaluate R1 and R2 defined by (7.26) and (7.27), respectively. P (τ ) We first observe from (7.29) that R1 can be expanded into a power series ℓ,m Cℓ,m ǫℓ δ m , (τ )

which converges uniformly in τ > 0. We wish to argue that Cℓ,1 = 0 for ℓ < 2n. For this purpose we recall the heuristic estimate (4.17) which is valid only for finite τ and sufficiently small ǫ and δ. Nevertheless the heuristic estimates shows rigorously that (τ ) Cℓ,1 = 0 for ℓ < 2n for sufficiently small τ . But, when we recall that (7.29) admits a (τ )

(τ )

nice expansion, that Cℓ,1 = 0 for a finite interval of τ implies that Cℓ,1 = 0 for any τ . The quantity R2 is much more complicated but can be treated in a similar manner. We first note that [∂ 2n+1 Ξ(α, ˆ ζ, τ )/∂ζ 2n+1 ]ζ=ζ˜n (ˆα,τ ) reduces to a kind of cumulant † † ˆ† c ˆ ˆ hh(β W νˆ +Ψαˆ )2n+1 iiαst→ −chh(β W νˆ +Ψαˆ )2n+1 iiαst→ , where chh· · · iiαst→ is defined by replacˆ )/2 α α ˆ −ζ˜n (ˆ α,τ )(β W νˆ +Ψα ˆ ing the expectation h· · · ist→ (in the standard cumulant) by h(· · · ) e ist→ . ν ˆ α ˆ ˜ −ζn (ˆ α,τ )(β W +Ψ )/2 Now one expands e in a power series, and make a heuristic order estimate as in section 4.3. A good news here is that one does not have to invoke the complicated estimate using the time-reversal symmetry; only naive expansion and power counting is enough. For example the dominant term can be evaluated as

αˆ

αˆ† † † (β W νˆ + Ψαˆ )2n+1 − (β W νˆ + Ψαˆ )2n+1

(α) (α) † † (α) = (Ψ(α) + ∆Φαˆ )2n+1 (1 + ∆Γαˆ ) − (Ψ + ∆Φαˆ )2n+1 (1 + ∆Γαˆ ) = O(ǫ2n δ) + O(ǫ2n−1 δ 2 ). (7.30) This heuristic estimate can be converted into a rigorous estimate as before by relying on the representation (7.29). This proves the desired Theorem 5.2.

7.5

Excess entropy production

Let us prove Theorem 5.4, which states that the extended Clausius relation can be written in terms of the excess entropy production. 41

We first define the entropy production rate as a function of x ∈ S by X α α σxα := θx→y Rx→y .

(7.31)

y∈S

Let α ˆ = (α(t))t∈[−τ,τ ] be an arbitrary protocol, and pαˆ (t) be the corresponding solution of the master equation (2.2) with the initial condition pαˆ (−τ ) = ρα(−τ ) . Then from the expression (2.8) of the path weight, the definition (2.12) of the path average, and the definition (2.19) of the total entropy production, one finds Z τ X α ˆ α ˆ hΘ ist→ = σxα(t) pαxˆ (t). (7.32) dt −τ

x∈S

P (α) This in particular implies hΘ(α) ist→ = 2τP x σxα ραx , which shows that the right-hand side α of (4.8) is independent of τ , and σst = x σxα ραx . For simplicity we first consider the single step protocol (5.4), i.e., α ˆ with α(t) = α ′ for t ∈ [−τ, 0], and α(t) = α for t ∈ (0, τ ]. Then we have Z τ X ′ α′ α ˆ α ˆ α hΘ ist→ = τ σst + σxα (etR ρα )x , dt (7.33) 0

x∈S

where we noted that the probability distribution at t = 0 is ρα . On the other hand, ′ α′ ′ noting that ρα = etR ρα , we have Z τ X ′ α′ ′ α′ τ σst = dt σxα (etR ρα )x . (7.34) 0

x∈S



α α ˆ ) in this case, we find = τ (σst + σst Since the house-keeping entropy production is Σαhk Z τ X ′ ′  α′ α ˆ α ˆ α ˆ hΘ ist→ − Σhk = dt σxα etR (ρα − ρα ) x . (7.35) 0

′ tRα

x∈S



Noting that e (ρα − ρα ) converges exponentially to zero as t ↑ ∞, we conclude that ˆ ˆ the excess entropy production hΘαˆ iαst→ − Σαhk has a finite τ ↑ ∞ limit. † For the reversed protocol α ˆ , we similarly have Z τ X  α ′ α ˆ† α ˆ† α ˆ† hΘ ist→ − Σhk = dt σxα etR (ρα − ρα ) x 0

=

Z

0

x∈S

τ

dt

X x∈S



α′



σxα etR (ρα − ρα ) ′



x



+ O(δ 2 ),

(7.36) ′

where the final estimate follows from σxα − σxα = O(δ), Rα − Rα = O(δ), and ρα − ρα = O(δ). This is easily made into a rigorous estimate by using the exponential convergence. By comparing (7.35) and (7.36), we find  † α ˆ† ˆ ˆ ˆ† hΘαˆ iαst→ − Σαhk = − hΘαˆ ist→ − Σαhk + O(δ 2). (7.37) 42

Thus the excess entropy production is (nearly) antisymmetric with respect to time ˆ reversal. Since the house-keeping entropy production is clearly symmetric, i.e., Σαhk = α ˆ† Σhk , one gets o 1 n αˆ αˆ † α ˆ ˆ ˆ† hΘαˆ iαst→ − Σαhk = hΘ ist→ − hΘαˆ ist→ + O(δ 2), (7.38) 2 which is a rigorous estimate for the single step protocol (5.4). It is clear that the same estimate applies to the N-step protocol α ˆ defined in (5.8), (5.9), and we get    o n o 1 n αˆ αˆ δ 2 α ˆ α ˆ α ˆ α ˆ† α ˆ† lim hΘ ist→ − Σhk = lim . (7.39) hΘ ist→ − hΘ ist→ + N × O τ ↑∞ 2 τ ↑∞ N By letting N ↑ ∞, we see that the two limits coincide. Given (5.12), this proves the desired (5.15).

8

Representation of the stationary distribution and entropy

Here we prove Theorem 5.1 for the nonequilibrium entropy. For this purpose we prove the representation (8.8) for the stationary distribution of NESS [40, 43], which is interesting in its own light. In the present section, we only consider a constant protocol (α), i.e., α(t) = α for the (α) whole time interval t ∈ [−τ, τ ]. We denote the corresponding expectation h· · · ix→ and (α) τ,(α) τ,(α) the unnormalized expectation [· · · ]st→x (see (6.1) and (6.3)) as h· · · ix→ and [· · · ]st→x , respectively, to emphasize the dependence on τ . We also define a new conditional expectation by τ,(α) [· · · ]st→x τ,(α) , (8.1) h· · · ist→x = ραx which is normalized. Let us define τ -dependent free energy Fxτ (α) by Fxτ (α)

−β Hxα

:= e

he−Ψ

(α) /2

τ,(α)

ix→

τ,(α)

[e−Ψ(α) /2 ]st→x

.

(8.2)

By comparing this with (6.12), we find that Fxτ (α) → F (α) as τ ↑ ∞. Note that the x dependence vanishes in the limit. By using (8.1) and (8.2), we can also write ραx

β Fxτ (α)−β Hxα

=e

he−Ψ

(α) /2

τ,(α)

ix→

τ,(α)

he−Ψ(α) /2 ist→x 43

.

(8.3)

Applying formal cumulant expansion to this expression, one has log ραx



Fxτ (α)

−β

Hxα

∞ X (−1)k nc (α) k τ,(α) c (α) k τ,(α) o + (Ψ ) x→ − (Ψ ) st→x . k! 2k

(8.4)

k=1

As in section 4.3, we fix a finite τ , and expand around the equilibrium constant protocol (αeq ) by using h· · · iτ,(α) = h· · · (1 + ∆Ω(α) )iτ,(αeq ) with ∆Ω(α) = O(ǫ). Then, by recalling Ψ(α) = O(ǫ) the term with k = 2 in (8.4) is evaluated as

(α) 2 τ,(α) c (α) 2 τ,(α) c (α) 2 τ,(αeq ) c (α) 2 τ,(αeq ) (Ψ ) x→ − (Ψ ) st→x = (Ψ ) x→ − (Ψ ) st→x +O(ǫ3 ) = O(ǫ3 ), (8.5)

c

τ,(αeq )

τ,(αeq ) where we noted that c (Ψ(α) )2 x→ = c (Ψ(α) )2 st→x by the time-reversal symmetry ′ and W = 0. (4.21). We noted that (Ψ(α) )† = −Ψ(α) , and that we here have αeq = αeq By substituting the error estimate (8.5) into the expansion (8.4), we get log ραx



Fxτ (α)

−β

Hxα

1 n (α) τ,(α) (α) τ,(α) o − Ψ − Ψ + O(ǫ3 ). x→ st→x 2

(8.6)

By formally letting τ ↑ ∞, we get a formal but very suggestive representation (4.13). It was first written down by two of us (T.S.K. and N.N.) in [40]. We now note that this representation can be turned into a rigorous estimate. The key observation is that the quantity in the right-hand side of (8.3) is written as he−Ψ

(α) /2

τ,(α)

ix→

τ,(α)

he−Ψ(α) /2 ist→x

= ραx

~1 eτ R˜ δ x ~δ(x) eτ R˜ ρα

,

(8.7)

where we used the same quantities as in section 7. Then by using the machinery in section 7, it is not hard to show that the above quantity admits a series expansion in ǫ which converges uniformly in τ , and each coefficient in the expansion converges as τ ↑ ∞. We then get the following theorem. Theorem 8.1 (Representation of the probability distribution for NESS) Consider a class of models introduced in the beginning of section 5. Then for any set of parameters α, the corresponding stationary distribution ρα = (ραx )x∈S satisfies  o  n

1 τ,(α) τ,(α) (8.8) log ραx − β F (α) − β Hxα − lim Ψ(α) x→ − Ψ(α) st→x ≤ C0 ǫ3 , 2 τ ↑∞ where C0 is a (model dependent) positive constant.

To prove Theorem 5.1, let us observe that, for any τ > 0, X n

τ,(α) τ,(α)

τ,(α) o

τ,(α)

ραx Ψ(α) x→ − Ψ(α) st→x = Ψ(α) st→ − Ψ(α) st→ = 0, x∈S

44

(8.9)

which is a direct consequence of the definition. Then from the representation (8.6) (with τ ↑ ∞), we see that X SSh [ρα ] = − ραx log ραx x∈S

= −β F (α) + β

X

ραx Hxα + O(ǫ3 ).

(8.10)

x∈S

By recalling the relation (4.4) between the nonequilibrium free energy and entropy, we find SSh [ρα ] = S(α) + O(ǫ3 ). (8.11) By using Theorem 8.1, this becomes a rigorous estimate, and we get the desired Theorem 5.1.

9

Proof of the extended Clausius inequality

Here we shall prove the extended Clausius inequality stated in theorem 5.5. For t ∈ [−τ, τ ], let p(t) = (px (t))x∈S be the solution of the master equation (2.2), (2.3) with the initial condition p(−τ ) = ρα . Then it is well-known that the “second law of thermodynamics” or the “H-theorem” d D[p(t)|ρα ] ≤0 (9.1) dt α=α(t) holds, where

D[p|q] :=

X

px log

x∈S

px qx

(9.2)

is the relative entropy (or the Kullback-Leibler divergence) of the two probability distributions p = (px )x∈S and q = (qx )x∈S . See, for example, [44], and also Appendix C of [15]. By substituting (9.2) into (9.1) and recalling the definition (4.11) of the Shannon entropy, we get X d − SSh [p(t)] − p˙ x (t) log ρα(t) ≤ 0. (9.3) x dt x∈S ′

By noting that limτ ↑∞ px (τ ) = ραx , this implies Z α′ α SSh [ρ ] − SSh [ρ ] ≥ − lim τ ↑∞

τ

−τ

dt

X

p˙x (t) log ρα(t) x .

(9.4)

x∈S

In fact this is precisely identical to the Clausius inequality written by Hatano and Sasa [9]. See also [14]. We need to rewrite (9.4) in terms of the excess entropy production to get (5.17). We can make us of the following representation of the stationary distribution ρα . 45

Lemma 9.1 (Linear response formula for stationary distribution) Consider a class of models introduced in the beginning of section 5. Then for any set of parameters α, the corresponding stationary distribution ρα = (ραx )x∈S satisfies n

(α) τ,(αeq ) o α α log ρx − β F (α) − β Hx − lim Ψ ≤ C˜ ′ ǫ2 , (9.5) x→ τ ↑∞ and

n

(α) τ,(α) o α ≤ C˜ ǫ2 , log ρx − −S(α) − lim Θex x→ τ ↑∞

(9.6)

˜ C˜ ′ are (model dependent) positive constants. where C,

Note that these representations have simpler forms but larger errors than the previous representation (4.13), (8.8). In fact (9.5) is a rigorous version of the linear response representation (see, e.g., [41, 43]), and (9.6) is its variant. Proof of Lemma 9.1: We shall give a heuristic argument, which can be turned into rigorous estimates. We start from the more accurate representation (4.13), (8.8), and first note that

(α) τ,(α) (α) τ,(α) (α) τ,(αeq ) (α) τ,(αeq ) + O(ǫ2 ) − Ψ = Ψ − Ψ Ψ st→x x→ st→x x→

τ,(αeq ) = 2 Ψ(α) x→ + O(ǫ2 ), (9.7)

where the first equality follows by expanding around the equilibrium protocol (as in section 4.3) and the second equality follows from the time-reversal symmetry (4.21). See the remark after (8.5). By substituting this into (4.13), we get the linear response representation

τ,(α) (9.8) log ραx = β F (α) − β Hxα − lim Ψ(α) x→ + O(ǫ2 ). τ ↑∞

We then note that (α)

(α)

(α) τ,(αeq ) (α) τ,(α) = Ψex x→ + O(ǫ2 ), Ψ x→

(9.9)

where Ψex := Ψ(α) − Σhk . It is crucial here that both the quantities have finite τ ↑ ∞ limits21 . We then use (2.24) (with W = 0) to get

(α) τ,(α) (α) τ,(α) (9.10) Ψex x→ = Θex x→ − β Hxν + βhH ν iαst .

By recalling the definition (4.4) of the nonequilibrium entropy, (9.8) further reduces to

τ,(α) 2 log ραx = −S(α) − lim Θ(α) (9.11) ex x→ + O(ǫ ). τ ↑∞

These heuristic observation can be made into rigorous estimates by using the machinery of section 7. In particular all the error terms can be bounded uniformly in τ.

τ,(αeq )

τ,(α) The difference between Ψ(α) x→ and Ψ(α) x→ is proportional to ǫ2 , but is also roughly proportional to τ . Thus the difference diverges as τ ↑ ∞. 21

46

Let us examine what happens when we substitute the representation (9.6) into the right-hand Rτ R τinequality P (9.4). Since P side of the main d dt x∈S p˙ x (t) S(α) = −τ dt dt { x∈S px (t)} S(α) = 0, we only need to investigate −τ

(α) τ,(α) the contribution of Θex x→ . We rewrite the integral as Z

τ

−τ

dt

X x∈S

XX

τ,(α(t))

(α(t)) τ,(α(t)) = − lim , (9.12) p˙ x (t) Θ(α) {p (t) − p (t + ∆t)} Θex x x ex x→ x→ ∆t↓0

t

x∈S

where t is summed over the integral multiples of ∆t within the interval [−τ, τ ], and (α(t)) denotes the protocol in which the parameters are fixed at α(t) (for the given t). Now note that X

(α) [t,t+∆t],(α(t))

(α(t)) τ,(α(t)) Θex p(t)→ := {px (t) − px (t + ∆t)} Θex (9.13) x→ x∈S

is precisely the excess entropy production in the time interval [t, t + ∆t], where the parameters are fixed at α(t) and the distribution at t is given by p(t). Then it follows that Z τ X X

τ,(α(t)) [t,t+∆t],(α(t)) ˆ α ˆ dt p˙ x (t) Θ(α) Θ(α) = − lim = −hΘαex ist→ . (9.14) ex x→ ex p(t)→ −τ

x∈S

∆t↓0

t

The existence of the limits can be proved by using the materials in of section 7. We finally use (5.3) to rewrite the Shannon entropy SSh [ρα ] in terms of our nonequilibrium entropy S(α).

10

Models with momenta and symmetrized Shannon entropy

We have been so far studying models in which state variables are symmetric with respect to time reversal. In the case of a system of N particles, our state x roughly corresponds to the collection (~r1 , . . . , ~rN ), where ~rj ∈ R3 is the position of the j-th particle. In a “less coarse grained” description of a particle system one also specifies the momenta of the particles. In this case the state x roughly corresponds to the collection (~r1 , . . . , ~rN , p~1 , . . . , ~pN ), where ~pj ∈ R3 denotes the momentum. By the time reversal, this state is mapped to a different state (~r1 , . . . , ~rN , −~p1 , . . . , −~pN ). We denote the corresponding state as x∗ . Here we deal with a Markov jump process which, in some sense, mimics the structure of such a system with momenta. We see that all but one of the results in the previous sections remain valid if we properly modify the definition of entropy as in (10.7). The only exception is the extended Clausius inequality (5.17) in theorem 5.5, which can never be valid in the present setting.

47

10.1

Setting and the main observation

Let us give a precise and abstract definition. We assume that to any state x ∈ S there corresponds a state x∗ ∈ S, and one has (x∗ )∗ = x. We assume that the Hamiltonian is time reversal symmetric in the sense that Hxν = Hxν∗ for any x ∈ S. With the above physical interpretation in mind, we should modify the detailed balance condition (2.15) for equilibrium dynamics as ν

ν

(β,ν)

(β,ν) e−βHx Rx→y = e−βHy∗ Ry∗ →x∗ , (β,ν)

and also assume that the escape rate (see (2.1)) λx (β,ν)

λx

(10.1) :=

(β,ν)

P

y∈S (y6=x)

(β,ν)

Rx→y satisfies

= λx ∗ . α As before let Rx→y denote the transition rate for a general model including a nonequiα librium one. We assume that Rx→y 6= 0 for x 6= y implies Ryα∗ →x∗ 6= 0. The connectivity α of S by nonvanishing Rx→y is again assumed. We still assume that the escape rate (2.1) 22 α has the symmetry λx = λαx∗ . Corresponding to (10.1), the definition (2.13) of the entropy production should be modified as α Rx→y α θx→y := log α . (10.2) Ry∗ →x∗ All the other definitions are exactly the same as before, and we can develop the theory in an almost parallel manner. One essential difference is that the fundamental time-reversal symmetry (6.4)   † T αˆ [ˆ x† ] = exp −Θαˆ [ˆ x] T αˆ [ˆ x] (10.3)

is valid as it is, but for a given path xˆ = (x(t))t∈[−τ,τ ] we define its time-reversal as xˆ† := ((x(−t))∗ )t∈[−τ,τ ] . This means that some relations that follows from (6.4) should be properly modified by putting ∗ on some variables. The basic identity (6.6), for example, now reads αˆ† αˆ ν′  ν ˆ† α ˆ† ν  ν ˆ α ˆ e−βHx e−(βW +Ψ )/2 x→y = e−βHy∗ e−(βW +Ψ )/2 y∗ →x∗ .

(10.4)

The most important change for us is that the definition (6.12) should be modified as −βF (α)

e

−βHxν

:= e

lim

τ ↑∞

h e−Ψ

(α) /2

[−τ,−τ /2],(α)

ix→

[τ /2,τ ],(α)

[ e−Ψ(α) /2 ]st→x∗

.

(10.5)

With this modification, the identity (6.13) is valid as it is, and so are the (thermodynamic) relations we have discussed in sections 3, 4, and 5, except for the extended Clausius inequality (5.17) of Theorem 5.5. Here the nonequilibrium entropy S(α) is defined by (4.4) with the newly defined F (α). 22

α It may be also reasonable to consider a model in which λα x 6= λx∗ . In such a model, one should α α α ˆ include the contribution from λx /λx∗ into the definition of Θ [ˆ x] so as to keep the symmetry (10.3) valid (see [45]). Then all the results in the present section remain valid.

48

As for the microscopic representation of the entropy S(α), we encounter a nontrivial and suggestive modification. In models with time reversal symmetry, we have shown that the nonequilibrium entropy S(α) coincides with the Shannon entropy of the stationary distribution (to the order O(ǫ2 )) as in (4.12) or (5.3) in Theorem 5.1. This is no longer valid in the present setting, and we have the following extension. Theorem 10.1 (Nonequilibrium entropy and the symmetrized Shannon entropy) Take the same setting as in Theorem 5.1 but in the present class of models. There is a constant A′ > 0, and one has (10.6) S(α) − Ssym [ρα ] ≤ A′ ǫ3

for any ǫ, ν, and κ. Here ρα is the stationary probability distribution corresponding to the parameter α. For any probability distribution p, we have defined the symmetrized Shannon entropy by X √ Ssym [p] = − px log px px∗ . (10.7) x∈S

This theorem is proved in the next section. Let us make a few remarks about the symmetrized Shannon entropy (10.7). It is apparent that if a probability distribution p has a time-reversal symmetry in the (β,ν) sense that px = px∗ , then we have Ssym [p] = SSh [p]. Since the equilibrium state ρx = ν e−βHx /Z(β) is symmetric, the Shannon and the symmetrized Shannon entropies coincide for the equilibrium state. Note that one may rewrite (10.7) in a suggestive form Ssym [p] = −

X px + px ∗ 2

x∈S

log



px px ∗ ,

(10.8)

in which both the arithmetic mean and the geometric mean appear. We also note that for a general probability distribution p that 1 Ssym [p] = SSh [p] + D[p|p∗ ] ≥ SSh [p], (10.9) 2 P where (p∗ )x = px∗ , and D[p|p′ ] := x∈S px log(px /p′x ) is the relative entropy, which is in general nonnegative. Recall that when we proved the extended Clausius inequality (Theorem 5.5) in section 9, the Shannon entropy played an essential role. This means that the proof does not extended to the present situation. Indeed we know that the the extended Clausius inequality can never be valid in models with “momenta” since there are models in which it is explicitly violated. We shall see such an example in section 10.3.

49

10.2

Proof of Theorem 10.1

We shall only present a heuristic argument, which can be made into a proof in the similar manner as in section 8. By following the derivation in section 8, we can show the representation for the stationary distribution for NESS in the present setting, which is n

τ,(α) o τ,(α)

1 (10.10) log ραx = βF (α) − βHxν − lim Ψ(α) x∗ → − Ψ(α) st→x + O(ǫ3 ). 2 τ ↑∞ We thus find

log

p 1 ραx ραx∗ = βF (α)−βHxν + hΨist→x −hΨix∗ → +hΨist→x∗ −hΨix→ +O(ǫ3 ), (10.11) 4

which means

1 X α ρx hΨist→x − hΨix∗ → + hΨist→x∗ − hΨix→ +O(ǫ3 ). 4 x∈S (10.12) α Since the left-hand side is S(α) − Ssym [ρ ], we shall show that the right-hand side is O(ǫ3 ). To do this, rewrite the right-hand side as β{hH ν iαst − F (α)} − Ssym [ρα ] =

1 X α ρx hΨist→x − hΨix∗ → + hΨist→x∗ − hΨix→ 4 x∈S 1 X α = ρx −hΨist→x − hΨix∗ → + hΨist→x∗ + hΨix→ 4 x∈S  1X α = (ρx − ραx∗ ) hΨix→ + hΨist→x∗ . 4 x∈S

(10.13)

 P To get the second line, we noted that (1/2) x ραx hΨist→x − hΨix→ = 0 for any τ , and subtracted this from the first line. We clearly have ραx − ραx∗ = O(ǫ). To bound the other term, we note that the corresponding equilibrium process satisfies (α )

eq (α) eq ) hΨ(α) i(α ist→x ∗ = 0, x→ + hΨ

(10.14)

which is an easy consequence of (properly rewritten version of) (6.5). Since Ψ = O(ǫ), we see that hΨix→ + hΨist→x∗ = O(ǫ2 ). This leads us to the desired (heuristic) estimate S(α) − Ssym [ρα ] = O(ǫ3 ).

10.3

An illustrative example

In order to demonstrate that one can never expect a general extended Clausius inequality, we here discuss an oversimplified model of heat conduction. The analysis also illustrates the role of the symmetrized Shannon entropy in the extended Clausius relation. 50

β2

β1

Figure 8: The transition rates for the simple model of heat conduction. The particle performs a back-and-forth stochastic motion between the two ends of the chain which are attached to distinct heat baths.

We consider a system of a single particle on a chain of length L whose left and right ends are attached to heat baths with the inverse temperatures β1 and β2 , respectively. The particle performs a back-and-forth motion between the two ends, carrying energy from one end to the other. The state of the model is specified as x = (j, v, k) ∈ S, where j ∈ {1, . . . , L} denotes the position, v ∈ {+, −} the velocity, and k ∈ {1, . . . , n} the internal degree of freedom of the particle. We define x∗ = (j, −v, k). The particle has the internal energy uk when it is in the state k. The model parameter is α = (β1 , β2 ). We define the transition rates as follows (Fig. 8). Let λ > 0. We first set (β ,β )

1 2 R(j,+,k)→(j+1,+,k) = λ,

(β ,β )

1 2 R(j,−,k)→(j−1,−,k) = λ,

(j = 1, . . . , L − 1),

(10.15)

(j = 2, . . . , L),

(10.16)

for any k, which describe the one-way-motion of the particle. Note that k does not change when the particle moves. At the ends of the chain we set (β ,β )

e−β1 uℓ , Z(β1 ) e−β2 uℓ =λ , Z(β2 )

1 2 R(0,−,k)→(0,+,ℓ) =λ

(β ,β )

1 2 R(L,+,k)→(L,−,ℓ)

(10.17) (10.18)

for any k and ℓ. These represent the processes the particle is bounced back and Pwhere (β1 ,β2 ) n −βuk . We set Rx→y = 0 for other thermalized at the ends. We defined Z(β) = k=1 e (β1 ,β2 ) combinations. Then the escape rate (2.1) is λx = λ for all x ∈ S. From the definition (10.2), we see that the only nonvanishing components of the (β1 ,β2 ) entropy production θx→y are (β ,β )

(β1 ,β2 ) θ(0,−,k)→(0,+,ℓ)

and

= log

1 2 R(0,−,k)→(0,+,ℓ)

(β ,β )

1 2 R(0,−,ℓ)→(0,+,k)

= β1 (uk − uℓ ),

(β ,β )

1 2 θ(L,+,k)→(L,−,ℓ) = β2 (uk − uℓ ).

Note that uk − uℓ is the energy transferred from the particle to the bath. 51

(10.19)

(10.20)

It must be clear that the model is almost trivial. The stationary distribution is readily obtained as   e−β1 uk 1 e−β2 uk (β1 ,β2 ) ρ(j,v,k) = δv,+ . (10.21) + δv,− 2L Z(β1) Z(β2 ) Here the position of the particle is distributed uniformly on the chain, and the internal degree of freedom has the inverse temperature β1 or β2 when the particle is moving to the right or to the left, respectively. Let the average (internal) energy at the inverse temperature β be n

1 X u(β) := uk e−βuk . Z(β) k=1

(10.22)

In the NESS, the average (internal) energy of the particle is u(β1 ) or u(β2 ) when it is moving to the right or to the left, respectively. The average heat (or energy) current from the bath to the particle at L = 1 is thus {u(β1 ) − u(β2 )}λ/(2L), where λ/(2L) is the rate by which the particle is bounced back at L = 1. By also considering the current at j = L, the entropy production rate (in the baths) in NESS (4.8) is found to (β ,β ) be σst 1 2 = −(β1 − β2 ){u(β1 ) − u(β2 )}λ/(2L) ≥ 0. From the stationary distribution (10.21), we can explicitly compute its Shannon and symmetrized Shannon entropies as β1 u(β1) + β2 u(β2 ) log Z(β1 ) + log Z(β2 ) + + log(2L), 2 2

(10.23)

β1 + β2 u(β1 ) + u(β2 ) log Z(β1 ) + log Z(β2 ) + + log(2L). 2 2 2

(10.24)

SSh [ρ(β1 ,β2 ) ] = and Ssym [ρ(β1 ,β2 ) ] =

Let us consider the step protocol (5.4) with α = (β1 , β2 ) and α′ = (β1′ , β2′ ). The excess entropy production in the step protocol is easily evaluated by comparing the entropy productions in this process and another process that starts at t = 0 with the stationary distribution corresponding to (β1′ , β2′ ). The two processes differ only in the distribution of the internal energy of the particle at t = 0, and they become identical after the particle is bounced back by one of the ends for the first time. From this observation one immediately finds that the excess entropy production is given by23  i 1h  ˆ α ˆ hΘαex ist→ = β1′ u(β2 ) − u(β2′ ) + β2′ u(β1 ) − u(β1′ ) . (10.25) 2 Let us start by confirming the validity of the extended Clausius relation (4.10) or (5.15). Writing βk′ = βk + ∆βk for k = 1, 2, and δ = max{|∆β1 |, |∆β2|}, we find from 23

One can also compute the expectation value of the entropy production rate explicitly as PL (β1′ ,β2′ ) (β ′ ,β ′ ) σ px (t) = σst 1 2 + [β1′ {u(β1 ) − u(β1′ )} + β2′ {u(β2 ) − u(β2′ )}](2L)−1 j=1 λj tj−1 e−λt for t ≥ 0. x x It decays exponentially to the steady value. P

52

(10.25) that 1 β2 u′ (β1 ) ∆β1 + β1 u′ (β2 ) ∆β2 + O(δ 2) 2 1 1 = − β u′(β) (∆β1 + ∆β2 ) + u′ (β) − βu′′ (β) (∆β1 − ∆β2 ) ǫ 2 4 2 2 + O(ǫ δ) + O(δ ), (10.26)

ˆ α ˆ hΘαex ist→ =−

where, we wrote u′ (β) = du(β)/dβ. In the second line, we set β = (β1 + β2 )/2 and ǫ = β1 − β2 . The expression (10.26) should be compared with the difference between the entropies (defined as S(β1 , β2 ) := Ssym [ρ(β1 ,β2 ) ]), which is S(β1′ , β2′ )

1h  − S(β1 , β2 ) = − u(β1 ) − u(β2 ) (∆β1 − ∆β2 ) 4  i + (β1 + β2 ) u′ (β1 ) ∆β1 + u′ (β2 ) ∆β2 + O(δ 2) 1 1 = β u′ (β) (∆β1 + ∆β2 ) − u′ (β) − βu′′ (β) (∆β1 − ∆β2 ) ǫ 2 4 2 2 + O(ǫ δ) + O(δ ). (10.27)

ˆ α ˆ We thus have24 S(β1′ , β2′ ) − S(β1 , β2 ) = −hΘαex ist→ + O(ǫ2 δ) + O(δ 2), which is the extended Clausius relation (4.10), (5.15). We note that the difference between the Shannon entropies ′



SSh [ρ(β1 ,β2) ] − SSh [ρ(β1 ,β2 ) ] =

1 β1 u′ (β1 ) ∆β1 + β2 u′ (β2 ) ∆β2 + O(δ 2 ), 2

(10.28)

ˆ α ˆ differs from −hΘαex ist→ by O(ǫ δ). To see that the corresponding inequality is impossible we go onto compute the O(δ 2 ) term explicitly to get

1 ′ u′′(β)  (∆β1 )2 − (∆β2 )2 ǫ u (β) ∆β1 ∆β2 + 2 8 2 3 + O(ǫ δ) + O(δ ). (10.29)

ˆ α ˆ S(β1′ , β2′ ) − S(β1 , β2 ) + hΘαex ist→ =−

Let us set, for simplicity, |∆β1 | = |∆β2|. Then the right-hand side becomes −(1/2) u′(β) ∆β1 ∆β2 + O(ǫ2 δ) + O(δ 3 ). We first note that the term −(1/2) u′(β) ∆β1 ∆β2 vanishes if we use the quasi-static protocol (which connects (β1 , β2 ) and (β1′ , β2′ )) as in (5.8), (5.9). This term therefore represents the effect of the sudden operation, i.e., the step protocol. Now recall that ˆ α ˆ the extended Clausius inequality (5.17) would imply S(β1′ , β2′ ) − S(β1 , β2 ) + hΘαex ist→ ≥ 2 3 ′ ′ O(ǫ δ) + O(δ ). But, since u (β) < 0, we have −(1/2) u (β) ∆β1 ∆β2 < 0 provided that ∆β1 ∆β2 < 0. This provides a concrete counterexample to the extended Clausius inequality. 24

The term O(ǫ2 δ) is nonvanishing.

53

It is a pleasure to thank Hisao Hayakawa, Masato Itami, Nobuyasu Ito, Chris Jarzynski, Gianni Jona-Lasinio, Joel Lebowitz, Christian Maes, Karel Netocny, Yoshi Oono, Glenn Paquette, Takahiro Sagawa, Keiji Saito, Herebert Spohn, and Akira Shimizu for valuable discussions. The present study was supported by KAKENHI Nos. 22340109, 23540435, and 25103002, by the JSPS Core-to-Core program “Non-equilibrium dynamics of soft-matter and information”, and partially by JSPS and Leading Research Organizations, namely NSERC, ANR, DFG, RFBR, RCUK and NSF as Partner Organizations under the G8 Research Councils Initiative for Multilateral Research Funding.

References [1] Y. Oono and M. Paniconi, Steady state thermodynamics, Prog. Theor. Phys. Suppl. 130, 29-44 (1998). [2] T. S. Komatsu, N. Nakagawa, S.-I. Sasa and H. Tasaki, Steady State Thermodynamics for Heat Conduction — Microscopic Derivation, Phys. Rev. Lett. 100, 230602 (2008). arXiv:0711.0246 [3] T. S. Komatsu, N. Nakagawa, S.-I. Sasa and H. Tasaki, Entropy and Nonlinear Nonequilibrium Thermodynamic Relation for Heat Conducting Steady States, J. Stat. Phys. 142, 127–153 (2011). arXiv:1009.0970 [4] R. Landauer, dQ = T dS far from equilibrium, Phys. Rev. A18, 255-266 (1978). [5] D. Ruelle, Extending the definition of entropy to nonequilibrium steady states, Proc. Natl. Acad. Sci. U.S.A. 100, 3054–3058 (2003). arXiv:cond-mat/0303156 [6] K. Saito and H. Tasaki, Extended Clausius Relation and Entropy for Nonequilibrium Steady States in Heat Conducting Quantum Systems, J. Stat. Phys. 145, 1275–1290 (2011). arXiv:1105.2168 [7] T. Sagawa and H. Hayakawa, Geometrical expression of excess entropy production, Phys. Rev. E 84, 051110 (2011). arXiv:1109.0796 [8] T. Yuge, T. Sagawa, A. Sugita, and H. Hayakawa, Geometrical Excess Entropy Production in Nonequilibrium Quantum Systems, J. Stat. Phys. 153, 412–441 (2013). arXiv:1305.5026 [9] T. Hatano and S.-I. Sasa, Steady-State Thermodynamics of Langevin Systems, Phys. Rev. Lett. 86, 3463 (2001). arXiv:cond-mat/0010405 54

[10] L. Bertini, D. Gabrielli, G. Jona-Lasinio, and C. Landim, Thermodynamic transformations of nonequilibrium states, J. Stat. Phys. 149, 773–802 (2012). arXiv:1206.2412 [11] L. Bertini, D. Gabrielli, G. Jona-Lasinio, and C. Landim, Clausius Inequality and Optimality of Quasistatic Transformations for Nonequilibrium Stationary States, Phys. Rev. Lett. 110, 020601 (2013). arXiv:1208.1872 [12] G. Jona-Lasinio, Thermodynamics of stationary states, J. Stat. Mech. P02004 (2014). [13] C. Maes and K. Netocny, A nonequilibrium extension of the Clausius heat theorem, J. Stat. Phys. 154, 188–203 (2014). arXiv:1206.3423 [14] S.-I. Sasa, Possible extended forms of thermodynamic entropy, J. Stat. Mech. P01004 (2014). arXiv:1309.7131 [15] S.-I. Sasa and Hal Tasaki, Steady State Thermodynamics, J. Stat. Phys. 125, 125– 224 (2006). arXiv:cond-mat/0411052 [16] P. Pradhan, C.P. Amann, and U. Seifert, Nonequilibrium Steady States in Contact: Approximate Thermodynamic Structure and Zeroth Law for Driven Lattice Gases, Phys. Rev. Lett. 105, 150601 (2010). arXiv:1002.4349 [17] P. Pradhan, C.P. Amann, and U. Seifert, Approximate thermodynamic structure for driven lattice gases in contact, Phys. Rev. E 84, 041104 (2011). arXiv:1107.5434 [18] R. Dickman and R. Motai, Inconsistencies in steady state thermodynamics, preprint (2014). arXiv:1401.1678 [19] E. Boksenbojm, C. Maes, K. Netocny, and J. Pesek, Heat capacity in nonequilibrium steady states, Euro Phys. Lett. 96, 40001 (2011). arXiv:1109.3054 [20] D. Mandal, Nonequilibrium heat capacity, Phys. Rev. E 88, 062135 (2013). arXiv:1311.7176 [21] N. Nakagawa, Universal exact expression for adiabatic pumping in terms of nonequilibrium steady states, preprint (2014). arXiv:1401.4242 55

[22] L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, and C. Landim, Fluctuations in Stationary Nonequilibrium States of Irreversible Processes, Phys. Rev. Lett. 87, 040601 (2001). arXiv:cond-mat/0104153 [23] L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, and C. Landim, Large deviation approach to non equilibrium processes in stochastic lattice gases, Bull. Braz. Math. Soc. 37, 611–643 (2006). arXiv:math/0602557 [24] L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, and C. Landim, Towards a Nonequilibrium Thermodynamics: A Self-Contained Macroscopic Description of Driven Diffusive Systems, J. Stat. Phys. 135, 857–872 (2009). arXiv:0807.4457 [25] G. Jona-Lasinio, From Fluctuations in Hydrodynamics to Nonequilibrium Thermodynamics, Prog. Theor. Phys. Suppl. 184, 262 (2010). arXiv:1003.4164 [26] B. Derrida, J. L. Lebowitz, and E. R. Speer, Free Energy Functional for Nonequilibrium Systems: An Exactly Solvable Case, Phys. Rev. Lett. 87, 150601 (2001). arXiv:cond-mat/0105110 [27] B. Derrida, J. L. Lebowitz, and E. R. Speer, Exact Large Deviation Functional of a Stationary Open Driven Diffusive System: The Asymmetric Exclusion Process, J. Stat. Phys. 110, 775–810 (2003). arXiv:cond-mat/0205353 [28] T. Bodineau and B. Derrida, Current Fluctuations in Nonequilibrium Diffusive Systems: An Additivity Principle, Phys. Rev. Lett. 92, 180601 (2004). arXiv:cond-mat/0402305 [29] E. H. Lieb and J. Yngvason, The entropy concept for non-equilibrium states, Proc. R. Soc. A 2013 469, 20130408 (2013). http://rspa.royalsocietypublishing.org/content/469/2158/20130408 [30] U. Seifert, Stochastic thermodynamics, fluctuation theorems, and molecular machines, Rep. Prog. Phys. 75, 126001 (2012). arXiv:1205.4176 [31] R. E. Spinny and I. J. Ford, Non-equilibrium thermodynamic systems with odd and even variables, Phys. Rev. Lett. 108, 170603 (2012). arXiv:1201.0904 [32] C. Jarzynski, Nonequilibrium Equality for Free Energy Differences, Phys. Rev. Lett. 78, 2690 (1997). arXiv:cond-mat/9610209 56

[33] G.E. Crooks, Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences, Phys. Rev. E 60, 2721 (1999). arXiv:cond-mat/9901352 [34] G.E. Crooks, Path-ensemble averages in systems driven far from equilibrium, Phys. Rev. E 61, 2361, (2000). arXiv:cond-mat/9908420 [35] U. Seifert, Entropy Production along a Stochastic Trajectory and an Integral Fluctuation Theorem, Phys. Rev. Lett. 95, 040602 (2005). arXiv:cond-mat/0503686 [36] C. Maes and K. Netocny, Time-Reversal and Entropy, J. Stat. Phys. 110, 269–310 (2003). arXiv:cond-mat/0202501 [37] M. Baiesi, C. Maes, and B. Wynants, Fluctuations and response of nonequilibrium states, Phys. Rev. Lett. 103, 010602 (2009). arXiv:0902.3955 [38] P. Baerts, U. Basu, C. Maes, and S. Safaverdi, The frenetic origin of negative differential response, Phys. Rev. E 88, 052109 (2013). arXiv:1308.5613 [39] N. Nakagawa, Work Relation and the Second Law of Thermodynamics in Nonequilibrium Steady States, Phys. Rev. E 85, 051115 (2012). arXiv:1109.1374 [40] T. S. Komatsu and N. Nakagawa, An expression for stationary distribution in nonequilibrium steady state, Phys. Rev. Lett. 100, 030601 (2008). arXiv:0708.3158 [41] C. Maes and K. Netocny, Rigorous meaning of McLennan ensembles, J. Math. Phys. 51, 015219 (2010). arXiv:0911.1032 [42] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications, (Springer, New York, 1998). [43] T. S. Komatsu, N. Nakagawa, S.-I. Sasa and H. Tasaki, Representation of Nonequilibrium Steady States in Large Mechanical Systems, J. Stat. Phys. 134, 401–423 (2009). arXiv:0805.3023 [44] T. M. Cover and J. A. Thomas, Elements of Information Theory, (WileyInterscience, 2006)

57

[45] M. Itami and S.-I. Sasa, Nonequilibrium Statistical Mechanics for Adiabatic Piston Problem, in preparation.

58