P H D T HESIS

Adaptive mesh refinement techniques for high order shock capturing schemes for hyperbolic systems of conservation laws Antonio Baeza Manzanares

Advisor: Pep Mulet Mestre

Universitat de Val`encia Val`encia, 2010.

A DAPTIVE MESH REFINEMENT TECHNIQUES FOR HIGH ORDER SHOCK CAPTURING SCHEMES FOR HYPERBOLIC SYSTEMS OF CONSERVATION LAWS Memoria ´ presentada per Antonio Baeza Manzanares, Llicenciat en Matematiques; ` realitzada al departament de Matematica ` Aplicada de la Universitat de Val`encia sota la direccio´ de Pep Mulet Mestre, Professor Titular d’aquest departament, amb l’objectiu d’aspirar al Grau de Doctor en Matematiques. `

Val`encia, 25

Pep Mulet Mestre Director de la Memoria `

de febrer

del 2010

Antonio Baeza Manzanares Aspirant al grau de Doctor

D EPAR TAMENT DE M ATEM A` TICA A PLICADA FACULTAT DE M ATEM A` TIQUES ` U NIVERSITAT DE VAL ENCIA

Contents Contents

v

Agra¨ıments

ix

Resum

xi

Abstract

xli

1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 High resolution shock-capturing schemes . 1.1.2 Need of fine resolution computational grids 1.1.3 AMR: spatial and temporal refinement . . . 1.2 Previous work . . . . . . . . . . . . . . . . . . . . . 1.3 Scope of the work . . . . . . . . . . . . . . . . . . . 1.4 Organization of the text . . . . . . . . . . . . . . . . 2 Fluid dynamics equations 2.1 Hyperbolic conservation laws . . . . . . . . . 2.2 Properties of hyperbolic conservation laws . . 2.2.1 Discontinuous solutions . . . . . . . . 2.2.2 Weak solutions . . . . . . . . . . . . . . 2.2.3 Rankine-Hugoniot conditions . . . . . 2.2.4 Characteristic structure of a system of laws . . . . . . . . . . . . . . . . . . . . 2.3 Model equations . . . . . . . . . . . . . . . . . 2.3.1 Scalar hyperbolic equations . . . . . . 2.3.2 Linear hyperbolic systems . . . . . . . 2.3.3 Nonlinear hyperbolic systems . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 3 4 4 7 7 8 11 12 15 15 19 20 20 24 24 25 27

vi

CONTENTS

3 Numerical methods for fluid dynamics 3.1 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Norms and convergence . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 The Lax equivalence theorem . . . . . . . . . . . . . . . 3.3 Elementary methods . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Conservative methods . . . . . . . . . . . . . . . . . . . . . . . 3.5 High resolution conservative methods . . . . . . . . . . . . . 3.5.1 Semi-discrete methods . . . . . . . . . . . . . . . . . . 3.6 Numerical methods for one-dimensional hyperbolic systems 3.7 Implementation of artificial boundary conditions . . . . . . .

37 38 41 43 44 47 47 48 50 52 54 57

4 Shu-Osher’s finite difference with Donat-Marquina’s flux split61 ting 4.1 Shu-Osher’s finite difference flux reconstruction . . . . . . . 63 4.2 Donat-Marquina’s flux formula . . . . . . . . . . . . . . . . . 67 4.3 Reconstruction procedures . . . . . . . . . . . . . . . . . . . . 70 4.3.1 ENO and WENO reconstruction for cell-average discretizations . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.3.2 ENO and WENO reconstructions for point-value discretizations . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.4 The complete integration algorithm . . . . . . . . . . . . . . . 78 5 Adaptive mesh refinement 5.1 Motivation . . . . . . . . . . . . . . . . . . . 5.2 Discretization and grid organization . . . . 5.3 Integration . . . . . . . . . . . . . . . . . . 5.4 Projection . . . . . . . . . . . . . . . . . . . 5.5 Adaptation . . . . . . . . . . . . . . . . . . 5.6 Grid interpolation . . . . . . . . . . . . . . 5.6.1 Grid interpolation and Runge-Kutta 6 Implementation and parallelization of 6.1 Sequential implementation . . . . . 6.1.1 Hierarchical grid system . . 6.1.2 The adaptation process . . . 6.1.3 Integration algorithm . . . . 6.1.4 Flux projection . . . . . . . . 6.2 Parallel implementation . . . . . . .

. . . . . . . . . . . . . . . . . . time

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . integration

the algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . . .

81 84 86 89 92 94 100 104

109 . 110 . 111 . 117 . 126 . 129 . 134

vii

CONTENTS 7 Numerical experiments 7.1 One-dimensional tests . . . . . . . . . . . . . . . . 7.1.1 Linear advection equation . . . . . . . . . . . 7.1.2 Inviscid Burgers’ equation . . . . . . . . . . 7.1.3 The Euler equations of gas dynamics . . . . 7.1.4 Two-component Euler equations in 1D . . . 7.2 Two-dimensional tests . . . . . . . . . . . . . . . . 7.2.1 A Riemann problem for the Euler equations 7.2.2 Double Mach reflection . . . . . . . . . . . . 7.2.3 Shock-vortex interaction . . . . . . . . . . . 7.2.4 Shock-bubble interaction . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

143 . 144 . 145 . 148 . 154 . 168 . 177 . 177 . 179 . 188 . 191

8 Conclussions and further work 197 8.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 8.2 Further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 A A generic description of the AMR algorithm 201 A.1 Grid system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 A.2 Grid adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . 209 A.3 Integration and projection . . . . . . . . . . . . . . . . . . . . 213 A.4 Cartesian grids . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 A.4.1 Uniform cell-refined Cartesian grid hierarchies . . . . 218 A.4.2 Adaptation for cell-refined Cartesian grids . . . . . . . 221 A.4.3 Integration and Projection of Shu-Osher’s finite-difference algorithm with Donat-Marquina’s flux split on a Cartesian grid hierarchy . . . . . . . . . . . . . . . . . . . . . 224 Bibliography

227

viii

CONTENTS

Agra¨ıments En primer lloc gracies ` al meu director Pep Mulet, a Rosa Donat i a Paco Arandiga, ` que m´es que director i companys, s’han portat sempre amb mi com a aut`entics amics. Gracies ` tamb´e a la resta del Departament de Matematica ` Aplicada que em va acollir de forma molt agradable els anys que vaig passar all´ı. Gracies ` tamb´e a Jacques Liandrat i Guillaume Chiavassa, i a la resta del LATP, amb qui vaig compartir uns mesos molt importants per a mi a Marsella. A Enrique Zuazua i Francisco Palacios, que em van donar l’oportunitat d’unir-me al seu projecte a Madrid, a Fernando Monge per facilitar-me l’estada a l’INTA i als meus companys, tant a INTA i a IMDEA com a la resta d’institucions amb qu`e vaig estar relacionat els dos anys que vaig passar alla. ` A Norbert Kroll, Markus Widhalm i la resta de companys del DLR a Braunschweig, per fer possible i agradable la meua estada alla` a l’any 2008. A Vicent Caselles, que tamb´e va confiar en mi per als seus projectes, i al Grup d’Imatge de Barcelona Media. Gracies ` als revisors d’aquesta tesi, pels seus comentaris i suggeriments. Vull donar les gracies ` especialment als meus pares, Antonio i Nieves, i a la meua germana, Anna, per estar sempre de la meua part, aix´ı com a la meua fam´ılia, que afortunadament e´ s molt gran i no puc anomenar un per un, i a Marta per donar alegria i pau a la meua vida cada dia. A Kanke, Juan, Anna i Milena, per la gran amistat que ens uneix malgrat la distancia ` que ens separa. Gracies ` a tots els meus amics, sou molts i no vos puc anomenar un per un.

x

Resum

´ Introduccio En els darrers anys s’ha produ¨ıt un gran increment en la capacitat de processament dels ordinadors, la qual cosa ha perm`es als investigadors avanc¸ar en la simulacio´ de problemes f´ısics que no es podrien abastar sense la seua ajuda. La dinamica ` de fluids computacional (Computational Fluid Dynamics, CFD) e´ s una disciplina que fa us ´ massiu de la simulacio´ num`erica per tractar problemes tals com disseny de vehicles –cotxes, avions, submarins, etc.–, control de transit, ` prediccio´ meteorologica ` i d’altres (veure, per exemple, [164, 126, 191, 3, 171, 53, 86, 133, 17, 129]). Cada vegada es desitja obtindre solucions num`eriques amb major precisio, ´ per la qual cosa s’han desenvolupat m`etodes m´es i m´es potents, que es volen aplicar sobre malles computacionals cada vegada amb major resolucio. ´ Malgrat la pot`encia dels sistemes de calcul ` actuals, el cost computacional d’un m`etode del tipus plantejat pot ser inabastable. Tot i que existeix un debat sobre si e´ s preferible un m`etode d’alt ordre aplicat a una malla de baixa resolucio´ o b´e un m`etode d’ordre baix aplicat a una malla d’alta resolucio, ´ de forma que ambdos ´ m`etodes proporcionen solucions amb la mateixa qualitat [58, 80, 87, 110], els problemes que es plantegen avui dia requereixen m`etodes d’alt ordre aplicats a malles d’alta resolucio. ´ Exemples de problemes amb aquests requeriments son ´ aquells que involucren inestabilitats de Rayleigh-Taylor [168, 172] i de ` Richtmyer-Meshkov [146, 122], interaccions d’ones de xoc amb vortex [48] i turbul`encia [189]. Les malles amb molta resolucio´ son ´ especialment utils ´ a les zones on apareix alguna fenomenologia que inclou estructures no regulars, com

xii les que acabem d’indicar. Aquesta idea ha generat multitud de m`etodes basats en utilitzar diferents resolucions en diferents zones, amb la finalitat de reduir el cost computacional dels algoritmes. Podem destacar els m`etodes basats en malles no uniformes o no estructurades [42, 73], m`etodes de penalitzacio´ [27, 190] i m`etodes sense malla [20]. Un inconvenient d’aquestes metodologies e´ s que la condicio´ de Courant, Friedrichs i Lewy (CFL), necessaria ` per tal d’assegurar l’estabilitat del m`etode, imposa una fita superior en el pas de temps que es pot utilitzar, que e´ s menor quant menor siga la grandaria ` de la cel·la m´es petita de la malla. Aix´ı, el fet d’utilitzar cel·les de diferents grandaries ` no redunda en una reduccio´ del nombre d’iteracions temporals necessaries ` per a calcular la solucio. ´ L’algoritme de refinament adaptatiu de malles (Adaptive Mesh Refinement, AMR) de Berger [21, 24, 22, 139] incorpora con a caracter´ıstica diferencial m´es rellevant el doble refinament en espai i en temps. L’objectiu de l’algoritme e´ s no tant reduir el nombre de cel·les sobre les que es calcula com el nombre d’operacions que s’han de realitzar per a actualitzar la solucio´ entre dos instants de temps. A difer`encia de les opcions que hem anomenat anteriorment, l’AMR explota el fet que cel·les de diferent grandaria ` es poden actualitzar utilitzant passos de temps diferents, creant malles de diferents resolucions que es superposen i que es poden integrar separadament, cadascuna amb un pas de temps diferent, i de forma coordinada. Aixo` implica augmentar el nombre total de cel·les, degut a la superposicio´ de les malles, pero` es produeix una reduccio´ en el nombre total d’integracions que s’han de realitzar, i com a consequ` ¨ encia una reduccio´ del cost computacional de l’algoritme. A partir de la primera descripcio´ de l’AMR el 1982 [26, 21], s’ha realitzat una considerable quantitat de recerca al seu voltant. Berger i Oliger [24] van descriure l’algoritme per a lleis de conservacio´ hiperboliques. ` Cap al final dels anys 80 ja existien implementacions per a problemes bidimensionals [22, 10], mentre que a principis del 90 van apar`eixer els primers resultats en tres dimensions [19]. Algoritmes AMR combinats amb resolvedors d’alt ordre van apar`eixer a finals del segle passat. El m`etode PPM (Piecewise Parabolic Method) [39] e´ s comunament usar en conjuncio´ amb AMR [29], i en l’actualitat es treballa en desenvolupar algoritmes basats en AMR i m`etodes de tipus ENO (Essentially Non-Oscillatory) i WENO (Weighted Essentially NonOscillatory) [106, 108, 15, 192]. En aquest treball desenvolupem un m`etode num`eric d’alta resolucio´ per a sistemes de lleis de conservacio´ hiperboliques, ` basat en algunes de les t`ecniques num`eriques m´es modernes en l’area ` de dinamica ` de fluids

RESUM

xiii

computacional, i tractem les principals dificultats que sorgeixen quan s’utilitzen aquestes t`ecniques de forma conjunta. Tamb´e mostrem la forma en qu`e un algoritme d’aquest tipus pot ser implementat en paral·lel. El nostre algoritme esta` basat en la combinacio´ d’un m`etode d’alt ordre de captura dels xocs (High Resolution Shock Capturing, HRSC) per a la resolucio´ num`erica d’equacions de fluids [118] i un m`etode de refinament adaptatiu de malles [22]. El m`etode de resolucio´ es construeix a partir de la formulacio´ en difer`encies finites de Shu i Osher [161], un interpolador WENO de cinqu`e ordre [81], el m`etode de divisio´ de fluxos (flux splitting) de Donat i Marquina [44] i un m`etode de Runge-Kutta de tercer ordre [161]. Alguns punts importants que hem estudiat son: ´ conservacio´ de les quantitats f´ısiques entre malles, procediments de refinament, adaptacio´ i generacio´ de malles computacionals adaptatives, implementacio´ i paral·lelitzacio´ de l’algoritme, i la descripcio´ de l’algoritme AMR d’una forma m´es general que la que es troba comunament a la literatura cient´ıfica. La tesi esta` estructurada en 8 cap´ıtols i un ap`endix. Al cap´ıtol 2 s’introdueixen els conceptes basics ` sobre les equacions de la dinamica ` de fluids i alguns models d’equacions utilitzats comunament. Al cap´ıtol 3 s’estableixen les bases generals sobre m`etodes num`erics per a equacions de fluids. Sobre estes bases es construeix, al cap´ıtol 4 un resolvedor d’alt ordre basat en la formulacio´ en difer`encies finites de Shu i Osher amb divisio´ de fluxos de Donat i Marquina. Al cap´ıtol 5 es descriu un m`etode de refinament adaptatiu de malles i com es combina amb l’algoritme descrit al cap´ıtol 4. La implementacio´ de l’algoritme resultant es descriu al cap´ıtol 6. Al cap´ıtol 7 realitzem un conjunt d’experiments num`erics per comprovar el rendiment de l’algoritme, i il·lustrar algunes de les seues propietats. Les conclusions del treball s’exposen al cap´ıtol 8. Finalment, a l’ap`endix A es descriu una generalitzacio´ de l’algoritme de refinament adaptatiu de malles.

Equacions de din`amica de fluids En f´ısica, una llei de conservacio´ estableix que certa propietat d’un sistema f´ısic a¨ıllat no canvia amb el temps. Les lleis de conservacio´ es modelitzen habitualment mitjanc¸ant equacions integrals, pero` a la practica ` s’utilitzen sistemes d’equacions en derivades parcials (PDE) que son ´ equivalents a la forma integral quan hi ha regularitat. Al cap´ıtol 1 hem introdu¨ıt els conceptes fonamentals sobre les lleis de conservacio´ i les seues solucions. Destaquem com a refer`encia principal per a aquest cap´ıtol el

xiv llibre de Landau i Lifschiz [93], i indiquem com a refer`encies fonamentals [16, 36, 41, 100, 91, 90, 119, 188]. Les lleis de conservacio´ tenen la forma d

∂u X ∂f q (u) + = 0, ∂t ∂xq

x ∈ Rd ,

q=1

t ∈ R+ ,

(1)

on d e´ s el nombre de dimensions espacials, u : Rd ×R+ −→ Rm e´ s la solucio´ de la llei de conservacio, ´ i f q : Rm −→ Rm son ´ els fluxos. Si m = 1 la llei de conservacio´ s’anomena escalar. Habitualment el problema que es planteja consisteix en resoldre un problema de Cauchy, e´ s a dir, trobar la solucio´ de l’equacio´ per a un temps T > 0 coneixent el valor de la solucio´ per al temps t = 0, i s’estableix per tant una condicio´ inicial u(x, 0) = u0 (x),

x ∈ Rd .

Podem escriure el sistema (1) en la forma d

∂u ∂u X Aq (u) + = 0, ∂t ∂xq q=1

x ∈ Rd , t ∈ R+ .

q

on les matrius Aq (u) = ∂f ´ els jacobians del sistema. El sistema (1) ∂u son s’anomena hiperbolic ` si qualsevol combinacio´ d X q=1

ξq Aq ,

(ξq ∈ R)

t´e m valors propis reals i m vectors propis linealment independents, e´ s a dir, si cadascuna d’aquestes combinacions e´ s semblant a una matriu diagonal. Si els m valors propis de Aq (u) son ´ diferents, el sistema s’anomena estrictament hiperbolic. ` ´ conegut que les solucions dels sistemes hiperbolics Es ` poden contenir discontinu¨ıtats, fins i tot si la dada inicial i els fluxos son ´ regulars. Per tal de tractar amb aquest tipus de solucions, s’introdueix el concepte de solucio´ feble; una funcio´ u(x, t) e´ s una solucio´ feble de (1) amb dada inicial u(x, 0) si es verifica Z Z Z d X ∂φ u(x, t) ∂φ (x, t) + f q (u) φ(x, 0)u(x, 0)dx (2) dxdt = − ∂t ∂x d q R R+ Rd q=1

per a tota funcio´ φ ∈ C01 (Rd × R+ ), on C01 (Rd × R+ ) e´ s l’espai de les funcions derivables amb derivada cont´ınua, i amb suport compacte en Rd × R+ .

xv

RESUM

Les solucions de (2) son ´ les solucions de la forma integral de les equa´ solucions febles, i les socions. A m´es a m´es, les solucions de (1) son lucions febles son ´ solucions de (1) en el cas que siguen suficientment regulars. Les solucions febles es caracteritzen per ser aquelles funcions que verifiquen la forma diferencial alla` on son ´ regulars i compleixen les ´ Les condicions condicions de Rankine-Hugoniot [143, 75] on no ho son. de Rankine-Hugoniot relacionen els valors de la solucio´ al voltant d’una discontinu¨ıtat i la velocitat amb la que es propaga la discontinu¨ıtat: [f ] · nΣ = s[u] · nΣ , on f = (f 1 , . . . f d ) e´ s una matriu que cont´e els fluxos, u e´ s la solucio, ´ s e´ s la velocitat de la discontinu¨ıtat i nΣ e´ s el vector normal a la discontinu¨ıtat. La notacio´ [·] indica el salt d’una variable a trav´es de la discontinu¨ıtat. Tamb´e e´ s un fet conegut que pot existir m´es d’una funcio´ que siga solucio´ feble d’una equacio. ´ Per tant, s’afegeixen condicions que completen la definicio´ de solucio´ feble, de forma que es puga identificar la solucio´ correcta en sentit f´ısic, coneguda com a solucio´ d’entropia. Algunes d’aquestes condicions son ´ degudes a Oleinik [130], Lax [98], Wendroff [187] i Liu [111]. La gran majoria de m`etodes num`erics per a sistemes hiperbolics ` exploten d’una forma o altra el fet que les matrius jacobianes siguen diagonalitzables. Si Rq e´ s la matriu dels vectors propis per la dreta de Aq (u), q Rq = [r1q , . . . , rm ] i Λq e´ s una matriu diagonal formada per els valors propis corresponents, Λq = diag([λq1 , . . . , λqm ]), aleshores Aq (u) es pot escriure com Aq = Rq Λq Rq−1 . Els vectors propis rpq defineixen camps vectorials, anomenats camps caracter´ıstics, al llarg dels quals es pot interpretar el comportament de les solucions. En particular existeixen dos tipus de camps d’especial relleu. En primer lloc ens referim als camps genu¨ınament no lineals, definits com aquells per als quals es verifica ∇λp (u) · rp (u) 6= 0,

∀u,

on ∇λp (u) e´ s el gradient de λp (u). El valor propi associat varia de forma monotona ` al llarg de corbes integrals del camp caracter´ıstic. Notar que als sistemes lineals Aq no dep`en d’u, i per tant λp e´ s constant respecte a u. Aixo` implica que els camps genu¨ınament no lineals no poden apar`eixer als sistemes lineals, i son ´ propis dels sistemes no lineals. Si, pel contrari, el que es dona ´ e´ s ∇λp (u) · rp (u) = 0, ∀u, (3)

xvi aleshores el camp s’anomena linealment degenerat, i es caracteritza pel fet que el valor propi associat e´ s constant al llarg de les corbes integrals de rp (u), tal com succeeix als sistemes lineals, on ∇λp = 0. Els valors propis de les matrius jacobianes determinen alguns tipus de discontinu¨ıtats que apareixen als sistemes hiperbolics. ` Si x = s(t) e´ s una discontinu¨ıtat que separa dos estats uL (t) i uR (t), la discontinu¨ıtat es diu que e´ s un xoc associat al camp caracter´ıstic p-`essim si es verifica λp (uL ) ≥ s′ (t) ≥ λp (uR ).

(4)

En el cas que es donen igualtats en (4), es diu que s(t) e´ s una discontinu¨ıtat de contacte associada al camp p-`essim. D’altra banda, les ones de rarefaccio´ es caracteritzen pel fet que el valor propi e´ s creixent al llarg del camp caracter´ıstic corresponent. Estos fets ens indiquen que si dos estats estan a la mateixa corba integral d’un camp linealment degenerat i estan connectats per una discontinu¨ıtat, aleshores degut a (3), els valors propis corresponents a eixos estats son ´ iguals, i la discontinu¨ıtat sols pot ser de contacte. D’altra banda, els camps genu¨ınament no lineals poden presentar ones de xoc i rarefaccions, depenent de la monotonia del valor propi. Presentem a continuacio´ alguns models d’equacions i sistemes de lleis de conservacio´ hiperboliques, ` on es poden observar algunes de les caracter´ıstiques de les equacions de fluids que hem descrit en esta seccio. ´ El model m´es simple e´ s l’equacio´ d’adveccio´ lineal escalar ut + aux = 0, on a e´ s una constant. Per a qualsevol funcio´ F : R −→ R, una solucio´ de l’equacio´ ve donada per u(x, t) = F (x−at), e´ s a dir, la funcio´ F (x) es transporta a velocitat constant en el temps. L’unic ´ tipus de discontinu¨ıtat que es pot presentar a les solucions d’aquesta equacio´ e´ s la discontinu¨ıtat de contacte. D’entre els models no lineals, un dels m´es simples e´ s l’equacio´ de Burgers no viscosa 2 u = 0. ut + 2 x Tot i que quan s’escriu en forma quasi-lineal ut + uux = 0, s’assembla a l’equacio´ d’adveccio´ lineal, el tipus de solucions que admet e´ s completament diferent, presentant ja ones de xoc i rarefaccions. Les

xvii

RESUM

solucions de l’equacio´ de Burgers no poden contenir discontinu¨ıtats de contacte. El seguent ¨ pas en la nostra descripcio´ d’equacions model son ´ els sistemes lineals de lleis de conservacio´ en una dimensio. ´ Tenen la forma ut + Aux = 0,

(5)

on A e´ s una matriu quadrada de grandaria ` m × m, sent m el nombre d’equacions. Per ser el sistema hiperbolic, ` la matriu A admet la descomposicio´ A = RΛR−1 , on Λ = diag(λ1 , . . . , λm ), amb λp ∈ R i R = [r1 , . . . , rm ], rp ∈ Rm . Fent el canvi a les variables caracter´ıstiques, definides segons w = R−1 u, el sistema es pot escriure en la forma wt + Λwx = 0. Aquest e´ s un sistema diagonal, de forma que cada fila e´ s una equacio´ d’adveccio, ´ donada per ∂wp ∂wp + λp = 0. ∂t ∂x

(6)

Si considerem una dada inicial u(x, 0) = u0 (x) per a (5), aleshores la solucio´ de (6) e´ s wp (x, t) = wp0 (x − λp t), on wp0 e´ s la component p-`essima de w0 = R−1 u0 . La solucio´ de (5) es pot escriure, per tant com u = R · w, o en forma expandida: u(x, t) =

m X p=1

wp (x − λp t, 0)rp .

Com a model m´es important de sistema no lineal de lleis de conservacio´ hiperboliques ` destaquen les equacions d’Euler, que en una dimensio´ es poden escriure com ρ ρv ρv + ρv 2 + p = 0, (7) E t v(E + p) x

on ρ e´ s densitat, v velocitat, E energia i p pressio. ´ Al cap´ıtol 2 fem una descripcio´ detallada d’aquestes equacions, aix´ı com de les equacions d’Euler multi-component. Simplement mencionarem aqu´ı que dels tres camps caracter´ıstics dos son ´ genu¨ınament no lineals i un linealment degenerat, i per tant a les solucions es poden trobar tant ones de xoc com discontinu¨ıtats de contacte i ones de rarefaccio. ´

xviii

M`etodes num`erics per a din`amica de fluids Les equacions de la dinamica ` de fluids son, ´ en general, impossibles de resoldre de forma exacta, tret d’alguns casos molt senzills. Els m`etodes num`erics tenen com a objectiu, precisament, el calcul ` aproximat de les solucions, en forma de conjunt de valors discrets, que en moltes ocasions e´ s suficient en la practica. ` En aquest cap´ıtol descrivim les nocions i resultats basics ` sobre m`etodes num`erics per a equacions de fluids, centrant-nos quasi exclusivament al cas escalar unidimensional, amb l’objectiu d’aclarir les idees que utilitzarem per a la descripcio´ del m`etode num`eric particular que utilitzarem en aquest treball. Tots els conceptes introdu¨ıts en aquest cap´ıtol s’expliquen a qualsevol manual de dinamica ` de fluids computacional, com per exemple els llibres de Toro [175], LeVeque [104, 105] i Hirsch [70, 71]. Considerem l’equacio´ que correspon al cas escalar unidimensional de (1), ut + f (u)x = 0, (x, t) ∈ R × R+ , (8) amb condicions inicials u(x, 0) = u0 (x). Considerem una malla obtinguda mitjanc¸ant la discretitzacio´ d’un interval de la recta real, que prenem com a I = [0, 1] per a simplificar, donada pels punts xj = j + 12 ∆x, on ∆x = N1 , amb N enter positiu, per a 0 ≤ j < N . Aquests punts defineixen subintervals cj = [xj− 1 , xj+ 1 ]. La variable temporal la discretitzarem 2

2

T sobre un interval [0, T ], t > 0, segons tn = n∆t, ∆t = M , amb M enter N −1 positiu. Denotarem per {Uj }j=0 l’aproximacio´ puntual de la solucio´ de (8) als punts xj . Considerarem m`etodes num`erics expl´ıcits en temps, que obtenen la solucio´ num`erica corresponent a un temps tn+1 a partir de les solucions ja calculades corresponents a temps anteriors, i que podem escriure com

U n+1 = H(U n , U n−1 , . . . , U n−p+1 ) = 0,

p > 0.

(9)

En particular considerarem el cas de m`etodes d’un pas (p = 1), e´ s a dir, U n+1 = H(U n ). L’objectiu dels m`etodes num`erics e´ s calcular aproximacions acurades a la solucio´ de (8). Per a mesurar la bondat d’aquestes aproximacions usarem normes discretes, definides sobre RN , on N e´ s el nombre de punts de la discretitzacio. ´ Desitgem que el m`etode num`eric siga convergent, e´ s a dir, si tenim una successio´ de malles {Gk }+∞ k=0 que verifica limk→+∞ ∆xk = 0, aleshores el m`etode (9) sera` convergent en la norma ||.|| si es verifica limk→+∞ ||UGk − uGk || = 0, on uGk son ´ els valors puntuals de

xix

RESUM

la solucio´ de (8) sobre la malla Gk , i UGk e´ s la solucio´ num`erica calculada sobre la mateixa malla. Per tal d’estudiar la converg`encia d’un m`etode, comunament s’utilitzen els conceptes de consist`encia i estabilitat d’un m`etode, que juntament amb el Teorema de Lax [102, 147, 167] poden ∆t esta` permetre demostrar la converg`encia. Assumim que la relacio´ ∆x fitada per una constant. Un m`etode s’anomena consistent si l’error lo 1 H(un ) − un+1 , tendeix a zero cal de truncament, definit segons Ln∆t = ∆t quan ∆t tendeix a zero, assumint que u e´ s regular. Si Ln∆t = O(∆tp ) direm que el m`etode t´e ordre de precisio´ igual a p. D’altra banda, un m`etode e´ s estable si l’error que resulta d’aplicar n vegades el m`etode num`eric a les dades inicials, donat per E n = Hn (U 0 ) − un , es pot fitar amb alguna quantitat que tendisca a zero quan ∆t i ∆x tendeixen a zero. El resultat conegut com Teorema de Lax ens permet aprofitar estos conceptes, ja que ens diu que, donat un m`etode lineal d’un pas consistent, per a un problema de Cauchy ben posat, estabilitat i converg`encia son ´ equivalents. L’analisi ` d’estabilitat i consist`encia e´ s, en general, m´es senzill que el de la converg`encia. La consist`encia es demostra t´ıpicament mitjanc¸ant desenvolupaments de Taylor, mentre que la estabilitat es pot analitzar utilitzant conceptes com variacio´ total, monotonia i contractivitat del m`etode. En aquest treball ens centrarem en m`etodes conservatius provinents d’una formulacio´ semi-discreta. Un m`etode es diu que e´ s conservatiu si existeix una funcio´ fˆ : Rp+q+1 → R, anomenada flux num`eric, tal que ∆t ˆ n n n n ) − fˆ(Uj−p , . . . , Uj+q−1 ) , (10) f (Uj−p+1 , . . . , Uj+q H(U n )j = Ujn − ∆x

per a certs nombres enters positius p i q. Usarem la notacio´ fˆj+ 1 = 2 n n ). La funcio fˆ(Uj−p+1 , . . . , Uj+q ´ fˆ es diu que e´ s consistent si fˆ(U, . . . , U ) = f (U ). L’avantatge d’utilitzar m`etodes conservatius prov´e del fet que aquests m`etodes, quan convergeixen, ho fan necessariament ` a una solucio´ feble del problema, d’acord amb el Teorema de Lax-Wendroff [103]. Una forma d’obtindre un m`etode conservatiu consisteix en discretitzar en primer lloc el terme f (u)x , mitjanc¸ant una formulacio´ f (u)x ≈

fˆj+ 1 − fˆj− 1 2

2

∆x

,

deixant el terme ut sense discretitzar, de forma que s’obt´e un sistema d’equacions diferencials ordinaries ` ˆ ˆ dUj (t) fj+ 21 − fj− 21 + = 0, dt ∆x

∀j.

xx Aquest sistema es resol mitjanc¸ant un resolvedor d’equacions ordinaries ` adequat. En aquest treball utilitzem un algoritme de tipus Runge-Kutta de tercer ordre [160] que t´e la propietat de ser de variacio´ total decreixent, i que t´e la forma: U (1) = U n − ∆tD(U n ), 1 1 3 U (2) = U n + U (1) − ∆tD(U (1) ), 4 4 4 1 n 2 (2) 2 n+1 U = U + U − ∆tD(U (2) ), 3 3 3

(11)

fˆj+ 1 −fˆj− 1

2 2 . on D(U ) ve donat per D(U )j = ∆x El calcul ` dels fluxos num`erics amb ordre alt es pot aconseguir seguint la metodologia de reconstruccio´ de fluxos amb limitadors. La idea e´ s resoldre, per a cada punt xj+ 1 , un problema donat per una dada cons2 tant a cada costat de la interf´ıcie (problema de Riemann), tal com es fa al m`etode de Godunov [57]. L’alt ordre s’aconsegueix calculant aquests estats com a interpolacions d’alt ordre de les dades, i introduint un limitador per tal que l’estabilitat es puga assegurar. Exemples de m`etodes d’aquest tipus son ´ els m`etodes MUSCL [181], PPM [39], PHM [117], ENO [64] i WENO [113, 81], per citar-ne alguns. Per a finalitzar aquesta seccio, ´ mencionarem que, donat que el m`etode s’aplica sobre una malla de grandaria ` finita, i que el calcul ` del flux num`eric als nodes proxims ` a les fronteres involucra valors que no pertanyen a la malla, es fa necessari especificar el comportament de les solucions a les fronteres, imposant condicions de frontera artificials. Considerarem per a tal fi nodes auxiliars, amb ´ındexs {−1, . . . , −p} (a l’esquerra) i {N, . . . , N + q − 1} (a la dreta), als quals especificarem el valor de la solucio. ´ Algunes condicions de frontera habituals s´ on les de tipus inflow i outflow, on el fluid entra o surt per la frontera, les de tipus absorbing, on el fluid e´ s absorbit per la frontera, i les de tipus reflecting, on el fluid rebota en arribar a ella. En el primer cas, una forma d’imposar les condicions de frontera consisteix en fixar els valors a la frontera segons

n = U0n , 1 ≤ k ≤ p, U−k

n n UN +k−1 = UN −1 , 1 ≤ k ≤ q.

Al segon cas, essencialment es tracta de fixar a zero la diverg`encia num`erica a la frontera, i, per als m`etodes conservatius, es pot aconseguir mitjanc¸ant n n , 1 ≤ k ≤ p, = Uk−1 U−k

n n UN +k−1 = UN −k , 1 ≤ k ≤ q.

(12)

xxi

RESUM

Al tercer cas, les formules ´ utilitzades depenen del problema. T´ıpicament s’apliquen les equacions (12) a totes les variables excepte a la velocitat, on es canvia el signe indicant que el fluid canvia de direccio´ en arribar a la frontera. Per exemple, per a les equacions d’Euler unidimensionals (7), fixar´ıem ρ−k = ρk−1 ,

ρN +k−1 = ρN −k ,

u−k = −uk−1 ,

uN +k−1 = −uN −k ,

E−k = Ek−1 ,

EN +k−1 = EN −k ,

1 ≤ k ≤ p, 1 ≤ k ≤ q.

´ en difer`encies finites de Shu i Formulacio ´ de fluxos de Donat i Osher amb divisio Marquina Descrivim breument en aquest cap´ıtol el m`etode num`eric que utilitzem per integrar les equacions en cada malla, sent els seus elements constitutius la formulacio´ en difer`encies finites de Shu i Osher [160, 161], la divisio´ de fluxos (flux-splitting) de Donat i Marquina [44], la reconstruccio´ WENO de cinqu`e ordre [81] i un integrador de Runge-Kutta TVD de tercer ordre [161]. La formulacio´ en difer`encies finites de Shu i Osher representa una alternativa als m`etodes basats en discretitzacions per volums finits, que simplifica la implementacio´ del m`etode num`eric, especialment per a problemes en m´es d’una dimensio´ espacial. Una descripcio´ comparativa d’ambdues opcions es pot trobar a [159]. La descripcio´ original del m`etode, realitzada per a equacions escalars, proposa l’extensio´ a sistemes mitjanc¸ant l’aplicacio´ del m`etode a cada camp caracter´ıstic local, provinent, per exemple, d’una linealitzacio´ com la del m`etode de Roe. Una alternativa es va proposar a [44], on es calculen dos matrius jacobianes a cada interf´ıcie entre cel·les. Aquesta e´ s una extensio´ m´es natural de la metodologia de Shu i Osher a sistemes, i ha demostrat proporcionar millors resultats en alguns casos patologics. ` A m´es a m´es, pot ser utilitzada conjuntament amb qualsevol reconstruccio. ´ En aquest treball utilitzem la reconstruccio´ WENO tal com es descriu en el treball de Jiang i Shu [81]. L’algoritme resultant d’ajuntar totes aquestes t`ecniques ha sigut utilitzat a [118] sobre una malla fixa, per a un problema de fluids multicomponent. Descrivim a continuacio´ la formulacio´ de Shu i Osher per a una equacio´ escalar unidimensional ut + f (u)x = 0.

(13)

xxii L’extensio´ a m´es d’una dimensio´ e´ s immediata, i l’extensio´ a sistemes es realitzara` m´es endavant. Siga h(x) una funcio´ (depenent de la grandaria ` de la malla ∆x) que verifica 1 f (u(x)) = ∆x Aleshores

Z

x+ ∆x 2 x− ∆x 2

h(ξ)dξ.

(14)

− h x − ∆x 2 f (u(x))x = , ∆x amb la qual cosa l’equacio´ (13) e´ s equivalent a h x + ∆x − h x − ∆x 2 2 = 0. ut + ∆x h x+

∆x 2

(15)

Per tal d’obtindre un m`etode conservatiu necessitem aproximar la derivada f (u(x))x amb una expressio´ de la forma (veure (10)) 1 ˆn n . fj+ 1 − fˆj− 1 ∆x 2 2

n L’equacio´ (15) suggereix que el flux num`eric fˆj+ 1 ha d’aproximar el 2

valor de h(xj+ 1 ). Dit d’una altra manera, si podem calcular aproximaci2 ons d’alt ordre de h(xj+ 1 ), aquestes poden ser utilitzades com a fluxos 2 num`erics d’alt ordre per a un esquema conservatiu. Notem que el valor de h(xj+ 1 ) es pot aproximar utilitzant els valors coneguts de les seues 2 mitges f (u(x)) als nodes de la malla, mitjanc¸ant qualsevol reconstruccio´ R de valors puntuals a partir de mitges en cel·la. En aquest treball utilitzem la reconstruccio´ WENO de Jiang i Shu [81]. En calcular la reconstruccio, ´ un punt essencial e´ s l’upwinding, que significa que el m`etode num`eric ha de tindre en compte la direccio´ en la qual la solucio´ es mou, donada pels signes dels valors propis de la matriu jacobiana. Per a equacions escalars, aixo` e´ s simplement el signe de f ′ (u). Utilitzem l’indicador de Roe κj+ 1 = 2

f (Uj+1 ) − f (Uj ) Uj+1 − Uj

(16)

per tal de determinar el seu signe, i realitzem reconstruccions convenientment esbiaixades: si κj+ 1 > 0 calcularem la reconstruccio´ esbiai2 xada cap a l’esquerra, segons fˆj+ 1 = R(f (Uj−s1 ), . . . , f (Uj+s2 ), xj+ 1 ), i si 2 2 κ 1 ≤ 0, calcularem fˆ 1 = R(f (Uj−s +1 ), . . . , f (Uj+s +1 ), x 1 ). j+ 2

j+ 2

1

2

j+ 2

xxiii

RESUM

´ conegut que els fluxos num`erics calculats d’aquesta manera adEs meten solucions no entropiques ` als punts sonics, ` per la qual cosa e´ s necessari variar el calcul ` a aquests punts, que son ´ aquells on f ′ (u) canvia de signe. Shu i Osher utilitzen l’algoritme de Lax-Friedrichs local (LLF) als punts sonics. ` Aquest algoritme s’obt´e realitzant una divisio´ de fluxos, donada per f (Uj ) = f + (Uj ) + f − (Uj ), amb f ± (u) = 12 (f (u) ± βu) on β e´ s un valor local donat per βj+ 1 = maxu∈[Uj ,Uj+1 ] |f ′ (u)|, i calculant la re2 construccio´ fˆ 1 com la suma fˆ 1 = fˆ+ 1 + fˆ− 1 de dues reconstruccions j+ 2

j+ 2

j+ 2

j+ 2

esbiaixades calculades utilitzant, respectivament, f + (u) i f − (u), segons: + + + fˆj+ 1 = R(f (Uj−s1 ), . . . , f (Uj+s2 )) 2

− + − fˆj+ 1 = R(f (Uj−s1 +1 ), . . . , f (Uj+s2 +1 )) 2

L’algoritme final de calcul ` de fluxos num`erics queda com segueix: Algoritme 1. Definir βj+ 1 = maxu∈[Uj ,Uj+1 ] |f ′ (u)| 2 Definir κj+ 1 mitjanc¸ant (16) 2 si κj− 1 · κj+ 1 > 0 2 2 si κj+ 1 > 0 2 fˆ 1 = R(fj−s , . . . , fj+s , x 1 ) j+ 2

1

j+ 2

2

altrament fˆj+ 1 = R(fj−s1+1 , . . . , fj+s2+1 , xj+ 1 ) 2 2 fi si altrament + 1 1 fˆj+ 1 = R( 2 (fj−s1 + βj+ 1 Uj−s1 ), . . . , 2 (fj+s2 + βj+ 1 Uj+s2 ), xj+ 1 ) 2 2 2 2 − 1 1 ˆ = R( U ), . . . , U (fj−s +1 − β 1 j−s +1 (fj+s +1 − β 1 j+s +1 ), x f 1 j+ 2

2

j+ 2

1

2

1

2

j+ 2

2

j+ 12 )

+ ˆ− 1 . fˆj+ 1 = fˆj+ 1 + f j+ 2

2

2

fi si L’extensio´ de la formulacio´ anterior a sistemes la realitzem mitjanc¸ant variables i fluxos caracter´ıstics, seguint la formula ´ de separacio´ de fluxos de Donat i Marquina [44], que es basa en realitzar una doble linealitzacio´ de la matriu jacobiana en cada interf´ıcie. Primerament es calculen, per a cada interf´ıcie, dues aproximacions esbiaixades de les variables conserL , i que es calculen utilitzant stencils L vades u, que denotem per Uj+ 1 i U j+ 1 2

2

esbiaixats, que contenen, respectivament, els punts xj i xj+1 . Notar que

xxiv el que es calculen son ´ aproximacions al valor puntual de u al punt xj+ 1 2 a partir de valors puntuals de u als nodes de la malla. En aquest treball hem utilitzat per a aquest proposit ` una versio´ de l’algoritme WENO de Jiang i Shu [81], modificat per tal que es reconstru¨ısquen valors puntuals a partir de valors puntuals, i no de mitges en cel·la com al m`etode original. En segon lloc es calculen les linealitzacions corresponents als L ) i f ′ (U R ). Denotarem per lp (U L ) i r p (U L ) els dos jacobians f ′ (Uj+ 1 j+ 1 j+ 1 j+ 1 2

2

2

2

L ), i per lp (U R ) vectors propis esquerra i dreta respectivament de f ′ (Uj+ 1 j+ 1 2

2

R ) els mateixos objectes corresponents a f ′ (U R ). A continuacio i r p (Uj+ ´ 1 j+ 1 2

2

es calculen dos conjunts de variables i fluxos caracter´ıstics, calculats als punts d’un cert stencil, segons els canvis de variable donats per les dues matrius de valors propis esquerra:

L = lp (U L ) · U , Wp,k k j+ 12 , p L L fp,k = l (Uj+ 1 ) · f (Uk )

per a j − s1 ≤ k ≤ j + s2 ,

2

R = lp (U R ) · U , Wp,k k j+ 12 , p R R fp,k = l (U 1 ) · f (Uk ) j+

(17) per a j − s1 + 1 ≤ k ≤ j + s2 + 1.

2

Els fluxos num`erics es calculen ara de forma similar a l’algoritme de Shu i Osher que hem descrit per al cas escalar, pero` utilitzant les variables caracter´ıstiques corresponents a la linealitzacio´ esquerra o dreta segons convinga. L’us ´ d’unes variables o altres vindra` determinat pel signe dels valors propis de les dues linealitzacions. L’algoritme final de calcul ` de fluxos num`erics queda de la seguent ¨ manera:

xxv

RESUM Algoritme 2. si λp (u)no canvia de signe en [Uj , Uj+1 ] si λp (Uj ) > 0 L = R(f L L ψp,j p,j−s1 , . . . fp,j+s2 , xj+ 21 ) R =0 ψp,j altrament L =0 ψp,j R = R(f R R ψp,j p,j−s1 +1 , . . . fp,j+s2+1 , xj+ 12 ) fi si altrament p p 1 L L = R( 1 (f L L L ψp,j 2 p,j−s1 + βj+ 1 Wp,j−s1 ), . . . , 2 (fp,j+s2 + βj+ 1 Wp,j+s2 ), xj+ 1 ) 2

2

2

p p 1 R R R R = R( 1 (f R ψp,j 2 p,j−s1+1 − βj+ 1 Wp,j−s1 +1 ), . . . , 2 (fp,j+s2 +1 − βj+ 1 Wp,j+s2+1 ), xj+ 1 ) 2

2

2

fi si P L p L R r p (U R ), r (Uj+ 1 ) + ψp,j fˆj+ 1 = p ψp,j j+ 1 2

2

2

p p on βj+ 1 = maxu |λ (u)|, amb u variant en una corba en l’espai de fases 2

que connecta Uj i Uj+1 . En el cas de les equacions model que hem considerat en aquest treball, els camps caracter´ıstics son ´ o b´e genu¨ınament no lineals o b´e linealment degenerats, amb la qual cosa els valors propis son, ´ respectivament, monotons ` o constants quan u varia, de forma que els canvis de signe de λp (u) es poden estudiar comprovant simplement p el signe de λp (Uj ) · λp (Uj+1 ), i βj+ 1 es pot calcular simplement segons 2

p p p βj+ 1 = max{|λ (Uj )|, |λ (Uj+1 )|}. Indiquem finalment que per a equacions 2

´ equivalents. Un cop calculats els fluxos escalars els algoritmes 1 i 2 son num`erics, resolem l’equacio´ ut +

fˆj+ 1 − fˆj− 1 2

2

∆x

=0

que resulta de la formulacio´ semi-discreta mitjanc¸ant el m`etode de RungeKutta de tercer ordre donat per les equacions (11). Finalment indicarem que l’extensio´ dels algoritmes anteriors a m´es d’una dimensio´ espacial es pot realitzar dimensio´ a dimensio´ d’una forma senzilla [161, 118].

Refinament adaptatiu de malles El refinament adaptatiu de malles que utilitzem en aquest treball e´ s una metodologia de proposit ` general per a la integracio´ num`erica eficient de

xxvi lleis de conservacio´ hiperboliques. ` L’algoritme va ser descrit inicialment per Berger [21] i Berger i Oliger [24] per a m`etodes basats en viscositat artificial, i posteriorment per a m`etodes basats en volums finits per Berger i Colella [22]. Una versio´ simplificada va ser descrita per Quirk [139]. L’AMR intenta adequar la resolucio´ de les malles computacionals a les necessitats de la solucio´ num`erica, utilitzant malles m´es grolleres on la solucio´ e´ s regular i refinant solament on aquesta presenta estructures no regulars. Sota condicions favorables l’algoritme resultant requereix nom´es una part del temps de computacio´ necessari per a resoldre el problema a una malla uniforme. Aquesta efici`encia prov´e tant del refinament espacial com del refinament en temps, i no s’imposen restriccions especials sobre el m`etode num`eric a emprar, de forma que l’algoritme mant´e un bon grau de genericitat. Aquests objectius s’aconsegueixen tenint en compte que les solucions de les lleis de conservacio´ hiperboliques ` es composen t´ıpicament d’ones que es mouen a trav´es de zones on la solucio´ e´ s regular. Els m`etodes de captura d’ones de xoc d’alta resolucio´ tracten de capturar i resoldre eixes ones, ja que a les zones on hi ha regularitat no es necessita alta resolucio. ´ La principal dificultat de l’AMR sera, ` per tant, identificar les zones no regulars, que seran refinades fins a un nivell de resolucio´ adequat, i assegurar que el refinament segueix el moviment de les ones amb el temps. Expliquem a continuacio´ l’algoritme amb m´es detall, per al cas unidimensional. Donat un conjunt fitat Ω1 ⊂ Rd , i degut a la hiperbolicitat del problema que considerem, la seua solucio´ u a Ω1 × [t0 , t0 + ∆t] dep`en solament dels valors de u a un superconjunt de Ω1 × {t0 }, que ve donat pel domini de depend`encia de les equacions, que e´ s un altre conjunt fitat. Els m`etodes num`erics imiten esta caracter´ıstica, i calculen la solucio´ a Ω1 × [t0 , t0 + ∆t] utilitzant dades d’un altre conjunt fitat, que ve determinat e 1 × {t0 } el pel domini de depend`encia num`erica del m`etode. Siga doncs Ω domini de depend`encia num`erica del m`etode considerat, i suposem que la condicio´ CFL es verifica. Aleshores, donada una aproximacio´ de u soe 1 × {t0 }, e´ s possible calcular aproximacions a bre una malla definida a Ω u(x, t), per a (x, t) ∈ Ω1 × [t0 , t0 + ∆t], mitjanc¸ant el m`etode num`eric. Una nova aplicacio´ de la mateixa idea, per tal de calcular aproximacions de u en una malla corresponent a Ω1 × [t0 + ∆t, t0 + 2∆t], requeriria con`eixer e 1 × {t0 + ∆t}, pero` solament el valor de les aproximacions de u sobre Ω disposem d’aproximacions sobre el seu subconjunt Ω1 × {t0 + ∆t}. Les b 1 := Ω e 1 \Ω1 s’hauran d’obtindre per altres aproximacions sobre la banda Ω b 1 , aquests valors poden ser interpolats de forvies. Si u e´ s regular en Ω ma acurada a partir de les aproximacions de u a una malla m´es grollera

xxvii

RESUM

e e1 ⊇ Ω e 1. definida sobre un domini Ω L’AMR es pot descriure mitjanc¸ant l’aplicacio´ recursiva d’aquesta idea a una jerarquia de malles de diferents resolucions, de forma que cada malla cont´e les caracter´ıstiques de la solucio´ que no es poden predir utilitzant la informacio´ continguda a malles m´es grolleres. Les jerarquies que considerem en aquest treball parteixen d’una malla Gt0 definida sobre tot el domini Ω (assumim Ω = [0, 1] per simplicitat) on volem resoldre les equacions. Aquesta malla estara` composada per N0 cel·les de grandaria ` 1 ∆x0 = N0 . A partir d’ella, es pot construir un conjunt de L malles amb m´es i m´es resolucio´ considerant malles obtingudes per subdivisio´ de les cel·les de la malla immediatament m´es grollera en diverses parts (assumim que cada cel·la es divideix en dues parts, per simplificar), e´ s a dir, l’interval [0, 1] es divideix en N0 , . . . , NL−1 subintervals (cel·les) de longitud ∆xℓ = 1/Nℓ , on Nℓ = 2ℓ N0 , ℓ = 0, . . . , L − 1. Els centres de les cel·les els denotarem per xℓj = (j + 12 )∆xℓ , j = 0, . . . , Nℓ − 1, ℓ = 0, . . . , L − 1, i la unio´ de les cel·les indexades per elements de Gℓ per Ωℓ (Gℓ ). Dins de la nostra jerarquia, una malla corresponent a un cert nivell ℓ e´ s un subconjunt de les Nℓ cel·les que corresponen a aquest nivell, i es pot interpretar tamb´e com un subconjunt de {0, . . . , Nℓ − 1}. Com que la solucio´ varia amb el temps, tamb´e ho faran les malles, de forma que denotarem per Gtℓℓ la malla corresponent al nivell de resolucio´ ℓ per al temps tℓ . Sobre cadasℓ ≈ u(xℓj , tℓ ), cuna d’aquestes malles considerarem una funcio´ discreta utℓ,j on j ∈ Gtℓℓ . L’algoritme evoluciona aquestes malles i les seues solucions num`eriques associades comenc¸ant amb tℓ = 0, ℓ = 0, . . . , L − 1 i finalitzant amb tℓ = T, ℓ = 0, . . . , L − 1, on T e´ s el temps final per al qual volem re′ soldre les equacions. Als temps intermedis es requereix tℓ ≥ tℓ′ si ℓ ≤ ℓ . Assegurarem a m´es a m´es la condicio´ seguent: ¨ per a ℓ > 0, Gℓ = {2i, 2i + 1, per a alguns i ∈ Gℓ−1 }, que en particular implica que les malles estan contingudes unes dins de les altres. Els blocs constituents fonamentals de l’AMR son: ´ integracio, ´ que consisteix en aplicar l’algoritme que hem descrit a la seccio´ anterior a cada cel·la de cada malla; adaptacio, ´ de forma que s’aplica un refinament adequat a cada part del domini en tot moment; i projeccio, ´ que obliga a que es respecte la conservacio´ entre malles quan aquestes es superposen. Pel que fa a la integracio, ´ en primer lloc es tria un pas temporal ∆t0 adequat per que la condicio´ CFL ∆t0 ≤

∆x0 , maxu |f ′ (u)|

es verifique a la malla Gt0 . Els passos temporals per a la resta de malles

xxviii els prenem com ∆tℓ =

∆tℓ−1 , 2

ℓ = 1, . . . , L − 1,

la qual cosa implica que la condicio´ CFL corresponent a cada malla es verifica. Un pas temporal d’una malla correspon, per tant, a dos passos de la malla immediatament m´es fina, de forma que cada malla es pot integrar des de temps t (inicialment el mateix per a totes) fins a temps t + ∆t0 , fent 2ℓ iteracions. Totes aquestes iteracions es realitzen de forma sequencial ¨ atenent als seguents ¨ criteris: 1. Cada malla s’integra immediatament despr´es de la malla corresponent a un nivell de resolucio´ menys. 2. Si una malla Gtℓℓ (l > 0) s’integra fins a un cert instant de temps, no es tornara` a integrar fins que totes les malles m´es fines que ella s’hagen integrat fins al mateix instant de temps. Un cop totes les malles s’han integrat fins temps t + ∆t0 , el proc´es es repeteix per al seguent ¨ pas temporal de la malla m´es grollera. Notem finalment que per tal d’integrar una malla Gℓ , (ℓ > 0) des de temps t a temps t + ∆tℓ , e´ s necessari proporcionar-li dades procedents e ℓ (Gℓ ) × {t}. Aquestes dades s’obtenen fent interpolacio´ en de la banda Ω e ℓ−1 (Gℓ−1 ). D’altra banda, per a integrar la mateixa espai a partir de Ω e ℓ (Gℓ ) × malla des de temps t + ∆tℓ fins a t + 2∆tℓ , es requereixen dades a Ω t+2∆tℓ t {t + ∆tℓ }, i en aquest cas s’obtenen a partir de Gℓ−1 i Gℓ−1 , que hauran sigut integrades anteriorment. Donat que per a integrar la malla Gtℓ utilitzem un algoritme de Runge-Kutta, definit per (11), de forma que a cada pas intermedi s’han de proporcionar dades per a aquesta banda, cosa que fem observant que les dades U (1) i U (2) que apareixen es poden ℓ interpretar com a aproximacions de la solucio´ a temps t + ∆tℓ i t + ∆t 2 , respectivament Un cop calculat (utℓℓ +2∆tℓ , Gtℓℓ ), existeixen dades calculades amb diferents resolucions corresponents al mateix conjunt Ωℓ (Gtℓℓ ). Per a donar coher`encia a aquestes dades, realitzem un proc´es de projeccio´ de les malles m´es fines cap a les m´es grolleres, de forma que es modifiquen ℓ ℓ +2∆tℓ tals que els seus que corresponguen a cel·les de Gtℓ−1 els valors utℓ−1,j tℓ ´ındexs i verifiquen {2i, 2(i − 1), 2(i + 1)} ∩ Gℓ 6= ∅. Per obtindre la correccio´ que cal fer a la malla grollera, notem que es dona ´ la relacio´ ℓ ℓ +2∆tℓ − = utℓ,j utℓ,j

∆tℓ ˆtℓ tℓ ℓ ℓ ˆtℓ +∆t ˆtℓ +∆t ((f ) − (fˆℓ,j− )), 1 +f 1 + f ℓ,j+ 21 ℓ,j− 21 ∆xℓ ℓ,j+ 2 2

(18)

xxix

RESUM que implica, prenent j = 2i, 2i + 1 ℓ +2∆tℓ ℓ +2∆tℓ + utℓ,2i+1 utℓ,2i

2

=

ℓ ℓ + utℓ,2i+1 utℓ,2i

2

−

∆tℓ−1 ˆˆtℓ ˆtℓ (fℓ−1,i+ 1 − fˆℓ−1,i− 1 ), ∆xℓ−1 2 2

(19)

on hem definit: ˆ fˆtℓ

ℓ−1,i+ 21

=

ˆtℓ fˆℓ−1,i− 1 =

tℓ ˆtℓ +∆t3ℓ fˆℓ,2i+ 3 + f ℓ,2i+ 2

tℓ fˆℓ,2i− 1 2

2

2 tℓ +∆tℓ + fˆℓ,2i− 1 2

2

2

, (20) ,

Per tant, si redefinim el flux num`eric segons ˆˆtℓ tℓ fˆℓ−1,i± , 1 = f ℓ−1,i± 1 2

(21)

2

i assumim que al temps t es compleix la relacio´ ℓ = utℓ−1,i

ℓ ℓ + utℓ,2i+1 utℓ,2i

2

,

(22)

aleshores, amb la correccio´ (21), es verifica la mateixa relacio´ per a temps tℓ + 2∆tℓ = tℓ−1 + ∆tℓ−1 : ℓ +2∆tℓ utℓ−1,i

=

ℓ +2∆tℓ ℓ +2∆tℓ + utℓ,2i+1 utℓ,2i

2

.

(23)

Les substitucions que hem fet tenen sentit per la coincid`encia de les interf´ıcies entre cel·les als dos nivells ℓ i ℓ − 1, ja que es compleix que ℓ−1 ℓ−1 ℓ xℓ2i+ 3 = xi+ = xi− ´ indicada cada cop 1 i x 1 . Realitzarem la correccio 2i− 1 2

2

2

2

que una malla s’integra durant un pas de temps de la malla immediatament m´es grollera. Expliquem finalment el proc´es d’adaptacio, ´ necessari per tal que les malles puguen evolucionar en el temps segons ho fan les solucions num`eriques. L’objectiu e´ s assegurar que les zones on hi hagen discontinu¨ıtats o altres estructures no regulars estiguen refinades fins al nivell que corresponga. La nostra proposta consisteix en incloure en la malla d’un cert nivell les cel·les que tenen valors de la solucio´ num`erica que no poden ser predits amb precisio´ a partir del nivell anterior. M´es expl´ıcitament, si xℓj ∈ Gtℓ i I(utℓ−1 , x) e´ s un operador d’interpolacio´ que actua sobre les

xxx dades utℓ−1 = {utℓ−1,i }i∈Gt , aleshores la cel·la definida per xℓj s’incloura` a ℓ−1 la malla refinada si t (24) uℓ,j − I(utℓ−1 , xℓj ) > τp > 0,

i ens assegurarem que la malla que resulte estiga formada per subdivisio´ de cel·les grolleres, marcant les cel·les escaients. A m´es a m´es, tamb´e inclourem una cel·la en la malla refinada si el modul ` del gradient discret de la solucio, ´ calculat sobre la malla grollera, supera una certa tolerancia, ` de forma que es puga detectar la formacio´ d’ones de xoc a partir de dades regulars. Per al gradient discret usarem l’aproximacio´ t t t t max − u − u u u , ℓ−1,j ℓ−1,j−1 ℓ−1,j+1 ℓ−1,j ∂u ℓ−1 (x , t) ≈ . (25) ∂x j ∆xℓ

Afegim finalment les cel·les corresponents a un entorn de grandaria ` d’almenys una cel·la de la malla grollera, i aixo` ens permetra` adaptar les malles despr´es de cada iteracio´ de la malla grollera, en lloc de despr´es de cada iteracio´ de la malla fina, ja que amb eixes cel·les extra es cobreix la distancia ` que una discontinu¨ıtat pot recorrer ´ en un pas temporal de la malla grollera, de forma que no podra` escapar de la malla fina. Indiquem tamb´e que el proc´es d’adaptacio´ el realitzarem actuant primer sobre les malles m´es fines, per tal d’assegurar que en tot moment es verifica Ωℓ (Gtℓ ) ⊆ Ωℓ−1 (Gtℓ−1 ). Assegurem tamb´e la inclusio´ Ωℓ (Gtℓ ) ⊇ Ωℓ+1 (Gtℓ+1 ), de forma que la jerarquia de malles verifica les inclusions desitjades. Aquestes fan possible el calcul ` de les dades necessaries ` a les bandes que hem definit al voltant de les cel·les marcades, i de les que e ℓ (Gt ) ⊆ Ω e ℓ−1 (Gt ). Un cop calculaparlem m´es amunt, ja que tindrem Ω ℓ ℓ−1 t t b b da la nova malla Gℓ , que verifica Ωℓ (Gℓ ) ⊆ Ωℓ−1 (Gℓ−1 ), calculem ( bt \ Gt I(utℓ−1 , xℓj ) si j ∈ G t ℓ ℓ (26) u bℓ,j = si j ∈ Gtℓ utℓ,j

e´ s a dir, les dades s’interpolen per a les cel·les que no estan en Gtℓ . La t bt , u malla refinada es defineix, per tant, amb (G ℓ bℓ ). L’algoritme que hem descrit en aquesta seccio´ es pot implementar mitjanc¸ant el pseudo-codi de la Fig. 1. La integracio´ de tota la jerarquia de malles amb un pas de temps ∆t0 , tal com hem descrit m´es amunt, s’aconsegueix amb la crida actualitzar(G, 0). La crida projectar(Gℓ) actualitza la solucio´ a la malla Gℓ−1 d’acord amb la correccio´ (21) i la crida adaptar(Gℓ+1) realitza el proc´es d’adaptacio´ que hem descrit.

xxxi

RESUM ´ actualitzar(G:jerarquia de malles, ℓ:enter) Funcio integrar(Gℓ) si(ℓ < L − 1) per a k = 1 fins 2 actualitzar(G, ℓ + 1) fi per a projectar(Gℓ+1) adaptar(Gℓ+1) fi si ´ fi funcio Figura 1: Una algoritme recursiu per a l’algoritme AMR

´ i paral·lelitzacio ´ de Implementacio l’algoritme Descrivim en aquesta seccio´ la implementacio´ practica ` de l’algoritme per al cas bidimensional, fent una distincio´ clara entre implementacio´ sequencial, ¨ que basicament ` e´ s una extensio´ a dues dimensions dels algoritmes descrits en les dues seccions anteriors, i implementacio´ en paral·lel. Considerem el problema ut (x, y, t) + f (u(x, y, t))x + g(u(x, y, t))y = 0, (x, y, t) ∈ Ω × [0, T ], (27) u(x, y, 0) = u0 (x), (x, y) ∈ Ω, on hem assumit Ω = [0, 1]2 per simplicitat. Tindrem en aquest cas, per a cada nivell de refinament ℓ, una discretitzacio´ 1 1 ℓ ℓ ℓ ∆xℓ , j + ∆yℓ , 0 ≤ i < Nℓx , 0 ≤ j < Nℓy , i+ xi,j := (xi , yj ) = 2 2 (28) on, donats enters positius N0x i N0y , hem definit: Nℓx = 2ℓ N0x ,

Nℓy = 2ℓ N0y ,

∆xℓ =

1 , Nℓx

∆yℓ =

1 , Nℓy

1 ≤ ℓ < L.

Aquests punts defineixen cel·les ∆xℓ ℓ ∆xℓ ∆yℓ ℓ ∆yℓ ℓ ℓ ℓ ci,j = xi − , xi + , yj + × yj − . 2 2 2 2

(29)

xxxii Considerarem jerarquies de malles on cada malla Gℓ , corresponent al nivell de refinament ℓ, es defineix com a un subconjunt de la discretitzacio´ corresponent a eixe nivell, organitzada com a un conjunt de trossos quadrats de malla, que anomenarem pegats, i que denotarem per {Gℓ,k , 1 ≤ k ≤ Kℓ }. Construirem les malles i les jerarquies de malles de forma que es verifiquen les seguents ¨ condicions: • Cada pegat e´ s un subconjunt de {cℓi,j : 0 ≤ i < Nℓx , 0 ≤ j < Nℓy }, • Ω(Gℓ,k ) e´ s un rectangle per a tot k, 1 ≤ k ≤ Kℓ , • ˚ Ω(Gℓ,k1 ) ∩ ˚ Ω(Gℓ,k2 ) = ∅ si k1 6= k2 (els pegats sols poden intersecar entre si a les seues fronteres), SKℓ−1 Ω(Gℓ−1,k ) (la malla d’un nivell esta` continguda a la • Ω(Gℓ,k ) ⊆ k=1 malla del nivell immediatament anterior), ℓ−1 6= ∅ per a alguns cℓ−1 ∈ G • si cℓi,j ∈ Gℓ,k e´ s tal que ˚ cℓi,j ∩ ˚ cp,q ℓ−1 , aleshop,q ℓ−1 res cp,q ⊆ Ω(Gℓ,k ) (les malles s’obtenen per subdivisio´ de cel·les de la malla immediatament anterior).

Ampliarem cadascun dels pegats que conformen una malla amb cel·les auxiliars al seu voltant, de forma que els pegats es puguen integrar separadament. Les dades corresponents a aquestes cel·les auxiliars s’obtindran de forma analoga ` al cas unidimensional, b´e interpolant (en espai o en espai i en temps) a partir de dades de la malla anterior, o b´e copiant dades d’altres pegats, ja que les cel·les auxiliars d’un pegat poden correspondre a cel·les interiors d’un altre. L’adaptacio´ de la jerarquia de malles la realitzem de forma similar al cas unidimensional descrit en la seccio´ anterior. Inclourem a la malla refinada les cel·les que verifiquen una condicio´ del tipus ℓ (30) Ui,j − I(U ℓ−1 , xℓi,j ) > τp > 0,

ℓ e on Ui,j ´ s la solucio´ num`erica corresponent al node (i, j) de la malla de nivell ℓ i I e´ s l’extensio´ tensorial d’un operador d’interpolacio´ unidimensional. A m´es inclourem tamb´e les cel·les que resulten de la subdivisio´ de cel·les de la malla immediatament m´es grollera on es verifique que el sensor de gradients ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 max Ui+1,j − Ui,j , U ui,j − Ui−1,j max Ui,j+1 − Ui,j , Ui,j − Ui,j−1 , max ∆xℓ ∆yℓ

(31)

xxxiii

RESUM

supera una certa tolerancia. ` Afegirem finalment les cel·les corresponents a almenys una cel·la de la malla anterior al voltant de cada cel·la marcada, per assegurar-nos que les discontinu¨ıtats no eixiran de la malla fina en un pas de temps de la malla grollera. A partir de les cel·les marcades realitzarem un proc´es d’agrupament d’aquestes en pegats rectangulars. El proc´es consisteix en trobar el pegat m´ınim que cont´e totes les cel·les marcades, i comparar el percentatge de cel·les marcades que cont´e amb una tolerancia ` preestablerta. En cas que el percentatge siga menor que la tolerancia, ` es divideix el pegat en trossos i es repeteix el proc´es per a cadascun dels pegats resultants. Per tal d’assegurar que les malles estan contingudes unes dins d’altres, refinarem les malles m´es fines abans que les m´es grolleres. D’esta manera, quan es refina una malla d’un nivell ℓ, es pot assegurar la inclusio´ de la malla m´es fina –que ja ha sigut refinada– simplement incloent en la malla refinada del nivell ℓ les cel·les que corresponguen a cel·les de la malla de nivell ℓ + 1. Finalment, per a dotar a la malla refinada d’una solucio´ num`erica, actuarem com al cas unidimensional, copiant dades de la malla no refinada alla` on ambdues malles coincidisquen i interpolant a partir de la malla anterior a la resta de cel·les. La integracio´ de cada pegat es pot realitzar, com hem dit, de forma separada. La organitzacio´ de les integracions dels diferents pegats i les diferents malles e´ s similar al cas unidimensional. Sols mencionarem que en el cas bidimensional el pas de temps per a la malla m´es grollera es defineix de forma que es verifique una condicio´ CFL, mentre que la resta de passos es defineixen de la mateixa forma que al cas unidimensional. Quant a la projeccio, ´ amb la que finalitzarem la part dedicada a la implementacio´ sequencial, ¨ notem que donada la discretitzacio´ que hem definit, els nodes on es calculen els fluxos num`erics a dues malles de nivells diferents no coincideixen (veure Fig. 2), encara que estan localitzats sobre interf´ıcies coincidents de cel·les. Per tal de fer la correccio´ del flux num`eric de la malla grollera a partir del flux de la malla fina, calcularem valors interpolats a partir dels valors de la malla fina i utilitzarem aquests valors per a la correccio´ de la malla grollera. Fent un analisi ` similar al de la seccio´ anterior, observem que la integracio´ d’un node xℓ2i,2j ∈ Gℓ d’un temps t a un temps t + ∆tℓ−1 es pot expressar com ℓ ut+2∆t ℓ,2i,2j

=

utℓ,2i,2j

∆tℓ RK3,t RK3,t+∆tℓ RK3,t+∆tℓ RK3,t ˆ ˆ ˆ ˆ fℓ,2i+ 1 ,2j + fℓ,2i+ 1 ,2j − fℓ,2i− 1 ,2j + fℓ,2i− 1 ,2j − ∆xℓ 2 2 2 2 ∆tℓ RK3,t RK3,t+∆tℓ RK3,t+∆tℓ RK3,t − gˆℓ,2i,2j+ 1 + gˆℓ,2i,2j+ 1 − gˆℓ,2i,2j− 1 + gˆℓ,2i,2j− 1 , ∆yℓ 2 2 2 2

xxxiv t+2∆tℓ t+2∆tℓ ℓ amb expressions analogues ` per a ut+2∆t ℓ,2i+1,2j , uℓ,2i,2j+1 i uℓ,2i+1,2j+1 . Si defiy nim ara per a −1 ≤ i ≤ Nℓx i 0 ≤ j ≤ Nℓ

ˆt fˆℓ−1,i+ = 1 ,j

RK3,t+∆tℓ RK3,t RK3,t+∆tℓ RK3,t + fˆℓ,2i+ + fˆℓ,2i+ + fˆℓ,2i+ fˆℓ,2i+ 3 3 3 3 ,2j ,2j ,2j+1 ,2j+1 2

2

2

2

4

2

,

(32)

,

(33)

i per a 0 ≤ i ≤ Nℓx i −1 ≤ j ≤ Nℓy t gˆ ˆℓ−1,i,j+ 1 =

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ gˆℓ,2i,2j+ ˆℓ,2i,2j+ + gˆℓ,2i+1,2j+ ˆℓ,2i+1,2j+ 3 + g 3 + g 3 3 2

2

2

2

4

2

aleshores es t´e t+2∆tℓ t+2∆tℓ t+2∆tℓ ℓ ut+2∆t ℓ,2i,2j + uℓ,2i+1,2j + uℓ,2i,2j+1 + uℓ,2i+1,2j+1

= −

4 utℓ,2i,2j + utℓ,2i+1,2j + utℓ,2i,2j+1 + utℓ,2i+1,2j+1

(34)

4

∆t ∆tℓ ˆ ˆˆt ℓ ˆt t ˆˆt − . g ˆ fˆℓ−1,i+ − f 1 − g 1 1 1 ,j ℓ−1,i,j− 2 ℓ−1,i− 2 ,j ∆xℓ 2 ∆yℓ ℓ−1,i,j+ 2

Per tant, fem, sobre la malla grollera, les correccions ˆt t fˆℓ−1,i+ = fˆℓ−1,i+ 1 1 , ,j ,j

x −1 ≤ i ≤ Nℓ−1 ,

t t ˆ ˆℓ−1,i,j+ gˆℓ−1,i,j+ 1, 1 = g

x − 1, 0 ≤ i ≤ Nℓ−1

2

2

2

2

y 0 ≤ j ≤ Nℓ−1 − 1, y −1 ≤ j ≤ Nℓ−1 ,

(35)

amb la qual cosa, si suposem que es verifica a temps t la relacio´ utℓ−1,i,j

=

utℓ,2i,2j + utℓ,2i+1,2j + utℓ,2i,2j+1 + utℓ,2i+1,2j+1 4

,

(36)

es t´e que la mateixa relacio´ es verifica per al temps seguent ¨ t + ∆tl−1 . Passem a continuacio´ a resumir la implementacio´ de l’algoritme en paral·lel. La forma t´ıpica de paral·lelitzar algoritmes basats en AMR consisteix en dividir el domini en trossos i assignar a cada processador els pegats –de tots els nivells– corresponents a cada tros, intercanviant dades entre processadors quan siga necessari. El problema d’aquesta metodologia e´ s que el fet que existisquen regions de grandaria ` redu¨ıda pero` amb un alt grau de refinament complica el disseny d’una estrat`egia de particionament eficient, que idealment hauria de: • equilibrar les carregues ` de treball dels processadors i

xxxv

RESUM

l

l

x 2i,2j+1

x 2i+1,2j+1 l−1

x i,j l

x 2i,2j

l

x 2i+1,2j

Figura 2: Localitzacio´ relativa dels nodes i els punts on es calculen fluxos num`erics per a dues malles de resolucio´ consecutives. Els nodes de la malla grollera els hem indicat amb cercles negres, i els fins amb quadrats negres. Els punts on es calculen fluxos num`erics els hem representat amb cercles blancs per a la malla grollera i quadrats blancs per a la fina.

• minimitzar les transfer`encies de dades entre els diferents processadors. Per assolir estos objectius, la practica ` m´es comu ´ consisteix en utilitzar corbes que omplen tot l’espai (space filling curves, SFC) [151]. Aquestes son ´ corbes cont´ınues que passen per tots els punts de l’interval [0, 1]2 , i es construeixen iterativament, de forma que en cada pas k de la construccio´ es pot establir una aplicacio´ bijectiva entre dos conjunts {1, . . . , K}2 i {1, . . . , K 2 }, on K dep`en del pas considerat en la construccio´ i de la corba emprada. D’aquesta forma, es pot assignar un ´ındex sequencial ¨ als trossos en qu`e s’ha dividit el domini [0, 1]2 . L’avantatge de les SFC e´ s que intenten assignar ´ındexs propers a trossos propers, de forma que si els trossos s’assignen als processadors seguint l’ordre donat per l’SFC, es tendira` a assignar trossos propers al mateix processador. Donat que l’intercanvi de dades sols e´ s necessari entre trossos adjacents, aixo` reduira` la transfer`encia de dades entre els processadors [135, 31]. En aquest treball hem utilitzat les corbes de Peano-Hilbert [136, 69], que en el pas k-`essim de la construccio´ divideixen l’interval [0, 1] en 4k trossos. La divisio´ del domini en trossos es realitza mirant la malla m´es grollera i assignant a cada cel·la un cost proporcional al nombre d’integracions

xxxvi que s’han de realitzar, per a integrar eixa cel·la i totes les cel·les de nivells m´es fins en qu`e s’haja subdividit, durant un pas de temps de la malla m´es grollera. Aix´ı, si la cel·la s’ha refinat fins a un nivell ℓ ≥ 0, se li assignara` un cost igual a ℓ X l=0

23ℓ =

23ℓ+3 − 1 . 7

A cada tros li correspondra` un cost igual a la suma dels costos de les seues cel·les. L’algoritme que hem utilitzat per a repartir els calculs ` entre els diferents processadors disponibles actua com segueix: Es calcula el nombre kmax de divisions que es poden realitzar en cada dimensio´ de forma que cada subdomini resultant continga un nombre enter de cel·les de la malla m´es grollera. A continuacio´ es calcula el nombre kmin que correspon a un pas de la construccio´ de la corba de Peano-Hilbert que produeix un nombre de subdominis 4k ≥ P , on P e´ s el nombre de processadors disponibles. Es calcula el cost corresponent a cada tros per a la divisio´ donada per k = kmin , i es mira si es poden assignar eixos trossos a processadors diferents de forma que la difer`encia entre el cost total assignat a un processador i el cost mitja` (que resulta de dividir el cost total entre el nombre de processadors) siga menor que una certa quantitat τ . Si e´ s el cas s’accepta la divisio´ i es reparteixen les dades. Altrament, si k ≤ kmax s’incrementa k en una unitat i es repeteix el proc´es fins que s’aconsegueix un repartiment de carrega ` satisfactori. Si no es pot trobar aquesta divisio´ per a un valor k ≤ kmax , s’incrementa el valor de τ i es repeteix el proc´es des del principi. Aquest proc´es es pot implementar tal com s’indica al pseudo-codi de la Fig. 3

Experiments num`erics En aquest cap´ıtol analitzem el rendiment del m`etode num`eric amb diversos exemples en una i dues dimensions. Els estudis de caracter ` quantitatiu, com ara l’analisi ` dels errors num`erics els hem realitzat en el cas unidimensional, perqu`e e´ s m´es senzill analitzar fenomenologies a¨ıllades que al cas bidimensional. A m´es a m´es, els temps d’execucio´ son ´ m´es redu¨ıts, i ens permeten realitzar proves extensives en un temps raonable. Amb els exemples bidimensionals hem mostrat el comportament del m`etode en situacions complexes, amb una metodologia m´es qualitativa. Hem seleccionat un conjunt de problemes les solucions dels quals presenten una ampla varietat de fenomenologies, amb l’objectiu de mos-

RESUM

xxxvii

´ equilibra(g:jerarquia de malles, P :enter, τ :real) Funcio Calcular kmin i kmax fer k = kmin fer C = calcular l’ordre de Peano-Hilbert per al pas k l = llista de costos per a C i g assignar costos als processadors segons l, P i τ k = k+1 mentre (l’assignaci´ o de costos no ´ es acceptable i k ≤ kmax ) incrementar τ es acceptable) o de costos no ´ mentre(l’assignaci´ ´ fi funcio Figura 3: Pseudo-codi per a l’equilibri de carrega ` entre processadors.

trar que l’algoritme dona ´ un bon rendiment en el rang de situacions m´es ample possible. Hem observat que per als problemes que hem resolt, tant unidimensionals com bidimensionals, el m`etode dona ´ un bon rendiment, obtenint solucions de la mateixa qualitat que les que s’obtenen amb una malla de resolucio´ fixa, quan els parametres ` s’han fixat a valors adequats. El rendiment de l’algoritme dep`en de diferents factors, com la complexitat del problemes, la jerarquia de malles utilitzada, i els parametres, ` entre d’altres, i sols l’experi`encia ens pot ajudar en l’eleccio´ d’una configuracio´ adequada. En el cas unidimensional hem resolt problemes per a l’equacio´ d’adveccio´ lineal, l’equacio´ de Burgers, dues configuracions per a les equacions d’Euler –el problema de tub de Sod i la interaccio´ d’una ona de xoc amb ones entropiques– ` i un problema per a les equacions d’Euler multicomponent. Aquests problemes ens han servit per a mostrar com es comporta l’error –ent`es com la difer`encia entre les solucions obtingudes amb una malla fixa i amb AMR– respecte als parametres, ` i l’efecte que t´e cadascuna de les parts que composen el proc´es de marcatge de cel·les a refinar. A m´es hem mostrat la importancia ` que t´e la projeccio´ de fluxos en el rendiment de l’algoritme i hem analitzat el cost computacional de l’algoritme arribant a la conclusio´ de que la integracio´ e´ s la part m´es costosa amb gran difer`encia, de forma que una estimacio´ del percentatge d’integracions a realitzar per l’AMR respecte a l’algoritme a malla fixa pot servir com a estimacio´ del percentatge de temps que necessitara. `

xxxviii En el cas bidimensional hem resolt num`ericament diversos problemes: un problema de Riemann bidimensional, on els estats inicials representen quatre ones de xoc; el problema conegut con double Mach reflection, en el qual una ona de xoc es troba amb una rampa; la interaccio´ d’una ona de xoc amb un vortex ` i la interaccio´ d’una ona de xoc viatjant a trav´es d’aire amb una bombolla d’heli. En tots els casos l’algoritme obt´e solucions de la mateixa qualitat que les que s’obtenen a malla fixa o les obtingudes per altres autors, amb un cost computacional menor.

Conclusions i treball futur En aquest treball hem descrit un m`etode num`eric per a la resolucio´ de sistemes hiperbolics ` de lleis de conservacio. ´ El m`etode e´ s el resultat de la combinacio´ d’un m`etode de captura d’ones de xoc d’alt ordre –constru¨ıt a partir de la formulacio´ en difer`encies finites de Shu i Osher, un m`etode d’interpolacio´ WENO de cinqu`e ordre, la divisio´ de fluxos de Donat i Marquina i un algoritme de Runge-Kutta de tercer ordre– i la t`ecnica AMR desenvolupada per Berger i col·laboradors. Mostrem la forma en qu`e totes aquestes t`ecniques poden ser ajuntades per tal de construir un m`etode num`eric molt eficient, i descrivim la implementacio´ practica ` de l’algoritme en un programa sequencial ¨ o paral·lel. Hem comprovat el funcionament de l’algoritme amb diverses proves num`eriques en una i dues dimensions, que mostren que el m`etode e´ s capac¸ d’obtindre solucions de la mateixa qualitat que les obtingudes sense adaptacio, ´ pero` amb un cost computacional molt menor. La gran quantitat de proves realitzades permet obtindre una bona comprensio´ de les propietats de l’algoritme que pot ser util ´ en la practica, ` per a obtindre informacio´ sobre els guanys potencials que pot proporcionar, aix´ı com del seu comportament respecte als parametres ` de qu`e dep`en. Amb l’ajuda dels experiments hem explicat diferents aspectes de l’algoritme, en particular: • el comportament del m`etode adaptatiu respecte al mateix algoritme aplicat a una malla de resolucio´ fixa, en termes de la difer`encia entre les respectives solucions, • la influ`encia que el proc´es de refinament t´e en la qualitat del resultat final i en el rendiment obtingut per l’algoritme adaptatiu, i • la importancia ` de la projeccio´ de fluxos de malles m´es fines a malles m´es grolleres en l’algoritme.

RESUM

xxxix

Tot i que el rendiment de l’algoritme e´ s satisfactori, hem detectat diferents aspectes, principalment relacionats amb la implementacio, ´ que podrien millorar la seua efici`encia en alguns casos. En particular, la sobrecarrega ` que produeix l’AMR es pot reduir utilitzant algoritmes de cerca rapids ` en la jerarquia de malles, de forma que el cost de trobar la connectivitat entre les malles es podria reduir. La connectivitat de les malles s’utilitza en diversos processos dins de l’algoritme, com ara el calcul ` de la solucio´ num`erica en les cel·les auxiliars dels pegats, o el calcul ` de la solucio´ num`erica d’una malla despr´es de l’adaptacio. ´ Seguint la mateixa l´ınia del comentari anterior, en alguns casos podria resultar util ´ ajuntar pegats menuts en pegats m´es grans, sempre que siga possible, i sense reduir el percentatge de cel·les marcades present en cada pegat. Aixo` reduiria el nombre de cel·les auxiliars que serien necessaries, ` i per tant el cost de l’algoritme, a canvi de pagar el cost que representa el proc´es d’ajuntar els pegats. El criteri de refinament basat en errors d’interpolacio´ que hem utilitzat podria continuar sent investigat amb l’objectiu de reduir la seua depend`encia del parametre ` τp . L’extensio´ m´es evident del m`etode que hem descrit en aquest treball e´ s la seua implementacio´ en tres dimensions. Malgrat que la descripcio´ de l’algoritme en 3D e´ s senzilla a partir del cas bidimensional, i podria semblar senzill produir-la, la gran quantitat d’aspectes i detalls que cal tindre en compte per a escriure un codi d’aquest tipus e´ s tan gran que actualment no estem considerant desenvolupar-lo. En lloc d’aixo, ` pretenem ampliar el rang d’aplicacio´ del m`etode a problemes bidimensionals m´es generals, en particular tenim en preparacio´ la seua aplicacio´ a problemes de lleis de balanc¸, on apareix un terme font. Tamb´e e´ s interessant aplicar-lo a problemes on la descomposicio´ caracter´ıstica de les matrius jacobianes no esta` totalment disponible de forma anal´ıtica, com problemes de traffic flow (veure [45]). Un altre cas interessant seria la combinacio´ de m`etodes de penalitzacio´ [25] amb refinament adaptatiu de malles.

xl

Abstract The numerical simulation of physical phenomena represented by nonlinear hyperbolic systems of conservation laws presents specific difficulties, that are not present in other kind of systems of partial differential equations. These are mainly due to the presence of discontinuities in the solution. State of the art methods for the solution of such equations involve high resolution shock capturing schemes, which are able to produce sharp profiles at the discontinuities and high accuracy in smooth regions, together with some kind of grid adaptation, which reduces the computational cost by using finer grids near the discontinuities and coarser grids in smooth regions. The combination of both techniques presents intrinsic numerical and computational difficulties. In this work we present a method obtained by the combination of a high order shock capturing scheme, built from Shu-Osher’s conservative formulation, a fifth order weighted essentially non-oscillatory (WENO) interpolatory technique, Donat-Marquina’s flux-splitting method and a third order Runge-Kutta method, with the adaptive mesh refinement (AMR) technique of Berger and collaborators. We show how all these techniques can be merged together to build up a highly efficient numerical method, and we show how to parallelize such an algorithm. We also present a description of the AMR algorithm that is much more general that the actual descriptions found in the scientific literature and tries to approach to the fundations of the running algorithms that are described and implemented in practice. We make extensive testing of our implementation to determite its extent of applicability and relative benefits with respect to the non-adaptive algorithm.

xlii

1 Introduction

1.1 Motivation The enormous growth of the processing power of modern computers allows scientists to go further in the simulation and analysis of physical problems that could not be tackled without their help. In particular, Computational Fluid Dynamics (CFD) is one of the fields that make extensive use of computer-aided numerical simulation. Some theoretical results, hypothesis and intuitions have been tested against numerical results obtained with the aid of computers. Lots of software packages have been developed to help scientists to perform numerical simulation of fluid phenomena, and a lot of knowledge about the fluid dynamics equations has been acquired in different contexts thank to the help of such software. CFD is nowadays applied to a wide and heterogeneous

2

1.1. Motivation

range of fields as design of moving vehicles like aircrafts, submarines, cars or satellites, traffic flow control, weather prediction, biomedical sciences, topological design, micro and nano-device cooling systems, environmental sciences and others (see e.g. [164, 126, 191, 3, 171, 53, 86, 133, 17, 129]). The industry makes extensive use of in-house or professional software packages. It is important for them to obtain results from their simulations as fast as possible and with the highest possible accuracy, in order to reduce their costs and production cycle times. The industry demands reliability, robustness and specificity in the tools used in the production processes. Cutting-edge CFD software packages, be them commercial like FLUENT [51], open-source like OpenFOAM [131] or in-house like the DLR-TAU code [156] implement many of the standard computational fluid dynamics numerical methods used nowadays by the industry. However, there is a big gap between the most modern numerical techniques developed in academic environments, such as universities and research centers, and its integration into engineering production processes. This is mainly due to the increasing complexity of such techniques, to the big costs involved in their implementation and integration with existing codes and to the lack of confidence in the final efficiency that can be obtained when using more complex technologies in exchange for the investment made in their development. Practical problems involve several different steps that make use of various technologies, usually developed in different contexts, and advantage could be obtained from the new developments obtained in each separate area. For example, in the field of aeronautical optimal shape design [78, 68, 126], the goal is to define the shape of an object, say an airplane or a part of it that travels through air or another fluid medium, in order to achieve a prescribed goal, like minimization of drag, maximization of lift or noise reduction. An approach to reach the desired goal is to start with an initial shape, and deform or modify it through an iterative optimization process, until a satisfactory solution is reached. The efficient solution to this problem involves, as leading processes, geometric shape definition, generation and deformation of computational grids, numerical solution of Partial Differential Equations (PDEs), geometric handling of surfaces and optimization. The efficient merging of technologies coming from these fields is an active field of research, as demonstrates the fact that the European Framework Programmes are supporting several projects related to the application of academic research to industrial problems, see e.g. [88]. Also in the United States a considerable effort is being done in the same direction, leaded by the

1. Introduction

3

NPARC Alliance [173], that concentrates efforts from partners as Boeing, NASA and USAF. With much more modest objectives, in this work we develop a numerical method based on some of the state-of-the-art numerical techniques present in the area of computational fluid dynamics, and we solve the main difficulties that appear when putting all those techniques together. We also show the way in which such a method can be parallelized and implemented into a computer program. Our implementation of the method has been written from scratch using ANSI C [76] and the Message Passing Interface (MPI) standard [123] for parallelization.

1.1.1 High resolution shock-capturing schemes There is a (still open) debate about the relative advantages of high order (high resolution) and low order numerical methods for the numerical solution of hyperbolic systems of equations. Typically low order methods are faster and easier to implement, but provide less accurate solutions. High order methods compute better numerical approximations but with a higher computational cost per computational cell. It is a problem dependent issue whether it is more efficient a high order method applied to a relatively coarse grid, or a lower order method applied to a finer grid, so that both methods give solutions with the same accuracy. It is widely accepted that high order methods are advantageous over low order schemes, even if positionings of both signs can be found in the literature, see e. g. [58, 80, 87, 110]. In particular, for time-dependent problems including flows with complicated structures, evidence shows that higher order methods outperform lower order ones [157, 95]. In real industrial applications high order methods are seldom used, and the main reasons for this are the implementation difficulties associated with those schemes. It represents a lot of development effort to obtain an stable and robust code incorporating high resolution technologies, and the industry continues using the classical, well established, codes and algorithms. However, there is an increasing demand of high fidelity tools for CFD simulation. Research has shown some phenomena intrinsic to fluid mechanics that would need unapproachable fine meshes to be resolved with low order methods. Examples are the Rayleigh-Taylor [168, 172] and Richtmyer-Meshkov [146, 122] instabilities, acoustic waves generated by shock-vortex interactions [48] and turbulence [189].

4

1.1. Motivation

1.1.2 Need of fine resolution computational grids The numerical solution of partial differential equations can be obtained by means of discrete approximations to the continuous equation, obtained through a discretization process. The way how one passes from the continuous equations to the discrete, finite-dimensional problem is determined by the different choices made for each element that acts in the problem. Typically the computational domain is divided into cells, and the continuous equations are replaced by a discrete approximation at each cell. The discretization of the computational domain itself imposes a limit in the flow features that can be resolved. The numerical solution within a cell is often interpreted as an approximation to the average or point-value of the true solution in that cell, which means that no method can resolve phenomena whose scale is smaller than the mesh size. The difference between different numerical methods can be interpreted in terms of their relative ability to get the information of the solution contained in a single computational cell. High order methods give better results than low order methods when applied to a given fixed grid, because they are able to better resolve the flow in a single cell, but to properly resolve small scale features it is a necessary condition that the grid size be smaller than the scale of the phenomena to be solved. To sum up, the optimal method would be a high order method applied on a very fine computational grid, but the computational requirements of such a method would be, by far, unapproachable with today’s technology in a reasonable time, both in storage and computational power requirements.

1.1.3 AMR: spatial and temporal refinement Accurate approximations of the true solution of the equations can be obtained wherever the solution has enough smoothness using a relatively coarse mesh and low order methods. Small-scale features are often related with parts of the numerical solution where discontinuities appear. Fine grids are particularly helpful only in the parts of the solution which have non-smooth structure, or where the solution is rapidly changing. This idea led researchers to develop a variety of techniques in order to reduce the computational cost of the overall algorithm, mainly based on

1. Introduction

5

the use of non-uniform grids. These algorithms use a grid with cells of variable size, trying to use cells of smaller size in some regions of interest, maintaining cells of bigger size in other regions where the solution is smooth. These grids are often difficult to manipulate in more than one space dimension, because the solution at a cell depends on the solution at some neighborhood around it. The use of cells of mixed size renders difficult the computation of the solution at the next time step because of the variable number of neighbors with non-uniform sizes and relative locations. Even more complex is the usage of unstructured grids. This approach is often used to model complex geometries. The lack of structure added to the non-uniformity in the grid makes these algorithms even harder to be implemented. Other methods allow the control points to move within a grid towards the region where high resolution is needed, as penalty methods [27, 190]. These methods are efficient and robust when the small scale features are well separated and a sufficient number of grid points is used. For complex flows the grids can become badly behaved, because of high stretching, distortion, etc. One step forward are meshless methods [20], where no discrete grid is used at all. A certain number of nodes is freely distributed in the computational domain. Node collapsing and bad behavior of the numerical methods on these structures is a major problem for these methods, that are, on the other hand, very flexible. Another important drawback of the previous approaches is stability. The Courant - Friedrichs - Lewy (CFL) condition [40] is a necessary condition for stability, and imposes an upper limit in the size of the time step, which depends on the size of the smallest cell. The smaller the smallest cell in the grid is, the higher the number of time steps necessary to compute the solution. Adaptive Mesh Refinement (AMR) [21, 24, 22, 139] adds a new feature to this pool: temporal refinement. The goal of the AMR procedure is to perform as few cell updates as possible, instead of reducing the number of cells. It exploits the fact that cells of different sizes can be advanced in time with different time steps by splitting the cells into different grids with uniform grid size, that are integrated according to their corresponding time steps. The AMR approach is very general and in practice a compromise between efficiency and complexity of the numerical algorithm has to be taken. In particular, in the final form of the algorithm that will be described in this work the underlying idea is to use a hierarchical set of Cartesian, uniform meshes that live at different resolution levels. At the

6

1.1. Motivation

coarsest level there is a set of coarse mesh patches covering the whole domain. Mesh patches at some resolution level are obtained by the subdivision of groups of immediately coarser cells according to a suitable refinement criterion. By repeating this sub-division procedure one can cover the regions of interest with mesh patches so that the non-smooth structure of the solution can be resolved with the desired resolution. The grids at different resolution levels co-exist, and some mesh connectivity information is needed to connect the solutions at different resolution levels. Provided the connectivity information, each mesh patch can be viewed in isolation and can be integrated independently. The presence of discontinuities at a small part of the domain does not restrict the time step than can be used at the coarse grid. Note that, on the other hand, there is some redundancy in the solution, since grids that correspond to different resolutions can refer to the same spatial location. Despite the computational saving is much more spectacular in two or three space dimensions, a simple example in one space dimension can clarify the power of the AMR approach: imagine to have a computational domain composed by a coarse computational grid of 100 cells, whose grid size is ∆x. At a certain point of the simulation an interval of 20 coarse cells is refined. From these coarse cells at level 0 we obtain a mesh patch at refinement level 1 of 40 cells of size ∆x 2 each, composed by the subdivision of each coarse cell into 2 sub-cells of equal size. Suppose that the process is repeated for an interval of 10 cells of the patch at level 1, obtaining a patch of 20 fine cells at level 2. If the time step imposed by the CFL condition at level 0 is ∆t then the time steps for levels 1 and 2 have to ∆t be taken, for stability, as ∆t 2 and 4 respectively, so the integration of the whole grid hierarchy from time t to time t + ∆t requires 100 cell updates for level 0, 40 × 2 = 80 for level 1 and 20 × 4 = 80 for level 2, for a total of 260 cell updates. To perform the simulation at an equivalent fixed grid of 400 ∆t points of size ∆x 4 a time step equal to 4 is required, so 400 × 4 = 1600 cell updates have to be computed to integrate the solution from time t to time t + ∆t. Only a 16.25% of that quantity is required by the AMR algorithm. As the integration of the solution is, by far, the most expensive part of the algorithm, especially for high order schemes, the overhead produced by the transfer of information between grids is small compared with the reduction of computational effort provided by the spatial and temporal refinement of the AMR algorithm. Other related algorithms, based on the idea of the cell update reduction are the local defect correction method [61], local multi-grid methods [28] and the Fast Adaptive Composite Grid (FAC) method [120].

1. Introduction

7

1.2 Previous work Since the first descriptions of the AMR algorithm by Oliger’s group at Stanford University in 1982 [26, 21], a lot of development has been done on adaptive mesh refinement. The algorithm was established for hyperbolic equations by the paper of Berger and Oliger [24]. At the end of the 1980’s some works implemented AMR for two-dimensional gas dynamics [22, 10]. Short after three-dimensional simulations were presented by Bell, Berger, Saltzman and Welcome [19]. These algorithms proved their efficiency in the experiments performed in the aforementioned papers, but the task of implementing an AMR-based running code in more than one space dimension was (and still is) an Herculean task. In 1991 Quirk [139] simplified several parts of the (two-dimensional) algorithm, making it easier to implement, but some simplifications seem to be too restrictive. However this simplified version encouraged researchers to think on incorporating AMR variations to their research and computer programs. Nowadays several software packages include AMR infrastructures and templates to help others in the task of building their own AMR programs [116, 185]. Because of the time-refinement features of the AMR algorithm, explicit schemes for time integration seem to be better suited to AMR, but implicit schemes have also been used [4]. Solvers with accuracy order higher than two were implemented combined with AMR in the last years of the last century. The Piecewise Parabolic Method (PPM) of Colella and Woodward [39] is nowadays widely used in AMR codes [29] and effort to build AMR codes including Essentially Non-Oscillatory (ENO) and Weighted Essentially Non-Oscillatory (WENO) methods is currently being made [106, 108, 15, 192].

1.3 Scope of the work In this work we develop a numerical method for fluid dynamics that incorporates some of the most advanced techniques present in the literature. We describe the main algorithms and techniques that form the building blocks of the algorithm, and we address the difficulties that appear when putting them all together into a single algorithm. Some

8

1.4. Organization of the text

particular points of interest that have been investigated are: description of the AMR algorithm in wide generality, inter-grid conservation, refinement procedures, grid generation and handling and implementation and parallelization of the algorithms. The building blocks of the algorithm are Shu-Osher’s conservative formulation [160, 161], a fifth order WENO interpolatory technique [81], Donat-Marquina’s flux-splitting method and a third order Runge-Kutta method [160], with the AMR technique for mesh refinement. We have tested the resulting algorithm with several numerical experiments in order to investigate its behavior in different scenarios, and to what extent the algorithm can be applied to a particular problem. We have implemented a parallel version of the algorithm using the ANSI C language and the MPICH implementation [127] of the MPI standard for parallelization.

1.4 Organization of the text The text is organized as follows: In chapter 2 we introduce the basic concepts and ideas of fluid dynamics, focusing on the Euler equations, which are one of our test models in the validation of the algorithm. The basics of numerical methods for fluid dynamics are introduced in chapter 3. We have tried to write a reasonably self-contained text, so to ease its reading to non-experts in the area. In chapters 4 and 5 we describe the basic building blocks of our algorithm. The flow solver is described in chapter 4. It consists mainly of three parts. In section 4.1 Shu-Osher’s finite difference approach is described. Conceptually, it represents a simplification of the finite-volume framework, but the combination of algorithms based on it with the AMR algorithm presents particular difficulties. In section 4.2 the basic flow solver, based on Donat-Marquina’s flux splitting is described. It represents an extension to nonlinear systems of Shu-Osher’s algorithm and has proven to be a robust and efficient algorithm. The weighted essentially non-oscillatory (WENO) reconstruction procedure of Jiang and Shu is described in section 4.3. This is the reconstruction used into DonatMarquina’s algorithm to achieve fifth order accuracy. Finally, the AMR algorithm used in our implementation is described in chapter 5. A more general AMR algorithm has been described in appendix A. At the end of the appendix, the algorithm constructed in chapter 5 is described as

1. Introduction

9

a particular case. In chapter 6 we describe our implementation of the algorithm in parallel using message passing. Chapter 7 is devoted to the numerical validation of the algorithm. We have run several test cases in one and two dimensions, for scalar equations and systems, in order to define the extent of applicability of the algorithm. Conclusions and future research lines to be followed from this work are pointed out in chapter 8. Preliminary work related to this thesis can be found in [13, 14, 15, 128].

10

1.4. Organization of the text

2 Fluid dynamics equations In this chapter we review some basic facts about hyperbolic conservation laws, focusing on fluid dynamics equations, and more precisely on the Euler equations of gas dynamics. We will review the structural properties of such equations and their solutions, with the aim of deriving information that has to be taken into account when building numerical methods for their solution. Other well known model equations used in the numerical experiments will also be presented here. There are many sources of information about hyperbolic conservation laws and fluid mechanics. The classic, essential book of Landau and Lifshitz [93] is the main reference in this field. More recent works, such as the books of Batchelor [16], Chorin and Marsden [36] and Dafermos [41] are good references. The work of Lax [100] is also a must. Another classical text is Lamb’s book [91], whose first edition amounts to 1879. This book was written in the time in which the exciting works

12

2.1. Hyperbolic conservation laws

of Lord Rayleigh [168, collected in [169], pp.200–207], Lord Kelvin [174] and Helmholtz [184], among others, appeared, and reflects an, at that time, new conception of fluid mechanics. Other interesting references are [90, 119, 188].

2.1 Hyperbolic conservation laws In physics, a conservation law states that a particular measurable property of an isolated physical system does not change as the system evolves in time. Conservation laws are often modeled by means of integral equations, but in practice these are represented by systems of partial differential equations, that are equivalent to the integral formulation for smooth solutions. Hyperbolic conservation laws are a class of hyperbolic partial differential equations, of special interest in fluid dynamics since the most important models of fluid motion are represented by equations of this type. A leading characteristic of hyperbolic conservation laws is their ability to accept discontinuous solutions. This fact, together with the finite speed of propagation of information, are the main reasons why numerical methods particular for hyperbolic conservation laws have to be developed. It is the goal of this work to develop and test a numerical method to solve hyperbolic systems of conservation laws using quite sophisticated technology. Why such elaborated methods need to be developed has been explained in chapter 1, and will be assessed throughout the rest of the work. A vast literature on numerical methods specially designed for hyperbolic conservation laws has been produced in the last years, see e.g. [175, 6, 105, 70, 71]. There are several reasons for this interest on the study, apart from other types of PDE’s, of hyperbolic PDE’s in general and hyperbolic conservation laws in particular, being the main reasons the following: 1. Many engineering and industrial problems involve conservation of some quantities. In particular the equations of fluid mechanics reduce to the Euler equations, when the effects of viscosity and heat conduction are neglected. The Euler equations are one of the most known hyperbolic systems of conservation laws.

2. Fluid dynamics equations

13

2. The structure and properties of the solutions of hyperbolic conservation laws are particular to them, and must be treated carefully in order to devise convergent numerical methods. 3. Some non-hyperbolic systems can be numerically solved by adding a trivial discretization of the non-hyperbolic terms to an hyperbolic solver. Conservation laws are represented by a system of partial differential equations with a particular structure: d

∂u X ∂f q (u) + = 0, ∂t ∂xq q=1

x ∈ Rd ,

t ∈ R+ ,

(2.1)

where d is the number of spatial dimensions, u : Rd × R+ −→ Rm is the solution of the conservation law, formed by the so-called conserved variables, and f q : Rm −→ Rm are the flux functions. The number m of conserved quantities is problem dependent. The particular case m = 1 is often referred as scalar conservation law. Conservation laws often come from an integral equation representing the conservation of a certain quantity, whose density is represented by u. Conservation means that the amount of mass contained in a given volume can only change due to the mass flux crossing the interfaces of the given volume. For a cube in Rd : "Z Z d X f q (u(x1 , . . . , xq−1 , x1q , xq+1 , . . . , xd ))d¯ xq (u(x, t2 ) − u(x, t1 ))dx = c

q=1

−

Z

c¯2q

c¯1q

#

f q (u(x1 , . . . , xq−1 , x2q , xq+1 , . . . , xd ))d¯ xq , (2.2)

where c = [x11 , x21 ] × . . . , ×[x1d , x2d ] ⊆ Rd is an arbitrary cell, x ¯q = (x1 , . . . , xq−1 , xq+1 , . . . , xd ) and c¯iq = [x11 , x21 ] × · · · × [x1q−1 , x2q−1 ] × {xiq } × [x1q+1 , x2q+1 ] × [x1d , x2d ] ∈ Rd−1 is a cell interface, for 1 ≤ q ≤ d and i = 1, 2. The integral form is much more general than the differential form (2.1); the last implies the former, but the reciprocal is only true for

14

2.1. Hyperbolic conservation laws

smooth functions. In practice the solution u is not smooth in general and only the integral form is valid in this case. Hopefully, a mixed formulation, where the differential form is used wherever u and f q (u) are smooth, and additional conditions are given for the zones where discontinuities appear, can still be used (see section 2.2). The problem statement is usually to solve a Cauchy problem, i.e., to find the state of the system after a certain time t = T , given the state at time t = 0. Equation (2.1) is thus augmented with initial conditions u(x, 0) = u0 (x),

x ∈ Rd .

(2.3)

Boundary conditions have to be also specified when considering a bounded domain in Rd . System (2.1) can be written in quasi-linear form as: d

∂u X ∂f q ∂u + = 0, ∂t ∂u ∂xq q=1

x ∈ Rd , t ∈ R+ .

The matrices

∂f q ∂u are called the Jacobian matrices of the system. System (2.1) is said to be hyperbolic if any combination Aq ≡ Aq (u) ≡

d X q=1

ξq Aq ,

(ξq ∈ R)

has m real eigenvalues and a complete set of eigenvectors1 . If the eigenvalues of the Jacobian matrix are all distinct, the system is said to be strictly hyperbolic. For any u the Jacobian matrices can thus be diagonalized as: Aq = Rq Λq Rq−1 , where Λq is a diagonal matrix whose entries are the matrix eigenvalues, Λq = diag(λq1 , . . . , λqm ),

(2.4)

and Rq is the matrix whose column vectors are the (corresponding) right eigenvectors of Aq , q Rq = [r1q |, · · · |rm ] (2.5) 1

Each eigenvalue is repeated as many times as its multiplicity indicates

2. Fluid dynamics equations

15

Hyperbolicity is necessary. As will be illustrated throughout the text, the solution of simple hyperbolic linear problems (e.g. Riemann problems) is composed by m simple waves moving independently (in particular, see section 2.3.2). For the existence of such solutions it is necessary for the system to be hyperbolic (see [105] for a simple proof). For nonlinear systems the above argument applies at least locally, so hyperbolicity is also necessary for nonlinear systems. For smooth solutions of 1D linear systems, well-posedness of the system in C ∞ needs also hyperbolicity (see the works of Lax [97] and Mizohata [125]). For a nonlinear system in Rd the necessity of hyperbolicity can be seen when considering an initial value problem for the system written in quasi-linear form, and considering an initial data that varies only in a direction given by ξ = (ξ1 , . . . , ξd ) ∈ Rd : ( Pd ∂u ∂u q=1 Aq (u) ∂xq = 0, ∂t + (2.6) u(x, 0) = u0 (ξ · x), ξ = (ξ1 , . . . , ξd ) ∈ Rd . System (2.6) is well-posed inP C ∞ if and only if for any function u0 : R −→ Rm ∈ C ∞ the combination dq=1 ξq Aq (u0 ) is diagonalizable with real eigenvalues.

2.2 Properties of hyperbolic conservation laws In this section we study some important qualitative properties of hyperbolic equations and, more precisely, hyperbolic systems of conservation laws. We mainly review the fact that these systems can develop discontinuous solutions, even if smooth initial data is provided. Discontinuous solutions lead to the concept of weak solution, introduced in section 2.2.2. We then briefly explore the spectral structure of such systems, and we show how it can help in the understanding of the nature of the equations, and hence in the development of numerical methods for its approximate solution.

2.2.1 Discontinuous solutions In section 2.2.4 we will show how to exploit the possibility of diagonalizing the Jacobian matrix in a more general context. In this section we

16

2.2. Properties of hyperbolic conservation laws

only aim to show that hyperbolic equations can develop discontinuities in their solutions by means of simple examples, and how these discontinuities can be treated using the spectral information contained in the Jacobian matrices. Consider a Cauchy problem for an one-dimensional hyperbolic scalar equation of the form ut + f (u)x = 0, x ∈ R, t ∈ R+ , u(x, 0) = u0 (x), x ∈ R, or in quasi-linear form: ut + f ′ (u)ux = 0, x ∈ R, u(x, 0) = u0 (x), x ∈ R,

t ∈ R+ ,

(2.7)

∂f (u) where we have used the notation ut = ∂u ∂t and f (u)x = ∂x . Let x(t) be a parameterized curve in the (x, t) plane verifying the ordinary differential equation x′ (t) = f ′ (u(x(t), t)). (2.8)

For such a curve it holds d u(x(t), t) = ut + ux x′ (t) = ut + f ′ (u)ux = 0, dt i.e., the solution u is constant along the curve x(t) as time varies. Such a curve is called a characteristic curve of the equation (2.7). As u is constant along characteristics, by (2.8) so it is x′ (t). The characteristic curves are therefore given by x(t) = f ′ (u)t + C, i.e., for scalar equations the characteristics are straight lines in the (x, t) space, with slopes given by f ′ (u). The solution at a given point (x, t1 ), with t1 > 0 can be, in principle, obtained from the initial data by tracing back a characteristic that passes through the point until time t = 0. All the derivation made until now assumes smooth solutions and fluxes, but in general this is not the case; as time evolves, characteristic curves can cross in the (x, t) space. At a point where two different characteristics cross, the solution can take two different values, given by the initial data at two different spatial locations. Since multi-valued solutions are not physically allowable in fluid dynamics, only one solution is possible. The physically correct solution is often composed by a jump discontinuity, located at the crossing point, that propagates along time. This situation corresponds with the formation of a shock wave. Another possibility is that no characteristic passes through the given point (x, t1 ). In this case the solution at that point cannot be defined by

2. Fluid dynamics equations

17

means of characteristics and some information, that was not present in the initial data, has to be incorporated to build a feasible solution. In gas dynamics, this situation corresponds to the formation of an expansion wave, where the gas is being rarefied, and is therefore commonly called a rarefaction wave. For linear equations the slope of the characteristics is constant, and thus they are parallel. In this case it is impossible the formation of a shock or rarefaction wave. Another kind of wave, called contact discontinuity, is typical of linear equations with jump discontinuities in the initial data, and is characterized by the propagation of the data with constant speed. The three kind of phenomena described above –shocks, rarefactions, and contacts– represent, in a simplified form, the main typical features of the solution of hyperbolic systems of conservation laws. All these basic and intuitive ideas are extended, more formally, to hyperbolic nonlinear systems in section 2.2.4. As an example consider a Cauchy problem for the inviscid Burgers’ equation (cf. section 2.3.1): 2 ( = 0, x ∈ R, t ∈ R+ , ut + u2 (2.9) x u(x, 0) = u0 (x), x ∈ R.

For this equation we have f ′ (u) = u. For the following initial data, if x < 13 , 1 3 u0 (x) = (1 − x) if 31 ≤ x ≤ 23 , 12 if x > 32 , 2

shown in the bottom left plot of Fig. 2.1, the characteristics, in the (x, t) plane, are straight lines t = mx + n, with slopes m given by if x < 13 , 1 2 if 31 ≤ x ≤ 23 , m= 3(1−x) 2 if x > 32 .

These lines will cross in finite time. For example the characteristics passing through x = 31 and x = 23 are respectively given by x(t) = 13 + t and x(t) = 32 + 2t and cross at time t = 23 . In fact at this time all characteristics starting from points in the region 13 ≤ x ≤ 23 will cross at the same point (top left plot of Fig. 2.1). If we consider the following initial data, depicted in the bottom right plot of Fig. 2.1: 1 if x > 12 , u0 (x) = 1 1 2 if x ≤ 2 ,

18

2.2. Properties of hyperbolic conservation laws t

t

x=0

x=1/3

x=1

x=2/3

x=0

x

111111111111111 000000000000000 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111

x=1/2

x=1

x

x=1/2

x=1

x

U (x)

U (x)

0

0

1

1

1/2

1/2

x=0

x=1/3

x=1

x=2/3

x=0

x

Figure 2.1: Characteristics in the (x, t) plane (top) and corresponding initial data (bottom). Left part: characteristics collide at finite time. Right part: No characteristic arrives to the shadowed zone.

then for x > 12 the characteristic passing through a point x0 is given by x(t) = x0 + t and for x ≤ 21 by x(t) = x0 + 21 t. These characteristics are sketched in the top right plot of Fig. 2.1. In this case the characteristics do not cross and no characteristic is passing through the points in the region x− 1 < 2 t

1 2

< 1,

which is shadowed in the figure. In order to illustrate the concept of contact discontinuity, consider the linear advection equation (cf. section 2.3.1)

ut + aux = 0, x ∈ R, u(x, 0) = u0 (x), x ∈ R,

t ∈ R+ ,

(2.10)

where u0 (x) is any function. In this case the characteristics are given by x(t) = at + C, so they are parallel lines, as shown in the Fig. 2.2. If a discontinuity is present in the initial data, it will be simply advected with speed given by a. No discontinuities can form from smooth initial data.

2. Fluid dynamics equations

19

t

x=0

x=1/3

x=2/3

x=1

x

Figure 2.2: Characteristics in the (x, t) plane corresponding to the case a = 1 in (2.10)

2.2.2 Weak solutions In order to be able to consider non-smooth solutions the classical concept of solution, i.e., a smooth function verifying (2.1), has to be relaxed. As the integral form of the equation is more general than the differential form, the latter being obtained from the former by means of smoothness assumptions that do not hold in general, the new concept of solution can be thought of as a solution of the integral form of the conservation law. Unfortunately the integral form is quite difficult to handle. One should prove that, for a given function, equation (2.2) holds for any choice of the control volume and of the time interval. An equivalent, more convenient, form of the integral equation is provided by the theory of distributions. Definition 1. A function u(x, t) is a weak solution of (2.1) with given initial data u(x, 0) if Z Z Z d X ∂φ ∂φ q u(x, t) (x, t) + dxdt = − f (u) φ(x, 0)u(x, 0)dx (2.11) ∂t ∂xq Rd R+ Rd q=1

holds for all φ ∈ C01 (Rd ×R+ ), where C01 (Rd ×R+ ) is the space of continuously differentiable functions with compact support in Rd × R+ .

It is easy to check that (2.2) and (2.11) do have the same solutions. Of course strong solutions are also weak solutions, and continuously differentiable weak solutions are strong solutions. Weak solutions are often not unique, and a procedure to identify the physically correct one is needed. This is usually done by means of additional conditions imposed on the solutions of the equation, that should be satisfied in a discrete sense by the numerical method. This point is outlined in section 2.2.3.

20

2.2. Properties of hyperbolic conservation laws

2.2.3 Rankine-Hugoniot conditions By applying the integral form of the conservation law on a small volume surrounding an isolated discontinuity, one can obtain the RankineHugoniot conditions [143, 75]. These conditions characterize weak solutions in terms of the discontinuity movement, and gives information about the behavior of the conserved variables across discontinuities. The derivation of the Rankine-Hugoniot conditions can be found in some sources, e. g. [70, 71, 36]. For a general conservation law the conditions read: [f ] · nΣ = s[u] · nΣ , (2.12)

where f = (f 1 , . . . f d) is a matrix containing the fluxes, u is the solution, s is the discontinuity velocity and nΣ is the vector normal to the discontinuity. The notation [·] indicates the jump on a variable across the discontinuity. Weak solutions do necessarily satisfy the Rankine-Hugoniot conditions at discontinuities. In fact it can be shown that a function u(x, t) is a weak solution of (2.1) if and only if equation (2.1) holds wherever u is smooth at (x, t) and the Rankine-Hugoniot conditions are satisfied if u is not smooth in (x, t), see e.g. [36]. Since conservation laws can have more than one weak solution (some simple examples of this fact can be found e. g. in [104]), additional conditions have to be imposed to the equations in order to pick up the physically correct solution, known as entropy solution. Several conditions were derived in the fifties and sixties, being the most known the ones due to Oleinik [130], Lax [98], Wendroff [187] and Liu [111]. Lax’s E-condition is defined in (2.16) below.

2.2.4 Characteristic structure of a system of conservation laws In section 2.2.1 we have illustrated the fact that, to some extent, the characteristics are an extremely useful tool for the computation of solutions of hyperbolic conservation laws, since an important part of the structure of the solution is given by the characteristic curves. Roughly speaking, characteristics are the curves in the (x, t) space that carry information. The fact that information propagates along characteristics is

2. Fluid dynamics equations

21

particular to hyperbolic systems and, to some extent, serves as a design tool for numerical methods and as a way to understand the behavior of the solutions. A brief summary of the leading facts about the characteristic structure of hyperbolic systems, extending the ideas presented in section 2.2.1 for scalar equations, is presented in this section. More complete studies on the topic can be found in [165, 41]. For simplicity we will restrict the study to one-dimensional problems, ut + f (u)x = 0.

(2.13)

Being system (2.13) hyperbolic, we can decompose the Jacobian matrix A = f ′ (u) as A = RΛR−1 , with R and Λ as in (2.4) and (2.5). From the properties of the characteristic structure of A, allowable types of discontinuities in the flow solution and their properties are described next. Each column vector rp of R defines a vector field rp : Rm → Rm , u → rp (u), called p-th characteristic field. For a constant-coefficient linear system of conservation laws the characteristic information suffices to completely solve the system, as will be explained in section 2.3.2. A nonlinear system cannot be solved by the same means, but an analysis analogous to the linear case gives qualitative information about the solution structure and allows to tackle the numerical solution of the system in a more convenient way. The first concept to be introduced is that of characteristic curves: Definition 2. Given an hyperbolic system of conservation laws ut +f (u)x = ′ 0, let {λp (u)}m p=1 be the eigenvalues of the Jacobian matrix f (u). We say that a curve x=x(t) is a characteristic curve of the system if it is a solution of the ordinary differential equation: dx = λp (u(x, t)) dt for some p, 1 ≤ p ≤ m. Note that only strictly hyperbolic systems have m different characteristic curves. Characteristic curves are often interpreted in the (x, t) plane. A basic fact about characteristics is that, for linear systems the solution of the system is constant along characteristics, and these are straight lines. For nonlinear systems characteristics are no longer straight lines, nor the solution is constant along them but, for small times, it is reasonable to suppose that the behavior of the nonlinear system can be mimicked by that of a linear system, coming from some suitable linearization. (cf. sections 3.6 and 4.2).

22

2.2. Properties of hyperbolic conservation laws

Particular types of characteristic fields of interest are presented next. The first type are genuinely nonlinear fields. A characteristic field defined by an eigenvector rp (u) is called genuinely nonlinear if ∇λp (u) · rp (u) 6= 0,

∀u,

(2.14)

where ∇λp (u) is the gradient of λp (u). Note that for a linear system A does not depend on u and therefore λp is constant with respect to u, so genuinely nonlinear fields cannot appear in linear system and are particular of nonlinear systems. Let u(α) be a parameterized curve that is an integral curve of a genuinely nonlinear vector field rp , i.e., a curve in phase space such that du = c(α)rp (u(α)), dα for some c(α) 6= 0. Because of the definition of genuinely nonlinear field, λp (u) varies monotonically as u varies along an integral curve of rp (u): dλp (u(α)) du(α) = ∇λp (u(α)) · = c(α)∇λp (u(α)) · rp (u(α)) 6= 0. dα dα Another interesting type of characteristic fields are linearly degenerate fields, for which ∇λp (u) · rp (u) = 0, ∀u. (2.15) In linearly degenerate fields λ(u) remains therefore constant along integral curves of rp (u) as u varies, due to (2.15). These fields are a generalization of the characteristic fields of a constant-coefficient linear system, where ∇λp = 0. Roughly speaking, the behavior of the solution with respect to a linearly degenerate field is similar to a linear system, whereas a genuinely nonlinear field implies types of discontinuous solutions that can never appear in a linear system. Existence theory for the Cauchy problem for systems where all the characteristic fields are either linearly degenerate of genuinely nonlinear was developed by Glimm [55], using the solution of the Riemann problem found by Lax [98]. For general systems, existence of weak entropy solutions is much more difficult to state [112]. Some types of discontinuities can appear in the solution of a nonlinear system. The presence or not of an specific type of discontinuity can be determined in certain cases from the characteristic structure of the Jacobian matrix. More precisely, discontinuities can often be associated with a single characteristic field, and certain types of discontinuities are associated to particular types of characteristic fields.

2. Fluid dynamics equations

23

Let us introduce some definitions in order to start a brief study of the relationships between discontinuities and characteristic fields. A discontinuity defined by x = s(t), that separates two states uL (t) and uR (t) is said to be a p-shock, or a shock wave associated to the p-th characteristic field if λp (uL ) ≥ s′ (t) ≥ λp (uR ). (2.16) Condition (2.16) is called Lax’s E-condition [99]. It is a particular case of the entropy conditions mentioned in section 2.2.3. We will only consider shocks where the following holds: λj (uL,R ) > λp (uL ) ≥ s′ (t) ≥ λp (uR ) > λi (uL,R ),

j > p > i.

(2.17)

In general a shock wave is defined as a discontinuity verifying the Rankine-Hugoniot conditions (2.12), and can involve jumps in more than one conserved variable. Conditions (2.16) and (2.17) ensure that the shock is associated to a single characteristic field and that other characteristics are not interfering, in the sense that (2.16) cannot hold for two characteristic fields at the same time. This is often the case when λp (u) is a simple eigenvalue. Shocks where the E-condition (2.16) is satisfied for more than one characteristic field simultaneously will not be considered here. A p-contact discontinuity is an special case of a p-shock wave, where (2.16) holds with equalities, i.e. λp (uL ) = s′ (t) = λp (uR ).

(2.18)

In gas dynamics, a contact discontinuity represents the separation of two zones with different density, but in pressure equilibrium, whereas shock waves represent a discontinuity arising from an abrupt pressure change, resulting in a compression of the medium. Rarefaction waves are a kind of waves that are typical of genuinely nonlinear fields. A rarefaction wave does not involve discontinuities in the conserved variables and in gas dynamics represents the situation in which the fluid is expanding and there is a zone where the fluid is being rarefied. Rarefactions are characterized by the condition λp (uL ) < λp (uR ). Let us now analyze the relationships between the different types of waves introduced above, and the different kinds of characteristic fields. In linearly degenerate fields corresponding to single eigenvalues, if two states uL and uR lie in the same integral curve and compose a jump

24

2.3. Model equations

discontinuity, then by (2.15) these two states propagate with the same velocity, i. e. λp (uL ) = λp (uR ) holds, forcing (2.18) to hold. Therefore discontinuities associated to these fields can only be contact discontinuities. These are the only type of discontinuities that can appear in the solution of linear systems. On the other hand genuinely nonlinear fields can host both shocks and rarefaction waves, depending on the left and right states and the kind of monotonicity of the variation of λp (u).

2.3 Model equations In this section we introduce the main model equations used in fluid dynamics, namely the advection equation, Burgers’ equation, linear hyperbolic systems and the Euler equations, as a model of nonlinear system of conservation laws. The two-component Euler equations are also introduced. All these models will be used in chapter 7 for the validation of the algorithm.

2.3.1 Scalar hyperbolic equations In this section we will consider the advection equation and Burgers’ equation, which represent two of the most studied examples of hyperbolic scalar equations. These models present many of the features of hyperbolic systems.

Advection equation The advection equation is the simplest model of a conservation law. In one space dimension it is written as: ut + aux = 0,

(2.19)

where a is a constant. The advection equation governs the motion of a (conserved) quantity, with density u, in a fluid as it is advected with constant velocity a. Advection with space or time-dependent velocities will not be considered here.

2. Fluid dynamics equations

25

For any function F : R −→ R, a solution of (2.19) is given by u(x, t) = F (x − at).

(2.20)

This solutions represent the transport of a given perturbation described by F through the flow at constant speed a without changing shape. If an initial condition u(x, 0) = u0 (x) is given, the solution of the corresponding Cauchy problem is u(x, t) = u0 (x − at). Note that even if F is not continuous (2.20) is still a weak solution of (2.19), and such a situation is a simple case of a contact discontinuity propagating with constant velocity.

Inviscid Burgers’ equation The inviscid Burgers’ equation is defined by: 2 u = 0. ut + 2 x This equation is the inviscid version of the viscous equation ut +

(2.21)

u2 2

x

=

ǫuxx , with ǫ > 0, studied by Burgers [30]. The equation is similar to the advection equation, when written in quasi-linear form ut + uux = 0, but with the particularity that the speed of propagation, given by f ′ (u) = u, is no longer constant, but depends on the solution itself. Despite of this resemblance, the behavior of the solution of this equation is completely different from the advection equation. Here u is not simply advected as time evolves, but can also be compressed or rarefied. Shocks and rarefaction waves typically appear in the solution of this equation, see section 2.2.1.

2.3.2 Linear hyperbolic systems Linear systems represent a generalization to several variables of the scalar advection equation (2.19). In this section we will study the main properties of linear hyperbolic systems, in particular the solution of the system through a change of variables. Most of the knowledge acquired from

26

2.3. Model equations

linear systems will be exported to nonlinear systems, since numerical methods for nonlinear systems are constructed under the assumption that nonlinear systems behave locally like linear systems. A linear hyperbolic system is a particular case of the PDE (2.1) where the flux function f (u) depends linearly on u, i.e., it can be written as f (u) = Au, where A is an Rm × Rm constant-coefficient matrix. The equation thus reads for this case: ut + Aux = 0.

(2.22)

Hyperbolicity of the system means that the matrix A has m real eigenvalues λ1 , . . . , λm and m linearly independent (right) eigenvectors r1 , . . . , rm . This is equivalent to say that the matrix A is diagonalizable with real eigenvalues, i.e., it can be expressed as: A = RΛR−1 where Λ = diag(λ1 , . . . , λm ), with λp ∈ R and R = [r1 , . . . , rm ], rp ∈ Rm . The introduction of the change of basis given by the matrix R produces a new linear system that is diagonal, and can hence be solved as m (decoupled) advection equations, whose solution is known, see section 2.3.1. By applying the inverse change of basis to the solution of the diagonal system one obtains the general solution of the linear system. The variables u, when expressed in the basis given by R, are called the characteristic variables, as stated in the next definition. Definition 3. Given a hyperbolic linear system, with matrix A = RΛR−1 , the characteristic variables w = [w1 , . . . , wm ]T of the system are defined by w = R−1 u. With these variables we can write equation (2.22) as: Rwt + RΛR−1 Rwx = 0 ⇔ Rwt + RΛwx = 0 ⇔ wt + Λwx = 0.

(2.23)

The equation wt + Λwx = 0. is called the characteristic form of the linear system (2.22), and is a diagonal linear system of PDE’s, that can be written in expanded form as: λ1 0 · · · · · · 0 0 λ2 0 · · · 0 w1 w1 . .. . . . . + .. (2.24) ... . · .. = 0. .. wm t wm x . 0 0 ··· 0 0 λm

2. Fluid dynamics equations

27

Each row in (2.24) reads ∂wp ∂wp + λp = 0, ∂t ∂x which is nothing but an advection equation with constant velocity λp . Given initial data u(x, 0) = u0 (x) for (2.22), the solution w of (2.24) is given by wp (x, t) = wp0 (x − λp t), where wp0 is the p-th component of w0 = R−1 u0 . By taking the inverse change of basis the solution of the original linear system (2.22) is obtained as: u = R · w, or in expanded form: u(x, t) =

m X p=1

wp (x − λp t, 0)rp .

2.3.3 Nonlinear hyperbolic systems Nonlinear hyperbolic systems merge together two already presented models, namely nonlinear scalar equations, exemplified by Burgers’ equation (2.21), and linear systems, introduced in section 2.3.2. Within this section we introduce the basics of two model equations –the Euler equations and the two-component Euler equations– that will be used as test problems and reference models throughout the text.

Euler equations The Euler equations model the dynamics of a Newtonian, ideal, inviscid fluid. The Euler equations are derived from the conservation of mass, linear momentum and energy in the fluid as it moves, and represent a simplified model for the Navier-Stokes equations, which are the most complete model used up to now for the simulation of fluid dynamics. The two-dimensional Euler equations can be written as: ut + f (u)x + g(u)y = 0

(2.25)

28

2.3. Model equations

with

ρ ρvx u= ρvy , E

ρvx ρvx2 + p f (u) = ρvx vy , vx (E + p)

ρvy ρvy vx g(u) = ρvy2 + p , vy (E + p)

(2.26)

where ρ denotes density, vx and vy are the Cartesian components of the velocity vector v, E is energy and p is pressure. The one dimensional version of the equations are obtained by retaining the first two terms in the left hand side of (2.25) and canceling out the third row of u and f (u), to get ρ ρvx ρvx + ρvx2 + p = 0. (2.27) v (E + p) E x x t

The Euler equations in d space dimensions compose a system of d + 2 equations. Note that in the definition of the equations we use d + 3 variables, namely density, d velocity components, total energy and pressure. To close the system we need to specify an additional relation linking all these variables. This relation is called Equation Of State (EOS), and depends on the type of fluid under consideration. We will consider an special class of fluids called ideal fluids or ideal gases. An ideal gas can be defined as a gas in which all collisions between atoms or molecules are perfectly elastic and in which there are no intermolecular attractive forces. Ideal gases are characterized by three state variables: absolute pressure (p), density (ρ), and absolute temperature (T ). The relationship between them may be deduced from kinetic theory and is called the Ideal Gas Law: p = ρRT, where R = 8.314472 J · K−1 · mol−1 is the Universal Gas Constant. Total energy can be decomposed into kinetic energy plus internal energy as follows: E=

1 ρ||v||22 2 | {z }

kinetic energy

+

, ρe |{z} internal energy

(2.28)

where e denotes the specific internal energy. Kinetic energy is due to the advection of the flow, whereas internal energy is the result of other forms of energy. It is often assumed that internal energy is a known function of pressure and density, e = e(p, ρ).

2. Fluid dynamics equations

29

For an ideal gas, internal energy is a function of temperature alone, i.e., a function of ρp . A further simplification is to suppose that it is proportional to temperature, e = cv T,

(2.29)

where cv is called the specific heat at constant volume. Ideal gases where (2.29) holds are called polytropic gases. In general the term specific heat defines the amount of heat required to change a unit property of a substance by one degree in temperature. For polytropic gases, cv is the specific heat for internal energy (i.e., the increment of internal energy depends linearly on temperature by the factor cv , when the gas volume is held fixed). If the volume is allowed to expand and the gas pressure is forced to remain constant the increment in internal energy does not depend linearly on temperature anymore, because some energy is used to expand the volume. The quantity that depends linearly on T is called specific enthalpy, denoted by h, and is defined by: h =e+

p ρ

(2.30)

For polytropic gases the following holds: h = cp T

(2.31)

The constant cp is called specific heat at constant pressure. The quantity γ=

cp cv

(2.32)

is called the specific heat ratio, and is just a number that depends on the gas. For air it takes the value γ ≈ 1.4. Using equations (2.30), (2.31) and (2.32), a simple computation leads to express the internal energy in terms of pressure and density as e=

p , ρ(γ − 1)

(2.33)

which is another form of the equation of state for a perfect gas. Substituting (2.33) into (2.28) gives E=

1 p ρ||v||22 + . 2 γ−1

Some common gases and fluids can be considered, to a good approximation, polytropic. Examples of such are air, helium, carbon dioxide

30

2.3. Model equations

or even water. For real fluids the ratio of specific heats is not constant, and varies also with temperature, but very often the variations are small enough to be neglected (e.g. 2 · 10−3 for dry air between 0o C and 100o C or 2 · 10−2 for water between 20o C and 200o C), and thus these gases are in practice considered polytropic ideal gases. An essential physical quantity is entropy, denoted by S = S(ρ, e). Entropy was first rigorously introduced by Rudolf Clausius in 1864 [37]. He actually defined the concept of change in entropy as the ratio between the heat transferred to a system and its absolute temperature. As the change in total internal energy can be expressed as the work done on the system, denoted by dW , plus the heat transmitted to the system, denoted by dQ, we can write: dQ = de − dW. The amount of work done on the system can be taken as dW = −pdv, where 1ρ = v is the specific volume, leading to dQ = de + pdv. The change in entropy is written as: dS =

de + pdv dQ = T T

(2.34)

Integrating (2.34) one gets the following expression for the entropy, valid for a polytropic gas: p + C1 , S = cv ln ργ where C1 is a constant, or equivalently S

p = C2 e Cv ργ ,

(2.35)

with C2 constant. The following equation holds for the entropy (we indicate the 1-D version): ∂S ∂S + vx = 0. (2.36) ∂t ∂x Note that dS ∂S ∂S dx = + , dt ∂t ∂x dt dS therefore along particle paths vx = dx dt it holds dt = 0, i.e., entropy remains constant as particles move, as long as the flow solution is smooth. Along particle paths equation (2.35) reduces to the isentropic law

p = C 3 ργ ,

(2.37)

2. Fluid dynamics equations

31

where C3 is a constant that depends only on the initial entropy of the particle. Gases where entropy is constant everywhere are called isentropic gases. Such an assumption is often made when no shock waves are present in the gas. The quantity s ∂p c= ∂ρ S=constant is called the local speed of sound in the gas. Clearly, for isentropic gasses, the local speed of sound can be written as: s r p γC3 ργ γp γ−1 = . (2.38) c = C3 γρ = ρ ρ Note that the system of the Euler equations is hyperbolic if and only if the quantity ∂p ∂ρ S=constant

is positive, as holds, for example, for polytropic gases. A quantity related to the speed of sound is the Mach number, introduced by Ernst Mach, and simply defined as the ratio between the fluid velocity and the local speed of sound: M=

||v||2 c

The local speed of sound describes the speed of acoustic waves passing through the medium at rest and plays a central role in the quantitative behavior of the fluid motion. Fluids are often classified in subsonic (M < 1) and supersonic (M > 1). A more adequate classification for practical purposes includes transonic flows, in which the Mach number lies in a range near one (0.8 ≤ M ≤ 1.2 is commonly taken), and both subsonic and supersonic regions exist, and hypersonic flow (M > 5). Because our numerical method needs the full spectral decomposition of the Jacobian matrices (see chapter 4), we compute next the required expressions for the Euler equations. For the two-dimensional equations (2.25) the Jacobian matrix can be written as: 0 1 0 1 1 (γ − 1)(vx2 + vy2 ) − vx2 vx (3 − γ) vy (1 − γ) γ − 1 2 , f ′ (u) = −vx vy vy vx 0 vx ( 12 (γ − 1)(vx2 + vy2 ) − H) H + (1 − γ)vx2 (1 − γ) vx vy γvx

32

2.3. Model equations

with eigenvalues λ1 = vx − c, λ2 = λ3 = vx , λ4 = vx + c, and the matrices R and L = R−1 given by: 1 1 0 1 v −c vx 0 vx + c x R= v v 1 vy y y vx2 +vy2 H − vx c vy H + vx c 2

(2.39)

and

L=

(γ−1)(vx2 +vy2 )+2vx c 4c2 (γ−1)(vx2 +vy2 ) 1− 2c2 −vy (γ−1)(vx2 +vy2 )−2vx c 4c2

− vx (γ−1)+c 2c2 vx (γ−1) c2

0

vy (γ−1) 2c2 vy (γ−1) c2

γ−1 2c2 − γ−1 c2

vy (γ−1) 2c2

γ−1 2c2

−

1

− vx (γ−1)−c 2c2

−

0

.

(2.40)

In (2.39), H represents the total enthalpy , given by H=

1 c2 1 E+p = ||v||2 + = ||v||2 + h. ρ 2 γ−1 2

(2.41)

The eigenstructure of g′ (u) is obtained by interchanging the roles of vx and vy , and the second and third components of each left and right eigenvector (see [71]). The one-dimensional versions of these matrices are obtained by removing the third row and column from the two-dimensional versions and setting vy = 0:

0 1 2 ′ f (u) = 2 (γ − 3)vx c2 vx 1 3 2 (γ − 2)vx − γ−1

1 (3 − γ)vx 3−2γ 2 c2 2 vx + γ−1

vx − c 0 0 , Λ= 0 vx 0 0 0 vx + c

1 R= vx − c H − vx c

1 1 vx vx + c . 1 2 2 vx H + vx c

0 γ − 1 , γvx

2. Fluid dynamics equations

33

The inverse of the matrix R is the matrix whose rows are the left eigenvectors: v (v2 −v c−2H) vx2 −2vx c−2H x x x 1 − − 2 2 2 2c(vx −2H) vx −2H x −2H) 2c(v 2(vx2 −H) 2vx 2 . L= − vx2 −2H u2 −2H vx2 −2H 2 +2v c−2H vx (vx2 +vx c−2H) x 1 − vx2c(v − v2 −2H 2 −2H) 2c(v2 −2H) x

x

x

From the one dimensional version of (2.41) one has vx2 − 2H = −

2c2 , γ−1

so that the eigenstructure of f ′ (u) depends only on two variables, c and vx . A similar analysis can be done on the two dimensional matrices R and L defined in (2.39) and (2.40) to show that in that case the eigenstructure of the system depends only on c, vx and vy .

Two-component Euler equations An extension of the Euler equations, for a fluid composed by the mixture of two or more perfect gasses in thermal equilibrium, consists of adding a new equation, which models the conservation of one of the gasses, which implies the conservation of both gasses due to the conservation of mass. We add a new variable φ which represents the mass fraction of one of the gasses. Consequently, the quantity 1 − φ represents the mass fraction of the other gas. The resulting system is hyperbolic. We state in this section the spectral structure of the system in one and two dimensions, which will be needed in the numerical methods described in further sections. In one space dimension we write the equation of conservation of the mass of the first gas as: (ρφ)t + (ρφvx )x = 0, so that the new system of equations extending the Euler equations of gas dynamics becomes:

ρ ρvx ρvx ρvx2 + p E + vx (E + p) ρφ ρφvx t

= 0. x

(2.42)

34

2.3. Model equations

The Jacobian matrix for this system can be written as: 0 1 0 0 ′ ′ γ−3 2 v − φγ (φ)e (3 − γ)v γ − 1 γ (φ)e x x 2 f ′ (u) = γ−1 v 3 − vx H − vx φγ ′ (φ)e H − (γ − 1)v 2 γvx vx γ ′ (φ)e x x 2 −φvx φ 0 vx

,

where e is the specific internal energy. The ratio of specific heats γ = γ(φ) depends now on the composition of the mixture through the relation: γ=

cp1 φ + cp2 (1 − φ) , cv1 φ + cv2 (1 − φ)

where cpi and cvi are, respectively, the specific heats and constant pressure and constant volume , for the i − th gas component, i = 1, 2. The eigenvalues of the Jacobian matrix are λ1 = vx − c, λ2 = λ3 = vx , λ4 = vx + c, the corresponding right eigenvectors are r1 = [1, vx − c, H − vx c, φ]T , T 1 r2 = 1, vx , vx2 , φ , 2 #T " ′ γ (φ)e ,1 , r3 = 0, 0, γ−1

r4 = [1, vx + c, H + vx c, φ]T ,

and the (normalized) left eigenvectors are: vx 1 l1 = β2 + − φβ3 , −β1 vx − , β1 , β3 , 2c 2c l2 = [1 − 2β2 + 2φβ3 , 2β1 vx , −2β1 , −2β3 ] , l3 = [−φ, 0, 0, 1] , 1 vx − φβ3 , −β1 vx + , β1 , β3 , l4 = β2 − 2c 2c

where γ−1 , 2c2 v2 = β1 x , 2 ′ γ (φ) = . 2γ(γ − 1)

β1 = β2 β3

2. Fluid dynamics equations

35

The multi-component Euler equations in 2D read:

ρ ρvx ρvy E ρφ

+ t

ρvx ρvx2 + p ρvx vy vx (E + p) ρφvx

+ x

ρvy ρvx vy ρvy2 + p vy (E + p) ρφvy

= 0.

(2.43)

y

The analysis made in the one-dimensional case for the Jacobian matrices can be repeated here for

f (u) =

ρvx ρvx2 + p ρvx vy vx (E + p) ρφvx

and

g(u) =

ρvy ρvx vy ρvy2 + p vy (E + p) ρφvy

.

We simply state here the eigenstructure of f ′ (u). By interchanging vx and vy and the second and third components of each left and right eigenvector, the eigenstructure of g′ (u) is obtained. See [118] and references therein for further details. The eigenvalues of F ′ (u) are λ1 = vx − c, λ2,3,4 = vx , λ5 = vx + c, and the corresponding right eigenvectors ri and left eigenvectors li , normalized

36

2.3. Model equations

so that ri · lj = δij , are

T 1 vx − c vy H − vx c φ , i h T 2 2 , r2 = 1 vx vy vx +vy φ 2 T r3 = 0 0 1 vy 0 , h iT ′ (φ)e r4 = 0 0 0 − γγ−1 , 1 T r5 = 1 vx + c vy H + vx c φ , vx 1 l1 = β2 + 2c − φβ3 −β1 vx − 2c −β1 vy β1 β3 , l2 = 1 − 2β2 + 2φβ3 2β1 vx 2β1 vy −2β1 −2β3 , l3 = −vy 0 1 0 0 , l4 = −φ 0 0 0 1 , u 1 l5 = β2 − 2c − φβ3 −β1 vx + 2c −β1 vy β1 β3 ,

r1 =

where

(2.44)

γ −1 , 2c2 vx2 + vy2 , = β1 2 γ ′ (φ) = , 2γ(γ − 1)

β1 = β2 β3 2

c and H is the enthalpy, H = γ(φ)−1 + 21 (vx2 + vy2 ). As a final remark, it can be shown that for both the one and the two dimensional case, the left and right eigenvectors can be written in terms of c, φ and the velocity components only, i.e., one variable less than the number of conserved variables.

3 Numerical methods for fluid dynamics In chapter 2 we have presented the main features of partial differential equations related to fluid dynamics, and in particular hyperbolic conservation laws. Such equations are in general impossible to solve analytically, except in some trivial cases, like the linear advection equation presented in section 2.3.1. Numerical methods aim to obtain a discrete approximation of the true solution, which often suffices for practical applications. In this chapter we briefly review the main notions and results related to numerical methods for hyperbolic systems of conservation laws. In this chapter we will center our description in one-dimensional scalar equations, with some notions on the application to one-dimensional systems. The ideas introduced here will be exploited in further chapters, where we will move the discussion to finite-difference methods for nonlinear systems of conservation laws in more dimensions. We will focus on the main facts that are

38

3.1. Discretization

useful for the particular class of numerical methods concerned in this work, and it is by no means intended to be comprehensive. Most of the concepts introduced in this chapter are explained in more detail and in a wider context in any basic textbook of numerical solution of hyperbolic PDE’s, as, for example, the books of LeVeque [104, 105], Toro [175] and Hirsch [70, 71].

3.1 Discretization To numerically solve partial differential equations, the continuous equations are replaced by a discrete representation of them. This is achieved by first discretizing the domain of definition of the PDE by means of a grid, to be introduced below, then the PDE is discretized on the grid, and the resulting discrete, finite-dimensional problem, is solved. There are two major discretization strategies used in Computational Fluid Dynamics: • Point-value discretization, where the discrete values correspond to the pointwise values of the unknown variables at the nodes of the grid, and • Cell-average discretization: the discrete values represent the average value of the variables at the cells of the grid. Consider a scalar conservation laws in one space dimension ut + f (u)x = 0,

(x, t) ∈ R × R+ ,

(3.1)

where u, f : R −→ R, with initial data given by u(x, 0) = u0 (x),

x ∈ R.

Let us introduce the aforementioned concepts for this case. Consider a discrete subset of points of R, defined by nodes {xj }j∈Z . We assume that the nodes are ordered, i.e., xj < xj+1 for all j and, from the points {xj } we define a set of cells by: xj+1 − xj xj − xj−1 , xj + . (3.2) cj = xj − 2 2 A grid is defined, depending on the context, to be either the set {cj }j∈Z or the set {xj }j∈Z . For simplicity, we further assume that the grid is

3. Numerical methods for fluid dynamics

x c

x c

x

j

39

j+1

j

x

j

j+1

j

Figure 3.1: Nodes and cells of grids defined on R. A non-uniform grid (top) and a uniform grid(bottom).

uniform, i.e., xj − xj−1 = ∆x with ∆x a positive constant. This constant is called mesh (or grid) size. For convenience we will use discrete grids indexed with the following convention: 1 ∆x. (3.3) xj = j + 2 Sometimes we will use non-integer indexes to indicate points that do not correspond to nodes. For example the point xj+ 1 represents the point 2 (j + 1)∆x. Given nodes {xj }j∈Z , under the above assumptions, formula (3.2) reduces to ∆x ∆x , xj + cj = xj − = [xj− 1 , xj+ 1 ] = [j∆x, (j + 1)∆x], 2 2 2 2

so that each cell is in this case a subinterval whose center is xj . In Fig 3.1 two discretizations of R are shown, for a uniform (bottom plot) and a non-uniform grid (top plot). The time variable is discretized by defining points in time {tn }n∈N , with tn < tn+1 . If tn+1 − tn is constant with respect to n, we denote it by ∆t and call it the time increment. We will denote by U n = {Ujn }j∈Z the computed approximation to the exact solution u(xj , tn ) of (3.1). We can also interpret the numerical solution as an approximation to the cell-average of the exact solution, defined as: Z x 1 j+ 2 n u ¯j = u(x, tn )dx. (3.4) xj− 1

2

In practical implementations the grid has to be restricted to a finite number of nodes or cells, or equivalently, the domain of definition of the

40

3.1. Discretization

equations have to be restricted to a bounded subset of R and a finite time interval. In particular we will consider a square I = [0, 1] and a fixed time T > 0. We then take positive numbers N and M and a set of nodes {xj }0≤j 0 in which we are interested to know the solution of the equations, and we ∆t assume that the ratio between the spatial and temporal step sizes, ∆x is constant, so it suffices to consider limits when one of the step sizes vanishes. Definition 4. Let {Gk }+∞ k=0 be a sequence of grids with corresponding grid sizes ∆xk , verifying lim ∆xk = 0. (3.6) k→+∞

Given a (discrete) norm ||.|| we say that a sequence of numerical solutions {UGk }+∞ k=0 , associated with the numerical scheme H and the grids {Gk }k , and corresponding to a fixed time T , converges to a function u in the norm ||.|| if lim ||UGk − uGk || = 0 k→+∞

where uGk represents the discretization of the function u on the grid Gk . A numerical method is said to be convergent if any sequence of numerical solutions obtained through it on a sequence of grids verifying (3.6) converges to the true solution of the equation. Often used norms are the discrete Lp norms

||v||p = ∆x

X j∈Z

1

p

p

|vj |

,

3. Numerical methods for fluid dynamics

43

and the discrete L∞ norm ||v||∞ = max |vj |. j∈Z

Note that the concept of convergence is strongly dependent on the norm. Sequences of numerical solutions (and hence numerical methods) can converge in one norm but not in another. In this work we will almost exclusively consider the L1 and L2 norms. It is in general very difficult to show that a given numerical method is convergent in a given norm. The way in which one usually studies convergence is through the concepts of consistency and stability, making use of the Lax equivalence theorem. We briefly describe these concepts in the next sections.

3.2.1 Consistency Consistency deals with the investigation of how a numerical method behaves locally, i.e. in a single time step. To prove consistency it suffices to show that the error produced by a single application of the numerical method (local truncation error, defined below) vanishes as ∆t approaches zero. This is often an easy task, which is attained by simply using Taylor series expansions. Definition 5. Given a one-step numerical method U n+1 = H∆t (U n ) the local truncation error is defined as: Ln∆t =

1 H∆t (un ) − un+1 ∆t

(3.7)

where un is the discretization of the true solution of the PDE. Definition 6. An one-step numerical method U n+1 = H∆t (U n ) is consistent in the norm ||.|| if lim ||L∆t (·, t)|| = 0,

∆t→0

provided u is a smooth function satisfying the PDE. We say that the order of the method is p if L∆t (·, t) = O(∆tp ).

44

3.2. Norms and convergence

3.2.2 Stability Consistency gives information about the accuracy of the numerical method in a single time step. We expect that the faster the local truncation error vanishes, the better the numerical approximations obtained through the numerical method are. This is true provided • The method is convergent, and • The solution u is sufficiently smooth. In this section we deal with the first of these requirements. Consider a numerical method of the form (3.5). If we perform n steps of the method starting with a given initial data U 0 we obtain an approximation Hn (U 0 ) that approximates the true solution un with an error given by E n = Hn (U 0 ) − un . If we advance the solution one step further we get E n+1 = Hn+1 (U 0 ) − un+1 = H(Hn (U 0 )) − un+1 = H(E n + un ) − un+1 = (H(E n + un ) − H(un )) + H(un ) − un+1 = (H(E n + un ) − H(un )) + ∆tLn∆t .

Convergence means that the error vanishes in a given norm as ∆t does. For a consistent method the term ||∆tLn∆t || will vanish by definition. Stability is devoted to the study of the other term, that describes the effect that the errors have in the numerical method. Typically one wants to bound ||E n || by some quantity that can be forced to vanish as ∆t vanishes, for example the discretization error of the initial data, given by E 0 = ||U 0 − u0 ||. Classical stability theory is applicable to linear methods using the 2norm. It exploits the properties of the Fourier transform, in particular Parseval’s identity, that relates the norm of the numerical solution with the one of its Fourier transform. This theory is known as von Neumann’s stability, and is widely used when studying stability of linear methods for partial differential equations (see e.g. [70, 167]). Unfortunately efficient methods for hyperbolic conservation laws are nonlinear, and, on the other hand, the theory can only be applied to the linearized equations.

3. Numerical methods for fluid dynamics

45

Stability can be shown, for example for methods that verify, for some k ≥ 0: ||H(v) − H(w)|| ≤ (1 + k∆t)||v − w||, (3.8) for a certain norm. If (3.8) can be stated with k = 0 the method is called contractive in the norm || · ||: ||H(v) − H(w)|| ≤ ||v − w||. For linear operators the condition of being contractive reduces to the fact that the powers of the operator H are uniformly bounded by a constant: ||Hn || ≤ C, ∀n ∈ N. For nonlinear methods it is very difficult to show that a method is contractive. A similar, but simpler argument is based on the total variation of the numerical solutions, defined as X n T V (U n ) = |Ujn − Uj−1 |. j∈Z

Boundedness of the total variation of the numerical solutions can be used to show stability by means of compactness arguments in normed functional spaces [147]. Total variation diminishing (TVD) (and more generally total variation bounded, TVB) methods exploit this idea. A method is said to be TVD if T V (U n+1 ) ≤ T V (U n ),

(3.9)

and TVB if there exists a constant C independent of n such that T V (U n ) ≤ C. Monotonicity is also useful for stability, since monotone methods are contractive. A method of the form (3.5) is monotone if ∂H ≥ 0, ∂Ujn

∀j.

A necessary condition for stability is the Courant-Friedrichs-Lewy (CFL) condition [40]. This condition is a requirement, for example, for TVD methods in order to be so. The CFL condition relies on the hyperbolic nature of the equations. On these equations the information is carried through characteristics, so the physical solution at a given point

46

3.2. Norms and convergence

(x, t) depends on the values of the initial data at a bounded domain, because no characteristic outside this domain can reach the point x at time t. More precisely, the domain of dependence of the point (x, t) is defined as the set of points corresponding to time t = 0 that completely determine the solution of problem (2.1) with initial data (2.3) at (x, t). By the above comments, the domain of dependence of a point is always a bounded set. Given a numerical method of the form (3.5), the numerical solution Ujn+1 depends on a finite number of components of U n , say {Uℓn }ℓ∈L , where L is a given finite set of indexes. By tracing back this idea, the numerical solution at a point (xj , tn ) for a given numerical method is fully determined by a finite set of points located at time t = 0. The convex hull of this set is called the numerical domain of dependence of the point (x, t). The CFL condition states that the numerical domain of dependence of any point has to contain its domain of dependence. This condition is quite intuitive, since it states that the numerical method has to be able to take into account the information coming from any point that is actually influencing the solution at the next time step. The CFL condition is often expressed as an upper bound in some ratio between ∆t and ∆x. As an example consider a Cauchy problem for the advection equation (2.19). From section 2.3.1 we know that the solution of this problem with initial data u(x, 0) = u0 (x) is given by u(x, t) = u0 (x − at). This means that the domain of dependence of the point (x, t) is the point x − at. In order to be able to compute the correct solution, the numerical method should contain this point in its numerical domain of dependence. Note that, because a is constant, the domain of dependence of any point (x, t) is always located at the left or the right of (x, t). More generally, this idea can be exported to nonlinear systems in the sense that the system can be locally linearized, and this linear system is equivalent to a decoupled system of advection equations, as explained in section 2.3.2. Therefore, to perform a single time step, the relevant information that has to be taken into account by the method comes from one direction, given by the local sign of the eigenvalues of the linearized equations. This idea of following the direction where the characteristic information comes from is called upwinding. Methods that decide the set of points that take part in the computation of the solution from one time step to the next are called upwind methods. High resolution numerical schemes for nonlinear systems can rarely be shown to have any of the above properties (monotonicity, contractiveness or total variation boundedness), so stability cannot be ensured for

3. Numerical methods for fluid dynamics

47

them. These methods are constructed, however, around the ideas behind TVD and monotone methods, and all them have the CFL condition as a requirement.

3.2.3 The Lax equivalence theorem The most important result relating the concepts of stability, consistency and convergence is the Lax equivalence theorem, reformulated from [102] (see e.g. [147, 167]). It can be stated as follows: Theorem 1. For a consistent one step linear scheme for a Cauchy problem of a well-posed linear PDE, stability is a necessary and sufficient condition to convergence. It is in general easier to check consistency and stability than convergence. Consistency is often a trivial task, since the quantities that appear in the equation are replaced by discrete approximations of them, that produce consistent methods almost automatically. For conservative methods the approach is slightly different, but both consistency and the order of accuracy of the method are obtained from the procedure used in the reconstruction (see section 3.4 below). However, practical numerical methods are not linear, and the theorem cannot be applied directly. Moreover, stability is much more hard to be proved in general for nonlinear methods and equations.

3.3 Elementary methods Most elementary numerical methods to solve (2.1) are based on the substitution of partial derivatives by consistent finite-difference approximations. One such a method is Lax-Friedrichs’ method, based on the following substitutions: 1 u(x + ∆x, t) + u(x − ∆x, t) u(x, t + ∆t) − ut (x, t) ≈ ∆t 2 1 (f (u(x + ∆x, t)) − f (u(x − ∆x, t))). f (u)x ≈ 2∆x

48

3.4. Conservative methods

With the notation introduced in section 3.1 the method can be written in the form n + Un Uj+1 ∆t j−1 n n − (f (Uj+1 ) − f (Uj−1 )) (3.10) Ujn+1 = 2 2∆x The time derivative has been substituted by a first order approximation, while the space derivative has been approximated by a second order finite-difference approximation. Therefore, the method is first order accurate in time and second order in space. Because of stability restrictions, expressed by the CFL condition, the method is globally first-order accurate. First order methods suffer from diffusion. The solution is very much smeared near a discontinuity or a zone with high variation. This is due to the big amount of numerical dissipation added to the solution by the scheme, which is in fact approximating an advection-diffusion equation. Elementary methods being second (and higher) order accurate in both time and space can be obtained in several ways. Classical methods are the methods of Lax and Wendroff [96], Beam and Warming [186], MacCormack [115] or the Leapfrog method. These methods are not efficient in general when the solution is not smooth (in particular, none of them is monotone or TVD). The typical behavior of linear second order methods is that spurious oscillations are produced near the discontinuities, and these oscillations do not decrease as the grid size does. It is due in most cases to the lack of numerical dissipation in the solution, and in most cases they can be seen as schemes approximating dispersive equations without diffusive terms. However, these methods were the basis for the development of other second order methods, obtained by modifications of the former, being some of them still used nowadays in practice for particular applications (see e.g. [176],[79]).

3.4 Conservative methods The methods described in section 3.3 are based on the assumption that the partial derivatives can be accurately approximated by finite-differences. This is true in the points where the flow solution u(x, t) is smooth with respect to (x, t). In fact Lax-Friedrich’s and Lax-Wendroff’s methods will converge to the right solution as ∆x and ∆t tend to zero provided the solution is smooth for all (x, t).

3. Numerical methods for fluid dynamics

49

If some singularity is present in the flow solution, the argument is no longer valid. As stated in section 2.2 more than one weak solution can exist in this situation and there is no reason to suppose that the method will converge to the right one. Moreover, the method can converge to a function that is not a weak solution of the PDE. Simple illustrating examples can be found e. g. in [104]. Conservative methods ensure that convergence can only be done to weak solutions. This result, known as Lax-Wendroff theorem, will be stated at the end of the section. We start by introducing conservative methods and its main properties. Definition 7. A numerical method U n+1 = H(U n ) is said to be conservative if there exists a function fˆ : Rp+q+1 → R such that ∆t ˆ n n n n f (Uj−p+1 , . . . , Uj+q ) − fˆ(Uj−p , . . . , Uj+q−1 ) , (3.11) H(U n )j = Ujn − ∆x

for some nonnegative integers p and q. The function fˆ is called the numerical flux function. Conservative methods aim to reproduce at a discrete level the conservation of the physical variables in the continuous equations. In fact (3.11) can be seen as a discrete version of the integral form (2.2) of the PDE. As we will see, convergent solutions computed by conservative methods respect the integral form, in the sense that they are discrete approximations to weak solutions of the equations. Definition 8. Given a conservative numerical method we say that its numerical flux function is consistent with the conservation law if fˆ(U, . . . , U ) = f (U ). In order to ensure some smoothness in the way in which fˆ approaches a certain value f (U ), as its arguments tend to U , we will suppose in general that the flux function is locally Lipschitz continuous in each variable1 . Consistency of the numerical flux is a natural requirement. It is necessary to ensure that a discrete form of conservation, analogous to the conservation law, is provided by conservative methods and, in fact, it is equivalent to the consistency of the scheme itself. We are ready to introduce the main result concerning conservative methods 1

A function f defined on a normed space M is said to be locally Lipschitz continuous at a point x ∈ M if there exists a constant K and a neighborhood N (x) of x such that ||f (y) − f (x)|| ≤ K||y − x||, ∀y ∈ N (x)

50

3.5. High resolution conservative methods

Theorem 2. (Lax-Wendroff, [103]) Let {Gk }+∞ k=0 be a sequence of grids with corresponding grid sizes (∆xk , ∆tk ), verifying lim ∆xk = 0,

k→+∞

lim ∆tk = 0.

k→+∞

Let {UGk }+∞ k=0 be the sequence of numerical solutions obtained by a conservative numerical method, consistent with (2.1), on the grids {Gk }k . If UGk converges to a function u(x, t), then u is a weak solution of (2.1). The definition of convergence stated in the theorem has received much attention. The original definition used in the work of Lax and Wendroff [103] has been relaxed and extended to more general grids, see e.g. [89, 47]. This theorem is the main reason why so much attention has been focused on conservative schemes. However, there is also interest on nonconservative schemes, see e.g. [83, 32]. See also [72, 109] for discussions on both methodologies.

3.5 High resolution conservative methods One of the first methods that combine all the ingredients discussed up to now is Godunov’s method [57]. It is based on the exact solution of a Cauchy problem located in each cell interface, assuming that the solution is constant at each side of the interface and takes the cell-average values of the numerical solution corresponding to the cells at the left and right of the interface as initial data, i.e., for a given time step tn it finds, for each j, the exact solution at time tn+1 of (3.1) with initial data given by ( Ujn if xj− 1 < x ≤ xj+ 1 n 2 2 u(x, t ) = n Uj+1 if xj+ 1 < x ≤ xj+ 3 2

2

This local problem is called a Riemann problem. Due to the finite speed of propagation of information along characteristics, solutions of Riemann problems corresponding to adjacent cell interfaces will not interact for short enough time. Once these Riemann problems are solved, the solution is averaged on each cell to raise a new Riemann problem

3. Numerical methods for fluid dynamics

51

for the next time step. Godunov’s method is explained in more detail in section 3.6. The idea of solving Riemann problems forward in time is at the basis of modern high-resolution shock-capturing methods. Godunov’s method is, however, first order accurate. The order of accuracy can be increased by solving initial-value problems where the piecewise constant approximation used in Godunov’s method is replaced by a more accurate approximation, for example piecewise linear data. One could solve (3.1) with initial data of the form ( Ujn + σj (x − xj ) if xj− 1 < x ≤ xj+ 1 2 2 u(x, tn ) = , (3.12) n Uj+1 + σj+1 (x − xj+1 ) if xj+ 1 < x ≤ xj+ 3 2

2

where σj is a slope computed on the j − th cell from the data U n . This gives a method that is second order accurate in space. Unfortunately, Cauchy problems with initial data that are not piecewise constant cannot be analytically solved in general, and stability properties, as TVD or monotonicity cannot be ensured anymore for any choice of the slopes σj . For some choices of the slopes σj this method will have the same bad behavior of the second order methods already studied, as Lax-Wendroff and Beam-Warming. In fact, those methods can be obtained from particular choices of the slopes (see [105]). In order to control the total variation of the method, limiters are applied to the slopes. Almost any limiter is based on the minmod limiter due to Van Leer [181]: σj = where

1 n n minmod(Uj+1 − Ujn , Ujn − Uj−1 ), ∆x

u minmod(u, v) = v 0

if |u| < |v| and u · v > 0, if |v| ≤ |u| and u · v > 0, if u · v ≤ 0.

A common practice to construct numerical methods with order of accuracy higher than one and suitable for nonlinear systems, is using piecewise constant initial data obtained by a higher order reconstruction at the cell interfaces. From the numerical solution at a given time step one reconstructs, by a certain interpolation or approximation procedure, R L at each interface. Then the Riemann problem two values Uj+ 1 and U j+ 1 2

2

with initial data n

u(x, t ) =

(

L Uj+ 1

2

R Uj+ 1 2

if xj− 1 < x ≤ xj+ 1 , 2

2

if xj+ 1 < x ≤ xj+ 3 , 2

2

52

3.5. High resolution conservative methods

is solved. For example, using a piecewise linear reconstruction one has to solve a Riemann problem with initial data given by ∆x , 2 ∆x n = Uj+1 − σj+1 . 2

L n Uj+ 1 = Uj + σj 2

R Uj+ 1 2

TVD methods can be developed based on slope limiters. However it is known that order of accuracy higher than two cannot be achieved while ensuring the TVD or TVB properties [67]. To achieve higher order some techniques have been developed following similar ideas, but without strictly ensuring total variation boundedness. Examples are the essentially non-oscillatory (ENO) methods, introduced by Harten, Engquist, Osher and Chakravarthy in [64] and the weighted essentially non-oscillatory (WENO) methods [113, 81], that are explained in more detail in section 4.3.

3.5.1 Semi-discrete methods Methods based on local reconstruction of the solution at the cell interfaces can increase the spatial accuracy of the solution by computing numerical flux functions that are better approximations of the true fluxes. In the study of the local truncation error, however, the first order time discretization Ujn+1 − Ujn + O(∆t), ut (xj , tn ) = ∆t is already present in (3.11), giving a method that is only first order accurate in time. Semi-discrete methods face this drawback by applying a space discretization first, leaving the time derivative unchanged, leading to a system of ordinary differential equations: dUj (t) + D(Uj (t)) = 0, dt

∀j,

where D(Uj (t)) is some approximation of the spatial derivative f (u)x for a fixed time t. Typically, the spatial approximation is made by means of a conservative reconstruction of the numerical fluxes, giving a formulation ˆ ˆ dUj (t) fj+ 21 − fj− 21 + = 0, dt ∆x

∀j,

(3.13)

3. Numerical methods for fluid dynamics

53

where fˆj+ 1 = fˆ(Uj−p+1 (t), . . . , Uj+q (t)). System (3.13) is then solved by 2 means of an ODE solver. A class of TVD Runge-Kutta methods specially tailored to solve this kind of ODE systems was developed by Shu and Osher [160]. The general formulation of these methods is as follows: U (0) = U n , U

(i)

=

i X k=0 (¯ r)

U n+1 = U

αik U (k) − βik ∆tD(U (k) ) ,

1 ≤ i ≤ r¯,

(3.14)

,

where r¯ depends on the order of accuracy of the particular Runge-Kutta scheme and αik , βik are coefficients that also depend on the method (see [160, 161] for details). In this work we will use the third order version: U (1) = U n − ∆tD(U n ), 1 1 3 U (2) = U n + U (1) − ∆tD(U (1) ), 4 4 4 1 n 2 (2) 2 n+1 U = U + U − ∆tD(U (2) ). 3 3 3

(3.15)

It has been proved [160] that the second and third order TVD RungeKutta methods are stable provided the forward Euler method, which is the first order case in (3.14), is stable, under the same CFL restriction of the forward Euler method. All these methods provide conservative schemes, when used together with spatial operators that lead to ODE’s of the form (3.13), since the composed time-advance step can be expressed in the conservative form (3.11). For example, given a numerical flux function fˆ(U n ) arising from the spatial discretization, expanding (3.15) for each node xj we have 1ˆ 2ˆ 1ˆ n+1 n (1) (2) n (U ) + (U ) + (U ) f 1 f 1 f 1 Uj = Uj − ∆t 6 j+ 2 6 j+ 2 3 j+ 2 (3.16) 1ˆ 1ˆ 2ˆ n (1) (2) − f 1 (U ) + fj− 1 (U ) + fj− 1 (U ) , 2 2 6 j− 2 6 3 Since U (1) and U (2) are obtained from U n we can write (3.16) in terms of a numerical flux function 1 1 2 fˆRK3 (U n ) = fˆ(U n ) + fˆ(U (1) ) + fˆ(U (2) ) (3.17) 6 6 3 as n RK3 n RK3 ˆ (U ) − f (U ) . Ujn+1 = Ujn − ∆t fˆj+ 1 j− 1 2

2

54 3.6. Numerical methods for one-dimensional hyperbolic systems Note that we have defined fˆ as a function that depends on p + q arguments (cf. Definition 7), which implies that fˆRK3 depends on 3(p + q) − 2 arguments, because it involves three recursive applications of fˆ. This numerical flux function is consistent. Recall that for any W = {Wj } we have D(W )j = D(Wj−p, . . . , Wj+q ) =

1 ˆ (f 1 (W ) − fˆj− 1 (W )), 2 ∆x j+ 2

therefore, because of the consistency of fˆ we get that D(w, . . . , w) = 0 for n = Ujn for any w. In particular, if U n is “locally constant” at xj , i.e., Uj+k k = −3p + 3, . . . , 3q, then U (1) and U (2) are also locally constant at j: (1)

Uj

(2)

Uj

= Ujn − ∆tD(U n )j = Ujn 1 n 2 (1) 2 U + Uj − ∆tD(U (1) )j = 3 j 3 3 1 n 2 n 2 = U + Uj − ∆tD(U n )j = Ujn . 3 j 3 3

This implies that if all the arguments of fˆRK3 take a common value w, by the consistency of fˆ: fˆRK3 (w, . . . , w) = =

1ˆ f (w, . . . , w) + 6 1ˆ f (w, . . . , w) + 6

1 ˆ (1) 2 f (w , . . . , w(1) ) + fˆ(w(2) , . . . , w(2) ) 6 3 1ˆ 2ˆ f (w, . . . , w) + f (w, . . . , w) = f (w), 6 3

where we have denoted by w(1) and w(2) , the (constant) values obtained in the first and second stage of the Runge-Kutta algorithm for a vector taking the constant value w.

3.6 Numerical methods for one-dimensional hyperbolic systems A common way to extend the methods described in section 3.5 to systems of equations is based on the application of scalar techniques to the characteristic variables instead of the conserved variables. The idea is to exploit the hyperbolic character of the system, by approximately decoupling it into advection equations as explained in section 2.3.2. We focus

3. Numerical methods for fluid dynamics

55

our description in Godunov’s method [57], and the approximate Riemann solver of Roe [148], which lie at the basis of the numerical methods to be described in chapter 4. Godunov’s method is based on the solution of Riemann problems forward in time. Consider the linear system in 1D t ∈ R+ ,

x ∈ R, x ∈ R.

ut + Aux = 0, u(x, 0) = u0 (x),

Godunov’s method solves the set of Riemann problems defined by x ∈ R, t ∈ (tn , tn+1 ], if x ≤ xj+ 1 ,

ut + A(u)ux = 0, L , u(x, tn ) = Uj+ 1

2

2

R , u(x, tn ) = Uj+ 1

if x > xj+ 1 , 2

2

R L are reconstructions of the for each interface xj+ 1 , where Uj+ 1 and U j+ 1 2

2

2

flow solution at the left and right of the interface. The numerical method involves essentially the same steps as for an scalar equation (cf. section 3.5): R L for the Riemann 1. Compute initial left and right states Uj+ 1 and U j+ 1 2

2

L problem. For the original first order method of Godunov take Uj+ 1 = n R Ujn and Uj+ 1 = Uj+1 .

2

2

2. Compute the exact solution of the Riemann problem corresponding to time tn+1 in each cell with a time step ∆t sufficiently small so that waves emanating from a given cell interface do not interact with neighboring waves. The local Riemann solutions can then be glued together into a global solution u ˜. 3. Compute the cell average of u ˜ at each cell as the new numerical n+1 solution for time t . Using the integral form (2.2) of the conservation law, and assuming that the exact solutions of the Riemann problems are known, Godunov’s method can be written in conservation form with the numerical flux defined by Z tn+1 1 n ˆ fj+ 1 = f (˜ u(xj+ 1 , t))dt, 2 2 ∆t tn where u ˜ denotes the solution of the corresponding Riemann problem. This formula is simplified using the fact that the solution of the Riemann

56 3.6. Numerical methods for one-dimensional hyperbolic systems problem is self-similar so that, for t ∈ (tn , tn+1 ), the solution is constant with respect to t along the interface x = xj+ 1 . This value depends only 2

L R . If we denote u∗ (U L , U R ) = on the left and right states Uj+ 1 and U j+ 1 j+ 1 j+ 1 2

2

2

2

u ˜(xj+ 1 , t) then the numerical flux for Godunov’s method reduces to 2

n u(xj+ 1 , tn+1 )). fˆj+ 1 = f (˜ 2

2

From the ideas of Godunov, a lot of methods have been constructed (see e.g. [66, 46]). The extension of Godunov’s method to nonlinear systems requires, however, the solution of the Riemann problem for a nonlinear system. Although it is possible to exactly solve some Riemann problems, it is in general very expensive to find their solution. Approximate Riemann solvers exploit the fact that only some information about the solution of the Riemann problem is needed to update the solution of the PDE from time tn to time tn+1 . This is due to the fact that the full exact solution of the Riemann problem is not required, but only its cell-average, in steps 1 and 3 above, leading to a numerical flux that requires only the evaluation of the solution at the cell interface, which is, moreover, constant in time. A typical approach to build an approximate Riemann solver starts with the quasi-linear form of the nonlinear problem ut + A(u)ux = 0, and simplifies the problem by replacing the Jacobian matrix A(u) by a matrix Aˆj that is constant for each interface, so that each Riemann problem is raised for a linear system. The matrix Aˆj varies from one interface L to another, and it is therefore a function of the left and right states Uj+ 1 2

R . The Riemann problems and Uj+ 1 2

ut + Aˆj ux = 0, in R × (tn , tn+1 ], ( L Uj+ 1 if x < xj+ 1 , 2 2 u(x, tn ) = R if x > xj+ 1 , Uj+ 1 2

(3.18)

2

are then solved exactly for each j, and the solution at time tn+1 is constructed as in the original Godunov method, following steps 1, 2 and 3 in page 55. The way how the matrices Aˆj are constructed depends on the system to be solved. Roe [148] proposed a set of general rules to build the linearization and constructed a matrix for the Euler equations. The

3. Numerical methods for fluid dynamics

57

numerical flux of Roe’s method can be written as X p p L n λj αj+ 1 rjp , fˆj+ 1) + 1 = f (U j+ 2

2

λpj 0

2

Taking the average of (3.19) and (3.20) we obtain the often used expression for the numerical flux ! 1 X 1 p p p n L R (3.21) |λj |αj+ 1 rj . fˆj+ f (Uj+ ) − 1 = 1 ) + f (U j+ 21 2 2 2 2 2 p From expressions (3.19), (3.20) and (3.21) the upwinding character and the characteristic-based structure of the method becomes clear. Matrices similar to Roe’s matrix for the Euler equations can be built for other systems, see [56, 175] for general reviews on approximate Riemann solvers, and [11, 49, 54, 162] for some particular examples.

3.7 Implementation of artificial boundary conditions In this section we describe the discrete implementation of boundary conditions. Recall that we have raised our problems in an unbounded spatial domain but, in practice, the computation has to be performed in a bounded subdomain. A different case is the initial-boundary value problem, where both initial and boundary conditions are specified by the problem. These boundary conditions are used to modify the flux calculation depending on if a given point is near the boundary or not. In our case we transform the unbounded problem into a bounded problem by specifying the behavior of the flow at the boundary. Special formulas for the flux computation near the boundaries can be devised,

58

3.7. Implementation of artificial boundary conditions

but in practice it is more efficient to consider the ghost cell approach, already mentioned in section 3.1. Note that the numerical fluxes in (3.11) for a single point xj depend on p + q + 1 values. For nodes at the boundary or near the boundary the values of the numerical solution in these nodes may be unavailable, since the nodes may not belong to the grid. In particular, if we consider N −1 a spatial grid of the interval [0, 1] composed by N nodes {xj }j=0 , with 1 1 xj = (j + 2 )∆x, ∆x = N (cf. section 3.1), then, for example, at the node n , and the same x0 , the points x−1 . . . , x−p are required to compute fˆj− 1 2

applies to the node xN −1 at the right boundary, where xN , . . . , xN +q−1 are required. In general, p and q ghost cells are required, respectively, at the left and right boundary, in order to update the numerical solution. As an example, consider Lax-Friedrichs’ method 3.10. Its numerical flux is defined as 1 ∆x n n (Uj+1 − Ujn ) + (f (Uj+1 ) + f (Ujn )). fˆj+ 1 = 2 2 2 In this case we have p = q = 1 and one ghost cell is required at each boundary. The idea is to augment the computational grid with some extra cells in the boundary. These cells are filled with data using the solution at the interior grid before the integration. For the implementation of the WENO method described in section 4.3 we need three ghost cells at each boundary. Moreover, ghost cells will also be used in the update of cells that are not near the boundaries, but in the boundary of a mesh patch in the AMR algorithm (see chapter 5). Particular cases of implementation of boundary conditions using the ghost cell technique, that will be used in the validation section are inflow, outflow and reflecting boundary conditions. Inflow is the discrete version of Dirichlet boundary conditions, and outflow represent Neumann ∂u boundary conditions ∂~ n = 0. Qualitatively this represents the cases where the flow is entering (inflow) or exiting (outflow) the domain, and are the most natural choice when solving in a finite domain a problem that is defined in the whole real line. In practice this case is implemented by copying the values of the boundary cells into the ghost cells. For the sample grid used in this section we would have: n U−k = U0n , 1 ≤ k ≤ p,

n n UN +k−1 = UN −1 , 1 ≤ k ≤ q.

Another possibility, known as absorbing boundary conditions, is to copy the values in the interior of the domain to the ghost cells with

3. Numerical methods for fluid dynamics

59

simmetry at the boundary, which essentially produces zero numerical divergence at the boundary. In this case we fix n n U−k = Uk−1 , 1 ≤ k ≤ p,

n n UN +k−1 = UN −k , 1 ≤ k ≤ q.

When the boundary represents a solid wall where the flow rebounds, reflecting boundary conditions can be imposed. These conditions are typically applied to systems of equations and depends on the particular equation to be solved. As a general rule, the velocity is set as the value at the interior but changing its sign, to state that the flow will change its direction when touches the solid wall. The rest of the variables are copied from the interior. The values are copied imposing symmetry at the boundary. As an example, for the Euler equations in 1D defined by (2.27), the conditions are as follows: at the left boundary we put: ρ−k = ρk−1 ,

u−k = −uk−1 ,

E−k = Ek−1 ,

for 1 ≤ k ≤ p. Similarly, at the right boundary the following values ρN +k−1 = ρN −k ,

uN +k−1 = −uN −k ,

EN +k−1 = EN −k ,

are set for the ghost cells (1 ≤ k ≤ q). For multi-dimensional problems, the same rules are applied to each variable and boundary in a dimensional splitting fashion.

60

3.7. Implementation of artificial boundary conditions

4 Shu-Osher’s finite difference with Donat-Marquina’s flux splitting Shu-Osher’s conservative schemes [160, 161] constitutes a conceptually new approach for the solution of hyperbolic systems, that actually simplifies the implementation of some finite volume High Resolution Shock Capturing (HRSC) methods. Examples of such methods are the MUSCL methods of Van Leer [177, 178, 179, 180, 181], the PPM method of Colella and Woodward [39], the ENO methods of Harten, Engquist, Osher and Chakravarthy [64], Marquina’s PHM method [117] and the WENO methods developed by Jiang and Shu [81] and Liu, Osher and Chan [113]. Most of these methods are based in discretizations of the equations on control volumes, following the cell-average approach. This

62 formulation has proved to be effective in one dimensional simulations (see references above and the books of LeVeque [104, 105] and Toro [175] for an introductory explanation), but it is difficult to implement in more than one space dimension and order of accuracy higher than two. The main reason is that the numerical flux is an integral over the control volume boundary, that has to be approximated by some quadrature formula. This imposes a complicated and computationally expensive connection between the cell-averages of the solution in the control volume and its reconstructed point-values in some nodal points at the control volume boundary. This combination of cell-averages and pointvalues becomes more complicated if the grids used in the discretization are non-uniform or unstructured, because of the additional technical complexity intrinsic to those kinds of grids. To overcome these difficulties Shu and Osher developed a method that avoids the use of cell-averages by using a point-value discretization of the solution instead, where the computation of the numerical fluxes is done from the equations’ fluxes point-values. A key point is that all the one-dimensional reconstruction procedures used in finite volume methods can be easily exported to Shu-Osher’s approach, with the advantage that the computation of the numerical fluxes is performed in a dimensional splitting fashion. This reduces significantly the computational cost of the algorithm respect to finite volume schemes of the same order. On the other hand, the main drawback of this approach is that to obtain a high order conservative finite difference scheme, the mesh size is required to be uniform or at least smoothly varying [121]. A very complete description comparing finite volume and finite difference schemes can be found in a paper by Shu [159]. The original description of the method is for scalar equations, with the extension to systems being based on the application of the methodology to the local characteristic fields of a linearized system coming, for example from an intermediate Jacobian matrix corresponding to an approximate Riemann solver [161]. An alternative approach has been proposed by Donat and Marquina [44], where two Jacobians are computed at each cell interface. This is a more natural extension of the flux-split character of Shu-Osher’s methodology into systems, and has proved to be more efficient than the use of a single Jacobian in some pathological cases. Moreover, it can be used in conjunction with any reconstruction procedure. Donat-Marquina’s flux formulation is described in section 4.2. High order reconstructions of the flow variables and the numerical fluxes are obtained in this work by means of fifth order WENO reconstructions, described in section 4.3. The overall algorithm that results of putting together these techniques

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

63

has been described and tested by Marquina and Mulet in [118] on a fixed grid configuration, for a fluid flow composed by the mixture of two perfect gases in thermal equilibrium.

4.1 Shu-Osher’s finite difference flux reconstruction Let us describe the finite difference approach of Shu and Osher in one space dimension for a scalar equation. As mentioned above, generalizations to multi-dimensions will be straightforward because of the dimensional-splitting nature of the procedure. The extension to systems using Donat-Marquina’s flux formulation will be discussed in section 4.2. The idea that enables Shu-Osher’s approach comes from the following lemma: Lemma 1. (Shu and Osher, [161]) If a function h(x) satisfies 1 f (u(x)) = ∆x then f (u(x))x =

h x+

Z

x+ ∆x 2 x− ∆x 2

∆x 2

h(ξ)dξ,

−h x− ∆x

∆x 2

(4.1)

.

In order to obtain a conservative scheme we aim to approximate the flux derivative, f (u(x))x , by an expression of the form (see 3.11) 1 ˆn n . fj+ 1 − fˆj− 1 ∆x 2 2 n The above lemma suggests that the numerical fluxes fˆj+ 1 should ap2

proximate the values h(xj+ 1 ). In other words, if we can compute high 2 order approximations to h(xj+ 1 ), they can play the role of high order 2 numerical fluxes for a conservative scheme. Shu-Osher’s formulation can be derived from the cell-average form of the equations [121]. The point-values obtained through this formulation can be understood as estimates or deconvolutions of the cell-average

64

4.1. Shu-Osher’s finite difference flux reconstruction

values, obtained by the application of the inverse of the cell-average operator (3.4), with the advantage that no explicit knowledge of the cellaverages of the solution is needed in the process: the equation can be evolved from one time step to the following one using only the available nodal values. Using the same notation as in (3.4), in what follows the cell-average ¯j : of a function h in the cell [xj− 1 , xj+ 1 ] will be denoted by h 2

¯j = 1 h ∆x

2

Z

xj+ 1 2

h(ξ)dξ.

xj− 1 2

By the assumption (4.1), the problem has been translated into the classical cell-average framework of reconstructing the point-values of a function at the cell boundaries from its cell-averages. This automatically permits to export all the reconstruction algorithms used in classical finitevolume methods to Shu-Osher’s framework. An important point is that no knowledge of the function h(x) (other than its cell-averages) is required. We consider local reconstruction procedures, where the value of hj+ 1 := h(xj+ 1 ) is obtained using information from a finite number 2 2 of nodes around xj . This is the case for the class of methods we are interested in, namely conservative methods. A generic local reconstruction procedure for h(x) from its cell-averages ¯ j−s , . . . , h ¯ j+s } defined on the interval [x {h 1 2 j−s1 − 12 , xj+s2 + 12 ] will be denoted ¯ j−s , . . . , h ¯ j+s , x), where s1 and s2 are nonnegative integers, and by R(h 1

2

has to verify the following properties:

• Preservation of the cell-averages: Z x 1 k+ 2 1 ¯ j−s , . . . , h ¯ j+s , x)dx = h ¯k, R(h 1 2 ∆x x 1 k− 2

k = j − s1 , . . . , j + s2 . (4.2)

• Accuracy. Wherever h is smooth: ¯ j−s , . . . , h ¯ j+s , x) = h(x) + O(∆xr ), R(h 1 2

x ∈ [xj−s1 − 1 , xj+s2 + 1 ]. (4.3) 2

2

for some r > 0. • The total variation 1

1

¯ j−s , . . . , h ¯ j+s , x) is essentially bounded of R(h 1 2

The total variation of a differentiable function h(x) in an interval I is given by Z T V (h) = |h′ (x)|dx. I

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

65

by the total variation of h(x), i.e., for some p > 0: ¯ j−s , . . . , h ¯ j+s , x)) ≤ C · T V (h(x))) + O(∆xp ). T V (R(h 1 2

(4.4)

When computing the reconstruction, another essential point is upwinding. Roughly speaking, upwinding means that the numerical scheme has to take into account the directions in which the solutions are moving, given by the signs of the eigenvalues of the Jacobian matrix. For a scalar equation, the direction of the movement of the solution is locally given by the sign of f ′ (u). Shu and Osher use the Roe speed κj+ 1 = 2

f (Uj+1 ) − f (Uj ) Uj+1 − Uj

(4.5)

to determine its sign and then perform reconstructions biased towards the correct direction. The computation of the numerical fluxes with Shu-Osher’s procedure, using a generic local reconstruction procedure ¯ j−s , . . . , h ¯ j+s , x), where h ¯ j = f (Uj ) can be summarized as follows: R(h 1 2 Algorithm 1. 1. If κj+ 1 > 0 Compute fˆj+ 1 by 2

2

¯ j−s , . . . , h ¯ j+s , x 1 ). fˆj+ 1 = R(h 1 2 j+ 2

2

2. Else Compute fˆj+ 1 by 2

¯ j−s +1 , . . . , h ¯ j+s +1 , x 1 ). fˆj+ 1 = R(h 1 2 j+ 2

2

The numerical fluxes fˆj+ 1 computed through Algorithm 1 are based 2 on the first order Roe fluxes. In fact for the zero-th order reconstruction ¯ j+1 , the numerical flux is the Roe (s1 = s2 = 0), where fˆj+ 1 = ¯ hj or fˆj+ 1 = h 2 2 flux. It is known that the Roe solver admits only shock waves or contact discontinuities as solutions of the Riemann problem, but not rarefaction waves. In particular transonic rarefactions are wrongly substituted by entropy-violating expansion shocks. This situation is often avoided by using an entropy fix, commonly obtained by means of a local correction on the numerical fluxes where transonic rarefactions take place, in order

66

4.1. Shu-Osher’s finite difference flux reconstruction

to introduce some smoothing that “breaks” the shock into a rarefaction wave. Shu and Osher [160] use a local Lax-Friedrichs (LLF) version of the ENO algorithms, that can be generalized to other piecewise polynomial reconstructions as: Algorithm 2. (Local Lax-Friedrichs) 1. Take βj+ 1 = maxu∈[uj ,uj+1 ] |f ′ (u)|. 2

+ 1 2. Compute fˆj+ 1 with step 1 of Algorithm 1 using 2 (f (u)+βj+ 1 u) instead 2

2

of f (u). − 1 3. Compute fˆj+ 1 with step 2 of Algorithm 1 using 2 (f (u)−βj+ 1 u) instead 2

2

of f (u). + ˆ− 1 . 4. Take fˆj+ 1 = fˆj+ 1 + f j+ 2

2

2

Since f ′ (u) changes sign through a transonic rarefaction, the entropyfix version of Algorithm 1 consists of computing fˆj+ 1 with Algorithm 1 if 2 f ′ (u) does not change sign between Uj and Uj+1 and with Algorithm 2 otherwise. The final algorithm for a scalar equation can be stated as follows: Algorithm 3. (Shu-Osher’s algorithm for scalar equations) Define βj+ 1 = maxu∈[Uj ,Uj+1 ] |f ′ (u)| 2 Define κj+ 1 by (4.5) 2 if κj− 1 · κj+ 1 > 0 2 2 if κj+ 1 > 0 2 fˆ 1 = R(fj−s , . . . , fj+s , x 1 ) j+ 2

1

2

j+ 2

else fˆj+ 1 = R(fj−s1+1 , . . . , fj+s2+1 , xj+ 1 ) 2 2 end else + 1 1 fˆj+ 1 = R( 2 (fj−s1 + βj+ 1 Uj−s1 ), . . . , 2 (fj+s2 + βj+ 1 Uj+s2 ), xj+ 1 ) 2 2 2 2 fˆ− 1 = R( 1 (fj−s +1 − β 1 Uj−s +1 ), . . . , 1 (fj+s +1 − β 1 Uj+s +1 ), x j+ 2

fˆj+ 1 = 2

end

2 + fˆj+ 1 2

1

+

− fˆj+ 1. 2

j+ 2

1

2

2

j+ 2

2

j+ 12 )

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

67

4.2 Donat-Marquina’s flux formula A possible extension of the framework described in section 4.1 to nonlinear systems involves some average Jacobian matrix computed at some intermediate state between the left and right states as, for example, the Roe matrix [148]. These average matrices can be difficult to compute for systems of equations other than the Euler equations and, on the other hand, numerical methods based on Roe’s and other average matrices suffer several failures that have been reported in the literature, see e.g. [134, 140] and references therein. Donat-Marquina’s flux formula [44] is a generalization of Shu-Osher’s algorithm to nonlinear systems that alleviates some of those pathologies, as well as avoids the use of an average matrix. Its leading philosophy is to compute the numerical fluxes at the interfaces using two sets of characteristic information for each interface, one coming from the left state and the other coming from the right state. This mimics, in the context of nonlinear systems, the flux-splitting behavior of the formulation made by Shu and Osher for scalar equations. As Donat-Marquina’s flux formula reduces to Shu-Osher’s for scalar equations we will center the discussion on a nonlinear system in one space dimension. Let us briefly describe the numerical method corresponding to Roe’s linearization first. The method of Roe is a conservative scheme whose numerical flux is given by ! 1 X 1 p p p L R |λj |αj+ 1 rj , f (Uj+ ) − (4.6) fˆj+ 1 = 1 ) + f (U j+ 21 2 2 2 2 2 p ˆ L 1 , U R 1 ) comwhere λpj are the eigenvalues of the Roe matrix Aˆj = A(U j+ j+ 2

2

puted at the cell interface xj+ 1 , rjp are the corresponding eigenvectors, Lj 2 is the matrix of left eigenvectors of Aˆj and αp 1 = Lj (U R 1 − U L 1 ). The first order method corresponds to

j+ 2 L Uj+ 1 2

j+ 2

= Uj and

j+ 2 R Uj+ 1 2

= Uj+1 .

Higher order versions are obtained, for example, by using high order reconstructions of the numerical solutions at the interfaces, as explained in section 3.5. Donat-Marquina’s flux formula starts with the same approach. Left and right states are computed at each cell interface. Note that the left

68

4.2. Donat-Marquina’s flux formula

and right states are computed from point-values, i.e. we need to compute approximations to the point-values of u at the interfaces xj+ 1 from 2 point-values of u at the cell centers. For nonnegative integers t1 and t2 , we denote by S(Wj−t1 , . . . , Wj+t2 , x) such an approximation function for a generic function w, whose point-values are given by {Wi }. The approximation at the cell interface is given by S(Wj−t1 , . . . , Wj+t2 , xj+ 1 ) and the 2 operator S has to verify conditions analogous to (4.2)–(4.4): • Interpolation:

S(Wj−t1 , . . . , Wj+t2 , xj+k ) = Wj+k ,

−t1 ≤ k ≤ t2 .

(4.7)

• Accuracy. Whenever w is smooth:

S(Wj−t1 , . . . , Wj+t2 , x) = w(x) + O(∆xr ),

x ∈ [xj−t1 , xj+t2 ].

(4.8)

for some r > 0. • The total variation of S(Wj−t1 , . . . , Wj+t2 , x) is essentially bounded by the total variation of w(x). For some p > 0: T V (S(Wj−t1 , . . . , Wj+t2 , x)) ≤ C · T V (w(x))) + O(∆xp ).

(4.9)

L The algorithm starts by computing the sided reconstructions Uj+ 1 and 2

R Uj+ 1 by: 2

L Uj+ 1 = S(Uj−t1 , . . . , Uj+t2 , xj+ 1 ), 2

2

R Uj+ 1 2

(4.10)

= S(Uj−t1 +1 , . . . , Uj+t2 +1 , xj+ 1 ). 2

Now, instead of building a Jacobian matrix corresponding to an intermeR , a double linearization, corresponding L diate state between Uj+ 1 and U j+ 1 2

2

L ) and f ′ (U R ) is computed. Let us denote by to the Jacobians f ′ (Uj+ 1 j+ 1 2

2

L ) and r p (U L ) the left and right eigenvectors of f ′ (U L ) and by lp (Uj+ 1 j+ 1 j+ 1 2

2

2

2

2

R ) and r p (U R ) the same objects for f ′ (U R ). Two sets of characlp (Uj+ 1 j+ 1 j+ 1 2

teristic fluxes and variables are then computed according to these two Jacobians by: L = lp (U L ) · U , Wp,k k j+ 12 , for j − s1 ≤ k ≤ j + s2 , L p L fp,k = l (Uj+ 21 ) · f (Uk ) (4.11) R p R Wp,k = l (Uj+ 1 ) · Uk , 2 , for j − s1 + 1 ≤ k ≤ j + s2 + 1. f R = lp (U R ) · f (U ) p,k

j+ 21

k

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

69

In (4.11) s1 and s2 denote two nonnegative integers that are used in the numerical flux computation in Algorithm 4 below. The numerical fluxes are then computed in a way similar to Shu-Osher’s algorithm. L and ψ R . These First we compute the upwind characteristic fluxes ψp,j p,j fluxes are biased towards its corresponding side and are computed according, respectively, to the characteristic information corresponding to L ) and f ′ (U R ). In points where the eigenvalues change sign, the f ′ (Uj+ 1 j+ 1 2

2

entropy fix based on the local Lax-Friedrichs flux is applied, as in the scalar case. The final algorithm is as follows: Algorithm 4. if λp (u)does not change sign in a path in phase space connecting Uj and Uj+1 if λp (Uj ) > 0 L = R(f L L ψp,j p,j−s1 , . . . fp,j+s2 , xj+ 21 ) R =0 ψp,j else L =0 ψp,j R = R(f R R ψp,j p,j−s1 +1 , . . . fp,j+s2+1 , xj+ 12 ) end else p p 1 L L L L = R( 1 (f L ψp,j 2 p,j−s1 + βj+ 1 Wp,j−s1 ), . . . , 2 (fp,j+s2 + βj+ 1 Wp,j+s2 ), xj+ 1 ) 2

2

2

p p 1 R R R R = R( 1 (f R ψp,j 2 p,j−s1+1 − βj+ 1 Wp,j−s1 +1 ), . . . , 2 (fp,j+s2 +1 − βj+ 1 Wp,j+s2+1 ), xj+ 1 ) 2

2

2

end

p p where βj+ 1 = maxu |λ (u)|, and u varies in any curve in phase space 2

connecting Uj and Uj+1 . The numerical flux is finally defined as: fˆj+ 1 = 2

X p

L p L R p R ψp,j r (Uj+ ). 1 ) + ψp,j r (U j+ 1 2

(4.12)

2

If the characteristic fields of the Jacobian matrices are linearly degenerate or genuinely nonlinear, the eigenvalues are, respectively, constant and monotone along integral curves of these fields (see section 2.2.4), and Algorithm 4 can be simplified to:

70

4.3. Reconstruction procedures

Algorithm 5. if λp (Uj ) > 0 and λp (Uj+1 ) > 0 L = R(f L L ψp,j p,j−s1 , . . . fp,j+s2 , xj+ 21 ) R =0 ψp,j else if λp (Uj ) < 0 and λp (Uj+1 ) < 0 L =0 ψp,j R R R ψp,j = R(fp,j−s , . . . fp,j+s , xj+ 1 ) 1 +1 2 +1 2 else p p 1 L L L = R( 1 (f L L ψp,j 2 p,j−s1 + βj+ 1 Wp,j−s1 ), . . . , 2 (fp,j+s2 + βj+ 1 Wp,j+s2 ), xj+ 1 ) 2

2

2

p p 1 R = R( 1 (f R R R R ψp,j 2 p,j−s1+1 − βj+ 1 Wp,j−s1 +1 ), . . . , 2 (fp,j+s2 +1 − βj+ 1 Wp,j+s2+1 ), xj+ 1 ) 2

2

end

p p p with βj+ 1 = max{|λ (Uj )|, |λ (Uj+1 )|}. Note that for scalar equations 2

the above algorithms reduce to the entropy-fix version of Shu-Osher’s algorithm (Algorithm 3 in page 66).

Remark 1. In the practical implementation it is more convenient to interpolate the variables the eigenstructure of the Jacobian matrix depends on, instead of the conserved variables at the cell interfaces, using the operator S. The reconstructions in (4.10) are applied to the suitable variables. For example, the eigenstructure of the Jacobian matrix of the Euler equations in 1D depends only on two variables, while the equations are composed by three conserved variables (see section 7.1.3).

4.3 Reconstruction procedures In this section we describe the WENO interpolatory techniques which we will use as the reconstructions R and S, defined in pages 64 and 68 respectively, used to compute the numerical fluxes fˆj+ 1 at the cell bound2 aries. The WENO technique was developed by Liu, Osher and Chan in [113] and further improved by Jiang and Shu in [81]. Preliminary works dealing with the same ideas were produced by Shu [158] and Fatemi, Jerome and Osher [50]. The WENO algorithm is described in subsection 4.3.1, and constitutes an improvement of the ENO technique first introduced in [64]. It is less sensible to perturbations, reduces the practical computational cost and increases the accuracy in regions where the solution is smooth.

2

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

71

This algorithm is directly used as the reconstruction procedure R. The algorithm can be modified to reconstruct point-values from point-values, rather than from cell-averages. In subsection 4.3.2 such a procedure is described. This reconstruction of point-values from point-values will be used within Donat-Marquina’s algorithm to compute the biased reconR L in (4.10). In the multi-dimensional case the structions Uj+ 1 and U j+ 1 2

2

reconstructions are performed dimension-by-dimension, therefore the extension from one dimension to multi-dimensions is trivial.

4.3.1 ENO and WENO reconstruction for cell-average discretizations ENO is based on the fact that Lagrange interpolation on a cell cj = [xj− 1 , xj+ 1 ] can be computed using different sets of points or cells (sten2 2 cils) whose convex hulls contain the given cell. Among the possible stencils the ENO procedure tries to select the stencil that produces the smoothest Lagrange interpolant. If the stencils contain r cells, in which ¯ j of a function h(x) are known, and h(x) is smooth in the cell-averages h the stencil, then the reconstruction of the point-values of h(x) obtained by the ENO algorithm is O(∆xr ) accurate. During the stencil selection procedure the ENO method considers r possible stencils, which altogether contain 2r − 1 cells. The selection procedure is computationally expensive, since it involves a lot of conditional branches. Moreover, in regions of smoothness a stencil formed by these 2r − 1 cells could be used, since the reconstructed functions are smooth regardless the selected stencil, and we could increase the accuracy of the reconstruction up to O(∆x2r−1 ). The WENO reconstruction technique overcomes these drawbacks by using a convex combination of the interpolants corresponding to all the possible stencils considered, instead of selecting one of them. A weight, which depends on the smoothness of the function in the corresponding stencil, is assigned to each interpolant. These weights determine the contribution of each interpolant to the final approximation. The generic reconstruction problem can be formulated as follows: given the cell-averages of a function h(x): ¯j = 1 h ∆x

Z

xj+ 1 2

xj− 1

2

h(ξ)dξ

72

4.3. Reconstruction procedures

find, for each cell cj , a polynomial qjr (x) of degree at most r − 1 such that it is an rth order approximation to h(x) inside the cell, provided the function h(x) is smooth enough: qjr (x) − h(x) = O(∆xr ),

∀x ∈ cj .

(4.13)

A procedure to compute such a polynomial on a cell cj consists of choosing a stencil of r cells formed with s1 cells to the left and s2 cells to the right surrounding the cell cj , with s1 + s2 = r − 1 (s1 , s2 > 0): S¯j = {cj−s1 , . . . , cj+s2 }.

(4.14)

The (unique) polynomial qjr (x) of degree at most r − 1 that attains the same cell-averages as h(x) in the cells in S¯j is a solution of the proposed reconstruction problem, provided h(x) is smooth enough in the region covered by the stencil S¯j . In our context we will need approximations to the point-values h(xj+ 1 ), 2 that will be approximated by qjr (xj+ 1 ). To compute the polynomial qjr (x) 2 the “reconstruction via primitive function” procedure in [64] is applied. We briefly describe this approach next. Let h be a function and H a primitive of h: Z x h(ξ)dξ. H(x) = −∞

The point-values of the primitive function at the cell boundaries can be computed by: H(xj+ 1 ) = 2

Z

xj+ 1

2

h(ξ)dξ = ∆x

−∞

j X

¯j , h

(4.15)

i=−∞

and the following relation is trivial to check: H(xj+ 1 ) − H(xj− 1 ) 2

2

∆x

¯j . =h

(4.16)

Note that the left hand side of (4.16) is a first order divided difference of H. It is easy to check that all the higher order divided differences of H can be computed from the available data due to the relation: H[xj− 1 , xj+ 1 , . . . , xj+ℓ+ 1 ] = 2

2

2

1 ¯ h[uj , . . . , uj+ℓ ], ℓ+1

ℓ ≥ 0.

(4.17)

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

73

If we denote by Qrj (x) the (unique) polynomial of degree at most r that interpolates the primitive function H(x) at the point s that define the stencil S¯j in (4.14), i.e., the points: Sj = {xj−s1 − 1 , . . . , xj+s2 + 1 }, 2

(4.18)

2

then Qrj approximates H(x) with order of accuracy r + 1 provided H is smooth in Sj . Its derivative qjr (x) :=

dQrj (x) , dx

(4.19)

interpolates the cell-averages of h(x) in the stencil S¯j : Z x 1 k+ 2 1 ¯ k , j − s1 ≤ k ≤ j + s2 , q r (x)dx = h ∆x x 1 j k− 2

and from standard theory on interpolation, if Qrj (x) − H(x) = O(∆xr+1 ), then qjr (x) − h(x) = O(∆xr ). So the accuracy requirement (4.13) is also satisfied by the piecewise polynomial function q r (x) defined by q r (x) := qjr (x),

∀x ∈ cj .

The actual approximation of the function h at the cell interfaces is computed by hj+ 1 ≈ q r (xj+ 1 ). 2 2 To effectively compute the numerical approximation to h(xj+ 1 ) con2 sider the Newton form of the polynomial that interpolates the primitive function H(x) in the points in (4.18): Qrj (x)

=

r X ℓ=0

H[xj−s1− 1 , . . . , xj−s1 +ℓ− 1 ] 2

2

ℓ−1 Y

m=0

(x − xj−s1+m− 1 ). 2

Its derivative is given by: qjr (x) =

r X ℓ=1

H[xj−s1 − 1 , . . . , xj−s1 +ℓ− 1 ] 2

2

ℓ−1 X

m=0

ℓ−1 Y

n=0 n 6= m

(x − xj−s1+ℓ− 1 ). 2

(4.20)

Remark 2. As only first or higher order divided differences of H are used in (4.20), but not the nodal values H(xj+ 1 ), the primitive function does not 2 need to be explicitly computed and so the summation in (4.15) is never performed, because of (4.17).

74

4.3. Reconstruction procedures

The reconstruction via primitive function is a standard procedure to reconstruct point-values from cell-averages. A more complete description of this technique can be found in [9]. We describe next the ENO and WENO reconstruction procedures used in this work. Note that there are r possible stencils of r cells that contain the cell cj : S¯j,k = {cj+k−r+1 , . . . , cj+k }, 0 ≤ k < r. (4.21) The ENO algorithm selects one of the stencils in (4.21) using divided differences as smoothness indicators, choosing the stencil which produces the smallest divided differences, in an attempt for producing less oscilr (x latory interpolants, see [64, 9] for further details. We denote by qj,k j+ 21 ) the Lagrangian approximation to hj+ 1 built from the kth candidate sten2 cil in (4.21), according to (4.20). This would be the approximation of the numerical flux computed by the ENO algorithm if the stencil S¯j,k had been chosen in the stencil selection procedure. The stencil selection procedure of the ENO algorithm is used everywhere, regardless the smoothness of the function to be reconstructed. In regions where the function is smooth, if we had used the stencil formed by the 2r − 1 nodes contained in any of the candidate stencils, i.e. the stencil Sˆj = {cj−r+1 , . . . , cj+r−1 }, we could have obtained a (2r − 1)th order approximation of hj+ 1 , denoted 2

by qj2r−1 (xj+ 1 ). Jiang and Shu [81] found that there exist coefficients Ckr 2

such that the value of the polynomial qj2r−1 (x) at the point xj+ 1 can be 2 expressed as a combination of the values of the ENO polynomials at the same point: r−1 X 2r−1 r (4.22) Ckr qj,k (xj+ 1 ), (xj+ 1 ) = qj 2

2

k=0

and the coefficients verify r−1 X k=0

Ckr = 1,

∀r ≥ 2.

These coefficients are called the optimal weights. Their values for r = 2 are: 2 1 C12 = , C22 = , 3 3

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

75

leading to a third order interpolation, and for r = 3: C13 =

1 , 10

C23 =

6 10

, C33 =

3 , 10

leading to fifth order accuracy. The idea behind the WENO technique is to use a convex combination r (x 1 ) to compute a new approximation: of the approximations qj,k j+ 2

ˆ h(x j+ 1 ) = 2

r−1 X

r r wj,k qj,k (xj+ 1 ). 2

(4.23)

k=0

r . Ideally The key of the method is the computation of the weights wj,k one wants that the weights automatically adapt to the smoothness of r should the function. If the stencil S¯j,k crosses a singularity, then wj,k r be essentially zero, canceling out the contribution of qj,k (xj+ 1 ) to the 2 reconstruction, and h(xj+ 1 ) is then built from the remaining stencils that 2 do not cross singularities, so that rth order of accuracy is maintained. On the other hand, if all the candidate stencils are contained in regions r has to approximate the optimal where the function is smooth, then wj,k r weight Ck , and thus the convex sum in the right hand side if (4.23) will approximate q 2r−1 (xj+ 1 ) with optimal order. Weights verifying the above 2 requirements were defined in [113] through the formula:

αr wkr = Pr−1k i=0

where αrk =

Ckr , (ǫ + ISj,k )p

αri

,

0 ≤ k ≤ r − 1.

(4.24)

(4.25)

In (4.25), ǫ is a positive number used to avoid the denominator to become zero. ISj,k is a smoothness measurement of the function h(x) in the stencil S¯j,k and constitutes the key point to define the weights. Jiang and Shu gave a definition for ISj,k that achieves the desired goals for the weights: ISk =

r−1 Z X i=0

xj+ 1

2

xj− 1 2

∆x

2i−1

2 di r q (x) dx. dxi j,k

(4.26)

With the algorithm described above, we construct the numerical fluxes of Donat-Marquina’s algorithm. We suppose, as in Shu-Osher’s lemma,

76

4.3. Reconstruction procedures

L and f R , defined in (4.11), are cellthat the characteristic fluxes fp,k p,k averages of an unknown function h(x). The values of h(xj+ 1 ) are there2

L and ψ R in fore interpreted as the characteristic numerical fluxes ψp,j p,j Algorithm 4 and Algorithm 5, used to define the final numerical flux fˆj+ 1 2 by (4.12). Note that two different reconstructions R, based on different L and ψ R .We stencils, are used, respectively, for the computation of ψp,j p,j have used fifth order (r = 3) WENO interpolants for the definition of R in our scheme.

4.3.2 ENO and WENO reconstructions for point-value discretizations So far we have described the construction of the operator R that we will L and ψ R from the use to compute the characteristic numerical fluxes ψp,j p,j L and f R in Algorithms 4 and 5. The characterischaracteristic fluxes fp,j p,j tic fluxes are projections of the equation fluxes following the characterR . L istic fields corresponding to the states Uj+ 1 and U j+ 1 2

2

In order to achieve a high order of accuracy, a reconstruction of the point-values of the solution at the cell interfaces from its point-values at the cell centers is needed. In this section we describe the reconstruction R L in Donat-Marquina’s algoused to compute the values Uj+ 1 and U j+ 1 2

2

rithm, which is a modified version of the WENO algorithm described in the previous section. We refer to [8, 18] for further details. To build a WENO interpolant that approximates point-values from point-values we follow a setup similar to the one used to compute approximations to point-values from cell-averages. Each reconstruction is computed using all the possible stencils of r points that contain the point R L ) or x xj (for Uj+ j+1 (for Uj+ 1 ). 1 2

2

Using the same notation as in section 4.3.1, for the reconstruction of L , we find that, for r = 3, there exist coefficients C r such that Uj+ 1 k 2

r r r (xj+ 1 ) + Cj1 qj,1 (xj+ 1 ) + Cj2 qj,2 (xj+ 1 ) qj2r−1 (xj+ 1 ) = Cj0 qj,0 2

2

2

2

(4.27)

where qkr is the Lagrange interpolatory polynomial of the solution {Uk } constructed from the stencil Sj,k : Sj,k = {xj−r+1+k , xj+k },

1 ≤ k ≤ r.

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

77

For r = 3 we have the three stencils: Sj,0 = {xj−2 , xj−1 , xj },

Sj,1 = {xj−1 , xj , xj+1 },

Sj,2 = {xj , xj+1 , xj+2 }.

and the corresponding reconstructions 15 5 Uj − Uj−1 + 8 4 3 3 = Uj+1 + Uj − 8 4 3 3 = Uj + Uj+1 − 8 4

3 qj,0 = 3 qj,1 3 qj,2

3 Uj−2 , 8 1 Uj−1 , 8 1 Uj+2 . 8

(4.28)

The interpolated value at xj+ 1 that results from the stencil 2

Sj := {xj−2 , xj−1 , xj , xj+1 , xj+2 } is given by qj5 =

20 90 60 5 3 Uj−2 − Uj−1 + Uj + Uj+1 − Uj+2 . 128 128 128 128 128

Solving (4.27), one obtains the coefficients: C03 =

1 , 16

C13 =

10 , 16

C23 =

5 . 16

(4.29)

It can be shown that using the same weights as in the cell-average case, i. e., the ones defined by (4.24), fifth order accuracy is obtained in smooth regions using the combination (4.23) as an approximation of hj+ 1 , whereas the weights approach zero for the reconstructions corre2 sponding to stencils that contain discontinuities. Therefore, we take the approximation: r−1 X 3 3 U L (xj+ 1 ) = wj,k qj,k (xj+ 1 ), 2

2

k=0

r defined by (4.28), and wr defined by (4.24) using the optimal with qj,k j weights (4.29). R , usA similar analysis can be performed for the computation of Uj+ 1 2

ing the stencils Sj,k = {xj−r+2+k , xj+k+1 },

0 ≤ k < r.

78

4.4. The complete integration algorithm

In this case, and for r = 3, the three interpolatory polynomials are given by 3 3 3 qj,0 = Uj+1 + Uj − 8 4 3 3 3 qj,1 = Uj + Uj+1 − 8 4 15 5 3 qj,2 = Uj+1 − Uj+2 + 8 4 and the optimal weights are C03 =

10 , 16

C13 =

5 16

1 Uj−1 , 8 1 Uj+2 , 8 3 Uj+3 , 8

C23 =

1 . 16

4.4 The complete integration algorithm For the sake of completeness and clarity we summarize here the complete numerical algorithm that results from putting together the following ingredients: • Shu-Osher’s finite-difference formulation for flux reconstruction, described in section 4.1. • Donat-Marquina’s flux formula, described in section 4.2 • A fifth order flux reconstruction based on the WENO algorithm, described in section 4.3. • The third order Runge-Kutta algorithm (3.15) for time integration. Consider a one-dimensional system: ut + f (u)x = 0,

(x, t) ∈ R × [0, T ],

(4.30)

x ∈ R.

(4.31)

where u : R −→ Rm , and f : Rm −→ Rm , for m > 1, with initial data u(x, 0) = u0 (x),

We assume a uniform discretization of the interval [0, 1]×[0, T ], defined in the same way as in section 3.1, i.e., 1 xj = j + ∆x, tn = n∆t, (4.32) 2

1 , for N, M ∈ N \ 0. The numerical solution of where ∆x = N1 and ∆t = M the problem defined by (4.30) and (4.31) consists of the following steps:

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

79

• Compute a discretization of the initial data by Uj0 = u0 (xj ). • For n = 0 until M-1 compute U n+1 from U n as follows: N −1 n n , Un , Un , Un , Un 1. From {Ujn }j=0 compute values {U−3 −1 −2 N N +1 , UN +2 }, corresponding to the ghost nodes {x−3 , x−2 , x−1 , xN , xN +1 , xN +2 } using a procedure for artificial boundary conditions, as described in section 3.7. +2 L 2. From {Ujn }N j=−3 compute two reconstructed values Uj+ 1 and 2

R Uj+ 1 for j = −1, . . . , N , according to (4.10). The reconstruction 2

operator S is built as described in section 4.3.2.

L and 3. For each j compute two sets of characteristic variables Wp,k R and characteristic fluxes, f L and f R using (4.11), for the Wp,k p,k p,k value s1 = s2 = 2, and for j = −1, . . . , N .

4. For each j = −1, . . . , N compute two sets of upwind characR L teristic fluxes ψj+ 1 and ψj+f rac12 according to Algorithm 4 or 2

Algorithm 5. n (U n ) using (4.12), for j = 5. Compute the numerical fluxes fˆj+ 1 2

−1, . . . , N .

6. Perform the first step of the Runge-Kutta algorithm (3.15), and (1) compute Uj according to (1)

Uj

= Ujn −

for j = 0, . . . , N − 1.

∆t ˆ (f 1 (U n ) − fˆj− 1 (U n )), 2 ∆x j+ 2 (1)

7. Repeat steps (1) to (5) changing Ujn by Uj n (U (1) ), for j = −1, . . . , N , merical fluxes fˆj+ 1

and compute nu-

2

8. Perform the second step of the Runge-Kutta algorithm (3.15), (2) and compute Uj according to (2)

Uj

3 1 (1) 1 ∆t ˆ = Ujn + Uj − (f 1 (U (1) ) − fˆj− 1 (U (1) )), 2 4 4 4 ∆x j+ 2

for j = 0, . . . , N − 1.

(2)

9. Repeat steps (1) to (5) changing Ujn by Uj n (U (2) ), for j = −1, . . . , N , merical fluxes fˆj+ 1 2

and compute nu-

80

4.4. The complete integration algorithm 10. Perform the third step of the Runge-Kutta algorithm (3.15), and compute Ujn+1 according to 2 2 ∆t ˆ 1 (f 1 (U (2) ) − fˆj− 1 (U (2) )) Ujn+1 = U n + U (2) − 2 3 3 3 ∆x j+ 2 for j = 0, . . . , N − 1.

5 Adaptive mesh refinement In this chapter we introduce the infrastructure needed to build an adaptive mesh refinement algorithm for the numerical solution of hyperbolic systems of conservation laws. We will focus on conservative numerical methods, in particular in the high order numerical scheme described in chapter 4. For the sake of simplicity, we essentially describe here the AMR algorithm in a very simple form, using uniform discretizations of an interval of the real line. In chapter 6 we describe how the ideas described here export to two dimensions and we show the details of our actual two-dimensional implementation of the algorithm. A wider description of adaptive mesh refinement in a more general setup has been postponed to Appendix A, being the algorithms described in this chapter particular cases of the general approach described there. Mesh refinement in a wide variety of forms is nowadays present in almost any algorithm for the integration of hyperbolic equations. As already explained, the formation of discontinuities and small scale features

82 in the solutions of such equations forces the use of very fine meshes, whose scale is smaller than the scale of the flow features to be resolved, in the numerical algorithms. As stated in the introduction, most of the difficulties associated with the numerical solution of hyperbolic conservation laws come from the lack of smoothness of the solution in some regions of the computational domain. When this situation appears, numerical methods have to cope with weak solutions, that typically are piecewise smooth functions with corners or discontinuities. In complex problems, other features as turbulence and vorticity can appear. High order methods are able to compute very accurate solutions for meshes coarser than the ones used with lower order methods, and in the last years a debate on the relative efficiency of low and high order methods has been active. Nowadays, however, the increasing demands for the CFD capabilities makes clear that it is necessary to develop algorithms that combine high order methods with very fine meshes. The drawback of this approach is the prohibitive computational cost of such problems, even for massively parallel computers. Any kind of adaptation, enabling for the use of high resolution only in a part of the computational domain, will produce a benefit in terms of computational efficiency. This reduction in the overall cost can be achieved in several ways. Multiresolution algorithms [62, 63, 35, 38] for example, try to reduce the cost by using a high order nonlinear integrator where needed, and switching to a much cheaper interpolation elsewhere. Adaptive grid algorithms try to fit the grid resolution to the needs of the numerical solution by refining the grid only in regions where the solution has non-smooth structure, and holding a coarser grid where such a high resolution is not needed. Around this idea several adaptive algorithms have been reported in the literature, being among the most known moving mesh methods [65, 114, 170], multiresolution algorithms h- and hp-adaptive methods [12] in the context of Galerkin methods, and approximations on non-uniform and unstructured meshes [42, 73]. Combinations of several of these techniques have also been addressed, as, for example, multiresolution algorithms on unstructured meshes [2]. Following a different approach, Berger and Oliger [21, 23, 24] developed the Adaptive Mesh Refinement (AMR) algorithm, with the aim of reducing the overall number of integrations by overlapping grids of different scales, constructing an approach whose main feature is the possibility of performing time refinement as well as space refinement. The algorithm was originally proposed for the implementation of artificial viscosity schemes, and later extended to more general finite volume schemes [22]. A simplified version was described by Quirk in [139]. Since

5. Adaptive mesh refinement

83

then, a lot of effort has been done in the development of new algorithms around the AMR idea. Current Applications of AMR cover areas as different as cosmology [1], magnetohydrodynamics (MHD) [52, 7], weather prediction [77] or plasma physics [182]. More and more complex models are being incorporated into AMR algorithms. As an example, the development and implementation of immerse boundary methods [124], with applications in fluid-structure interaction problems, in combination with the AMR technique is being currently investigated [149, 59]. The AMR algorithm is a two-fold adaptive method. It refines the grid by overlapping fine grids over coarse grids in order to obtain grid sizes small enough as to resolve the small features of the solution with arbitrary accuracy. The way in which the grids are overlapped allows to refine also in time, in the sense that each grid is integrated with temporal steps adapted to its spatial grid size. The bigger time step allowable for a grid is essentially proportional to the size of its smallest cell, in order to ensure accuracy and stability. Therefore, coarser cells could be evolved with bigger time steps if they were considered in isolation, but, since every cell in the grid has to be evolved to the same time at each time iteration, the time step is fixed for every cell in the grid. The AMR algorithms allows fine cells to be evolved with smaller time steps than for coarse cells, by grouping them into independent grids in a way such that all cells that belong to each grid have the same size. Each grid can overlap with coarser and finer grids. Even if the number of cells increases with respect to a grid containing cells of mixed size, one can expect that the total number of integrations required to evolve the solution for a given time step is reduced. Under favorable circumstances the AMR algorithm requires only a part of the computational power (and hence time) needed by an equivalent fine grid of uniform size or a grid of non constant size. In particular, if the integration algorithm is computationally expensive, as is the case of high order methods, the gain due to the reduction of the overall number of integrations can be very important. The differentiating idea of the AMR algorithm is therefore the creation of a collection of grids, each of them containing cells of constant size, that can be, at least in part, considered in isolation as independent grids. On the basis of the AMR algorithm lies the fact that the solution of an hyperbolic system of conservation laws is composed by waves moving through regions in which the solution is smooth. The main difficulty of HRSC schemes consists of catching and resolving these waves, since in the regions where the solution is smooth the differential form of the equations holds, and the solution can be computed with no major diffi-

84

5.1. Motivation

culties. The leading issue of AMR is to dynamically adapt the resolution of the grids to the requirements of the actual numerical solution, which is changing on time. It is well known that the movement of the waves is governed by the eigenstructure of the Jacobian matrix of the system, in particular, the speed of the waves is controlled by its eigenvalues. This information can be used to design an strategy to decide the moment when a grid has to be refined, so that the AMR algorithm can ensure that, if the waves are initially covered by a fine grid, they will always be, by adapting the grids before the waves can move to a region that is not covered by the fine grid. The different parts involved in an AMR algorithm have to be interrelated and organized in a way such that the aforementioned property is satisfied. In this chapter we motivate and construct the building blocks for the complete algorithm that results from putting together the integration algorithm described in chapter 4 and a version of the AMR algorithm.

5.1 Motivation Given a bounded subset Ω1 ⊂ Rd and due to the hyperbolicity of system (2.1), its solution u at Ω1 × [t0 , t0 + ∆t] only depends on the values of u at a superset of Ω1 × {t0 } which is determined by the domain of dependence of the equations, which is also a bounded set. Numerical methods mimic this idea, and are designed in a way such that the numerical solution at Ω1 × [t0 , t0 + ∆t] provided by the scheme is determined using data from another bounded set, which is defined by the numerical domain of dependence of the method. These concepts have been introduced in section 3.2.2. e 1 × {t0 } be the numerical domain of dependence of a numerical Let Ω scheme under consideration. Assume that the CFL condition is imposed, so that the numerical domain of dependence contains the domain of dependence of the equation. In this situation, given an approximation to u e 1 × {t0 }, it is then possible to compute on a computational mesh over Ω approximations to u(x, t), for (x, t) ∈ Ω1 × [t0 , t0 + ∆t], by means of the numerical scheme. The weakness of this exposition is that further applications of this idea to compute an approximation of u on a mesh over Ω1 × [t0 + ∆t, t0 + 2∆t] would require the knowledge of approximations of e 1 × {t0 + ∆t}, but we only have approximations of u on its subset u on Ω

5. Adaptive mesh refinement

85

~ Ω1 Ω1

Fine mesh

^ Ω 1 Coarse mesh

~ ~ Ω1 e 1 . Data for Ω b 1 can be obtained Figure 5.1: A region Ω1 and its surrounding band Ω by interpolation from the coarser mesh.

Ω1 × {t0 + ∆t}. The approximated values of u on the “surrounding band” b 1 := Ω e 1 \ Ω1 must be obtained by an auxiliary procedure. If u is smooth Ω b 1 , the values on the given mesh that lie in this band can be accuon Ω rately interpolated from an approximation of u on a coarse mesh defined e e1 ⊇ Ω e 1 . Fig. 5.1 illustrates this idea. on a domain Ω

The AMR algorithm can be described by the recursive application of this essential idea to an arbitrarily deep hierarchy of increasingly finer nested grids. At a given time, each grid conceptually contains the flow features that can not be accurately “predicted” from lower resolution levels. The transient nature of the non-smooth flow features endows a dynamical character to the grid hierarchy. An important feature of the algorithm is the time step adaptation referred to as “local time stepping”, which consists in the evolution of each resolution level by its own time step, instead of using a global time step that would inevitably be small, as dictated by CFL restrictions on the finest mesh. The local time stepping means that the algorithm performs more time steps in a certain resolution level than in the immediately coarser level, and this is another key feature for improving the overall performance of the algorithm. The algorithm, as described up to this point, only uses information on a coarse grid to predict the required values at the band surrounding a grid at the immediately finer level. Therefore, the approximated

86

5.2. Discretization and grid organization

solutions at each resolution level would not be synchronized. This lack of synchronization between resolution levels would then be reflected in worse fine resolution predictions that would negatively affect the efficiency of the algorithm. This behavior can be corrected by means of a ”projection” of information from fine to immediately coarser resolutions that, although optional, is a key element that determines the foundations of the algorithm. In a finite volume setting, with the basic scheme evolving approximated cell-averages, it is reasonable to use cell-based mesh-refinement and to project approximated cell-averages at the fine cells covering a coarse cell by just averaging the values at the fine cells. When the basic scheme evolves accurate approximations to point values of the solution, which is the case in Shu-Osher’s finite-difference flux formulation, the most straightforward projection is to organize the grids so that the nodes of a coarse grid belong also to the immediately finer grid, so that the projection amounts to just copying the values at the fine grid to the corresponding position on the coarse grid. But this choice would entail a loss of information, because of the lack of conservation between meshes and would render unclear the implementation of a conservative scheme. This is the reason of our choice of cell-based grids for our formulation based on point values, where the projection is based on flux projection rather than on solution projection. Fig. 5.2 shows two nested meshes with the nodes of the coarse and fine grids aligned so that the solution can be copied from fine to coarse (top plot) and another pair of meshes where the projection is achieved by means of fluxes thanks to the alignment of the cell boundaries (bottom plot). Let us briefly describe a one-dimensional version of the algorithm for a scalar equation. This description will provide an intuitive idea of how the algorithm works and will allow the reader to identify the main parts of the algorithm, to be described for a more general case in further sections.

5.2 Discretization and grid organization Consider the problem ut + f (u)x = 0, u(x, 0) = u0 (x),

x ∈ R, t > 0, x ∈ R,

and assume that our computational domain is the interval Ω = [0, 1], and appropriate numerical boundary conditions are imposed where required

5. Adaptive mesh refinement

87

Fine grid

Coarse grid

Nodes

Cell boundaries

Fine grid

Coarse grid

Figure 5.2: A pair of grids suitable for solution projection (top) and a pair of grids suitable for flux projection (bottom).

88

5.2. Discretization and grid organization

x1= 3/8

x1= 1/8

2

1

0

x1= 5/8

x1= 7/8 4

3

x0= 1/4

x0 = 3/4

1

2

Figure 5.3: Meshes with increasing resolution obtained by dyadic subdivision of a base mesh, with L = 4 and N0 = 2.

(see section 3.7), i.e., assume that the integration algorithm can be applied on a fixed mesh defined on Ω. For simplicity we consider an equally spaced mesh G0 composed by N0 cells of length ∆x0 = N10 . As already mentioned, meshes of higher and higher resolution are needed for the computation of small-scale flow features. A set of L meshes with increasing level of refinement can be built from the mesh G0 by considering meshes obtained by the subdivision of each cell in the immediately coarser level in two, i.e., the unit interval is divided into N0 , . . . , NL−1 subintervals (cells) of length ∆xℓ = 1/Nℓ , with Nℓ = 2ℓ N0 , ℓ = 0, . . . , L − 1. The centers of those cells will be denoted by xℓj = (j + 12 )∆xℓ , j = 0, . . . , Nℓ − 1, ℓ = 0, . . . , L − 1. An example illustrating this construction is depicted in Fig. 5.3. A grid Gℓ at resolution level ℓ > 0 can be interpreted both as a subset of {0, . . . , Nℓ − 1} or as the family of cells represented by them. The extent of the grid is the union of the cells indexed by elements of Gℓ and is denoted by Ωℓ (Gℓ ). The AMR algorithm can be described by the time evolution of a grid hierarchy, which is nothing but a tuple of “grid functions” ((utℓℓ , Gtℓℓ )/ℓ = 0, . . . L − 1), ℓ l ≈ u(xℓj , tℓ ), /j ∈ Gtℓℓ ), with utℓ,j where Gtℓℓ ⊆ {0, . . . , Nℓ − 1} and utℓℓ = (utℓ,j

1

5. Adaptive mesh refinement

0

89

1

0

1

Figure 5.4: A sample grid hierarchy (left) and a set of grids that do not compose a conformal grid hierarchy (right). The grid at level 2 is not contained in the grid at level 1, and the grid at level 3 is not obtained from subdivision of cells of the grid at level 2.

for some time tℓ . This time evolution starts with tℓ = 0, ℓ = 0, . . . , L − 1 ′ and ends when tℓ = T, ℓ = 0, . . . , L − 1. Meanwhile, tℓ ≥ tℓ′ if ℓ ≤ ℓ . We assume the following conditions for the grid hierarchy (See Fig. 5.4 for examples): • G0 = {0, . . . , N0 − 1}, • For ℓ > 0, Gℓ = {2i, 2i + 1, for some i ∈ Gℓ−1 }. In plain words, the above conditions mean that the coarsest grid covers the whole computational domain, and that the rest of the grids are obtained by dyadic subdivision of the immediately coarser grid. In particular, the second condition implies that the grids are nested, i.e. Ωℓ (Gℓ ) ⊆ Ωℓ−1 (Gℓ−1 ). The main building blocks of an AMR algorithm for the numerical solution of an hyperbolic equation are: integration, which amounts to the application of the algorithm described in chapter 4 to each cell in each grid; adaptation, in order to ensure that the adequate degree of refinement is applied to each part of the domain; and projection, which is a procedure to enforce conservation between the different grids in the zones where they overlap.

5.3 Integration Let us describe the necessary steps for the time evolution of a grid hierarchy, that initially corresponds to a time t (the same for all grids), up to

90

5.3. Integration

a time t + ∆t0 , where ∆t0 is a suitable time step for the integration of the grid G0 . In particular, ∆t0 is chosen such that ∆t0 ≤

∆x0 , maxu |f ′ (u)|

(5.1)

and the time steps for the rest of the levels are recursively defined by ∆tℓ =

∆tℓ−1 , 2

ℓ = 1, . . . , L − 1.

(5.2)

In this way the CFL condition for each grid is satisfied. Note that one time step of a grid G0 corresponds to two of the immediately finer grid, so that all grids can be integrated up to the same time t + ∆t0 . At that point, the process described here is repeated to integrate the equations from time t + ∆t0 up to time t + 2∆t0 , so that it suffices to describe the evolution of the grid hierarchy for a single time step of the coarse grid. The integration is organized in a sequential fashion, based on the following conditions: 1. The integration of a grid is performed after the integration of the immediately coarser grid. 2. After a grid Gℓ , with ℓ > 0, is integrated up to a time t + k∆tℓ , (k ∈ {1, . . . , 2ℓ }), it is not integrated again before the immediately finer grid has been integrated up to time t + k∆tℓ . The pseudo-code of an algorithm that performs the required sequence of integrations on a set of grids G := {G0 , . . . , GL−1 } in order to update the grid at a given refinement level ℓ and finer, from a given time t to time t + ∆tℓ is shown in Fig. 5.5. The call to integrate(Gℓ) performs the application of the numerical scheme to the grid Gℓ . An integration from time t to time t + ∆t0 of the sequence of grids G is achieved by the call update(G, 0). In Fig. 5.5 the gridlist data type stores a grid hierarchy. This data type is explained in detail in chapter 6. As an example, the integration process from time t to time t + ∆t0 for a grid hierarchy composed by three levels would be performed following the order indicated in Fig. 5.6. When all grids have been integrated up to the same time t + ∆t0 , the process is repeated up to the next coarse time step. Note that for the integration of a grid Gℓ , (ℓ > 0) from time t up to e ℓ (Gℓ ) × {t} is required. time t + ∆tℓ , data from the “surrounding band” Ω e ℓ−1 (Gℓ−1 ). On the This data is obtained from spatial interpolation from Ω other hand, for the integration of the same grid from time t + ∆tℓ up to

5. Adaptive mesh refinement

91

Function update(G:gridlist, ℓ:integer) integrate(Gℓ) if(ℓ < L − 1) for k = 1 until 2 update(G, ℓ + 1) end for end if end function Figure 5.5: A recursive algorithm to integrate level ℓ and finer of a grid hierarchy G = {G0 , . . . , GL−1 }, for one time step of the grid Gℓ . A call to Update(G, 0) will integrate the whole grid hierarchy for a time step of the coarser grid.

Order of integration GRID Time step 1

Time

G0

∆t0

t + ∆t0

2

G1

t+

3

G2

4

G2

5

G1

6

G2

7

G2

∆t0 2 ∆t0 4 ∆t0 4 ∆t0 2 ∆t0 4 ∆t0 4

t+ t+

∆t0 2 ∆t0 4 ∆t0 2

t + ∆t0 t+

3∆t0 4

t + ∆t0

Figure 5.6: Sample sequence of integrations for a three-level grid hierarchy

92

5.4. Projection

e ℓ (Gℓ )× {t + ∆tℓ }. In this case it is obtained by t + 2∆tℓ , data is required at Ω t ℓ , Gℓ−1 ), which must have been interpolation from (uℓ−1 , Gℓ−1 ) and (ut+2∆t ℓ−1 computed in previous steps. This interpolation procedure is described in detail in section 5.6.

5.4 Projection Once computed (utℓℓ +2∆tℓ , Gtℓℓ ), there is data that overlay Ωℓ (Gtℓℓ ) at different resolution levels. It is at this point that the projection of the data at ℓ +2∆tℓ the fine resolution level should be applied to modify the values utℓ−1,j of the immediately coarser grid function that correspond to cells overlaid ℓ ℓ such that and adjacent to them as well, i.e., cells in Gtℓ−1 by cells at Gtℓ−1 tℓ their indices i verify {2i, 2(i − 1), 2(i + 1)} ∩ Gℓ 6= ∅. In order to explain how the values of the coarse grid are modified, let us analyze the detailed computation of (uℓtℓ +k∆tℓ , Gtℓl ), for k = 1, 2. Let ℓ be such that 2i, 2i + 1 ∈ Gtℓℓ . The update of Gtℓℓ from tℓ to tℓ + 2∆tℓ i ∈ Gtℓ−1 is performed by means of the following computations, for j = 2i, 2i + 1: ∆tℓ ˆtℓ ˆtℓ 1 ), (f 1 − f ℓ,j− 2 ∆xℓ ℓ,j+ 2 ∆tℓ ˆtℓ +∆tℓ tℓ +∆tℓ ℓ +∆tℓ − = utℓ,j (f − fˆℓ,j− 1 1 ). ∆xℓ ℓ,j+ 2 2

ℓ ℓ +∆tℓ − = utℓ,j utℓ,j ℓ +2∆tℓ utℓ,j

(5.3)

ℓ +2∆tℓ can be written as From these expressions utℓ,j ℓ ℓ +2∆tℓ − = utℓ,j utℓ,j

∆tℓ ˆtℓ tℓ ℓ ℓ ˆtℓ +∆t ˆtℓ +∆t ) − (fˆℓ,j− )). ((f 1 + f 1 + f ℓ,j+ 21 ℓ,j− 21 ∆xℓ ℓ,j+ 2 2

(5.4)

From (5.4) we deduce: ℓ +2∆tℓ ℓ +2∆tℓ + utℓ,2i+1 utℓ,2i

2

=

ℓ ℓ + utℓ,2i+1 utℓ,2i

2

−

∆tℓ−1 ˆˆtℓ ˆˆtℓ (f ), 1 −f ℓ−1,i− 21 ∆xℓ−1 ℓ−1,i+ 2

(5.5)

where we have defined: ˆtℓ fˆℓ−1,i+ 1 =

tℓ ˆtℓ +∆t3ℓ fˆℓ,2i+ 3 + f ℓ,2i+ 2

2

ˆtℓ fˆℓ−1,i− 1 = 2

tℓ fˆℓ,2i− 1 2

2

2 tℓ +∆tℓ + fˆℓ,2i− 1 2

2

, (5.6) ,

5. Adaptive mesh refinement

93

∆t

∆tℓ = ∆xℓ−1 . and used that ∆x ℓ ℓ−1 Therefore, if we redefine accordingly the coarse numerical fluxes, i.e., we impose ˆˆtℓ tℓ fˆℓ−1,i± , (5.7) 1 = f ℓ−1,i± 1 2

2

and we assume that at time tℓ the relation ℓ utℓ−1,i

=

ℓ ℓ + utℓ,2i+1 utℓ,2i

(5.8)

2

holds, then if we perform the above correction it holds ℓ +2∆tℓ utℓ−1,i

=

ℓ +2∆tℓ ℓ +2∆tℓ + utℓ,2i+1 utℓ,2i

2

,

(5.9)

i.e., the same relation holds for time tℓ + 2∆tℓ = tℓ−1 + ∆tℓ−1 . Note that the ˆtℓ substitutions in (5.7) make sense because the values fˆℓ−1,i± 1 correspond 2

to an approximation of the numerical fluxes of the equation at the cell ˆtℓ interfaces of the grid Gℓ−1 . It follows from the definition of fˆℓ−1,i± 1 in (5.6) 2 ℓ−1 1 and the election of the nodes xi = i + 2 ∆xℓ−1 as the cell-centers. ˆ The values fˆtℓ 1 correspond, respectively, to approximations of the ℓ−1,i± 2

numerical fluxes at the points ℓ−1 xℓ2i+ 3 = (2i + 2)∆xℓ = (i + 1)∆xℓ−1 = xi+ 1 2

2

and ℓ−1 xℓ2i− 1 = 2i∆xℓ = i∆xℓ−1 = xi− 1. 2

2

These points are the cell interfaces of the cell whose center is the point xiℓ−1 . Note that the same relation does not hold if the nodes are taken as the cell boundary points, i.e., xℓi = i∆x. We therefore conclude that, even for the case of a discretization based on point values, a coherent projection can be based on a discrete form of conservation in the sense of cell-averages. In the adaptation procedure, described below, we have enforced the construction of adapted grids in a way such that they are obtained by dyadic subdivision of the underlying coarser grids. In other words, if a cell of a grid Gℓ , whose index can be 2i or 2i + 1, for some i ∈ Gℓ−1 is marked for refinement, then both cells 2i and 2i + 1 are forced to belong to the refined grid. In this way, the numerical fluxes required to build a

94

5.5. Adaptation

Function update(G:gridlist, ℓ:integer) integrate(Gℓ) if(ℓ < L − 1) for k = 1 until 2 update(G, ℓ + 1) end for project(Gℓ+1) end if end function Figure 5.7: A recursive algorithm, that includes projection, to integrate level ℓ and finer of a grid hierarchy G = {G0 , . . . , GL−1 }, for one time step of the grid Gℓ .

conservative projection, in the sense of (5.8) – (5.9), are available in the finer grid. The indicated projection is performed each time a grid has been integrated one coarse time step. We can modify the algorithm in Fig. 5.5 to include the projection as indicated in Fig. 5.7. A call to project(Gℓ) updates the solution corresponding to the grid Gℓ−1 , according to the correction (5.7).

5.5 Adaptation Another key step in the time evolution of the grid hierarchy is adaptation. The reason for adapting the grids is that refinement of the grids must somehow follow the motion of the flow features. Consequently, the grids corresponding to the various refinement levels have to be modified according to the actual characteristics of the flow. We discuss next some procedures to decide which cells are to be included in a given resolution level, and we give a procedure to decide in what moments of the evolution the grids have to be adapted. Finally, we describe how the integration, projection and adaptation processes can be interleaved. The main goal of the adaptation process is to ensure that discontinuities that are initially covered by a grid corresponding to a given resolution level continue being covered by a grid with the same resolution, as long as the discontinuity persists. On the other hand, the adaptation should be able to catch newly generated discontinuities when they are

5. Adaptive mesh refinement

95

forming. These goals can be accomplished in many ways, that can be split into two major groups: on the one hand the ones based on accuracy, where estimators of the local truncation error are used, and a cell is refined or not depending on if the estimation of the local truncation error is below a given tolerance or not; on the other hand, the ones based on the flow properties, where measures of the flow features are used instead, and a cell is refined if it is decided that it is near or within a region that has some feature that needs more refinement. Accuracy-based criteria are, in general, more reliable: if a correct solution cannot be computed by the numerical method, then the zone is refined in order to gain accuracy. These criteria can be applied to very general problems regardless their physical nature. The main drawback is their relatively high computational cost. Within this group we can mention the method described in [22] by Berger and Colella, who proposed the usage of estimators of the local truncation error, based on Richardson extrapolation. The idea is that for a difference method H, that we assume for simplicity in the form (3.5), with temporal and spatial order q, the quantity u(x, t + ∆t) − H∆t (u(x, t)) which is, up to a constant, the local truncation error, defined by (3.7), can be estimated by the formula u(x, t + ∆t) − H∆t (u(x, t)) ≈

2 (u(x, t)) − H H∆t 2∆t (u(x, t)) . q+1 2 −2

(5.10)

2 (u(x, t)) denotes two successive apWe recall from section 3.1, that H∆t plications of the numerical method H for a step ∆t (see page 42). On the other hand H2∆t is the result of the numerical method applied on a grid of size 2∆x with a time step 2∆t. The approximation in (5.10) is accurate up to order O(∆x)q+2 if the function is smooth. One can therefore identify non-smooth zones as the ones where the approximation is not valid, which in practice amounts to flagging those cells that verify a criterion like 2 (u(x, t)) − H |H∆t 2∆t (u(x, t))| >τ q+1 2 −2

for a given tolerance τ . Similar techniques have been used by many researchers in different contexts, see e. g. [163, 85]. A review of local truncation estimators can be found in [183].

96

5.5. Adaptation

The main drawback of this approach is obviously the fact that extra numerical solutions have to be computed for each point in order to be able to estimate the local truncation error and then decide if the cell is refined or not. In the problems and methods under consideration in this work, such an expense of computational resources can be unacceptable, since the flux computation (i.e., the application of the numerical method) is precisely the most time-consuming part of the algorithm. On the other hand, flow-based refinement criteria can reduce the computational complexity of the process of marking cells for refinement. The idea is that one can use some knowledge of the qualitative behavior of the solutions of the equations in order to decide which cells need to be flagged for refinement. Typical flow characteristics that one might want to detect are contact discontinuities and shock waves (jump discontinuities) and rarefaction waves (jumps in the first derivative at the tail and the head of the rarefaction). Since they all constitute variations in the solution or in the gradient of the solution, it could be enough to use information coming from the first and second derivatives of the solution as an indicator of the presence of such discontinuities. This is the approach used, for example by Quirk [139], who used approximations to the gradient, based on finite differences as estimators. If the change in some seminorm of the gradient (e.g., the absolute value of the density component of the gradient for Euler equations, or some norm of the gradient for general equations) of the solution between two adjacent cells is above a given tolerance, then both cells are flagged for refinement. More precisely, a cell is refined if it verifies a condition of the type |u(x + ∆x, t) − u(x, t)| > τg , ∆x

(5.11)

where τg is a given tolerance and u is any quantity of interest. In other contexts the quantities used as indicators may vary. In [74], for example, a flagging procedure based on vorticity is used. Indicators based on different characteristics of the flow are used in [138, 193] for MHD problems. The flow-based indicator has to be tailored for the specific application, and it presents the problem of distinguishing the continuous features at a discrete level. Numerical methods introduce some smearing of the discontinuities, and it is a serious problem to take apart, for example, a continuous function with a deep gradient from a jump discontinuity. Regardless the approach used, be it flow-based or accuracy-based, a thresholding criterion has to be supplied to actually decide if a particular cell needs refinement or not. The choice of the thresholds is also

5. Adaptive mesh refinement

97

problem-dependent and has an important influence on the final result. If severe values of the threshold are chosen, more refinement will be needed, thus increasing the computational cost of the algorithm. If the values of the threshold are relaxed, then less refinement will be required, but the risk of leaving without refinement zones that should be refined increases. Combinations of sensors based on the local truncation error estimation with sensors based on flow features have also been considered [34, 150]. Other authors have studied more general estimators of the local truncation error based on the interpolation reconstruction of the operators [94, 107]. Our proposal, which is similar to the one in [35, 38], mainly consists in including in the grid those cells whose associated values cannot be accurately predicted by interpolation from the next coarser level, thus ensuring that values at cells not refined at some level can be accurately predicted from coarser values. More precisely, for our cellcentered approach, if xℓj = (j + 21 )∆xℓ is the center of a cell belonging to a grid Gtℓ and I(utℓ−1 , x) is an interpolation operator defined on the data utℓ−1 = {utℓ−1,i }i∈Gt , then the cell defined by xℓj will be selected for ℓ−1 refinement if the difference t (5.12) uℓ,j − I(utℓ−1 , xℓj )

is above a given tolerance τp . Note that only the cells present in the actual grid are considered for refinement. New cells are included only because of the addition of some extra cells around each marked cell. We also ensure that the refined grid is obtained by subdivision of coarse cells: if a cell xtℓ,j is selected for refinement, then every cell that overlaps the same coarse cell as xtℓ,j is also included in the refined grid. Further, we also include a cell in the refinement list if the modulus of the discrete gradient, computed in the coarser grid, exceeds some large threshold, so that shock formation can be detected from steepened data. For the discrete gradient we use the approximation: t t t t − u , max − u uℓ−1,j+1 ℓ−1,j−1 ℓ−1,j uℓ−1,j ∂u ℓ−1 (x , t) ≈ . (5.13) ∂x j ∆xℓ The last observation for this refinement procedure is that it should be performed from fine to coarse resolution levels to ensure that at every moment of the update process it holds that Ωℓ (Gtℓ ) ⊆ Ωℓ−1 (Gtℓ−1 ). We also enforce the inclusion Ωℓ (Gtℓ ) ⊇ Ωℓ+1 (Gtℓ+1 ), so that the whole sequence

98

5.5. Adaptation

of grids verifies the desired inclusions. Finally note that the process of computing data at the corresponding surrounding bands is possible e ℓ−1 (Gt ). e ℓ (Gt ) ⊆ Ω because the grids are nested, and this implies that Ω ℓ−1 ℓ See page 85 for the definition of the “surrounding band”. bℓ , that verifies Ωℓ (G bt ) ⊆ Ωℓ−1 (Gt ), one Once computed the new grid G ℓ−1 ℓ then sets ( bt \ Gt I(utℓ−1 , xℓj ) if j ∈ G ℓ ℓ (5.14) u btℓ,j = utℓ,j if j ∈ Gtℓ

i.e., the value at the j-th cell is interpolated from data at the next coarser t bt , u level for cells not in Gtℓ . The refined grid is therefore defined by (G ℓ bℓ ). Discrete boundary conditions are also applied if the grid overlaps the domain boundary. To end up the description of the adaptation process, what still remains to be done is to decide when a given grid has to be adapted in order to avoid discontinuities to move from a fine to a coarse grid. As long as the flow travels in time, during a time step a discontinuity can move from one cell to another. At this point we observe that the CFL condition, which is imposed for the numerical method to be stable, guarantees that, during a time interval [t, t + ∆tℓ ], the information about the properties of the fluid can move, at most, a distance ∆xℓ around its initial location. This fact can be seen by looking at the CFL condition (cf. (5.1 and (5.2)) ∆xℓ , max |f ′ (u)| ≤ u ∆tℓ which indicates that the maximum characteristic speed of the equation, ℓ given by maxu |f ′ (u)| is bounded by the “grid speed” ∆x ∆tℓ . The maximum distance that the information about the properties of the fluid can move during a time step ∆tℓ is thus given by maxu |f ′ (u)| · ∆tℓ ≤ ∆xℓ . ∆tℓ ∆t In Fig. 5.8 two lines with slopes ∆x are shown on a spaceand − ∆x ℓ time grid of spatial size ∆xℓ and temporal size ∆tℓ . These lines represent ∆xℓ ℓ information propagating with speeds ∆x ∆tℓ and − ∆tℓ , respectively, and determine the maximum distance that the information about the properties of the fluid can move during a time interval, so that data at a point x can only affect the solution at the interval [x−∆xℓ , x+∆xℓ ] in the time interval [t, t + ∆tℓ ]. This suggests that a band of width at least of one cell has to be added to any grid in order to avoid discontinuities to escape the grid during one time step. Therefore, an strategy for the adaptation of the grids would consist on the adaptation of the grid after each integration, adding the required extra cells to the adapted grid. In practice, a band of width equal to one

5. Adaptive mesh refinement

99

lines with slopes equal to the reciprocal of a characteristic speed

∆xl ∆ tl

line with slope −

∆t l ∆x l

line with slope

t+ ∆ t l t

x−∆ x l

x

∆t l ∆x l

x+ ∆x l

Figure 5.8: An illustration of the movement of the information of the characteris∆tℓ ∆tℓ tics of the fluid. The solid lines with slopes ∆x and − ∆x determine the maximum ℓ ℓ velocities. The lines corresponding to information propagating with characteristic speeds (dashed lines) necessarily fall in the region determined by them.

100

5.6. Grid interpolation

Function update(G:gridlist, ℓ:integer) integrate(Gℓ) if(ℓ < L − 1) for k = 1 until 2 update(G, ℓ + 1) end for project(Gℓ+1) adapt(Gℓ+1) end if end function Figure 5.9: A recursive algorithm, that includes projection, integration and adaptation, to update level ℓ and finer of a grid hierarchy G = {G0 , . . . , GL−1 }, for one time step of the grid Gℓ .

cell of the coarser grid (i.e., two cells of the grid Gℓ ) is added instead, allowing for the refinement of the grids after each integration of the coarser grid. Performing the refinement in this fashion has the advantage that, at the moment when the grid is to be refined, data from the coarse grid is available for the same time, thus enabling the procedure described above for the computation of the numerical solution for the adapted grid, as described in formula (5.14). The algorithm in Fig. 5.7, modified to include the adaptation process, is shown in Fig. 5.9, and the sequence of integrations, adaptations and projections for the example used in Fig. 5.6 is shown in Fig. 5.10. The algorithm in Fig. 5.9 is recursive, as is the nature of the sequence of integrations, adaptations and projections, and most implementations use recursion, even if it is not natively supported in some programming languages, like FORTRAN 77 [5] (see e.g. [139]). The algorithm can be, however, implemented in a sequential form as shown in Fig. 5.11.

5.6 Grid interpolation In the description of the AMR algorithm we have highlighted the fact that, at some stages of the integration process, interpolation from coarse to fine grid is required. We describe here the interpolants used in such cases.

5. Adaptive mesh refinement

Step GRID 1

G0

Action

Time step

Time

Integrate

∆t0

t + ∆t0

∆t0 2 ∆t0 4 ∆t0 4

t+

−

t+

2

G1

Integrate

3

G2

Integrate

4

G2

Integrate

5

G2 → G1

6

G2

101

Project Adapt

7

G1

Integrate

8

G2

Integrate

9

G2

Integrate

10 11

G2 → G1 G2

Project Adapt

12 13

G1 → G0 G1

Project Adapt

−

∆t0 2 ∆t0 4 ∆t0 4

− −

− −

t+ t+ t+

∆t0 2 ∆t0 4 ∆t0 2 ∆t0 2 ∆t0 2

t + ∆t0 t+

3∆t0 4

t + ∆t0 t + ∆t0 t + ∆t0 t + ∆t0 t + ∆t0

Figure 5.10: Sample sequence of integrations, projections and adaptations for a three-level grid hierarchy

102

5.6. Grid interpolation

Function update(G) integrate(G0) ℓ=1 sℓ = 0, 1 ≤ ℓ ≤ L − 1 while 1 ≤ ℓ < L do integrate(Gℓ) sℓ = sℓ + 1 while ℓ = L − 1 and sℓ < 2 do integrate Gℓ sℓ = sℓ + 1 end while if sℓ < 2 or ℓ < L − 1 then ℓ=ℓ+1 else while sℓ = 2 do project(Gℓ+1) adapt(Gℓ) sℓ = 0 ℓ=ℓ−1 end while end if end while end function Figure 5.11: An iterative algorithm to update a grid hierarchy.

5. Adaptive mesh refinement

103

Let us recall the integration algorithm shown in Fig. 5.5 when applied to a grid hierarchy G = {G0 , G1 } composed by two refinement levels. Initially both grids correspond to the same time t and we assume that all the required data for the integration is available for all grids, including the ghost cells that compose their respective “surrounding bands”. The first step is to integrate the grid G0 from time t up to time t + ∆t0 . Then the grid G1 is integrated up to time t + ∆t1 = t + ∆t2 0 . At this point we notice that data corresponding to the ghost cells that compose the “surrounding band” of G1 , is required to perform the next integration for the grid G1 . It can happen that the ghost cells overlap the boundary of the computational domain, and in that case a numerical solution for them is computed from the boundary conditions. Otherwise, in order to obtain a numerical solution for the ghost cells corresponding to time t + ∆t1 one has two possibilities: • Perform an integration from time t to time t + ∆t1 of the cells in G0 required to compute data on the ghost cells of G1 . • Interpolate in time the numerical solutions corresponding to the grid G0 for times t and t + ∆t0 . The first option should produce a more accurate approximation to the data that we aim to compute, but the second option is much cheaper in terms of computational cost. These two approaches give exactly the same result if the time integration procedure is the forward Euler method, but are different in general (and in the particular case of the third order Runge-Kutta method used in this work). The fact that the ghost cells are away from discontinuities allow us to use the second approach, i.e., time interpolation between the known solutions at times t and t + ∆t0 . Since we only know these two approximations in the grid G0 , the only choice is linear interpolation, which amounts to compute 1 ut+∆t = 0,i

0 ut0,i + ut+∆t 0,i

2

for some cells i ∈ G0 . The actual nodes where this interpolation is performed depend on the spatial interpolation to be applied on them. This provides data at time t + ∆t1 for some nodes in the grid G0 . This data is next interpolated in space to obtain values for the ghost nodes of the grid G1 . In particular, if linear interpolation in space is used, data is required in the nodes of G0 whose associated cells overlap ghost cells of G1 and in one extra coarse node around them. If a third order Lagrange

104

5.6. Grid interpolation

interpolant is used instead, we need to perform time interpolation for the coarse cells that overlap ghost nodes of G1 and for a ”band“ of two coarse nodes around them. An illustration of this process is depicted in Fig. 5.12, for third order interpolation in space. We see from the above example that there are two situations where a grid needs data that may not be available for its refinement level. In the first case, a fine grid needs to fill its ghost nodes with data, and a coarse grid, corresponding to the same time, exists (Fig. 5.12(f)). In this case a spatial interpolation is applied. The second case corresponds to the situation where no coarse grid for the same time exist, and therefore interpolation between two coarse grids, corresponding to different times, has to be performed prior to the spatial interpolation (Fig. 5.12(c) - 5.12(d)). A more general case is depicted in Fig. 5.13, where three grids are used. In that example space-time interpolation is required for the finest grid G2 after steps (d) and (g), and for the intermediate grid G1 after step (c). Interpolation in space only is required for the grid G2 after steps (e) and (h) and for the grid G1 after step (f).

5.6.1 Grid interpolation and Runge-Kutta time integration Most semi-discrete methods use a multi-stage time integration method, as is the three-step Runge-Kutta method (3.15) used in this work. Let us recall the algorithm, which reads as follows: U (1) = U n − ∆tD(U n ), 1 1 3 U (2) = U n + U (1) − ∆tD(U (1) ), 4 4 4 1 2 2 U n+1 = U n + U (2) − ∆tD(U (2) ). 3 3 3

(5.15)

Note that the values corresponding to an stage of the Runge-Kutta algorithm are obtained by means of a combination of the values corresponding to the previous stages and an spatial operator, that we have denoted by D, applied on the values of the previous stage. The operator D is, in our case, the divided difference of the numerical fluxes corresponding to each cell, being these fluxes computed according to the algorithm described in chapter 4. The algorithm can be thus written, for a single node xj , as

5. Adaptive mesh refinement

105 Time t+∆t

0

t

(a) The coarse grid is integrated and exists at times t and t + ∆t Time t+∆t

0

X

X

X

t+∆t

1

t

(b) The fine grid is integrated up to time t + ∆t1 Time t+∆t

0

X

X

X

t+∆t

1

t

(c) Linear interpolation in time is performed on some coarse nodes

Figure 5.12: Illustration of the data interpolation process. The coarse nodes are depicted with circles, and the fine nodes with squares. The X signs indicate ghost nodes of the fine grid that do not have a numerical solution.

106

5.6. Grid interpolation Time t+∆t

0

t+∆t

1

t

(d) Interpolation in space is performed to fill the ghost nodes of the fine grid with a numerical solution at time t + ∆t1 Time X

X

X

t+∆t

0

t

(e) The fine grid is integrated up to time t + ∆t0

Time t+∆t

0

t

(f) Interpolation in space is performed to fill the ghost nodes of the fine grid with a numerical solution at time t + ∆t0

Figure 5.12 (continued)

5. Adaptive mesh refinement

(1)

Uj

(2)

Uj

Ujn+1

∆t ˆ n f (U )j+ 1 − fˆ(U n )j− 1 , 2 2 ∆x 3 n 1 (1) 1 ∆t ˆ (1) = Uj + Uj − f (U )j+ 1 − fˆ(U (1) )j− 1 , 2 2 4 4 4 ∆x 1 n 2 (2) 2 ∆t ˆ (2) f (U )j+ 1 − fˆ(U (2) )j− 1 . = Uj + Uj − 2 2 3 3 3 ∆x

107

= Ujn −

(5.16)

During each intermediate step of the Runge-Kutta integration the numerical fluxes are computed, therefore the ghost nodes need to be filled with data. The steps of the Runge-Kutta algorithm are sequentially applied to each grid separately, so we consider a single grid, which contains the numerical solution for a given time t. Assume that the ghost nodes contain a valid numerical solution, obtained by spatial interpolation from a coarser grid or from the boundary conditions. Let us analyze each step of the algorithm in order to provide a way to compute data for the ghost cells. We mainly need to identify each intermediate stage of the Runge-Kutta algorithm with a precise time, in order to be able to perform the space-time interpolation process described above. Assume that U n is the numerical solution at a given time t. The intermediate result U (1) naturally corresponds to time t + ∆t, since it is nothing but the value that results from an iteration of the forward Euler method corresponding to a time step ∆t. The intermediate values U (2) can be interpreted as values that correspond to time t + ∆t 2 . This can be seen, for example, by writing 1 3 U (2) = U n + U ∗ 4 4

(5.17)

where U ∗ = U 1 − ∆tD(U (1) ). U ∗ results of advancing U 1 in time a step ∆t using the forward Euler method, so U ∗ can be interpreted as a solution that corresponds to time t + 2 ∆t. Then, (5.17) is the result of applying linear interpolation in time to the pairs (t, U n ) and (t + 2∆t, U ∗ ), for the time t + ∆t 2 . The final value U n+1 corresponds of course to time t+∆t. As a remark, note that the same procedure used for U (2) , if applied to U n+1 , gives that it can be considered as an interpolated value at time t + ∆t using the pairs (t, U n ) and (t + 32 ∆t, U (2) − ∆tD(U (2) )).

108

5.6. Grid interpolation TIME

TIME

t+ ∆ t 0

t+ ∆ t 0

t+

3 ∆t0

t+

∆t0

t+

∆t0

4

2

4

t+

3 ∆t0

t+

∆t0

t+

∆t0

t

4

2

4

t

G0

G1

G2

G0

GRID

(a) Initial state TIME

t+ ∆ t 0

t+ ∆ t 0

3 ∆t0

t+

∆t0

t+

∆t0

G2

GRID

G2

GRID

G2

GRID

G2

GRID

(b) Step 1

TIME

t+

G1

4

2

4

t+

3 ∆t0

t+

∆t0

t+

∆t0

t

4

2

4

t

G0

G1

G2

G0

GRID

(c) Step 2

(d) Step 3

TIME

TIME

t+ ∆ t 0

t+ ∆ t 0

t+

3 ∆t0

t+

∆t0

t+

∆t0

G1

4

2

4

t+

3 ∆t0

t+

∆t0

t+

∆t0

t

4

2

4

t

G0

G1

G2

G0

GRID

(e) Step 4 TIME

TIME

t+ ∆ t 0

t+ ∆ t 0

t+

3 ∆t0

t+

∆t0

t+

∆t0

G1

(f) Step 5

4

2

4

t

t+

3 ∆t0

t+

∆t0

t+

∆t0

4

2

4

t

G0

G1

(g) Step 6

G2

GRID

G0

G1

(h) Step 7

Figure 5.13: A graphical representation of the sequence of integrations of Fig. 5.6.

6 Implementation and parallelization of the algorithm In this chapter we describe the AMR algorithm in the form used in our 2D implementation. For the sake of clarity, we have followed up to now a very simple approach, based on one-dimensional examples with equally spaced points. The algorithm can be, however, applied to much more general grids, as is described in appendix A. In this chapter we focus on our implementation of the algorithm in 2D, that assumes a particular organization of the grids which, in turn, imposes some requirements and allows some simplifications in some auxiliary procedures. We describe the data structures used in the program, and the main routines implemented for the different parts of the algorithm in a generic form, using pseudocode, to make the explanation more understandable than the actual implementation, written in ANSI C, which contains a

110

6.1. Sequential implementation

lot of auxiliary code that is not essential for the comprehension of the algorithms. The chapter is divided into two sections, where we respectively explain the sequential and parallel implementation of the algorithm. Albeit our actual code is parallel (the sequential version is exactly the parallel code running on a single processor), we prefer to split the explanation in two parts, delaying the concerns of the parallelization to a separate section, and focus initially on what would be a pure sequential implementation. In section 6.1 we explain how we have implemented the major parts of the algorithm. In the actual, parallel code, each of these parts runs at each processor almost independently from the others, except for some data transfers between processors, and it will be more understandable to add these parts after the implementation of each building block has been described. Along with other parallelization issues, this is the goal of section 6.2.

6.1 Sequential implementation It is a complicated issue to manage grids composed by isolated points. Processes such as flow integration, grid interpolation, projection of solution, etc. would need to process information related to the grid to decide, for example, if a point is surrounded by points of the same resolution or not, or where a point lies. To manage that kind of information, additional data structures containing it can be created and updated dynamically in the code, but it is unpractical from the computational point of view, because substantial computational time and memory needs to be used to process and dynamically update the required information for each point. On the other hand, the particular type of numerical method used in this work for flow integration, which is based on Shu-Osher’s approach, relies on a dimensional splitting approach, which is well suited for structured, Cartesian grids, but not for other kinds of grid. For Cartesian grids, a common approach for Adaptive Mesh Refinement is to organize the grids into rectangular patches, rather than considering the nodes in isolation. This approach implies that some nodes that, in principle, do not belong to a grid, are included on it in order to compose the rectangular patches, thus requiring more integrations, but the savings of computational time due to the lack of decision-making mechanisms in the code, and the simplification in the final implementa-

6. Implementation and parallelization of the algorithm

111

tion, justify this approach. The number of redundant nodes included in the patches can be, on the other hand, controlled by the user, enabling for a balance between redundancy and efficiency. This approach, with slight variants, has been adopted in many implementations regarding many different applications, see e.g. [132, 137, 192, 193]. In the rest of the section we describe our actual implementation of the algorithm, focusing in the 2D version. We start by describing the organization of the grid hierarchies and associated numerical solutions, and we later analyze the implementation of the main processes involved in the algorithm, namely adaptation, projection and integration. Despite our numerical method for the flow integration, and hence our AMR implementation, are based on nodal values rather that cellaverage values, in our description we will mix cells and nodes in order to clarify the explained concepts. Some ideas regarding grid nestedness, flux projection or grid adaptation, for example, can be explained more clearly with this approach.

6.1.1 Hierarchical grid system Our implementation of the AMR algorithm uses a hierarchical grid system composed by a set of Cartesian coarse mesh patches, which constitutes the level 0 of the hierarchy and defines the computational domain. These patches can be refined locally by defining finer mesh patches that form the level 1 of the hierarchy. The finer patches are obtained by the sub-division of groups of coarse cells that have been marked for refinement. This process can be repeated to obtain even finer patches at level 2. Grids of the desired resolution can be obtained by iterating this process. We will use the term grid to refer to the set of mesh patches that belong to the same refinement level, while the terms mesh, patch and mesh patch will be used interchangeably to refer to a single rectangular patch. Let us consider the problem:

ut (x, y, t) + f (u(x, y, t))x + g(u(x, y, t))y = 0, (x, y, t) ∈ Ω × [0, T ], u(x, y, 0) = u0 (x), (x, y) ∈ Ω,

(6.1)

where for simplicity we assume Ω = [0, 1]2 . In order to define a uniform Cartesian discretization of Ω we proceed similarly to the one-dimensional case, described in sections 3.1 an 5.2. We take positive integer numbers

112

6.1. Sequential implementation

N0x and N0y and we define the grid sizes by ∆x0 = defines a discretization given by the points (x0i , yj0 ) =

i+

1 2

1 ∆x0 , j + ∆y0 , 2

1 N0x

and ∆y0 =

1 . N0y

0 ≤ i < N0x , 0 ≤ j < N0y .

This

(6.2)

Each point (x0i , yj0 ) defines a cell c0i,j by c0i,j

=

x0i

∆y0 0 ∆y0 ∆x0 0 ∆x0 0 , xi + , yj + × yj − . − 2 2 2 2

(6.3)

The coarsest grid in our grid hierarchy, denoted by G0 , is defined as a set of K0 mesh patches, denoted by {G0,k : 1 ≤ k ≤ K0 }, where each patch is composed by a subset of {c0i,j : 0 ≤ i < N0x , 0 ≤ j < N0y }, with the following restrictions: • The extent of each patch, defined by Ω(G0,k ) =

[

c0i,j

c0i,j ∈G0,k

is a rectangle. • Two patches can overlap only on their boundaries, i.e., ˚ Ω(G0,k1 ) ∩ ˚ Ω(G0,k2 ) = ∅ if k1 6= k2 , where ˚ Ω(G0,k ) denotes the interior of the set Ω(G0,k ). • The union of all the coarse mesh patches covers the computational domain: K0 [ Ω(G0,k ) = [0, 1]2 . k=1

In order to construct a grid hierarchy composed by L levels of refinement we take integer numbers rℓx and rℓy such that rℓx , rℓy ≥ 2 for ℓ = 0, . . . , L − 2 and we define a new discretization on Ω based on the points defined by (xℓi , yjℓ ) =

i+

1 2

1 ∆xℓ , j + ∆yℓ , 2

0 ≤ i < Nℓx , 0 ≤ j < Nℓy ,

(6.4)

6. Implementation and parallelization of the algorithm

113

where, for 1 ≤ ℓ ≤ L − 1 we have defined: x x = Nℓ−1 Nℓx = rℓ−1

y y Nℓy = rℓ−1 Nℓ−1 =

ℓ−1 Y

m=0 ℓ−1 Y

x rm N0x ,

y rm N0y ,

m=0

∆xℓ−1 ∆x0 ∆xℓ = x , = Qℓ−1 x rℓ−1 m=0 rm ∆yℓ =

∆y0 ∆yℓ−1 = Qℓ−1 y . y rℓ−1 m=0 rm

We will denote by xℓi,j the node (xℓi , yjℓ ). From the nodes in (6.4) we define the corresponding cells analogously to (6.3): ∆xℓ ℓ ∆xℓ ∆yℓ ℓ ∆yℓ ℓ ℓ ℓ ci,j = xi − , xi + , yj + × yj − . (6.5) 2 2 2 2 A subgrid Gℓ , defined on Ω for the refinement level ℓ, given by ∆xℓ and ∆yℓ , is defined as a set of Kℓ mesh patches, denoted by {Gℓ,k : 1 ≤ k ≤ Kℓ }, where: • Each mesh patch is a subset of {cℓi,j : 0 ≤ i < Nℓx , 0 ≤ j < Nℓy }, • Ω(Gℓ,k ) is a rectangle for all k = 1, . . . , Kℓ • ˚ Ω(Gℓ,k1 ) ∩ ˚ Ω(Gℓ,k2 ) = ∅ if k1 6= k2 (the patches can only overlap at their boundaries), SKℓ−1 Ω(Gℓ−1,k ) (the grid at level ℓ is contained in the im• Ω(Gℓ,k ) ⊆ k=1 mediately coarser grid), ℓ−1 6= ∅ for some cℓ−1 ∈ G • if cℓi,j ∈ Gℓ,k is such that ˚ cℓi,j ∩ ˚ cp,q ℓ−1 , then p,q ℓ−1 cp,q ⊆ Ω(Gℓ,k ) (the grids are obtained by subdivision of cells in the coarser grid).

An example of such a construction is depicted in Fig. 6.1, for a grid hierarchy of three levels, with all refinement factors equal to 2. As indicated in sections 3.1 and 5.6, we augment each patch with a set of ghost cells that form a band that surrounds the patch, required to integrate it. We define the pad of the patch as this set of cells. When required, we will explicitly indicate that a cell or node does not belong to the pad by referring to it as an interior cell or node. If p is the width

114

6.1. Sequential implementation G2

G1

G0

Figure 6.1: A sample three-level AMR grid hierarchy. All refinement factors are set to the value 2.

of the pad at each side of the patch, then the ghost cells have indices x , . . . , N x + p − 1, for the x direction, and j = i = −p, . . . , −1 and i = Nℓ,k ℓ,k y y x and −p, . . . , −1 and j = Nℓ,k , . . . , Nℓ,k + p − 1 for the y direction, where Nℓ,k y Nℓ,k denote the dimensions of the patch under consideration. A set of x + N y ) + 4p2 ghost cells is therefore assigned to the patch. 2p(Nℓ,k ℓ,k We can assign to each point xℓi,j a single global index q, defined by q = jNℓx + i. Note that, from an index q ∈ {0, . . . , Nℓx · Nℓy − 1}, the indices for each component can be recovered by i = q mod Nℓx and j = [q/Nℓx ], where [·] indicates the integer part. We define the position of a patch as the indices of the the node located at the bottom-left corner of a patch. A patch can be therefore fully determined by its position and dimension. If a point with global indices (i, j) belongs to a single patch Gℓ,k , whose position is given by (iℓ,k , jℓ,k ), and x and N y , indices relative to the patch can also be its dimensions are Nℓ,k ℓ,k assigned to the point point xℓi,j by ˜i = (i − iℓ,k ),

˜j = (j − jℓ,k ),

x . and a single index, relative to the patch, can be assigned by q˜ = ˜i + ˜jNℓ,k In our implementation a grid hierarchy is stored using an structure named gridlist, which essentially contains:

• The number of refinement levels.

6. Implementation and parallelization of the algorithm

115

typedef struct gridlist { int num_levels; /**< Number of levels */ int **dim; /**< Dimension of each level for a fixed grid */ int **rf; /**< Refinement factors for each level and direction */ patch **base; /**< List of pointers to patch lists, one per level */ }gridlist; Figure 6.2: A code excerpt to define the gridlist structure.

• The dimension of the coarsest (fixed) grid. • The refinement factors. • A list of pointers to patch lists, one per level. A simplified version of our gridlist structure is shown in Fig. 6.2. The patch list for each level has been implemented using a linked list, being each element an structure containing: • The dimension of the patch • The position of the patch, relative to a fixed grid. • The pad in each direction. • A vector to store the values of the numerical solution. • A vector containing the physical fluxes • A vector containing the numerical fluxes. This data is necessary in order to perform the transfer of solution process. A simplified version of our actual definition for the structure patch is as shown in Fig. 6.3. The vectors containing the conserved variables and the physical and numerical fluxes store the data corresponding the the patch and the nodes located in their surrounding bands, defined by the pad. In order to deal with these two different kind of data we define some variables that help in their management. In particular we define two pointers for the conserved variables, one of them pointing to the first value including the surrounding band and another pointing to the first value inside the patch (see Fig. 6.4).

116

6.1. Sequential implementation

typedef struct patch { int dim[2]; /**< Dimension of the patch */ int pos[2]; /**< Position of the patch */ int pad[2]; /**< Pads of the patch */ REAL *d0; /**< Conserved variables, no shift */ REAL *d; /**< Conserved variables */ REAL *hatf; /**< Numerical fluxes. */ REAL *f; /**< Equation fluxes */ struct patch *next; /**< Next patch in a patch list */ }patch; Figure 6.3: A code excerpt to define the patch structure. The data type REAL is a redefinition for float or double

d0

d

Figure 6.4: Sample patch with dim = {7, 7} and pad = {3, 3}. Dashed cells correspond to ghost nodes

6. Implementation and parallelization of the algorithm

117

6.1.2 The adaptation process Given a grid Gℓ the adaptation process obtains a set of mesh patches ˜ ℓ . This adaptation procedure is that will compose the adapted grid G composed by three major processes: first, a flagging process to decide which cells have to be included in the refined grid is needed; second, a clustering procedure groups the selected cells into Cartesian patches; finally, the newly created mesh patches are filled with a solution. The only restriction is the nestedness property Gℓ ⊆ Gℓ−1 .

Marking cells for refinement. For the selection of the cells that need refinement we use the approach described in section 5.5: we combine a criterion based on marking the cells that cannot be predicted with enough accuracy from coarse data and a gradient sensor, that can detect the formation of shock waves from smooth data, allowing for the refinement of the grids before the shock forms. We use high thresholds in this sensor in order to avoid the detection of rapidly varying smooth data as discontinuities. The adaptation is forced to generate grids from subdivision of coarse cells. For ℓ−1 the gradient sensor we use the natural 2D extension of (5.13): if ui,j denotes the numerical solution at the node with indices (i, j) of the grid Gℓ−1 , then we mark for refinement (flag) the cells that result from the subdivision of the cell (i, j) if ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 max ui+1,j − ui,j , ui,j − ui−1,j max ui,j+1 − ui,j , ui,j − ui,j−1 , max ∆xℓ ∆yℓ (6.6) is above a prescribed tolerance. For the identification of the cells that cannot be correctly predicted from the coarse grid we use again an extension of the procedure indicated in section 5.5: for each node xℓi,j of a patch we compute an approximate value interpolated in space using the operator I given by tensor product extension of the 1D interpolation explained in section 5.6, and decide to select it for refinement if ℓ (6.7) ui,j − I(uℓ−1 , xℓi,j ) > τp , with τp > 0.

118

6.1. Sequential implementation

Figure 6.5: An example of the addition of safety flags. The original marked cells are indicated with white circles, and the safety flags with black circles.

Safety flags. Once the coarse grid has been flagged we add a certain number of safety flags to ensure that the cells adjacent to a singularity are refined. The safety flags will avoid singularities to escape from the fine grid during one coarse time step. Fig. 6.5 shows an example of the addition of a band of one safety flag to an already flagged patch. Another example can be seen in Figs. 6.7(a) and 6.7(b). Another criterion for adding safety flags is dictated by the need of interpolating ghost cell values from relatively smooth regions: the length of the stencil of the interpolation operator must be less than twice the number of safety flags. In our case we use third order linear interpolation, and this imposes the addition of 2 safety flags. For analogous reasons, if the computation of the numerical flux depends on 2n values of the fine grid, then, in order to ensure that it is computed using non-interpolated data, the number of safety flags has to be greater than n2 . In the case of the method used in this work, described in chapter 4, we have n = 3, and thus the number of safety flags added should be at least 2. According to the criteria above, we add 2 safety flags in our implementation.

6. Implementation and parallelization of the algorithm

119

1111 0000 00000 11111 0000 1111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 0000 1111 0000 1111 0000 1111 00000 11111 0000 1111 0000 1111 0000 1111 00000 11111 0000 000011111 1111 000001111 0000 1111

Figure 6.6: Synchronization of marked cells between patches of the same level. A marked cell in the patch at the left (solid light gray) induces three marked cells in the patch at the right due to the safety flags (dashed dark gray). A band of one safety flag is added (all dashed cells).

Synchronization of marked cells. It may happen that the addition of safety flags to a flagged patch produces marked cells that are located in the pad of the patch. If another patch is adjacent or sufficiently near the first one (depending on the width of the pad and the number of safety cells added), some of the marked cells in the pad may correspond to interior cells of the other patch. In this case, a mechanism to ensure that the cell will also be marked in the second patch is applied after every patch has been flagged, and before the procedure for grouping the marked cells into rectangular patches, described next. An example where this situation appears is shown in Fig. 6.6.

Grouping cells into rectangular patches. Once the desired coarse cells have been flagged for refinement, we group them into rectangular patches at the next refinement level. These patches contain every flagged cell and possibly some non-flagged cells. We proceed with every patch independently from the others, grouping the cells that are marked and belong to that patch. A simple and effective clustering procedure consists of the following steps: first, the minimum Cartesian patch that contains all the flagged cells is found. If the ratio between the number of flagged cells and the total number of cells in the patch is above a prescribed tolerance, then

120

6.1. Sequential implementation

the patch is accepted and the process ends for the current patch. Otherwise the patch is sub-divided into smaller sub-patches and the same criterion is applied to each sub-patch. The procedure continues until a patch has been assigned to each marked cell. The procedure is then applied to the next patch, until every patch has been clustered. A new grid is then generated using the accepted rectangular patches. For computational efficiency reasons we have implemented some procedures to avoid the formation of mesh patches that have a very high aspect ratio, or patches that are very small. More precisely, for each patch, we first check if it fulfills the requirement dp :=

Number of marked cells > tc , Total number of cells

(6.8)

where 0 < tc ≤ 1 is a tolerance, or if the size (width · height) of the patch is smaller than a specified patch size sp , in which case it is accepted “as is”. Otherwise it is sent to the subroutine that subdivides the patch. This subroutine divides the patch along the directions whose length is bigger than a specified length sl . As a result, the patch can be divided in four parts (if it can be divided in both directions), or in two (if the division can only be done in one direction). We sketch the process in the code shown in Fig. 6.8, and we show an example in Fig. 6.7. In that example we start with a patch of size 16 × 16 that is to be refined. We fix sp = 4, sl = 2 and tc = 0.9. Assume that the result of the flagging procedure is the one shown in Fig. 6.7(a), where the marked cells are indicated with a white circle. For each marked cell we mark a band of one cell around it, in order to cover the region where the information contained in the previously marked cells can move during one time step. The added cells are marked with black circles, as shown in Fig. 6.7(b). The result contains 85 marked cells out of the 256 cells of the patch, i. e., dp ≈ 0.33 < tp so it is not accepted. We divide it into four parts (Fig. 6.7(c)) and we crop each of these parts so that we only consider the minimum rectangle that contains marked cells (Fig. 6.7(d)). The cells that are discarded because of the cropping of the patches are shown in light gray. None of the four patches fulfill the requirement (6.8). Their respective values for dp are, from left to right and from top to bottom, 18 32 19 16 48 ≈ 0.33, 56 ≈ 0.32, 64 = 0.5 and 48 ≈ 0.40. A new division is therefore performed as appears in Fig. 6.7(e). After cropping, some patches are accepted (indicated in dark gray) because they do not contain unmarked cells (Fig. 6.7(f)). The four sub-patches that are not accepted have values for dp (from left to right and from top to bottom) equal to 87 = 0.875, 86 =

6. Implementation and parallelization of the algorithm

121

9 14 0.75, 16 = 0.5625 and 16 = 0.875, that are below the specified threshold tc = 0.9. Fig. 6.7(g) shows the result of dividing these sub-patches and cropping them. Note that the two smaller patches have been divided only in two parts because one of their dimensions is equal to 2, which is not bigger than sl . The sub-patches in Fig. 6.7(g) are all accepted (some of them verify dp = 1 > tc , and the size of them all is not bigger than sd = 4). The final refined patch is shown in Fig. 6.7(h).

Ensuring nestedness. Due to the organization of the AMR algorithm it can happen that at a point in the algorithm several grids, corresponding to different resolutions, have to be adapted. More precisely, the fact that a grid which is not the finer grid has to be adapted implies that all grids that are finer than it have to be adapted as well (cf. Section 5.5). In this situation we perform the adaptation from the finest grid to the coarsest. This approach is motivated by the fact that some features that would need refinement in a certain resolution level could not be identified as such in coarser levels. Note that when a grid Gℓ is adapted, the actual con˜ ℓ is made using the coarser grid Gℓ−1 as struction of the adapted grid G well. If the adaptation is performed from coarse to fine, and the grid Gℓ−1 has been produced, in turn, by a previous adaptation, it can happen that it does not cover a region that needs to be included in the grid at level ℓ because the grid Gℓ−2 , used to construct Gℓ−1 , did not detect that that region needed adaptation, and therefore it was not included in Gℓ−1 . Cells not belonging to Gℓ−1 are not considered for the construction of Gℓ and, as a result, a loss of refinement can occur. The adaptation from fine to coarse prevents it, but requires an additional procedure to ensure that the grid hierarchy resulting from the adaptation of several levels is nested. This procedure amounts in practice to mark some cells to ensure that the adapted grid will contain the grid at the immediately finer level: if the grid Gℓ to be adapted is not the finer grid, we mark the cells of Gℓ−1 that intersect the grid Gℓ+1 , which was previously adapted.

Transfer of solution to the adapted grid Each new fine grid obtained by the adaptation process needs to be filled with a numerical solution. This process has been described in section 5.5 (see page 98) for the 1D case, and essentially amounts to copying the solution from the grid that existed before adaptation to the adapted grid in the nodes where these grids overlap, and to compute interpolated val-

122

6.1. Sequential implementation

(a) Cells initially marked (white circles).

(b) Cells marked because of the safety flags (black circles).

(c) Division of the patch in four parts.

(d) Resulting sub-patches after crop. Cells discarded are marked in light gray

Figure 6.7: An illustration of the clustering process.

6. Implementation and parallelization of the algorithm

123

(e) New division of the subpatches

(f) Resulting sub-patches after crop. Accepted sub-patches are marked in dark gray.

(g) New division of the unaccepted patches and crop

(h) Final result of the clustering.

Figure 6.7 (continued)

124

6.1. Sequential implementation

Function cluster(p:patch, g:gridlist) cnt =number of flagged cells in p if cnt 6= 0 then size =Size (width · height) of the patch p dpcurrent = cnt/size if dpcurrent ≥ dp or size ≤ sd or width ≤ sl or height ≤ sl then accept patch(p, g) else [q, npatches] = divide(p) for i = 1 until npatches q[i] = crop(q[i]) cluster(q[i], g) end for end if end if end function Figure 6.8: A pseudo-code fragment corresponding to the cluster function.

ues from a coarser grid where they do not overlap. Instead of checking, for each cell, if it is contained in the unadapted grid or not, and then decide if data has to be copied or interpolated from the coarser grid, in practice it is more efficient to perform both processes separately: first, as the nestedness property ensures that the new grid is wholly contained in the coarser grid, a numerical solution is interpolated from it. Second, the numerical solution is copied from the grid of the same level that existed before the adaptation process, in the regions in which both grids overlap, overwriting the interpolated solution. Finally, boundary conditions are applied wherever the patch boundary overlaps the domain boundary. An sketch of a function that performs these steps is shown in Fig. 6.9. By merging together the procedures of marking cells for refinement, ensuring nestedness, clustering and transfer of solution to the adapted grid we obtain an adaptation algorithm with the structure shown in Fig. 6.10. A call to adapt(ginitial , gf inal , l) will produce, from the gridlist ginitial another gridlist called gf inal which coincides with ginitial at levels 0 . . . l − 1 and has adapted grids for levels l . . . , L − 1. A numerical solution for the adapted grids is produced as well, using the transfer function.

6. Implementation and parallelization of the algorithm

125

Function transfer(ginitial:gridlist, gf inal :gridlist, l:integer) for each patch p of level l in gf inal Search overlaps O of p with patches in gf inal of level l − 1 for each overlap Oi in O interpolate solution(Oi) end for end for for each patch p of level l in gf inal Search overlaps P of p with patches in ginitial of level l for each overlap Pi in P copy solution(Pi) end for end for for each patch p of level l in gf inal if p intersects the domain boundary then put boundary conditions(p) end if end for end function Figure 6.9: A code fragment corresponding to the transfer function.

Function adapt(ginitial :gridlist, gf inal : gridlist, l:integer) L = number of levels in the gridlists for ℓ = L − 1 until l for each patch p of level ℓ in ginitial f lags = flag patch(p) if ℓ < L − 1 f lags = f lags ∪ flag from finer(p) end if f lags = f lags ∪ put safety flags(p) crop(p, f lags) cluster(p, gf inal ) end for transfer(ginitial , gf inal , ℓ) end for end function Figure 6.10: A code fragment corresponding to the adapt function.

126

6.1. Sequential implementation

6.1.3 Integration algorithm The ghost cell approach permits to perform the integration of each patch almost separately. When a grid Gℓ is to be integrated, we call a procedure that sweeps over the patches in Gℓ and integrates each of them. The integration of each patch essentially amounts to sequentially call a function to compute the numerical fluxes corresponding to the actual solution and functions to perform the time integration, in our case the third order Runge-Kutta algorithm (3.14). To complete the algorithm, mechanisms for the computation of a numerical solution for the ghost nodes is also required, and is applied before each Runge-Kutta step. Note that this process amounts to compute an interpolated solution from the coarse to the fine grid, and is performed independently for each patch. In the middle, a procedure for the synchronization of the intermediate solutions between the different patches of the grid is applied before each RungeKutta step, in order to ensure that overlapping nodes have the same solution. This synchronization is the only process within the integration algorithm that requires data interchange between patches. The structure of the integration algorithm is as shown in Fig. 6.11. The computation of the numerical divergence, represented by the function compute numerical divergence, is performed by considering separately the fluxes in the x and y directions, corresponding to the functions f and g in (6.1), and are given by formulas analogous to (4.12). The computation of the numerical fluxes is described in detail in chapter 4, and the complete integration algorithm that is used for the update of each patch is described in section 4.4. In 2D the operator D that represents the numerical divergence is simply given by the sum of the 1D numerical divergences in each Cartesian direction: D(U ) =

fˆi+ 1 ,j (U ) − fˆi− 1 ,j (U ) 2

2

∆x

+

gˆi,j+ 1 (U ) − gˆi,j− 1 (U ) 2

2

∆y

.

After each step of the Runge-Kutta algorithm has been applied to a grid, we update the ghost cells as described in section 5.6, using an space-time interpolation algorithm applied on the coarser grids corresponding to the solutions before and after the Runge-Kutta step. The two patches are first interpolated in time, for a suitable time instant (that depends on the particular step of the Runge-Kutta algorithm), in some coarse cells. These are the cells needed to compute spatial interpolations in the required ghost cells of the fine grid. The data that results from the

6. Implementation and parallelization of the algorithm

127

Function integrate(g:gridlist, l:integer) for step = 1 until 3 for each patch p of level l in g compute numerical divergence(p) update total fluxes(p, step) perform Runge Kutta step(p, step) interpolate ghost cells(p, step) end for synchronize level(g, l) impose boundary conditions(g, l) end for end function Figure 6.11: A code fragment corresponding to the integrate function.

time interpolation is then interpolated in space using the tensor product extension of the 1D algorithm explained in section 5.6. This process is represented by the function interpolate ghost cells. An example is depicted in Fig. 6.12. We show some cells where the interpolation is to be computed as squares in Fig. 6.12(a). The circles represent coarse nodes. We assume that the time interpolation between the coarse grids before and after a Runge-Kutta step, for the corresponding time instant, has been already performed. The interpolation in space is performed in the direction of the x coordinate, which allows to compute approximations in the points that correspond to the fine cells in a 1D grid (cf. Fig. 5.12). The crosses in Fig. 6.12(b) correspond to those points. The same procedure, applied to the nodes indicated by crosses, in the direction of the y coordinate produces approximations in the fine nodes of the 2D grid, as shown in Fig. 6.12(c). Once every patch has been updated for a Runge-Kutta step, we ensure that the data contained at the different patches is coherent. This operation is represented by the function synchronize level in Fig. 6.11. On the one hand, we check if a ghost cell of one patch overlaps an internal cell of another patch, in which case we copy the solution from the internal cell to the ghost cell. Pseudocode for this function is shown in Fig. 6.13. On the other hand, we impose appropriate boundary conditions wherever needed (this process depends on the particular kind of boundary conditions, see section 3.7). The time steps for the different grid levels are computed in a way

128

6.1. Sequential implementation

(a) A coarse grid (circles) and some fine nodes (squares) where interpolation in space has to be computed.

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

(b) 1D interpolation in space in the x direction produces approximations in the points marked with crosses.

(c) 1D interpolation in space in the y direction produces approximations in the points corresponding to the fine nodes (squares) of the 2D grid.

Figure 6.12: An illustration of the 2D interpolation used for the update of the ghost cells.

function synchronize level(g:gridlist, l:integer) for each patch p of level l in g Search overlaps P of p with patches in g of level l for each overlap Pi in P copy solution(Pi) end for end for Figure 6.13: Pseudocode for the function synchronize level

6. Implementation and parallelization of the algorithm

129

similar to the 1D case. (cf. (5.1) – (5.2)). We define, for 1 ≤ ℓ ≤ L − 1: ∆tℓ =

∆tℓ−1 x , ry } , max{rℓ−1 ℓ−1

(6.9)

and ∆t0 as

min{∆x0 , ∆y0 } , (6.10) M where 0 < K < 1 and M is the maximum numerical characteristic speed of the equation (see appendix A, section A.4.3 for a description of the computation of this quantity). ∆t0 = K

6.1.4 Flux projection The need of a transfer of information from fine to coarse grids was motivated in section 5.1. This transfer of information can be performed in two ways: by transferring nodal values, just copying the solution from the fine to the coarse nodes that correspond to the same points, and by transferring numerical fluxes from coarse to fine cell interfaces, and updating the coarse solution according to the corrected fluxes. In this work we choose to project the numerical fluxes, in order to ensure conservation between meshes, that would be lost if projection of nodal values is used. The projection, for the case of numerical flux projection in 1D, has been described in detail in Sections 5.1 and 5.4. We describe here the extension to two dimensions. During the integration process we store the numerical fluxes that will be used later for flux projection. The numerical fluxes are added for each Runge-Kutta step so that, at the end, the numerical fluxes corresponding to a time step (i.e., the three steps of Runge-Kutta), are available for their further projection onto the coarser grid. In the 1D case this corresponds to compute the values given by (3.17) for each coarse cell interface. This process is represented in Fig. 6.11 by the function update total fluxes. The 2D fluxes, equivalent of (3.17) are 1 1 2 fˆRK3 (U n ) = fˆ(U n ) + fˆ(U (1) ) + fˆ(U (2) ), 6 6 3 1 2 1 n (1) RK3 n gˆ (U ) = gˆ(U ) + gˆ(U ) + gˆ(U (2) ). 6 6 3

(6.11)

A first note is that within Shu-Osher’s flux-splitting formulation the numerical fluxes correspond to nodal values located in the boundary of

130

6.1. Sequential implementation

Figure 6.14: Relative location of the numerical fluxes for coarse and fine grid points for refinement factors of 2 (left) and 3 (right). Coarse nodes are indicated with solid circles and fine nodes with solid squares. The locations of the coarse fluxes are indicated with empty circles and the locations of the fine numerical fluxes with empty squares.

the cells whose centers are the nodal values where the solution is computed. In 1D, the grid points can be organized such that a grid hierarchy for which the cell interfaces of a grid coincide with grid interfaces of the finer grid can be built. We have described this grid organization in Section 5.1 and an example was shown in Fig. 5.2. In our 2D discretization, the locations of the coarse numerical fluxes coincide with locations of fine numerical fluxes only if the refinement factor is an odd number. Otherwise, numerical fluxes for the fine grid are computed in nodes that belong to the boundary of the coarse cell, but do not coincide with the locations of the coarse numerical fluxes. An example of the distribution of the nodes and the locations of the numerical fluxes for refinement factors equal to 2 and 3 is shown in Fig. 6.14. If the refinement factors are set to an even number, then Lagrange interpolation in space is performed to compute an approximation to the coarse numerical flux in the corresponding point. In the case of being equal to 2, the approximation is simply given by the average of the two fine values. Let us explain in more detail the most common case, where the re-

6. Implementation and parallelization of the algorithm

l

131

l

x 2i,2j+1

x 2i+1,2j+1 l−1

x i,j l

x 2i,2j

l

x 2i+1,2j

Figure 6.15: Relative location of the points involved in the flux projection at one cell.

ℓ−1 finement factors are set to 2. Consider a coarse node xi,j given by 1 1 ℓ−1 ∆xℓ−1 , j + ∆yℓ−1 . = i+ xi,j 2 2

The numerical fluxes required to update this node are located in the ℓ−1 ℓ−1 ℓ−1 points xi+ and xi− for the horizontal flux, and in the points xi,j+ 1 1 1 .j ,j 2

2

2

ℓ−1 and xi,j− 1 for the vertical fluxes. The nodes on the fine grid whose nu2

merical fluxes are computed in the same cell interfaces are the points xℓ2i+1,2j , xℓ2i+1,2j+1 , xℓ2i,2j and xℓ2i,2j+1 , The relative location of these points is depicted in Fig. 6.15, where circles indicate coarse points and squares fine points; solid objects indicates nodes and empty objects points where the numerical flux is computed. As in the 1D case described in Section 5.4, let us denote by utℓ−1 the numerical solution corresponding to time t in the grid Gtℓ , where ℓ−1 t ℓ , t). The computation of the solution ut+2∆t utℓ−1,i,j ≈ u(xi,j ℓ,2i,2j from uℓ is performed by means of two sequential integrations with time step ∆tℓ , analogous to (5.3), that can be summarized in the 2D version of (5.4) as: ℓ ut+2∆t ℓ,2i,2j

=

utℓ,2i,2j

∆tℓ RK3,t RK3,t+∆tℓ RK3,t+∆tℓ RK3,t ˆ ˆ ˆ ˆ fℓ,2i+ 1 ,2j + fℓ,2i+ 1 ,2j − − fℓ,2i− 1 ,2j + fℓ,2i− 1 ,2j ∆xℓ 2 2 2 2 ∆tℓ RK3,t RK3,t+∆tℓ RK3,t+∆tℓ RK3,t − gˆℓ,2i,2j+ + g ˆ + g ˆ − g ˆ , 1 ℓ,2i,2j+ 21 ℓ,2i,2j− 21 ℓ,2i,2j− 21 ∆yℓ 2

132

6.1. Sequential implementation

t+2∆tℓ t+2∆tℓ ℓ with analogous expressions for ut+2∆t ℓ,2i+1,2j , uℓ,2i,2j+1 and uℓ,2i+1,2j+1 , from which we deduce that t+2∆tℓ t+2∆tℓ t+2∆tℓ ℓ ut+2∆t ℓ,2i,2j + uℓ,2i+1,2j + uℓ,2i,2j+1 + uℓ,2i+1,2j+1

= −

4 utℓ,2i,2j + utℓ,2i+1,2j + utℓ,2i,2j+1 + utℓ,2i+1,2j+1

∆tℓ ∆xℓ

4 RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ ˆ ˆ fℓ,2i+ 3 ,2j + fℓ,2i+ 3 ,2j + fˆℓ,2i+ + fˆℓ,2i+ 3 3 ,2j+1 ,2j+1 2

2

2

2

4

−

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ fˆℓ,2i− + fˆℓ,2i− + fˆℓ,2i− + fˆℓ,2i− 1 1 1 1 ,2j ,2j ,2j+1 ,2j+1 2

2

2

2

4

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ ∆tℓ gˆℓ,2i,2j+ 23 + gˆℓ,2i,2j+ 23 + gˆℓ,2i+1,2j+ 23 + gˆℓ,2i+1,2j+ 23 − ∆yℓ 4

−

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ gˆℓ,2i,2j− ˆℓ,2i,2j− + gˆℓ,2i+1,2j− ˆℓ,2i+1,2j− 1 + g 1 + g 1 1 2

2

2

2

4

If we define now, for −1 ≤ i ≤ Nℓx and 0 ≤ j ≤ Nℓy ˆt fˆℓ−1,i+ = 1 ,j

RK3,t+∆tℓ RK3,t RK3,t+∆tℓ RK3,t + fˆℓ,2i+ + fˆℓ,2i+ + fˆℓ,2i+ fˆℓ,2i+ 3 3 3 3 ,2j ,2j ,2j+1 ,2j+1 2

2

2

2

4

2

(6.12)

,

(6.13)

,

(6.14)

and for 0 ≤ i ≤ Nℓx and −1 ≤ j ≤ Nℓy t gˆ ˆℓ−1,i,j+ 1 2

=

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ gˆℓ,2i,2j+ ˆℓ,2i,2j+ + gˆℓ,2i+1,2j+ ˆℓ,2i+1,2j+ 3 + g 3 + g 3 3 2

2

2

4

2

then (6.12) reads t+2∆tℓ t+2∆tℓ t+2∆tℓ ℓ ut+2∆t ℓ,2i,2j + uℓ,2i+1,2j + uℓ,2i,2j+1 + uℓ,2i+1,2j+1

= −

utℓ,2i,2j

+

utℓ,2i+1,2j

4 + utℓ,2i,2j+1 + utℓ,2i+1,2j+1

(6.15)

4

∆t ∆tℓ ˆ ˆ ℓ ˆt t t t ˆ ˆ fˆℓ−1,i+ g ˆ . − g ˆ − f − 1 1 ,j ℓ−1,i,j− 21 ℓ−1,i− 21 ,j ∆xℓ 2 ∆yℓ ℓ−1,i,j+ 2

If we assume that for time t the relation utℓ−1,i,j =

utℓ,2i,2j + utℓ,2i+1,2j + utℓ,2i,2j+1 + utℓ,2i+1,2j+1 4

(6.16)

6. Implementation and parallelization of the algorithm

133

∆tℓ ∆tℓ holds, then, using that the ratios ∆xℓ do not depend on the and ∆y ℓ resolution level ℓ, from (6.15), we deduce the same relation holds for time t + ∆t: t+∆t

t+∆tℓ−1 uℓ−1,i,j

=

t+∆t

t+∆t

t+∆t

ℓ−1 ℓ−1 ℓ−1 uℓ,2i,2jℓ−1 + uℓ,2i+1,2j + uℓ,2i,2j+1 + uℓ,2i+1,2j+1

4

,

(6.17)

provided that the coarse fluxes are corrected according to: ˆt t fˆℓ−1,i+ = fˆℓ−1,i+ 1 1 , ,j ,j

x , −1 ≤ i ≤ Nℓ−1

t t ˆ ˆℓ−1,i,j+ gˆℓ−1,i,j+ 1, 1 = g

x 0 ≤ i ≤ Nℓ−1 − 1,

2

2

2

2

y 0 ≤ j ≤ Nℓ−1 −1 y −1 ≤ j ≤ Nℓ−1

(6.18)

Note that the values RK3,t+∆tℓ RK3,t + fˆℓ,2i+ fˆℓ,2i+ 3 3 ,2j ,2j 2

2

2 and RK3,t RK3,t+∆tℓ fˆℓ,2i+ + fˆℓ,2i+ 3 3 ,2j+1 ,2j+1 2

2

2 are, respectively, the numerical fluxes at the points xℓ2i+ 3 ,2j and xℓ2i+ 3 ,2j+1 , 2

2

corresponding to a time step 2∆tℓ = ∆tℓ−1 . Their average is exactly the right hand side of (6.13) and is the linear approximation of the numerical flux, for the same time step ∆tℓ−1 at the midpoint of the vertical line joining the two points, which is the point 1 ℓ−1 xℓ2i+ 3 ,2j+ 1 = ((2i + 2)∆xℓ , (2j + 1)∆yℓ ) = ((i + 1)∆xℓ−1 , (j + )∆yℓ ) = xi+ 1 . ,j 2 2 2 2 t The same analysis can be performed for gˆˆℓ−1,i,j+ 1 in (6.14). The values 2

ˆt t and gˆ ˆℓ−1,i,j+ fˆℓ−1,i+ 1 are therefore approximations to the coarse numer1 ,j 2

2

ical fluxes in their respective locations. A pseudocode for the projection algorithm is shown in Fig. 6.16. The subroutine update coarse fluxes performs the operations represented by (6.18). The final algorithm that updates a gridlist is the same as described in chapter 5, see Fig. 5.9 (the notations used in chapters 5 and 6 are slightly different).

134

6.2. Parallel implementation

Function project(g:gridlist, l:integer) if l > 0 for each patch p in g of level l Search overlaps O of p with patches in g of level l − 1 for each overlap Oi in O update coarse fluxes(Oi) end for end for end if end function Figure 6.16: A pseudo-code fragment corresponding to the project function.

6.2 Parallel implementation After the description of the implementation of our algorithm, we focus on the parallelization of the code. Parallelization is important if one aims to face grand challenge problems that, even with the help of high order methods and adaptation, are impossible to solve on a single machine, due to both the spatial (data storage) and temporal (computational cost) requirements. Parallel implementations offer the potential for the computation of accurate solutions of such complex problems, at the cost of facing new challenges in resource allocation, data distribution, load balancing and data communication and synchronization. See [43] for a good discussion on data partitioning and load balancing in a more general context. The typical approach in the parallel implementations of AMR applications splits the grid hierarchy into different portions, that are distributed among the available processors. Each processor acts on one or more portions separately, sharing information with other processors when required. To reduce the communications, a first requirement is that if a patch is assigned to a certain processor, the finer patches that overlay it are assigned to the same processor. This leads to a partition strategy based on the coarsest grid, which is split into pieces that are assigned to the available processors, along with the parts of the finer grids contained in them. The dynamic nature of the grids present in AMR applications, that lead to deep hierarchies and small regions with high refinement, makes

6. Implementation and parallelization of the algorithm

135

difficult to design an strategy for finding a suitable partition of the grid, that ideally should: • balance the load between the processors and • minimize data transfers between processors. In our algorithm, data transfers are needed each time that the ghost nodes of the patches have to be filled with data. This includes in particular communicating after each step of the Runge-Kutta algorithm, after adaptation, and after the process of correcting the numerical solution using projected fluxes. Several authors have faced this problem in the last fifteen years. The simplest load balancing techniques do not consider data locality, and simply redistribute the work load among the available processors, using the data partition given by the adaptation algorithm [141]. Other authors further sub-divide the domain if a good load balancing is not achieved, and try to balance the loads by means, for example, of data exchanges between processors that have loads higher than the average and processors that have loads smaller than the average [153, 92]. Techniques of dynamic programming have been used as well for finding the distribution that optimizes the work load balance [145]. Partitioners based in graphs [154] (and more recently hypergraphs [33]), like ParMetis [84] or Zoltan [152], are often used in codes using unstructured meshes or adaptive finite elements, but their higher computational cost makes its application to dynamic load balancing problems, where the workloads vary dynamically as time evolves, very limited. Most of the partitioning techniques used in parallel AMR codes use space filling curves [151] to increase data locality [135, 31]. Space filling curves are defined as continuous, surjective functions, from the unit interval [0, 1] to the d-dimensional unit hypercube [0, 1]d . In particular, a 2-dimensional space-filling curve is a continuous curve that passes through every point of the unit square [0, 1]2 . Extension to general hypercubes is trivial. A space-filling curve is typically defined as the limit of a sequence of curves. If the target hypercube is partitioned uniformly in a certain way into N d patches (obtained by making N divisions along each of the d spatial dimensions, where N depends on the particular curve under consideration), then there is a curve in the sequence that visits each element in the partition following a particular order. Therefore, the intermediate curves, whose limit is the space filling curve under consideration, can be seen as bijections from {1, . . . N }d to {1, . . . , N d }, that assign an index

136

6.2. Parallel implementation

to each of the N d elements of the discretization. The interest of using space filling curves to obtain indexings of discretizations comes from the fact that these curves try to assign close indices to close elements of the discretization, thus producing orderings with high data locality. If the assignment of patches to processors is done following the ordering of the space filling curves, each processor will likely host patches that are neighbors of other patches hosted by the same processor. Because each patch in the grid hierarchy has to communicate only with his neighbors, data locality can reduce the total amount of data communication required by the parallel algorithm. In this work we use the Peano-Hilbert space filling curves [136, 69]. These curves are defined iteratively as follows for the two dimensional case (see Fig. 6.17): in the first stage we divide the unit square in four squares of equal size and assign a sequential number to these squares clockwise, starting at the top right square, thus ending in the top left square. If we draw the line that joins the centers of the squares in the order indicated we obtain the construction depicted in Fig. 6.17(a). For the second iteration the four squares of the first iteration are divided in turn into four squares, and each of these four quarters is considered in isolation. In the two bottom quarters the same construction of the previous iteration is repeated on a 14 scale . On the top left square we repeat the same process but the ”figure” of the previous iteration is rotated an angle of π2 counterclockwise and scaled. Finally, in the top right quarter we repeat again the same figure of the previous iteration but rotated π2 clockwise and scaled. The result is depicted in Fig. 6.17(b) with black lines. These four constructions are then merged together. The constructions in the bottom quarters are connected to their horizontal and vertical neighbors by the shortest lines that join them. These lines are depicted in gray in Fig. 6.17(b). We finally assign numbers to each cell following the same ordering as in the first step: starting in the top right cell and following the line, that ends in the top left cell. The curve corresponding to the third step is shown in Fig. 6.17(c), where the colors of the lines have the same meaning as in Fig. 6.17(b). The construction is the same as for the previous case, but acting on the figure of the second step. Recall that at the k-th step the unit square is divided into 4k squares of equal size. The application of this space-filling curve to load balancing of parallel programs divides the computational domain in 4k subdomains, for a suitable value of k, and assumes that a weight can be assigned to each subdomain. The weight represents the computational effort or workload

6. Implementation and parallelization of the algorithm

4

16

15

13

14

2

1

3

4

137

1

3

12

9

8

5

11

10

7

6

2

(a) Iteration 1

(b) Iteration 2

(c) Iteration 3

Figure 6.17: Construction of the Peano-Hilbert space filling curve

required to process that subdomain. The subdomains are then ordered according to the space filling curve and the resulting list of weights is divided in a number of pieces equal to the number of available processors. This division determines which subdomains are assigned to each processor. The division of the list is performed trying to ensure that the total loads assigned to the processors are as balanced as possible. In our k case, for the partition of the list of loads, that we denote by l = {li }4i=1 , into P parts, we proceed as follows: we start with the full list of loads and we compute the position p1 in the list whose accumulated load is the closer one to the average load, given by k

4 1 X li . L1 = P i=1

A simple way of doing that is to look for the position s1 in the list that verifies that sX s1 1 +1 X li ≥ L1 , li < L1 and i=1

i=1

The position s1 is the one whose accumulated load is the closer one to the average, being inferior to it. Then we take P1 P 1 +1 s1 if L1 − si=1 li < si=1 li − L1 p1 = . s1 + 1 otherwise

The pieces assigned to the first processor are the ones corresponding to the labels {1, . . . , p1 } in Hilbert’s order.

138

6.2. Parallel implementation

7

4

9

2

3

10

20

4

7

4

9

2

3

10

20

4

12

17

14

8

12

17

14

8

3

9

20

18

3

9

20

18

(a) Division of the domain into patches. (b) Load assignment. Patches of the Numbers indicate weights assigned to same gray intensity are assigned to the each patch. same processor.

Figure 6.18: An example for the load balancing algorithm.

For the rest of the processors we repeat the same computation, but considering the average load of the pieces that have not been assigned so far: 4k X 1 Lj = li , j = 2, . . . , P. P −j+1 i=pj−1 +1

and we look for the position pj whose accumulated load, counting only the loads of the unassigned pieces, is the closer one to the average load Lj . As an example, assume that the computational domain is [0, 1]2 and we have P = 4 processors. We divide the domain in 16 parts, as in Fig. 6.17(b). Consider the weight assignment of Fig. 6.18(a). If we order the subdomains according to the Hilbert curve corresponding to k = 2, we get the following list of weights: l = {10, 3, 20, 4, 8, 18, 20, 14, 17, 9, 3, 12, 9, 2, 4, 7}. The total load is 160 giving an average loadP of L1 = 40. Therefore, we 4 get s = 4 and we take p = 4, because L − 1 1 1 i=1 li = 40 − 37 = 3 and P5 l − L = 45 − 40 = 5 > 3. For the second processor we have L2 = 41 1 i=1 i and we therefore get p2 = 7. Repeating the process we obtain p3 = 10

6. Implementation and parallelization of the algorithm

139

and p4 = 16. This gives the load assignment shown in Fig. 6.18(b), where the pieces of the domain that would be assigned to each processor are depicted in different grayscale levels. The loads assigned to each processor are, respectively, 37, 46, 40 and 37. In the case of the AMR algorithm, we have grids and grid patches of several sizes, according to the different resolutions of each grid and to the result of the adaptation algorithm. Our approach consists in considering the dimensions of the coarsest grid, and we compute the number kmax of divisions that can be performed in each direction in order to be able to divide the domain in subdomains such that each subdomain contains an integer number of coarse cells (i.e., we do not divide coarse cells). Then, we compute the minimum value kmin of divisions that have to be done in each direction, in order to obtain a number of pieces that is greater or equal to the number of processors, so that each processor receives at least one piece. Then we assign to each piece the load that corresponds to every cell whose extent is contained in the given piece. The computation of this loads is described below. We start with k = kmin and divide the domain in 4kmin pieces. The assignment of pieces to each processor is computed following the algorithm based on space-filling curves described above. We accept or reject the load distribution of the algorithm using a simple check: if the maximum difference between the load assigned to each processor Pj , given by Pj =

pj X

li ,

i=pj−1 +1

and the corresponding average load Lj is smaller than a given threshold for every j, we decide to keep that division and we distribute the work according to it. Otherwise, if k < kmax we increase k and repeat the process until a division that fulfills the requirement is achieved. If it is not possible to find a division that suits the required conditions, we increase the threshold and repeat the load balancing procedure. A pseudocode algorithm for load balancing is shown in Fig. 6.19. The load assigned to each piece is basically determined by the total number of integrations needed to integrate the cells of the grid hierarchy whose extent is included in the given piece. In the simplest case, the cost for the integration of a given patch hj during a coarse time step is given by: L−1 X Wj = Card(Cℓ,j )2ℓ , (6.19) ℓ=0

140

6.2. Parallel implementation

Function balance(g:gridlist, P :integer, threshold:real) Compute kmin and kmax do k = kmin do C = compute Hilbert ordering(k) l = compute cost list(C, g) assign costs to processors(l, P, threshold) k = k+1 while (cost assignment is not acceptable and k ≤ kmax ) increase threshold while(cost assignment is not acceptable) end function Figure 6.19: A pseudocode for the balance function.

where all refinement factors of the grid hierarchy are assumed to be equal to 2 and Cℓ,j is the set of cells in the grid Gℓ that belong to the patch hj . A penalization cost for the communication can also be considered. One can, for example, add a load defined by an increasing function of the surface length of the logical patches defined by the cells in Cℓ,j , thus favoring that the ratio between area and length of the pieces is as high as possible. As an example assume that we have 4 processors at our disposition and consider that, at some stage, the working grids of the AMR algorithm are as depicted in Fig. 6.20(a). The thicker lines indicate patch boundaries, and the fine lines indicate cell boundaries. We observe that the grid hierarchy is composed by three levels and that the coarsest grid has 8 × 8 cells, thus allowing for three possible divisions according to the Hilbert curve. Following (6.19), we assign a cost of 1 to a coarse cell, and a cost of 2ℓ to each cell at level ℓ. A coarse cell corresponds to four cells of the next refinement level and to 16 of the finest level. If a coarse cell is overlaid by cells at the next refinement level only, it has a cost of 1 + 2 · 4 = 9, and if it is refined up to the finest level it has a cost of 1 + 2 · 4 + 4 · 16 = 73. The costs for each patch are depicted in Fig. 6.20(b). We assume that the threshold for the difference between the assigned load and the average is set to 20. The load balancing algorithm will proceed as follows: the domain is divided in four pieces, and the loads for each piece are computed, giving the values indicated in Fig. 6.20(c). The average load is L1 = 432. For the

6. Implementation and parallelization of the algorithm

141

first processor the algorithm compares the choices p1 = 1, with a load of 16 and p1 = 2, with a load of 938. None of them is admissible for the given threshold, and therefore a second subdivision is made, getting 16 pieces, as indicated in Fig. 6.20(d). Neither this subdivision gives a satisfactory balance, and the next division, into 64 pieces, is done (Fig. 6.20(e)). In this case the algorithm is able to balance the load according to the given threshold, and produces the load assignment shown in Fig. 6.20(e), where the pieces assigned to each processor are painted in different gray levels. The load assigned to the four target processors are, respectively, 417, 438, 437 and 436, with a maximum difference between the assigned load and the corresponding average of 15. The load assignment, along with the initial grid, is depicted in Fig. 6.20(f). In some cases it happens that the parallel performance of the algorithm is not as good as it would be desirable, even it the costs are balanced, because some processors have cells belonging to levels with more refinement than other processors. A big patch of the coarsest level can have a load similar to a small patch that is overlapped by finer grids. As communication is needed after each step of the Runge-Kutta algorithm, the processors whose number of cells in the coarsest level is smaller will wait for the other processor to end its computation, and, conversely, after the coarsest level has been integrated, the processor with no fine grids is idle while the other is integrating the fine grid. In practice we have observed that this phenomenon has a reduced impact, because as long as a piece has high refinement, its load grows exponentially, and if a good load balance is achieved both processors are likely to get a piece of the refined region. Moreover, including a cost in the load computation that includes a per-level load balancing leads to divisions into a big number of pieces, which produces higher transmission costs. Especially in distributed systems, the time for data transfers is much higher that the computational costs, and thus the potential gain of the per-level balancing does not produce a speedup in the algorithm.

142

6.2. Parallel implementation

36

36

16 73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

36

36

36

36

(a) Initial grid

36

(b) Initial grid with weights

400

36

36

4

4

36

292

4

4

36

292

292

36

36

36

292

292

16

400

912

(c) Initial division and Hilbert curve (k = 1) 9

9

9

9

1

1

1

1

9

9

9

9

1

1

1

1

9

9

73

73

1

1

1

1

9

9

73

73

1

1

1

1

9

9

73

73

73

73

9

9

9

9

73

73

73

73

9

9

9

9

9

9

73

73

73

73

(d) Hilbert curve for k = 2

36

36

16 73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

36

36

36 9

9

9

9

73

73

73

73

36

36

(e) Hilbert curve and load as- (f) Grid assignment after load signment after load balancing balancing for k = 3

Figure 6.20: An example to illustrate the load balancing algorithm

7 Numerical experiments In this chapter we analyze the performance of the numerical scheme for various one- and two-dimensional examples. Quantitative studies, such as the analysis of the numerical errors have been made in the onedimensional case only, because of the possibility of analyzing particular phenomenologies, that are hard to isolate in 2D experiments, together with the reduced execution times that allow us to make extensive testing in a reasonable time. In the two-dimensional examples we show the behavior of our scheme in more complex problems, with a more qualitative approach. We have selected several problems whose solutions exhibit a wide variety of phenomenologies, in order to show that the AMR algorithm performs well in the widest possible range of situations. It is the aim of this section to give an insight of how the algorithm behaves in different situations, through a wide range of test problems. The performance of the AMR algorithm depends on the complexity of the problem, on the resolution of the grid hierarchy and on the parameters, among other factors, and only the experience can help in the choice of a particular setup for a particular

144

7.1. One-dimensional tests

problem. Let us state some abbreviations and conventions used in this chapter. We will denote by τg the tolerance used in the gradient sensor defined in equations (5.11) for the 1D case and (6.6) for the 2D case. The tolerance used by the cell marking procedure based on the interpolation error, defined respectively in equations (5.12) and (6.7), will be denoted by τp . The parameter tc is the one used to group marked cells into rectangular patches, it appears in equation (6.8). The CFL constant, denoted by K, is used to compute the time steps for each iteration, as in (6.10), so that the CFL condition is ensured with the approach described in sections 5.3, 6.1.3 and A.4.3. In all tests we have set all refinement factors to be equal to 2 for simplicity. We will use the term error to refer to the difference between the solution computed by the AMR algorithm and the solution computed on a fixed grid with the same resolution as the finest grid in the grid hierarchy. We use the percentage of integrations (with respect to the number of integrations on a fixed grid) that the AMR algorithm needs as a measure of the performance of the algorithm because the integration algorithm is, by far, the most time consuming process in the algorithm. The choice of this quantity is justified in section 7.1.4.

7.1 One-dimensional tests We present in this section some tests performed on well known onedimensional test cases. Let us state some facts about the figures and results presented in this section. The AMR algorithm produces as an output a grid hierarchy with an associated numerical solution for each node of each grid. Therefore there is some redundancy in the output data, because of the grid overlapping intrinsic to the algorithm. It is not useful to plot all these data together, and we use two kinds of visualizations in the case of 1D data. In some figures we plot the numerical solution using a mixture of the solutions in the various grids, using the finest grid available in the AMR grid hierarchy wherever several grids overlap, thus discarding the values in a grid if a finer grid exists for the same spatial location. The density of the plotted values gives an idea of the refinement used. In other cases we plot the solution in a uniform grid with the same resolution as the finest grid of the actual grid hierarchy. The solution in the points where the finest grid does not exist is

7. Numerical experiments

145

computed using interpolation from coarser grids. In the computation of errors with respect to a fixed grid, we always use the last approach, in order to be able to compare both solutions point per point.

7.1.1 Linear advection equation Our first test consists in the solution of the linear 1D advection equation ut + cux = 0 x ∈ [−1, 1], t ≥ 0 (7.1) u(x, 0) = u0 (x) x ∈ [−1, 1], where c is a nonzero constant. The exact solution u(x, t) = u0 (x − ct) can be easily obtained by the method of characteristics, and consists on the advection of the initial data u0 (x) at speed c. An interesting test case is the advection of discontinuities. We have chosen the initial data 1 if |x| < 15 , u0 (x) = (7.2) 0 otherwise. Figure 7.1 shows the the numerical results obtained for c = 1 with the AMR algorithm, using five grid levels. The coarsest grid at level 0 is composed by a (fixed) grid of 50 points, and all refinement factors are equal to 2, giving a resolution equivalent to the fixed grid of 800 points. The solution has been evolved until time t = 0.5. Up to that time, the AMR algorithm performs a 12.66% of the integrations made by the algorithm running on a fixed grid of 800 nodes (we will refer to this ratio as global efficiency). The values of the parameters used are tc = 0.7, τp = 10−4 , τg = 10.0 and K = 0.5. We observe that the solution obtained by the AMR algorithm is in good agreement with the solution computed on a fixed grid. In Figure 7.2 we show the the difference between the solution obtained by the AMR algorithm and the solution computed on a fixed grid of 800 points, for some different executions with the same parameters as in Figure 7.1 except τp , which varies between 10−8 and 0.25. We plot the differences measured in the 1−norm, the 2−norm and the max–norm. In Figure 7.2(a) we observe that the error decreases linearly with respect to the tolerance parameter. In this case the grids have been adapted attending to the interpolation error only, without a sensor gradient. This is the reason why an abrupt increase in the error can be observed for

146

7.1. One-dimensional tests

1D advection, t = 0.5

1D advection, t = 0.5

Exact Fixed grid AMR

1

1

0.8

0.6

0.6 u(x)

u(x)

0.8

Exact Fixed grid AMR

0.4

0.4

0.2

0.2

0

−1

0

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

(a) Full solution.

0.25

0.26

0.27

0.28

0.29

0.3 x

0.31

0.32

0.33

0.34

0.35

(b) Zoom of the left contact discontinuity.

1D advection, t = 0.5 1D advection, t = 0.5 Exact Fixed grid AMR

1

4

0.8

3

u(x)

Level

0.6

2

0.4

1 0.2

0 0

−1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

0.65

0.66

0.67

0.68

0.69

0.7 x

0.71

0.72

0.73

0.74

0.75

(c) Grid hierarchy used in the last iteration. (d) Zoom of the right contact discontinuity.

Figure 7.1: Solution of the linear advection equation (7.1) with initial data (7.2) at time t = 0.5.

7. Numerical experiments

147

1D advection, t = 0.5

0

1D advection, t = 0.5

0

10

10 1−norm 2−norm Max−norm

−1

10

1−norm 2−norm Max−norm

−1

10

−2

−2

10

10

−3

−3

10

10

−4

−4

Error

10

Error

10

−5

−5

10

10

−6

−6

10

10

−7

−7

10

10

−8

−8

10

10

−9

10

−9

−8

10

−7

10

−6

10

−5

10

−4

10 Tolerance τp

−3

10

−2

10

−1

10

0

10

10

−8

10

−7

10

−6

10

−5

10

−4

10 Tolerance τp

−3

10

−2

10

−1

10

0

10

(a) Adaptation based on interpolation er- (b) Adaptation based on interpolation errors only. rors and gradient.

Figure 7.2: Difference between the solution with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the tolerance τp , for the linear advection equation (7.1) with initial data (7.2) at time t = 0.5.

big values of τg . Due to numerical diffusion, the contact discontinuity is smeared in all grids and, if the threshold is too high, the difference between the actual discrete values in a grid and the values predicted from the coarser grid by interpolation are smaller than the threshold and the contact discontinuity is not refined. This phenomenon is avoided if we include the gradient sensor in the adaptation. In this case the contact discontinuity is always refined and the error remains nearly constant for big values of τp . The errors corresponding to an adaptation based on both sensors are shown in Figure 7.2(b), where τg = 10 and τp varies in the same range as in Fig. 7.2(a). Note that up to a certain tolerance, both graphs coincide. Figure 7.3 shows the relation between the errors and the percentage of integrations made by the AMR algorithm with respect to the integrations needed by the algorithm on a fixed grid of 800 points. We observe the good performance of the algorithm, which is able to obtain a solution very close to the reference solution with a much smaller computational cost. As an example, the solution depicted in Figure 7.1, for which the algorithm makes only a 12.66% of the integrations needed on a fixed grid, has a difference with respect to the reference solution equal to 1.3243·10−5 (in the 1-norm), 4.3523 · 10−5 (in the 2-norm) and 3.3932 · 10−4 (in the max– norm).

148

7.1. One-dimensional tests 1D advection, t = 0.5

−2

10

1−norm 2−norm Max−norm

−3

10

−4

10

−5

Error

10

−6

10

−7

10

−8

10

−9

10

10

15

20 25 Percentage of integrations

30

35

Figure 7.3: Difference between the solution with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the percentage of integrations done by the AMR algorithm, for the linear advection equation (7.1) with initial data (7.2) at time t = 0.5.

7.1.2 Inviscid Burgers’ equation One of the simplest nonlinear scalar equations in 1D is Burgers’ equation. In this section we solve the following problem: 2 x ∈ [−1, 1], t ≥ 0 ut + ( u2 )x = 0 (7.3) u(x, 0) = u0 (x) x ∈ [−1, 1], with initial data (7.2). The solution consists on a shock wave moving to the right with speed c = 21 and a rarefaction wave. For t < 54 the solution of this problem is given by 0 if x < − 51 or x ≥ 2t + 51 , 1 x + 5t if − 51 ≤ x ≤ t − 15 , u(x, t) = t 1 if t − 51 < x < 2t + 15 .

In Figure 7.4, we compare the results obtained by the AMR algorithm with a reference solution computed on a fixed grid and the exact solution. In this experiment we have used grid hierarchies of the same type as in the case of the linear advection equation in section 7.1.1 (coarsest grid of 50 points, 5 refinement levels, equivalent to a resolution of 800 points). The parameters have been set to tc = 0.7, τp = 10−4 , τg = 10.0 and K = 0.5. The solution corresponds to time t = 0.7. We observe that the algorithm

7. Numerical experiments

149

has properly identified the regions corresponding to the shock wave and to the tail and the head of the rarefaction. The percentage of integrations of the algorithm with respect to the reference solution is a 12.96% for this experiment. It is interesting to observe what happens if we use a marking strategy based on a gradient sensor only, as is an usual approach in the literature (see e.g. [139, 82]). Figure 7.5 shows the results obtained using only the gradient sensor, with the same setup as the one used in Figure 7.4 and τg = 2.75. This parameter has been chosen such that the percentage of integrations (12.99% in this case) is similar in both experiments. A loss of accuracy can be seen in the tail and the head of the rarefaction, due to a lack of refinement. While shocks and contact discontinuities can be properly refined by the sensor gradient, it is not the case for rarefactions. In fact, the only way for the gradient sensor to refine these parts is to refine the whole rarefaction. To give more information about how the gradient sensor behaves, in Figure 7.6 we show the numerical solution corresponding to two different iterations of the algorithm for this case, along with the grid hierarchies used in those iterations. Figures 7.6(a) and 7.6(c) correspond to t = 0.339945 (iteration 17), and Figures 7.6(b) and 7.6(d) correspond to t = 0.379937 (iteration 19). While in iteration 17 the algorithm has refined the whole rarefaction, in iteration 19 only the shock is refined, because the slope of the rarefaction has become smaller than τg . Figure 7.7 shows the relation between the error incurred by the AMR algorithm, with respect to the solution on a fixed grid and the tolerance τp (Figure 7.7(a)) and with respect to the percentage of integrations (Figure 7.7(b)). The setup for this experiment is again the same: a grid hierarchy of 5 levels, with 50 points in the coarser level, τg = 10.0, tc = 0.7, K = 0.5. Figure 7.7(a) shows the same behavior as the case of the advection equation, plotted in Figure 7.2(b). The presence of the gradient sensor avoids big errors for big values of τp , producing a flat plot at the right part of the graph. For intermediate values, the errors behave linearly with respect to the tolerance. Unlike the case of the linear advection equation, the errors remain nearly constant and do not decrease for tolerances smaller than a certain value. The reason for this stabilization of the error can be tracked back to the fact that the numerical solution at the shock is extremely sensible to perturbations, and a small difference between the solution on a fixed grid and the solution of the AMR algorithm persists, provided the shock wave is refined, regardless the refinement used. This difference can be caused by several factors, but the clearest one is the fact that, when working with a single grid, correspond-

150

7.1. One-dimensional tests

1D Burgers, t = 0.7 1D Burgers, t = 0.7

1.2 Exact Fixed grid AMR

Exact Fixed grid AMR

0.1

1 0.08

0.8 0.06

u(x)

u(x)

0.6 0.04

0.4 0.02

0.2 0

0 −0.02

−0.2 −1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

−0.3

(a) Full solution.

−0.25

−0.1

−0.05

1D Burgers, t = 0.7

1.05

0.15

Exact Fixed grid AMR

Exact Fixed grid AMR

1

0.1

u(x)

u(x)

−0.15

(b) Zoom of the tail of the rarefaction.

1D Burgers, t = 0.7

0.95

0.9

0.85 0.4

−0.2 x

0.05

0

0.42

0.44

0.46

0.48

0.5 x

0.52

0.54

0.56

0.58

0.6

−0.05 0.45

0.5

0.55 x

0.6

0.65

(c) Zoom of the head of the rarefaction and (d) Zoom of the part at the right of the the part at the left of the shock. shock.

Figure 7.4: Solution of Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7.

7. Numerical experiments

151

1D Burgers, t = 0.7 1D Burgers, t = 0.7

Exact Fixed grid AMR

1

Exact Fixed grid AMR

0.1

0.08

0.8

0.06

u(x)

u(x)

0.6

0.04

0.4 0.02

0.2 0

0

−1

−0.02

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

−0.3

(a) Full solution

−0.25

−0.1

−0.05

1D Burgers, t = 0.7

1.05

0.15

Exact Fixed grid AMR

Exact Fixed grid AMR

1

0.1

u(x)

u(x)

−0.15

(b) Zoom of the tail of the rarefaction.

1D Burgers, t = 0.7

0.95

0.9

0.85 0.4

−0.2 x

0.05

0

0.42

0.44

0.46

0.48

0.5 x

0.52

0.54

0.56

0.58

0.6

−0.05 0.45

0.5

0.55 x

0.6

0.65

(c) Zoom of the head of the rarefaction and (d) Zoom of the part at the right of the the part at the left of the shock. shock.

Figure 7.5: Solution of Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7. Adaptation based on the gradient sensor only.

152

7.1. One-dimensional tests

1D Burgers, t=0.339945

1D Burgers, t=0.379937

1

1

0.8

0.8

0.6

0.6 u(x)

1.2

u(x)

1.2

0.4

0.4

0.2

0.2

0

0

−0.2 −1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

−0.2 −1

1

(a) Solution for t = 0.339945 (iteration 17).

−0.8

−0.6

−0.4

3

3

Level

Level

4

2

1

0

0

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

2

1

−0.6

0.2

1D Burgers, t=0.379937

4

−0.8

0 x

(b) Solution for t = 0.379937 (iteration 19).

1D Burgers, t=0.339945

−1

−0.2

0.4

0.6

0.8

(c) Grid hierarchy for iteration 17.

1

−1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

(d) Grid hierarchy for iteration 19.

Figure 7.6: Solutions of Burgers’ equation (7.3) with initial data (7.2) at times t = 0.339945 and t = 0.379937. Adaptation based on the gradient sensor only.

7. Numerical experiments

153

1D Burgers, t=0.7

−1

1D Burgers, t=0.7

−1

10

10 1−norm 2−norm Max−norm

1−norm 2−norm Max−norm

−2

−2

10

10

−3

−3

10

−4

Error

Error

10

10

−5

10

−5

10

−6

10

−6

10

−7

10

−4

10

−7

−8

10

−7

10

−6

10

−5

10

−4

10 Tolerance τp

−3

10

−2

10

(a) Errors vs. tolerance

−1

10

0

10

10

5

10

15

20 25 Percentage of integrations

30

35

40

(b) Errors vs. percentage of integrations

Figure 7.7: Difference between the solution with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the tolerance τp and to the percentage of integrations, for Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7.

ing to the finest resolution, the time step is updated after each iteration. The maximum characteristic speed is recomputed and the time step is adapted so that the CFL condition is verified. In the AMR algorithm, in turn, the time step is adapted after every iteration of the coarsest grid. In our example, this corresponds to 16 iterations of the finest grid. To illustrate this assertion, in Figure 7.8 we plot the difference uAM R − uREF between the solution uAM R of the AMR algorithm and the solution uREF computed on a fixed grid of 800 points, for various choices of the refinement tolerance. Figures 7.8(a), 7.8(b) and 7.8(c) correspond to τp = 10−4 , τp = 10−5 and τp = 10−6 , respectively. A pointwise convergence to the reference solution is clearly observed except at the shock, where the difference is not going to zero. Even more clear is Figure 7.8(d), where we compare the reference solution with a solution computed on a complete grid hierarchy, i.e., a grid hierarchy where the grid corresponding to each level is a complete fixed grid, of the corresponding size, covering the whole domain. In this case the finest grid is not affected by the solution at the coarser grids, since grid interpolation is never done. Also in this case, a difference similar to the cases depicted in Figures 7.8(a), 7.8(b) and 7.8(c) exists. The existence of this difference does not mean that one solution is more accurate than the other. The same thing can be observed in Figures 7.9(a) and 7.9(b), that show, respectively, a closeup

154

7.1. One-dimensional tests

of the differences in a part of the rarefaction and in the zone of the shock. The differences for the cases τp = 10−4 , τp = 10−5 and the case of using the complete grid hierarchy are plotted together.

7.1.3 The Euler equations of gas dynamics The Euler equations are one of the most important models of nonlinear hyperbolic systems of conservation laws. In this section we test the AMR algorithm for two common problems, namely the shock tube or Sod’s problem [166] and the interaction of a shock and an entropy wave [161]. The Euler equations in one dimension are given by (2.27), and we refer to section 2.3.3 for the description of the equations.

Shock tube problem We solve a Riemann problem for the Euler equations (2.27). The initial data is given by uL if x < 0 u0 (x) = (7.4) uR if x ≥ 0, with uL = (ρL , vL , pL ) = (1, 0, 1),

uR = (ρR , vR , pR ) = (0.125, 0, 0.1),

(7.5)

The solution consists of a shock wave, a contact discontinuity and a rarefaction wave. We evolve the solution until time t = 2 with the AMR algorithm using a grid hierarchy of 5 levels, with a coarsest grid of 50 points. We have used the parameters tc = 0.7, τg = 10.0, τp = 10−4 and K = 0.5. With this setup the AMR algorithm computes the solution depicted in Figure 7.10. In this example the AMR algorithm, using only a 19.59% of the integrations required by the algorithm on a fixed grid, is able to properly resolve all the features of the solution. Refinement has been made in the tail and the head of the rarefaction, in the contact discontinuity and in the shock wave. Figures 7.11 and 7.12 show zoomed regions of the density, velocity and pressure profiles around the relevant zones. Figure 7.13 shows the errors corresponding to the density field with respect to the tolerance parameter τp and the percentage of integrations. The error behaves in the same way as in the case of Burgers’ equation, shown in Figure 7.7.

7. Numerical experiments

−4

−4

1D Burgers, t = 0.7

x 10

5

4

4

3

3

2

2

1

1 uAMR − uREF

uAMR − uREF

5

155

0 −1

0 −1

−2

−2

−3

−3

−4

−4

−5

−5

−1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

1D Burgers, t = 0.7

x 10

−1

−0.8

−0.6

(a) τp = 10−4 −4

−4

1D Burgers, t = 0.7

x 10

5

4

4

3

3

2

2

1

1

0 −1

−2 −3

−4

−4

−5

−5

−0.6

−0.4

−0.2

0 x

0.2

(c) τp = 10−6

0.2

0.4

0.6

0.8

1

0.4

0.6

0.8

1

0.4

0.6

0.8

1

1D Burgers, t = 0.7

x 10

0

−3

−0.8

0 x

−1

−2

−1

−0.2

(b) τp = 10−5

uAMR − uREF

uAMR − uREF

5

−0.4

−1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

(d) Complete grid hierarchy

Figure 7.8: Differences between the reference solution and several AMR solutions corresponding to different values of τg . Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7. All figures are at the same scale.

156

7.1. One-dimensional tests −4

−4

1D Burgers, t=0.7

x 10

1D Burgers, t=0.7

x 10

τp=10−4

τp=10−4

τ =10−5

τ =10−5

3

p

p

Full gridlist

Full gridlist

1 2

Error

Error

1

0

0 −1

−2

−3 −1 0.44

0.45

0.46

0.47 x

0.48

0.49

0.5

(a) Zoom in a part of the rarefaction.

0.542

0.544

0.546

0.548

0.55

0.552 x

0.554

0.556

0.558

0.56

0.562

(b) Zoom in the zone of the shock.

Figure 7.9: Differences between the reference solution and several AMR solutions corresponding to different values of τg . Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7.

Shock-entropy wave interaction The solutions of the previous problems consist of some waves moving through regions in which the solution is piecewise constant. We consider now another problem, proposed by Shu and Osher [161], which shows a Mach 3 shock wave interacting with sinusoidal waves in density. In this experiment the solution merges regions with discontinuities and regions with a complicated smooth structure. The Euler equations (2.27) are solved with initial data: (ρL , vL , pL ) = (3.857, 2.629, 10.33), if x ≤ −4, u0 (x) = (7.6) (ρR , vR , pR ) = (1 + 0.2 sin(5x), 0, 1) else, at time t = 1.8. The computational domain has been set to [−5, 5], with outflow boundary conditions at x = −5 and inflow boundary conditions at x = 5. The density, velocity and pressure corresponding to the initial data (7.6) are depicted in Figure 7.14. Figures 7.15, 7.16 and 7.17 show, respectively, the density, velocity and pressure distributions computed by the AMR algorithm, and a good agreement between the AMR and the reference solution can be observed. We have used the previous grid hierarchy (5 refinement levels, coarse grids of 50 points), with the parameters τp = 5 · 10−3 , τg = 8.0, tc = 0.8, K = 0.5. The algorithm performs with this setup a 29% of the integrations needed by the algorithm applied to a grid of 800 points.

7. Numerical experiments

157

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t=2

1.1

1

Fixed grid AMR

Fixed grid AMR

1 0.8

0.9

0.8 0.6

Velocity

Density

0.7

0.6

0.4

0.5 0.2

0.4

0.3 0

0.2

0.1 −5

−4

−3

−2

−1

0 x

1

2

3

4

−0.2 −5

5

−4

−3

−2

(a) Density.

−1

0 x

1

2

3

4

5

3

4

5

(b) Velocity. 1D Euler, Sod’s problem, t=2

1D Euler, Sod’s problem, t = 2 Fixed grid AMR

1

4

0.8

0.6

Level

Pressure

3

0.4

1

0.2

0 −5

2

0

−4

−3

−2

−1

0 x

1

(c) Pressure.

2

3

4

5

−5

−4

−3

−2

−1

0 x

1

2

(d) Grid hierarchy used in the last iteration.

Figure 7.10: Solution of the shock tube problem for the Euler equations (2.27) with initial data (7.4) for t = 2.

158

7.1. One-dimensional tests

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2 Fixed grid AMR

1.04

0.5

1.02

0.48

Density

Density

Fixed grid AMR

1

0.46

0.98

0.44

0.96

0.42

0.94

0.4 −3

−2.8

−2.6

−2.4

−2.2

−2

−0.6

−0.4

−0.2

0

x

0.2

0.4

x

(a) Density, head of the rarefaction.

(b) Density, tail of the rarefaction.

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2 0.28 Fixed grid AMR

0.45

Fixed grid AMR 0.26

0.24

0.4

Density

Density

0.22

0.35

0.2

0.18 0.3

0.16

0.14 0.25 0.12 1.4

1.6

1.8

2

2.2

x

(c) Density, contact discontinuity.

2.4

3.3

3.4

3.5

3.6

3.7

3.8

x

(d) Density, shock wave.

Figure 7.11: Zoomed regions for the density field of the solution of the shock tube problem for the Euler equations (2.27) with initial data (7.4) for t = 2.

7. Numerical experiments

159

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2 Fixed grid AMR

0.12

0.94

0.1

0.92

Velocity

0.08

Velocity

Fixed grid AMR

0.96

0.06

0.9

0.88

0.04

0.02

0.86

0

0.84

0.82

−0.02 −3

−2.8

−2.6

−2.4

−2.2

−0.6

−2

−0.4

−0.2

0

0.2

0.4

x

x

(a) Velocity, head of the rarefaction.

(b) Velocity, tail of the rarefaction.

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2

Fixed grid AMR

0.9

Fixed grid AMR

1.04

0.8 1.02

0.7

1

0.5

Pressure

Velocity

0.6

0.4

0.3

0.98

0.96

0.2 0.94

0.1 0.92

0 2.9

3

3.1

3.2

3.3

3.4 x

3.5

3.6

3.7

3.8

3.9

−3

−2.8

−2.6

−2.4

−2.2

−2

x

(c) Velocity, shock wave.

(d) Pressure, head of the rarefaction. 1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2 0.4

Fixed grid AMR

Fixed grid AMR

0.38

0.3

0.36

Pressure

Pressure

0.25

0.34

0.2

0.32

0.15 0.3

0.28

0.1

−0.6

−0.4

−0.2

0

0.2

0.4

x

(e) Pressure, tail of the rarefaction.

0.6

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

x

(f) Pressure, shock wave.

Figure 7.12: Zoomed regions for the velocity and pressure fields of the solution of the shock tube problem for the Euler equations (2.27) with initial data (7.4) for t = 2.

160

7.1. One-dimensional tests

1D Euler, Sod’s problem, t = 2

−1

10

1−norm 2−norm Max−norm

1−norm 2−norm Max−norm

−2

−2

10

−3

Error

Error

10

10

−3

10

−4

10

−4

10

−5

10

1D Euler, Sod’s problem, t = 2

−1

10

−5

−8

−7

10

−6

10

−5

10

10

−4

−3

10 Tolerance τp

−2

10

−1

10

10

0

10

10

(a) Errors vs. tolerance

0

5

10

15

20 25 30 Percentage of integrations

35

40

45

50

(b) Errors vs. percentage of integrations

Figure 7.13: Difference in the density field between the solution computed with the AMR algorithm and the solution computed on an equivalent fixed grid, with respect to the tolerance τp (a) and to the percentage of integrations (b), for the shock tube problem for the Euler equations (2.27), with initial data (7.4), for time t = 2.

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

4

11

3

10 3.5

2.5

9 3

8 2

7 Pressure

Density

Velocity

2.5 1.5

2

6

5 1

4 1.5

3

0.5 1

2 0

0.5 −5

−4

−3

−2

−1

0 x

1

2

(a) Density

3

4

5

−5

1 −4

−3

−2

−1

0 x

1

2

3

4

5

−5

(b) Velocity

Figure 7.14: Initial data (7.6)

−4

−3

−2

−1

0 x

1

2

(c) Pressure

3

4

5

7. Numerical experiments

161

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

5 Fixed grid AMR

Fixed grid AMR

4.1

4.5

4

4

3.5

Density

Density

3.9 3

2.5

2

3.8

3.7

1.5 3.6 1

0.5 −5

3.5 −4

−3

−2

−1

0 x

1

2

3

4

5

−3

−2.5

−1.5

−1

−0.5

0

x

(a) Density field

(b) Density, Zoomed region

1D Euler, Shock−entropy wave interaction problem, t=1.8

4.6

−2

1D Euler, Shock−entropy wave interaction problem, t = 1.8

Fixed grid AMR

Fixed grid AMR

4

4.4 3.5

4.2 3 Density

Density

4

3.8

3.6

2.5

2

3.4 1.5

3.2 1

3 0.2

0.4

0.6

0.8

1

1.2 x

1.4

1.6

1.8

(c) Density, Zoomed region

2

2

2.1

2.2

2.3 x

2.4

2.5

2.6

(d) Density, Zoomed region

Figure 7.15: Numerical solution of the shock-entropy wave interaction problem. Euler equations (2.27) with initial data (7.6). Density field for t = 1.8.

162

7.1. One-dimensional tests

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8 3 Fixed grid AMR

Fixed grid AMR

2.8

2.5

2.75 2

2.7

Velocity

Velocity

1.5

2.65

1

2.6

0.5

2.55

0

−0.5 −5

2.5

−4

−3

−2

−1

0 x

1

2

3

4

2.45

5

−3

−2.5

−1.5

−1

−0.5

0

x

(a) Velocity field

(b) Velocity, Zoomed region 1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8 Fixed grid AMR

2.8

Fixed grid AMR

3

2.5

2.75

2

Velocity

2.7

Velocity

−2

2.65

1.5

2.6

1 2.55

0.5 2.5

0 2.45 0.2

0.4

0.6

0.8

1

1.2 x

1.4

1.6

1.8

(c) Velocity, Zoomed region

2

2

2.1

2.2

2.3 x

2.4

2.5

2.6

(d) Velocity, Zoomed region

Figure 7.16: Numerical solution of the shock-entropy wave interaction problem. Euler equations (2.27) with initial data (7.6). Velocity field for t = 1.8.

7. Numerical experiments

163

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

12 Fixed grid AMR

Fixed grid AMR

11.5

10

11

Pressure

Pressure

8

6

10.5

10

4

9.5

2

0 −5

−4

−3

−2

−1

0 x

1

2

3

4

9

5

−3

−2.5

−2

−1.5

−1

−0.5

0

x

(a) Pressure field

(b) Pressure, Zoomed region 1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

Fixed grid AMR

Fixed grid AMR

11.5

10

11

10.5

Pressure

Pressure

8

6

10

4

9.5

2

9 0.2

0 0.4

0.6

0.8

1

1.2 x

1.4

1.6

1.8

(c) Pressure, Zoomed region

2

2

2.1

2.2

2.3 x

2.4

2.5

2.6

(d) Pressure, Zoomed region

Figure 7.17: Numerical solution of the shock-entropy wave interaction problem. Euler equations (2.27) with initial data (7.6). Pressure field for t = 1.8.

164

7.1. One-dimensional tests 1D Euler, Shock−entropy wave problem, t = 1.8

0

1D Euler, Shock−entropy wave problem, t = 1.8

0

10

10 1−norm 2−norm Max−norm

1−norm 2−norm Max−norm

−1

−1

10

Error

Error

10

−2

10

−3

10

−3

10

−4

10

−2

10

−4

−4

10

−3

10

−2

Tolerance τp

10

(a) Errors vs. tolerance

−1

10

10

15

20

25

30

35 40 45 Percentage of integrations

50

55

60

65

(b) Errors vs. percentage of integrations

Figure 7.18: Difference in the density field between the solution with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the tolerance τp (a) and to the percentage of integrations (b), for the shock-entropy wave interaction problem for the Euler equations (2.27), with initial data (7.6) for t = 1.8.

Figure 7.18(a) shows the errors corresponding to different values of the tolerance parameter τp , for the density distribution. The error in this case behaves differently to the cases previously studied (compare Figure 7.18(a) with Figures 7.2(b), 7.7(a) and 7.13(a)). We observe a combination of zones where the norm of the error is not significantly changing and zones with abrupt changes in the error. This behavior can be explained as the result of the influence on the error of several sources of differences between the AMR solution and the fixed grid reference solution. When analyzing the relation between the error and the refinement tolerance, in previous experiments – Burgers’ equation and the shock tube problem– we concluded that there was a difference between the solution on a fixed grid and the solution obtained by the AMR algorithm that is not related to the refinement but to the algorithm organization, and is present even if the numerical solution is obtained with a complete grid hierarchy. We observed that this difference was due to the fact that the numerical solution computed at the shock wave is extremely sensible to perturbations. In the case of the experiments in sections 7.1.2 and 7.1.3, this was the error source that was dominating the error for tolerances smaller than a certain value, so that the error was nearly constant for that range of tolerances. In the case of the shock-entropy wave interaction problem, the so-

7. Numerical experiments

165

lution is much more complicated and, as the refinement is changing, different factors can be influencing the error. Consider, for example, the abrupt error decrease that occurs near τp = 10−2 . We plot in Figure 7.19 the density distribution of two solutions of the AMR algorithm corresponding to two different –but close– tolerances, τp = 8.65 · 10−3 and τp = 1.30 · 10−2 . Despite the AMR algorithm for these two tolerances requires nearly the same percentage of integrations respect to the reference solution (24.55% for τp = 8.65 · 10−3 and 24.49% for τp = 1.30 · 10−2 ), the errors for τp = 8.65 · 10−3 are, depending on the case, 4 − 5 times smaller than the errors for τp = 1.30 · 10−2 , and a difference in the numerical solutions is clearly visible in the plots of Figure 7.19. To analyze why it happens, in Figure 7.20 we plot the differences between the AMR solution and the reference solution for several values of τp in the range 8.65 · 10−3 − 2.92 · 10−2 . Comparing Figures 7.20(a), 7.20(b) and 7.20(c) we observe that the reduction in the tolerance parameter mainly produces a reduction in the acoustic waves located at the left the point x = 0, but does not produce a significant reduction in the zone near the shock wave. As the error at the shock is the biggest one, the overall reduction of the error is small, as can be observed in Figure 7.18(a). If the tolerance is set to τp = 8.65·10−3 (Figure 7.20(d)), the solution at the shock is better approximated and the error presents an abrupt descent. Of course, the adaptation algorithm has followed the shock as it moves for all the referred tolerances, and the reason for the different quality of the solutions obtained with different tolerances is not a lack of refinement at the shock location for the bigger tolerances, but the high sensitivity to perturbations of this problem. In this particular case, the difference comes from the fact that for the smallest tolerance τp = 8.65 · 10−3 the algorithm has included during all the integration process two refinement levels (levels 0 and 1) in the zone where the sinusoidal waves are initially located, whereas for the other tolerances has not, as shown in Figure 7.21. Although the interpolation errors in that zone are –because of the smoothness of the solution– small for all the tolerances considered, the smaller interpolation errors of the smaller tolerance produces a better approximation of the states at both sides of the shock. The same argument explains the big decrease in the error that can be observed in Figure 7.18(a) for tolerances near 2 · 10−3 .

166

7.1. One-dimensional tests

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

5 4.1

4.5

4

4

3.5

Density

Density

3.9 3

2.5

2

3.8

3.7

1.5 3.6

Fixed grid AMR, τ = 8.65⋅ 10−3

1

Fixed grid AMR, τ = 8.65⋅ 10−3

p

p

AMR, τ = 1.30⋅ 10−2

AMR, τ = 1.30⋅ 10−2

p

0.5 −5

p

3.5 −4

−3

−2

−1

0 x

1

2

3

4

5

−3

−2.5

−2

−1.5

−1

−0.5

0

x

(a) Density field

(b) Density, Zoomed region

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

4.6

4

4.4 3.5

4.2 3 Density

Density

4

3.8

3.6

2.5

2

3.4 1.5

3.2

Fixed grid AMR, τ = 8.65⋅ 10−3

Fixed grid AMR, τp = 8.65⋅ 10−3

p

AMR, τ = 1.30⋅ 10−2 3 0.2

0.4

0.6

0.8

AMR, τp = 1.30⋅ 10−2

1

p

1

1.2 x

1.4

1.6

1.8

(c) Density, Zoomed region

2

2

2.1

2.2

2.3 x

2.4

2.5

2.6

(d) Density, Zoomed region

Figure 7.19: Numerical solution obtained with the AMR algorithm for two values of τp , for the shock-entropy wave interaction problem. Density field for t = 1.8.

7. Numerical experiments

167

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8 0.25

0.1

0.1

0

AMR

0

u

u

AMR

−u

0.05

REF

0.05

(Density)

0.2

0.15

−u

0.2

0.15

REF

(Density)

0.25

−0.05

−0.05

−0.1

−0.1

−0.15

−0.15

−0.2 −5

−4

−3

−2

−1

0 x

1

2

3

4

−0.2 −5

5

−4

−3

(a) τp = 2.92 · 10−2

0 x

1

2

3

4

5

3

4

5

1D Euler, Shock−entropy wave interaction problem, t = 1.8 0.25

0.2

0.2

0.15

0.15

0.1

0.1 (Density)

0.25

0.05

−u

REF

0.05

REF

(Density)

−1

(b) τp = 1.95 · 10−2

1D Euler, Shock−entropy wave interaction problem, t = 1.8

0

u

u

AMR

0

AMR

−u

−2

−0.05

−0.05

−0.1

−0.1

−0.15

−0.15

−0.2 −5

−4

−3

−2

−1

0 x

1

(c) τp = 1.30 · 10−2

2

3

4

5

−0.2 −5

−4

−3

−2

−1

0 x

1

2

(d) τp = 8.65 · 10−3

Figure 7.20: Difference at time t = 1.8 in the density field between the reference solution and several AMR solutions, for different refinement parameters. Shockentropy wave interaction problem. All figures are at the same scale.

168

7.1. One-dimensional tests 1D Euler, Shock−entropy wave interaction problem, t = 1.8

4

4

3

3

Level

Level

1D Euler, Shock−entropy wave interaction problem, t = 1.8

2

2

1

1

0

0

−5

−4

−3

−2

−1

0 x

1

2

(a) τp = 1.30 · 10−2

3

4

5

−5

−4

−3

−2

−1

0 x

1

2

3

4

5

(b) τp = 8.65 · 10−3

Figure 7.21: Grid hierarchies constructed by the AMR algorithm for the first time iteration, for two different refinement parameters. Shock-entropy wave interaction problem.

7.1.4 Two-component Euler equations in 1D The numerical experiment presented in this section consists on the onedimensional version of the interaction of a shock wave travelling in air with a helium bubble. The physical problem was originally studied by Haas and Sturtevant in [60] and has been simulated by Karni [83] using a primitive formulation and a second order algorithm, with simulations including AMR in [142]. The integration algorithm presented in this work has been applied to the helium bubble problem on a fixed grid in [118]. The setup of the problem is as follows: the two-component Euler equations (2.42) are considered in the interval [0, 0.356], with initial data that represents a one-dimensional helium “bubble”, located in the interval [0.15, 0.20] and surrounded by air. A left-traveling Mach 1.22 shock wave is located at x = 0.225. The initial data is thus given by:

uA if 0 ≤ x < 0.15 or 0.2 < x < 0.225, u(x, 0) = u0 (x) = u if 0.15 ≤ x ≤ 0.2, B uS if 0.225 ≤ x ≤ 0.356,

(7.7)

7. Numerical experiments

169

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1 0.1

1.4

0

1.2

−0.1 Velocity

Density

1

0.8

−0.2

0.6 −0.3

0.4

−0.4

0.2 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0

0.05

0.1

0.15

x

0.2

0.25

0.3

0.35

x

(a) Density

(b) Velocity

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1 1.6

1.5

0.8

Pressure

1.4

0.6 1.3

0.4

1.2

1.1

0.2 1

0 0.9

0

0.05

0.1

0.15

0.2 x

0.25

0.3

0.35

0

0.05

(c) Pressure

0.1

0.15

0.2

0.25

0.3

0.35

(d) Mass fraction φ

Figure 7.22: Initial data (7.7) for the shock-bubble interaction problem in 1D.

where uA = (ρA , vA , pA , φA ) = (1, 0, 1, 1), uB = (ρB , vB , pB , φB ) = (0.1819, 0, 1, 0),

(7.8)

uS = (ρS , vS , pS , φS ) = (1.3764, −0.3947, 1.5698, 1). Quiescent air is represented by uA , uB represents quiescent helium and uS is the state connected with uA by a Mach 1.22 shock traveling to the left. The initial data is depicted in Figure 7.22. This is the same setup used in [118] for this experiment. We have used the same grid hierarchy as in previous experiments (5 levels, coarse grid of 50 points, equivalent to a uniform grid of 800 points). The sample result in Figure

170

7.1. One-dimensional tests

7.23 corresponds to time t = 0.1, and has been obtained with the parameters τg = 8.0, τp = 10−4 , tc = 0.7 and K = 0.5. In this case the AMR algorithm performs a 54.97% of the integrations needed by the algorithm applied to a fixed grid of 800 points. Zooms of the plots in Figure 7.23 are shown in Figures 7.24, and 7.25 and 7.26. Figure 7.27 shows how the error is related to the tolerance parameter τp and to the percentage of integrations. The error behaves in the same way as in previous examples, with parts where the error decreases linearly and flat zones. We use this experiment to show the effects of using the flux projection from fine to coarse grids, described in sections 5.4 and 6.1.4. It was argued there that the update of the coarse fluxes using finer fluxes enforces inter-grid conservation and provides a benefit in terms of computational cost, because the updated coarse solution with numerical fluxes coming from the finer grid has an increased resolution. If some data is then interpolated from coarse to fine grids, the results are more accurate if the conservative fix-up has been applied. This feedback mechanism provides a sharper coarse numerical solution which is visually distinguishable after some time steps from the coarse solution without the conservative fix-up. An example is shown in Figure 7.28, where the numerical solution computed on the coarsest grid, corresponding to the experiment shown in Figures 7.23, 7.24, 7.25 and 7.26, is shown for both cases (with and without flux projection). The reduced smearing in the case of using flux projection results in a reduction in the quantity of refined cells. Figure 7.29 shows that, for the same number of integrations, the numerical solution obtained using flux projection is more accurate than the solution obtained without using it (the figure shows the 2-norm of the difference with respect to the reference solution). In all the previous experiments we have used the global efficiency, computed as the percentage of integrations required by the AMR algorithm with respect to a fixed grid of equivalent size, as an indicator of the performance of the AMR algorithm. It has been argued that this is a good measure of the efficiency of the algorithm, because of the relatively higher computational cost of the integration algorithm with respect to other processes as adaptation, interpolation and flux projection. The global efficiency is machine independent, is not affected by secondary processes running on the machine, as input/output operations and can be estimated without running the problem on a fixed grid. To end this section we show a comparison between the global efficiency and the actual percentage of time needed by the AMR algorithm with respect to a fixed grid algorithm, which is the interesting measure of efficiency in practice. Figure 7.30 shows a plot of the (wall-clock) percentage of time

7. Numerical experiments

171

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

0.1

Fixed grid AMR

Fixed grid AMR 0

1.2

−0.1

1

−0.2 Velocity

Density

1.4

0.8

−0.3

0.6

−0.4

0.4

−0.5

0.2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0

0.05

0.1

0.15

x

0.2

0.25

0.3

0.35

x

(a) Density

(b) Velocity 1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

1.5

Pressure

1.4

1.3

1.2

1.1

1 Fixed grid AMR 0.9

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

x

(c) Pressure

Figure 7.23: Sample numerical solution of the shock-bubble interaction problem in 1D, t = 0.1.

172

7.1. One-dimensional tests

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1 Fixed grid AMR

1.4

Fixed grid AMR

0.45

1.35 0.4 1.3

Density

Density

1.25

1.2

0.35

1.15 0.3 1.1

1.05 0.25 1

0.95 0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

0.13

0.12

0.125

0.13

0.135

0.14

x 1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

0.145 x

0.15

0.155

0.16

0.165

0.17

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1 Fixed grid AMR

Fixed grid AMR

1.4

1.4

1.35

Density

Density

1.35

1.3

1.3

1.25

1.25 1.2

1.15

1.2 0.16

0.17

0.18

0.19 x

0.2

0.21

0.22

0.23

0.24

0.25

0.26

0.27

0.28

0.29

0.3

x

Figure 7.24: Zoomed regions for the sample numerical solution of the shockbubble interaction problem in 1D, t = 0.1. Density.

7. Numerical experiments

173

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

0.05 Fixed grid AMR

0

Fixed grid AMR

−0.38

−0.4 −0.05 −0.42 −0.1 −0.44 Velocity

Velocity

−0.15

−0.2

−0.46

−0.48

−0.25

−0.5

−0.3

−0.35

−0.52

−0.4

−0.54

−0.45

−0.56 0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

0.13

0.18

0.2

0.22

x

0.24

0.26

0.28

x

Figure 7.25: Zoomed regions for the sample numerical solution of the shockbubble interaction problem in 1D, t = 0.1. Velocity.

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

1.6

1.5 1.55

1.4

Pressure

Pressure

1.5

1.3

1.2

1.45

1.4

1.1

1.35

1 Fixed grid AMR 0.9

0.05

0.06

0.07

0.08

0.09 x

0.1

0.11

0.12

0.13

Fixed grid AMR 1.3

0.18

0.2

0.22

0.24

0.26

0.28

x

Figure 7.26: Zoomed regions for the sample numerical solution of the shockbubble interaction problem in 1D, t = 0.1. Pressure.

174 −2

7.1. One-dimensional tests 1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

10

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

−2

10

1−norm 2−norm Max−norm

1−norm 2−norm Max−norm

−3

10

−3

10

−4

10

−4

Error

Error

10

−5

10

−5

10

−6

10

−6

10

−7

10

−7

−4

10

−3

Tolerance τ

p

(a) Error vs tolerance

10

10

25

30

35

40

45 50 55 Percentage of integrations

60

65

70

75

(b) Error vs percentage of integrations

Figure 7.27: Difference in the density field between the solution obtained with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the tolerance τp (a) and to the percentage of integrations (b), for the shock-bubble interaction problem. Multi-component Euler equations (2.42) with initial data (7.7), t = 0.1

required by the AMR algorithm, measured with the time command provided by Linux, and the real global efficiency. We observe the linear relation between the percentage of time and the percentage of integrations, which is close to the optimal relation represented by a solid, straight line in the plot. We conclude that the global efficiency gives a good estimation of the relative computational cost of the AMR algorithm. For the experiments shown in the plot, we have run the shock-bubble and shock-entropy wave interaction problems for different values of τp , and compared the time needed and the integrations required by them with the same quantities for the algorithm applied on a fixed grid. To minimize the influence of the system, we use the average times over several runs, and we use setups with more points, so that startup times are reduced. For the shock-bubble interaction problem we used a grid hierarchy of 5 levels, with a coarse grid of 100 points, which is equivalent to a fixed grid of 1600 points. For the shock-entropy wave problem we used 6 refinement levels with a coarsest grid of 60 points, equivalent to a fixed grid of 1920 points.The rest of the parameters were set to K = 0.5, τg = 8 and tc = 0.8 for both experiments.

7. Numerical experiments

175

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

0.1

With projection Without projection

With projection Without projection 0

1.2

−0.1

1

−0.2 Velocity

Density

1.4

0.8

−0.3

0.6

−0.4

0.4

−0.5

0.2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0

0.05

0.1

0.15

x

0.2

0.25

0.3

0.35

x

(a) Density

(b) Velocity

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

With projection Without projection

1.5 1

1.4 0.8

Mass fraction

Pressure

1.3

1.2

0.6

0.4

1.1 0.2

1

0

0.9 With projection Without projection 0.8

0

0.05

0.1

0.15

0.2 x

(c) Pressure

0.25

0.3

0.35

−0.2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

x

(d) Mass fraction

Figure 7.28: Numerical solution on the coarsest grid (50 points), for the shockbubble interaction problem, with and without using flux projection. Multicomponent Euler equations (2.42) with initial data (7.7), t = 0.1. Same setup as in Figure 7.23

176

7.1. One-dimensional tests

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

−2

10

With projection Without projection

−3

2−norm of the error (density)

10

−4

10

−5

10

−6

10

20

30

40

50 Percentage of integrations

60

70

80

Figure 7.29: 2-norm of the error obtained with and without flux projection, with respect to the percentage of integrations. Shock-bubble interaction problem, t = 0.1.

80 Experimental data, shock−bubble Experimental data, shock−entropy Line %time = %integrations 70

Percentage of time

60

50

40

30

20

10 10

20

30

40 50 Percentage of integrations

60

70

80

Figure 7.30: Relation between the percentage of time and the global efficiency, for the shock-bubble and the shock-entropy wave interaction problem.

7. Numerical experiments

177

7.2 Two-dimensional tests Various problems for the Euler equations and the multi-component Euler equations in 2D are considered in this section. In section 7.2.1 we consider a Riemann problem, with the initial data defining four shock waves. We next address in section 7.2.2 the double Mach reflection problem, where a shock wave encounters a wedge. The interaction of a shock wave with a vortex is addressed in section 7.2.3. Finally, the 2D version of the shock-bubble interaction problem for the multi-component Euler equations, whose one-dimensional version was studied in section 7.1.4, is considered in section 7.2.4. Schlieren-type images are used in some figures for the visualization of flow features. The discrete function |∇ui,j | si,j = exp −k maxl,m |∇ul,m | is computed, using a discrete approximation of the gradient, and is plotted as a grayscale image, for which darker pixel values correspond to higher density variations. The quantity u represent here a physical variable, typically density or pressure. In the case of a single fluid k is a constant, and in the case of two fluids the value of k depends on the fluid, being higher for the lighter fluid. We refer to [118] for details.

7.2.1 A Riemann problem for the Euler equations The first problem consists in a Riemann problem for the Euler equations (2.25)–(2.26), corresponding to configuration 3 of [155] (see also [101]). To define a Riemann problem in 2D, the domain, which is assumed here to be the unit square, is divided in four equal parts, as in Figure 7.31, and four constant states (ρk , uk , vk , pk ), k ∈ {I, II, III, IV }, are considered as initial data, so that, at each interface between two parts, a single wave appears. In this particular case the initial data corresponds to four shock waves and is given by ρ1 ρ2 ρ3 ρ4

= 1.5, = 0.5323, = 0.138, = 0.5323,

u1 u2 u3 u4

= 0, = 1.206, = 1.206, = 0,

v1 v2 v3 v4

= 0, = 0, = 1.206, = 1.206,

p1 p2 p3 p4

= 1.5, = 0.3, = 0.029, = 0.3.

(7.9)

178

7.2. Two-dimensional tests 1

II

I

III

IV

3/4

0

3/4

1

Figure 7.31: Sketch to the computational domain for a Riemann problem in 2D.

We solve this problem for time t = 0.5. The parameters used for this simulation are τp = 10−4 , tc = 0.9 and K = 0.45. We show the results corresponding to two different grid hierarchies. Figure 7.32 shows a schlieren image of the density field for a grid hierarchy of 7 levels, with a coarse grid of 32 × 32 points, equivalent to a resolution of 2048 × 2048 points. A zoom of the central part is shown in Figure 7.33. These figures can be compared with the results obtained with a fixed grid of resolution 2048×2048, shown in Fig. 7.34 and 7.35. Due to the high sensitivity of the solution with respect to small variations in the parameters, the solutions are slightly different, but with the same quality, having a a comparable resolution of the features of the solution The simulation requires a 14% of the integrations required on a fixed grid, and, running on 4 processors in parallel requires a wall-clock time of 16496 seconds (around 4.6 hours). The result on a fixed grid was obtained with a single processor and took a wall-clock time of 326131 seconds (around 90.6 hours), which represents an optimal timing of around 22.65 hours if it was run in parallel with 4 processors, as the AMR simulation was. We can compare these timings to conclude that the AMR algorithm used (at most) a 20.23% of the time used by the algorithm on a fixed grid, which represents a percentage of time for parallelization and operations related with the AMR algorithm (adaptation, projection, interpolation,

7. Numerical experiments

179

2000

1800

1600

1400

1200

1000

800

600

400

200

200

400

600

800

1000

1200

1400

1600

1800

2000

Figure 7.32: Numerical schlieren image of the density field for the 4-shocks Riemann problem, computed with a grid hierarchy of 7 levels. Euler equations (2.25)– (2.26) with initial data (7.9). Time t = 0.5.

etc.) of only the 6%. We have repeated the same experiment with the same setup as before but with 8 refinement levels instead of 7. The plots in Figures 7.36 and 7.37 correspond, respectively, to the plots in Figures 7.32 and 7.33. In this case the global efficiency is 12.8%.

7.2.2 Double Mach reflection This is a classical experiment, that appears in [39]. We consider the Euler equations (2.25)–(2.26), and the problem consists on a vertical Mach 10

180

7.2. Two-dimensional tests

1200

1000

800

600

400

200

200

400

600

800

1000

Figure 7.33: Zoom of the central part of Figure 7.32.

1200

7. Numerical experiments

181

2000

1800

1600

1400

1200

1000

800

600

400

200

200

400

600

800

1000

1200

1400

1600

1800

2000

Figure 7.34: Numerical schlieren image of the density field for the 4-shocks Riemann problem, computed with a fixed grid of 2048 × 2048 points. Euler equations (2.25)–(2.26) with initial data (7.9). Time t = 0.5.

182

7.2. Two-dimensional tests

1200

1000

800

600

400

200

200

400

600

800

1000

Figure 7.35: Zoom of the central part of Figure 7.34.

1200

7. Numerical experiments

183

4000

3500

3000

2500

2000

1500

1000

500

500

1000

1500

2000

2500

3000

3500

4000

Figure 7.36: Numerical schlieren image of the density field for the 4-shocks Riemann problem, computed with a grid hierarchy of 8 levels. Euler equations (2.25)– (2.26) with initial data (7.9). Time t = 0.5.

184

7.2. Two-dimensional tests

2000

1500

1000

500

500

1000

1500

2000

Figure 7.37: Zoom of the central part of Figure 7.36.

7. Numerical experiments

185

1

I

0 α

II

x0

4

Figure 7.38: Sketch of the computational domain for the double Mach reflection problem.

shock wave that moves horizontally to the right and encounters a wedge at some point x0 in the x axis. This is equivalent to sending a diagonal shock wave in a straight tube with a reflecting wall. The reflection of the shock on the wall produces a jet of dense gas with a complicated structure. The domain used in this experiment is sketched in Figure 7.38, and the corresponding initial data is given by ρI = 8.0, uI = 8.25 cos(α), vI = −8.25 sin(α), pI = 116.5, ρII = 1.4 uII = 0, vII = 0, pII = 1.0,

(7.10)

where the zones I and II are the ones indicated in figure 7.38. The angle α is the inclination of the shock with respect to the vertical. We consider the computational domain [0, 4] × [0, 1]. As in [39] we assume an inclination of the shock equal to π6 , and we set x0 = 14 . The numerical solution (density) obtained by the AMR algorithm for time t = 0.2 is shown in Figure 7.39. It has been obtained with a grid hierarchy of 6 levels, with a coarse grid of 80 × 20 points, which is equivalent to a fixed grid of 2560 × 640 points. The simulation has been run with a single processor, and the parameters have been set to the values τp = 10−4 , tc = 0.9 and K = 0.45. Figure 7.40 shows a contour plot of a part of the solution. These Figures can be compared with the solution obtained with a fixed grid of size 2560 × 640, shown in Figures 7.41 and 7.42. Both solutions give the same resolution, and only some small differences appear in the roll-up structure in Figures 7.40 and 7.42, due to the sensitivity of this structure to small perturbations. This simulation has a global efficiency of the 28%.

186

7.2. Two-dimensional tests

20 15 10 5

Figure 7.39: Plot of the density field for the double Mach reflection problem, computed with the AMR algorithm using a grid hierarchy of 6 levels. Euler equations (2.25)–(2.26) with initial data (7.10). Time t = 0.2.

16 300 14 250 12 200 10

150

8

100

6

50

4

2 50

100

150

200

250

300

350

400

Figure 7.40: Contour plot of a part of the solution of Figure 7.39.

7. Numerical experiments

187

20 15 10 5

Figure 7.41: Plot of the density field for the double Mach reflection problem, computed with a fixed grid. Euler equations (2.25)–(2.26) with initial data (7.10). Time t = 0.2.

16 300 14 250 12 200 10

150

8

100

6

50

4

2 50

100

150

200

250

300

350

400

Figure 7.42: Contour plot of a part of the solution of Figure 7.41.

188

7.2. Two-dimensional tests 1

y=0.5

b

III IV

I

a

II 0

x=0.25

2

x=0.5

Figure 7.43: Sketch of the computational domain used in the shock-vortex interaction problem.

7.2.3 Shock-vortex interaction This experiment shows the interaction of a planar stationary shock with a rotating vortex. We have used the same setup as in [144]. The flow is modeled with the Euler equations (2.25)–(2.26). Initially a shock and a vortex are considered in the square [0, 2] × [0, 1], so that they are isolated. The vortex is modeled as a rotating circle with uniform vorticity and an outer annulus with oppositely directed uniform vorticity and opposite total circulation. In our tests the shock is initially located at x = 0.5 and the vortex has its center at (0.25, 0.5), with radius a = 0.075 for the inner circle and b = 0.175 for the outer circle. Figure 7.43 shows an sketch of the computational domain. The shock Mach number is denoted by Ms . At the left of the shock and outside the vortex (region II in Figure 7.43) we fix initial conditions

ρII = 1,

uII =

√ γ Ms ,

vII = 0,

pII = 1.

The initial data for region I are derived from standard conditions for a

7. Numerical experiments

189

planar moving shock and are given by: (γ − 1)Ms2 , 2 + (γ − 1)Ms2 2 + (γ − 1)Ms2 uI = uII , (γ + 1)Ms2 vI = 0, ρI = ρII

pI = pII

2γMs2 − γ + 1 . γ+1

At the vortex, the (counterclockwise) angular velocity is given by r vm a if r ≤ a, a b2 vr = vm a2 −b2 r − r if a ≤ r ≤ b, 0 if r > b, p where r = (x − 0.25)2 + (y − 0.5)2 is the distance of a point (x, y) to the center of the vortex and vm is the maximum angular velocity. The velocity field inside the vortex is simply given by the components of the angular velocity added to the velocity of the region II, i.e. uIII,IV = uII − sin(θ)vr ,

vIII,IV = vII + cos(θ)vr ,

where θ is the angle composed by the point (x, y) and the horizontal axis. The density and pressure are given by the expressions [144] γ T γ−1 pIII,IV = pII , TII 1 T γ−1 , ρIII,IV = ρII TII where T is temperature, which is given by 2 s −1 s 2 − ln(s) T (r) = TII 1 + (γ − 1)Mv (1 − s)2 2s sb − s +(γ − 1)Mv2 2s if r ≤ a, and by T (r) = TII

1 + (γ − 1)Mv2

s (1 − s)2

s2b − 1 − ln(sb ) 2sb

190

7.2. Two-dimensional tests

3000

2500

2000

1500

1000

500

1000

2000

3000

4000

5000

6000

Figure 7.44: Numerical schlieren image obtained with the density field, corresponding to the interaction of a weak shock (Ms = 1.1) with a strong vortex (Mv = 1.7). Time t = 0.6.

pII ρII R , and vm Mv = √γRT , II

if a ≤ r ≤ b. The temperature TII is given by TII = a2 , b2

r2 b2

the rest of

the quantities are given by s = sb = and which is a measure of the strength of the vortex. Note the similarity between the definition of Mv and the shock Mach number. We have computed the solution for two configurations taken from [144]. In the first one, a weak (Ms = 1.1) shock interacts with a strong vortex (Mv = 1.7). The simulation has been made with one processor, setting τp = 10−4 , tc = 0.9, K = 0.45 and using a grid hierarchy of 6 levels, whose coarse grid is made of 200 × 100 points. This gives a resolution equivalent to a fixed grid of 6400 × 3200 points. Schlieren images obtained with the pressure field of the numerical solution, corresponding to time t = 0.6 are shown in Figures 7.44 and 7.45. The global efficiency for this case is 5.4%. The second configuration corresponds to a strong shock (Ms = 7) interacting with the same vortex. The setup is the same as in the previous case except for the number of levels, that have been set to 5, and the time instant which is t = 0.1. The grid hierarchy has the same resolution as a fixed grid of 3200 × 1600 points. In this case the solution has a much more complicated structure, shown in Figures 7.46 and 7.47. A 17.2% of

7. Numerical experiments

191

2000

1900

1800

1700

1600

1500

1400

1300

1600

1700

1800

1900

2000

2100

2200

2300

2400

2500

Figure 7.45: A closeup view of a part of Figure 7.44.

the integrations on a fixed grid is required in this example.

7.2.4 Shock-bubble interaction In this section we address the shock-bubble interaction problem in its two dimensional version. The governing equations for this problem are the multi-component Euler equations (2.43), considered in the square [0, 0.890] × [0, 0.089]. The helium bubble is a circle with center (0.42, 0.0445) and radius r = 0.025. A vertical 1.22 Mach shock wave, initially located at x = 0.6675 is moving left through air, see figure 7.48. The initial data are the following / B, UA if 0 ≤ x < 0.6775 and x ∈ U (x, 0) = U0 (x) = U if x ∈ B, B US if 0.6675 ≤ x ≤ 0.890,

where B = {(x, y) ∈ R2 /(x − 0.42)2 + (y − 0.0445)2 ≤ 0.0252 is the circle of center (0.42, 0.0445) and radius r = 0.025, and represents the helium bubble. The value of UA , represents quiescent air, UB represents helium contaminated with a 28% of air, in equilibrium with the surrounding air, and US is the state connected with quiescent air by a vertical left moving

192

7.2. Two-dimensional tests

1600 1400 1200 1000 800 600 400 200

500

1000

1500

2000

2500

3000

Figure 7.46: Numerical schlieren image obtained with the density field, corresponding to the interaction of a strong shock (Ms = 7) with a strong vortex (Mv = 1.7). Time t = 0.1.

1400

1200

1000

800

600

400

200

200

400

600

800

1000

Figure 7.47: A closeup view of Figure 7.46.

7. Numerical experiments

193 Shock 89 mm 50 mm

222.5 mm

222.5 mm

890 mm

Figure 7.48: Computational domain of the two dimensional experiment (not to scale)

1.22 Mach shock wave. The respective values of UA , UB and US are: UA = (ρA , uA , vA , pA , φA ) = (1.225, 0, 0, 101325, 1), UB = (ρB , uB , vB , pB , φB ) = (0.2228, 0, 0, 101325, 0), US = (ρA , uS , vS , pS , φS ) = (1.6861, −156.26, 0, 250638, 1). Note that, in contrast with the one-dimensional experiment in Section 7.1.4, we indicate here the physical values of the magnitudes, with no normalization, so that our results can be compared with the ones in other sources, like [60, 142, 118]. We have used a coarse mesh of 640 × 32 to discretize the upper half of the computational domain. To obtain the lower part by symmetry we impose artificial reflecting boundary conditions, following the same approach of Quirk and Karni [142] and Marquina and Mulet [118]. Six levels of refinement with all refinement factors set to 2 have been used to obtain a resolution equivalent to a fixed grid of 20480 × 1024 cells. In this experiment we have used the following parameters: the CFL condition has been set to K = 0.45. The refinement parameter is τp = 10−4 and the clustering parameter is tc = 0.7. The simulation has run up to time 1.7385 · 10−3 . With this setup the AMR algorithm performs a 10.08% of integrations with respect to a fixed grid algorithm. The execution has used 8 processors, with an execution time of 1856450 seconds ≈ 21.5 days. This leads to an estimation of the time needed on a fixed grid of around 7 months. In Figure 7.49 we display the bubble at several stages in the execution, as computed with the AMR algorithm. The times indicated correspond to the time elapsed since the shock arrives to the bubble.

194

7.2. Two-dimensional tests

(a) t = 4 µs.

(b) t = 32 µs.

(c) t = 143 µs.

(d) t = 250 µs.

(e) t = 326 µs.

(f) t = 386 µs.

Figure 7.49: Numerical schlieren images of the density field of the shock-bubble interaction problem for different times.

7. Numerical experiments

195

(g) t = 528 µs.

(h) t = 594 µs.

(i) t = 812 µs.

(j) t = 993 µs.

(k) t = 1123 µs.

(l) t = 1203 µs.

Figure 7.49 (continued)

196

7.2. Two-dimensional tests

8 Conclussions and further work

8.1 Conclusions In this work we describe a numerical method for the solution of hyperbolic systems of conservation laws. Our method results from the combination of a high order shock capturing scheme –built from Shu-Osher’s conservative formulation, a fifth order WENO interpolatory technique, Donat-Marquina’s flux-splitting method, and a third order Runge-Kutta algorithm– with the AMR technique developed by Berger and collaborators. We show how all these techniques can be merged together to build up a highly efficient numerical method, and we describe the practical implementation of the algorithm as a sequential or parallel computer

198

8.2. Further work

program. We have tested the algorithm with several one- and two-dimensional experiments, that show that our method is able to obtain solutions with the same quality as those obtained without adaptation, but with a much smaller computational time. The extensive testing of the algorithm gives an insight of the properties of the algorithm that can be useful in practice to have information about the potential gains that it can provide, as well as about its behavior with respect to the parameters involved. With the help of the experiments, we have explained several issues related to the AMR algorithm, in particular • the behavior of the adaptive method with respect to the same integration algorithm applied on a fixed grid, in terms of the difference between solutions, • the influence that the refinement procedure used in the adaptation stage has in the quality of the final result and the performance obtained by the adaptive algorithm, and • the role of the projection from fine to coarse in the algorithm. We also present in the appendix a description of the AMR algorithm that is much more general than the actual descriptions found in the scientific literature and tries to approach the foundations of the running algorithms that are described and implemented in practice.

8.2 Further work Despite the performance of the algorithm is satisfactory, we have detected several aspects, mostly related to the implementation, that can improve the efficiency of the algorithm in some cases. In particular, the overhead produced by the AMR algorithm can be reduced by using fast search algorithms in the grid hierarchy, so that the cost of finding mesh connectivity is reduced. The grid connectivity is used in various processes, like computing numerical solutions in the cells at the pads of the grid patches, or the creation of a numerical solution for an adapted grid. In the same line of the previous comment, in some cases it can be useful to merge small patches into bigger ones, wherever this is possible, without relaxing the clustering parameter. This leads to a reduction in the number of cells in the pad of the patches, thus improving the overall

8. Conclussions and further work

199

efficiency of the algorithm, but a cost has to be paid in order to find adjacent patches that can be merged and to merge them. Some more effort could be invested on the refinement criterion based on interpolation errors to make it less dependent on the choice of the threshold. The clearest extension of the method is the implementation in 3D. Despite the 3D version of the method can be easily described, as is partly done in appendix A, and it could seem a minor task to produce a 3D version from the 2D one, the amount of details to be taken into account for making such a code run in practice is so huge, that we are not currently considering its implementation. Instead, we aim to use our method for other interesting 2D problems, in particular hyperbolic systems of balance laws, where a source term is present (work in preparation), problems where the characteristic structure may not be fully available analytically, like traffic flow problems (see [45]). Another possible extension is the combination of penalization methods [25] with the AMR algorithm.

200

8.2. Further work

A A generic description of the AMR algorithm The adaptive mesh refinement algorithm is a general purpose framework for the efficient numerical integration of hyperbolic systems of equations. The algorithm was first developed by Berger in [21] and in the joint works with Oliger [23, 24] and Colella [22]. A simplified version was described by Quirk in [139]. Writing an AMR code is a challenging task. Most of the existing AMR implementations are designed for numerical methods based on the classical finite-volume approach, and the original developments of the algorithm were also thought for finite-volume methods. Since the first descriptions of the algorithm, decisions regarding the optimal way to implement an AMR algorithm have been discussed and taken in the literature. The descriptions of the AMR technique typically focus on the particular implementation made in this text, but a general description of the AMR algorithm is missing.

202

A.1. Grid system

Our description of the AMR algorithm has followed so far the same approach since, for the sake of clarity, we have centered our description in the way used in the implementation. In this chapter we will describe an AMR algorithm with a quite more general vision, trying to differentiate what is an AMR algorithm and what is an AMR implementation, which is the particular AMR algorithm that results from choosing particular types of grids, integration algorithms, adaptation techniques, etc. It is the goal of this chapter to describe the algorithm in wider generality, with the goal of giving some light to its insights, aimed in particular to those readers who are interested in the implementation of an AMR algorithm in a different way to the common approaches that exist in the literature. After some particular choices on the different elements that compose the AMR algorithm, our algorithm can be obtained. The advantageous numerical solution of hyperbolic systems of conservation laws using the AMR approach involves an efficient combination of several operations, performed on grids and on the numerical solutions defined on them. We will describe the main ingredients required to build such an algorithm using an AMR grid infrastructure, namely integration, adaptation and projection, for both the general case and the case of Cartesian grids, with special emphasis in the case of uniform cellrefined grids, which are the grids used in our implementation. An AMR algorithm can be built using these pieces and the pseudo-code included in chapter 5 (in particular the algorithm in Fig. 5.9). We start with the description of the grid and grid structures that lie on the basis of the algorithm.

A.1 Grid system The AMR algorithm uses a hierarchical grid system composed by grids with different levels of resolution. The coarsest grid covers the whole computational domain, and grids with smaller cells are overlapped where more refinement is needed. More grids, with smaller and smaller cells can be overlapped in turn over parts of the coarser grids, until the desired resolution is achieved. These grids are independent, in the sense that they can be, to some extent, handled in isolation, but some coherence has to be enforced between grids of different resolution, since they can cover the same spatial locations, and therefore different solutions corresponding to the same locations can exist. Despite AMR grid hier-

A. A generic description of the AMR algorithm

203

G2

G1

G0

Figure A.1: A sample three-level AMR grid hierarchy

archies have not been formally introduced yet, with illustrative purposes we show an example of such a hierarchy in Fig. A.1. A similar example is shown in fig. 6.1. The goal of this section is to describe the requirements of the sets of grids that conform the basis of an AMR algorithm. We start with the definition of grids, which are the simplest elements in the framework. Definition 9. Given an open bounded set Ω ⊂ Rd , a (complete) grid defined on Ω is a family of closed sets A := {ci }i∈I such that: ci = ˚ c¯i ∀i ∈ I,

˚ ci ∩ ˚ cj = ∅ ∀i, j ∈ I, [ ¯ ci = Ω,

i 6= j,

i∈I

¯ is the closure of Ω and ˚ where I ⊂ N is a set of indexes, Ω ci is the interior of the set ci . A subset of a grid A will be called a subgrid. For practical purposes both complete and subgrids will be often called grids henceforth, making an explicit distinction when required. The set of all complete grids defined on Ω will be denoted by Ac (Ω), and the set of all subgrids defined on Ω by A(Ω). Obviously Ac (Ω) ⊂ A(Ω). Definition 10. Given A ∈ A(Ω), we define the maximum grid size of A as |A|max = max{|ci | : ci ∈ A},

204

A.1. Grid system

where |ci | denotes the (Lebesgue) measure in Rd of ci . We define the minimum grid size of A by |A|mim = min{|ci | : ci ∈ A}. We say that A is uniform if |A|mim = |A|max , and in this case we denote |A| = |A|mim = |A|max . The definition of grid size provides a way of giving sense to the word resolution, by comparing grid sizes. This motivates the next definition. Definition 11. Let A1 , A2 ∈ A(Ω). If |A1 |max < |A2 |min we say that A1 is finer than A2 , or equivalently that A2 is coarser than A1 . If A1 is finer than A2 we will denote this fact by writing A1 < A2 , or A2 > A1 . Our aim is to define a grid hierarchy, composed by several subgrids, so that we obtain more accurate (better resolved) solutions as the resolution increases (i.e., the grid size decreases). This concept is introduced in the next definition. Definition 12. Given L > 1 and an open bounded set Ω ⊂ Rd , an L-level grid hierarchy defined on Ω is a set of L subgrids AL = {A0 , . . . , AL−1 }, with Aℓ = {cℓi }i∈Iℓ ∈ A(Ω), such that the following conditions are verified: Aℓ < Aℓ−1 , [ ¯ c0i = Ω.

1 ≤ ℓ ≤ L − 1,

(A.1) (A.2)

i∈I0

An L-level complete grid hierarchy defined on Ω is a set of L complete grids GL = {G0 , . . . , GL−1 } defined on Ω that verify condition (A.1) above (note that condition (A.2) is automatically satisfied for complete grids). Condition (A.1) means that the grids are getting finer as the level increases. Condition (A.2) is necessary in order to be able to solve the PDE in the whole domain of definition of the equation for the coarsest resolution, so that every grid, except the coarsest one, overlaps a grid coarser than it. This property is necessary in order to handle the different procedures that involve data exchange between grids of different resolutions. Despite an AMR algorithm can be, at least formally, built over a grid hierarchy defined as in Definition 12, it could lead to an excessive size ratio between cells of different grids corresponding to the same spatial location. It is more convenient to work with grid hierarchies where each grid is fully contained in its coarser grid. Data transfers between grids

A. A generic description of the AMR algorithm

205

that do not correspond to consecutive refinement levels are performed in cascade, passing through every grid in the middle. This approach reduces undesirable effects like high interpolation errors, that would ultimately lead to a loss on the accuracy of the algorithm. Definition 13. Given an L-level grid hierarchy AL = {A0 , . . . , AL−1 } defined on an open bounded set Ω ⊂ Rd , with Aℓ = {cℓi }i∈Iℓ (0 ≤ ℓ ≤ L − 1), we say that AL is nested if it verifies: [ [ cjℓ−1 , 1 ≤ ℓ ≤ L − 1. (A.3) cℓi ⊆ i∈Iℓ

j∈Iℓ−1

From now on, we will assume that every grid hierarchy is nested. We further restrict now the discussion to a class of grid hierarchies where each grid is obtained from the coarser by means of cell sub-division. Almost any AMR algorithm present in the literature is based on this kind of grid hierarchies. Definition 14. Let AL = {A0 , . . . , AL−1 } be a nested L-level grid hierarchy, with Aℓ = {cℓi }i∈Iℓ , (0 ≤ ℓ ≤ L − 1). We say that AL is cell-refined if for 1 ≤ ℓ ≤ L − 1 the following condition holds: let cjℓ−1 ∈ Aℓ−1 be such that the set Iℓj := {i ∈ Iℓ : ˚ cℓi ∩ ˚ cjℓ−1 ) 6= ∅} is nonempty. Then

[

cℓi = cjℓ−1 .

(A.4)

i∈Iℓj

The set Iℓj is a subset of Iℓ that represents the indexes of the cells of the grid Aℓ that intersect the interior of the cell cjℓ−1 . Condition (A.4), along with the nestedness of the grid hierarchy means that the refinement is performed following a cell-based approach: cells belonging to Aℓ are obtained by sub-division of cells belonging to Aℓ−1 . The grid hierarchy shown in Fig. A.2 is cell-refined. Fig. A.1 shows a cell-refined grid hierarchy composed by uniform grids. Cell-refined grids are important if we aim to use the AMR technology to numerically solve an hyperbolic system by means of a conservative scheme. In such numerical methods, the solution at a cell is updated using approximations of the fluxes that cross the cell interfaces. If the grid hierarchy is cell-refined, it follows from (A.4) that, if cjℓ−1 is such that Iℓj 6= ∅, then [ ∂cℓi , (A.5) ∂cjℓ−1 ⊂ i∈Iℓj

206

A.1. Grid system G2

G1

G0

Figure A.2: A sample three-level AMR cell-refined grid hierarchy

where ∂c indicates the boundary of the set c. Property (A.5) suggests that flux approximations computed on a grid can be used to update the solution at its coarser grid, so that conservation between grids is enforced, as was done in sections 5.4 and 6.1.4 when building the projection operator. Once the basic definitions for grids have been introduced, we assign to each grid a discrete set of data, that conceptually will correspond to the numerical approximation to the solution of an hyperbolic system of conservation laws of the form ut +

d X ∂f q (u) q=1

∂xq

= 0,

x ∈ Rd ,

t ∈ R+ .

(A.6)

The bundles composed by a grid and its associated numerical solution, and combinations of these pairs into hierarchies will be the basic elements to build an AMR algorithm. Definition 15. Given a grid A = {ci }i∈I ∈ A(Ω), a numerical solution of (A.6) in A is an application U : A −→ Rm . The images of the elements of A through U are denoted by U (ci ) = Ui . Both the application U and its image U (A) = {Ui }i∈I will be referred as numerical solution. The pair (A, U ) will be collectively denoted by M = {mi }i∈I , where mi = (ci , Ui ), and will be called a computational grid. The set of all pairs (A, U ),

A. A generic description of the AMR algorithm

207

where A ∈ A(Ω) and U is a numerical solution for A will be denoted by M(Ω). Definition 16. Mc (Ω) = {M = (A, U ) ∈ M : A ∈ Ac (Ω)}. The elements of Mc (Ω) will be called complete computational grids. We introduce now the concept of computational grid hierarchy, which is simply a grid hierarchy with its respective associated numerical solutions. Definition 17. An L-level computational grid hierarchy is a sequence of L computational grids {M0 , . . . , ML−1 } = {(A0 , U0 ), . . . , (AL−1 , UL−1 )} such that {A0 , . . . , AL−1 } is an L-level grid hierarchy. The set of all L-level computational grid hierarchies will be denoted by and by ML c (Ω) if all the grids in the grid hierarchy are complete grids, i.e.

ML (Ω),

L ML c (Ω) = {(A0 , U0 ), . . . , (AL−1 , UL−1 )} ∈ M (Ω) : Aℓ ∈ Ac (Ω), 0 ≤ ℓ ≤ L − 1}

From now on we will omit the appellative computational when referring to any type of grid, making distinctions, for example, between ordinary and computational grids or between complete grids and subgrids only when confusion is possible. An implicit distinction will be made sometimes by indicating the set to which the grid belongs. So far we have defined the basic mathematical structures needed for the solution of (A.6) using an AMR algorithm. We define next some operations that represent the basic elements in any AMR algorithm. We start by establishing relationships between subgrids and complete grids. The basic elements are, on the one hand, embedding operators, that relate a given subgrid to a complete grid, and restriction operators, that, from a complete grid, obtain subsets of it. The goal of the restriction and embedding operators is to simplify the description of other operators that are involved in the AMR algorithm. Once the operations that relate subgrids and complete grids have been defined, the description of the basic operators can be performed for complete grids, obtained through appropriate embeddings, and the result of the operation done on the complete grid is transferred back to the subgrid using a restriction operator. This approach of considering embedding and restriction operators as auxiliary procedures for the definition of operations on subgrids, is a purely abstract construction, made with the only goal of clarifying how operations like integration or cell marking are applied to subgrids. The idea is to pay no regard, by acting on complete grids only, to the fact that subgrids

208

A.1. Grid system M

R(M, {9, 27, 34, 36, 37, 40, 48})

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

36

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

R

Figure A.3: An example of a restriction operator R in 2D (only the grids are depicted).

can have arbitrary geometry and need information from a surrounding band (whose size is dependent on the particular operation) around it, as commented in section 5.1. Of course, because of efficiency reasons, the practical implementation of the algorithm does not follow this approach, and acts directly on subgrids. Definition 18. An embedding operator is any application E : M(Ω) −→ ˜ U ˜ ) = E(A, U ) then Mc (Ω) such that if (A, U ) ∈ M(Ω) and we denote (A, ˜ is an extension of U as an application. A ⊂ A˜ and U ˜ U ˜ ) = E(A, U ) we will We will use the following abuse of notation: if (A, ˜ ˜ denote A = E(A) and U = E(U ). Note that if M ∈ Mc (Ω) then E(M ) = M . Definition 19. A restriction operator is an application R : Mc (Ω)×P(N) −→ M(Ω) defined by R({mi }i∈I , I1 ) = {mj }j∈I∩I1 , where {mi }i∈I ∈ Mc (Ω) and I1 ∈ P(N). We shall use the same abuse of notation as with the embedding operaˆ U ˆ ) = R(A, U, I1 ) we will denote Aˆ = R(A, I1 ) and U ˆ = R(U, I1 ). An tor: if (A, example of a restriction operator with a two-dimensional grid is shown in Fig. A.3. Remark 3. Note that, if E is an embedding and R is a restriction, then for each M ∈ M(Ω), with M = {mi }i∈I one has R(E(M ), I) = M . Another remark is that that the embedding operator is not unique and that different

A. A generic description of the AMR algorithm

209

operations involved in the update process can be defined through different embeddings, according to the particular characteristics of the operator. These concepts are illustrated in Fig. A.4. Now, given an operator Pc : Mc (Ω) −→ Mc (Ω), that acts on complete grids, an embedding E and a restriction R, we can define an operator P : M(Ω) −→ M(Ω) by setting: P = R ◦ Pc ◦ E.

(A.7)

This construction is illustrated in Fig. A.5.

A.2 Grid adaptation The goal of the adaptation process is to track the moving features of the numerical solution as the integration advances the numerical solution in time. Given a subgrid A, after some iterations of the integration algorithm, it is necessary to change A so that it adapts to the updated numerical solution. This is precisely the process carried out by the adaptation algorithm. As we will see later on, a correct grid adaptation strategy is absolutely essential for any AMR algorithm to make sense. In this section we will denote by A˜ the subgrid obtained from A by adaptation. The adaptation is composed by two major processes: on the one hand, a procedure that decides which cells will compose the grid A˜ is required. This process will be referred to as flagging or marking procedure. On the other hand, once the composition of the new grid has been decided, the new cells have to be filled with a numerical solution. The only important restriction when constructing these building blocks is that the resulting grids have to compose a grid hierarchy in the sense of Definition 12. These two leading processes are described next. A marking operator, that decides which cells of a subgrid need refinement, is one of the most important components of the AMR algorithm. It will allow the algorithm to ensure that the discontinuities will never escape from a fine grid into a coarser one, which would lead to a loss of accuracy in the region where the AMR algorithm is precisely supposed to provide better resolution due to refinement. The actual procedures by means of which one decides if a cell is refined or not are diverse, and our choices were studied in Section 5.5. We give the following generic definition of a marking operator:

210

A.2. Grid adaptation

M2 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

36

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

R

E2 M

M1 1

2

3

4

7

8

5

9

10

14

11

12

13

16

17

18

19

20

25

26

27

30

31

32

15 21

22

28

29 33

23

24

6

E1

R

34

Figure A.4: An example of the combination of restriction and embedding. In the figure, E1 (M ) = M1 and E2 (M ) = M2 . M = R(M1 , {9, 27, 34, 36, 37, 40, 48}) = R(M3 , {4, 17, 19, 21, 22, 25, 31}) (only the grids are depicted).

A. A generic description of the AMR algorithm

Mc (Ω)

Pc

211

Mc (Ω) R

E

M(Ω)

M(Ω)

P

Figure A.5: Construction of an operator P between subgrids through an operator Pc acting on complete grids, an embedding operator E and a restriction operator R.

Definition 20. A marking operator is an application Fc : Mc (Ω) −→ P(N).

(A.8)

Note that Fc acts on complete grids. A marking operator for elements of M(Ω), that we denote by F , can be defined through a construction similar to the one defined in figure A.5: if E is an embedding we define: F (M ) := Fc (E(M )).

(A.9)

A marking operator is therefore an application that, having a grid as input, returns a set of indices corresponding to cells that need to be included in the subgrid, according to a certain criterion. Once the desired cells have been selected for refinement using a marking operator, a new grid is created from those cells. We formalize this fact in the following definition: Definition 21. Given a marking operator F : M(Ω) −→ P(N), the adaptation operator associated to it is the application Ad : M(Ω) −→ M(Ω), defined by Ad(M ) = R(E(M ), F (M )),

∀M ∈ M(Ω).

where E is the embedding used to define F . Remark 4. Note that Definition 21 applies for both subgrids and complete grids, the embedding being in the latter case the identity. If the initial grid

212

A.2. Grid adaptation

is complete, then the adaptation operator is nothing but a particular class of a restriction operator, where the selected indexes are determined by a marking operator. Remark 5. An adaptation operator defined as in Definition 21 retains unmodified the cells that are marked and belong to the initial grid. As a consequence, the coarsest grid, which is forced to cover the while computational domain, remains fixed by adaptation. Several points have to be taken into account when designing an adaptation strategy, (i.e., a marking operator). The first one is to ensure that the result of adapting one or more grids within a grid hierarchy, is still a grid hierarchy. If AL = {A0 , . . . , AL−1 } is a grid hierarchy and a grid Aℓ is adapted, being A˜ℓ = {˜ cℓi }i the adapted grid, then, looking at the definition of grid hierarchy (Definition 12) we have to ensure that conditions (A.1) and (A.2) hold, which means that, if 1 ≤ ℓ < L − 1, Aℓ+1 < A˜ℓ < Aℓ−1 , S ¯ whereas if ℓ = L − 1, A˜L−1 < AL−2 , and if ℓ = 0, A˜0 > A1 and i∈I0 c˜0i = Ω. A simple practical way to proceed is to fix |A˜ℓ | = |Aℓ |, for 1 ≤ ℓ ≤ L − 1, and to maintain fixed the coarsest grid A0 . For a nested grid hierarchy we also need to enforce the nestedness of the resulting grid hierarchy, i.e., A˜ℓ ⊂ Aℓ−1 , if 1 ≤ ℓ ≤ L − 1 and

Aℓ+1 ⊂ A˜ℓ , if 0 ≤ ℓ < L − 1.

Several particular strategies to ensure the nestedness are possible, and we have discussed our choice in Sections 5.5 and 6.1.2 (page 121). On the other hand, we have already commented that, because the grids have to follow the moving features of the solution, in general the adapted grid will not be contained in the original grid. Consider a sub˜ U ˜ ) = R(E(M ), F (M )), that grid M = (A, U ) and the subgrid Ad(M ) = (A, results from adaptation for some operators R, E and F . The fact that A˜ \ A 6= ∅ implies that there are several different possibilities for building a numerical solution for the newly created grid A˜ from the existing grid ˜ will depend in fact on the definition of A. The particular definition of U ˜ (ci ) the embedding operator, which is not unique. Recall that the value U assigned to a cell ci ∈ A˜ is exactly the value that comes from the embedding operator. From the definition of the embedding, the numerical

A. A generic description of the AMR algorithm

213

solution for the complete grid has to be an extension of the numerical ˜ (ci ) solution defined on the subgrid, i.e., if ci ∈ A ∩ A˜ then the value U ˜ assigned to ci as an element of A is the same that as an element of E(A). As the restriction is not modifying the numerical solution, we have ˜ (ci ), U (ci ) = U

∀i ∈ I˜ ∩ I,

where I is the set of indices that defines M and I˜ = F (M ) is the one ˜ . Therefore, for the cells that belong to both grids M that defines M ˜ and M , the natural requirement that the numerical solution takes the same value in both grids is imposed by construction, but nothing can be deduced for the rest of the cells. An adaptation operator defined on subgrids can thus be viewed as an operator that changes the grid, while preserving the numerical solution in the cells shared by the original and the adapted grid. For the cells corresponding to indices that belong to the set I˜ \ I, the definition of the values of U is in principle free, and is usually given by interpolation from the coarse grids, as is done in our case (see Sections 5.5, 5.6 and 6.1.2). For the cells that coincide with the boundary of the computational domain, external boundary data have to be supplied, as is often done when integrating the equations with a single grid. Some procedures to deal with (artificial) boundary conditions are described in Section 3.7.

A.3 Integration and projection In this section we introduce into the generic framework described in Section A.1 the operations related to the update, from a time instant to the following one, of the numerical solution of the system of equations under consideration, from a generic point of view. A discussion on generalities on numerical methods for fluid dynamics has been performed in chapter 3, while our particular numerical algorithm has been described in chapter 4. In further sections we will particularize the discussion to our particular choice, that is, Shu-Osher’s finite-difference algorithm with Donat-Marquina’s flux split on a uniform cell-refined Cartesian grid hierarchy. A generic integration algorithm can be expressed in terms of an operator that produces an updated solution from the solutions already computed, as in the following definition:

214

A.3. Integration and projection

Definition 22. An integrator or an integration operator is an application I : M(Ω) −→ M(Ω) ˆ Uˆ ) then Aˆ = A. so that, if M = (A, U ) ∈ M(Ω) and I(M ) = (A, An integrator is thus an operator that modifies the values of the numerical solution without modifying the grid. If Ic : Mc (Ω) −→ Mc (Ω) is an integrator that acts on complete grids, an integrator for subgrids can be defined through the construction (A.7), i.e. I(M ) = R(Ic (E(M )), I),

∀M ∈ M(Ω),

where I is the set of indexes that defines M , E is an embedding and R is a restriction. Within a grid hierarchy one has different numerical solutions corresponding to different grids. Sometimes these different solutions need to be coherent in some sense, for example when they correspond to the same time instant. We define next an operation that represents the modification of the numerical solution at one grid from the solution at a different grid. It is assumed that finer grids can provide more accurate information than coarser grids, so this operation is forced to reduce to the identity if information from a coarser grid is to be used to modify the solution at a finer grid. Projection operators involving more than two grids (and even applications from ML (Ω) to ML (Ω), that modify more than one grid using information from more than one grid) can be considered, but we will restrict the discussion to the case defined below, where only two grids are considered. Typically these grids will correspond to two consecutive refinement levels within a nested grid hierarchy, and the corrections are applied from the finest to the coarsest grid passing through each pair of grids in the middle. Definition 23. A projection operator or a projector is an application: P r : M(Ω)2 −→ M(Ω) such that, given M1 = (A1 , U1 ) and M2 = (A2 , U2 ) ∈ M(Ω), then P r(M1 , M2 ) = ˜ ) and if A2 ≤ A1 then U ˜ = U2 . (A2 , U A projector is thus an operator that modifies the numerical solution at a grid (M2 in the definition) using a finer grid (M1 in the definition). As with the integration operators, a projector can be defined using a projector P rc : Mc (Ω)2 −→ Mc (Ω) that acts on complete grids by P r(M1 , M2 ) = R(P rc (E(M1 ), E(M2 )), I),

∀M1 , M2 ∈ M(Ω)

A. A generic description of the AMR algorithm

215

where I is the set of indexes that defines M2 . So far we have defined the basic components needed to build an adaptive mesh refinement algorithm for the numerical solution of (A.6). The description up to now is however very general and several choices on the particular elections of each piece, as well as on the algorithm organization have to be still made. We describe next an algorithm for the integration of a system of hyperbolic conservation laws using a finite-difference conservative scheme with an adaptive mesh refinement infrastructure based on Cartesian grids. The election of such a class of grid hierarchies is motivated by several factors: • Our numerical method is based on Shu-Osher’s finite difference formulation, which is well suited for its application to uniform Cartesian grids. Other parts of the numerical method, as the Weighted ENO reconstruction, are also better suited for Cartesian grids than for other kinds of grids. • The practical implementation of the algorithm into a computer program is much easier with a Cartesian grid, and there is no special advantage on using other kinds of grids for the kind of problems of interest in this work.

A.4 Cartesian grids Cartesian grids are a class of grids of particular interest for the AMR algorithm. The original algorithm was actually described for rotated Cartesian grids [21, 23], and most practical implementations, including ours, follow a Cartesian approach. This is mainly due to the relative simplicity of Cartesian grids with respect to other kind of grids. In this section we describe a generic AMR algorithm for a Cartesian grid infrastructure, using the building blocks defined in the previous section, particularized for this kind of grids. We start by defining Cartesian grids and grid hierarchies1 Definition 24. Let Ω be an open cube in Rd . A Cartesian (complete) grid defined on Ω is a grid G ∈ Ac (Ω) that verifies: for each k ∈ {1, . . . d} There 1

In this section we will sometimes consider, for simplicity, only grids instead for computational grids, omitting the numerical solution associated to the grids. We will use the suitable abuses of notation where required.

216

A.4. Cartesian grids

exists a set of indexes Ik and ordered points {pkik }ik ∈Ik , verifying pkik < pkik +1 , ∀ik ∈ Ik , and such that G = {ci1 ,...,id : ik ∈ Ik , 1 ≤ k ≤ d} , where ci1 ,...,id =

d Y

[pkik , pkik +1 ].

(A.10)

k=1

Definition 25. Let Ω be an open cube in Rd . A Cartesian subgrid defined on Ω is a subgrid A ∈ A(Ω) such that there exists a complete Cartesian grid G defined on Ω with A ⊆ G. Complete Cartesian grids are therefore complete grids, whose elements are hyperrectangles of Rd that result from the Cartesian product of intervals of the real line. Cartesian subgrids are just subsets of them. The centers of the cells defined by (A.10) are the points xi1 ,...,id = (x1i1 , . . . , xdid ),

ik ∈ Ik ,

1 ≤ k ≤ d,

(A.11)

where, for each k,

pkik +1 + pkik , ik ∈ Ik . (A.12) 2 In Definitions 24 and 25, if, for a certain k, the differences pkik +1 − pkik are constant with respect to ik , then the grid is said to be uniform in the k-th Cartesian direction. If the property holds for every k, then the grid is uniform, according to Definition 10. The corresponding constants pkik +1 − pkik will be denoted by ∆xk and referred to as grid sizes. If ∆xk does not depend on k, then the grid is made of hypercubes, and is said to be a square Cartesian grid, with grid size ∆x. As an illustration, different types of grids are shown in Fig. A.6. Let Ω be an open hyperrectangle in Rd , that we can write as xkik :=

Ω=

d Y

(ak , bk ),

k=1

for certain ak , bk ∈ R, ak < bk , 1 ≤ k ≤ d. A complete Cartesian grid is fully determined by positive integer numbers defining the number of cells considered in each direction and the relative spacing of each cell. More precisely, given d nonnegative integers n1 , . . . , nd , and (n1 −1) + · · · + (nd −1) positive real numbers ∆x01 , . . . , ∆x1n1 −2 , . . . , ∆x0d , . . . ∆xdnd −2 such that nX k −2 i=0

∆xik < bk − ak ,

1 ≤ k ≤ d,

A. A generic description of the AMR algorithm y

y

1

1

1

0

x

0

y

y

1

1

1

0

x

0

217

1

x

1

x

Figure A.6: 2D Cartesian grids for Ω = (0, 1)2 : a square Cartesian grid (top left), a uniform Cartesian grid (top right), a Cartesian grid that is not uniform in any Cartesian direction (bottom left) and a uniform grid that is not a Cartesian grid (bottom right).

a complete Cartesian grid defined on Ω can be defined as follows: define, for each k: pk0 = ak , pki+1

= ak +

i X j=0

pknk = bk .

∆xjk = pki + ∆xik ,

0 ≤ i < nk − 1,

(A.13)

Q ¯ usWe can define dk=1 nk subsets {ci1 ,...,id : ik ∈ {0, . . . , nk − 1}} of Ω, ing (A.10), that trivially define a Cartesian complete grid of Ω. The grid obtained from the element n1 , . . . , nd , ∆x01 , . . . , ∆x1n1 −2 , . . . , ∆x0d , . . . ∆xdnd −2 ∈ Nd × R(n1 −1)+···+(nd −1)

using the above construction will be denoted by D n1 , . . . , nd , ∆x01 , . . . , ∆x1n1 −2 , . . . , ∆x0d , . . . ∆xdnd −2 .

This construction simplifies for square or uniform grids. A complete uniform grid is completely defined by a set of d positive integers

218

A.4. Cartesian grids

{n1 , . . . , nd } in the following way: since the grid spacings ∆xik are conk , and (A.13) reduces stant for each k, the only choice is to get ∆xk = bkn−a k to pk0 = ak pki+1 = ak + (i + 1) ∆xk ,

0 ≤ i ≤ nk − 1.

(A.14)

The sets defined with (A.10) using the points as defined in (A.14) define a complete Cartesian grid that is, by construction, uniform. This construction defines an application Du : Nd −→ Ac (Ω) that assigns to each vector {n1 , . . . , nd } ∈ Nd its corresponding grid Du (n1 , . . . , nd ). Square grids can only be defined for a grid size ∆x if nk :=

bk − ak ∈ Z, ∆x

for 1 ≤ k ≤ d.

In that case, we can repeat the same construction as in the case of uniform grids, according to (A.14) and (A.10), with ∆xk = ∆x, 1 ≤ k ≤ d. We will use the notation Ds (∆x) for the application that assigns to the quantity ∆x its corresponding square grid. 1 ) (top The Cartesian grids depicted in Fig. A.6 correspond to Ds ( 16 3 5 2 3 1 2 2 2 1 6 1 left), Du (8, 4) (top right) and D(6, 7, 14 , 14 , 14 , 14 , 14 , 14 , 14 , 14 , 14 , 14 , 14 ) (bottom left). Before entering into Cartesian grid hierarchies, let us comment that, given a Cartesian subgrid, according to Definition 25, in general there exist many complete grids that can produce the given subgrid by means of a restriction, see Fig. A.7. We will assume that a given subgrid comes from the restriction of a particular complete grid that is known and fixed for each problem: in other words, the numbers describing the complete grid by means of the operators D, Du or Ds are known. This provides a unique embedding for each subgrid, so that the embedding is the complete grid from which the subgrid comes from. A Cartesian grid hierarchy is defined as a grid hierarchy composed by Cartesian grids. We will focus in a particular class of Cartesian grid hierarchies: uniform cell-refined Cartesian grid hierarchies.

A.4.1 Uniform cell-refined Cartesian grid hierarchies Given an open cube Ω ⊂ Rd , a positive integer L, and a set of L · d positive integers (n1 , . . . , nd , r10 , . . . , rd0 , . . . , r1L−2 , . . . , rdL−2 ) we can define a unique L-

A. A generic description of the AMR algorithm

219

Figure A.7: Two complete Cartesian grids (top figures) that can produce the same subgrid (bottom) by restriction

level complete grid hierarchy GL = {G0 , . . . , GL−1 } on Ω as follows: (i) (ii)

G0 = Du (n1 , . . . , nd ) If Gℓ = Du (m1 , . . . , md ) then Gℓ+1 = Du (m1 · r1ℓ , . . . , md · rdℓ ), for 0 ≤ ℓ ≤ L − 2.

Condition (ii) of (A.15) is equivalent to Q Qℓ−1 i ℓ−1 i rd , r1 , . . . , nd i=0 (iib) Gℓ = Du n1 i=0

1 ≤ ℓ ≤ L − 1.

(A.15)

(A.16)

In the grid hierarchies constructed using (A.15) every grid is complete, uniform and cell-refined. In fact each cell at a level ℓ is divided into Qd ℓ cells that belong to the grid at level ℓ + 1. An example of a grid r i=1 d hierarchy defined in such a way is shown in Fig. A.8. A unique L-level complete Cartesian grid hierarchy on Ω is therefore fully determined by the numbers (n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 ), so we can define an application (L)

DLc : NL·d −→ Ac (Ω)× · · · ×Ac (Ω).

(A.17)

For grid hierarchies we will modify the notation for cells and nodes, by including an index to indicate the resolution level, so that a generic

220

A.4. Cartesian grids

Figure A.8: A three level grid hierarchy corresponding to D3c (4, 2, 2, 2, 2, 4)

cell at level ℓ will be denoted by cℓi1 ,...,id

=

d Y

k,ℓ [pk,ℓ ik , pik +1 ],

(A.18)

k=1

where

pk,ℓ ik = ak + ik ∆xk .

(A.19)

The center of the cell cℓi1 ,...,id is given by

where xk,ℓ ik =

k,ℓ pk,ℓ ik +1 +pik . 2

d,ℓ , , . . . , x xℓi1 ,...,id = x1,ℓ i1 id

(A.20)

Definition 26. A Cartesian, uniform, cell-refined grid hierarchy is a cellrefined grid hierarchy AL = {A0 , . . . , AL−1 } such that there exists a complete grid hierarchy GL = {G0 , . . . , GL−1 }, defined by (A.21) GL = DLc (n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 ), for a certain election of n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 , and such that Aℓ ⊆ Gℓ , for 0 ≤ ℓ ≤ L − 1.

A. A generic description of the AMR algorithm

221

In other words, we consider cell-refined grid hierarchies whose individual grids are obtained, by restriction, from a unique complete Cartesian grid hierarchy GL , which is in turn cell-refined, and is defined by the operator DLc . In the AMR algorithm the grid AL is changing in time. We will assume that the grid hierarchy GL is initially fixed, so that every grid hierarchy considered in the AMR algorithm is obtained, using a suitable restriction, from GL . If AL does not contain empty grids, then GL is fully determined by AL and by the fact that is cell-refined and every grid on it is uniform. Moreover we will assume that the flagging procedure provides an adaptation operator that, from a given grid, produces a cell-refined grid. We will show a simple way to design such a flagging operator in Section A.4.2. To sum up, the grids under consideration in our AMR algorithm are assumed to verify the following: the actual grid hierarchy will always be cell-refined, and will come from the same complete grid hierarchy, that is assumed to be a Cartesian grid hierarchy where each grid is uniform, and can be defined by (A.21). We show next a way to design a flagging operator that ensures that the grid hierarchy resulting from adaptation is cell-refined.

A.4.2 Adaptation for cell-refined Cartesian grids If, at some time instant in the AMR algorithm, the actual grid hierarchy is cell-refined, then from any flagging operator, an adaptation operator that produces a cell-refined grid hierarchy can be easily built as follows: let AL = {A0 , . . . , AL−1 } be a cell-refined L-level uniform Cartesian grid hierarchy. Let GL = {G0 , . . . , GL−1 } = DLc (n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 ), for certain values of n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 , such that Aℓ ⊆ Gℓ , 0 ≤ ℓ ≤ L − 1, fact that we will also denote by Hℓ = E(Aℓ ), where E stands for the embedding operator E(Aℓ ) = Du

n1

ℓ−1 Y i=0

r1i , . . . , nd

ℓ−1 Y i=0

rdi

!

,

1 ≤ ℓ ≤ L − 1.

Let Cℓ : N → N be the operator that, given an index corresponding to a cell in E(Aℓ ) returns the index of the cell in E(Aℓ−1 ) where the given

222

A.4. Cartesian grids ℓ

ℓ

cell is contained. Let Tℓ : N → Nr1 ·−·rd be the operator that, given a cell in E(Aℓ ) returns the indexes of the cells in E(Aℓ+1 ) that are contained in the given cell. Let F be a flagging operator. Define: F˜ (Aℓ ) = Tℓ−1 Cℓ F (Aℓ ). The adaptation operator corresponding to F˜ (see Definition 21) ensures that the adapted grid is composed by the subdivision of coarse cells. These cells do not need to belong to the grid Aℓ−1 , but to the grid E(Aℓ−1 ), and the grid Aℓ+1 is not necessarily contained in the adapted grid. In order to ensure the nestedness of the grid hierarchy we can define F˜ by F˜ (Aℓ ) = Tℓ−1 (Cℓ (F (Aℓ ) ∪ Cℓ+1 (Aℓ+1 )) ∩ Iℓ−1 ) ,

(A.22)

where Iℓ−1 is the set of indices that defines Aℓ−1 . Equation (A.22) is valid for 1 ≤ ℓ ≤ L − 2. The coarsest grid A0 is never adapted, and for the finer grid AL−1 (A.22) reduces to F˜ (AL−1 ) = TL−2 (CL−1 (F (AL−1 )) ∩ IL−2 ) .

(A.23)

The adaptation operator defined using F˜ from (A.22) and (A.23), will now produce an adapted grid hierarchy which is nested and cell-refined after adaptation of the grid at level ℓ. An example can be seen in Fig. A.9. The top plots show three grids corresponding to three consecutive refinement levels, depicted in solid squares. The corresponding complete grids are shown with empty squares. If the flagging procedure F (A1 ) returns the indexes of the cells shown in Fig. A.9(d), then the result of F (A1 ) ∪ C2 (A2 ), which includes the cells of A1 overlaid by A2 , is shown in Fig. A.9(e). Finally, Fig. A.9(f) shows the cells marked by the operator F˜ (A1 ) defined by A.22. In our implementation, described in chapter 6, we have used this procedure for the adaptation operator, with the flagging operator described there (section 6.1.2), and with the addition of a system for clustering the marked cells into rectangular patches, included for performance reasons.

A. A generic description of the AMR algorithm

(a) Grid A0

(d) Cells F (A1 ).

marked

by

223

(b) Grid A1

(c) Grid A2

(e) Cells marked to ensure that A2 will be included in the adapted grid

(f) Cells marked by F˜ (A1 )

Figure A.9: An illustration of the flagging procedure for cell-refined cells. Thicker lines indicate cells of coarser levels.

224

A.4. Cartesian grids

A.4.3 Integration and Projection of Shu-Osher’s finite-difference algorithm with Donat-Marquina’s flux split on a Cartesian grid hierarchy We describe in this section the integration of a uniform, cell-refined Cartesian grid hierarchy, and the procedure used to modify the numerical solution at a grid using the numerical fluxes computed in finer grids, for the case of the numerical method described in chapter 4. We essentially describe here a generalization to d dimensions of the methods described there. In our framework, based on Shu-Osher’s flux formulation, the numerical solution is computed at the centers of the cells of each mesh. Let AL = {A0 , . . . , AL−1 } be a grid hierarchy as in Definition 26. According to (A.18), (A.19) and (A.20), consider a generic cell cℓi1 ,...,id ∈ Aℓ , whose center is xℓi1 ,...,id . The numerical solution for (A.6) at the node xℓi1 ,...,id is advanced in time by solving the system of ODE’s raising from the semi-discrete formulation: t,ℓ d fˆ − fˆit,ℓ,...,i ,i − 1 ,i ,...,i X i1 ,...,iq−1 ,iq + 21 ,iq+1 ,...,id ∂u ℓ 1 q−1 q 2 q+1 d (xi1 ,...,id , t) + ∂t ∆x q q=1

(A.24)

where fˆit,ℓ,...,i 1

1 q−1 ,iq + 2 ,iq+1 ,...,id

), , . . . , Uit,ℓ = fˆ(Uit,ℓ 1 ,...,iq +p+1,...,id 1 ,...,iq −p,...,id

is the numerical solution correspondis the numerical flux and Uit,ℓ 1 ,...,id ing to the node xℓi1 ,...,id and time t. The value fˆit,ℓ,...,i ,i + 1 ,i ,...,i is an 1

q−1 q

2

q+1

d

approximation of the flux passing through the point xℓi

1 1 ,...,iq−1 ,iq + 2 ,iq+1 ,...,id

q−1,ℓ q,ℓ q+1,ℓ d,ℓ = (x1,ℓ i1 , . . . , xiq−1 , xi + 1 , xiq+1 , . . . , xid ), q

2

which is computed using the algorithm described in chapter 4. For simplicity we assume that, if a multi-step explicit discretization of the time derivative in (A.24) is used, then fˆ represents the accumulated numerical flux. In our case, where a three-stage Runge-Kutta method is used, the flux fˆit,ℓ,...,i ,i + 1 ,i ,...,i is given by the analogous of (6.11). We also 1

q−1 q

2

q+1

d

recall that our solver works in a dimensional splitting fashion, so that

A. A generic description of the AMR algorithm

225

the computation of each of the d terms in the right hand side of (A.24) is one-dimensional. The time step for each grid is selected according to the formula ∆tℓ =

∆tℓ−1 max1≤k≤d rkℓ−1

,

1≤ℓ≤L−1

(A.25)

where the time step ∆t0 corresponding to the coarsest grid is selected so that the CFL condition is verified on it. The CFL condition for our } be the numerical method can be written as follows: let U t,ℓ = {Uit,ℓ 1 ,...,id numerical solution of (A.6) computed at the grid Aℓ and time t, and let Uit,ℓ,...,i ,i + 1 ,i ,...,i ;L and Uit,ℓ,...,i ,i + 1 ,i ,...,i ;R be the two sided recon1

q−1 q

2

q+1

1

d

q−1 q

2

q+1

d

structions2 of the conserved variables at the point xℓi

1 1 ,...,iq−1 ,iq + 2 ,iq+1 ,...,id

.

q If we denote by {λqp (u)}m p=1 the eigenvalues of the flux function f (u), we define n o t,ℓ t,ℓ ℓ q q Mq = max max max |λp (Ui1 ,...,id ;L )|, |λp (Ui1 ,...,id ;R )| , i1 ,...,id

1≤p≤m

which is the maximum characteristic speed computed over all the nodes of the grid Aℓ . The CFL condition for the grid Aℓ can be written as: ∆xℓq . 1≤q≤d Mqℓ

∆tℓ ≤ min

In practice it it often used the more restrictive condition ∆tℓ ≤

min1≤q≤d ∆xℓq . max1≤q≤d Mqℓ

∆t0 ≤

min1≤q≤d ∆x0q . max1≤q≤d Mq0

Therefore we select

(A.26)

Selecting the time steps according to (A.25) – (A.26) ensures that the CFL condition is satisfied in each grid. Recall that different numerical methods require different CFL conditions. The projection of solution from a fine to a coarse grid is a generalization of the one described in Sections 5.4 and 6.1.4, where we considered a square one- or bi-dimensional grid hierarchy. As explained in chapter 4, the numerical fluxes represent approximations to the true fluxes 2

These two reconstructions are described in Section 4.2, see in particular (4.10)

226

A.4. Cartesian grids

passing through a point, rather than through a cell face, which is the case of finite-volume methods. In general it is not possible, in our setup, to ensure that the solver, when acting in different grids, will compute numerical fluxes corresponding to the same point, so that the coarse flux could be corrected using fine fluxes computed at the same points. Only in the case where all refinement factors are set to an odd number that approach could be possible. Instead, we proceed as in section 6.1.4 and consider a correction of the coarse flux with a value coming from interpolation of the fine fluxes that correspond to points that lie in the same cell face as the coarse flux. The point xℓi ,...,i ,i + 1 ,i ,...,i lies on a face of the hyperrectangle (A.18), 1

q−1 q

2

q+1

d

given by 1,ℓ q−1,ℓ q−1,ℓ q,ℓ q+1,ℓ q+1,ℓ d,ℓ d,ℓ [p1,ℓ i1 , pi1 +1 ] × . . . × [piq−1 , piq−1 +1 ] × {piq +1 } × [piq+1 , piq+1 +1 ] × · · · × [pid , pid +1 ]. Q In the finer grid Aℓ+1 there are k6=q rkℓ fine fluxes that are computed in the same face, at the points

{xℓ+1 j ,...,j 1

1 ℓ ℓ p−1 ,rp ·ip +(rp −1)+ 2 ,jp+1 ,...,jd

},

(A.27)

where each jk , for k 6= p, varies between rkℓ ik and rkℓ ik + (rkℓ − 1). The numerical flux at xℓi ,...,i ,i + 1 ,i ,...,i can therefore be updated with the 1

q−1 q

2

q+1

d

fine fluxes that lie at the same interface. The (d−1)−linear interpolation of the fine flux values gives an approximation of the numerical flux function at the coarse point corresponding to a single time step of the fine grid. If (A.25) is used for the definition of the fine time step from the coarsest time step, then max1≤k≤d rkℓ time steps are performed on the grid Aℓ+1 for each time step of the grid Aℓ . The value of the coarse flux can thus be substituted with the result of adding the interpolated values for the max1≤k≤d rkℓ fine time steps needed to perform a coarse time step. If we denote by L(fˆt,ℓ+1 , xℓi ,...,i ,i + 1 ,i ,...,i ) q−1 q

1

2

q+1

d

the (d − 1)−linear interpolation of the values of the numerical fluxes for time t at the points indicated by (A.27), evaluated at the point xℓi ,...,i ,i + 1 ,i 1

q−1 q

2

then the projected flux at it is given by ℓ

ˆ fˆit,ℓ,...,i 1

1 q−1 ,iq + 2 ,iq+1 ,...,id

where we have denoted substitution: fˆt,ℓ

N 1 X ˆt+i∆tℓ+1 ,ℓ+1 ℓ L(f , xi1 ,...,iq−1 ,iq + 1 ,iq+1 ,...,i ), = ℓ d N 2 i=1

Nℓ

= max1≤k≤d rkℓ . We can therefore perform the

i1 ,...,iq−1 ,iq + 21 ,iq+1 ,...,id

ˆ = fˆit,ℓ,...,i 1

1 q−1 ,iq + 2 ,iq+1 ,...,id

.

(A.28)

q+1 ,...,id

,

Bibliography [1] T. Abel, G. L. Bryan, and M. L. Norman. The formation and fragmentation of primordial molecular clouds. The Astrophysical Journal, 540:39–44, 2000. [2] R. Abgrall. Multiresolution analysis on unstructured meshes: applications to CFD. In K.W. Morton and M.J. Baines, editors, Numerical methods for fluid dynamics, volume 5, pages 271–276. Oxford Science Publications, 1995. [3] V. Agoshkov, A. Quarteroni, and G.Rozza. A mathematical approach in the design of arterial bypass using unsteady Stokes equations. J. Sci. Comput., 28(2–3):139–161, 2006. [4] A. S. Almgren, J. B. Bell, P. Colella, L. H. Howell, and M. L. Welcome. A conservative adaptive projection method for the variable density incompressible Navier-Stokes equations. J. Comput. Phys., 142:1–46, 1998. [5] American National Standards Institute. ANSI Fortran X3.9–1978, 1978. [6] J. D. Anderson. Modern compressible flow. McGraw-Hill, 1982. [7] M. Anderson, E. W. Hirschmann, S. L. Liebling, and D. Neilsen. Relativistic MHD with adaptive mesh refinement. Classical Quantum Gravity, 23:6503–6524, 2006. [8] F. Arandiga, ` A. Baeza, and A. M. Belda. Interpolacion ´ WENO para valores puntuales. In Proceedings of the XVIII CEDYA/VIII CMA Conference, 2003. (in Spanish).

228

BIBLIOGRAPHY

[9] F. Arandiga ` and R. Donat. Nonlinear multiscale decompositions: the approach of A. Harten. Numer. Algorithms, 23:175–216, 2000. [10] D. C. Arney. An adaptive method with mesh moving and mesh refinement for solving the Euler equations. In ASME, SIAM, and APS, National Fluid Dynamics Congress, 1st, Cincinnati, OH, July 25-28, 1988, 1988. AIAA Paper 88–3567–CP. [11] N. Aslan and T. Kammash. A Riemann solver for the twodimensional MHD equations. Int. J. Numer. Meth. Fluids, 25(8):953–957, 1998. [12] I. Babuska and B. Guo. The h-p version of the finite element method for domains with curved boundaries. SIAM J. Numer. Anal., 25:837–861, 1998. [13] A. Baeza and P. Mulet. Adaptive mesh refinement techniques for high order shock capturing schemes for hyperbolic systems of conservation laws. Technical Report GrAN 04–02, Departament de Matematica ` Aplicada, Universitat de Val`encia, Spain, 2004. [14] A. Baeza and P. Mulet. Adaptive mesh refinement techniques for high order shock capturing schemes for multi-dimensional hydrodynamic simulations. Technical Report GrAN 05–01, Departament de Matematica ` Aplicada, Universitat de Val`encia, Spain, 2005. [15] A. Baeza and P. Mulet. Adaptive mesh refinement techniques for high-order shock capturing schemes for multi-dimensional hydrodynamic simulations. Int. J. Numer. Meth. Fluids, 52:455–471, 2006. [16] G. K. Batchelor. An introduction to fluid dynamics. Cambridge University Press, 2000. [17] P. D. Bates, S. N. Lane, and R. I. Ferguson, editors. Computational fluid dynamics: applications in environmental hydraulics. Wiley, 2005. [18] A. M. Belda. Weighted ENO y aplicaciones. Technical Report GrAN 04–03, Departament de Matematica ` Aplicada, Universitat de Val`encia, Spain, 2004. (in Spanish). [19] J. Bell, M. J. Berger, J. Saltzman, and M. Welcome. Threedimensional adaptive mesh refinement for hyperbolic conservation laws. SIAM J. Sci. Comput., 15(1):127–138, 1994.

BIBLIOGRAPHY

229

[20] T. Belytschko, Y. Krongauz, D. Organ, M. Fleming, and P. Krysl. Meshless methods: and overview and recent developments. Comput. Methods Appl. Mech. Engrg., 139:3–47, 1996. [21] M. J. Berger. Adaptive mesh refinement for hyperbolic partial differential equations. PhD thesis, Computer Science Dept., Stanford University, 1982. [22] M. J. Berger and P. Colella. Local adaptive mesh refinement for shock hydrodynamics. J. Comput. Phys., 82:64–84, 1989. [23] M. J. Berger and J. Oliger. Adaptive mesh refinement for hyperbolic partial differential equations. Technical Report NA–83–02, Computer Science Department, Stanford University, 1983. [24] M. J. Berger and J. Oliger. Adaptive mesh refinement for hyperbolic partial differential equations. J. Comput. Phys., 53:484–512, 1984. [25] O. Boiron, G. Chiavassa, and R. Donat. A high-resolution penalization method for large Mach number flows in the presence of obstacles. Computers & Fluids, 38(3):703–714, 2009. [26] J. H. Boldstad. An adaptive finite difference method for hyperbolic systems in one space dimension. PhD thesis, Computer Science Dept., Stanford University, 1982. [27] J. U. Brackbill and J. S. Saltzman. Adaptive zoning for singular problems in two dimensions. J. Comput. Phys., 46(3):342–368, 1982. [28] W. Briggs. A multigrid tutorial. SIAM, 1987. [29] Greg L. Bryan, T. Abel, and M. L. Norman. Achieving extreme resolution in numerical cosmology using adaptive mesh refinement: resolving primordial star formation. In Supercomputing ’01: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (published in CD-ROM format). ACM Press, 2001. [30] J. M. Burgers. A mathematical model illustrating the theory of turbulence. Adv. Appl. Mech., 1:171–199, 1948. [31] P. M. Campbell, K. D. Devine, J. E. Flaherty, L. G. Gervasio, and J. D. Teresco. Dynamic octree load balancing using space-filling curves. Technical Report CS-03-01, Williams College Department of Computer Science, 2003.

230

BIBLIOGRAPHY

[32] C. E. Castro and E. F. Toro. A Riemann solver and upwind methods for a two-phase flow model in non-conservative form. Int. J. Numer. Meth. Fluids, 50(3):275–307, 2006. [33] U.V. Catalyurek, E.G. Boman, K.D. Devine, D. Bozdag, R.T. Heaphy, and L.A. Riesen. Hypergraph-based dynamic load balancing for adaptive scientific computations. In Proc. of 21st International Parallel and Distributed Processing Symposium (IPDPS’07). IEEE, 2007. [34] W. L. Chen, F. S. Lien, and M. A. Leschziner. Local mesh refinement within a multi-block structured-grid scheme for general flows. Comput. Methods Appl. Mech. Engrg., 144:327–369, 1997. [35] G. Chiavassa and R. Donat. Point-value multiscale algorithms for 2D compressible flows. SIAM J. Sci. Comput., 20:805–823, 2001. [36] A. J. Chorin and J. E. Marsden. A mathematical introduction to fluid mechanics. Springer, New York, 3rd edition, 2000. [37] R. Clausius. Abhandlungen u ¨ ber die mechanische W¨ armetheorie. Braunschweig. F. Vieweg und Sohn, 1864. [38] Albert Cohen, Sidi Mahmoud Kaber, Siegfried Muller, ¨ and Marie Postel. Fully adaptive multiresolution finite volume schemes for conservation laws. Math. Comp., 72(241):183–225 (electronic), 2003. [39] P. Colella and P. Woodward. The piecewise parabolic method (PPM) for gas-dynamical simulations. J. Comput. Phys., 54:174–201, 1984. ¨ [40] R. Courant, K. Friedrichs, and H. Lewy. Uber die partiellen Differenzengleichungen der mathematischen Physik. Math. Ann., 100(1):32–74, 1928. English translation: ”On the partial difference equations of mathematical physics”, IBM Journal of Research and Development, 11:215–234, 1967. [41] C. M. Dafermos. Hyperbolic conservation laws in continuum physics. Springer, 2000. [42] A. Dervieux and J.-A. Desideri. Compressible flow solvers using unstructured grids. NASA STI/Recon Technical Report N, 94, June 1992.

BIBLIOGRAPHY

231

[43] K. D. Devine, E. G. Boman, R. T. Heaphy, B. A. Hendrickson, J. D. Teresco, J. Faik, J. E. Flaherty, and L. G. Gervasio. New challenges in dynamic load balancing. Appl. Numer. Math., 52(2–3):133–152, 2005. [44] R. Donat and A. Marquina. Capturing shock reflections: an improved flux formula. J. Comput. Phys., 125:42–58, 1996. [45] R. Donat and P. Mulet. Characteristic-based schemes for multiclass Lighthill-Whitham-Richards traffic models. J. Sci. Comput., 37(3):233–250, 2008. [46] B. Einfeldt. On Godunov–type schemes for gas dynamics. SIAM J. Numer. Anal., 25(2):294–318, 1988. [47] V. Elling. A Lax-Wendroff type theorem for unstructured grids. PhD thesis, Stanford University, 2004. [48] J. L. Ellzey, M. R. Hennecke, J. M. Picone, and E. S. Oran. The interaction of a shock with a vortex: shock distortion and the production of acoustic waves. Phys. Fluids, 7:172–184, 1995. [49] F. Eulderink and G. Mellema. General relativistic hydrodynamics with a Roe solver. Astron. Astrophys. Suppl. Ser., 110:587–623, 1995. [50] E. Fatemi, J. Jerome, and S. Osher. Solution of the hydrodynamic device model using high order non-oscillatory shock-capturing algorithms. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 10:232–244, 1991. [51] Fluent, Inc. http://www.fluent.com/software/fluent/index.htm. [52] S. Fromang, P. Hennebelle, and R. Teyssier. A high order Godunov scheme with constrained transport and adaptive mesh refinement for astrophysical magnetohydrodynamics. Astronomy and Astrophysics, 457(2):371–384, 2006. [53] K. Fukuyo. Application of computational fluid dynamics and pedestrian-behavior simulations to the design of task-ambient airconditioning systems of a subway station. Energy, 31(5), 2006. [54] P. Glaister. Approximate Riemann solutions of the shallow-water equations. J. Hydraul. Res., 26(3):293–306, 1988.

232

BIBLIOGRAPHY

[55] J. Glimm. Solutions in the large for nonlinear hyperbolic systems of equations. Comm. Pure Appl. Math., 18:697–715, 1965. [56] E. Godlewski and P.-A. Raviart. Numerical approximation of hyperbolic systems of conservation laws. Springer, New York, 1996. [57] S. K. Godunov. A finite difference method for the numerical computation of discontinuous solutions of the equations of fluid dynamics. Matematicheskii Sbornik, 47:271, 1959. [58] J. A. Greenough and W. J. Rider. A quantitative comparison of numerical methods for the compressible Euler equations: fifth order WENO and piecewise-linear Godunov. J. Comput. Phys., 196:259– 281, 2004. [59] B. E. Griffith, R. D. Hornung, D. M. McQueen, and C. S. Peskin. An adaptive, formally second order accurate version of the immersed boundary method. J. Comput. Phys., 223(1):10–49, 2007. [60] J.-F. Haas and B. Sturtevant. Interaction of weak shock waves with cylindrical and spherical inhomogeneities. J. Fluid Mech., 181:41– 76, 1987. [61] W. Hackbush. Verlag, 1985.

Multi-grid methods and applications.

Springer-

[62] A. Harten. Adaptive multiresolution schemes for shock computations. J. Comput. Phys., 115:319–338, 1994. [63] A. Harten. Multiresolution algorithms for the numerical solution of hyperbolic conservation laws. Comm. Pure Appl. Math., 48:1305– 1342, 1995. [64] A. Harten, B. Engquist, S. Osher, and S. R. Chakravarthy. Uniformly high order accurate essentially non-oscillatory schemes, III. J. Comput. Phys., 71(2):231–303, 1987. [65] A. Harten and J. M. Hyman. Self-adjusting grid methods for one-dimensional hyperbolic conservation laws. J. Comput. Phys., 50:235–269, 1983. [66] A. Harten, P. D. Lax, and B. van Leer. On upstream differencing and Godunov-type schemes for hyperbolic conservation laws. SIAM Rev., 25:35–61, 1983.

BIBLIOGRAPHY

233

[67] A. Harten and S. Osher. Uniformly high order accurate essentially non-oscillatory schemes, I. SIAM J. Numer. Anal., 24(2):279–309, 1987. [68] P. A. Henne, editor. Applied computational aerodynamics, volume 125 of Progress in Aeronautics and Astronautics. American Institute for Aeronautics and Astronautics, 1990. ¨ [69] D. Hilbert. Uber die stetige Abbildung einer Linie auf ein Fl¨achenstuck. ¨ Math. Ann., 38:459–460, 1891. [70] C. Hirsch. Numerical computation of internal and external flows (volume 1): fundamentals of numerical discretization. John Wiley & Sons, Inc., New York, NY, USA, 1988. [71] C. Hirsch. Numerical computation of internal and external flows (volume 2): computational methods for inviscid and viscous flow. John Wiley & Sons, Inc., New York, NY, USA, 1988. [72] T. Y. Hou and G. Le Floch. Why nonconservative schemes converge to wrong solutions: error analysis. Math. Comp., 62(206):497–530, 1994. [73] C. Hu. Numerical methods for hyperbolic equations on unstructured meshed. PhD thesis, Brown University, 1999. [74] M. E. Hubbard and N. Nikiforakis. A three-dimensional, adaptive, Godunov-type model for global atmospheric flows. Mon. Weather Rev., 130:1026–1039, 2003. [75] H. Hugoniot. Sur la propagation du movement dans les coprs et sp´ecialement dans les gaz parfaits. J. Ecole Polytechnique, 57:3– 97, 1887. [76] ISO. The ANSI C standard (C99). Technical Report WG14 N1124, ISO/IEC, 1999. [77] C. Jablonowski. Adaptive grids in weather and climate prediction. PhD thesis, University of Michigan, 2004. [78] A. Jameson. Aerodynamic design via control theory. J. Sci. Comput., 3:233–260, 1988. [79] A. Jameson, W. Schmidt, and E. Turkel. Numerical solution of the Euler equations by finite volume methods. In 14th AIAA Fluid and Plasma Dynamics Conference, 1981. AIAA Paper 81-1259.

234

BIBLIOGRAPHY

[80] L. Jameson. High order schemes for resolving waves: number of points per wavelength. J. Sci. Comput., 15(4):417–439, 2000. [81] G.-S. Jiang and C.-W. Shu. Efficient implementation of weighted ENO schemes. J. Comput. Phys., 126(1):202–28, 1996. [82] J.-C. Jouhaud and M. Borrel. Discontinuous Galerkin and MUSCL strategies for an adaptative mesh refinement method. In Fifteenth International Conference on Numerical Methods in Fluid Dynamics, pages 400–405, 1997. [83] S. Karni. Multicomponent flow calculations by a consistent primitive algorithm. J. Comput. Phys., 112(1):31–43, 1994. [84] Karypis Lab. http://glaros.dtc.umn.edu/gkhome/views/metis. [85] R. Keppens, M. Nool, G. Toth, ´ and J. P. Goedbloed. Adaptive mesh refinement for conservative systems: multi-dimensional efficiency evaluation. Comput. Phys. Comm., 153:39–339, 2003. [86] R. Kimura. Numerical weather prediction. J. Wind Eng. Ind. Aerodyn., 90(12–15):1403–1414, 2002. [87] H. O. Kreiss and J. Oliger. Comparison of accurate methods for the integration of hyperbolic equations. Tellus, XXIV:3, 1972. [88] N. Kroll. ADIGMA – A European project on the development of adaptive higher order variational methods for aerospace applications. In P. Wesseling, E. Onate, ˜ and J. P´eriaux, editors, European Conference on Computational Fluid Dynamics (ECCOMAS CFD 2006), 2006. [89] D. Kroner, M. Rokyta, and M. Wierse. A Lax-Wendrof type theorem for upwind finite volume schemes in 2D. East-West J. Numer. Math, 4:279–292, 1996. [90] P. K. Kundu and I. M. Cohen. Fluid mechanics. Academic Press, 4th edition, 2008. [91] S. H. Lamb. Hydrodynamics. Cambridge University Press, 6th edition, 1975. [92] Z. Lan, V. E. Taylor, and G. Bryan. A novel dynamic load balancing scheme for parallel systems. J. Parallel Distrib. Comput., 62(12):1763 – 1781, 2002.

BIBLIOGRAPHY

235

[93] L. D. Landau and E. M. Lifshitz. Fluid mechanics. Course of theoretical physics, vol. 6. Pergamon Press, Oxford, 2nd edition, 1987. [94] G. Lapenta. A recipe to detect the error in discretization schemes. Int. J. Numer. Meth. Engng., 59:2065–2087, 2004. [95] M. Latini, O. Schilling, and W. S. Don. Effects of WENO flux reconstruction order and spatial resolution on reshocked twodimensional Richtmyer-Meshkov instability. J. Comput. Phys., 221(2):805–836, 2007. [96] P. D. Lax. Weak solutions of nonlinear hyperbolic equations and their numerical computation. Comm. Pure Appl. Math., 7:159–193, 1954. [97] P. D. Lax. Asymptotic solutions of oscillatory initial value problems. Duke Math. J., 24:627–646, 1957. [98] P. D. Lax. Hyperbolic systems of conservation laws, II. Comm. Pure Appl. Math., 10:537–566, 1957. [99] P. D. Lax. Shock waves and entropy. In E.A. Zarantonello, editor, Contributions to nonlinear functional analysis, pages 603–634. Academic Press, 1971. [100] P. D. Lax. Hyperbolic systems of conservation laws and the mathematical theory of shock waves, volume 11 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics, 1973. [101] P. D. Lax and X.-D. Liu. Solution of two dimensional Riemann problem of gas dynamics by positive schemes. SIAM J. Sci. Comput., 19:319–340, 1995. [102] P. D. Lax and R. D. Richtmyer. Stability of difference equations. Comm. Pure Appl. Math., 9:267–293, 1956. [103] P. D. Lax and B. Wendroff. Systems of conservation laws. Comm. Pure Appl. Math., 13:217–237, 1960. [104] R. J. LeVeque. Numerical methods for conservation laws. Birkh¨auser Verlag, 1992. [105] R. J. LeVeque. Finite-volume methods for hyperbolic problems. Cambridge University Press, 2004.

236

BIBLIOGRAPHY

[106] S. Li. Adaptive mesh methods for time-dependent partial differential equations. PhD thesis, University of Minnesota, 1998. [107] S. Li. Comparison of refinement criteria for structured adaptive mesh refinement. J. Comput. Appl. Math., 2009. [108] S. Li and J. M. Hyman. Adaptive mesh refinement for finite difference WENO schemes. Technical Report LA-UR-03-8927, Los Alamos National Laboratory, 2003. [109] W. T. Lin, C. H. Wang, and X. S. Chen. A comparative study of conservative and nonconservative schemes. Adv. Atmos. Sci., 20(5):810–814, 2003. [110] R. Liska and B. Wendroff. Comparison of several difference schemes on 1D and 2D test problems for the Euler equations. SIAM J. Sci. Comput., 25(3):995–1017, 2003. [111] T.-P. Liu. The entropy condition and the admissibility of shocks. J. Math. Anal. Appl., 53:78–88, 1976. [112] T.-P. Liu. Admissible solutions of hyperbolic conservation laws. Mem. Amer. Math. Soc., 30(240), 1981. [113] X-D. Liu, S. Osher, and T. Chan. Weighted essentially nonoscillatory schemes. J. Comput. Phys., 115:200–212, 1994. [114] B. J. Lucier. A moving mesh numerical method for hyperbolic conservation laws. Math. Comp., 46:59–69, 1986. [115] R. W. MacCormack. The effect of viscosity in hypervelocity impact cratering. In AIAA Hypervelocity Impact Conference, 1969. AIAA Paper 69-354. [116] P. MacNeice, K. M. Olson, and C. Mobarry. PARAMESH: a parallel adaptive mesh refinement community toolkit. Comput. Phys. Comm., 126:330–354, 2000. Available at http://sourceforge.net/projects/paramesh/. [117] A. Marquina. Local piecewise hyperbolic reconstruction of numerical fluxes for nonlinear scalar conservation laws. SIAM J. Sci. Comput., 15(4):892–915, 1994. [118] A. Marquina and P. Mulet. A flux-split algorithm applied to conservative models for multicomponent compressible flows. J. Comput. Phys., 185(1):120–138, 2003.

BIBLIOGRAPHY

237

[119] B. Massey and J. Ward-Smith. Mechanics of fluids. Taylor & Francis, 8th edition, 2005. [120] S. F. McCormick. Multilevel adaptive methods for partial differential equations. SIAM Frontiers in Applied Mathematics, 1989. [121] B. Merriman. Understanding the Shu-Osher conservative finite difference form. J. Sci. Comput., 19(1–3):309–322, 2003. [122] E. E. Meshkov. Instability of the interface of two gases accelerated by a shock wave. Soviet Fluid Dynamics, 4:101–104, 1969. [123] Message Passing Interface Forum. MPI: A message-passing interface standard. Version 2.1. Available at http://www.mpi-forum.org/docs/docs.html. [124] R. Mittal and G. Iaccarino. Immersed boundary methods. Annu. Rev. Fluid Mech., 37:239–261, 2005. [125] S. Mizohata. Some remarks on the Cauchy problem. J. Math. Kyoto Univ., 1:109–127, 1961. [126] B. Mohammadi and O. Pironneau. Applied shape optimization for fluids. Oxford University Press, 2001. [127] MPICH Home Page. http://www.mcs.anl.gov/research/projects/mpi/mpich1/. [128] P. Mulet and A. Baeza. Highly accurate conservative finite difference schemes and adaptive mesh refinement techniques for hyperbolic systems of conservation laws. In A. Bermudez ´ de Castro, D. Gomez, ´ P. Quintela, and P. Salgado, editors, Numerical mathematics and advanced applications. Proceedings of ENUMATH 2005, pages 198–206, 2006. [129] N.-T. Nguyen and T. Wereley. Fundamentals and applications of microfluidics. Artech House MEMS Series, 2002. [130] O. Oleinik. Discontinuous solutions of nonlinear differential equations. Amer. Math. Soc. Transl. Ser. 2, 26:95–172, 1957. [131] OpenCFD, Ltd. http://www.opencfd.co.uk/openfoam/index.html.

238

BIBLIOGRAPHY

[132] B. O’Shea, G. Bryan, J. Bordner, M. Norman, T. Abel, R. Harkness, and A. Kritsuk. Introducing Enzo, an AMR cosmology application. In T. Plewa, T. Linde, and V. G. Weirs, editors, Adaptive mesh refinement - theory and applications, volume 41 of Lecture Notes in Computational Science and Engineering. Springer, 2005. [133] C. Othmer, E. de Villiers, and H. G. Weller. Implementation of a continuous adjoint for topology optimization of ducted flows. In 18th AIAA Computational Fluid Dynamics Conference, 2007. AIAA Paper 2007-3947. [134] M. Pandolfi and D. D’Ambrosio. Numerical instabilities in upwind methods: analysis and cures for the ’carbuncle’ phenomenon. J. Comput. Phys., 166:271–301, 2001. [135] M. Parashar and J. C. Browne. On partitioning dynamic adaptive grid hierarchies. In Proceedings of the 29th Annual Hawaii International Conference on System Sciences, pages 604–613, 1996. [136] G. Peano. Sur une courbe, qui remplit toute une aire plane. Math. Ann., 36(1):157–160, 1890. [137] R. B. Pember, J. B. Bell, P. Colella, W. Y. Crutchfield, and M. L. Welcome. An adaptive cartesian grid method for unsteady compressible flow in irregular regions. J. Comput. Phys., 120(2):278–304, 1995. [138] K.G. Powell, P.L. Roe, T.J. Linde, T.I. Gombosi, and D.L. De Zeeuw. A solution-adaptive upwind scheme for ideal magnetohydrodynamics. J. Comput. Phys., 154:284–309, 1999. [139] J. J. Quirk. An adaptive grid algorithm for computational shock hydrodynamics. PhD thesis, Cranfield Institute of Technology, 1991. [140] J. J. Quirk. A contribution to the great Riemann solver debate. Int. J. Numer. Meth. Fluids, 18(6):555–574, 1994. [141] J. J. Quirk. A parallel adaptive grid algorithm for computational shock hydrodynamics. Appl. Numer. Math., 20:427–453, 1996. [142] J. J. Quirk and S. Karni. On the dynamics of a shock-bubble interaction. J. Fluid Mech., 318:129–163, 1996. [143] W. J. M. Rankine. On the thermodynamic theory of waves of finite longitudinal disturbance. Phil. Trans. Roy. Soc. London, 160:277– 288, 1870.

BIBLIOGRAPHY

239

[144] A. Rault, G. Chiavassa, and R. Donat. Shock-vortex interactions at high mach numbers. J. Sci. Comput., 19(1-3):347–371, 2003. [145] C. A. Rendleman, V. E. Beckner, M. Lijewski, W. Y. Crutchfield, and J. B. Bell. Parallelization of structured, hierarchical adaptive mesh refinement algorithms. Comput. Visual. Sci., 3:137–147, 2000. [146] R. D. Richtmyer. Taylor instability in a shock acceleration of compressible fluids. Comm. Pure Appl. Math., 13:297–319, 1960. [147] R. D. Richtmyer and K. W. Morton. Difference methods for initialvalue problems, volume 4 of Interscience Tracts in Pure and Applied Mathematics. Wiley Interscience, New York, U.S.A., 2nd edition, 1967. [148] P. L. Roe. Approximate Riemann solvers, parameter vectors, and difference schemes. J. Comput. Phys., 43:357–372, 1981. [149] A. M. Roma, C. S. Peskin, and M. J. Berger. An adaptive version of the immersed boundary method. J. Comput. Phys., 153(2):509– 534, 1999. [150] O. Roussel and M. P. Errera. Adaptive mesh refinement: A wavelet point of view. In European Congress on Computational Methods in Applied Sciences and Engineering (ECCOMAS), 2000. [151] H. Sagan. Space-filling curves. Springer-Verlag, 1994. [152] Sandia National Laboratories. http://www.cs.sandia.gov/Zoltan/. [153] K. Schloegel, G. Karypis, and V. Kumar. Multilevel diffusion algorithms for repartitioning of adaptive meshes. J. Parallel Distrib. Comput., 47:109–124, 1997. [154] K. Schloegel, G. Karypis, and V. Kumar. A unified algorithm for load-balancing adaptive scientific simulations. In Supercomputing ’00: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing (CDROM), page 59, Washington, DC, USA, 2000. IEEE Computer Society. [155] C.W. Schulz-Rinne, J.P. Collins, and H.M. Glaz. Numerical solution of the Riemann problem for two-dimensional gas dynamics. SIAM J. Sci. Comput., 14:1394–1414, 1993.

240

BIBLIOGRAPHY

[156] D. Schwamborn, T. Gerhold, and R. Heinrich. The DLR TAU-Code: Recent applications in research and industry. In P. Wesseling, E. Onate, ˜ and J. P´eriaux, editors, European Conference on Computational Fluid Dynamics (ECCOMAS CFD 2006), 2006. [157] J. Shi, Y.-T. Zhang, and C.-W. Shu. Resolution of high order WENO schemes for complicated flow structures. J. Comput. Phys., 186:690–696, 2003. [158] C.-W. Shu. Numerical experiments on the accuracy of ENO and modified ENO schemes. J. Sci. Comput., 5:127–149, 1990. [159] C.-W. Shu. Essentially non-oscillatory and weighted essentially non-oscillatory schemes for hyperbolic conservation laws. In Alfio Quarteroni, editor, Advanced numerical approximation of nonlinear hyperbolic equations, volume 1697 of Lecture Notes in Mathematics, pages 325–432. Springer, 1998. [160] C.-W. Shu and S. Osher. Efficient implementation of essentially non-oscillatory shock-capturing schemes. J. Comput. Phys., 77(2):439–471, 1988. [161] C.-W. Shu and S. Osher. Efficient implementation of essentially non-oscillatory shock-capturing schemes, II. J. Comput. Phys., 83(1):32–78, 1989. [162] U. Shumlak and J. Loverich. Approximate Riemann solver for the two-fluid plasma model. J. Comput. Phys., 187(2):620–638, 2003. [163] W. C. Skamarock and J. B. Klemp. Adaptive grid refinement for two-dimensional and three-dimensional nonhydrostatic atmospheric flow. Mon. Weather Rev., 121:788–804, 1993. [164] S. W. Skillman, B. W. O’Shea, E. J. Hallman, J. O. Burns, and M. L. Norman. Cosmological shocks in adaptive mesh refinement simulations and the acceleration of cosmic rays. Astrophys. J., 689(2):1063–1077, 2008. [165] J. Smoller. Shock waves and reaction-diffusion equations, volume 258 of A series of comprehensive studies in mathematics. SpringerVerlag, 1994. [166] G. A. Sod. A survey of several finite difference methods for systems of nonlinear hyperbolic conservation laws. J. Comput. Phys., 27:1– 31, 1978.

BIBLIOGRAPHY

241

[167] J. C. Strikverda. Finite difference schemes and partial differential equations. Wadsworth and Brooks, 1989. [168] Lord Rayleigh (J.W. Strutt). Investigation of the character of the equilibrium of an incompressible heavy fluid of variable density. Proceedings of the London Mathematical Society, 14:170–177, 1883. [169] Lord Rayleigh (J.W. Strutt). Scientific papers, volume II. Cambridge University Press, 1900. [170] H. Tang and T. Tang. Adaptive mesh methods for one- and twodimensional hyperbolic conservation laws. SIAM J. Numer. Anal., 41(2):487–515, 2004. [171] C. A. Taylor, T. J. R. Hughes, and C. K. Zarins. Finite element modeling of blood flow in arteries. Comput. Methods Appl. Mech. Engrg. [172] Sir G. I. Taylor. The instability of liquid surfaces when accelerated in a direction perpendicular to their planes. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci., 201:192–196, 1950. [173] The NPARC Alliance. NPARC Alliance policies and plans. Technical Report FY06-FY07, NPARC, 2005. [174] Lord Kelvin (Sir William Thomson). Mathematical and physical papers, Vol. 4, hydrodynamics and general dynamics. Cambridge University Press, 1910. [175] E. F. Toro. Riemann solvers and numerical methods for fluid dynamics. Springer-Verlag, third edition, 2009. [176] E. Turkel. Composite methods for hyperbolic equations. SIAM J. Numer. Anal., 44(4):744–759, 1981. [177] B. van Leer. Towards the ultimate conservative finite difference scheme, I. The quest of monotonicity. Lecture Notes in Physics, 18:163–168, 1973. [178] B. van Leer. Towards the ultimate conservative finite difference scheme, II. Monotonicity and conservation combined in a second order scheme. J. Comput. Phys., 14:361–370, 1974.

242

BIBLIOGRAPHY

[179] B. van Leer. Towards the ultimate conservative finite difference scheme, III. Upstream-centered finite-difference schemes for ideal compressible flow. J. Comput. Phys., 23:263–275, 1977. [180] B. van Leer. Towards the ultimate conservative finite difference scheme, IV. A new approach to numerical convection. J. Comput. Phys., 23:276–299, 1977. [181] B. van Leer. Towards the ultimate conservative finite difference scheme, V. A second order sequel to Godunov’s method. J. Comput. Phys., 32:101–136, 1979. [182] J.-L. Vay, P. Colella, J. W. Kwan, P. McCorquodale, D. B. Serafini, A. Friedman, D. P. Grote, G. Westenskow, J.-C. Adam, A. H´eron, and I. Haber. Application of adaptive mesh refinement to particle-in-cell simulations of plasmas and beams. Phys. Plasmas, 11(5):2928–2934, 2004. [183] R. Verfurth. ¨ A review of a posteriori error estimation and adaptive mesh-refinement techniques. John Wiley/Teubner, 1996. ¨ [184] H.L.F. von Helmholtz. Uber discontinuierliche Flussigkeitsbewe¨ gungen. Monatsberichte der k¨onigl. Akad. der Wissenschaften zu Berlin, 23:215–228, 1868. [185] R. Walden and D. Folini. A-MAZE: a code package to compute 3D magnetic flows, 3D NLTE radiative transfer and synthetic spectra. In Thermal and ionization aspects of flows from hot stars: observations and theory, ASP Conference Series, volume 204, pages 281– 284, 2000. [186] R. F. Warming and R. W. Beam. Upwind second order difference schemes with applications in aerodynamic flows. AIAA Journal, 24:1241–1249, 1976. [187] B. Wendroff. The Riemann problem for materials with nonconvex equation of state. J. Math. Anal. Appl., 38:454–466, 1972. [188] F. M. White. Fluid mechanics. McGraw-Hill, 5th edition, 2003. [189] D. C. Wilcox. Turbulence modeling for CFD. D. C. W. Industries, 2002. [190] A. Winslow. Adaptive mesh zoning by the equipotential method. Technical Report UCID19062, Lawrence Livermore Laboratory, 1981.

BIBLIOGRAPHY

243

[191] M. Zhang, C.-W. Shu, G. C. K. Wong, and S. C. Wong. A weighted essentially non-oscillatory numerical scheme for a multi-class Lighthill-Whitham-Richards traffic flow model. J. Comput. Phys., 191(2):639–659, 2003. [192] W. Zhang and A. I. MacFadyen. RAM: A relativistic adaptive mesh refinement hydrodynamics code. The Astrophysical Journal Supplementary Series, 164:255–279, 2006. [193] U. Ziegler. A three-dimensional Cartesian adaptive mesh code for compressible magnetohydrodynamics. Comput. Phys. Comm., 116:65–77, 1999.

Adaptive mesh refinement techniques for high order shock capturing schemes for hyperbolic systems of conservation laws Antonio Baeza Manzanares

Advisor: Pep Mulet Mestre

Universitat de Val`encia Val`encia, 2010.

A DAPTIVE MESH REFINEMENT TECHNIQUES FOR HIGH ORDER SHOCK CAPTURING SCHEMES FOR HYPERBOLIC SYSTEMS OF CONSERVATION LAWS Memoria ´ presentada per Antonio Baeza Manzanares, Llicenciat en Matematiques; ` realitzada al departament de Matematica ` Aplicada de la Universitat de Val`encia sota la direccio´ de Pep Mulet Mestre, Professor Titular d’aquest departament, amb l’objectiu d’aspirar al Grau de Doctor en Matematiques. `

Val`encia, 25

Pep Mulet Mestre Director de la Memoria `

de febrer

del 2010

Antonio Baeza Manzanares Aspirant al grau de Doctor

D EPAR TAMENT DE M ATEM A` TICA A PLICADA FACULTAT DE M ATEM A` TIQUES ` U NIVERSITAT DE VAL ENCIA

Contents Contents

v

Agra¨ıments

ix

Resum

xi

Abstract

xli

1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 High resolution shock-capturing schemes . 1.1.2 Need of fine resolution computational grids 1.1.3 AMR: spatial and temporal refinement . . . 1.2 Previous work . . . . . . . . . . . . . . . . . . . . . 1.3 Scope of the work . . . . . . . . . . . . . . . . . . . 1.4 Organization of the text . . . . . . . . . . . . . . . . 2 Fluid dynamics equations 2.1 Hyperbolic conservation laws . . . . . . . . . 2.2 Properties of hyperbolic conservation laws . . 2.2.1 Discontinuous solutions . . . . . . . . 2.2.2 Weak solutions . . . . . . . . . . . . . . 2.2.3 Rankine-Hugoniot conditions . . . . . 2.2.4 Characteristic structure of a system of laws . . . . . . . . . . . . . . . . . . . . 2.3 Model equations . . . . . . . . . . . . . . . . . 2.3.1 Scalar hyperbolic equations . . . . . . 2.3.2 Linear hyperbolic systems . . . . . . . 2.3.3 Nonlinear hyperbolic systems . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 3 4 4 7 7 8 11 12 15 15 19 20 20 24 24 25 27

vi

CONTENTS

3 Numerical methods for fluid dynamics 3.1 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Norms and convergence . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 The Lax equivalence theorem . . . . . . . . . . . . . . . 3.3 Elementary methods . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Conservative methods . . . . . . . . . . . . . . . . . . . . . . . 3.5 High resolution conservative methods . . . . . . . . . . . . . 3.5.1 Semi-discrete methods . . . . . . . . . . . . . . . . . . 3.6 Numerical methods for one-dimensional hyperbolic systems 3.7 Implementation of artificial boundary conditions . . . . . . .

37 38 41 43 44 47 47 48 50 52 54 57

4 Shu-Osher’s finite difference with Donat-Marquina’s flux split61 ting 4.1 Shu-Osher’s finite difference flux reconstruction . . . . . . . 63 4.2 Donat-Marquina’s flux formula . . . . . . . . . . . . . . . . . 67 4.3 Reconstruction procedures . . . . . . . . . . . . . . . . . . . . 70 4.3.1 ENO and WENO reconstruction for cell-average discretizations . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.3.2 ENO and WENO reconstructions for point-value discretizations . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.4 The complete integration algorithm . . . . . . . . . . . . . . . 78 5 Adaptive mesh refinement 5.1 Motivation . . . . . . . . . . . . . . . . . . . 5.2 Discretization and grid organization . . . . 5.3 Integration . . . . . . . . . . . . . . . . . . 5.4 Projection . . . . . . . . . . . . . . . . . . . 5.5 Adaptation . . . . . . . . . . . . . . . . . . 5.6 Grid interpolation . . . . . . . . . . . . . . 5.6.1 Grid interpolation and Runge-Kutta 6 Implementation and parallelization of 6.1 Sequential implementation . . . . . 6.1.1 Hierarchical grid system . . 6.1.2 The adaptation process . . . 6.1.3 Integration algorithm . . . . 6.1.4 Flux projection . . . . . . . . 6.2 Parallel implementation . . . . . . .

. . . . . . . . . . . . . . . . . . time

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . integration

the algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . . .

81 84 86 89 92 94 100 104

109 . 110 . 111 . 117 . 126 . 129 . 134

vii

CONTENTS 7 Numerical experiments 7.1 One-dimensional tests . . . . . . . . . . . . . . . . 7.1.1 Linear advection equation . . . . . . . . . . . 7.1.2 Inviscid Burgers’ equation . . . . . . . . . . 7.1.3 The Euler equations of gas dynamics . . . . 7.1.4 Two-component Euler equations in 1D . . . 7.2 Two-dimensional tests . . . . . . . . . . . . . . . . 7.2.1 A Riemann problem for the Euler equations 7.2.2 Double Mach reflection . . . . . . . . . . . . 7.2.3 Shock-vortex interaction . . . . . . . . . . . 7.2.4 Shock-bubble interaction . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

143 . 144 . 145 . 148 . 154 . 168 . 177 . 177 . 179 . 188 . 191

8 Conclussions and further work 197 8.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 8.2 Further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 A A generic description of the AMR algorithm 201 A.1 Grid system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 A.2 Grid adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . 209 A.3 Integration and projection . . . . . . . . . . . . . . . . . . . . 213 A.4 Cartesian grids . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 A.4.1 Uniform cell-refined Cartesian grid hierarchies . . . . 218 A.4.2 Adaptation for cell-refined Cartesian grids . . . . . . . 221 A.4.3 Integration and Projection of Shu-Osher’s finite-difference algorithm with Donat-Marquina’s flux split on a Cartesian grid hierarchy . . . . . . . . . . . . . . . . . . . . . 224 Bibliography

227

viii

CONTENTS

Agra¨ıments En primer lloc gracies ` al meu director Pep Mulet, a Rosa Donat i a Paco Arandiga, ` que m´es que director i companys, s’han portat sempre amb mi com a aut`entics amics. Gracies ` tamb´e a la resta del Departament de Matematica ` Aplicada que em va acollir de forma molt agradable els anys que vaig passar all´ı. Gracies ` tamb´e a Jacques Liandrat i Guillaume Chiavassa, i a la resta del LATP, amb qui vaig compartir uns mesos molt importants per a mi a Marsella. A Enrique Zuazua i Francisco Palacios, que em van donar l’oportunitat d’unir-me al seu projecte a Madrid, a Fernando Monge per facilitar-me l’estada a l’INTA i als meus companys, tant a INTA i a IMDEA com a la resta d’institucions amb qu`e vaig estar relacionat els dos anys que vaig passar alla. ` A Norbert Kroll, Markus Widhalm i la resta de companys del DLR a Braunschweig, per fer possible i agradable la meua estada alla` a l’any 2008. A Vicent Caselles, que tamb´e va confiar en mi per als seus projectes, i al Grup d’Imatge de Barcelona Media. Gracies ` als revisors d’aquesta tesi, pels seus comentaris i suggeriments. Vull donar les gracies ` especialment als meus pares, Antonio i Nieves, i a la meua germana, Anna, per estar sempre de la meua part, aix´ı com a la meua fam´ılia, que afortunadament e´ s molt gran i no puc anomenar un per un, i a Marta per donar alegria i pau a la meua vida cada dia. A Kanke, Juan, Anna i Milena, per la gran amistat que ens uneix malgrat la distancia ` que ens separa. Gracies ` a tots els meus amics, sou molts i no vos puc anomenar un per un.

x

Resum

´ Introduccio En els darrers anys s’ha produ¨ıt un gran increment en la capacitat de processament dels ordinadors, la qual cosa ha perm`es als investigadors avanc¸ar en la simulacio´ de problemes f´ısics que no es podrien abastar sense la seua ajuda. La dinamica ` de fluids computacional (Computational Fluid Dynamics, CFD) e´ s una disciplina que fa us ´ massiu de la simulacio´ num`erica per tractar problemes tals com disseny de vehicles –cotxes, avions, submarins, etc.–, control de transit, ` prediccio´ meteorologica ` i d’altres (veure, per exemple, [164, 126, 191, 3, 171, 53, 86, 133, 17, 129]). Cada vegada es desitja obtindre solucions num`eriques amb major precisio, ´ per la qual cosa s’han desenvolupat m`etodes m´es i m´es potents, que es volen aplicar sobre malles computacionals cada vegada amb major resolucio. ´ Malgrat la pot`encia dels sistemes de calcul ` actuals, el cost computacional d’un m`etode del tipus plantejat pot ser inabastable. Tot i que existeix un debat sobre si e´ s preferible un m`etode d’alt ordre aplicat a una malla de baixa resolucio´ o b´e un m`etode d’ordre baix aplicat a una malla d’alta resolucio, ´ de forma que ambdos ´ m`etodes proporcionen solucions amb la mateixa qualitat [58, 80, 87, 110], els problemes que es plantegen avui dia requereixen m`etodes d’alt ordre aplicats a malles d’alta resolucio. ´ Exemples de problemes amb aquests requeriments son ´ aquells que involucren inestabilitats de Rayleigh-Taylor [168, 172] i de ` Richtmyer-Meshkov [146, 122], interaccions d’ones de xoc amb vortex [48] i turbul`encia [189]. Les malles amb molta resolucio´ son ´ especialment utils ´ a les zones on apareix alguna fenomenologia que inclou estructures no regulars, com

xii les que acabem d’indicar. Aquesta idea ha generat multitud de m`etodes basats en utilitzar diferents resolucions en diferents zones, amb la finalitat de reduir el cost computacional dels algoritmes. Podem destacar els m`etodes basats en malles no uniformes o no estructurades [42, 73], m`etodes de penalitzacio´ [27, 190] i m`etodes sense malla [20]. Un inconvenient d’aquestes metodologies e´ s que la condicio´ de Courant, Friedrichs i Lewy (CFL), necessaria ` per tal d’assegurar l’estabilitat del m`etode, imposa una fita superior en el pas de temps que es pot utilitzar, que e´ s menor quant menor siga la grandaria ` de la cel·la m´es petita de la malla. Aix´ı, el fet d’utilitzar cel·les de diferents grandaries ` no redunda en una reduccio´ del nombre d’iteracions temporals necessaries ` per a calcular la solucio. ´ L’algoritme de refinament adaptatiu de malles (Adaptive Mesh Refinement, AMR) de Berger [21, 24, 22, 139] incorpora con a caracter´ıstica diferencial m´es rellevant el doble refinament en espai i en temps. L’objectiu de l’algoritme e´ s no tant reduir el nombre de cel·les sobre les que es calcula com el nombre d’operacions que s’han de realitzar per a actualitzar la solucio´ entre dos instants de temps. A difer`encia de les opcions que hem anomenat anteriorment, l’AMR explota el fet que cel·les de diferent grandaria ` es poden actualitzar utilitzant passos de temps diferents, creant malles de diferents resolucions que es superposen i que es poden integrar separadament, cadascuna amb un pas de temps diferent, i de forma coordinada. Aixo` implica augmentar el nombre total de cel·les, degut a la superposicio´ de les malles, pero` es produeix una reduccio´ en el nombre total d’integracions que s’han de realitzar, i com a consequ` ¨ encia una reduccio´ del cost computacional de l’algoritme. A partir de la primera descripcio´ de l’AMR el 1982 [26, 21], s’ha realitzat una considerable quantitat de recerca al seu voltant. Berger i Oliger [24] van descriure l’algoritme per a lleis de conservacio´ hiperboliques. ` Cap al final dels anys 80 ja existien implementacions per a problemes bidimensionals [22, 10], mentre que a principis del 90 van apar`eixer els primers resultats en tres dimensions [19]. Algoritmes AMR combinats amb resolvedors d’alt ordre van apar`eixer a finals del segle passat. El m`etode PPM (Piecewise Parabolic Method) [39] e´ s comunament usar en conjuncio´ amb AMR [29], i en l’actualitat es treballa en desenvolupar algoritmes basats en AMR i m`etodes de tipus ENO (Essentially Non-Oscillatory) i WENO (Weighted Essentially NonOscillatory) [106, 108, 15, 192]. En aquest treball desenvolupem un m`etode num`eric d’alta resolucio´ per a sistemes de lleis de conservacio´ hiperboliques, ` basat en algunes de les t`ecniques num`eriques m´es modernes en l’area ` de dinamica ` de fluids

RESUM

xiii

computacional, i tractem les principals dificultats que sorgeixen quan s’utilitzen aquestes t`ecniques de forma conjunta. Tamb´e mostrem la forma en qu`e un algoritme d’aquest tipus pot ser implementat en paral·lel. El nostre algoritme esta` basat en la combinacio´ d’un m`etode d’alt ordre de captura dels xocs (High Resolution Shock Capturing, HRSC) per a la resolucio´ num`erica d’equacions de fluids [118] i un m`etode de refinament adaptatiu de malles [22]. El m`etode de resolucio´ es construeix a partir de la formulacio´ en difer`encies finites de Shu i Osher [161], un interpolador WENO de cinqu`e ordre [81], el m`etode de divisio´ de fluxos (flux splitting) de Donat i Marquina [44] i un m`etode de Runge-Kutta de tercer ordre [161]. Alguns punts importants que hem estudiat son: ´ conservacio´ de les quantitats f´ısiques entre malles, procediments de refinament, adaptacio´ i generacio´ de malles computacionals adaptatives, implementacio´ i paral·lelitzacio´ de l’algoritme, i la descripcio´ de l’algoritme AMR d’una forma m´es general que la que es troba comunament a la literatura cient´ıfica. La tesi esta` estructurada en 8 cap´ıtols i un ap`endix. Al cap´ıtol 2 s’introdueixen els conceptes basics ` sobre les equacions de la dinamica ` de fluids i alguns models d’equacions utilitzats comunament. Al cap´ıtol 3 s’estableixen les bases generals sobre m`etodes num`erics per a equacions de fluids. Sobre estes bases es construeix, al cap´ıtol 4 un resolvedor d’alt ordre basat en la formulacio´ en difer`encies finites de Shu i Osher amb divisio´ de fluxos de Donat i Marquina. Al cap´ıtol 5 es descriu un m`etode de refinament adaptatiu de malles i com es combina amb l’algoritme descrit al cap´ıtol 4. La implementacio´ de l’algoritme resultant es descriu al cap´ıtol 6. Al cap´ıtol 7 realitzem un conjunt d’experiments num`erics per comprovar el rendiment de l’algoritme, i il·lustrar algunes de les seues propietats. Les conclusions del treball s’exposen al cap´ıtol 8. Finalment, a l’ap`endix A es descriu una generalitzacio´ de l’algoritme de refinament adaptatiu de malles.

Equacions de din`amica de fluids En f´ısica, una llei de conservacio´ estableix que certa propietat d’un sistema f´ısic a¨ıllat no canvia amb el temps. Les lleis de conservacio´ es modelitzen habitualment mitjanc¸ant equacions integrals, pero` a la practica ` s’utilitzen sistemes d’equacions en derivades parcials (PDE) que son ´ equivalents a la forma integral quan hi ha regularitat. Al cap´ıtol 1 hem introdu¨ıt els conceptes fonamentals sobre les lleis de conservacio´ i les seues solucions. Destaquem com a refer`encia principal per a aquest cap´ıtol el

xiv llibre de Landau i Lifschiz [93], i indiquem com a refer`encies fonamentals [16, 36, 41, 100, 91, 90, 119, 188]. Les lleis de conservacio´ tenen la forma d

∂u X ∂f q (u) + = 0, ∂t ∂xq

x ∈ Rd ,

q=1

t ∈ R+ ,

(1)

on d e´ s el nombre de dimensions espacials, u : Rd ×R+ −→ Rm e´ s la solucio´ de la llei de conservacio, ´ i f q : Rm −→ Rm son ´ els fluxos. Si m = 1 la llei de conservacio´ s’anomena escalar. Habitualment el problema que es planteja consisteix en resoldre un problema de Cauchy, e´ s a dir, trobar la solucio´ de l’equacio´ per a un temps T > 0 coneixent el valor de la solucio´ per al temps t = 0, i s’estableix per tant una condicio´ inicial u(x, 0) = u0 (x),

x ∈ Rd .

Podem escriure el sistema (1) en la forma d

∂u ∂u X Aq (u) + = 0, ∂t ∂xq q=1

x ∈ Rd , t ∈ R+ .

q

on les matrius Aq (u) = ∂f ´ els jacobians del sistema. El sistema (1) ∂u son s’anomena hiperbolic ` si qualsevol combinacio´ d X q=1

ξq Aq ,

(ξq ∈ R)

t´e m valors propis reals i m vectors propis linealment independents, e´ s a dir, si cadascuna d’aquestes combinacions e´ s semblant a una matriu diagonal. Si els m valors propis de Aq (u) son ´ diferents, el sistema s’anomena estrictament hiperbolic. ` ´ conegut que les solucions dels sistemes hiperbolics Es ` poden contenir discontinu¨ıtats, fins i tot si la dada inicial i els fluxos son ´ regulars. Per tal de tractar amb aquest tipus de solucions, s’introdueix el concepte de solucio´ feble; una funcio´ u(x, t) e´ s una solucio´ feble de (1) amb dada inicial u(x, 0) si es verifica Z Z Z d X ∂φ u(x, t) ∂φ (x, t) + f q (u) φ(x, 0)u(x, 0)dx (2) dxdt = − ∂t ∂x d q R R+ Rd q=1

per a tota funcio´ φ ∈ C01 (Rd × R+ ), on C01 (Rd × R+ ) e´ s l’espai de les funcions derivables amb derivada cont´ınua, i amb suport compacte en Rd × R+ .

xv

RESUM

Les solucions de (2) son ´ les solucions de la forma integral de les equa´ solucions febles, i les socions. A m´es a m´es, les solucions de (1) son lucions febles son ´ solucions de (1) en el cas que siguen suficientment regulars. Les solucions febles es caracteritzen per ser aquelles funcions que verifiquen la forma diferencial alla` on son ´ regulars i compleixen les ´ Les condicions condicions de Rankine-Hugoniot [143, 75] on no ho son. de Rankine-Hugoniot relacionen els valors de la solucio´ al voltant d’una discontinu¨ıtat i la velocitat amb la que es propaga la discontinu¨ıtat: [f ] · nΣ = s[u] · nΣ , on f = (f 1 , . . . f d ) e´ s una matriu que cont´e els fluxos, u e´ s la solucio, ´ s e´ s la velocitat de la discontinu¨ıtat i nΣ e´ s el vector normal a la discontinu¨ıtat. La notacio´ [·] indica el salt d’una variable a trav´es de la discontinu¨ıtat. Tamb´e e´ s un fet conegut que pot existir m´es d’una funcio´ que siga solucio´ feble d’una equacio. ´ Per tant, s’afegeixen condicions que completen la definicio´ de solucio´ feble, de forma que es puga identificar la solucio´ correcta en sentit f´ısic, coneguda com a solucio´ d’entropia. Algunes d’aquestes condicions son ´ degudes a Oleinik [130], Lax [98], Wendroff [187] i Liu [111]. La gran majoria de m`etodes num`erics per a sistemes hiperbolics ` exploten d’una forma o altra el fet que les matrius jacobianes siguen diagonalitzables. Si Rq e´ s la matriu dels vectors propis per la dreta de Aq (u), q Rq = [r1q , . . . , rm ] i Λq e´ s una matriu diagonal formada per els valors propis corresponents, Λq = diag([λq1 , . . . , λqm ]), aleshores Aq (u) es pot escriure com Aq = Rq Λq Rq−1 . Els vectors propis rpq defineixen camps vectorials, anomenats camps caracter´ıstics, al llarg dels quals es pot interpretar el comportament de les solucions. En particular existeixen dos tipus de camps d’especial relleu. En primer lloc ens referim als camps genu¨ınament no lineals, definits com aquells per als quals es verifica ∇λp (u) · rp (u) 6= 0,

∀u,

on ∇λp (u) e´ s el gradient de λp (u). El valor propi associat varia de forma monotona ` al llarg de corbes integrals del camp caracter´ıstic. Notar que als sistemes lineals Aq no dep`en d’u, i per tant λp e´ s constant respecte a u. Aixo` implica que els camps genu¨ınament no lineals no poden apar`eixer als sistemes lineals, i son ´ propis dels sistemes no lineals. Si, pel contrari, el que es dona ´ e´ s ∇λp (u) · rp (u) = 0, ∀u, (3)

xvi aleshores el camp s’anomena linealment degenerat, i es caracteritza pel fet que el valor propi associat e´ s constant al llarg de les corbes integrals de rp (u), tal com succeeix als sistemes lineals, on ∇λp = 0. Els valors propis de les matrius jacobianes determinen alguns tipus de discontinu¨ıtats que apareixen als sistemes hiperbolics. ` Si x = s(t) e´ s una discontinu¨ıtat que separa dos estats uL (t) i uR (t), la discontinu¨ıtat es diu que e´ s un xoc associat al camp caracter´ıstic p-`essim si es verifica λp (uL ) ≥ s′ (t) ≥ λp (uR ).

(4)

En el cas que es donen igualtats en (4), es diu que s(t) e´ s una discontinu¨ıtat de contacte associada al camp p-`essim. D’altra banda, les ones de rarefaccio´ es caracteritzen pel fet que el valor propi e´ s creixent al llarg del camp caracter´ıstic corresponent. Estos fets ens indiquen que si dos estats estan a la mateixa corba integral d’un camp linealment degenerat i estan connectats per una discontinu¨ıtat, aleshores degut a (3), els valors propis corresponents a eixos estats son ´ iguals, i la discontinu¨ıtat sols pot ser de contacte. D’altra banda, els camps genu¨ınament no lineals poden presentar ones de xoc i rarefaccions, depenent de la monotonia del valor propi. Presentem a continuacio´ alguns models d’equacions i sistemes de lleis de conservacio´ hiperboliques, ` on es poden observar algunes de les caracter´ıstiques de les equacions de fluids que hem descrit en esta seccio. ´ El model m´es simple e´ s l’equacio´ d’adveccio´ lineal escalar ut + aux = 0, on a e´ s una constant. Per a qualsevol funcio´ F : R −→ R, una solucio´ de l’equacio´ ve donada per u(x, t) = F (x−at), e´ s a dir, la funcio´ F (x) es transporta a velocitat constant en el temps. L’unic ´ tipus de discontinu¨ıtat que es pot presentar a les solucions d’aquesta equacio´ e´ s la discontinu¨ıtat de contacte. D’entre els models no lineals, un dels m´es simples e´ s l’equacio´ de Burgers no viscosa 2 u = 0. ut + 2 x Tot i que quan s’escriu en forma quasi-lineal ut + uux = 0, s’assembla a l’equacio´ d’adveccio´ lineal, el tipus de solucions que admet e´ s completament diferent, presentant ja ones de xoc i rarefaccions. Les

xvii

RESUM

solucions de l’equacio´ de Burgers no poden contenir discontinu¨ıtats de contacte. El seguent ¨ pas en la nostra descripcio´ d’equacions model son ´ els sistemes lineals de lleis de conservacio´ en una dimensio. ´ Tenen la forma ut + Aux = 0,

(5)

on A e´ s una matriu quadrada de grandaria ` m × m, sent m el nombre d’equacions. Per ser el sistema hiperbolic, ` la matriu A admet la descomposicio´ A = RΛR−1 , on Λ = diag(λ1 , . . . , λm ), amb λp ∈ R i R = [r1 , . . . , rm ], rp ∈ Rm . Fent el canvi a les variables caracter´ıstiques, definides segons w = R−1 u, el sistema es pot escriure en la forma wt + Λwx = 0. Aquest e´ s un sistema diagonal, de forma que cada fila e´ s una equacio´ d’adveccio, ´ donada per ∂wp ∂wp + λp = 0. ∂t ∂x

(6)

Si considerem una dada inicial u(x, 0) = u0 (x) per a (5), aleshores la solucio´ de (6) e´ s wp (x, t) = wp0 (x − λp t), on wp0 e´ s la component p-`essima de w0 = R−1 u0 . La solucio´ de (5) es pot escriure, per tant com u = R · w, o en forma expandida: u(x, t) =

m X p=1

wp (x − λp t, 0)rp .

Com a model m´es important de sistema no lineal de lleis de conservacio´ hiperboliques ` destaquen les equacions d’Euler, que en una dimensio´ es poden escriure com ρ ρv ρv + ρv 2 + p = 0, (7) E t v(E + p) x

on ρ e´ s densitat, v velocitat, E energia i p pressio. ´ Al cap´ıtol 2 fem una descripcio´ detallada d’aquestes equacions, aix´ı com de les equacions d’Euler multi-component. Simplement mencionarem aqu´ı que dels tres camps caracter´ıstics dos son ´ genu¨ınament no lineals i un linealment degenerat, i per tant a les solucions es poden trobar tant ones de xoc com discontinu¨ıtats de contacte i ones de rarefaccio. ´

xviii

M`etodes num`erics per a din`amica de fluids Les equacions de la dinamica ` de fluids son, ´ en general, impossibles de resoldre de forma exacta, tret d’alguns casos molt senzills. Els m`etodes num`erics tenen com a objectiu, precisament, el calcul ` aproximat de les solucions, en forma de conjunt de valors discrets, que en moltes ocasions e´ s suficient en la practica. ` En aquest cap´ıtol descrivim les nocions i resultats basics ` sobre m`etodes num`erics per a equacions de fluids, centrant-nos quasi exclusivament al cas escalar unidimensional, amb l’objectiu d’aclarir les idees que utilitzarem per a la descripcio´ del m`etode num`eric particular que utilitzarem en aquest treball. Tots els conceptes introdu¨ıts en aquest cap´ıtol s’expliquen a qualsevol manual de dinamica ` de fluids computacional, com per exemple els llibres de Toro [175], LeVeque [104, 105] i Hirsch [70, 71]. Considerem l’equacio´ que correspon al cas escalar unidimensional de (1), ut + f (u)x = 0, (x, t) ∈ R × R+ , (8) amb condicions inicials u(x, 0) = u0 (x). Considerem una malla obtinguda mitjanc¸ant la discretitzacio´ d’un interval de la recta real, que prenem com a I = [0, 1] per a simplificar, donada pels punts xj = j + 12 ∆x, on ∆x = N1 , amb N enter positiu, per a 0 ≤ j < N . Aquests punts defineixen subintervals cj = [xj− 1 , xj+ 1 ]. La variable temporal la discretitzarem 2

2

T sobre un interval [0, T ], t > 0, segons tn = n∆t, ∆t = M , amb M enter N −1 positiu. Denotarem per {Uj }j=0 l’aproximacio´ puntual de la solucio´ de (8) als punts xj . Considerarem m`etodes num`erics expl´ıcits en temps, que obtenen la solucio´ num`erica corresponent a un temps tn+1 a partir de les solucions ja calculades corresponents a temps anteriors, i que podem escriure com

U n+1 = H(U n , U n−1 , . . . , U n−p+1 ) = 0,

p > 0.

(9)

En particular considerarem el cas de m`etodes d’un pas (p = 1), e´ s a dir, U n+1 = H(U n ). L’objectiu dels m`etodes num`erics e´ s calcular aproximacions acurades a la solucio´ de (8). Per a mesurar la bondat d’aquestes aproximacions usarem normes discretes, definides sobre RN , on N e´ s el nombre de punts de la discretitzacio. ´ Desitgem que el m`etode num`eric siga convergent, e´ s a dir, si tenim una successio´ de malles {Gk }+∞ k=0 que verifica limk→+∞ ∆xk = 0, aleshores el m`etode (9) sera` convergent en la norma ||.|| si es verifica limk→+∞ ||UGk − uGk || = 0, on uGk son ´ els valors puntuals de

xix

RESUM

la solucio´ de (8) sobre la malla Gk , i UGk e´ s la solucio´ num`erica calculada sobre la mateixa malla. Per tal d’estudiar la converg`encia d’un m`etode, comunament s’utilitzen els conceptes de consist`encia i estabilitat d’un m`etode, que juntament amb el Teorema de Lax [102, 147, 167] poden ∆t esta` permetre demostrar la converg`encia. Assumim que la relacio´ ∆x fitada per una constant. Un m`etode s’anomena consistent si l’error lo 1 H(un ) − un+1 , tendeix a zero cal de truncament, definit segons Ln∆t = ∆t quan ∆t tendeix a zero, assumint que u e´ s regular. Si Ln∆t = O(∆tp ) direm que el m`etode t´e ordre de precisio´ igual a p. D’altra banda, un m`etode e´ s estable si l’error que resulta d’aplicar n vegades el m`etode num`eric a les dades inicials, donat per E n = Hn (U 0 ) − un , es pot fitar amb alguna quantitat que tendisca a zero quan ∆t i ∆x tendeixen a zero. El resultat conegut com Teorema de Lax ens permet aprofitar estos conceptes, ja que ens diu que, donat un m`etode lineal d’un pas consistent, per a un problema de Cauchy ben posat, estabilitat i converg`encia son ´ equivalents. L’analisi ` d’estabilitat i consist`encia e´ s, en general, m´es senzill que el de la converg`encia. La consist`encia es demostra t´ıpicament mitjanc¸ant desenvolupaments de Taylor, mentre que la estabilitat es pot analitzar utilitzant conceptes com variacio´ total, monotonia i contractivitat del m`etode. En aquest treball ens centrarem en m`etodes conservatius provinents d’una formulacio´ semi-discreta. Un m`etode es diu que e´ s conservatiu si existeix una funcio´ fˆ : Rp+q+1 → R, anomenada flux num`eric, tal que ∆t ˆ n n n n ) − fˆ(Uj−p , . . . , Uj+q−1 ) , (10) f (Uj−p+1 , . . . , Uj+q H(U n )j = Ujn − ∆x

per a certs nombres enters positius p i q. Usarem la notacio´ fˆj+ 1 = 2 n n ). La funcio fˆ(Uj−p+1 , . . . , Uj+q ´ fˆ es diu que e´ s consistent si fˆ(U, . . . , U ) = f (U ). L’avantatge d’utilitzar m`etodes conservatius prov´e del fet que aquests m`etodes, quan convergeixen, ho fan necessariament ` a una solucio´ feble del problema, d’acord amb el Teorema de Lax-Wendroff [103]. Una forma d’obtindre un m`etode conservatiu consisteix en discretitzar en primer lloc el terme f (u)x , mitjanc¸ant una formulacio´ f (u)x ≈

fˆj+ 1 − fˆj− 1 2

2

∆x

,

deixant el terme ut sense discretitzar, de forma que s’obt´e un sistema d’equacions diferencials ordinaries ` ˆ ˆ dUj (t) fj+ 21 − fj− 21 + = 0, dt ∆x

∀j.

xx Aquest sistema es resol mitjanc¸ant un resolvedor d’equacions ordinaries ` adequat. En aquest treball utilitzem un algoritme de tipus Runge-Kutta de tercer ordre [160] que t´e la propietat de ser de variacio´ total decreixent, i que t´e la forma: U (1) = U n − ∆tD(U n ), 1 1 3 U (2) = U n + U (1) − ∆tD(U (1) ), 4 4 4 1 n 2 (2) 2 n+1 U = U + U − ∆tD(U (2) ), 3 3 3

(11)

fˆj+ 1 −fˆj− 1

2 2 . on D(U ) ve donat per D(U )j = ∆x El calcul ` dels fluxos num`erics amb ordre alt es pot aconseguir seguint la metodologia de reconstruccio´ de fluxos amb limitadors. La idea e´ s resoldre, per a cada punt xj+ 1 , un problema donat per una dada cons2 tant a cada costat de la interf´ıcie (problema de Riemann), tal com es fa al m`etode de Godunov [57]. L’alt ordre s’aconsegueix calculant aquests estats com a interpolacions d’alt ordre de les dades, i introduint un limitador per tal que l’estabilitat es puga assegurar. Exemples de m`etodes d’aquest tipus son ´ els m`etodes MUSCL [181], PPM [39], PHM [117], ENO [64] i WENO [113, 81], per citar-ne alguns. Per a finalitzar aquesta seccio, ´ mencionarem que, donat que el m`etode s’aplica sobre una malla de grandaria ` finita, i que el calcul ` del flux num`eric als nodes proxims ` a les fronteres involucra valors que no pertanyen a la malla, es fa necessari especificar el comportament de les solucions a les fronteres, imposant condicions de frontera artificials. Considerarem per a tal fi nodes auxiliars, amb ´ındexs {−1, . . . , −p} (a l’esquerra) i {N, . . . , N + q − 1} (a la dreta), als quals especificarem el valor de la solucio. ´ Algunes condicions de frontera habituals s´ on les de tipus inflow i outflow, on el fluid entra o surt per la frontera, les de tipus absorbing, on el fluid e´ s absorbit per la frontera, i les de tipus reflecting, on el fluid rebota en arribar a ella. En el primer cas, una forma d’imposar les condicions de frontera consisteix en fixar els valors a la frontera segons

n = U0n , 1 ≤ k ≤ p, U−k

n n UN +k−1 = UN −1 , 1 ≤ k ≤ q.

Al segon cas, essencialment es tracta de fixar a zero la diverg`encia num`erica a la frontera, i, per als m`etodes conservatius, es pot aconseguir mitjanc¸ant n n , 1 ≤ k ≤ p, = Uk−1 U−k

n n UN +k−1 = UN −k , 1 ≤ k ≤ q.

(12)

xxi

RESUM

Al tercer cas, les formules ´ utilitzades depenen del problema. T´ıpicament s’apliquen les equacions (12) a totes les variables excepte a la velocitat, on es canvia el signe indicant que el fluid canvia de direccio´ en arribar a la frontera. Per exemple, per a les equacions d’Euler unidimensionals (7), fixar´ıem ρ−k = ρk−1 ,

ρN +k−1 = ρN −k ,

u−k = −uk−1 ,

uN +k−1 = −uN −k ,

E−k = Ek−1 ,

EN +k−1 = EN −k ,

1 ≤ k ≤ p, 1 ≤ k ≤ q.

´ en difer`encies finites de Shu i Formulacio ´ de fluxos de Donat i Osher amb divisio Marquina Descrivim breument en aquest cap´ıtol el m`etode num`eric que utilitzem per integrar les equacions en cada malla, sent els seus elements constitutius la formulacio´ en difer`encies finites de Shu i Osher [160, 161], la divisio´ de fluxos (flux-splitting) de Donat i Marquina [44], la reconstruccio´ WENO de cinqu`e ordre [81] i un integrador de Runge-Kutta TVD de tercer ordre [161]. La formulacio´ en difer`encies finites de Shu i Osher representa una alternativa als m`etodes basats en discretitzacions per volums finits, que simplifica la implementacio´ del m`etode num`eric, especialment per a problemes en m´es d’una dimensio´ espacial. Una descripcio´ comparativa d’ambdues opcions es pot trobar a [159]. La descripcio´ original del m`etode, realitzada per a equacions escalars, proposa l’extensio´ a sistemes mitjanc¸ant l’aplicacio´ del m`etode a cada camp caracter´ıstic local, provinent, per exemple, d’una linealitzacio´ com la del m`etode de Roe. Una alternativa es va proposar a [44], on es calculen dos matrius jacobianes a cada interf´ıcie entre cel·les. Aquesta e´ s una extensio´ m´es natural de la metodologia de Shu i Osher a sistemes, i ha demostrat proporcionar millors resultats en alguns casos patologics. ` A m´es a m´es, pot ser utilitzada conjuntament amb qualsevol reconstruccio. ´ En aquest treball utilitzem la reconstruccio´ WENO tal com es descriu en el treball de Jiang i Shu [81]. L’algoritme resultant d’ajuntar totes aquestes t`ecniques ha sigut utilitzat a [118] sobre una malla fixa, per a un problema de fluids multicomponent. Descrivim a continuacio´ la formulacio´ de Shu i Osher per a una equacio´ escalar unidimensional ut + f (u)x = 0.

(13)

xxii L’extensio´ a m´es d’una dimensio´ e´ s immediata, i l’extensio´ a sistemes es realitzara` m´es endavant. Siga h(x) una funcio´ (depenent de la grandaria ` de la malla ∆x) que verifica 1 f (u(x)) = ∆x Aleshores

Z

x+ ∆x 2 x− ∆x 2

h(ξ)dξ.

(14)

− h x − ∆x 2 f (u(x))x = , ∆x amb la qual cosa l’equacio´ (13) e´ s equivalent a h x + ∆x − h x − ∆x 2 2 = 0. ut + ∆x h x+

∆x 2

(15)

Per tal d’obtindre un m`etode conservatiu necessitem aproximar la derivada f (u(x))x amb una expressio´ de la forma (veure (10)) 1 ˆn n . fj+ 1 − fˆj− 1 ∆x 2 2

n L’equacio´ (15) suggereix que el flux num`eric fˆj+ 1 ha d’aproximar el 2

valor de h(xj+ 1 ). Dit d’una altra manera, si podem calcular aproximaci2 ons d’alt ordre de h(xj+ 1 ), aquestes poden ser utilitzades com a fluxos 2 num`erics d’alt ordre per a un esquema conservatiu. Notem que el valor de h(xj+ 1 ) es pot aproximar utilitzant els valors coneguts de les seues 2 mitges f (u(x)) als nodes de la malla, mitjanc¸ant qualsevol reconstruccio´ R de valors puntuals a partir de mitges en cel·la. En aquest treball utilitzem la reconstruccio´ WENO de Jiang i Shu [81]. En calcular la reconstruccio, ´ un punt essencial e´ s l’upwinding, que significa que el m`etode num`eric ha de tindre en compte la direccio´ en la qual la solucio´ es mou, donada pels signes dels valors propis de la matriu jacobiana. Per a equacions escalars, aixo` e´ s simplement el signe de f ′ (u). Utilitzem l’indicador de Roe κj+ 1 = 2

f (Uj+1 ) − f (Uj ) Uj+1 − Uj

(16)

per tal de determinar el seu signe, i realitzem reconstruccions convenientment esbiaixades: si κj+ 1 > 0 calcularem la reconstruccio´ esbiai2 xada cap a l’esquerra, segons fˆj+ 1 = R(f (Uj−s1 ), . . . , f (Uj+s2 ), xj+ 1 ), i si 2 2 κ 1 ≤ 0, calcularem fˆ 1 = R(f (Uj−s +1 ), . . . , f (Uj+s +1 ), x 1 ). j+ 2

j+ 2

1

2

j+ 2

xxiii

RESUM

´ conegut que els fluxos num`erics calculats d’aquesta manera adEs meten solucions no entropiques ` als punts sonics, ` per la qual cosa e´ s necessari variar el calcul ` a aquests punts, que son ´ aquells on f ′ (u) canvia de signe. Shu i Osher utilitzen l’algoritme de Lax-Friedrichs local (LLF) als punts sonics. ` Aquest algoritme s’obt´e realitzant una divisio´ de fluxos, donada per f (Uj ) = f + (Uj ) + f − (Uj ), amb f ± (u) = 12 (f (u) ± βu) on β e´ s un valor local donat per βj+ 1 = maxu∈[Uj ,Uj+1 ] |f ′ (u)|, i calculant la re2 construccio´ fˆ 1 com la suma fˆ 1 = fˆ+ 1 + fˆ− 1 de dues reconstruccions j+ 2

j+ 2

j+ 2

j+ 2

esbiaixades calculades utilitzant, respectivament, f + (u) i f − (u), segons: + + + fˆj+ 1 = R(f (Uj−s1 ), . . . , f (Uj+s2 )) 2

− + − fˆj+ 1 = R(f (Uj−s1 +1 ), . . . , f (Uj+s2 +1 )) 2

L’algoritme final de calcul ` de fluxos num`erics queda com segueix: Algoritme 1. Definir βj+ 1 = maxu∈[Uj ,Uj+1 ] |f ′ (u)| 2 Definir κj+ 1 mitjanc¸ant (16) 2 si κj− 1 · κj+ 1 > 0 2 2 si κj+ 1 > 0 2 fˆ 1 = R(fj−s , . . . , fj+s , x 1 ) j+ 2

1

j+ 2

2

altrament fˆj+ 1 = R(fj−s1+1 , . . . , fj+s2+1 , xj+ 1 ) 2 2 fi si altrament + 1 1 fˆj+ 1 = R( 2 (fj−s1 + βj+ 1 Uj−s1 ), . . . , 2 (fj+s2 + βj+ 1 Uj+s2 ), xj+ 1 ) 2 2 2 2 − 1 1 ˆ = R( U ), . . . , U (fj−s +1 − β 1 j−s +1 (fj+s +1 − β 1 j+s +1 ), x f 1 j+ 2

2

j+ 2

1

2

1

2

j+ 2

2

j+ 12 )

+ ˆ− 1 . fˆj+ 1 = fˆj+ 1 + f j+ 2

2

2

fi si L’extensio´ de la formulacio´ anterior a sistemes la realitzem mitjanc¸ant variables i fluxos caracter´ıstics, seguint la formula ´ de separacio´ de fluxos de Donat i Marquina [44], que es basa en realitzar una doble linealitzacio´ de la matriu jacobiana en cada interf´ıcie. Primerament es calculen, per a cada interf´ıcie, dues aproximacions esbiaixades de les variables conserL , i que es calculen utilitzant stencils L vades u, que denotem per Uj+ 1 i U j+ 1 2

2

esbiaixats, que contenen, respectivament, els punts xj i xj+1 . Notar que

xxiv el que es calculen son ´ aproximacions al valor puntual de u al punt xj+ 1 2 a partir de valors puntuals de u als nodes de la malla. En aquest treball hem utilitzat per a aquest proposit ` una versio´ de l’algoritme WENO de Jiang i Shu [81], modificat per tal que es reconstru¨ısquen valors puntuals a partir de valors puntuals, i no de mitges en cel·la com al m`etode original. En segon lloc es calculen les linealitzacions corresponents als L ) i f ′ (U R ). Denotarem per lp (U L ) i r p (U L ) els dos jacobians f ′ (Uj+ 1 j+ 1 j+ 1 j+ 1 2

2

2

2

L ), i per lp (U R ) vectors propis esquerra i dreta respectivament de f ′ (Uj+ 1 j+ 1 2

2

R ) els mateixos objectes corresponents a f ′ (U R ). A continuacio i r p (Uj+ ´ 1 j+ 1 2

2

es calculen dos conjunts de variables i fluxos caracter´ıstics, calculats als punts d’un cert stencil, segons els canvis de variable donats per les dues matrius de valors propis esquerra:

L = lp (U L ) · U , Wp,k k j+ 12 , p L L fp,k = l (Uj+ 1 ) · f (Uk )

per a j − s1 ≤ k ≤ j + s2 ,

2

R = lp (U R ) · U , Wp,k k j+ 12 , p R R fp,k = l (U 1 ) · f (Uk ) j+

(17) per a j − s1 + 1 ≤ k ≤ j + s2 + 1.

2

Els fluxos num`erics es calculen ara de forma similar a l’algoritme de Shu i Osher que hem descrit per al cas escalar, pero` utilitzant les variables caracter´ıstiques corresponents a la linealitzacio´ esquerra o dreta segons convinga. L’us ´ d’unes variables o altres vindra` determinat pel signe dels valors propis de les dues linealitzacions. L’algoritme final de calcul ` de fluxos num`erics queda de la seguent ¨ manera:

xxv

RESUM Algoritme 2. si λp (u)no canvia de signe en [Uj , Uj+1 ] si λp (Uj ) > 0 L = R(f L L ψp,j p,j−s1 , . . . fp,j+s2 , xj+ 21 ) R =0 ψp,j altrament L =0 ψp,j R = R(f R R ψp,j p,j−s1 +1 , . . . fp,j+s2+1 , xj+ 12 ) fi si altrament p p 1 L L = R( 1 (f L L L ψp,j 2 p,j−s1 + βj+ 1 Wp,j−s1 ), . . . , 2 (fp,j+s2 + βj+ 1 Wp,j+s2 ), xj+ 1 ) 2

2

2

p p 1 R R R R = R( 1 (f R ψp,j 2 p,j−s1+1 − βj+ 1 Wp,j−s1 +1 ), . . . , 2 (fp,j+s2 +1 − βj+ 1 Wp,j+s2+1 ), xj+ 1 ) 2

2

2

fi si P L p L R r p (U R ), r (Uj+ 1 ) + ψp,j fˆj+ 1 = p ψp,j j+ 1 2

2

2

p p on βj+ 1 = maxu |λ (u)|, amb u variant en una corba en l’espai de fases 2

que connecta Uj i Uj+1 . En el cas de les equacions model que hem considerat en aquest treball, els camps caracter´ıstics son ´ o b´e genu¨ınament no lineals o b´e linealment degenerats, amb la qual cosa els valors propis son, ´ respectivament, monotons ` o constants quan u varia, de forma que els canvis de signe de λp (u) es poden estudiar comprovant simplement p el signe de λp (Uj ) · λp (Uj+1 ), i βj+ 1 es pot calcular simplement segons 2

p p p βj+ 1 = max{|λ (Uj )|, |λ (Uj+1 )|}. Indiquem finalment que per a equacions 2

´ equivalents. Un cop calculats els fluxos escalars els algoritmes 1 i 2 son num`erics, resolem l’equacio´ ut +

fˆj+ 1 − fˆj− 1 2

2

∆x

=0

que resulta de la formulacio´ semi-discreta mitjanc¸ant el m`etode de RungeKutta de tercer ordre donat per les equacions (11). Finalment indicarem que l’extensio´ dels algoritmes anteriors a m´es d’una dimensio´ espacial es pot realitzar dimensio´ a dimensio´ d’una forma senzilla [161, 118].

Refinament adaptatiu de malles El refinament adaptatiu de malles que utilitzem en aquest treball e´ s una metodologia de proposit ` general per a la integracio´ num`erica eficient de

xxvi lleis de conservacio´ hiperboliques. ` L’algoritme va ser descrit inicialment per Berger [21] i Berger i Oliger [24] per a m`etodes basats en viscositat artificial, i posteriorment per a m`etodes basats en volums finits per Berger i Colella [22]. Una versio´ simplificada va ser descrita per Quirk [139]. L’AMR intenta adequar la resolucio´ de les malles computacionals a les necessitats de la solucio´ num`erica, utilitzant malles m´es grolleres on la solucio´ e´ s regular i refinant solament on aquesta presenta estructures no regulars. Sota condicions favorables l’algoritme resultant requereix nom´es una part del temps de computacio´ necessari per a resoldre el problema a una malla uniforme. Aquesta efici`encia prov´e tant del refinament espacial com del refinament en temps, i no s’imposen restriccions especials sobre el m`etode num`eric a emprar, de forma que l’algoritme mant´e un bon grau de genericitat. Aquests objectius s’aconsegueixen tenint en compte que les solucions de les lleis de conservacio´ hiperboliques ` es composen t´ıpicament d’ones que es mouen a trav´es de zones on la solucio´ e´ s regular. Els m`etodes de captura d’ones de xoc d’alta resolucio´ tracten de capturar i resoldre eixes ones, ja que a les zones on hi ha regularitat no es necessita alta resolucio. ´ La principal dificultat de l’AMR sera, ` per tant, identificar les zones no regulars, que seran refinades fins a un nivell de resolucio´ adequat, i assegurar que el refinament segueix el moviment de les ones amb el temps. Expliquem a continuacio´ l’algoritme amb m´es detall, per al cas unidimensional. Donat un conjunt fitat Ω1 ⊂ Rd , i degut a la hiperbolicitat del problema que considerem, la seua solucio´ u a Ω1 × [t0 , t0 + ∆t] dep`en solament dels valors de u a un superconjunt de Ω1 × {t0 }, que ve donat pel domini de depend`encia de les equacions, que e´ s un altre conjunt fitat. Els m`etodes num`erics imiten esta caracter´ıstica, i calculen la solucio´ a Ω1 × [t0 , t0 + ∆t] utilitzant dades d’un altre conjunt fitat, que ve determinat e 1 × {t0 } el pel domini de depend`encia num`erica del m`etode. Siga doncs Ω domini de depend`encia num`erica del m`etode considerat, i suposem que la condicio´ CFL es verifica. Aleshores, donada una aproximacio´ de u soe 1 × {t0 }, e´ s possible calcular aproximacions a bre una malla definida a Ω u(x, t), per a (x, t) ∈ Ω1 × [t0 , t0 + ∆t], mitjanc¸ant el m`etode num`eric. Una nova aplicacio´ de la mateixa idea, per tal de calcular aproximacions de u en una malla corresponent a Ω1 × [t0 + ∆t, t0 + 2∆t], requeriria con`eixer e 1 × {t0 + ∆t}, pero` solament el valor de les aproximacions de u sobre Ω disposem d’aproximacions sobre el seu subconjunt Ω1 × {t0 + ∆t}. Les b 1 := Ω e 1 \Ω1 s’hauran d’obtindre per altres aproximacions sobre la banda Ω b 1 , aquests valors poden ser interpolats de forvies. Si u e´ s regular en Ω ma acurada a partir de les aproximacions de u a una malla m´es grollera

xxvii

RESUM

e e1 ⊇ Ω e 1. definida sobre un domini Ω L’AMR es pot descriure mitjanc¸ant l’aplicacio´ recursiva d’aquesta idea a una jerarquia de malles de diferents resolucions, de forma que cada malla cont´e les caracter´ıstiques de la solucio´ que no es poden predir utilitzant la informacio´ continguda a malles m´es grolleres. Les jerarquies que considerem en aquest treball parteixen d’una malla Gt0 definida sobre tot el domini Ω (assumim Ω = [0, 1] per simplicitat) on volem resoldre les equacions. Aquesta malla estara` composada per N0 cel·les de grandaria ` 1 ∆x0 = N0 . A partir d’ella, es pot construir un conjunt de L malles amb m´es i m´es resolucio´ considerant malles obtingudes per subdivisio´ de les cel·les de la malla immediatament m´es grollera en diverses parts (assumim que cada cel·la es divideix en dues parts, per simplificar), e´ s a dir, l’interval [0, 1] es divideix en N0 , . . . , NL−1 subintervals (cel·les) de longitud ∆xℓ = 1/Nℓ , on Nℓ = 2ℓ N0 , ℓ = 0, . . . , L − 1. Els centres de les cel·les els denotarem per xℓj = (j + 12 )∆xℓ , j = 0, . . . , Nℓ − 1, ℓ = 0, . . . , L − 1, i la unio´ de les cel·les indexades per elements de Gℓ per Ωℓ (Gℓ ). Dins de la nostra jerarquia, una malla corresponent a un cert nivell ℓ e´ s un subconjunt de les Nℓ cel·les que corresponen a aquest nivell, i es pot interpretar tamb´e com un subconjunt de {0, . . . , Nℓ − 1}. Com que la solucio´ varia amb el temps, tamb´e ho faran les malles, de forma que denotarem per Gtℓℓ la malla corresponent al nivell de resolucio´ ℓ per al temps tℓ . Sobre cadasℓ ≈ u(xℓj , tℓ ), cuna d’aquestes malles considerarem una funcio´ discreta utℓ,j on j ∈ Gtℓℓ . L’algoritme evoluciona aquestes malles i les seues solucions num`eriques associades comenc¸ant amb tℓ = 0, ℓ = 0, . . . , L − 1 i finalitzant amb tℓ = T, ℓ = 0, . . . , L − 1, on T e´ s el temps final per al qual volem re′ soldre les equacions. Als temps intermedis es requereix tℓ ≥ tℓ′ si ℓ ≤ ℓ . Assegurarem a m´es a m´es la condicio´ seguent: ¨ per a ℓ > 0, Gℓ = {2i, 2i + 1, per a alguns i ∈ Gℓ−1 }, que en particular implica que les malles estan contingudes unes dins de les altres. Els blocs constituents fonamentals de l’AMR son: ´ integracio, ´ que consisteix en aplicar l’algoritme que hem descrit a la seccio´ anterior a cada cel·la de cada malla; adaptacio, ´ de forma que s’aplica un refinament adequat a cada part del domini en tot moment; i projeccio, ´ que obliga a que es respecte la conservacio´ entre malles quan aquestes es superposen. Pel que fa a la integracio, ´ en primer lloc es tria un pas temporal ∆t0 adequat per que la condicio´ CFL ∆t0 ≤

∆x0 , maxu |f ′ (u)|

es verifique a la malla Gt0 . Els passos temporals per a la resta de malles

xxviii els prenem com ∆tℓ =

∆tℓ−1 , 2

ℓ = 1, . . . , L − 1,

la qual cosa implica que la condicio´ CFL corresponent a cada malla es verifica. Un pas temporal d’una malla correspon, per tant, a dos passos de la malla immediatament m´es fina, de forma que cada malla es pot integrar des de temps t (inicialment el mateix per a totes) fins a temps t + ∆t0 , fent 2ℓ iteracions. Totes aquestes iteracions es realitzen de forma sequencial ¨ atenent als seguents ¨ criteris: 1. Cada malla s’integra immediatament despr´es de la malla corresponent a un nivell de resolucio´ menys. 2. Si una malla Gtℓℓ (l > 0) s’integra fins a un cert instant de temps, no es tornara` a integrar fins que totes les malles m´es fines que ella s’hagen integrat fins al mateix instant de temps. Un cop totes les malles s’han integrat fins temps t + ∆t0 , el proc´es es repeteix per al seguent ¨ pas temporal de la malla m´es grollera. Notem finalment que per tal d’integrar una malla Gℓ , (ℓ > 0) des de temps t a temps t + ∆tℓ , e´ s necessari proporcionar-li dades procedents e ℓ (Gℓ ) × {t}. Aquestes dades s’obtenen fent interpolacio´ en de la banda Ω e ℓ−1 (Gℓ−1 ). D’altra banda, per a integrar la mateixa espai a partir de Ω e ℓ (Gℓ ) × malla des de temps t + ∆tℓ fins a t + 2∆tℓ , es requereixen dades a Ω t+2∆tℓ t {t + ∆tℓ }, i en aquest cas s’obtenen a partir de Gℓ−1 i Gℓ−1 , que hauran sigut integrades anteriorment. Donat que per a integrar la malla Gtℓ utilitzem un algoritme de Runge-Kutta, definit per (11), de forma que a cada pas intermedi s’han de proporcionar dades per a aquesta banda, cosa que fem observant que les dades U (1) i U (2) que apareixen es poden ℓ interpretar com a aproximacions de la solucio´ a temps t + ∆tℓ i t + ∆t 2 , respectivament Un cop calculat (utℓℓ +2∆tℓ , Gtℓℓ ), existeixen dades calculades amb diferents resolucions corresponents al mateix conjunt Ωℓ (Gtℓℓ ). Per a donar coher`encia a aquestes dades, realitzem un proc´es de projeccio´ de les malles m´es fines cap a les m´es grolleres, de forma que es modifiquen ℓ ℓ +2∆tℓ tals que els seus que corresponguen a cel·les de Gtℓ−1 els valors utℓ−1,j tℓ ´ındexs i verifiquen {2i, 2(i − 1), 2(i + 1)} ∩ Gℓ 6= ∅. Per obtindre la correccio´ que cal fer a la malla grollera, notem que es dona ´ la relacio´ ℓ ℓ +2∆tℓ − = utℓ,j utℓ,j

∆tℓ ˆtℓ tℓ ℓ ℓ ˆtℓ +∆t ˆtℓ +∆t ((f ) − (fˆℓ,j− )), 1 +f 1 + f ℓ,j+ 21 ℓ,j− 21 ∆xℓ ℓ,j+ 2 2

(18)

xxix

RESUM que implica, prenent j = 2i, 2i + 1 ℓ +2∆tℓ ℓ +2∆tℓ + utℓ,2i+1 utℓ,2i

2

=

ℓ ℓ + utℓ,2i+1 utℓ,2i

2

−

∆tℓ−1 ˆˆtℓ ˆtℓ (fℓ−1,i+ 1 − fˆℓ−1,i− 1 ), ∆xℓ−1 2 2

(19)

on hem definit: ˆ fˆtℓ

ℓ−1,i+ 21

=

ˆtℓ fˆℓ−1,i− 1 =

tℓ ˆtℓ +∆t3ℓ fˆℓ,2i+ 3 + f ℓ,2i+ 2

tℓ fˆℓ,2i− 1 2

2

2 tℓ +∆tℓ + fˆℓ,2i− 1 2

2

2

, (20) ,

Per tant, si redefinim el flux num`eric segons ˆˆtℓ tℓ fˆℓ−1,i± , 1 = f ℓ−1,i± 1 2

(21)

2

i assumim que al temps t es compleix la relacio´ ℓ = utℓ−1,i

ℓ ℓ + utℓ,2i+1 utℓ,2i

2

,

(22)

aleshores, amb la correccio´ (21), es verifica la mateixa relacio´ per a temps tℓ + 2∆tℓ = tℓ−1 + ∆tℓ−1 : ℓ +2∆tℓ utℓ−1,i

=

ℓ +2∆tℓ ℓ +2∆tℓ + utℓ,2i+1 utℓ,2i

2

.

(23)

Les substitucions que hem fet tenen sentit per la coincid`encia de les interf´ıcies entre cel·les als dos nivells ℓ i ℓ − 1, ja que es compleix que ℓ−1 ℓ−1 ℓ xℓ2i+ 3 = xi+ = xi− ´ indicada cada cop 1 i x 1 . Realitzarem la correccio 2i− 1 2

2

2

2

que una malla s’integra durant un pas de temps de la malla immediatament m´es grollera. Expliquem finalment el proc´es d’adaptacio, ´ necessari per tal que les malles puguen evolucionar en el temps segons ho fan les solucions num`eriques. L’objectiu e´ s assegurar que les zones on hi hagen discontinu¨ıtats o altres estructures no regulars estiguen refinades fins al nivell que corresponga. La nostra proposta consisteix en incloure en la malla d’un cert nivell les cel·les que tenen valors de la solucio´ num`erica que no poden ser predits amb precisio´ a partir del nivell anterior. M´es expl´ıcitament, si xℓj ∈ Gtℓ i I(utℓ−1 , x) e´ s un operador d’interpolacio´ que actua sobre les

xxx dades utℓ−1 = {utℓ−1,i }i∈Gt , aleshores la cel·la definida per xℓj s’incloura` a ℓ−1 la malla refinada si t (24) uℓ,j − I(utℓ−1 , xℓj ) > τp > 0,

i ens assegurarem que la malla que resulte estiga formada per subdivisio´ de cel·les grolleres, marcant les cel·les escaients. A m´es a m´es, tamb´e inclourem una cel·la en la malla refinada si el modul ` del gradient discret de la solucio, ´ calculat sobre la malla grollera, supera una certa tolerancia, ` de forma que es puga detectar la formacio´ d’ones de xoc a partir de dades regulars. Per al gradient discret usarem l’aproximacio´ t t t t max − u − u u u , ℓ−1,j ℓ−1,j−1 ℓ−1,j+1 ℓ−1,j ∂u ℓ−1 (x , t) ≈ . (25) ∂x j ∆xℓ

Afegim finalment les cel·les corresponents a un entorn de grandaria ` d’almenys una cel·la de la malla grollera, i aixo` ens permetra` adaptar les malles despr´es de cada iteracio´ de la malla grollera, en lloc de despr´es de cada iteracio´ de la malla fina, ja que amb eixes cel·les extra es cobreix la distancia ` que una discontinu¨ıtat pot recorrer ´ en un pas temporal de la malla grollera, de forma que no podra` escapar de la malla fina. Indiquem tamb´e que el proc´es d’adaptacio´ el realitzarem actuant primer sobre les malles m´es fines, per tal d’assegurar que en tot moment es verifica Ωℓ (Gtℓ ) ⊆ Ωℓ−1 (Gtℓ−1 ). Assegurem tamb´e la inclusio´ Ωℓ (Gtℓ ) ⊇ Ωℓ+1 (Gtℓ+1 ), de forma que la jerarquia de malles verifica les inclusions desitjades. Aquestes fan possible el calcul ` de les dades necessaries ` a les bandes que hem definit al voltant de les cel·les marcades, i de les que e ℓ (Gt ) ⊆ Ω e ℓ−1 (Gt ). Un cop calculaparlem m´es amunt, ja que tindrem Ω ℓ ℓ−1 t t b b da la nova malla Gℓ , que verifica Ωℓ (Gℓ ) ⊆ Ωℓ−1 (Gℓ−1 ), calculem ( bt \ Gt I(utℓ−1 , xℓj ) si j ∈ G t ℓ ℓ (26) u bℓ,j = si j ∈ Gtℓ utℓ,j

e´ s a dir, les dades s’interpolen per a les cel·les que no estan en Gtℓ . La t bt , u malla refinada es defineix, per tant, amb (G ℓ bℓ ). L’algoritme que hem descrit en aquesta seccio´ es pot implementar mitjanc¸ant el pseudo-codi de la Fig. 1. La integracio´ de tota la jerarquia de malles amb un pas de temps ∆t0 , tal com hem descrit m´es amunt, s’aconsegueix amb la crida actualitzar(G, 0). La crida projectar(Gℓ) actualitza la solucio´ a la malla Gℓ−1 d’acord amb la correccio´ (21) i la crida adaptar(Gℓ+1) realitza el proc´es d’adaptacio´ que hem descrit.

xxxi

RESUM ´ actualitzar(G:jerarquia de malles, ℓ:enter) Funcio integrar(Gℓ) si(ℓ < L − 1) per a k = 1 fins 2 actualitzar(G, ℓ + 1) fi per a projectar(Gℓ+1) adaptar(Gℓ+1) fi si ´ fi funcio Figura 1: Una algoritme recursiu per a l’algoritme AMR

´ i paral·lelitzacio ´ de Implementacio l’algoritme Descrivim en aquesta seccio´ la implementacio´ practica ` de l’algoritme per al cas bidimensional, fent una distincio´ clara entre implementacio´ sequencial, ¨ que basicament ` e´ s una extensio´ a dues dimensions dels algoritmes descrits en les dues seccions anteriors, i implementacio´ en paral·lel. Considerem el problema ut (x, y, t) + f (u(x, y, t))x + g(u(x, y, t))y = 0, (x, y, t) ∈ Ω × [0, T ], (27) u(x, y, 0) = u0 (x), (x, y) ∈ Ω, on hem assumit Ω = [0, 1]2 per simplicitat. Tindrem en aquest cas, per a cada nivell de refinament ℓ, una discretitzacio´ 1 1 ℓ ℓ ℓ ∆xℓ , j + ∆yℓ , 0 ≤ i < Nℓx , 0 ≤ j < Nℓy , i+ xi,j := (xi , yj ) = 2 2 (28) on, donats enters positius N0x i N0y , hem definit: Nℓx = 2ℓ N0x ,

Nℓy = 2ℓ N0y ,

∆xℓ =

1 , Nℓx

∆yℓ =

1 , Nℓy

1 ≤ ℓ < L.

Aquests punts defineixen cel·les ∆xℓ ℓ ∆xℓ ∆yℓ ℓ ∆yℓ ℓ ℓ ℓ ci,j = xi − , xi + , yj + × yj − . 2 2 2 2

(29)

xxxii Considerarem jerarquies de malles on cada malla Gℓ , corresponent al nivell de refinament ℓ, es defineix com a un subconjunt de la discretitzacio´ corresponent a eixe nivell, organitzada com a un conjunt de trossos quadrats de malla, que anomenarem pegats, i que denotarem per {Gℓ,k , 1 ≤ k ≤ Kℓ }. Construirem les malles i les jerarquies de malles de forma que es verifiquen les seguents ¨ condicions: • Cada pegat e´ s un subconjunt de {cℓi,j : 0 ≤ i < Nℓx , 0 ≤ j < Nℓy }, • Ω(Gℓ,k ) e´ s un rectangle per a tot k, 1 ≤ k ≤ Kℓ , • ˚ Ω(Gℓ,k1 ) ∩ ˚ Ω(Gℓ,k2 ) = ∅ si k1 6= k2 (els pegats sols poden intersecar entre si a les seues fronteres), SKℓ−1 Ω(Gℓ−1,k ) (la malla d’un nivell esta` continguda a la • Ω(Gℓ,k ) ⊆ k=1 malla del nivell immediatament anterior), ℓ−1 6= ∅ per a alguns cℓ−1 ∈ G • si cℓi,j ∈ Gℓ,k e´ s tal que ˚ cℓi,j ∩ ˚ cp,q ℓ−1 , aleshop,q ℓ−1 res cp,q ⊆ Ω(Gℓ,k ) (les malles s’obtenen per subdivisio´ de cel·les de la malla immediatament anterior).

Ampliarem cadascun dels pegats que conformen una malla amb cel·les auxiliars al seu voltant, de forma que els pegats es puguen integrar separadament. Les dades corresponents a aquestes cel·les auxiliars s’obtindran de forma analoga ` al cas unidimensional, b´e interpolant (en espai o en espai i en temps) a partir de dades de la malla anterior, o b´e copiant dades d’altres pegats, ja que les cel·les auxiliars d’un pegat poden correspondre a cel·les interiors d’un altre. L’adaptacio´ de la jerarquia de malles la realitzem de forma similar al cas unidimensional descrit en la seccio´ anterior. Inclourem a la malla refinada les cel·les que verifiquen una condicio´ del tipus ℓ (30) Ui,j − I(U ℓ−1 , xℓi,j ) > τp > 0,

ℓ e on Ui,j ´ s la solucio´ num`erica corresponent al node (i, j) de la malla de nivell ℓ i I e´ s l’extensio´ tensorial d’un operador d’interpolacio´ unidimensional. A m´es inclourem tamb´e les cel·les que resulten de la subdivisio´ de cel·les de la malla immediatament m´es grollera on es verifique que el sensor de gradients ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 max Ui+1,j − Ui,j , U ui,j − Ui−1,j max Ui,j+1 − Ui,j , Ui,j − Ui,j−1 , max ∆xℓ ∆yℓ

(31)

xxxiii

RESUM

supera una certa tolerancia. ` Afegirem finalment les cel·les corresponents a almenys una cel·la de la malla anterior al voltant de cada cel·la marcada, per assegurar-nos que les discontinu¨ıtats no eixiran de la malla fina en un pas de temps de la malla grollera. A partir de les cel·les marcades realitzarem un proc´es d’agrupament d’aquestes en pegats rectangulars. El proc´es consisteix en trobar el pegat m´ınim que cont´e totes les cel·les marcades, i comparar el percentatge de cel·les marcades que cont´e amb una tolerancia ` preestablerta. En cas que el percentatge siga menor que la tolerancia, ` es divideix el pegat en trossos i es repeteix el proc´es per a cadascun dels pegats resultants. Per tal d’assegurar que les malles estan contingudes unes dins d’altres, refinarem les malles m´es fines abans que les m´es grolleres. D’esta manera, quan es refina una malla d’un nivell ℓ, es pot assegurar la inclusio´ de la malla m´es fina –que ja ha sigut refinada– simplement incloent en la malla refinada del nivell ℓ les cel·les que corresponguen a cel·les de la malla de nivell ℓ + 1. Finalment, per a dotar a la malla refinada d’una solucio´ num`erica, actuarem com al cas unidimensional, copiant dades de la malla no refinada alla` on ambdues malles coincidisquen i interpolant a partir de la malla anterior a la resta de cel·les. La integracio´ de cada pegat es pot realitzar, com hem dit, de forma separada. La organitzacio´ de les integracions dels diferents pegats i les diferents malles e´ s similar al cas unidimensional. Sols mencionarem que en el cas bidimensional el pas de temps per a la malla m´es grollera es defineix de forma que es verifique una condicio´ CFL, mentre que la resta de passos es defineixen de la mateixa forma que al cas unidimensional. Quant a la projeccio, ´ amb la que finalitzarem la part dedicada a la implementacio´ sequencial, ¨ notem que donada la discretitzacio´ que hem definit, els nodes on es calculen els fluxos num`erics a dues malles de nivells diferents no coincideixen (veure Fig. 2), encara que estan localitzats sobre interf´ıcies coincidents de cel·les. Per tal de fer la correccio´ del flux num`eric de la malla grollera a partir del flux de la malla fina, calcularem valors interpolats a partir dels valors de la malla fina i utilitzarem aquests valors per a la correccio´ de la malla grollera. Fent un analisi ` similar al de la seccio´ anterior, observem que la integracio´ d’un node xℓ2i,2j ∈ Gℓ d’un temps t a un temps t + ∆tℓ−1 es pot expressar com ℓ ut+2∆t ℓ,2i,2j

=

utℓ,2i,2j

∆tℓ RK3,t RK3,t+∆tℓ RK3,t+∆tℓ RK3,t ˆ ˆ ˆ ˆ fℓ,2i+ 1 ,2j + fℓ,2i+ 1 ,2j − fℓ,2i− 1 ,2j + fℓ,2i− 1 ,2j − ∆xℓ 2 2 2 2 ∆tℓ RK3,t RK3,t+∆tℓ RK3,t+∆tℓ RK3,t − gˆℓ,2i,2j+ 1 + gˆℓ,2i,2j+ 1 − gˆℓ,2i,2j− 1 + gˆℓ,2i,2j− 1 , ∆yℓ 2 2 2 2

xxxiv t+2∆tℓ t+2∆tℓ ℓ amb expressions analogues ` per a ut+2∆t ℓ,2i+1,2j , uℓ,2i,2j+1 i uℓ,2i+1,2j+1 . Si defiy nim ara per a −1 ≤ i ≤ Nℓx i 0 ≤ j ≤ Nℓ

ˆt fˆℓ−1,i+ = 1 ,j

RK3,t+∆tℓ RK3,t RK3,t+∆tℓ RK3,t + fˆℓ,2i+ + fˆℓ,2i+ + fˆℓ,2i+ fˆℓ,2i+ 3 3 3 3 ,2j ,2j ,2j+1 ,2j+1 2

2

2

2

4

2

,

(32)

,

(33)

i per a 0 ≤ i ≤ Nℓx i −1 ≤ j ≤ Nℓy t gˆ ˆℓ−1,i,j+ 1 =

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ gˆℓ,2i,2j+ ˆℓ,2i,2j+ + gˆℓ,2i+1,2j+ ˆℓ,2i+1,2j+ 3 + g 3 + g 3 3 2

2

2

2

4

2

aleshores es t´e t+2∆tℓ t+2∆tℓ t+2∆tℓ ℓ ut+2∆t ℓ,2i,2j + uℓ,2i+1,2j + uℓ,2i,2j+1 + uℓ,2i+1,2j+1

= −

4 utℓ,2i,2j + utℓ,2i+1,2j + utℓ,2i,2j+1 + utℓ,2i+1,2j+1

(34)

4

∆t ∆tℓ ˆ ˆˆt ℓ ˆt t ˆˆt − . g ˆ fˆℓ−1,i+ − f 1 − g 1 1 1 ,j ℓ−1,i,j− 2 ℓ−1,i− 2 ,j ∆xℓ 2 ∆yℓ ℓ−1,i,j+ 2

Per tant, fem, sobre la malla grollera, les correccions ˆt t fˆℓ−1,i+ = fˆℓ−1,i+ 1 1 , ,j ,j

x −1 ≤ i ≤ Nℓ−1 ,

t t ˆ ˆℓ−1,i,j+ gˆℓ−1,i,j+ 1, 1 = g

x − 1, 0 ≤ i ≤ Nℓ−1

2

2

2

2

y 0 ≤ j ≤ Nℓ−1 − 1, y −1 ≤ j ≤ Nℓ−1 ,

(35)

amb la qual cosa, si suposem que es verifica a temps t la relacio´ utℓ−1,i,j

=

utℓ,2i,2j + utℓ,2i+1,2j + utℓ,2i,2j+1 + utℓ,2i+1,2j+1 4

,

(36)

es t´e que la mateixa relacio´ es verifica per al temps seguent ¨ t + ∆tl−1 . Passem a continuacio´ a resumir la implementacio´ de l’algoritme en paral·lel. La forma t´ıpica de paral·lelitzar algoritmes basats en AMR consisteix en dividir el domini en trossos i assignar a cada processador els pegats –de tots els nivells– corresponents a cada tros, intercanviant dades entre processadors quan siga necessari. El problema d’aquesta metodologia e´ s que el fet que existisquen regions de grandaria ` redu¨ıda pero` amb un alt grau de refinament complica el disseny d’una estrat`egia de particionament eficient, que idealment hauria de: • equilibrar les carregues ` de treball dels processadors i

xxxv

RESUM

l

l

x 2i,2j+1

x 2i+1,2j+1 l−1

x i,j l

x 2i,2j

l

x 2i+1,2j

Figura 2: Localitzacio´ relativa dels nodes i els punts on es calculen fluxos num`erics per a dues malles de resolucio´ consecutives. Els nodes de la malla grollera els hem indicat amb cercles negres, i els fins amb quadrats negres. Els punts on es calculen fluxos num`erics els hem representat amb cercles blancs per a la malla grollera i quadrats blancs per a la fina.

• minimitzar les transfer`encies de dades entre els diferents processadors. Per assolir estos objectius, la practica ` m´es comu ´ consisteix en utilitzar corbes que omplen tot l’espai (space filling curves, SFC) [151]. Aquestes son ´ corbes cont´ınues que passen per tots els punts de l’interval [0, 1]2 , i es construeixen iterativament, de forma que en cada pas k de la construccio´ es pot establir una aplicacio´ bijectiva entre dos conjunts {1, . . . , K}2 i {1, . . . , K 2 }, on K dep`en del pas considerat en la construccio´ i de la corba emprada. D’aquesta forma, es pot assignar un ´ındex sequencial ¨ als trossos en qu`e s’ha dividit el domini [0, 1]2 . L’avantatge de les SFC e´ s que intenten assignar ´ındexs propers a trossos propers, de forma que si els trossos s’assignen als processadors seguint l’ordre donat per l’SFC, es tendira` a assignar trossos propers al mateix processador. Donat que l’intercanvi de dades sols e´ s necessari entre trossos adjacents, aixo` reduira` la transfer`encia de dades entre els processadors [135, 31]. En aquest treball hem utilitzat les corbes de Peano-Hilbert [136, 69], que en el pas k-`essim de la construccio´ divideixen l’interval [0, 1] en 4k trossos. La divisio´ del domini en trossos es realitza mirant la malla m´es grollera i assignant a cada cel·la un cost proporcional al nombre d’integracions

xxxvi que s’han de realitzar, per a integrar eixa cel·la i totes les cel·les de nivells m´es fins en qu`e s’haja subdividit, durant un pas de temps de la malla m´es grollera. Aix´ı, si la cel·la s’ha refinat fins a un nivell ℓ ≥ 0, se li assignara` un cost igual a ℓ X l=0

23ℓ =

23ℓ+3 − 1 . 7

A cada tros li correspondra` un cost igual a la suma dels costos de les seues cel·les. L’algoritme que hem utilitzat per a repartir els calculs ` entre els diferents processadors disponibles actua com segueix: Es calcula el nombre kmax de divisions que es poden realitzar en cada dimensio´ de forma que cada subdomini resultant continga un nombre enter de cel·les de la malla m´es grollera. A continuacio´ es calcula el nombre kmin que correspon a un pas de la construccio´ de la corba de Peano-Hilbert que produeix un nombre de subdominis 4k ≥ P , on P e´ s el nombre de processadors disponibles. Es calcula el cost corresponent a cada tros per a la divisio´ donada per k = kmin , i es mira si es poden assignar eixos trossos a processadors diferents de forma que la difer`encia entre el cost total assignat a un processador i el cost mitja` (que resulta de dividir el cost total entre el nombre de processadors) siga menor que una certa quantitat τ . Si e´ s el cas s’accepta la divisio´ i es reparteixen les dades. Altrament, si k ≤ kmax s’incrementa k en una unitat i es repeteix el proc´es fins que s’aconsegueix un repartiment de carrega ` satisfactori. Si no es pot trobar aquesta divisio´ per a un valor k ≤ kmax , s’incrementa el valor de τ i es repeteix el proc´es des del principi. Aquest proc´es es pot implementar tal com s’indica al pseudo-codi de la Fig. 3

Experiments num`erics En aquest cap´ıtol analitzem el rendiment del m`etode num`eric amb diversos exemples en una i dues dimensions. Els estudis de caracter ` quantitatiu, com ara l’analisi ` dels errors num`erics els hem realitzat en el cas unidimensional, perqu`e e´ s m´es senzill analitzar fenomenologies a¨ıllades que al cas bidimensional. A m´es a m´es, els temps d’execucio´ son ´ m´es redu¨ıts, i ens permeten realitzar proves extensives en un temps raonable. Amb els exemples bidimensionals hem mostrat el comportament del m`etode en situacions complexes, amb una metodologia m´es qualitativa. Hem seleccionat un conjunt de problemes les solucions dels quals presenten una ampla varietat de fenomenologies, amb l’objectiu de mos-

RESUM

xxxvii

´ equilibra(g:jerarquia de malles, P :enter, τ :real) Funcio Calcular kmin i kmax fer k = kmin fer C = calcular l’ordre de Peano-Hilbert per al pas k l = llista de costos per a C i g assignar costos als processadors segons l, P i τ k = k+1 mentre (l’assignaci´ o de costos no ´ es acceptable i k ≤ kmax ) incrementar τ es acceptable) o de costos no ´ mentre(l’assignaci´ ´ fi funcio Figura 3: Pseudo-codi per a l’equilibri de carrega ` entre processadors.

trar que l’algoritme dona ´ un bon rendiment en el rang de situacions m´es ample possible. Hem observat que per als problemes que hem resolt, tant unidimensionals com bidimensionals, el m`etode dona ´ un bon rendiment, obtenint solucions de la mateixa qualitat que les que s’obtenen amb una malla de resolucio´ fixa, quan els parametres ` s’han fixat a valors adequats. El rendiment de l’algoritme dep`en de diferents factors, com la complexitat del problemes, la jerarquia de malles utilitzada, i els parametres, ` entre d’altres, i sols l’experi`encia ens pot ajudar en l’eleccio´ d’una configuracio´ adequada. En el cas unidimensional hem resolt problemes per a l’equacio´ d’adveccio´ lineal, l’equacio´ de Burgers, dues configuracions per a les equacions d’Euler –el problema de tub de Sod i la interaccio´ d’una ona de xoc amb ones entropiques– ` i un problema per a les equacions d’Euler multicomponent. Aquests problemes ens han servit per a mostrar com es comporta l’error –ent`es com la difer`encia entre les solucions obtingudes amb una malla fixa i amb AMR– respecte als parametres, ` i l’efecte que t´e cadascuna de les parts que composen el proc´es de marcatge de cel·les a refinar. A m´es hem mostrat la importancia ` que t´e la projeccio´ de fluxos en el rendiment de l’algoritme i hem analitzat el cost computacional de l’algoritme arribant a la conclusio´ de que la integracio´ e´ s la part m´es costosa amb gran difer`encia, de forma que una estimacio´ del percentatge d’integracions a realitzar per l’AMR respecte a l’algoritme a malla fixa pot servir com a estimacio´ del percentatge de temps que necessitara. `

xxxviii En el cas bidimensional hem resolt num`ericament diversos problemes: un problema de Riemann bidimensional, on els estats inicials representen quatre ones de xoc; el problema conegut con double Mach reflection, en el qual una ona de xoc es troba amb una rampa; la interaccio´ d’una ona de xoc amb un vortex ` i la interaccio´ d’una ona de xoc viatjant a trav´es d’aire amb una bombolla d’heli. En tots els casos l’algoritme obt´e solucions de la mateixa qualitat que les que s’obtenen a malla fixa o les obtingudes per altres autors, amb un cost computacional menor.

Conclusions i treball futur En aquest treball hem descrit un m`etode num`eric per a la resolucio´ de sistemes hiperbolics ` de lleis de conservacio. ´ El m`etode e´ s el resultat de la combinacio´ d’un m`etode de captura d’ones de xoc d’alt ordre –constru¨ıt a partir de la formulacio´ en difer`encies finites de Shu i Osher, un m`etode d’interpolacio´ WENO de cinqu`e ordre, la divisio´ de fluxos de Donat i Marquina i un algoritme de Runge-Kutta de tercer ordre– i la t`ecnica AMR desenvolupada per Berger i col·laboradors. Mostrem la forma en qu`e totes aquestes t`ecniques poden ser ajuntades per tal de construir un m`etode num`eric molt eficient, i descrivim la implementacio´ practica ` de l’algoritme en un programa sequencial ¨ o paral·lel. Hem comprovat el funcionament de l’algoritme amb diverses proves num`eriques en una i dues dimensions, que mostren que el m`etode e´ s capac¸ d’obtindre solucions de la mateixa qualitat que les obtingudes sense adaptacio, ´ pero` amb un cost computacional molt menor. La gran quantitat de proves realitzades permet obtindre una bona comprensio´ de les propietats de l’algoritme que pot ser util ´ en la practica, ` per a obtindre informacio´ sobre els guanys potencials que pot proporcionar, aix´ı com del seu comportament respecte als parametres ` de qu`e dep`en. Amb l’ajuda dels experiments hem explicat diferents aspectes de l’algoritme, en particular: • el comportament del m`etode adaptatiu respecte al mateix algoritme aplicat a una malla de resolucio´ fixa, en termes de la difer`encia entre les respectives solucions, • la influ`encia que el proc´es de refinament t´e en la qualitat del resultat final i en el rendiment obtingut per l’algoritme adaptatiu, i • la importancia ` de la projeccio´ de fluxos de malles m´es fines a malles m´es grolleres en l’algoritme.

RESUM

xxxix

Tot i que el rendiment de l’algoritme e´ s satisfactori, hem detectat diferents aspectes, principalment relacionats amb la implementacio, ´ que podrien millorar la seua efici`encia en alguns casos. En particular, la sobrecarrega ` que produeix l’AMR es pot reduir utilitzant algoritmes de cerca rapids ` en la jerarquia de malles, de forma que el cost de trobar la connectivitat entre les malles es podria reduir. La connectivitat de les malles s’utilitza en diversos processos dins de l’algoritme, com ara el calcul ` de la solucio´ num`erica en les cel·les auxiliars dels pegats, o el calcul ` de la solucio´ num`erica d’una malla despr´es de l’adaptacio. ´ Seguint la mateixa l´ınia del comentari anterior, en alguns casos podria resultar util ´ ajuntar pegats menuts en pegats m´es grans, sempre que siga possible, i sense reduir el percentatge de cel·les marcades present en cada pegat. Aixo` reduiria el nombre de cel·les auxiliars que serien necessaries, ` i per tant el cost de l’algoritme, a canvi de pagar el cost que representa el proc´es d’ajuntar els pegats. El criteri de refinament basat en errors d’interpolacio´ que hem utilitzat podria continuar sent investigat amb l’objectiu de reduir la seua depend`encia del parametre ` τp . L’extensio´ m´es evident del m`etode que hem descrit en aquest treball e´ s la seua implementacio´ en tres dimensions. Malgrat que la descripcio´ de l’algoritme en 3D e´ s senzilla a partir del cas bidimensional, i podria semblar senzill produir-la, la gran quantitat d’aspectes i detalls que cal tindre en compte per a escriure un codi d’aquest tipus e´ s tan gran que actualment no estem considerant desenvolupar-lo. En lloc d’aixo, ` pretenem ampliar el rang d’aplicacio´ del m`etode a problemes bidimensionals m´es generals, en particular tenim en preparacio´ la seua aplicacio´ a problemes de lleis de balanc¸, on apareix un terme font. Tamb´e e´ s interessant aplicar-lo a problemes on la descomposicio´ caracter´ıstica de les matrius jacobianes no esta` totalment disponible de forma anal´ıtica, com problemes de traffic flow (veure [45]). Un altre cas interessant seria la combinacio´ de m`etodes de penalitzacio´ [25] amb refinament adaptatiu de malles.

xl

Abstract The numerical simulation of physical phenomena represented by nonlinear hyperbolic systems of conservation laws presents specific difficulties, that are not present in other kind of systems of partial differential equations. These are mainly due to the presence of discontinuities in the solution. State of the art methods for the solution of such equations involve high resolution shock capturing schemes, which are able to produce sharp profiles at the discontinuities and high accuracy in smooth regions, together with some kind of grid adaptation, which reduces the computational cost by using finer grids near the discontinuities and coarser grids in smooth regions. The combination of both techniques presents intrinsic numerical and computational difficulties. In this work we present a method obtained by the combination of a high order shock capturing scheme, built from Shu-Osher’s conservative formulation, a fifth order weighted essentially non-oscillatory (WENO) interpolatory technique, Donat-Marquina’s flux-splitting method and a third order Runge-Kutta method, with the adaptive mesh refinement (AMR) technique of Berger and collaborators. We show how all these techniques can be merged together to build up a highly efficient numerical method, and we show how to parallelize such an algorithm. We also present a description of the AMR algorithm that is much more general that the actual descriptions found in the scientific literature and tries to approach to the fundations of the running algorithms that are described and implemented in practice. We make extensive testing of our implementation to determite its extent of applicability and relative benefits with respect to the non-adaptive algorithm.

xlii

1 Introduction

1.1 Motivation The enormous growth of the processing power of modern computers allows scientists to go further in the simulation and analysis of physical problems that could not be tackled without their help. In particular, Computational Fluid Dynamics (CFD) is one of the fields that make extensive use of computer-aided numerical simulation. Some theoretical results, hypothesis and intuitions have been tested against numerical results obtained with the aid of computers. Lots of software packages have been developed to help scientists to perform numerical simulation of fluid phenomena, and a lot of knowledge about the fluid dynamics equations has been acquired in different contexts thank to the help of such software. CFD is nowadays applied to a wide and heterogeneous

2

1.1. Motivation

range of fields as design of moving vehicles like aircrafts, submarines, cars or satellites, traffic flow control, weather prediction, biomedical sciences, topological design, micro and nano-device cooling systems, environmental sciences and others (see e.g. [164, 126, 191, 3, 171, 53, 86, 133, 17, 129]). The industry makes extensive use of in-house or professional software packages. It is important for them to obtain results from their simulations as fast as possible and with the highest possible accuracy, in order to reduce their costs and production cycle times. The industry demands reliability, robustness and specificity in the tools used in the production processes. Cutting-edge CFD software packages, be them commercial like FLUENT [51], open-source like OpenFOAM [131] or in-house like the DLR-TAU code [156] implement many of the standard computational fluid dynamics numerical methods used nowadays by the industry. However, there is a big gap between the most modern numerical techniques developed in academic environments, such as universities and research centers, and its integration into engineering production processes. This is mainly due to the increasing complexity of such techniques, to the big costs involved in their implementation and integration with existing codes and to the lack of confidence in the final efficiency that can be obtained when using more complex technologies in exchange for the investment made in their development. Practical problems involve several different steps that make use of various technologies, usually developed in different contexts, and advantage could be obtained from the new developments obtained in each separate area. For example, in the field of aeronautical optimal shape design [78, 68, 126], the goal is to define the shape of an object, say an airplane or a part of it that travels through air or another fluid medium, in order to achieve a prescribed goal, like minimization of drag, maximization of lift or noise reduction. An approach to reach the desired goal is to start with an initial shape, and deform or modify it through an iterative optimization process, until a satisfactory solution is reached. The efficient solution to this problem involves, as leading processes, geometric shape definition, generation and deformation of computational grids, numerical solution of Partial Differential Equations (PDEs), geometric handling of surfaces and optimization. The efficient merging of technologies coming from these fields is an active field of research, as demonstrates the fact that the European Framework Programmes are supporting several projects related to the application of academic research to industrial problems, see e.g. [88]. Also in the United States a considerable effort is being done in the same direction, leaded by the

1. Introduction

3

NPARC Alliance [173], that concentrates efforts from partners as Boeing, NASA and USAF. With much more modest objectives, in this work we develop a numerical method based on some of the state-of-the-art numerical techniques present in the area of computational fluid dynamics, and we solve the main difficulties that appear when putting all those techniques together. We also show the way in which such a method can be parallelized and implemented into a computer program. Our implementation of the method has been written from scratch using ANSI C [76] and the Message Passing Interface (MPI) standard [123] for parallelization.

1.1.1 High resolution shock-capturing schemes There is a (still open) debate about the relative advantages of high order (high resolution) and low order numerical methods for the numerical solution of hyperbolic systems of equations. Typically low order methods are faster and easier to implement, but provide less accurate solutions. High order methods compute better numerical approximations but with a higher computational cost per computational cell. It is a problem dependent issue whether it is more efficient a high order method applied to a relatively coarse grid, or a lower order method applied to a finer grid, so that both methods give solutions with the same accuracy. It is widely accepted that high order methods are advantageous over low order schemes, even if positionings of both signs can be found in the literature, see e. g. [58, 80, 87, 110]. In particular, for time-dependent problems including flows with complicated structures, evidence shows that higher order methods outperform lower order ones [157, 95]. In real industrial applications high order methods are seldom used, and the main reasons for this are the implementation difficulties associated with those schemes. It represents a lot of development effort to obtain an stable and robust code incorporating high resolution technologies, and the industry continues using the classical, well established, codes and algorithms. However, there is an increasing demand of high fidelity tools for CFD simulation. Research has shown some phenomena intrinsic to fluid mechanics that would need unapproachable fine meshes to be resolved with low order methods. Examples are the Rayleigh-Taylor [168, 172] and Richtmyer-Meshkov [146, 122] instabilities, acoustic waves generated by shock-vortex interactions [48] and turbulence [189].

4

1.1. Motivation

1.1.2 Need of fine resolution computational grids The numerical solution of partial differential equations can be obtained by means of discrete approximations to the continuous equation, obtained through a discretization process. The way how one passes from the continuous equations to the discrete, finite-dimensional problem is determined by the different choices made for each element that acts in the problem. Typically the computational domain is divided into cells, and the continuous equations are replaced by a discrete approximation at each cell. The discretization of the computational domain itself imposes a limit in the flow features that can be resolved. The numerical solution within a cell is often interpreted as an approximation to the average or point-value of the true solution in that cell, which means that no method can resolve phenomena whose scale is smaller than the mesh size. The difference between different numerical methods can be interpreted in terms of their relative ability to get the information of the solution contained in a single computational cell. High order methods give better results than low order methods when applied to a given fixed grid, because they are able to better resolve the flow in a single cell, but to properly resolve small scale features it is a necessary condition that the grid size be smaller than the scale of the phenomena to be solved. To sum up, the optimal method would be a high order method applied on a very fine computational grid, but the computational requirements of such a method would be, by far, unapproachable with today’s technology in a reasonable time, both in storage and computational power requirements.

1.1.3 AMR: spatial and temporal refinement Accurate approximations of the true solution of the equations can be obtained wherever the solution has enough smoothness using a relatively coarse mesh and low order methods. Small-scale features are often related with parts of the numerical solution where discontinuities appear. Fine grids are particularly helpful only in the parts of the solution which have non-smooth structure, or where the solution is rapidly changing. This idea led researchers to develop a variety of techniques in order to reduce the computational cost of the overall algorithm, mainly based on

1. Introduction

5

the use of non-uniform grids. These algorithms use a grid with cells of variable size, trying to use cells of smaller size in some regions of interest, maintaining cells of bigger size in other regions where the solution is smooth. These grids are often difficult to manipulate in more than one space dimension, because the solution at a cell depends on the solution at some neighborhood around it. The use of cells of mixed size renders difficult the computation of the solution at the next time step because of the variable number of neighbors with non-uniform sizes and relative locations. Even more complex is the usage of unstructured grids. This approach is often used to model complex geometries. The lack of structure added to the non-uniformity in the grid makes these algorithms even harder to be implemented. Other methods allow the control points to move within a grid towards the region where high resolution is needed, as penalty methods [27, 190]. These methods are efficient and robust when the small scale features are well separated and a sufficient number of grid points is used. For complex flows the grids can become badly behaved, because of high stretching, distortion, etc. One step forward are meshless methods [20], where no discrete grid is used at all. A certain number of nodes is freely distributed in the computational domain. Node collapsing and bad behavior of the numerical methods on these structures is a major problem for these methods, that are, on the other hand, very flexible. Another important drawback of the previous approaches is stability. The Courant - Friedrichs - Lewy (CFL) condition [40] is a necessary condition for stability, and imposes an upper limit in the size of the time step, which depends on the size of the smallest cell. The smaller the smallest cell in the grid is, the higher the number of time steps necessary to compute the solution. Adaptive Mesh Refinement (AMR) [21, 24, 22, 139] adds a new feature to this pool: temporal refinement. The goal of the AMR procedure is to perform as few cell updates as possible, instead of reducing the number of cells. It exploits the fact that cells of different sizes can be advanced in time with different time steps by splitting the cells into different grids with uniform grid size, that are integrated according to their corresponding time steps. The AMR approach is very general and in practice a compromise between efficiency and complexity of the numerical algorithm has to be taken. In particular, in the final form of the algorithm that will be described in this work the underlying idea is to use a hierarchical set of Cartesian, uniform meshes that live at different resolution levels. At the

6

1.1. Motivation

coarsest level there is a set of coarse mesh patches covering the whole domain. Mesh patches at some resolution level are obtained by the subdivision of groups of immediately coarser cells according to a suitable refinement criterion. By repeating this sub-division procedure one can cover the regions of interest with mesh patches so that the non-smooth structure of the solution can be resolved with the desired resolution. The grids at different resolution levels co-exist, and some mesh connectivity information is needed to connect the solutions at different resolution levels. Provided the connectivity information, each mesh patch can be viewed in isolation and can be integrated independently. The presence of discontinuities at a small part of the domain does not restrict the time step than can be used at the coarse grid. Note that, on the other hand, there is some redundancy in the solution, since grids that correspond to different resolutions can refer to the same spatial location. Despite the computational saving is much more spectacular in two or three space dimensions, a simple example in one space dimension can clarify the power of the AMR approach: imagine to have a computational domain composed by a coarse computational grid of 100 cells, whose grid size is ∆x. At a certain point of the simulation an interval of 20 coarse cells is refined. From these coarse cells at level 0 we obtain a mesh patch at refinement level 1 of 40 cells of size ∆x 2 each, composed by the subdivision of each coarse cell into 2 sub-cells of equal size. Suppose that the process is repeated for an interval of 10 cells of the patch at level 1, obtaining a patch of 20 fine cells at level 2. If the time step imposed by the CFL condition at level 0 is ∆t then the time steps for levels 1 and 2 have to ∆t be taken, for stability, as ∆t 2 and 4 respectively, so the integration of the whole grid hierarchy from time t to time t + ∆t requires 100 cell updates for level 0, 40 × 2 = 80 for level 1 and 20 × 4 = 80 for level 2, for a total of 260 cell updates. To perform the simulation at an equivalent fixed grid of 400 ∆t points of size ∆x 4 a time step equal to 4 is required, so 400 × 4 = 1600 cell updates have to be computed to integrate the solution from time t to time t + ∆t. Only a 16.25% of that quantity is required by the AMR algorithm. As the integration of the solution is, by far, the most expensive part of the algorithm, especially for high order schemes, the overhead produced by the transfer of information between grids is small compared with the reduction of computational effort provided by the spatial and temporal refinement of the AMR algorithm. Other related algorithms, based on the idea of the cell update reduction are the local defect correction method [61], local multi-grid methods [28] and the Fast Adaptive Composite Grid (FAC) method [120].

1. Introduction

7

1.2 Previous work Since the first descriptions of the AMR algorithm by Oliger’s group at Stanford University in 1982 [26, 21], a lot of development has been done on adaptive mesh refinement. The algorithm was established for hyperbolic equations by the paper of Berger and Oliger [24]. At the end of the 1980’s some works implemented AMR for two-dimensional gas dynamics [22, 10]. Short after three-dimensional simulations were presented by Bell, Berger, Saltzman and Welcome [19]. These algorithms proved their efficiency in the experiments performed in the aforementioned papers, but the task of implementing an AMR-based running code in more than one space dimension was (and still is) an Herculean task. In 1991 Quirk [139] simplified several parts of the (two-dimensional) algorithm, making it easier to implement, but some simplifications seem to be too restrictive. However this simplified version encouraged researchers to think on incorporating AMR variations to their research and computer programs. Nowadays several software packages include AMR infrastructures and templates to help others in the task of building their own AMR programs [116, 185]. Because of the time-refinement features of the AMR algorithm, explicit schemes for time integration seem to be better suited to AMR, but implicit schemes have also been used [4]. Solvers with accuracy order higher than two were implemented combined with AMR in the last years of the last century. The Piecewise Parabolic Method (PPM) of Colella and Woodward [39] is nowadays widely used in AMR codes [29] and effort to build AMR codes including Essentially Non-Oscillatory (ENO) and Weighted Essentially Non-Oscillatory (WENO) methods is currently being made [106, 108, 15, 192].

1.3 Scope of the work In this work we develop a numerical method for fluid dynamics that incorporates some of the most advanced techniques present in the literature. We describe the main algorithms and techniques that form the building blocks of the algorithm, and we address the difficulties that appear when putting them all together into a single algorithm. Some

8

1.4. Organization of the text

particular points of interest that have been investigated are: description of the AMR algorithm in wide generality, inter-grid conservation, refinement procedures, grid generation and handling and implementation and parallelization of the algorithms. The building blocks of the algorithm are Shu-Osher’s conservative formulation [160, 161], a fifth order WENO interpolatory technique [81], Donat-Marquina’s flux-splitting method and a third order Runge-Kutta method [160], with the AMR technique for mesh refinement. We have tested the resulting algorithm with several numerical experiments in order to investigate its behavior in different scenarios, and to what extent the algorithm can be applied to a particular problem. We have implemented a parallel version of the algorithm using the ANSI C language and the MPICH implementation [127] of the MPI standard for parallelization.

1.4 Organization of the text The text is organized as follows: In chapter 2 we introduce the basic concepts and ideas of fluid dynamics, focusing on the Euler equations, which are one of our test models in the validation of the algorithm. The basics of numerical methods for fluid dynamics are introduced in chapter 3. We have tried to write a reasonably self-contained text, so to ease its reading to non-experts in the area. In chapters 4 and 5 we describe the basic building blocks of our algorithm. The flow solver is described in chapter 4. It consists mainly of three parts. In section 4.1 Shu-Osher’s finite difference approach is described. Conceptually, it represents a simplification of the finite-volume framework, but the combination of algorithms based on it with the AMR algorithm presents particular difficulties. In section 4.2 the basic flow solver, based on Donat-Marquina’s flux splitting is described. It represents an extension to nonlinear systems of Shu-Osher’s algorithm and has proven to be a robust and efficient algorithm. The weighted essentially non-oscillatory (WENO) reconstruction procedure of Jiang and Shu is described in section 4.3. This is the reconstruction used into DonatMarquina’s algorithm to achieve fifth order accuracy. Finally, the AMR algorithm used in our implementation is described in chapter 5. A more general AMR algorithm has been described in appendix A. At the end of the appendix, the algorithm constructed in chapter 5 is described as

1. Introduction

9

a particular case. In chapter 6 we describe our implementation of the algorithm in parallel using message passing. Chapter 7 is devoted to the numerical validation of the algorithm. We have run several test cases in one and two dimensions, for scalar equations and systems, in order to define the extent of applicability of the algorithm. Conclusions and future research lines to be followed from this work are pointed out in chapter 8. Preliminary work related to this thesis can be found in [13, 14, 15, 128].

10

1.4. Organization of the text

2 Fluid dynamics equations In this chapter we review some basic facts about hyperbolic conservation laws, focusing on fluid dynamics equations, and more precisely on the Euler equations of gas dynamics. We will review the structural properties of such equations and their solutions, with the aim of deriving information that has to be taken into account when building numerical methods for their solution. Other well known model equations used in the numerical experiments will also be presented here. There are many sources of information about hyperbolic conservation laws and fluid mechanics. The classic, essential book of Landau and Lifshitz [93] is the main reference in this field. More recent works, such as the books of Batchelor [16], Chorin and Marsden [36] and Dafermos [41] are good references. The work of Lax [100] is also a must. Another classical text is Lamb’s book [91], whose first edition amounts to 1879. This book was written in the time in which the exciting works

12

2.1. Hyperbolic conservation laws

of Lord Rayleigh [168, collected in [169], pp.200–207], Lord Kelvin [174] and Helmholtz [184], among others, appeared, and reflects an, at that time, new conception of fluid mechanics. Other interesting references are [90, 119, 188].

2.1 Hyperbolic conservation laws In physics, a conservation law states that a particular measurable property of an isolated physical system does not change as the system evolves in time. Conservation laws are often modeled by means of integral equations, but in practice these are represented by systems of partial differential equations, that are equivalent to the integral formulation for smooth solutions. Hyperbolic conservation laws are a class of hyperbolic partial differential equations, of special interest in fluid dynamics since the most important models of fluid motion are represented by equations of this type. A leading characteristic of hyperbolic conservation laws is their ability to accept discontinuous solutions. This fact, together with the finite speed of propagation of information, are the main reasons why numerical methods particular for hyperbolic conservation laws have to be developed. It is the goal of this work to develop and test a numerical method to solve hyperbolic systems of conservation laws using quite sophisticated technology. Why such elaborated methods need to be developed has been explained in chapter 1, and will be assessed throughout the rest of the work. A vast literature on numerical methods specially designed for hyperbolic conservation laws has been produced in the last years, see e.g. [175, 6, 105, 70, 71]. There are several reasons for this interest on the study, apart from other types of PDE’s, of hyperbolic PDE’s in general and hyperbolic conservation laws in particular, being the main reasons the following: 1. Many engineering and industrial problems involve conservation of some quantities. In particular the equations of fluid mechanics reduce to the Euler equations, when the effects of viscosity and heat conduction are neglected. The Euler equations are one of the most known hyperbolic systems of conservation laws.

2. Fluid dynamics equations

13

2. The structure and properties of the solutions of hyperbolic conservation laws are particular to them, and must be treated carefully in order to devise convergent numerical methods. 3. Some non-hyperbolic systems can be numerically solved by adding a trivial discretization of the non-hyperbolic terms to an hyperbolic solver. Conservation laws are represented by a system of partial differential equations with a particular structure: d

∂u X ∂f q (u) + = 0, ∂t ∂xq q=1

x ∈ Rd ,

t ∈ R+ ,

(2.1)

where d is the number of spatial dimensions, u : Rd × R+ −→ Rm is the solution of the conservation law, formed by the so-called conserved variables, and f q : Rm −→ Rm are the flux functions. The number m of conserved quantities is problem dependent. The particular case m = 1 is often referred as scalar conservation law. Conservation laws often come from an integral equation representing the conservation of a certain quantity, whose density is represented by u. Conservation means that the amount of mass contained in a given volume can only change due to the mass flux crossing the interfaces of the given volume. For a cube in Rd : "Z Z d X f q (u(x1 , . . . , xq−1 , x1q , xq+1 , . . . , xd ))d¯ xq (u(x, t2 ) − u(x, t1 ))dx = c

q=1

−

Z

c¯2q

c¯1q

#

f q (u(x1 , . . . , xq−1 , x2q , xq+1 , . . . , xd ))d¯ xq , (2.2)

where c = [x11 , x21 ] × . . . , ×[x1d , x2d ] ⊆ Rd is an arbitrary cell, x ¯q = (x1 , . . . , xq−1 , xq+1 , . . . , xd ) and c¯iq = [x11 , x21 ] × · · · × [x1q−1 , x2q−1 ] × {xiq } × [x1q+1 , x2q+1 ] × [x1d , x2d ] ∈ Rd−1 is a cell interface, for 1 ≤ q ≤ d and i = 1, 2. The integral form is much more general than the differential form (2.1); the last implies the former, but the reciprocal is only true for

14

2.1. Hyperbolic conservation laws

smooth functions. In practice the solution u is not smooth in general and only the integral form is valid in this case. Hopefully, a mixed formulation, where the differential form is used wherever u and f q (u) are smooth, and additional conditions are given for the zones where discontinuities appear, can still be used (see section 2.2). The problem statement is usually to solve a Cauchy problem, i.e., to find the state of the system after a certain time t = T , given the state at time t = 0. Equation (2.1) is thus augmented with initial conditions u(x, 0) = u0 (x),

x ∈ Rd .

(2.3)

Boundary conditions have to be also specified when considering a bounded domain in Rd . System (2.1) can be written in quasi-linear form as: d

∂u X ∂f q ∂u + = 0, ∂t ∂u ∂xq q=1

x ∈ Rd , t ∈ R+ .

The matrices

∂f q ∂u are called the Jacobian matrices of the system. System (2.1) is said to be hyperbolic if any combination Aq ≡ Aq (u) ≡

d X q=1

ξq Aq ,

(ξq ∈ R)

has m real eigenvalues and a complete set of eigenvectors1 . If the eigenvalues of the Jacobian matrix are all distinct, the system is said to be strictly hyperbolic. For any u the Jacobian matrices can thus be diagonalized as: Aq = Rq Λq Rq−1 , where Λq is a diagonal matrix whose entries are the matrix eigenvalues, Λq = diag(λq1 , . . . , λqm ),

(2.4)

and Rq is the matrix whose column vectors are the (corresponding) right eigenvectors of Aq , q Rq = [r1q |, · · · |rm ] (2.5) 1

Each eigenvalue is repeated as many times as its multiplicity indicates

2. Fluid dynamics equations

15

Hyperbolicity is necessary. As will be illustrated throughout the text, the solution of simple hyperbolic linear problems (e.g. Riemann problems) is composed by m simple waves moving independently (in particular, see section 2.3.2). For the existence of such solutions it is necessary for the system to be hyperbolic (see [105] for a simple proof). For nonlinear systems the above argument applies at least locally, so hyperbolicity is also necessary for nonlinear systems. For smooth solutions of 1D linear systems, well-posedness of the system in C ∞ needs also hyperbolicity (see the works of Lax [97] and Mizohata [125]). For a nonlinear system in Rd the necessity of hyperbolicity can be seen when considering an initial value problem for the system written in quasi-linear form, and considering an initial data that varies only in a direction given by ξ = (ξ1 , . . . , ξd ) ∈ Rd : ( Pd ∂u ∂u q=1 Aq (u) ∂xq = 0, ∂t + (2.6) u(x, 0) = u0 (ξ · x), ξ = (ξ1 , . . . , ξd ) ∈ Rd . System (2.6) is well-posed inP C ∞ if and only if for any function u0 : R −→ Rm ∈ C ∞ the combination dq=1 ξq Aq (u0 ) is diagonalizable with real eigenvalues.

2.2 Properties of hyperbolic conservation laws In this section we study some important qualitative properties of hyperbolic equations and, more precisely, hyperbolic systems of conservation laws. We mainly review the fact that these systems can develop discontinuous solutions, even if smooth initial data is provided. Discontinuous solutions lead to the concept of weak solution, introduced in section 2.2.2. We then briefly explore the spectral structure of such systems, and we show how it can help in the understanding of the nature of the equations, and hence in the development of numerical methods for its approximate solution.

2.2.1 Discontinuous solutions In section 2.2.4 we will show how to exploit the possibility of diagonalizing the Jacobian matrix in a more general context. In this section we

16

2.2. Properties of hyperbolic conservation laws

only aim to show that hyperbolic equations can develop discontinuities in their solutions by means of simple examples, and how these discontinuities can be treated using the spectral information contained in the Jacobian matrices. Consider a Cauchy problem for an one-dimensional hyperbolic scalar equation of the form ut + f (u)x = 0, x ∈ R, t ∈ R+ , u(x, 0) = u0 (x), x ∈ R, or in quasi-linear form: ut + f ′ (u)ux = 0, x ∈ R, u(x, 0) = u0 (x), x ∈ R,

t ∈ R+ ,

(2.7)

∂f (u) where we have used the notation ut = ∂u ∂t and f (u)x = ∂x . Let x(t) be a parameterized curve in the (x, t) plane verifying the ordinary differential equation x′ (t) = f ′ (u(x(t), t)). (2.8)

For such a curve it holds d u(x(t), t) = ut + ux x′ (t) = ut + f ′ (u)ux = 0, dt i.e., the solution u is constant along the curve x(t) as time varies. Such a curve is called a characteristic curve of the equation (2.7). As u is constant along characteristics, by (2.8) so it is x′ (t). The characteristic curves are therefore given by x(t) = f ′ (u)t + C, i.e., for scalar equations the characteristics are straight lines in the (x, t) space, with slopes given by f ′ (u). The solution at a given point (x, t1 ), with t1 > 0 can be, in principle, obtained from the initial data by tracing back a characteristic that passes through the point until time t = 0. All the derivation made until now assumes smooth solutions and fluxes, but in general this is not the case; as time evolves, characteristic curves can cross in the (x, t) space. At a point where two different characteristics cross, the solution can take two different values, given by the initial data at two different spatial locations. Since multi-valued solutions are not physically allowable in fluid dynamics, only one solution is possible. The physically correct solution is often composed by a jump discontinuity, located at the crossing point, that propagates along time. This situation corresponds with the formation of a shock wave. Another possibility is that no characteristic passes through the given point (x, t1 ). In this case the solution at that point cannot be defined by

2. Fluid dynamics equations

17

means of characteristics and some information, that was not present in the initial data, has to be incorporated to build a feasible solution. In gas dynamics, this situation corresponds to the formation of an expansion wave, where the gas is being rarefied, and is therefore commonly called a rarefaction wave. For linear equations the slope of the characteristics is constant, and thus they are parallel. In this case it is impossible the formation of a shock or rarefaction wave. Another kind of wave, called contact discontinuity, is typical of linear equations with jump discontinuities in the initial data, and is characterized by the propagation of the data with constant speed. The three kind of phenomena described above –shocks, rarefactions, and contacts– represent, in a simplified form, the main typical features of the solution of hyperbolic systems of conservation laws. All these basic and intuitive ideas are extended, more formally, to hyperbolic nonlinear systems in section 2.2.4. As an example consider a Cauchy problem for the inviscid Burgers’ equation (cf. section 2.3.1): 2 ( = 0, x ∈ R, t ∈ R+ , ut + u2 (2.9) x u(x, 0) = u0 (x), x ∈ R.

For this equation we have f ′ (u) = u. For the following initial data, if x < 13 , 1 3 u0 (x) = (1 − x) if 31 ≤ x ≤ 23 , 12 if x > 32 , 2

shown in the bottom left plot of Fig. 2.1, the characteristics, in the (x, t) plane, are straight lines t = mx + n, with slopes m given by if x < 13 , 1 2 if 31 ≤ x ≤ 23 , m= 3(1−x) 2 if x > 32 .

These lines will cross in finite time. For example the characteristics passing through x = 31 and x = 23 are respectively given by x(t) = 13 + t and x(t) = 32 + 2t and cross at time t = 23 . In fact at this time all characteristics starting from points in the region 13 ≤ x ≤ 23 will cross at the same point (top left plot of Fig. 2.1). If we consider the following initial data, depicted in the bottom right plot of Fig. 2.1: 1 if x > 12 , u0 (x) = 1 1 2 if x ≤ 2 ,

18

2.2. Properties of hyperbolic conservation laws t

t

x=0

x=1/3

x=1

x=2/3

x=0

x

111111111111111 000000000000000 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111

x=1/2

x=1

x

x=1/2

x=1

x

U (x)

U (x)

0

0

1

1

1/2

1/2

x=0

x=1/3

x=1

x=2/3

x=0

x

Figure 2.1: Characteristics in the (x, t) plane (top) and corresponding initial data (bottom). Left part: characteristics collide at finite time. Right part: No characteristic arrives to the shadowed zone.

then for x > 12 the characteristic passing through a point x0 is given by x(t) = x0 + t and for x ≤ 21 by x(t) = x0 + 21 t. These characteristics are sketched in the top right plot of Fig. 2.1. In this case the characteristics do not cross and no characteristic is passing through the points in the region x− 1 < 2 t

1 2

< 1,

which is shadowed in the figure. In order to illustrate the concept of contact discontinuity, consider the linear advection equation (cf. section 2.3.1)

ut + aux = 0, x ∈ R, u(x, 0) = u0 (x), x ∈ R,

t ∈ R+ ,

(2.10)

where u0 (x) is any function. In this case the characteristics are given by x(t) = at + C, so they are parallel lines, as shown in the Fig. 2.2. If a discontinuity is present in the initial data, it will be simply advected with speed given by a. No discontinuities can form from smooth initial data.

2. Fluid dynamics equations

19

t

x=0

x=1/3

x=2/3

x=1

x

Figure 2.2: Characteristics in the (x, t) plane corresponding to the case a = 1 in (2.10)

2.2.2 Weak solutions In order to be able to consider non-smooth solutions the classical concept of solution, i.e., a smooth function verifying (2.1), has to be relaxed. As the integral form of the equation is more general than the differential form, the latter being obtained from the former by means of smoothness assumptions that do not hold in general, the new concept of solution can be thought of as a solution of the integral form of the conservation law. Unfortunately the integral form is quite difficult to handle. One should prove that, for a given function, equation (2.2) holds for any choice of the control volume and of the time interval. An equivalent, more convenient, form of the integral equation is provided by the theory of distributions. Definition 1. A function u(x, t) is a weak solution of (2.1) with given initial data u(x, 0) if Z Z Z d X ∂φ ∂φ q u(x, t) (x, t) + dxdt = − f (u) φ(x, 0)u(x, 0)dx (2.11) ∂t ∂xq Rd R+ Rd q=1

holds for all φ ∈ C01 (Rd ×R+ ), where C01 (Rd ×R+ ) is the space of continuously differentiable functions with compact support in Rd × R+ .

It is easy to check that (2.2) and (2.11) do have the same solutions. Of course strong solutions are also weak solutions, and continuously differentiable weak solutions are strong solutions. Weak solutions are often not unique, and a procedure to identify the physically correct one is needed. This is usually done by means of additional conditions imposed on the solutions of the equation, that should be satisfied in a discrete sense by the numerical method. This point is outlined in section 2.2.3.

20

2.2. Properties of hyperbolic conservation laws

2.2.3 Rankine-Hugoniot conditions By applying the integral form of the conservation law on a small volume surrounding an isolated discontinuity, one can obtain the RankineHugoniot conditions [143, 75]. These conditions characterize weak solutions in terms of the discontinuity movement, and gives information about the behavior of the conserved variables across discontinuities. The derivation of the Rankine-Hugoniot conditions can be found in some sources, e. g. [70, 71, 36]. For a general conservation law the conditions read: [f ] · nΣ = s[u] · nΣ , (2.12)

where f = (f 1 , . . . f d) is a matrix containing the fluxes, u is the solution, s is the discontinuity velocity and nΣ is the vector normal to the discontinuity. The notation [·] indicates the jump on a variable across the discontinuity. Weak solutions do necessarily satisfy the Rankine-Hugoniot conditions at discontinuities. In fact it can be shown that a function u(x, t) is a weak solution of (2.1) if and only if equation (2.1) holds wherever u is smooth at (x, t) and the Rankine-Hugoniot conditions are satisfied if u is not smooth in (x, t), see e.g. [36]. Since conservation laws can have more than one weak solution (some simple examples of this fact can be found e. g. in [104]), additional conditions have to be imposed to the equations in order to pick up the physically correct solution, known as entropy solution. Several conditions were derived in the fifties and sixties, being the most known the ones due to Oleinik [130], Lax [98], Wendroff [187] and Liu [111]. Lax’s E-condition is defined in (2.16) below.

2.2.4 Characteristic structure of a system of conservation laws In section 2.2.1 we have illustrated the fact that, to some extent, the characteristics are an extremely useful tool for the computation of solutions of hyperbolic conservation laws, since an important part of the structure of the solution is given by the characteristic curves. Roughly speaking, characteristics are the curves in the (x, t) space that carry information. The fact that information propagates along characteristics is

2. Fluid dynamics equations

21

particular to hyperbolic systems and, to some extent, serves as a design tool for numerical methods and as a way to understand the behavior of the solutions. A brief summary of the leading facts about the characteristic structure of hyperbolic systems, extending the ideas presented in section 2.2.1 for scalar equations, is presented in this section. More complete studies on the topic can be found in [165, 41]. For simplicity we will restrict the study to one-dimensional problems, ut + f (u)x = 0.

(2.13)

Being system (2.13) hyperbolic, we can decompose the Jacobian matrix A = f ′ (u) as A = RΛR−1 , with R and Λ as in (2.4) and (2.5). From the properties of the characteristic structure of A, allowable types of discontinuities in the flow solution and their properties are described next. Each column vector rp of R defines a vector field rp : Rm → Rm , u → rp (u), called p-th characteristic field. For a constant-coefficient linear system of conservation laws the characteristic information suffices to completely solve the system, as will be explained in section 2.3.2. A nonlinear system cannot be solved by the same means, but an analysis analogous to the linear case gives qualitative information about the solution structure and allows to tackle the numerical solution of the system in a more convenient way. The first concept to be introduced is that of characteristic curves: Definition 2. Given an hyperbolic system of conservation laws ut +f (u)x = ′ 0, let {λp (u)}m p=1 be the eigenvalues of the Jacobian matrix f (u). We say that a curve x=x(t) is a characteristic curve of the system if it is a solution of the ordinary differential equation: dx = λp (u(x, t)) dt for some p, 1 ≤ p ≤ m. Note that only strictly hyperbolic systems have m different characteristic curves. Characteristic curves are often interpreted in the (x, t) plane. A basic fact about characteristics is that, for linear systems the solution of the system is constant along characteristics, and these are straight lines. For nonlinear systems characteristics are no longer straight lines, nor the solution is constant along them but, for small times, it is reasonable to suppose that the behavior of the nonlinear system can be mimicked by that of a linear system, coming from some suitable linearization. (cf. sections 3.6 and 4.2).

22

2.2. Properties of hyperbolic conservation laws

Particular types of characteristic fields of interest are presented next. The first type are genuinely nonlinear fields. A characteristic field defined by an eigenvector rp (u) is called genuinely nonlinear if ∇λp (u) · rp (u) 6= 0,

∀u,

(2.14)

where ∇λp (u) is the gradient of λp (u). Note that for a linear system A does not depend on u and therefore λp is constant with respect to u, so genuinely nonlinear fields cannot appear in linear system and are particular of nonlinear systems. Let u(α) be a parameterized curve that is an integral curve of a genuinely nonlinear vector field rp , i.e., a curve in phase space such that du = c(α)rp (u(α)), dα for some c(α) 6= 0. Because of the definition of genuinely nonlinear field, λp (u) varies monotonically as u varies along an integral curve of rp (u): dλp (u(α)) du(α) = ∇λp (u(α)) · = c(α)∇λp (u(α)) · rp (u(α)) 6= 0. dα dα Another interesting type of characteristic fields are linearly degenerate fields, for which ∇λp (u) · rp (u) = 0, ∀u. (2.15) In linearly degenerate fields λ(u) remains therefore constant along integral curves of rp (u) as u varies, due to (2.15). These fields are a generalization of the characteristic fields of a constant-coefficient linear system, where ∇λp = 0. Roughly speaking, the behavior of the solution with respect to a linearly degenerate field is similar to a linear system, whereas a genuinely nonlinear field implies types of discontinuous solutions that can never appear in a linear system. Existence theory for the Cauchy problem for systems where all the characteristic fields are either linearly degenerate of genuinely nonlinear was developed by Glimm [55], using the solution of the Riemann problem found by Lax [98]. For general systems, existence of weak entropy solutions is much more difficult to state [112]. Some types of discontinuities can appear in the solution of a nonlinear system. The presence or not of an specific type of discontinuity can be determined in certain cases from the characteristic structure of the Jacobian matrix. More precisely, discontinuities can often be associated with a single characteristic field, and certain types of discontinuities are associated to particular types of characteristic fields.

2. Fluid dynamics equations

23

Let us introduce some definitions in order to start a brief study of the relationships between discontinuities and characteristic fields. A discontinuity defined by x = s(t), that separates two states uL (t) and uR (t) is said to be a p-shock, or a shock wave associated to the p-th characteristic field if λp (uL ) ≥ s′ (t) ≥ λp (uR ). (2.16) Condition (2.16) is called Lax’s E-condition [99]. It is a particular case of the entropy conditions mentioned in section 2.2.3. We will only consider shocks where the following holds: λj (uL,R ) > λp (uL ) ≥ s′ (t) ≥ λp (uR ) > λi (uL,R ),

j > p > i.

(2.17)

In general a shock wave is defined as a discontinuity verifying the Rankine-Hugoniot conditions (2.12), and can involve jumps in more than one conserved variable. Conditions (2.16) and (2.17) ensure that the shock is associated to a single characteristic field and that other characteristics are not interfering, in the sense that (2.16) cannot hold for two characteristic fields at the same time. This is often the case when λp (u) is a simple eigenvalue. Shocks where the E-condition (2.16) is satisfied for more than one characteristic field simultaneously will not be considered here. A p-contact discontinuity is an special case of a p-shock wave, where (2.16) holds with equalities, i.e. λp (uL ) = s′ (t) = λp (uR ).

(2.18)

In gas dynamics, a contact discontinuity represents the separation of two zones with different density, but in pressure equilibrium, whereas shock waves represent a discontinuity arising from an abrupt pressure change, resulting in a compression of the medium. Rarefaction waves are a kind of waves that are typical of genuinely nonlinear fields. A rarefaction wave does not involve discontinuities in the conserved variables and in gas dynamics represents the situation in which the fluid is expanding and there is a zone where the fluid is being rarefied. Rarefactions are characterized by the condition λp (uL ) < λp (uR ). Let us now analyze the relationships between the different types of waves introduced above, and the different kinds of characteristic fields. In linearly degenerate fields corresponding to single eigenvalues, if two states uL and uR lie in the same integral curve and compose a jump

24

2.3. Model equations

discontinuity, then by (2.15) these two states propagate with the same velocity, i. e. λp (uL ) = λp (uR ) holds, forcing (2.18) to hold. Therefore discontinuities associated to these fields can only be contact discontinuities. These are the only type of discontinuities that can appear in the solution of linear systems. On the other hand genuinely nonlinear fields can host both shocks and rarefaction waves, depending on the left and right states and the kind of monotonicity of the variation of λp (u).

2.3 Model equations In this section we introduce the main model equations used in fluid dynamics, namely the advection equation, Burgers’ equation, linear hyperbolic systems and the Euler equations, as a model of nonlinear system of conservation laws. The two-component Euler equations are also introduced. All these models will be used in chapter 7 for the validation of the algorithm.

2.3.1 Scalar hyperbolic equations In this section we will consider the advection equation and Burgers’ equation, which represent two of the most studied examples of hyperbolic scalar equations. These models present many of the features of hyperbolic systems.

Advection equation The advection equation is the simplest model of a conservation law. In one space dimension it is written as: ut + aux = 0,

(2.19)

where a is a constant. The advection equation governs the motion of a (conserved) quantity, with density u, in a fluid as it is advected with constant velocity a. Advection with space or time-dependent velocities will not be considered here.

2. Fluid dynamics equations

25

For any function F : R −→ R, a solution of (2.19) is given by u(x, t) = F (x − at).

(2.20)

This solutions represent the transport of a given perturbation described by F through the flow at constant speed a without changing shape. If an initial condition u(x, 0) = u0 (x) is given, the solution of the corresponding Cauchy problem is u(x, t) = u0 (x − at). Note that even if F is not continuous (2.20) is still a weak solution of (2.19), and such a situation is a simple case of a contact discontinuity propagating with constant velocity.

Inviscid Burgers’ equation The inviscid Burgers’ equation is defined by: 2 u = 0. ut + 2 x This equation is the inviscid version of the viscous equation ut +

(2.21)

u2 2

x

=

ǫuxx , with ǫ > 0, studied by Burgers [30]. The equation is similar to the advection equation, when written in quasi-linear form ut + uux = 0, but with the particularity that the speed of propagation, given by f ′ (u) = u, is no longer constant, but depends on the solution itself. Despite of this resemblance, the behavior of the solution of this equation is completely different from the advection equation. Here u is not simply advected as time evolves, but can also be compressed or rarefied. Shocks and rarefaction waves typically appear in the solution of this equation, see section 2.2.1.

2.3.2 Linear hyperbolic systems Linear systems represent a generalization to several variables of the scalar advection equation (2.19). In this section we will study the main properties of linear hyperbolic systems, in particular the solution of the system through a change of variables. Most of the knowledge acquired from

26

2.3. Model equations

linear systems will be exported to nonlinear systems, since numerical methods for nonlinear systems are constructed under the assumption that nonlinear systems behave locally like linear systems. A linear hyperbolic system is a particular case of the PDE (2.1) where the flux function f (u) depends linearly on u, i.e., it can be written as f (u) = Au, where A is an Rm × Rm constant-coefficient matrix. The equation thus reads for this case: ut + Aux = 0.

(2.22)

Hyperbolicity of the system means that the matrix A has m real eigenvalues λ1 , . . . , λm and m linearly independent (right) eigenvectors r1 , . . . , rm . This is equivalent to say that the matrix A is diagonalizable with real eigenvalues, i.e., it can be expressed as: A = RΛR−1 where Λ = diag(λ1 , . . . , λm ), with λp ∈ R and R = [r1 , . . . , rm ], rp ∈ Rm . The introduction of the change of basis given by the matrix R produces a new linear system that is diagonal, and can hence be solved as m (decoupled) advection equations, whose solution is known, see section 2.3.1. By applying the inverse change of basis to the solution of the diagonal system one obtains the general solution of the linear system. The variables u, when expressed in the basis given by R, are called the characteristic variables, as stated in the next definition. Definition 3. Given a hyperbolic linear system, with matrix A = RΛR−1 , the characteristic variables w = [w1 , . . . , wm ]T of the system are defined by w = R−1 u. With these variables we can write equation (2.22) as: Rwt + RΛR−1 Rwx = 0 ⇔ Rwt + RΛwx = 0 ⇔ wt + Λwx = 0.

(2.23)

The equation wt + Λwx = 0. is called the characteristic form of the linear system (2.22), and is a diagonal linear system of PDE’s, that can be written in expanded form as: λ1 0 · · · · · · 0 0 λ2 0 · · · 0 w1 w1 . .. . . . . + .. (2.24) ... . · .. = 0. .. wm t wm x . 0 0 ··· 0 0 λm

2. Fluid dynamics equations

27

Each row in (2.24) reads ∂wp ∂wp + λp = 0, ∂t ∂x which is nothing but an advection equation with constant velocity λp . Given initial data u(x, 0) = u0 (x) for (2.22), the solution w of (2.24) is given by wp (x, t) = wp0 (x − λp t), where wp0 is the p-th component of w0 = R−1 u0 . By taking the inverse change of basis the solution of the original linear system (2.22) is obtained as: u = R · w, or in expanded form: u(x, t) =

m X p=1

wp (x − λp t, 0)rp .

2.3.3 Nonlinear hyperbolic systems Nonlinear hyperbolic systems merge together two already presented models, namely nonlinear scalar equations, exemplified by Burgers’ equation (2.21), and linear systems, introduced in section 2.3.2. Within this section we introduce the basics of two model equations –the Euler equations and the two-component Euler equations– that will be used as test problems and reference models throughout the text.

Euler equations The Euler equations model the dynamics of a Newtonian, ideal, inviscid fluid. The Euler equations are derived from the conservation of mass, linear momentum and energy in the fluid as it moves, and represent a simplified model for the Navier-Stokes equations, which are the most complete model used up to now for the simulation of fluid dynamics. The two-dimensional Euler equations can be written as: ut + f (u)x + g(u)y = 0

(2.25)

28

2.3. Model equations

with

ρ ρvx u= ρvy , E

ρvx ρvx2 + p f (u) = ρvx vy , vx (E + p)

ρvy ρvy vx g(u) = ρvy2 + p , vy (E + p)

(2.26)

where ρ denotes density, vx and vy are the Cartesian components of the velocity vector v, E is energy and p is pressure. The one dimensional version of the equations are obtained by retaining the first two terms in the left hand side of (2.25) and canceling out the third row of u and f (u), to get ρ ρvx ρvx + ρvx2 + p = 0. (2.27) v (E + p) E x x t

The Euler equations in d space dimensions compose a system of d + 2 equations. Note that in the definition of the equations we use d + 3 variables, namely density, d velocity components, total energy and pressure. To close the system we need to specify an additional relation linking all these variables. This relation is called Equation Of State (EOS), and depends on the type of fluid under consideration. We will consider an special class of fluids called ideal fluids or ideal gases. An ideal gas can be defined as a gas in which all collisions between atoms or molecules are perfectly elastic and in which there are no intermolecular attractive forces. Ideal gases are characterized by three state variables: absolute pressure (p), density (ρ), and absolute temperature (T ). The relationship between them may be deduced from kinetic theory and is called the Ideal Gas Law: p = ρRT, where R = 8.314472 J · K−1 · mol−1 is the Universal Gas Constant. Total energy can be decomposed into kinetic energy plus internal energy as follows: E=

1 ρ||v||22 2 | {z }

kinetic energy

+

, ρe |{z} internal energy

(2.28)

where e denotes the specific internal energy. Kinetic energy is due to the advection of the flow, whereas internal energy is the result of other forms of energy. It is often assumed that internal energy is a known function of pressure and density, e = e(p, ρ).

2. Fluid dynamics equations

29

For an ideal gas, internal energy is a function of temperature alone, i.e., a function of ρp . A further simplification is to suppose that it is proportional to temperature, e = cv T,

(2.29)

where cv is called the specific heat at constant volume. Ideal gases where (2.29) holds are called polytropic gases. In general the term specific heat defines the amount of heat required to change a unit property of a substance by one degree in temperature. For polytropic gases, cv is the specific heat for internal energy (i.e., the increment of internal energy depends linearly on temperature by the factor cv , when the gas volume is held fixed). If the volume is allowed to expand and the gas pressure is forced to remain constant the increment in internal energy does not depend linearly on temperature anymore, because some energy is used to expand the volume. The quantity that depends linearly on T is called specific enthalpy, denoted by h, and is defined by: h =e+

p ρ

(2.30)

For polytropic gases the following holds: h = cp T

(2.31)

The constant cp is called specific heat at constant pressure. The quantity γ=

cp cv

(2.32)

is called the specific heat ratio, and is just a number that depends on the gas. For air it takes the value γ ≈ 1.4. Using equations (2.30), (2.31) and (2.32), a simple computation leads to express the internal energy in terms of pressure and density as e=

p , ρ(γ − 1)

(2.33)

which is another form of the equation of state for a perfect gas. Substituting (2.33) into (2.28) gives E=

1 p ρ||v||22 + . 2 γ−1

Some common gases and fluids can be considered, to a good approximation, polytropic. Examples of such are air, helium, carbon dioxide

30

2.3. Model equations

or even water. For real fluids the ratio of specific heats is not constant, and varies also with temperature, but very often the variations are small enough to be neglected (e.g. 2 · 10−3 for dry air between 0o C and 100o C or 2 · 10−2 for water between 20o C and 200o C), and thus these gases are in practice considered polytropic ideal gases. An essential physical quantity is entropy, denoted by S = S(ρ, e). Entropy was first rigorously introduced by Rudolf Clausius in 1864 [37]. He actually defined the concept of change in entropy as the ratio between the heat transferred to a system and its absolute temperature. As the change in total internal energy can be expressed as the work done on the system, denoted by dW , plus the heat transmitted to the system, denoted by dQ, we can write: dQ = de − dW. The amount of work done on the system can be taken as dW = −pdv, where 1ρ = v is the specific volume, leading to dQ = de + pdv. The change in entropy is written as: dS =

de + pdv dQ = T T

(2.34)

Integrating (2.34) one gets the following expression for the entropy, valid for a polytropic gas: p + C1 , S = cv ln ργ where C1 is a constant, or equivalently S

p = C2 e Cv ργ ,

(2.35)

with C2 constant. The following equation holds for the entropy (we indicate the 1-D version): ∂S ∂S + vx = 0. (2.36) ∂t ∂x Note that dS ∂S ∂S dx = + , dt ∂t ∂x dt dS therefore along particle paths vx = dx dt it holds dt = 0, i.e., entropy remains constant as particles move, as long as the flow solution is smooth. Along particle paths equation (2.35) reduces to the isentropic law

p = C 3 ργ ,

(2.37)

2. Fluid dynamics equations

31

where C3 is a constant that depends only on the initial entropy of the particle. Gases where entropy is constant everywhere are called isentropic gases. Such an assumption is often made when no shock waves are present in the gas. The quantity s ∂p c= ∂ρ S=constant is called the local speed of sound in the gas. Clearly, for isentropic gasses, the local speed of sound can be written as: s r p γC3 ργ γp γ−1 = . (2.38) c = C3 γρ = ρ ρ Note that the system of the Euler equations is hyperbolic if and only if the quantity ∂p ∂ρ S=constant

is positive, as holds, for example, for polytropic gases. A quantity related to the speed of sound is the Mach number, introduced by Ernst Mach, and simply defined as the ratio between the fluid velocity and the local speed of sound: M=

||v||2 c

The local speed of sound describes the speed of acoustic waves passing through the medium at rest and plays a central role in the quantitative behavior of the fluid motion. Fluids are often classified in subsonic (M < 1) and supersonic (M > 1). A more adequate classification for practical purposes includes transonic flows, in which the Mach number lies in a range near one (0.8 ≤ M ≤ 1.2 is commonly taken), and both subsonic and supersonic regions exist, and hypersonic flow (M > 5). Because our numerical method needs the full spectral decomposition of the Jacobian matrices (see chapter 4), we compute next the required expressions for the Euler equations. For the two-dimensional equations (2.25) the Jacobian matrix can be written as: 0 1 0 1 1 (γ − 1)(vx2 + vy2 ) − vx2 vx (3 − γ) vy (1 − γ) γ − 1 2 , f ′ (u) = −vx vy vy vx 0 vx ( 12 (γ − 1)(vx2 + vy2 ) − H) H + (1 − γ)vx2 (1 − γ) vx vy γvx

32

2.3. Model equations

with eigenvalues λ1 = vx − c, λ2 = λ3 = vx , λ4 = vx + c, and the matrices R and L = R−1 given by: 1 1 0 1 v −c vx 0 vx + c x R= v v 1 vy y y vx2 +vy2 H − vx c vy H + vx c 2

(2.39)

and

L=

(γ−1)(vx2 +vy2 )+2vx c 4c2 (γ−1)(vx2 +vy2 ) 1− 2c2 −vy (γ−1)(vx2 +vy2 )−2vx c 4c2

− vx (γ−1)+c 2c2 vx (γ−1) c2

0

vy (γ−1) 2c2 vy (γ−1) c2

γ−1 2c2 − γ−1 c2

vy (γ−1) 2c2

γ−1 2c2

−

1

− vx (γ−1)−c 2c2

−

0

.

(2.40)

In (2.39), H represents the total enthalpy , given by H=

1 c2 1 E+p = ||v||2 + = ||v||2 + h. ρ 2 γ−1 2

(2.41)

The eigenstructure of g′ (u) is obtained by interchanging the roles of vx and vy , and the second and third components of each left and right eigenvector (see [71]). The one-dimensional versions of these matrices are obtained by removing the third row and column from the two-dimensional versions and setting vy = 0:

0 1 2 ′ f (u) = 2 (γ − 3)vx c2 vx 1 3 2 (γ − 2)vx − γ−1

1 (3 − γ)vx 3−2γ 2 c2 2 vx + γ−1

vx − c 0 0 , Λ= 0 vx 0 0 0 vx + c

1 R= vx − c H − vx c

1 1 vx vx + c . 1 2 2 vx H + vx c

0 γ − 1 , γvx

2. Fluid dynamics equations

33

The inverse of the matrix R is the matrix whose rows are the left eigenvectors: v (v2 −v c−2H) vx2 −2vx c−2H x x x 1 − − 2 2 2 2c(vx −2H) vx −2H x −2H) 2c(v 2(vx2 −H) 2vx 2 . L= − vx2 −2H u2 −2H vx2 −2H 2 +2v c−2H vx (vx2 +vx c−2H) x 1 − vx2c(v − v2 −2H 2 −2H) 2c(v2 −2H) x

x

x

From the one dimensional version of (2.41) one has vx2 − 2H = −

2c2 , γ−1

so that the eigenstructure of f ′ (u) depends only on two variables, c and vx . A similar analysis can be done on the two dimensional matrices R and L defined in (2.39) and (2.40) to show that in that case the eigenstructure of the system depends only on c, vx and vy .

Two-component Euler equations An extension of the Euler equations, for a fluid composed by the mixture of two or more perfect gasses in thermal equilibrium, consists of adding a new equation, which models the conservation of one of the gasses, which implies the conservation of both gasses due to the conservation of mass. We add a new variable φ which represents the mass fraction of one of the gasses. Consequently, the quantity 1 − φ represents the mass fraction of the other gas. The resulting system is hyperbolic. We state in this section the spectral structure of the system in one and two dimensions, which will be needed in the numerical methods described in further sections. In one space dimension we write the equation of conservation of the mass of the first gas as: (ρφ)t + (ρφvx )x = 0, so that the new system of equations extending the Euler equations of gas dynamics becomes:

ρ ρvx ρvx ρvx2 + p E + vx (E + p) ρφ ρφvx t

= 0. x

(2.42)

34

2.3. Model equations

The Jacobian matrix for this system can be written as: 0 1 0 0 ′ ′ γ−3 2 v − φγ (φ)e (3 − γ)v γ − 1 γ (φ)e x x 2 f ′ (u) = γ−1 v 3 − vx H − vx φγ ′ (φ)e H − (γ − 1)v 2 γvx vx γ ′ (φ)e x x 2 −φvx φ 0 vx

,

where e is the specific internal energy. The ratio of specific heats γ = γ(φ) depends now on the composition of the mixture through the relation: γ=

cp1 φ + cp2 (1 − φ) , cv1 φ + cv2 (1 − φ)

where cpi and cvi are, respectively, the specific heats and constant pressure and constant volume , for the i − th gas component, i = 1, 2. The eigenvalues of the Jacobian matrix are λ1 = vx − c, λ2 = λ3 = vx , λ4 = vx + c, the corresponding right eigenvectors are r1 = [1, vx − c, H − vx c, φ]T , T 1 r2 = 1, vx , vx2 , φ , 2 #T " ′ γ (φ)e ,1 , r3 = 0, 0, γ−1

r4 = [1, vx + c, H + vx c, φ]T ,

and the (normalized) left eigenvectors are: vx 1 l1 = β2 + − φβ3 , −β1 vx − , β1 , β3 , 2c 2c l2 = [1 − 2β2 + 2φβ3 , 2β1 vx , −2β1 , −2β3 ] , l3 = [−φ, 0, 0, 1] , 1 vx − φβ3 , −β1 vx + , β1 , β3 , l4 = β2 − 2c 2c

where γ−1 , 2c2 v2 = β1 x , 2 ′ γ (φ) = . 2γ(γ − 1)

β1 = β2 β3

2. Fluid dynamics equations

35

The multi-component Euler equations in 2D read:

ρ ρvx ρvy E ρφ

+ t

ρvx ρvx2 + p ρvx vy vx (E + p) ρφvx

+ x

ρvy ρvx vy ρvy2 + p vy (E + p) ρφvy

= 0.

(2.43)

y

The analysis made in the one-dimensional case for the Jacobian matrices can be repeated here for

f (u) =

ρvx ρvx2 + p ρvx vy vx (E + p) ρφvx

and

g(u) =

ρvy ρvx vy ρvy2 + p vy (E + p) ρφvy

.

We simply state here the eigenstructure of f ′ (u). By interchanging vx and vy and the second and third components of each left and right eigenvector, the eigenstructure of g′ (u) is obtained. See [118] and references therein for further details. The eigenvalues of F ′ (u) are λ1 = vx − c, λ2,3,4 = vx , λ5 = vx + c, and the corresponding right eigenvectors ri and left eigenvectors li , normalized

36

2.3. Model equations

so that ri · lj = δij , are

T 1 vx − c vy H − vx c φ , i h T 2 2 , r2 = 1 vx vy vx +vy φ 2 T r3 = 0 0 1 vy 0 , h iT ′ (φ)e r4 = 0 0 0 − γγ−1 , 1 T r5 = 1 vx + c vy H + vx c φ , vx 1 l1 = β2 + 2c − φβ3 −β1 vx − 2c −β1 vy β1 β3 , l2 = 1 − 2β2 + 2φβ3 2β1 vx 2β1 vy −2β1 −2β3 , l3 = −vy 0 1 0 0 , l4 = −φ 0 0 0 1 , u 1 l5 = β2 − 2c − φβ3 −β1 vx + 2c −β1 vy β1 β3 ,

r1 =

where

(2.44)

γ −1 , 2c2 vx2 + vy2 , = β1 2 γ ′ (φ) = , 2γ(γ − 1)

β1 = β2 β3 2

c and H is the enthalpy, H = γ(φ)−1 + 21 (vx2 + vy2 ). As a final remark, it can be shown that for both the one and the two dimensional case, the left and right eigenvectors can be written in terms of c, φ and the velocity components only, i.e., one variable less than the number of conserved variables.

3 Numerical methods for fluid dynamics In chapter 2 we have presented the main features of partial differential equations related to fluid dynamics, and in particular hyperbolic conservation laws. Such equations are in general impossible to solve analytically, except in some trivial cases, like the linear advection equation presented in section 2.3.1. Numerical methods aim to obtain a discrete approximation of the true solution, which often suffices for practical applications. In this chapter we briefly review the main notions and results related to numerical methods for hyperbolic systems of conservation laws. In this chapter we will center our description in one-dimensional scalar equations, with some notions on the application to one-dimensional systems. The ideas introduced here will be exploited in further chapters, where we will move the discussion to finite-difference methods for nonlinear systems of conservation laws in more dimensions. We will focus on the main facts that are

38

3.1. Discretization

useful for the particular class of numerical methods concerned in this work, and it is by no means intended to be comprehensive. Most of the concepts introduced in this chapter are explained in more detail and in a wider context in any basic textbook of numerical solution of hyperbolic PDE’s, as, for example, the books of LeVeque [104, 105], Toro [175] and Hirsch [70, 71].

3.1 Discretization To numerically solve partial differential equations, the continuous equations are replaced by a discrete representation of them. This is achieved by first discretizing the domain of definition of the PDE by means of a grid, to be introduced below, then the PDE is discretized on the grid, and the resulting discrete, finite-dimensional problem, is solved. There are two major discretization strategies used in Computational Fluid Dynamics: • Point-value discretization, where the discrete values correspond to the pointwise values of the unknown variables at the nodes of the grid, and • Cell-average discretization: the discrete values represent the average value of the variables at the cells of the grid. Consider a scalar conservation laws in one space dimension ut + f (u)x = 0,

(x, t) ∈ R × R+ ,

(3.1)

where u, f : R −→ R, with initial data given by u(x, 0) = u0 (x),

x ∈ R.

Let us introduce the aforementioned concepts for this case. Consider a discrete subset of points of R, defined by nodes {xj }j∈Z . We assume that the nodes are ordered, i.e., xj < xj+1 for all j and, from the points {xj } we define a set of cells by: xj+1 − xj xj − xj−1 , xj + . (3.2) cj = xj − 2 2 A grid is defined, depending on the context, to be either the set {cj }j∈Z or the set {xj }j∈Z . For simplicity, we further assume that the grid is

3. Numerical methods for fluid dynamics

x c

x c

x

j

39

j+1

j

x

j

j+1

j

Figure 3.1: Nodes and cells of grids defined on R. A non-uniform grid (top) and a uniform grid(bottom).

uniform, i.e., xj − xj−1 = ∆x with ∆x a positive constant. This constant is called mesh (or grid) size. For convenience we will use discrete grids indexed with the following convention: 1 ∆x. (3.3) xj = j + 2 Sometimes we will use non-integer indexes to indicate points that do not correspond to nodes. For example the point xj+ 1 represents the point 2 (j + 1)∆x. Given nodes {xj }j∈Z , under the above assumptions, formula (3.2) reduces to ∆x ∆x , xj + cj = xj − = [xj− 1 , xj+ 1 ] = [j∆x, (j + 1)∆x], 2 2 2 2

so that each cell is in this case a subinterval whose center is xj . In Fig 3.1 two discretizations of R are shown, for a uniform (bottom plot) and a non-uniform grid (top plot). The time variable is discretized by defining points in time {tn }n∈N , with tn < tn+1 . If tn+1 − tn is constant with respect to n, we denote it by ∆t and call it the time increment. We will denote by U n = {Ujn }j∈Z the computed approximation to the exact solution u(xj , tn ) of (3.1). We can also interpret the numerical solution as an approximation to the cell-average of the exact solution, defined as: Z x 1 j+ 2 n u ¯j = u(x, tn )dx. (3.4) xj− 1

2

In practical implementations the grid has to be restricted to a finite number of nodes or cells, or equivalently, the domain of definition of the

40

3.1. Discretization

equations have to be restricted to a bounded subset of R and a finite time interval. In particular we will consider a square I = [0, 1] and a fixed time T > 0. We then take positive numbers N and M and a set of nodes {xj }0≤j 0 in which we are interested to know the solution of the equations, and we ∆t assume that the ratio between the spatial and temporal step sizes, ∆x is constant, so it suffices to consider limits when one of the step sizes vanishes. Definition 4. Let {Gk }+∞ k=0 be a sequence of grids with corresponding grid sizes ∆xk , verifying lim ∆xk = 0. (3.6) k→+∞

Given a (discrete) norm ||.|| we say that a sequence of numerical solutions {UGk }+∞ k=0 , associated with the numerical scheme H and the grids {Gk }k , and corresponding to a fixed time T , converges to a function u in the norm ||.|| if lim ||UGk − uGk || = 0 k→+∞

where uGk represents the discretization of the function u on the grid Gk . A numerical method is said to be convergent if any sequence of numerical solutions obtained through it on a sequence of grids verifying (3.6) converges to the true solution of the equation. Often used norms are the discrete Lp norms

||v||p = ∆x

X j∈Z

1

p

p

|vj |

,

3. Numerical methods for fluid dynamics

43

and the discrete L∞ norm ||v||∞ = max |vj |. j∈Z

Note that the concept of convergence is strongly dependent on the norm. Sequences of numerical solutions (and hence numerical methods) can converge in one norm but not in another. In this work we will almost exclusively consider the L1 and L2 norms. It is in general very difficult to show that a given numerical method is convergent in a given norm. The way in which one usually studies convergence is through the concepts of consistency and stability, making use of the Lax equivalence theorem. We briefly describe these concepts in the next sections.

3.2.1 Consistency Consistency deals with the investigation of how a numerical method behaves locally, i.e. in a single time step. To prove consistency it suffices to show that the error produced by a single application of the numerical method (local truncation error, defined below) vanishes as ∆t approaches zero. This is often an easy task, which is attained by simply using Taylor series expansions. Definition 5. Given a one-step numerical method U n+1 = H∆t (U n ) the local truncation error is defined as: Ln∆t =

1 H∆t (un ) − un+1 ∆t

(3.7)

where un is the discretization of the true solution of the PDE. Definition 6. An one-step numerical method U n+1 = H∆t (U n ) is consistent in the norm ||.|| if lim ||L∆t (·, t)|| = 0,

∆t→0

provided u is a smooth function satisfying the PDE. We say that the order of the method is p if L∆t (·, t) = O(∆tp ).

44

3.2. Norms and convergence

3.2.2 Stability Consistency gives information about the accuracy of the numerical method in a single time step. We expect that the faster the local truncation error vanishes, the better the numerical approximations obtained through the numerical method are. This is true provided • The method is convergent, and • The solution u is sufficiently smooth. In this section we deal with the first of these requirements. Consider a numerical method of the form (3.5). If we perform n steps of the method starting with a given initial data U 0 we obtain an approximation Hn (U 0 ) that approximates the true solution un with an error given by E n = Hn (U 0 ) − un . If we advance the solution one step further we get E n+1 = Hn+1 (U 0 ) − un+1 = H(Hn (U 0 )) − un+1 = H(E n + un ) − un+1 = (H(E n + un ) − H(un )) + H(un ) − un+1 = (H(E n + un ) − H(un )) + ∆tLn∆t .

Convergence means that the error vanishes in a given norm as ∆t does. For a consistent method the term ||∆tLn∆t || will vanish by definition. Stability is devoted to the study of the other term, that describes the effect that the errors have in the numerical method. Typically one wants to bound ||E n || by some quantity that can be forced to vanish as ∆t vanishes, for example the discretization error of the initial data, given by E 0 = ||U 0 − u0 ||. Classical stability theory is applicable to linear methods using the 2norm. It exploits the properties of the Fourier transform, in particular Parseval’s identity, that relates the norm of the numerical solution with the one of its Fourier transform. This theory is known as von Neumann’s stability, and is widely used when studying stability of linear methods for partial differential equations (see e.g. [70, 167]). Unfortunately efficient methods for hyperbolic conservation laws are nonlinear, and, on the other hand, the theory can only be applied to the linearized equations.

3. Numerical methods for fluid dynamics

45

Stability can be shown, for example for methods that verify, for some k ≥ 0: ||H(v) − H(w)|| ≤ (1 + k∆t)||v − w||, (3.8) for a certain norm. If (3.8) can be stated with k = 0 the method is called contractive in the norm || · ||: ||H(v) − H(w)|| ≤ ||v − w||. For linear operators the condition of being contractive reduces to the fact that the powers of the operator H are uniformly bounded by a constant: ||Hn || ≤ C, ∀n ∈ N. For nonlinear methods it is very difficult to show that a method is contractive. A similar, but simpler argument is based on the total variation of the numerical solutions, defined as X n T V (U n ) = |Ujn − Uj−1 |. j∈Z

Boundedness of the total variation of the numerical solutions can be used to show stability by means of compactness arguments in normed functional spaces [147]. Total variation diminishing (TVD) (and more generally total variation bounded, TVB) methods exploit this idea. A method is said to be TVD if T V (U n+1 ) ≤ T V (U n ),

(3.9)

and TVB if there exists a constant C independent of n such that T V (U n ) ≤ C. Monotonicity is also useful for stability, since monotone methods are contractive. A method of the form (3.5) is monotone if ∂H ≥ 0, ∂Ujn

∀j.

A necessary condition for stability is the Courant-Friedrichs-Lewy (CFL) condition [40]. This condition is a requirement, for example, for TVD methods in order to be so. The CFL condition relies on the hyperbolic nature of the equations. On these equations the information is carried through characteristics, so the physical solution at a given point

46

3.2. Norms and convergence

(x, t) depends on the values of the initial data at a bounded domain, because no characteristic outside this domain can reach the point x at time t. More precisely, the domain of dependence of the point (x, t) is defined as the set of points corresponding to time t = 0 that completely determine the solution of problem (2.1) with initial data (2.3) at (x, t). By the above comments, the domain of dependence of a point is always a bounded set. Given a numerical method of the form (3.5), the numerical solution Ujn+1 depends on a finite number of components of U n , say {Uℓn }ℓ∈L , where L is a given finite set of indexes. By tracing back this idea, the numerical solution at a point (xj , tn ) for a given numerical method is fully determined by a finite set of points located at time t = 0. The convex hull of this set is called the numerical domain of dependence of the point (x, t). The CFL condition states that the numerical domain of dependence of any point has to contain its domain of dependence. This condition is quite intuitive, since it states that the numerical method has to be able to take into account the information coming from any point that is actually influencing the solution at the next time step. The CFL condition is often expressed as an upper bound in some ratio between ∆t and ∆x. As an example consider a Cauchy problem for the advection equation (2.19). From section 2.3.1 we know that the solution of this problem with initial data u(x, 0) = u0 (x) is given by u(x, t) = u0 (x − at). This means that the domain of dependence of the point (x, t) is the point x − at. In order to be able to compute the correct solution, the numerical method should contain this point in its numerical domain of dependence. Note that, because a is constant, the domain of dependence of any point (x, t) is always located at the left or the right of (x, t). More generally, this idea can be exported to nonlinear systems in the sense that the system can be locally linearized, and this linear system is equivalent to a decoupled system of advection equations, as explained in section 2.3.2. Therefore, to perform a single time step, the relevant information that has to be taken into account by the method comes from one direction, given by the local sign of the eigenvalues of the linearized equations. This idea of following the direction where the characteristic information comes from is called upwinding. Methods that decide the set of points that take part in the computation of the solution from one time step to the next are called upwind methods. High resolution numerical schemes for nonlinear systems can rarely be shown to have any of the above properties (monotonicity, contractiveness or total variation boundedness), so stability cannot be ensured for

3. Numerical methods for fluid dynamics

47

them. These methods are constructed, however, around the ideas behind TVD and monotone methods, and all them have the CFL condition as a requirement.

3.2.3 The Lax equivalence theorem The most important result relating the concepts of stability, consistency and convergence is the Lax equivalence theorem, reformulated from [102] (see e.g. [147, 167]). It can be stated as follows: Theorem 1. For a consistent one step linear scheme for a Cauchy problem of a well-posed linear PDE, stability is a necessary and sufficient condition to convergence. It is in general easier to check consistency and stability than convergence. Consistency is often a trivial task, since the quantities that appear in the equation are replaced by discrete approximations of them, that produce consistent methods almost automatically. For conservative methods the approach is slightly different, but both consistency and the order of accuracy of the method are obtained from the procedure used in the reconstruction (see section 3.4 below). However, practical numerical methods are not linear, and the theorem cannot be applied directly. Moreover, stability is much more hard to be proved in general for nonlinear methods and equations.

3.3 Elementary methods Most elementary numerical methods to solve (2.1) are based on the substitution of partial derivatives by consistent finite-difference approximations. One such a method is Lax-Friedrichs’ method, based on the following substitutions: 1 u(x + ∆x, t) + u(x − ∆x, t) u(x, t + ∆t) − ut (x, t) ≈ ∆t 2 1 (f (u(x + ∆x, t)) − f (u(x − ∆x, t))). f (u)x ≈ 2∆x

48

3.4. Conservative methods

With the notation introduced in section 3.1 the method can be written in the form n + Un Uj+1 ∆t j−1 n n − (f (Uj+1 ) − f (Uj−1 )) (3.10) Ujn+1 = 2 2∆x The time derivative has been substituted by a first order approximation, while the space derivative has been approximated by a second order finite-difference approximation. Therefore, the method is first order accurate in time and second order in space. Because of stability restrictions, expressed by the CFL condition, the method is globally first-order accurate. First order methods suffer from diffusion. The solution is very much smeared near a discontinuity or a zone with high variation. This is due to the big amount of numerical dissipation added to the solution by the scheme, which is in fact approximating an advection-diffusion equation. Elementary methods being second (and higher) order accurate in both time and space can be obtained in several ways. Classical methods are the methods of Lax and Wendroff [96], Beam and Warming [186], MacCormack [115] or the Leapfrog method. These methods are not efficient in general when the solution is not smooth (in particular, none of them is monotone or TVD). The typical behavior of linear second order methods is that spurious oscillations are produced near the discontinuities, and these oscillations do not decrease as the grid size does. It is due in most cases to the lack of numerical dissipation in the solution, and in most cases they can be seen as schemes approximating dispersive equations without diffusive terms. However, these methods were the basis for the development of other second order methods, obtained by modifications of the former, being some of them still used nowadays in practice for particular applications (see e.g. [176],[79]).

3.4 Conservative methods The methods described in section 3.3 are based on the assumption that the partial derivatives can be accurately approximated by finite-differences. This is true in the points where the flow solution u(x, t) is smooth with respect to (x, t). In fact Lax-Friedrich’s and Lax-Wendroff’s methods will converge to the right solution as ∆x and ∆t tend to zero provided the solution is smooth for all (x, t).

3. Numerical methods for fluid dynamics

49

If some singularity is present in the flow solution, the argument is no longer valid. As stated in section 2.2 more than one weak solution can exist in this situation and there is no reason to suppose that the method will converge to the right one. Moreover, the method can converge to a function that is not a weak solution of the PDE. Simple illustrating examples can be found e. g. in [104]. Conservative methods ensure that convergence can only be done to weak solutions. This result, known as Lax-Wendroff theorem, will be stated at the end of the section. We start by introducing conservative methods and its main properties. Definition 7. A numerical method U n+1 = H(U n ) is said to be conservative if there exists a function fˆ : Rp+q+1 → R such that ∆t ˆ n n n n f (Uj−p+1 , . . . , Uj+q ) − fˆ(Uj−p , . . . , Uj+q−1 ) , (3.11) H(U n )j = Ujn − ∆x

for some nonnegative integers p and q. The function fˆ is called the numerical flux function. Conservative methods aim to reproduce at a discrete level the conservation of the physical variables in the continuous equations. In fact (3.11) can be seen as a discrete version of the integral form (2.2) of the PDE. As we will see, convergent solutions computed by conservative methods respect the integral form, in the sense that they are discrete approximations to weak solutions of the equations. Definition 8. Given a conservative numerical method we say that its numerical flux function is consistent with the conservation law if fˆ(U, . . . , U ) = f (U ). In order to ensure some smoothness in the way in which fˆ approaches a certain value f (U ), as its arguments tend to U , we will suppose in general that the flux function is locally Lipschitz continuous in each variable1 . Consistency of the numerical flux is a natural requirement. It is necessary to ensure that a discrete form of conservation, analogous to the conservation law, is provided by conservative methods and, in fact, it is equivalent to the consistency of the scheme itself. We are ready to introduce the main result concerning conservative methods 1

A function f defined on a normed space M is said to be locally Lipschitz continuous at a point x ∈ M if there exists a constant K and a neighborhood N (x) of x such that ||f (y) − f (x)|| ≤ K||y − x||, ∀y ∈ N (x)

50

3.5. High resolution conservative methods

Theorem 2. (Lax-Wendroff, [103]) Let {Gk }+∞ k=0 be a sequence of grids with corresponding grid sizes (∆xk , ∆tk ), verifying lim ∆xk = 0,

k→+∞

lim ∆tk = 0.

k→+∞

Let {UGk }+∞ k=0 be the sequence of numerical solutions obtained by a conservative numerical method, consistent with (2.1), on the grids {Gk }k . If UGk converges to a function u(x, t), then u is a weak solution of (2.1). The definition of convergence stated in the theorem has received much attention. The original definition used in the work of Lax and Wendroff [103] has been relaxed and extended to more general grids, see e.g. [89, 47]. This theorem is the main reason why so much attention has been focused on conservative schemes. However, there is also interest on nonconservative schemes, see e.g. [83, 32]. See also [72, 109] for discussions on both methodologies.

3.5 High resolution conservative methods One of the first methods that combine all the ingredients discussed up to now is Godunov’s method [57]. It is based on the exact solution of a Cauchy problem located in each cell interface, assuming that the solution is constant at each side of the interface and takes the cell-average values of the numerical solution corresponding to the cells at the left and right of the interface as initial data, i.e., for a given time step tn it finds, for each j, the exact solution at time tn+1 of (3.1) with initial data given by ( Ujn if xj− 1 < x ≤ xj+ 1 n 2 2 u(x, t ) = n Uj+1 if xj+ 1 < x ≤ xj+ 3 2

2

This local problem is called a Riemann problem. Due to the finite speed of propagation of information along characteristics, solutions of Riemann problems corresponding to adjacent cell interfaces will not interact for short enough time. Once these Riemann problems are solved, the solution is averaged on each cell to raise a new Riemann problem

3. Numerical methods for fluid dynamics

51

for the next time step. Godunov’s method is explained in more detail in section 3.6. The idea of solving Riemann problems forward in time is at the basis of modern high-resolution shock-capturing methods. Godunov’s method is, however, first order accurate. The order of accuracy can be increased by solving initial-value problems where the piecewise constant approximation used in Godunov’s method is replaced by a more accurate approximation, for example piecewise linear data. One could solve (3.1) with initial data of the form ( Ujn + σj (x − xj ) if xj− 1 < x ≤ xj+ 1 2 2 u(x, tn ) = , (3.12) n Uj+1 + σj+1 (x − xj+1 ) if xj+ 1 < x ≤ xj+ 3 2

2

where σj is a slope computed on the j − th cell from the data U n . This gives a method that is second order accurate in space. Unfortunately, Cauchy problems with initial data that are not piecewise constant cannot be analytically solved in general, and stability properties, as TVD or monotonicity cannot be ensured anymore for any choice of the slopes σj . For some choices of the slopes σj this method will have the same bad behavior of the second order methods already studied, as Lax-Wendroff and Beam-Warming. In fact, those methods can be obtained from particular choices of the slopes (see [105]). In order to control the total variation of the method, limiters are applied to the slopes. Almost any limiter is based on the minmod limiter due to Van Leer [181]: σj = where

1 n n minmod(Uj+1 − Ujn , Ujn − Uj−1 ), ∆x

u minmod(u, v) = v 0

if |u| < |v| and u · v > 0, if |v| ≤ |u| and u · v > 0, if u · v ≤ 0.

A common practice to construct numerical methods with order of accuracy higher than one and suitable for nonlinear systems, is using piecewise constant initial data obtained by a higher order reconstruction at the cell interfaces. From the numerical solution at a given time step one reconstructs, by a certain interpolation or approximation procedure, R L at each interface. Then the Riemann problem two values Uj+ 1 and U j+ 1 2

2

with initial data n

u(x, t ) =

(

L Uj+ 1

2

R Uj+ 1 2

if xj− 1 < x ≤ xj+ 1 , 2

2

if xj+ 1 < x ≤ xj+ 3 , 2

2

52

3.5. High resolution conservative methods

is solved. For example, using a piecewise linear reconstruction one has to solve a Riemann problem with initial data given by ∆x , 2 ∆x n = Uj+1 − σj+1 . 2

L n Uj+ 1 = Uj + σj 2

R Uj+ 1 2

TVD methods can be developed based on slope limiters. However it is known that order of accuracy higher than two cannot be achieved while ensuring the TVD or TVB properties [67]. To achieve higher order some techniques have been developed following similar ideas, but without strictly ensuring total variation boundedness. Examples are the essentially non-oscillatory (ENO) methods, introduced by Harten, Engquist, Osher and Chakravarthy in [64] and the weighted essentially non-oscillatory (WENO) methods [113, 81], that are explained in more detail in section 4.3.

3.5.1 Semi-discrete methods Methods based on local reconstruction of the solution at the cell interfaces can increase the spatial accuracy of the solution by computing numerical flux functions that are better approximations of the true fluxes. In the study of the local truncation error, however, the first order time discretization Ujn+1 − Ujn + O(∆t), ut (xj , tn ) = ∆t is already present in (3.11), giving a method that is only first order accurate in time. Semi-discrete methods face this drawback by applying a space discretization first, leaving the time derivative unchanged, leading to a system of ordinary differential equations: dUj (t) + D(Uj (t)) = 0, dt

∀j,

where D(Uj (t)) is some approximation of the spatial derivative f (u)x for a fixed time t. Typically, the spatial approximation is made by means of a conservative reconstruction of the numerical fluxes, giving a formulation ˆ ˆ dUj (t) fj+ 21 − fj− 21 + = 0, dt ∆x

∀j,

(3.13)

3. Numerical methods for fluid dynamics

53

where fˆj+ 1 = fˆ(Uj−p+1 (t), . . . , Uj+q (t)). System (3.13) is then solved by 2 means of an ODE solver. A class of TVD Runge-Kutta methods specially tailored to solve this kind of ODE systems was developed by Shu and Osher [160]. The general formulation of these methods is as follows: U (0) = U n , U

(i)

=

i X k=0 (¯ r)

U n+1 = U

αik U (k) − βik ∆tD(U (k) ) ,

1 ≤ i ≤ r¯,

(3.14)

,

where r¯ depends on the order of accuracy of the particular Runge-Kutta scheme and αik , βik are coefficients that also depend on the method (see [160, 161] for details). In this work we will use the third order version: U (1) = U n − ∆tD(U n ), 1 1 3 U (2) = U n + U (1) − ∆tD(U (1) ), 4 4 4 1 n 2 (2) 2 n+1 U = U + U − ∆tD(U (2) ). 3 3 3

(3.15)

It has been proved [160] that the second and third order TVD RungeKutta methods are stable provided the forward Euler method, which is the first order case in (3.14), is stable, under the same CFL restriction of the forward Euler method. All these methods provide conservative schemes, when used together with spatial operators that lead to ODE’s of the form (3.13), since the composed time-advance step can be expressed in the conservative form (3.11). For example, given a numerical flux function fˆ(U n ) arising from the spatial discretization, expanding (3.15) for each node xj we have 1ˆ 2ˆ 1ˆ n+1 n (1) (2) n (U ) + (U ) + (U ) f 1 f 1 f 1 Uj = Uj − ∆t 6 j+ 2 6 j+ 2 3 j+ 2 (3.16) 1ˆ 1ˆ 2ˆ n (1) (2) − f 1 (U ) + fj− 1 (U ) + fj− 1 (U ) , 2 2 6 j− 2 6 3 Since U (1) and U (2) are obtained from U n we can write (3.16) in terms of a numerical flux function 1 1 2 fˆRK3 (U n ) = fˆ(U n ) + fˆ(U (1) ) + fˆ(U (2) ) (3.17) 6 6 3 as n RK3 n RK3 ˆ (U ) − f (U ) . Ujn+1 = Ujn − ∆t fˆj+ 1 j− 1 2

2

54 3.6. Numerical methods for one-dimensional hyperbolic systems Note that we have defined fˆ as a function that depends on p + q arguments (cf. Definition 7), which implies that fˆRK3 depends on 3(p + q) − 2 arguments, because it involves three recursive applications of fˆ. This numerical flux function is consistent. Recall that for any W = {Wj } we have D(W )j = D(Wj−p, . . . , Wj+q ) =

1 ˆ (f 1 (W ) − fˆj− 1 (W )), 2 ∆x j+ 2

therefore, because of the consistency of fˆ we get that D(w, . . . , w) = 0 for n = Ujn for any w. In particular, if U n is “locally constant” at xj , i.e., Uj+k k = −3p + 3, . . . , 3q, then U (1) and U (2) are also locally constant at j: (1)

Uj

(2)

Uj

= Ujn − ∆tD(U n )j = Ujn 1 n 2 (1) 2 U + Uj − ∆tD(U (1) )j = 3 j 3 3 1 n 2 n 2 = U + Uj − ∆tD(U n )j = Ujn . 3 j 3 3

This implies that if all the arguments of fˆRK3 take a common value w, by the consistency of fˆ: fˆRK3 (w, . . . , w) = =

1ˆ f (w, . . . , w) + 6 1ˆ f (w, . . . , w) + 6

1 ˆ (1) 2 f (w , . . . , w(1) ) + fˆ(w(2) , . . . , w(2) ) 6 3 1ˆ 2ˆ f (w, . . . , w) + f (w, . . . , w) = f (w), 6 3

where we have denoted by w(1) and w(2) , the (constant) values obtained in the first and second stage of the Runge-Kutta algorithm for a vector taking the constant value w.

3.6 Numerical methods for one-dimensional hyperbolic systems A common way to extend the methods described in section 3.5 to systems of equations is based on the application of scalar techniques to the characteristic variables instead of the conserved variables. The idea is to exploit the hyperbolic character of the system, by approximately decoupling it into advection equations as explained in section 2.3.2. We focus

3. Numerical methods for fluid dynamics

55

our description in Godunov’s method [57], and the approximate Riemann solver of Roe [148], which lie at the basis of the numerical methods to be described in chapter 4. Godunov’s method is based on the solution of Riemann problems forward in time. Consider the linear system in 1D t ∈ R+ ,

x ∈ R, x ∈ R.

ut + Aux = 0, u(x, 0) = u0 (x),

Godunov’s method solves the set of Riemann problems defined by x ∈ R, t ∈ (tn , tn+1 ], if x ≤ xj+ 1 ,

ut + A(u)ux = 0, L , u(x, tn ) = Uj+ 1

2

2

R , u(x, tn ) = Uj+ 1

if x > xj+ 1 , 2

2

R L are reconstructions of the for each interface xj+ 1 , where Uj+ 1 and U j+ 1 2

2

2

flow solution at the left and right of the interface. The numerical method involves essentially the same steps as for an scalar equation (cf. section 3.5): R L for the Riemann 1. Compute initial left and right states Uj+ 1 and U j+ 1 2

2

L problem. For the original first order method of Godunov take Uj+ 1 = n R Ujn and Uj+ 1 = Uj+1 .

2

2

2. Compute the exact solution of the Riemann problem corresponding to time tn+1 in each cell with a time step ∆t sufficiently small so that waves emanating from a given cell interface do not interact with neighboring waves. The local Riemann solutions can then be glued together into a global solution u ˜. 3. Compute the cell average of u ˜ at each cell as the new numerical n+1 solution for time t . Using the integral form (2.2) of the conservation law, and assuming that the exact solutions of the Riemann problems are known, Godunov’s method can be written in conservation form with the numerical flux defined by Z tn+1 1 n ˆ fj+ 1 = f (˜ u(xj+ 1 , t))dt, 2 2 ∆t tn where u ˜ denotes the solution of the corresponding Riemann problem. This formula is simplified using the fact that the solution of the Riemann

56 3.6. Numerical methods for one-dimensional hyperbolic systems problem is self-similar so that, for t ∈ (tn , tn+1 ), the solution is constant with respect to t along the interface x = xj+ 1 . This value depends only 2

L R . If we denote u∗ (U L , U R ) = on the left and right states Uj+ 1 and U j+ 1 j+ 1 j+ 1 2

2

2

2

u ˜(xj+ 1 , t) then the numerical flux for Godunov’s method reduces to 2

n u(xj+ 1 , tn+1 )). fˆj+ 1 = f (˜ 2

2

From the ideas of Godunov, a lot of methods have been constructed (see e.g. [66, 46]). The extension of Godunov’s method to nonlinear systems requires, however, the solution of the Riemann problem for a nonlinear system. Although it is possible to exactly solve some Riemann problems, it is in general very expensive to find their solution. Approximate Riemann solvers exploit the fact that only some information about the solution of the Riemann problem is needed to update the solution of the PDE from time tn to time tn+1 . This is due to the fact that the full exact solution of the Riemann problem is not required, but only its cell-average, in steps 1 and 3 above, leading to a numerical flux that requires only the evaluation of the solution at the cell interface, which is, moreover, constant in time. A typical approach to build an approximate Riemann solver starts with the quasi-linear form of the nonlinear problem ut + A(u)ux = 0, and simplifies the problem by replacing the Jacobian matrix A(u) by a matrix Aˆj that is constant for each interface, so that each Riemann problem is raised for a linear system. The matrix Aˆj varies from one interface L to another, and it is therefore a function of the left and right states Uj+ 1 2

R . The Riemann problems and Uj+ 1 2

ut + Aˆj ux = 0, in R × (tn , tn+1 ], ( L Uj+ 1 if x < xj+ 1 , 2 2 u(x, tn ) = R if x > xj+ 1 , Uj+ 1 2

(3.18)

2

are then solved exactly for each j, and the solution at time tn+1 is constructed as in the original Godunov method, following steps 1, 2 and 3 in page 55. The way how the matrices Aˆj are constructed depends on the system to be solved. Roe [148] proposed a set of general rules to build the linearization and constructed a matrix for the Euler equations. The

3. Numerical methods for fluid dynamics

57

numerical flux of Roe’s method can be written as X p p L n λj αj+ 1 rjp , fˆj+ 1) + 1 = f (U j+ 2

2

λpj 0

2

Taking the average of (3.19) and (3.20) we obtain the often used expression for the numerical flux ! 1 X 1 p p p n L R (3.21) |λj |αj+ 1 rj . fˆj+ f (Uj+ ) − 1 = 1 ) + f (U j+ 21 2 2 2 2 2 p From expressions (3.19), (3.20) and (3.21) the upwinding character and the characteristic-based structure of the method becomes clear. Matrices similar to Roe’s matrix for the Euler equations can be built for other systems, see [56, 175] for general reviews on approximate Riemann solvers, and [11, 49, 54, 162] for some particular examples.

3.7 Implementation of artificial boundary conditions In this section we describe the discrete implementation of boundary conditions. Recall that we have raised our problems in an unbounded spatial domain but, in practice, the computation has to be performed in a bounded subdomain. A different case is the initial-boundary value problem, where both initial and boundary conditions are specified by the problem. These boundary conditions are used to modify the flux calculation depending on if a given point is near the boundary or not. In our case we transform the unbounded problem into a bounded problem by specifying the behavior of the flow at the boundary. Special formulas for the flux computation near the boundaries can be devised,

58

3.7. Implementation of artificial boundary conditions

but in practice it is more efficient to consider the ghost cell approach, already mentioned in section 3.1. Note that the numerical fluxes in (3.11) for a single point xj depend on p + q + 1 values. For nodes at the boundary or near the boundary the values of the numerical solution in these nodes may be unavailable, since the nodes may not belong to the grid. In particular, if we consider N −1 a spatial grid of the interval [0, 1] composed by N nodes {xj }j=0 , with 1 1 xj = (j + 2 )∆x, ∆x = N (cf. section 3.1), then, for example, at the node n , and the same x0 , the points x−1 . . . , x−p are required to compute fˆj− 1 2

applies to the node xN −1 at the right boundary, where xN , . . . , xN +q−1 are required. In general, p and q ghost cells are required, respectively, at the left and right boundary, in order to update the numerical solution. As an example, consider Lax-Friedrichs’ method 3.10. Its numerical flux is defined as 1 ∆x n n (Uj+1 − Ujn ) + (f (Uj+1 ) + f (Ujn )). fˆj+ 1 = 2 2 2 In this case we have p = q = 1 and one ghost cell is required at each boundary. The idea is to augment the computational grid with some extra cells in the boundary. These cells are filled with data using the solution at the interior grid before the integration. For the implementation of the WENO method described in section 4.3 we need three ghost cells at each boundary. Moreover, ghost cells will also be used in the update of cells that are not near the boundaries, but in the boundary of a mesh patch in the AMR algorithm (see chapter 5). Particular cases of implementation of boundary conditions using the ghost cell technique, that will be used in the validation section are inflow, outflow and reflecting boundary conditions. Inflow is the discrete version of Dirichlet boundary conditions, and outflow represent Neumann ∂u boundary conditions ∂~ n = 0. Qualitatively this represents the cases where the flow is entering (inflow) or exiting (outflow) the domain, and are the most natural choice when solving in a finite domain a problem that is defined in the whole real line. In practice this case is implemented by copying the values of the boundary cells into the ghost cells. For the sample grid used in this section we would have: n U−k = U0n , 1 ≤ k ≤ p,

n n UN +k−1 = UN −1 , 1 ≤ k ≤ q.

Another possibility, known as absorbing boundary conditions, is to copy the values in the interior of the domain to the ghost cells with

3. Numerical methods for fluid dynamics

59

simmetry at the boundary, which essentially produces zero numerical divergence at the boundary. In this case we fix n n U−k = Uk−1 , 1 ≤ k ≤ p,

n n UN +k−1 = UN −k , 1 ≤ k ≤ q.

When the boundary represents a solid wall where the flow rebounds, reflecting boundary conditions can be imposed. These conditions are typically applied to systems of equations and depends on the particular equation to be solved. As a general rule, the velocity is set as the value at the interior but changing its sign, to state that the flow will change its direction when touches the solid wall. The rest of the variables are copied from the interior. The values are copied imposing symmetry at the boundary. As an example, for the Euler equations in 1D defined by (2.27), the conditions are as follows: at the left boundary we put: ρ−k = ρk−1 ,

u−k = −uk−1 ,

E−k = Ek−1 ,

for 1 ≤ k ≤ p. Similarly, at the right boundary the following values ρN +k−1 = ρN −k ,

uN +k−1 = −uN −k ,

EN +k−1 = EN −k ,

are set for the ghost cells (1 ≤ k ≤ q). For multi-dimensional problems, the same rules are applied to each variable and boundary in a dimensional splitting fashion.

60

3.7. Implementation of artificial boundary conditions

4 Shu-Osher’s finite difference with Donat-Marquina’s flux splitting Shu-Osher’s conservative schemes [160, 161] constitutes a conceptually new approach for the solution of hyperbolic systems, that actually simplifies the implementation of some finite volume High Resolution Shock Capturing (HRSC) methods. Examples of such methods are the MUSCL methods of Van Leer [177, 178, 179, 180, 181], the PPM method of Colella and Woodward [39], the ENO methods of Harten, Engquist, Osher and Chakravarthy [64], Marquina’s PHM method [117] and the WENO methods developed by Jiang and Shu [81] and Liu, Osher and Chan [113]. Most of these methods are based in discretizations of the equations on control volumes, following the cell-average approach. This

62 formulation has proved to be effective in one dimensional simulations (see references above and the books of LeVeque [104, 105] and Toro [175] for an introductory explanation), but it is difficult to implement in more than one space dimension and order of accuracy higher than two. The main reason is that the numerical flux is an integral over the control volume boundary, that has to be approximated by some quadrature formula. This imposes a complicated and computationally expensive connection between the cell-averages of the solution in the control volume and its reconstructed point-values in some nodal points at the control volume boundary. This combination of cell-averages and pointvalues becomes more complicated if the grids used in the discretization are non-uniform or unstructured, because of the additional technical complexity intrinsic to those kinds of grids. To overcome these difficulties Shu and Osher developed a method that avoids the use of cell-averages by using a point-value discretization of the solution instead, where the computation of the numerical fluxes is done from the equations’ fluxes point-values. A key point is that all the one-dimensional reconstruction procedures used in finite volume methods can be easily exported to Shu-Osher’s approach, with the advantage that the computation of the numerical fluxes is performed in a dimensional splitting fashion. This reduces significantly the computational cost of the algorithm respect to finite volume schemes of the same order. On the other hand, the main drawback of this approach is that to obtain a high order conservative finite difference scheme, the mesh size is required to be uniform or at least smoothly varying [121]. A very complete description comparing finite volume and finite difference schemes can be found in a paper by Shu [159]. The original description of the method is for scalar equations, with the extension to systems being based on the application of the methodology to the local characteristic fields of a linearized system coming, for example from an intermediate Jacobian matrix corresponding to an approximate Riemann solver [161]. An alternative approach has been proposed by Donat and Marquina [44], where two Jacobians are computed at each cell interface. This is a more natural extension of the flux-split character of Shu-Osher’s methodology into systems, and has proved to be more efficient than the use of a single Jacobian in some pathological cases. Moreover, it can be used in conjunction with any reconstruction procedure. Donat-Marquina’s flux formulation is described in section 4.2. High order reconstructions of the flow variables and the numerical fluxes are obtained in this work by means of fifth order WENO reconstructions, described in section 4.3. The overall algorithm that results of putting together these techniques

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

63

has been described and tested by Marquina and Mulet in [118] on a fixed grid configuration, for a fluid flow composed by the mixture of two perfect gases in thermal equilibrium.

4.1 Shu-Osher’s finite difference flux reconstruction Let us describe the finite difference approach of Shu and Osher in one space dimension for a scalar equation. As mentioned above, generalizations to multi-dimensions will be straightforward because of the dimensional-splitting nature of the procedure. The extension to systems using Donat-Marquina’s flux formulation will be discussed in section 4.2. The idea that enables Shu-Osher’s approach comes from the following lemma: Lemma 1. (Shu and Osher, [161]) If a function h(x) satisfies 1 f (u(x)) = ∆x then f (u(x))x =

h x+

Z

x+ ∆x 2 x− ∆x 2

∆x 2

h(ξ)dξ,

−h x− ∆x

∆x 2

(4.1)

.

In order to obtain a conservative scheme we aim to approximate the flux derivative, f (u(x))x , by an expression of the form (see 3.11) 1 ˆn n . fj+ 1 − fˆj− 1 ∆x 2 2 n The above lemma suggests that the numerical fluxes fˆj+ 1 should ap2

proximate the values h(xj+ 1 ). In other words, if we can compute high 2 order approximations to h(xj+ 1 ), they can play the role of high order 2 numerical fluxes for a conservative scheme. Shu-Osher’s formulation can be derived from the cell-average form of the equations [121]. The point-values obtained through this formulation can be understood as estimates or deconvolutions of the cell-average

64

4.1. Shu-Osher’s finite difference flux reconstruction

values, obtained by the application of the inverse of the cell-average operator (3.4), with the advantage that no explicit knowledge of the cellaverages of the solution is needed in the process: the equation can be evolved from one time step to the following one using only the available nodal values. Using the same notation as in (3.4), in what follows the cell-average ¯j : of a function h in the cell [xj− 1 , xj+ 1 ] will be denoted by h 2

¯j = 1 h ∆x

2

Z

xj+ 1 2

h(ξ)dξ.

xj− 1 2

By the assumption (4.1), the problem has been translated into the classical cell-average framework of reconstructing the point-values of a function at the cell boundaries from its cell-averages. This automatically permits to export all the reconstruction algorithms used in classical finitevolume methods to Shu-Osher’s framework. An important point is that no knowledge of the function h(x) (other than its cell-averages) is required. We consider local reconstruction procedures, where the value of hj+ 1 := h(xj+ 1 ) is obtained using information from a finite number 2 2 of nodes around xj . This is the case for the class of methods we are interested in, namely conservative methods. A generic local reconstruction procedure for h(x) from its cell-averages ¯ j−s , . . . , h ¯ j+s } defined on the interval [x {h 1 2 j−s1 − 12 , xj+s2 + 12 ] will be denoted ¯ j−s , . . . , h ¯ j+s , x), where s1 and s2 are nonnegative integers, and by R(h 1

2

has to verify the following properties:

• Preservation of the cell-averages: Z x 1 k+ 2 1 ¯ j−s , . . . , h ¯ j+s , x)dx = h ¯k, R(h 1 2 ∆x x 1 k− 2

k = j − s1 , . . . , j + s2 . (4.2)

• Accuracy. Wherever h is smooth: ¯ j−s , . . . , h ¯ j+s , x) = h(x) + O(∆xr ), R(h 1 2

x ∈ [xj−s1 − 1 , xj+s2 + 1 ]. (4.3) 2

2

for some r > 0. • The total variation 1

1

¯ j−s , . . . , h ¯ j+s , x) is essentially bounded of R(h 1 2

The total variation of a differentiable function h(x) in an interval I is given by Z T V (h) = |h′ (x)|dx. I

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

65

by the total variation of h(x), i.e., for some p > 0: ¯ j−s , . . . , h ¯ j+s , x)) ≤ C · T V (h(x))) + O(∆xp ). T V (R(h 1 2

(4.4)

When computing the reconstruction, another essential point is upwinding. Roughly speaking, upwinding means that the numerical scheme has to take into account the directions in which the solutions are moving, given by the signs of the eigenvalues of the Jacobian matrix. For a scalar equation, the direction of the movement of the solution is locally given by the sign of f ′ (u). Shu and Osher use the Roe speed κj+ 1 = 2

f (Uj+1 ) − f (Uj ) Uj+1 − Uj

(4.5)

to determine its sign and then perform reconstructions biased towards the correct direction. The computation of the numerical fluxes with Shu-Osher’s procedure, using a generic local reconstruction procedure ¯ j−s , . . . , h ¯ j+s , x), where h ¯ j = f (Uj ) can be summarized as follows: R(h 1 2 Algorithm 1. 1. If κj+ 1 > 0 Compute fˆj+ 1 by 2

2

¯ j−s , . . . , h ¯ j+s , x 1 ). fˆj+ 1 = R(h 1 2 j+ 2

2

2. Else Compute fˆj+ 1 by 2

¯ j−s +1 , . . . , h ¯ j+s +1 , x 1 ). fˆj+ 1 = R(h 1 2 j+ 2

2

The numerical fluxes fˆj+ 1 computed through Algorithm 1 are based 2 on the first order Roe fluxes. In fact for the zero-th order reconstruction ¯ j+1 , the numerical flux is the Roe (s1 = s2 = 0), where fˆj+ 1 = ¯ hj or fˆj+ 1 = h 2 2 flux. It is known that the Roe solver admits only shock waves or contact discontinuities as solutions of the Riemann problem, but not rarefaction waves. In particular transonic rarefactions are wrongly substituted by entropy-violating expansion shocks. This situation is often avoided by using an entropy fix, commonly obtained by means of a local correction on the numerical fluxes where transonic rarefactions take place, in order

66

4.1. Shu-Osher’s finite difference flux reconstruction

to introduce some smoothing that “breaks” the shock into a rarefaction wave. Shu and Osher [160] use a local Lax-Friedrichs (LLF) version of the ENO algorithms, that can be generalized to other piecewise polynomial reconstructions as: Algorithm 2. (Local Lax-Friedrichs) 1. Take βj+ 1 = maxu∈[uj ,uj+1 ] |f ′ (u)|. 2

+ 1 2. Compute fˆj+ 1 with step 1 of Algorithm 1 using 2 (f (u)+βj+ 1 u) instead 2

2

of f (u). − 1 3. Compute fˆj+ 1 with step 2 of Algorithm 1 using 2 (f (u)−βj+ 1 u) instead 2

2

of f (u). + ˆ− 1 . 4. Take fˆj+ 1 = fˆj+ 1 + f j+ 2

2

2

Since f ′ (u) changes sign through a transonic rarefaction, the entropyfix version of Algorithm 1 consists of computing fˆj+ 1 with Algorithm 1 if 2 f ′ (u) does not change sign between Uj and Uj+1 and with Algorithm 2 otherwise. The final algorithm for a scalar equation can be stated as follows: Algorithm 3. (Shu-Osher’s algorithm for scalar equations) Define βj+ 1 = maxu∈[Uj ,Uj+1 ] |f ′ (u)| 2 Define κj+ 1 by (4.5) 2 if κj− 1 · κj+ 1 > 0 2 2 if κj+ 1 > 0 2 fˆ 1 = R(fj−s , . . . , fj+s , x 1 ) j+ 2

1

2

j+ 2

else fˆj+ 1 = R(fj−s1+1 , . . . , fj+s2+1 , xj+ 1 ) 2 2 end else + 1 1 fˆj+ 1 = R( 2 (fj−s1 + βj+ 1 Uj−s1 ), . . . , 2 (fj+s2 + βj+ 1 Uj+s2 ), xj+ 1 ) 2 2 2 2 fˆ− 1 = R( 1 (fj−s +1 − β 1 Uj−s +1 ), . . . , 1 (fj+s +1 − β 1 Uj+s +1 ), x j+ 2

fˆj+ 1 = 2

end

2 + fˆj+ 1 2

1

+

− fˆj+ 1. 2

j+ 2

1

2

2

j+ 2

2

j+ 12 )

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

67

4.2 Donat-Marquina’s flux formula A possible extension of the framework described in section 4.1 to nonlinear systems involves some average Jacobian matrix computed at some intermediate state between the left and right states as, for example, the Roe matrix [148]. These average matrices can be difficult to compute for systems of equations other than the Euler equations and, on the other hand, numerical methods based on Roe’s and other average matrices suffer several failures that have been reported in the literature, see e.g. [134, 140] and references therein. Donat-Marquina’s flux formula [44] is a generalization of Shu-Osher’s algorithm to nonlinear systems that alleviates some of those pathologies, as well as avoids the use of an average matrix. Its leading philosophy is to compute the numerical fluxes at the interfaces using two sets of characteristic information for each interface, one coming from the left state and the other coming from the right state. This mimics, in the context of nonlinear systems, the flux-splitting behavior of the formulation made by Shu and Osher for scalar equations. As Donat-Marquina’s flux formula reduces to Shu-Osher’s for scalar equations we will center the discussion on a nonlinear system in one space dimension. Let us briefly describe the numerical method corresponding to Roe’s linearization first. The method of Roe is a conservative scheme whose numerical flux is given by ! 1 X 1 p p p L R |λj |αj+ 1 rj , f (Uj+ ) − (4.6) fˆj+ 1 = 1 ) + f (U j+ 21 2 2 2 2 2 p ˆ L 1 , U R 1 ) comwhere λpj are the eigenvalues of the Roe matrix Aˆj = A(U j+ j+ 2

2

puted at the cell interface xj+ 1 , rjp are the corresponding eigenvectors, Lj 2 is the matrix of left eigenvectors of Aˆj and αp 1 = Lj (U R 1 − U L 1 ). The first order method corresponds to

j+ 2 L Uj+ 1 2

j+ 2

= Uj and

j+ 2 R Uj+ 1 2

= Uj+1 .

Higher order versions are obtained, for example, by using high order reconstructions of the numerical solutions at the interfaces, as explained in section 3.5. Donat-Marquina’s flux formula starts with the same approach. Left and right states are computed at each cell interface. Note that the left

68

4.2. Donat-Marquina’s flux formula

and right states are computed from point-values, i.e. we need to compute approximations to the point-values of u at the interfaces xj+ 1 from 2 point-values of u at the cell centers. For nonnegative integers t1 and t2 , we denote by S(Wj−t1 , . . . , Wj+t2 , x) such an approximation function for a generic function w, whose point-values are given by {Wi }. The approximation at the cell interface is given by S(Wj−t1 , . . . , Wj+t2 , xj+ 1 ) and the 2 operator S has to verify conditions analogous to (4.2)–(4.4): • Interpolation:

S(Wj−t1 , . . . , Wj+t2 , xj+k ) = Wj+k ,

−t1 ≤ k ≤ t2 .

(4.7)

• Accuracy. Whenever w is smooth:

S(Wj−t1 , . . . , Wj+t2 , x) = w(x) + O(∆xr ),

x ∈ [xj−t1 , xj+t2 ].

(4.8)

for some r > 0. • The total variation of S(Wj−t1 , . . . , Wj+t2 , x) is essentially bounded by the total variation of w(x). For some p > 0: T V (S(Wj−t1 , . . . , Wj+t2 , x)) ≤ C · T V (w(x))) + O(∆xp ).

(4.9)

L The algorithm starts by computing the sided reconstructions Uj+ 1 and 2

R Uj+ 1 by: 2

L Uj+ 1 = S(Uj−t1 , . . . , Uj+t2 , xj+ 1 ), 2

2

R Uj+ 1 2

(4.10)

= S(Uj−t1 +1 , . . . , Uj+t2 +1 , xj+ 1 ). 2

Now, instead of building a Jacobian matrix corresponding to an intermeR , a double linearization, corresponding L diate state between Uj+ 1 and U j+ 1 2

2

L ) and f ′ (U R ) is computed. Let us denote by to the Jacobians f ′ (Uj+ 1 j+ 1 2

2

L ) and r p (U L ) the left and right eigenvectors of f ′ (U L ) and by lp (Uj+ 1 j+ 1 j+ 1 2

2

2

2

2

R ) and r p (U R ) the same objects for f ′ (U R ). Two sets of characlp (Uj+ 1 j+ 1 j+ 1 2

teristic fluxes and variables are then computed according to these two Jacobians by: L = lp (U L ) · U , Wp,k k j+ 12 , for j − s1 ≤ k ≤ j + s2 , L p L fp,k = l (Uj+ 21 ) · f (Uk ) (4.11) R p R Wp,k = l (Uj+ 1 ) · Uk , 2 , for j − s1 + 1 ≤ k ≤ j + s2 + 1. f R = lp (U R ) · f (U ) p,k

j+ 21

k

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

69

In (4.11) s1 and s2 denote two nonnegative integers that are used in the numerical flux computation in Algorithm 4 below. The numerical fluxes are then computed in a way similar to Shu-Osher’s algorithm. L and ψ R . These First we compute the upwind characteristic fluxes ψp,j p,j fluxes are biased towards its corresponding side and are computed according, respectively, to the characteristic information corresponding to L ) and f ′ (U R ). In points where the eigenvalues change sign, the f ′ (Uj+ 1 j+ 1 2

2

entropy fix based on the local Lax-Friedrichs flux is applied, as in the scalar case. The final algorithm is as follows: Algorithm 4. if λp (u)does not change sign in a path in phase space connecting Uj and Uj+1 if λp (Uj ) > 0 L = R(f L L ψp,j p,j−s1 , . . . fp,j+s2 , xj+ 21 ) R =0 ψp,j else L =0 ψp,j R = R(f R R ψp,j p,j−s1 +1 , . . . fp,j+s2+1 , xj+ 12 ) end else p p 1 L L L L = R( 1 (f L ψp,j 2 p,j−s1 + βj+ 1 Wp,j−s1 ), . . . , 2 (fp,j+s2 + βj+ 1 Wp,j+s2 ), xj+ 1 ) 2

2

2

p p 1 R R R R = R( 1 (f R ψp,j 2 p,j−s1+1 − βj+ 1 Wp,j−s1 +1 ), . . . , 2 (fp,j+s2 +1 − βj+ 1 Wp,j+s2+1 ), xj+ 1 ) 2

2

2

end

p p where βj+ 1 = maxu |λ (u)|, and u varies in any curve in phase space 2

connecting Uj and Uj+1 . The numerical flux is finally defined as: fˆj+ 1 = 2

X p

L p L R p R ψp,j r (Uj+ ). 1 ) + ψp,j r (U j+ 1 2

(4.12)

2

If the characteristic fields of the Jacobian matrices are linearly degenerate or genuinely nonlinear, the eigenvalues are, respectively, constant and monotone along integral curves of these fields (see section 2.2.4), and Algorithm 4 can be simplified to:

70

4.3. Reconstruction procedures

Algorithm 5. if λp (Uj ) > 0 and λp (Uj+1 ) > 0 L = R(f L L ψp,j p,j−s1 , . . . fp,j+s2 , xj+ 21 ) R =0 ψp,j else if λp (Uj ) < 0 and λp (Uj+1 ) < 0 L =0 ψp,j R R R ψp,j = R(fp,j−s , . . . fp,j+s , xj+ 1 ) 1 +1 2 +1 2 else p p 1 L L L = R( 1 (f L L ψp,j 2 p,j−s1 + βj+ 1 Wp,j−s1 ), . . . , 2 (fp,j+s2 + βj+ 1 Wp,j+s2 ), xj+ 1 ) 2

2

2

p p 1 R = R( 1 (f R R R R ψp,j 2 p,j−s1+1 − βj+ 1 Wp,j−s1 +1 ), . . . , 2 (fp,j+s2 +1 − βj+ 1 Wp,j+s2+1 ), xj+ 1 ) 2

2

end

p p p with βj+ 1 = max{|λ (Uj )|, |λ (Uj+1 )|}. Note that for scalar equations 2

the above algorithms reduce to the entropy-fix version of Shu-Osher’s algorithm (Algorithm 3 in page 66).

Remark 1. In the practical implementation it is more convenient to interpolate the variables the eigenstructure of the Jacobian matrix depends on, instead of the conserved variables at the cell interfaces, using the operator S. The reconstructions in (4.10) are applied to the suitable variables. For example, the eigenstructure of the Jacobian matrix of the Euler equations in 1D depends only on two variables, while the equations are composed by three conserved variables (see section 7.1.3).

4.3 Reconstruction procedures In this section we describe the WENO interpolatory techniques which we will use as the reconstructions R and S, defined in pages 64 and 68 respectively, used to compute the numerical fluxes fˆj+ 1 at the cell bound2 aries. The WENO technique was developed by Liu, Osher and Chan in [113] and further improved by Jiang and Shu in [81]. Preliminary works dealing with the same ideas were produced by Shu [158] and Fatemi, Jerome and Osher [50]. The WENO algorithm is described in subsection 4.3.1, and constitutes an improvement of the ENO technique first introduced in [64]. It is less sensible to perturbations, reduces the practical computational cost and increases the accuracy in regions where the solution is smooth.

2

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

71

This algorithm is directly used as the reconstruction procedure R. The algorithm can be modified to reconstruct point-values from point-values, rather than from cell-averages. In subsection 4.3.2 such a procedure is described. This reconstruction of point-values from point-values will be used within Donat-Marquina’s algorithm to compute the biased reconR L in (4.10). In the multi-dimensional case the structions Uj+ 1 and U j+ 1 2

2

reconstructions are performed dimension-by-dimension, therefore the extension from one dimension to multi-dimensions is trivial.

4.3.1 ENO and WENO reconstruction for cell-average discretizations ENO is based on the fact that Lagrange interpolation on a cell cj = [xj− 1 , xj+ 1 ] can be computed using different sets of points or cells (sten2 2 cils) whose convex hulls contain the given cell. Among the possible stencils the ENO procedure tries to select the stencil that produces the smoothest Lagrange interpolant. If the stencils contain r cells, in which ¯ j of a function h(x) are known, and h(x) is smooth in the cell-averages h the stencil, then the reconstruction of the point-values of h(x) obtained by the ENO algorithm is O(∆xr ) accurate. During the stencil selection procedure the ENO method considers r possible stencils, which altogether contain 2r − 1 cells. The selection procedure is computationally expensive, since it involves a lot of conditional branches. Moreover, in regions of smoothness a stencil formed by these 2r − 1 cells could be used, since the reconstructed functions are smooth regardless the selected stencil, and we could increase the accuracy of the reconstruction up to O(∆x2r−1 ). The WENO reconstruction technique overcomes these drawbacks by using a convex combination of the interpolants corresponding to all the possible stencils considered, instead of selecting one of them. A weight, which depends on the smoothness of the function in the corresponding stencil, is assigned to each interpolant. These weights determine the contribution of each interpolant to the final approximation. The generic reconstruction problem can be formulated as follows: given the cell-averages of a function h(x): ¯j = 1 h ∆x

Z

xj+ 1 2

xj− 1

2

h(ξ)dξ

72

4.3. Reconstruction procedures

find, for each cell cj , a polynomial qjr (x) of degree at most r − 1 such that it is an rth order approximation to h(x) inside the cell, provided the function h(x) is smooth enough: qjr (x) − h(x) = O(∆xr ),

∀x ∈ cj .

(4.13)

A procedure to compute such a polynomial on a cell cj consists of choosing a stencil of r cells formed with s1 cells to the left and s2 cells to the right surrounding the cell cj , with s1 + s2 = r − 1 (s1 , s2 > 0): S¯j = {cj−s1 , . . . , cj+s2 }.

(4.14)

The (unique) polynomial qjr (x) of degree at most r − 1 that attains the same cell-averages as h(x) in the cells in S¯j is a solution of the proposed reconstruction problem, provided h(x) is smooth enough in the region covered by the stencil S¯j . In our context we will need approximations to the point-values h(xj+ 1 ), 2 that will be approximated by qjr (xj+ 1 ). To compute the polynomial qjr (x) 2 the “reconstruction via primitive function” procedure in [64] is applied. We briefly describe this approach next. Let h be a function and H a primitive of h: Z x h(ξ)dξ. H(x) = −∞

The point-values of the primitive function at the cell boundaries can be computed by: H(xj+ 1 ) = 2

Z

xj+ 1

2

h(ξ)dξ = ∆x

−∞

j X

¯j , h

(4.15)

i=−∞

and the following relation is trivial to check: H(xj+ 1 ) − H(xj− 1 ) 2

2

∆x

¯j . =h

(4.16)

Note that the left hand side of (4.16) is a first order divided difference of H. It is easy to check that all the higher order divided differences of H can be computed from the available data due to the relation: H[xj− 1 , xj+ 1 , . . . , xj+ℓ+ 1 ] = 2

2

2

1 ¯ h[uj , . . . , uj+ℓ ], ℓ+1

ℓ ≥ 0.

(4.17)

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

73

If we denote by Qrj (x) the (unique) polynomial of degree at most r that interpolates the primitive function H(x) at the point s that define the stencil S¯j in (4.14), i.e., the points: Sj = {xj−s1 − 1 , . . . , xj+s2 + 1 }, 2

(4.18)

2

then Qrj approximates H(x) with order of accuracy r + 1 provided H is smooth in Sj . Its derivative qjr (x) :=

dQrj (x) , dx

(4.19)

interpolates the cell-averages of h(x) in the stencil S¯j : Z x 1 k+ 2 1 ¯ k , j − s1 ≤ k ≤ j + s2 , q r (x)dx = h ∆x x 1 j k− 2

and from standard theory on interpolation, if Qrj (x) − H(x) = O(∆xr+1 ), then qjr (x) − h(x) = O(∆xr ). So the accuracy requirement (4.13) is also satisfied by the piecewise polynomial function q r (x) defined by q r (x) := qjr (x),

∀x ∈ cj .

The actual approximation of the function h at the cell interfaces is computed by hj+ 1 ≈ q r (xj+ 1 ). 2 2 To effectively compute the numerical approximation to h(xj+ 1 ) con2 sider the Newton form of the polynomial that interpolates the primitive function H(x) in the points in (4.18): Qrj (x)

=

r X ℓ=0

H[xj−s1− 1 , . . . , xj−s1 +ℓ− 1 ] 2

2

ℓ−1 Y

m=0

(x − xj−s1+m− 1 ). 2

Its derivative is given by: qjr (x) =

r X ℓ=1

H[xj−s1 − 1 , . . . , xj−s1 +ℓ− 1 ] 2

2

ℓ−1 X

m=0

ℓ−1 Y

n=0 n 6= m

(x − xj−s1+ℓ− 1 ). 2

(4.20)

Remark 2. As only first or higher order divided differences of H are used in (4.20), but not the nodal values H(xj+ 1 ), the primitive function does not 2 need to be explicitly computed and so the summation in (4.15) is never performed, because of (4.17).

74

4.3. Reconstruction procedures

The reconstruction via primitive function is a standard procedure to reconstruct point-values from cell-averages. A more complete description of this technique can be found in [9]. We describe next the ENO and WENO reconstruction procedures used in this work. Note that there are r possible stencils of r cells that contain the cell cj : S¯j,k = {cj+k−r+1 , . . . , cj+k }, 0 ≤ k < r. (4.21) The ENO algorithm selects one of the stencils in (4.21) using divided differences as smoothness indicators, choosing the stencil which produces the smallest divided differences, in an attempt for producing less oscilr (x latory interpolants, see [64, 9] for further details. We denote by qj,k j+ 21 ) the Lagrangian approximation to hj+ 1 built from the kth candidate sten2 cil in (4.21), according to (4.20). This would be the approximation of the numerical flux computed by the ENO algorithm if the stencil S¯j,k had been chosen in the stencil selection procedure. The stencil selection procedure of the ENO algorithm is used everywhere, regardless the smoothness of the function to be reconstructed. In regions where the function is smooth, if we had used the stencil formed by the 2r − 1 nodes contained in any of the candidate stencils, i.e. the stencil Sˆj = {cj−r+1 , . . . , cj+r−1 }, we could have obtained a (2r − 1)th order approximation of hj+ 1 , denoted 2

by qj2r−1 (xj+ 1 ). Jiang and Shu [81] found that there exist coefficients Ckr 2

such that the value of the polynomial qj2r−1 (x) at the point xj+ 1 can be 2 expressed as a combination of the values of the ENO polynomials at the same point: r−1 X 2r−1 r (4.22) Ckr qj,k (xj+ 1 ), (xj+ 1 ) = qj 2

2

k=0

and the coefficients verify r−1 X k=0

Ckr = 1,

∀r ≥ 2.

These coefficients are called the optimal weights. Their values for r = 2 are: 2 1 C12 = , C22 = , 3 3

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

75

leading to a third order interpolation, and for r = 3: C13 =

1 , 10

C23 =

6 10

, C33 =

3 , 10

leading to fifth order accuracy. The idea behind the WENO technique is to use a convex combination r (x 1 ) to compute a new approximation: of the approximations qj,k j+ 2

ˆ h(x j+ 1 ) = 2

r−1 X

r r wj,k qj,k (xj+ 1 ). 2

(4.23)

k=0

r . Ideally The key of the method is the computation of the weights wj,k one wants that the weights automatically adapt to the smoothness of r should the function. If the stencil S¯j,k crosses a singularity, then wj,k r be essentially zero, canceling out the contribution of qj,k (xj+ 1 ) to the 2 reconstruction, and h(xj+ 1 ) is then built from the remaining stencils that 2 do not cross singularities, so that rth order of accuracy is maintained. On the other hand, if all the candidate stencils are contained in regions r has to approximate the optimal where the function is smooth, then wj,k r weight Ck , and thus the convex sum in the right hand side if (4.23) will approximate q 2r−1 (xj+ 1 ) with optimal order. Weights verifying the above 2 requirements were defined in [113] through the formula:

αr wkr = Pr−1k i=0

where αrk =

Ckr , (ǫ + ISj,k )p

αri

,

0 ≤ k ≤ r − 1.

(4.24)

(4.25)

In (4.25), ǫ is a positive number used to avoid the denominator to become zero. ISj,k is a smoothness measurement of the function h(x) in the stencil S¯j,k and constitutes the key point to define the weights. Jiang and Shu gave a definition for ISj,k that achieves the desired goals for the weights: ISk =

r−1 Z X i=0

xj+ 1

2

xj− 1 2

∆x

2i−1

2 di r q (x) dx. dxi j,k

(4.26)

With the algorithm described above, we construct the numerical fluxes of Donat-Marquina’s algorithm. We suppose, as in Shu-Osher’s lemma,

76

4.3. Reconstruction procedures

L and f R , defined in (4.11), are cellthat the characteristic fluxes fp,k p,k averages of an unknown function h(x). The values of h(xj+ 1 ) are there2

L and ψ R in fore interpreted as the characteristic numerical fluxes ψp,j p,j Algorithm 4 and Algorithm 5, used to define the final numerical flux fˆj+ 1 2 by (4.12). Note that two different reconstructions R, based on different L and ψ R .We stencils, are used, respectively, for the computation of ψp,j p,j have used fifth order (r = 3) WENO interpolants for the definition of R in our scheme.

4.3.2 ENO and WENO reconstructions for point-value discretizations So far we have described the construction of the operator R that we will L and ψ R from the use to compute the characteristic numerical fluxes ψp,j p,j L and f R in Algorithms 4 and 5. The characterischaracteristic fluxes fp,j p,j tic fluxes are projections of the equation fluxes following the characterR . L istic fields corresponding to the states Uj+ 1 and U j+ 1 2

2

In order to achieve a high order of accuracy, a reconstruction of the point-values of the solution at the cell interfaces from its point-values at the cell centers is needed. In this section we describe the reconstruction R L in Donat-Marquina’s algoused to compute the values Uj+ 1 and U j+ 1 2

2

rithm, which is a modified version of the WENO algorithm described in the previous section. We refer to [8, 18] for further details. To build a WENO interpolant that approximates point-values from point-values we follow a setup similar to the one used to compute approximations to point-values from cell-averages. Each reconstruction is computed using all the possible stencils of r points that contain the point R L ) or x xj (for Uj+ j+1 (for Uj+ 1 ). 1 2

2

Using the same notation as in section 4.3.1, for the reconstruction of L , we find that, for r = 3, there exist coefficients C r such that Uj+ 1 k 2

r r r (xj+ 1 ) + Cj1 qj,1 (xj+ 1 ) + Cj2 qj,2 (xj+ 1 ) qj2r−1 (xj+ 1 ) = Cj0 qj,0 2

2

2

2

(4.27)

where qkr is the Lagrange interpolatory polynomial of the solution {Uk } constructed from the stencil Sj,k : Sj,k = {xj−r+1+k , xj+k },

1 ≤ k ≤ r.

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

77

For r = 3 we have the three stencils: Sj,0 = {xj−2 , xj−1 , xj },

Sj,1 = {xj−1 , xj , xj+1 },

Sj,2 = {xj , xj+1 , xj+2 }.

and the corresponding reconstructions 15 5 Uj − Uj−1 + 8 4 3 3 = Uj+1 + Uj − 8 4 3 3 = Uj + Uj+1 − 8 4

3 qj,0 = 3 qj,1 3 qj,2

3 Uj−2 , 8 1 Uj−1 , 8 1 Uj+2 . 8

(4.28)

The interpolated value at xj+ 1 that results from the stencil 2

Sj := {xj−2 , xj−1 , xj , xj+1 , xj+2 } is given by qj5 =

20 90 60 5 3 Uj−2 − Uj−1 + Uj + Uj+1 − Uj+2 . 128 128 128 128 128

Solving (4.27), one obtains the coefficients: C03 =

1 , 16

C13 =

10 , 16

C23 =

5 . 16

(4.29)

It can be shown that using the same weights as in the cell-average case, i. e., the ones defined by (4.24), fifth order accuracy is obtained in smooth regions using the combination (4.23) as an approximation of hj+ 1 , whereas the weights approach zero for the reconstructions corre2 sponding to stencils that contain discontinuities. Therefore, we take the approximation: r−1 X 3 3 U L (xj+ 1 ) = wj,k qj,k (xj+ 1 ), 2

2

k=0

r defined by (4.28), and wr defined by (4.24) using the optimal with qj,k j weights (4.29). R , usA similar analysis can be performed for the computation of Uj+ 1 2

ing the stencils Sj,k = {xj−r+2+k , xj+k+1 },

0 ≤ k < r.

78

4.4. The complete integration algorithm

In this case, and for r = 3, the three interpolatory polynomials are given by 3 3 3 qj,0 = Uj+1 + Uj − 8 4 3 3 3 qj,1 = Uj + Uj+1 − 8 4 15 5 3 qj,2 = Uj+1 − Uj+2 + 8 4 and the optimal weights are C03 =

10 , 16

C13 =

5 16

1 Uj−1 , 8 1 Uj+2 , 8 3 Uj+3 , 8

C23 =

1 . 16

4.4 The complete integration algorithm For the sake of completeness and clarity we summarize here the complete numerical algorithm that results from putting together the following ingredients: • Shu-Osher’s finite-difference formulation for flux reconstruction, described in section 4.1. • Donat-Marquina’s flux formula, described in section 4.2 • A fifth order flux reconstruction based on the WENO algorithm, described in section 4.3. • The third order Runge-Kutta algorithm (3.15) for time integration. Consider a one-dimensional system: ut + f (u)x = 0,

(x, t) ∈ R × [0, T ],

(4.30)

x ∈ R.

(4.31)

where u : R −→ Rm , and f : Rm −→ Rm , for m > 1, with initial data u(x, 0) = u0 (x),

We assume a uniform discretization of the interval [0, 1]×[0, T ], defined in the same way as in section 3.1, i.e., 1 xj = j + ∆x, tn = n∆t, (4.32) 2

1 , for N, M ∈ N \ 0. The numerical solution of where ∆x = N1 and ∆t = M the problem defined by (4.30) and (4.31) consists of the following steps:

4. Shu-Osher’s finite difference with Donat-Marquina’s flux splitting

79

• Compute a discretization of the initial data by Uj0 = u0 (xj ). • For n = 0 until M-1 compute U n+1 from U n as follows: N −1 n n , Un , Un , Un , Un 1. From {Ujn }j=0 compute values {U−3 −1 −2 N N +1 , UN +2 }, corresponding to the ghost nodes {x−3 , x−2 , x−1 , xN , xN +1 , xN +2 } using a procedure for artificial boundary conditions, as described in section 3.7. +2 L 2. From {Ujn }N j=−3 compute two reconstructed values Uj+ 1 and 2

R Uj+ 1 for j = −1, . . . , N , according to (4.10). The reconstruction 2

operator S is built as described in section 4.3.2.

L and 3. For each j compute two sets of characteristic variables Wp,k R and characteristic fluxes, f L and f R using (4.11), for the Wp,k p,k p,k value s1 = s2 = 2, and for j = −1, . . . , N .

4. For each j = −1, . . . , N compute two sets of upwind characR L teristic fluxes ψj+ 1 and ψj+f rac12 according to Algorithm 4 or 2

Algorithm 5. n (U n ) using (4.12), for j = 5. Compute the numerical fluxes fˆj+ 1 2

−1, . . . , N .

6. Perform the first step of the Runge-Kutta algorithm (3.15), and (1) compute Uj according to (1)

Uj

= Ujn −

for j = 0, . . . , N − 1.

∆t ˆ (f 1 (U n ) − fˆj− 1 (U n )), 2 ∆x j+ 2 (1)

7. Repeat steps (1) to (5) changing Ujn by Uj n (U (1) ), for j = −1, . . . , N , merical fluxes fˆj+ 1

and compute nu-

2

8. Perform the second step of the Runge-Kutta algorithm (3.15), (2) and compute Uj according to (2)

Uj

3 1 (1) 1 ∆t ˆ = Ujn + Uj − (f 1 (U (1) ) − fˆj− 1 (U (1) )), 2 4 4 4 ∆x j+ 2

for j = 0, . . . , N − 1.

(2)

9. Repeat steps (1) to (5) changing Ujn by Uj n (U (2) ), for j = −1, . . . , N , merical fluxes fˆj+ 1 2

and compute nu-

80

4.4. The complete integration algorithm 10. Perform the third step of the Runge-Kutta algorithm (3.15), and compute Ujn+1 according to 2 2 ∆t ˆ 1 (f 1 (U (2) ) − fˆj− 1 (U (2) )) Ujn+1 = U n + U (2) − 2 3 3 3 ∆x j+ 2 for j = 0, . . . , N − 1.

5 Adaptive mesh refinement In this chapter we introduce the infrastructure needed to build an adaptive mesh refinement algorithm for the numerical solution of hyperbolic systems of conservation laws. We will focus on conservative numerical methods, in particular in the high order numerical scheme described in chapter 4. For the sake of simplicity, we essentially describe here the AMR algorithm in a very simple form, using uniform discretizations of an interval of the real line. In chapter 6 we describe how the ideas described here export to two dimensions and we show the details of our actual two-dimensional implementation of the algorithm. A wider description of adaptive mesh refinement in a more general setup has been postponed to Appendix A, being the algorithms described in this chapter particular cases of the general approach described there. Mesh refinement in a wide variety of forms is nowadays present in almost any algorithm for the integration of hyperbolic equations. As already explained, the formation of discontinuities and small scale features

82 in the solutions of such equations forces the use of very fine meshes, whose scale is smaller than the scale of the flow features to be resolved, in the numerical algorithms. As stated in the introduction, most of the difficulties associated with the numerical solution of hyperbolic conservation laws come from the lack of smoothness of the solution in some regions of the computational domain. When this situation appears, numerical methods have to cope with weak solutions, that typically are piecewise smooth functions with corners or discontinuities. In complex problems, other features as turbulence and vorticity can appear. High order methods are able to compute very accurate solutions for meshes coarser than the ones used with lower order methods, and in the last years a debate on the relative efficiency of low and high order methods has been active. Nowadays, however, the increasing demands for the CFD capabilities makes clear that it is necessary to develop algorithms that combine high order methods with very fine meshes. The drawback of this approach is the prohibitive computational cost of such problems, even for massively parallel computers. Any kind of adaptation, enabling for the use of high resolution only in a part of the computational domain, will produce a benefit in terms of computational efficiency. This reduction in the overall cost can be achieved in several ways. Multiresolution algorithms [62, 63, 35, 38] for example, try to reduce the cost by using a high order nonlinear integrator where needed, and switching to a much cheaper interpolation elsewhere. Adaptive grid algorithms try to fit the grid resolution to the needs of the numerical solution by refining the grid only in regions where the solution has non-smooth structure, and holding a coarser grid where such a high resolution is not needed. Around this idea several adaptive algorithms have been reported in the literature, being among the most known moving mesh methods [65, 114, 170], multiresolution algorithms h- and hp-adaptive methods [12] in the context of Galerkin methods, and approximations on non-uniform and unstructured meshes [42, 73]. Combinations of several of these techniques have also been addressed, as, for example, multiresolution algorithms on unstructured meshes [2]. Following a different approach, Berger and Oliger [21, 23, 24] developed the Adaptive Mesh Refinement (AMR) algorithm, with the aim of reducing the overall number of integrations by overlapping grids of different scales, constructing an approach whose main feature is the possibility of performing time refinement as well as space refinement. The algorithm was originally proposed for the implementation of artificial viscosity schemes, and later extended to more general finite volume schemes [22]. A simplified version was described by Quirk in [139]. Since

5. Adaptive mesh refinement

83

then, a lot of effort has been done in the development of new algorithms around the AMR idea. Current Applications of AMR cover areas as different as cosmology [1], magnetohydrodynamics (MHD) [52, 7], weather prediction [77] or plasma physics [182]. More and more complex models are being incorporated into AMR algorithms. As an example, the development and implementation of immerse boundary methods [124], with applications in fluid-structure interaction problems, in combination with the AMR technique is being currently investigated [149, 59]. The AMR algorithm is a two-fold adaptive method. It refines the grid by overlapping fine grids over coarse grids in order to obtain grid sizes small enough as to resolve the small features of the solution with arbitrary accuracy. The way in which the grids are overlapped allows to refine also in time, in the sense that each grid is integrated with temporal steps adapted to its spatial grid size. The bigger time step allowable for a grid is essentially proportional to the size of its smallest cell, in order to ensure accuracy and stability. Therefore, coarser cells could be evolved with bigger time steps if they were considered in isolation, but, since every cell in the grid has to be evolved to the same time at each time iteration, the time step is fixed for every cell in the grid. The AMR algorithms allows fine cells to be evolved with smaller time steps than for coarse cells, by grouping them into independent grids in a way such that all cells that belong to each grid have the same size. Each grid can overlap with coarser and finer grids. Even if the number of cells increases with respect to a grid containing cells of mixed size, one can expect that the total number of integrations required to evolve the solution for a given time step is reduced. Under favorable circumstances the AMR algorithm requires only a part of the computational power (and hence time) needed by an equivalent fine grid of uniform size or a grid of non constant size. In particular, if the integration algorithm is computationally expensive, as is the case of high order methods, the gain due to the reduction of the overall number of integrations can be very important. The differentiating idea of the AMR algorithm is therefore the creation of a collection of grids, each of them containing cells of constant size, that can be, at least in part, considered in isolation as independent grids. On the basis of the AMR algorithm lies the fact that the solution of an hyperbolic system of conservation laws is composed by waves moving through regions in which the solution is smooth. The main difficulty of HRSC schemes consists of catching and resolving these waves, since in the regions where the solution is smooth the differential form of the equations holds, and the solution can be computed with no major diffi-

84

5.1. Motivation

culties. The leading issue of AMR is to dynamically adapt the resolution of the grids to the requirements of the actual numerical solution, which is changing on time. It is well known that the movement of the waves is governed by the eigenstructure of the Jacobian matrix of the system, in particular, the speed of the waves is controlled by its eigenvalues. This information can be used to design an strategy to decide the moment when a grid has to be refined, so that the AMR algorithm can ensure that, if the waves are initially covered by a fine grid, they will always be, by adapting the grids before the waves can move to a region that is not covered by the fine grid. The different parts involved in an AMR algorithm have to be interrelated and organized in a way such that the aforementioned property is satisfied. In this chapter we motivate and construct the building blocks for the complete algorithm that results from putting together the integration algorithm described in chapter 4 and a version of the AMR algorithm.

5.1 Motivation Given a bounded subset Ω1 ⊂ Rd and due to the hyperbolicity of system (2.1), its solution u at Ω1 × [t0 , t0 + ∆t] only depends on the values of u at a superset of Ω1 × {t0 } which is determined by the domain of dependence of the equations, which is also a bounded set. Numerical methods mimic this idea, and are designed in a way such that the numerical solution at Ω1 × [t0 , t0 + ∆t] provided by the scheme is determined using data from another bounded set, which is defined by the numerical domain of dependence of the method. These concepts have been introduced in section 3.2.2. e 1 × {t0 } be the numerical domain of dependence of a numerical Let Ω scheme under consideration. Assume that the CFL condition is imposed, so that the numerical domain of dependence contains the domain of dependence of the equation. In this situation, given an approximation to u e 1 × {t0 }, it is then possible to compute on a computational mesh over Ω approximations to u(x, t), for (x, t) ∈ Ω1 × [t0 , t0 + ∆t], by means of the numerical scheme. The weakness of this exposition is that further applications of this idea to compute an approximation of u on a mesh over Ω1 × [t0 + ∆t, t0 + 2∆t] would require the knowledge of approximations of e 1 × {t0 + ∆t}, but we only have approximations of u on its subset u on Ω

5. Adaptive mesh refinement

85

~ Ω1 Ω1

Fine mesh

^ Ω 1 Coarse mesh

~ ~ Ω1 e 1 . Data for Ω b 1 can be obtained Figure 5.1: A region Ω1 and its surrounding band Ω by interpolation from the coarser mesh.

Ω1 × {t0 + ∆t}. The approximated values of u on the “surrounding band” b 1 := Ω e 1 \ Ω1 must be obtained by an auxiliary procedure. If u is smooth Ω b 1 , the values on the given mesh that lie in this band can be accuon Ω rately interpolated from an approximation of u on a coarse mesh defined e e1 ⊇ Ω e 1 . Fig. 5.1 illustrates this idea. on a domain Ω

The AMR algorithm can be described by the recursive application of this essential idea to an arbitrarily deep hierarchy of increasingly finer nested grids. At a given time, each grid conceptually contains the flow features that can not be accurately “predicted” from lower resolution levels. The transient nature of the non-smooth flow features endows a dynamical character to the grid hierarchy. An important feature of the algorithm is the time step adaptation referred to as “local time stepping”, which consists in the evolution of each resolution level by its own time step, instead of using a global time step that would inevitably be small, as dictated by CFL restrictions on the finest mesh. The local time stepping means that the algorithm performs more time steps in a certain resolution level than in the immediately coarser level, and this is another key feature for improving the overall performance of the algorithm. The algorithm, as described up to this point, only uses information on a coarse grid to predict the required values at the band surrounding a grid at the immediately finer level. Therefore, the approximated

86

5.2. Discretization and grid organization

solutions at each resolution level would not be synchronized. This lack of synchronization between resolution levels would then be reflected in worse fine resolution predictions that would negatively affect the efficiency of the algorithm. This behavior can be corrected by means of a ”projection” of information from fine to immediately coarser resolutions that, although optional, is a key element that determines the foundations of the algorithm. In a finite volume setting, with the basic scheme evolving approximated cell-averages, it is reasonable to use cell-based mesh-refinement and to project approximated cell-averages at the fine cells covering a coarse cell by just averaging the values at the fine cells. When the basic scheme evolves accurate approximations to point values of the solution, which is the case in Shu-Osher’s finite-difference flux formulation, the most straightforward projection is to organize the grids so that the nodes of a coarse grid belong also to the immediately finer grid, so that the projection amounts to just copying the values at the fine grid to the corresponding position on the coarse grid. But this choice would entail a loss of information, because of the lack of conservation between meshes and would render unclear the implementation of a conservative scheme. This is the reason of our choice of cell-based grids for our formulation based on point values, where the projection is based on flux projection rather than on solution projection. Fig. 5.2 shows two nested meshes with the nodes of the coarse and fine grids aligned so that the solution can be copied from fine to coarse (top plot) and another pair of meshes where the projection is achieved by means of fluxes thanks to the alignment of the cell boundaries (bottom plot). Let us briefly describe a one-dimensional version of the algorithm for a scalar equation. This description will provide an intuitive idea of how the algorithm works and will allow the reader to identify the main parts of the algorithm, to be described for a more general case in further sections.

5.2 Discretization and grid organization Consider the problem ut + f (u)x = 0, u(x, 0) = u0 (x),

x ∈ R, t > 0, x ∈ R,

and assume that our computational domain is the interval Ω = [0, 1], and appropriate numerical boundary conditions are imposed where required

5. Adaptive mesh refinement

87

Fine grid

Coarse grid

Nodes

Cell boundaries

Fine grid

Coarse grid

Figure 5.2: A pair of grids suitable for solution projection (top) and a pair of grids suitable for flux projection (bottom).

88

5.2. Discretization and grid organization

x1= 3/8

x1= 1/8

2

1

0

x1= 5/8

x1= 7/8 4

3

x0= 1/4

x0 = 3/4

1

2

Figure 5.3: Meshes with increasing resolution obtained by dyadic subdivision of a base mesh, with L = 4 and N0 = 2.

(see section 3.7), i.e., assume that the integration algorithm can be applied on a fixed mesh defined on Ω. For simplicity we consider an equally spaced mesh G0 composed by N0 cells of length ∆x0 = N10 . As already mentioned, meshes of higher and higher resolution are needed for the computation of small-scale flow features. A set of L meshes with increasing level of refinement can be built from the mesh G0 by considering meshes obtained by the subdivision of each cell in the immediately coarser level in two, i.e., the unit interval is divided into N0 , . . . , NL−1 subintervals (cells) of length ∆xℓ = 1/Nℓ , with Nℓ = 2ℓ N0 , ℓ = 0, . . . , L − 1. The centers of those cells will be denoted by xℓj = (j + 12 )∆xℓ , j = 0, . . . , Nℓ − 1, ℓ = 0, . . . , L − 1. An example illustrating this construction is depicted in Fig. 5.3. A grid Gℓ at resolution level ℓ > 0 can be interpreted both as a subset of {0, . . . , Nℓ − 1} or as the family of cells represented by them. The extent of the grid is the union of the cells indexed by elements of Gℓ and is denoted by Ωℓ (Gℓ ). The AMR algorithm can be described by the time evolution of a grid hierarchy, which is nothing but a tuple of “grid functions” ((utℓℓ , Gtℓℓ )/ℓ = 0, . . . L − 1), ℓ l ≈ u(xℓj , tℓ ), /j ∈ Gtℓℓ ), with utℓ,j where Gtℓℓ ⊆ {0, . . . , Nℓ − 1} and utℓℓ = (utℓ,j

1

5. Adaptive mesh refinement

0

89

1

0

1

Figure 5.4: A sample grid hierarchy (left) and a set of grids that do not compose a conformal grid hierarchy (right). The grid at level 2 is not contained in the grid at level 1, and the grid at level 3 is not obtained from subdivision of cells of the grid at level 2.

for some time tℓ . This time evolution starts with tℓ = 0, ℓ = 0, . . . , L − 1 ′ and ends when tℓ = T, ℓ = 0, . . . , L − 1. Meanwhile, tℓ ≥ tℓ′ if ℓ ≤ ℓ . We assume the following conditions for the grid hierarchy (See Fig. 5.4 for examples): • G0 = {0, . . . , N0 − 1}, • For ℓ > 0, Gℓ = {2i, 2i + 1, for some i ∈ Gℓ−1 }. In plain words, the above conditions mean that the coarsest grid covers the whole computational domain, and that the rest of the grids are obtained by dyadic subdivision of the immediately coarser grid. In particular, the second condition implies that the grids are nested, i.e. Ωℓ (Gℓ ) ⊆ Ωℓ−1 (Gℓ−1 ). The main building blocks of an AMR algorithm for the numerical solution of an hyperbolic equation are: integration, which amounts to the application of the algorithm described in chapter 4 to each cell in each grid; adaptation, in order to ensure that the adequate degree of refinement is applied to each part of the domain; and projection, which is a procedure to enforce conservation between the different grids in the zones where they overlap.

5.3 Integration Let us describe the necessary steps for the time evolution of a grid hierarchy, that initially corresponds to a time t (the same for all grids), up to

90

5.3. Integration

a time t + ∆t0 , where ∆t0 is a suitable time step for the integration of the grid G0 . In particular, ∆t0 is chosen such that ∆t0 ≤

∆x0 , maxu |f ′ (u)|

(5.1)

and the time steps for the rest of the levels are recursively defined by ∆tℓ =

∆tℓ−1 , 2

ℓ = 1, . . . , L − 1.

(5.2)

In this way the CFL condition for each grid is satisfied. Note that one time step of a grid G0 corresponds to two of the immediately finer grid, so that all grids can be integrated up to the same time t + ∆t0 . At that point, the process described here is repeated to integrate the equations from time t + ∆t0 up to time t + 2∆t0 , so that it suffices to describe the evolution of the grid hierarchy for a single time step of the coarse grid. The integration is organized in a sequential fashion, based on the following conditions: 1. The integration of a grid is performed after the integration of the immediately coarser grid. 2. After a grid Gℓ , with ℓ > 0, is integrated up to a time t + k∆tℓ , (k ∈ {1, . . . , 2ℓ }), it is not integrated again before the immediately finer grid has been integrated up to time t + k∆tℓ . The pseudo-code of an algorithm that performs the required sequence of integrations on a set of grids G := {G0 , . . . , GL−1 } in order to update the grid at a given refinement level ℓ and finer, from a given time t to time t + ∆tℓ is shown in Fig. 5.5. The call to integrate(Gℓ) performs the application of the numerical scheme to the grid Gℓ . An integration from time t to time t + ∆t0 of the sequence of grids G is achieved by the call update(G, 0). In Fig. 5.5 the gridlist data type stores a grid hierarchy. This data type is explained in detail in chapter 6. As an example, the integration process from time t to time t + ∆t0 for a grid hierarchy composed by three levels would be performed following the order indicated in Fig. 5.6. When all grids have been integrated up to the same time t + ∆t0 , the process is repeated up to the next coarse time step. Note that for the integration of a grid Gℓ , (ℓ > 0) from time t up to e ℓ (Gℓ ) × {t} is required. time t + ∆tℓ , data from the “surrounding band” Ω e ℓ−1 (Gℓ−1 ). On the This data is obtained from spatial interpolation from Ω other hand, for the integration of the same grid from time t + ∆tℓ up to

5. Adaptive mesh refinement

91

Function update(G:gridlist, ℓ:integer) integrate(Gℓ) if(ℓ < L − 1) for k = 1 until 2 update(G, ℓ + 1) end for end if end function Figure 5.5: A recursive algorithm to integrate level ℓ and finer of a grid hierarchy G = {G0 , . . . , GL−1 }, for one time step of the grid Gℓ . A call to Update(G, 0) will integrate the whole grid hierarchy for a time step of the coarser grid.

Order of integration GRID Time step 1

Time

G0

∆t0

t + ∆t0

2

G1

t+

3

G2

4

G2

5

G1

6

G2

7

G2

∆t0 2 ∆t0 4 ∆t0 4 ∆t0 2 ∆t0 4 ∆t0 4

t+ t+

∆t0 2 ∆t0 4 ∆t0 2

t + ∆t0 t+

3∆t0 4

t + ∆t0

Figure 5.6: Sample sequence of integrations for a three-level grid hierarchy

92

5.4. Projection

e ℓ (Gℓ )× {t + ∆tℓ }. In this case it is obtained by t + 2∆tℓ , data is required at Ω t ℓ , Gℓ−1 ), which must have been interpolation from (uℓ−1 , Gℓ−1 ) and (ut+2∆t ℓ−1 computed in previous steps. This interpolation procedure is described in detail in section 5.6.

5.4 Projection Once computed (utℓℓ +2∆tℓ , Gtℓℓ ), there is data that overlay Ωℓ (Gtℓℓ ) at different resolution levels. It is at this point that the projection of the data at ℓ +2∆tℓ the fine resolution level should be applied to modify the values utℓ−1,j of the immediately coarser grid function that correspond to cells overlaid ℓ ℓ such that and adjacent to them as well, i.e., cells in Gtℓ−1 by cells at Gtℓ−1 tℓ their indices i verify {2i, 2(i − 1), 2(i + 1)} ∩ Gℓ 6= ∅. In order to explain how the values of the coarse grid are modified, let us analyze the detailed computation of (uℓtℓ +k∆tℓ , Gtℓl ), for k = 1, 2. Let ℓ be such that 2i, 2i + 1 ∈ Gtℓℓ . The update of Gtℓℓ from tℓ to tℓ + 2∆tℓ i ∈ Gtℓ−1 is performed by means of the following computations, for j = 2i, 2i + 1: ∆tℓ ˆtℓ ˆtℓ 1 ), (f 1 − f ℓ,j− 2 ∆xℓ ℓ,j+ 2 ∆tℓ ˆtℓ +∆tℓ tℓ +∆tℓ ℓ +∆tℓ − = utℓ,j (f − fˆℓ,j− 1 1 ). ∆xℓ ℓ,j+ 2 2

ℓ ℓ +∆tℓ − = utℓ,j utℓ,j ℓ +2∆tℓ utℓ,j

(5.3)

ℓ +2∆tℓ can be written as From these expressions utℓ,j ℓ ℓ +2∆tℓ − = utℓ,j utℓ,j

∆tℓ ˆtℓ tℓ ℓ ℓ ˆtℓ +∆t ˆtℓ +∆t ) − (fˆℓ,j− )). ((f 1 + f 1 + f ℓ,j+ 21 ℓ,j− 21 ∆xℓ ℓ,j+ 2 2

(5.4)

From (5.4) we deduce: ℓ +2∆tℓ ℓ +2∆tℓ + utℓ,2i+1 utℓ,2i

2

=

ℓ ℓ + utℓ,2i+1 utℓ,2i

2

−

∆tℓ−1 ˆˆtℓ ˆˆtℓ (f ), 1 −f ℓ−1,i− 21 ∆xℓ−1 ℓ−1,i+ 2

(5.5)

where we have defined: ˆtℓ fˆℓ−1,i+ 1 =

tℓ ˆtℓ +∆t3ℓ fˆℓ,2i+ 3 + f ℓ,2i+ 2

2

ˆtℓ fˆℓ−1,i− 1 = 2

tℓ fˆℓ,2i− 1 2

2

2 tℓ +∆tℓ + fˆℓ,2i− 1 2

2

, (5.6) ,

5. Adaptive mesh refinement

93

∆t

∆tℓ = ∆xℓ−1 . and used that ∆x ℓ ℓ−1 Therefore, if we redefine accordingly the coarse numerical fluxes, i.e., we impose ˆˆtℓ tℓ fˆℓ−1,i± , (5.7) 1 = f ℓ−1,i± 1 2

2

and we assume that at time tℓ the relation ℓ utℓ−1,i

=

ℓ ℓ + utℓ,2i+1 utℓ,2i

(5.8)

2

holds, then if we perform the above correction it holds ℓ +2∆tℓ utℓ−1,i

=

ℓ +2∆tℓ ℓ +2∆tℓ + utℓ,2i+1 utℓ,2i

2

,

(5.9)

i.e., the same relation holds for time tℓ + 2∆tℓ = tℓ−1 + ∆tℓ−1 . Note that the ˆtℓ substitutions in (5.7) make sense because the values fˆℓ−1,i± 1 correspond 2

to an approximation of the numerical fluxes of the equation at the cell ˆtℓ interfaces of the grid Gℓ−1 . It follows from the definition of fˆℓ−1,i± 1 in (5.6) 2 ℓ−1 1 and the election of the nodes xi = i + 2 ∆xℓ−1 as the cell-centers. ˆ The values fˆtℓ 1 correspond, respectively, to approximations of the ℓ−1,i± 2

numerical fluxes at the points ℓ−1 xℓ2i+ 3 = (2i + 2)∆xℓ = (i + 1)∆xℓ−1 = xi+ 1 2

2

and ℓ−1 xℓ2i− 1 = 2i∆xℓ = i∆xℓ−1 = xi− 1. 2

2

These points are the cell interfaces of the cell whose center is the point xiℓ−1 . Note that the same relation does not hold if the nodes are taken as the cell boundary points, i.e., xℓi = i∆x. We therefore conclude that, even for the case of a discretization based on point values, a coherent projection can be based on a discrete form of conservation in the sense of cell-averages. In the adaptation procedure, described below, we have enforced the construction of adapted grids in a way such that they are obtained by dyadic subdivision of the underlying coarser grids. In other words, if a cell of a grid Gℓ , whose index can be 2i or 2i + 1, for some i ∈ Gℓ−1 is marked for refinement, then both cells 2i and 2i + 1 are forced to belong to the refined grid. In this way, the numerical fluxes required to build a

94

5.5. Adaptation

Function update(G:gridlist, ℓ:integer) integrate(Gℓ) if(ℓ < L − 1) for k = 1 until 2 update(G, ℓ + 1) end for project(Gℓ+1) end if end function Figure 5.7: A recursive algorithm, that includes projection, to integrate level ℓ and finer of a grid hierarchy G = {G0 , . . . , GL−1 }, for one time step of the grid Gℓ .

conservative projection, in the sense of (5.8) – (5.9), are available in the finer grid. The indicated projection is performed each time a grid has been integrated one coarse time step. We can modify the algorithm in Fig. 5.5 to include the projection as indicated in Fig. 5.7. A call to project(Gℓ) updates the solution corresponding to the grid Gℓ−1 , according to the correction (5.7).

5.5 Adaptation Another key step in the time evolution of the grid hierarchy is adaptation. The reason for adapting the grids is that refinement of the grids must somehow follow the motion of the flow features. Consequently, the grids corresponding to the various refinement levels have to be modified according to the actual characteristics of the flow. We discuss next some procedures to decide which cells are to be included in a given resolution level, and we give a procedure to decide in what moments of the evolution the grids have to be adapted. Finally, we describe how the integration, projection and adaptation processes can be interleaved. The main goal of the adaptation process is to ensure that discontinuities that are initially covered by a grid corresponding to a given resolution level continue being covered by a grid with the same resolution, as long as the discontinuity persists. On the other hand, the adaptation should be able to catch newly generated discontinuities when they are

5. Adaptive mesh refinement

95

forming. These goals can be accomplished in many ways, that can be split into two major groups: on the one hand the ones based on accuracy, where estimators of the local truncation error are used, and a cell is refined or not depending on if the estimation of the local truncation error is below a given tolerance or not; on the other hand, the ones based on the flow properties, where measures of the flow features are used instead, and a cell is refined if it is decided that it is near or within a region that has some feature that needs more refinement. Accuracy-based criteria are, in general, more reliable: if a correct solution cannot be computed by the numerical method, then the zone is refined in order to gain accuracy. These criteria can be applied to very general problems regardless their physical nature. The main drawback is their relatively high computational cost. Within this group we can mention the method described in [22] by Berger and Colella, who proposed the usage of estimators of the local truncation error, based on Richardson extrapolation. The idea is that for a difference method H, that we assume for simplicity in the form (3.5), with temporal and spatial order q, the quantity u(x, t + ∆t) − H∆t (u(x, t)) which is, up to a constant, the local truncation error, defined by (3.7), can be estimated by the formula u(x, t + ∆t) − H∆t (u(x, t)) ≈

2 (u(x, t)) − H H∆t 2∆t (u(x, t)) . q+1 2 −2

(5.10)

2 (u(x, t)) denotes two successive apWe recall from section 3.1, that H∆t plications of the numerical method H for a step ∆t (see page 42). On the other hand H2∆t is the result of the numerical method applied on a grid of size 2∆x with a time step 2∆t. The approximation in (5.10) is accurate up to order O(∆x)q+2 if the function is smooth. One can therefore identify non-smooth zones as the ones where the approximation is not valid, which in practice amounts to flagging those cells that verify a criterion like 2 (u(x, t)) − H |H∆t 2∆t (u(x, t))| >τ q+1 2 −2

for a given tolerance τ . Similar techniques have been used by many researchers in different contexts, see e. g. [163, 85]. A review of local truncation estimators can be found in [183].

96

5.5. Adaptation

The main drawback of this approach is obviously the fact that extra numerical solutions have to be computed for each point in order to be able to estimate the local truncation error and then decide if the cell is refined or not. In the problems and methods under consideration in this work, such an expense of computational resources can be unacceptable, since the flux computation (i.e., the application of the numerical method) is precisely the most time-consuming part of the algorithm. On the other hand, flow-based refinement criteria can reduce the computational complexity of the process of marking cells for refinement. The idea is that one can use some knowledge of the qualitative behavior of the solutions of the equations in order to decide which cells need to be flagged for refinement. Typical flow characteristics that one might want to detect are contact discontinuities and shock waves (jump discontinuities) and rarefaction waves (jumps in the first derivative at the tail and the head of the rarefaction). Since they all constitute variations in the solution or in the gradient of the solution, it could be enough to use information coming from the first and second derivatives of the solution as an indicator of the presence of such discontinuities. This is the approach used, for example by Quirk [139], who used approximations to the gradient, based on finite differences as estimators. If the change in some seminorm of the gradient (e.g., the absolute value of the density component of the gradient for Euler equations, or some norm of the gradient for general equations) of the solution between two adjacent cells is above a given tolerance, then both cells are flagged for refinement. More precisely, a cell is refined if it verifies a condition of the type |u(x + ∆x, t) − u(x, t)| > τg , ∆x

(5.11)

where τg is a given tolerance and u is any quantity of interest. In other contexts the quantities used as indicators may vary. In [74], for example, a flagging procedure based on vorticity is used. Indicators based on different characteristics of the flow are used in [138, 193] for MHD problems. The flow-based indicator has to be tailored for the specific application, and it presents the problem of distinguishing the continuous features at a discrete level. Numerical methods introduce some smearing of the discontinuities, and it is a serious problem to take apart, for example, a continuous function with a deep gradient from a jump discontinuity. Regardless the approach used, be it flow-based or accuracy-based, a thresholding criterion has to be supplied to actually decide if a particular cell needs refinement or not. The choice of the thresholds is also

5. Adaptive mesh refinement

97

problem-dependent and has an important influence on the final result. If severe values of the threshold are chosen, more refinement will be needed, thus increasing the computational cost of the algorithm. If the values of the threshold are relaxed, then less refinement will be required, but the risk of leaving without refinement zones that should be refined increases. Combinations of sensors based on the local truncation error estimation with sensors based on flow features have also been considered [34, 150]. Other authors have studied more general estimators of the local truncation error based on the interpolation reconstruction of the operators [94, 107]. Our proposal, which is similar to the one in [35, 38], mainly consists in including in the grid those cells whose associated values cannot be accurately predicted by interpolation from the next coarser level, thus ensuring that values at cells not refined at some level can be accurately predicted from coarser values. More precisely, for our cellcentered approach, if xℓj = (j + 21 )∆xℓ is the center of a cell belonging to a grid Gtℓ and I(utℓ−1 , x) is an interpolation operator defined on the data utℓ−1 = {utℓ−1,i }i∈Gt , then the cell defined by xℓj will be selected for ℓ−1 refinement if the difference t (5.12) uℓ,j − I(utℓ−1 , xℓj )

is above a given tolerance τp . Note that only the cells present in the actual grid are considered for refinement. New cells are included only because of the addition of some extra cells around each marked cell. We also ensure that the refined grid is obtained by subdivision of coarse cells: if a cell xtℓ,j is selected for refinement, then every cell that overlaps the same coarse cell as xtℓ,j is also included in the refined grid. Further, we also include a cell in the refinement list if the modulus of the discrete gradient, computed in the coarser grid, exceeds some large threshold, so that shock formation can be detected from steepened data. For the discrete gradient we use the approximation: t t t t − u , max − u uℓ−1,j+1 ℓ−1,j−1 ℓ−1,j uℓ−1,j ∂u ℓ−1 (x , t) ≈ . (5.13) ∂x j ∆xℓ The last observation for this refinement procedure is that it should be performed from fine to coarse resolution levels to ensure that at every moment of the update process it holds that Ωℓ (Gtℓ ) ⊆ Ωℓ−1 (Gtℓ−1 ). We also enforce the inclusion Ωℓ (Gtℓ ) ⊇ Ωℓ+1 (Gtℓ+1 ), so that the whole sequence

98

5.5. Adaptation

of grids verifies the desired inclusions. Finally note that the process of computing data at the corresponding surrounding bands is possible e ℓ−1 (Gt ). e ℓ (Gt ) ⊆ Ω because the grids are nested, and this implies that Ω ℓ−1 ℓ See page 85 for the definition of the “surrounding band”. bℓ , that verifies Ωℓ (G bt ) ⊆ Ωℓ−1 (Gt ), one Once computed the new grid G ℓ−1 ℓ then sets ( bt \ Gt I(utℓ−1 , xℓj ) if j ∈ G ℓ ℓ (5.14) u btℓ,j = utℓ,j if j ∈ Gtℓ

i.e., the value at the j-th cell is interpolated from data at the next coarser t bt , u level for cells not in Gtℓ . The refined grid is therefore defined by (G ℓ bℓ ). Discrete boundary conditions are also applied if the grid overlaps the domain boundary. To end up the description of the adaptation process, what still remains to be done is to decide when a given grid has to be adapted in order to avoid discontinuities to move from a fine to a coarse grid. As long as the flow travels in time, during a time step a discontinuity can move from one cell to another. At this point we observe that the CFL condition, which is imposed for the numerical method to be stable, guarantees that, during a time interval [t, t + ∆tℓ ], the information about the properties of the fluid can move, at most, a distance ∆xℓ around its initial location. This fact can be seen by looking at the CFL condition (cf. (5.1 and (5.2)) ∆xℓ , max |f ′ (u)| ≤ u ∆tℓ which indicates that the maximum characteristic speed of the equation, ℓ given by maxu |f ′ (u)| is bounded by the “grid speed” ∆x ∆tℓ . The maximum distance that the information about the properties of the fluid can move during a time step ∆tℓ is thus given by maxu |f ′ (u)| · ∆tℓ ≤ ∆xℓ . ∆tℓ ∆t In Fig. 5.8 two lines with slopes ∆x are shown on a spaceand − ∆x ℓ time grid of spatial size ∆xℓ and temporal size ∆tℓ . These lines represent ∆xℓ ℓ information propagating with speeds ∆x ∆tℓ and − ∆tℓ , respectively, and determine the maximum distance that the information about the properties of the fluid can move during a time interval, so that data at a point x can only affect the solution at the interval [x−∆xℓ , x+∆xℓ ] in the time interval [t, t + ∆tℓ ]. This suggests that a band of width at least of one cell has to be added to any grid in order to avoid discontinuities to escape the grid during one time step. Therefore, an strategy for the adaptation of the grids would consist on the adaptation of the grid after each integration, adding the required extra cells to the adapted grid. In practice, a band of width equal to one

5. Adaptive mesh refinement

99

lines with slopes equal to the reciprocal of a characteristic speed

∆xl ∆ tl

line with slope −

∆t l ∆x l

line with slope

t+ ∆ t l t

x−∆ x l

x

∆t l ∆x l

x+ ∆x l

Figure 5.8: An illustration of the movement of the information of the characteris∆tℓ ∆tℓ tics of the fluid. The solid lines with slopes ∆x and − ∆x determine the maximum ℓ ℓ velocities. The lines corresponding to information propagating with characteristic speeds (dashed lines) necessarily fall in the region determined by them.

100

5.6. Grid interpolation

Function update(G:gridlist, ℓ:integer) integrate(Gℓ) if(ℓ < L − 1) for k = 1 until 2 update(G, ℓ + 1) end for project(Gℓ+1) adapt(Gℓ+1) end if end function Figure 5.9: A recursive algorithm, that includes projection, integration and adaptation, to update level ℓ and finer of a grid hierarchy G = {G0 , . . . , GL−1 }, for one time step of the grid Gℓ .

cell of the coarser grid (i.e., two cells of the grid Gℓ ) is added instead, allowing for the refinement of the grids after each integration of the coarser grid. Performing the refinement in this fashion has the advantage that, at the moment when the grid is to be refined, data from the coarse grid is available for the same time, thus enabling the procedure described above for the computation of the numerical solution for the adapted grid, as described in formula (5.14). The algorithm in Fig. 5.7, modified to include the adaptation process, is shown in Fig. 5.9, and the sequence of integrations, adaptations and projections for the example used in Fig. 5.6 is shown in Fig. 5.10. The algorithm in Fig. 5.9 is recursive, as is the nature of the sequence of integrations, adaptations and projections, and most implementations use recursion, even if it is not natively supported in some programming languages, like FORTRAN 77 [5] (see e.g. [139]). The algorithm can be, however, implemented in a sequential form as shown in Fig. 5.11.

5.6 Grid interpolation In the description of the AMR algorithm we have highlighted the fact that, at some stages of the integration process, interpolation from coarse to fine grid is required. We describe here the interpolants used in such cases.

5. Adaptive mesh refinement

Step GRID 1

G0

Action

Time step

Time

Integrate

∆t0

t + ∆t0

∆t0 2 ∆t0 4 ∆t0 4

t+

−

t+

2

G1

Integrate

3

G2

Integrate

4

G2

Integrate

5

G2 → G1

6

G2

101

Project Adapt

7

G1

Integrate

8

G2

Integrate

9

G2

Integrate

10 11

G2 → G1 G2

Project Adapt

12 13

G1 → G0 G1

Project Adapt

−

∆t0 2 ∆t0 4 ∆t0 4

− −

− −

t+ t+ t+

∆t0 2 ∆t0 4 ∆t0 2 ∆t0 2 ∆t0 2

t + ∆t0 t+

3∆t0 4

t + ∆t0 t + ∆t0 t + ∆t0 t + ∆t0 t + ∆t0

Figure 5.10: Sample sequence of integrations, projections and adaptations for a three-level grid hierarchy

102

5.6. Grid interpolation

Function update(G) integrate(G0) ℓ=1 sℓ = 0, 1 ≤ ℓ ≤ L − 1 while 1 ≤ ℓ < L do integrate(Gℓ) sℓ = sℓ + 1 while ℓ = L − 1 and sℓ < 2 do integrate Gℓ sℓ = sℓ + 1 end while if sℓ < 2 or ℓ < L − 1 then ℓ=ℓ+1 else while sℓ = 2 do project(Gℓ+1) adapt(Gℓ) sℓ = 0 ℓ=ℓ−1 end while end if end while end function Figure 5.11: An iterative algorithm to update a grid hierarchy.

5. Adaptive mesh refinement

103

Let us recall the integration algorithm shown in Fig. 5.5 when applied to a grid hierarchy G = {G0 , G1 } composed by two refinement levels. Initially both grids correspond to the same time t and we assume that all the required data for the integration is available for all grids, including the ghost cells that compose their respective “surrounding bands”. The first step is to integrate the grid G0 from time t up to time t + ∆t0 . Then the grid G1 is integrated up to time t + ∆t1 = t + ∆t2 0 . At this point we notice that data corresponding to the ghost cells that compose the “surrounding band” of G1 , is required to perform the next integration for the grid G1 . It can happen that the ghost cells overlap the boundary of the computational domain, and in that case a numerical solution for them is computed from the boundary conditions. Otherwise, in order to obtain a numerical solution for the ghost cells corresponding to time t + ∆t1 one has two possibilities: • Perform an integration from time t to time t + ∆t1 of the cells in G0 required to compute data on the ghost cells of G1 . • Interpolate in time the numerical solutions corresponding to the grid G0 for times t and t + ∆t0 . The first option should produce a more accurate approximation to the data that we aim to compute, but the second option is much cheaper in terms of computational cost. These two approaches give exactly the same result if the time integration procedure is the forward Euler method, but are different in general (and in the particular case of the third order Runge-Kutta method used in this work). The fact that the ghost cells are away from discontinuities allow us to use the second approach, i.e., time interpolation between the known solutions at times t and t + ∆t0 . Since we only know these two approximations in the grid G0 , the only choice is linear interpolation, which amounts to compute 1 ut+∆t = 0,i

0 ut0,i + ut+∆t 0,i

2

for some cells i ∈ G0 . The actual nodes where this interpolation is performed depend on the spatial interpolation to be applied on them. This provides data at time t + ∆t1 for some nodes in the grid G0 . This data is next interpolated in space to obtain values for the ghost nodes of the grid G1 . In particular, if linear interpolation in space is used, data is required in the nodes of G0 whose associated cells overlap ghost cells of G1 and in one extra coarse node around them. If a third order Lagrange

104

5.6. Grid interpolation

interpolant is used instead, we need to perform time interpolation for the coarse cells that overlap ghost nodes of G1 and for a ”band“ of two coarse nodes around them. An illustration of this process is depicted in Fig. 5.12, for third order interpolation in space. We see from the above example that there are two situations where a grid needs data that may not be available for its refinement level. In the first case, a fine grid needs to fill its ghost nodes with data, and a coarse grid, corresponding to the same time, exists (Fig. 5.12(f)). In this case a spatial interpolation is applied. The second case corresponds to the situation where no coarse grid for the same time exist, and therefore interpolation between two coarse grids, corresponding to different times, has to be performed prior to the spatial interpolation (Fig. 5.12(c) - 5.12(d)). A more general case is depicted in Fig. 5.13, where three grids are used. In that example space-time interpolation is required for the finest grid G2 after steps (d) and (g), and for the intermediate grid G1 after step (c). Interpolation in space only is required for the grid G2 after steps (e) and (h) and for the grid G1 after step (f).

5.6.1 Grid interpolation and Runge-Kutta time integration Most semi-discrete methods use a multi-stage time integration method, as is the three-step Runge-Kutta method (3.15) used in this work. Let us recall the algorithm, which reads as follows: U (1) = U n − ∆tD(U n ), 1 1 3 U (2) = U n + U (1) − ∆tD(U (1) ), 4 4 4 1 2 2 U n+1 = U n + U (2) − ∆tD(U (2) ). 3 3 3

(5.15)

Note that the values corresponding to an stage of the Runge-Kutta algorithm are obtained by means of a combination of the values corresponding to the previous stages and an spatial operator, that we have denoted by D, applied on the values of the previous stage. The operator D is, in our case, the divided difference of the numerical fluxes corresponding to each cell, being these fluxes computed according to the algorithm described in chapter 4. The algorithm can be thus written, for a single node xj , as

5. Adaptive mesh refinement

105 Time t+∆t

0

t

(a) The coarse grid is integrated and exists at times t and t + ∆t Time t+∆t

0

X

X

X

t+∆t

1

t

(b) The fine grid is integrated up to time t + ∆t1 Time t+∆t

0

X

X

X

t+∆t

1

t

(c) Linear interpolation in time is performed on some coarse nodes

Figure 5.12: Illustration of the data interpolation process. The coarse nodes are depicted with circles, and the fine nodes with squares. The X signs indicate ghost nodes of the fine grid that do not have a numerical solution.

106

5.6. Grid interpolation Time t+∆t

0

t+∆t

1

t

(d) Interpolation in space is performed to fill the ghost nodes of the fine grid with a numerical solution at time t + ∆t1 Time X

X

X

t+∆t

0

t

(e) The fine grid is integrated up to time t + ∆t0

Time t+∆t

0

t

(f) Interpolation in space is performed to fill the ghost nodes of the fine grid with a numerical solution at time t + ∆t0

Figure 5.12 (continued)

5. Adaptive mesh refinement

(1)

Uj

(2)

Uj

Ujn+1

∆t ˆ n f (U )j+ 1 − fˆ(U n )j− 1 , 2 2 ∆x 3 n 1 (1) 1 ∆t ˆ (1) = Uj + Uj − f (U )j+ 1 − fˆ(U (1) )j− 1 , 2 2 4 4 4 ∆x 1 n 2 (2) 2 ∆t ˆ (2) f (U )j+ 1 − fˆ(U (2) )j− 1 . = Uj + Uj − 2 2 3 3 3 ∆x

107

= Ujn −

(5.16)

During each intermediate step of the Runge-Kutta integration the numerical fluxes are computed, therefore the ghost nodes need to be filled with data. The steps of the Runge-Kutta algorithm are sequentially applied to each grid separately, so we consider a single grid, which contains the numerical solution for a given time t. Assume that the ghost nodes contain a valid numerical solution, obtained by spatial interpolation from a coarser grid or from the boundary conditions. Let us analyze each step of the algorithm in order to provide a way to compute data for the ghost cells. We mainly need to identify each intermediate stage of the Runge-Kutta algorithm with a precise time, in order to be able to perform the space-time interpolation process described above. Assume that U n is the numerical solution at a given time t. The intermediate result U (1) naturally corresponds to time t + ∆t, since it is nothing but the value that results from an iteration of the forward Euler method corresponding to a time step ∆t. The intermediate values U (2) can be interpreted as values that correspond to time t + ∆t 2 . This can be seen, for example, by writing 1 3 U (2) = U n + U ∗ 4 4

(5.17)

where U ∗ = U 1 − ∆tD(U (1) ). U ∗ results of advancing U 1 in time a step ∆t using the forward Euler method, so U ∗ can be interpreted as a solution that corresponds to time t + 2 ∆t. Then, (5.17) is the result of applying linear interpolation in time to the pairs (t, U n ) and (t + 2∆t, U ∗ ), for the time t + ∆t 2 . The final value U n+1 corresponds of course to time t+∆t. As a remark, note that the same procedure used for U (2) , if applied to U n+1 , gives that it can be considered as an interpolated value at time t + ∆t using the pairs (t, U n ) and (t + 32 ∆t, U (2) − ∆tD(U (2) )).

108

5.6. Grid interpolation TIME

TIME

t+ ∆ t 0

t+ ∆ t 0

t+

3 ∆t0

t+

∆t0

t+

∆t0

4

2

4

t+

3 ∆t0

t+

∆t0

t+

∆t0

t

4

2

4

t

G0

G1

G2

G0

GRID

(a) Initial state TIME

t+ ∆ t 0

t+ ∆ t 0

3 ∆t0

t+

∆t0

t+

∆t0

G2

GRID

G2

GRID

G2

GRID

G2

GRID

(b) Step 1

TIME

t+

G1

4

2

4

t+

3 ∆t0

t+

∆t0

t+

∆t0

t

4

2

4

t

G0

G1

G2

G0

GRID

(c) Step 2

(d) Step 3

TIME

TIME

t+ ∆ t 0

t+ ∆ t 0

t+

3 ∆t0

t+

∆t0

t+

∆t0

G1

4

2

4

t+

3 ∆t0

t+

∆t0

t+

∆t0

t

4

2

4

t

G0

G1

G2

G0

GRID

(e) Step 4 TIME

TIME

t+ ∆ t 0

t+ ∆ t 0

t+

3 ∆t0

t+

∆t0

t+

∆t0

G1

(f) Step 5

4

2

4

t

t+

3 ∆t0

t+

∆t0

t+

∆t0

4

2

4

t

G0

G1

(g) Step 6

G2

GRID

G0

G1

(h) Step 7

Figure 5.13: A graphical representation of the sequence of integrations of Fig. 5.6.

6 Implementation and parallelization of the algorithm In this chapter we describe the AMR algorithm in the form used in our 2D implementation. For the sake of clarity, we have followed up to now a very simple approach, based on one-dimensional examples with equally spaced points. The algorithm can be, however, applied to much more general grids, as is described in appendix A. In this chapter we focus on our implementation of the algorithm in 2D, that assumes a particular organization of the grids which, in turn, imposes some requirements and allows some simplifications in some auxiliary procedures. We describe the data structures used in the program, and the main routines implemented for the different parts of the algorithm in a generic form, using pseudocode, to make the explanation more understandable than the actual implementation, written in ANSI C, which contains a

110

6.1. Sequential implementation

lot of auxiliary code that is not essential for the comprehension of the algorithms. The chapter is divided into two sections, where we respectively explain the sequential and parallel implementation of the algorithm. Albeit our actual code is parallel (the sequential version is exactly the parallel code running on a single processor), we prefer to split the explanation in two parts, delaying the concerns of the parallelization to a separate section, and focus initially on what would be a pure sequential implementation. In section 6.1 we explain how we have implemented the major parts of the algorithm. In the actual, parallel code, each of these parts runs at each processor almost independently from the others, except for some data transfers between processors, and it will be more understandable to add these parts after the implementation of each building block has been described. Along with other parallelization issues, this is the goal of section 6.2.

6.1 Sequential implementation It is a complicated issue to manage grids composed by isolated points. Processes such as flow integration, grid interpolation, projection of solution, etc. would need to process information related to the grid to decide, for example, if a point is surrounded by points of the same resolution or not, or where a point lies. To manage that kind of information, additional data structures containing it can be created and updated dynamically in the code, but it is unpractical from the computational point of view, because substantial computational time and memory needs to be used to process and dynamically update the required information for each point. On the other hand, the particular type of numerical method used in this work for flow integration, which is based on Shu-Osher’s approach, relies on a dimensional splitting approach, which is well suited for structured, Cartesian grids, but not for other kinds of grid. For Cartesian grids, a common approach for Adaptive Mesh Refinement is to organize the grids into rectangular patches, rather than considering the nodes in isolation. This approach implies that some nodes that, in principle, do not belong to a grid, are included on it in order to compose the rectangular patches, thus requiring more integrations, but the savings of computational time due to the lack of decision-making mechanisms in the code, and the simplification in the final implementa-

6. Implementation and parallelization of the algorithm

111

tion, justify this approach. The number of redundant nodes included in the patches can be, on the other hand, controlled by the user, enabling for a balance between redundancy and efficiency. This approach, with slight variants, has been adopted in many implementations regarding many different applications, see e.g. [132, 137, 192, 193]. In the rest of the section we describe our actual implementation of the algorithm, focusing in the 2D version. We start by describing the organization of the grid hierarchies and associated numerical solutions, and we later analyze the implementation of the main processes involved in the algorithm, namely adaptation, projection and integration. Despite our numerical method for the flow integration, and hence our AMR implementation, are based on nodal values rather that cellaverage values, in our description we will mix cells and nodes in order to clarify the explained concepts. Some ideas regarding grid nestedness, flux projection or grid adaptation, for example, can be explained more clearly with this approach.

6.1.1 Hierarchical grid system Our implementation of the AMR algorithm uses a hierarchical grid system composed by a set of Cartesian coarse mesh patches, which constitutes the level 0 of the hierarchy and defines the computational domain. These patches can be refined locally by defining finer mesh patches that form the level 1 of the hierarchy. The finer patches are obtained by the sub-division of groups of coarse cells that have been marked for refinement. This process can be repeated to obtain even finer patches at level 2. Grids of the desired resolution can be obtained by iterating this process. We will use the term grid to refer to the set of mesh patches that belong to the same refinement level, while the terms mesh, patch and mesh patch will be used interchangeably to refer to a single rectangular patch. Let us consider the problem:

ut (x, y, t) + f (u(x, y, t))x + g(u(x, y, t))y = 0, (x, y, t) ∈ Ω × [0, T ], u(x, y, 0) = u0 (x), (x, y) ∈ Ω,

(6.1)

where for simplicity we assume Ω = [0, 1]2 . In order to define a uniform Cartesian discretization of Ω we proceed similarly to the one-dimensional case, described in sections 3.1 an 5.2. We take positive integer numbers

112

6.1. Sequential implementation

N0x and N0y and we define the grid sizes by ∆x0 = defines a discretization given by the points (x0i , yj0 ) =

i+

1 2

1 ∆x0 , j + ∆y0 , 2

1 N0x

and ∆y0 =

1 . N0y

0 ≤ i < N0x , 0 ≤ j < N0y .

This

(6.2)

Each point (x0i , yj0 ) defines a cell c0i,j by c0i,j

=

x0i

∆y0 0 ∆y0 ∆x0 0 ∆x0 0 , xi + , yj + × yj − . − 2 2 2 2

(6.3)

The coarsest grid in our grid hierarchy, denoted by G0 , is defined as a set of K0 mesh patches, denoted by {G0,k : 1 ≤ k ≤ K0 }, where each patch is composed by a subset of {c0i,j : 0 ≤ i < N0x , 0 ≤ j < N0y }, with the following restrictions: • The extent of each patch, defined by Ω(G0,k ) =

[

c0i,j

c0i,j ∈G0,k

is a rectangle. • Two patches can overlap only on their boundaries, i.e., ˚ Ω(G0,k1 ) ∩ ˚ Ω(G0,k2 ) = ∅ if k1 6= k2 , where ˚ Ω(G0,k ) denotes the interior of the set Ω(G0,k ). • The union of all the coarse mesh patches covers the computational domain: K0 [ Ω(G0,k ) = [0, 1]2 . k=1

In order to construct a grid hierarchy composed by L levels of refinement we take integer numbers rℓx and rℓy such that rℓx , rℓy ≥ 2 for ℓ = 0, . . . , L − 2 and we define a new discretization on Ω based on the points defined by (xℓi , yjℓ ) =

i+

1 2

1 ∆xℓ , j + ∆yℓ , 2

0 ≤ i < Nℓx , 0 ≤ j < Nℓy ,

(6.4)

6. Implementation and parallelization of the algorithm

113

where, for 1 ≤ ℓ ≤ L − 1 we have defined: x x = Nℓ−1 Nℓx = rℓ−1

y y Nℓy = rℓ−1 Nℓ−1 =

ℓ−1 Y

m=0 ℓ−1 Y

x rm N0x ,

y rm N0y ,

m=0

∆xℓ−1 ∆x0 ∆xℓ = x , = Qℓ−1 x rℓ−1 m=0 rm ∆yℓ =

∆y0 ∆yℓ−1 = Qℓ−1 y . y rℓ−1 m=0 rm

We will denote by xℓi,j the node (xℓi , yjℓ ). From the nodes in (6.4) we define the corresponding cells analogously to (6.3): ∆xℓ ℓ ∆xℓ ∆yℓ ℓ ∆yℓ ℓ ℓ ℓ ci,j = xi − , xi + , yj + × yj − . (6.5) 2 2 2 2 A subgrid Gℓ , defined on Ω for the refinement level ℓ, given by ∆xℓ and ∆yℓ , is defined as a set of Kℓ mesh patches, denoted by {Gℓ,k : 1 ≤ k ≤ Kℓ }, where: • Each mesh patch is a subset of {cℓi,j : 0 ≤ i < Nℓx , 0 ≤ j < Nℓy }, • Ω(Gℓ,k ) is a rectangle for all k = 1, . . . , Kℓ • ˚ Ω(Gℓ,k1 ) ∩ ˚ Ω(Gℓ,k2 ) = ∅ if k1 6= k2 (the patches can only overlap at their boundaries), SKℓ−1 Ω(Gℓ−1,k ) (the grid at level ℓ is contained in the im• Ω(Gℓ,k ) ⊆ k=1 mediately coarser grid), ℓ−1 6= ∅ for some cℓ−1 ∈ G • if cℓi,j ∈ Gℓ,k is such that ˚ cℓi,j ∩ ˚ cp,q ℓ−1 , then p,q ℓ−1 cp,q ⊆ Ω(Gℓ,k ) (the grids are obtained by subdivision of cells in the coarser grid).

An example of such a construction is depicted in Fig. 6.1, for a grid hierarchy of three levels, with all refinement factors equal to 2. As indicated in sections 3.1 and 5.6, we augment each patch with a set of ghost cells that form a band that surrounds the patch, required to integrate it. We define the pad of the patch as this set of cells. When required, we will explicitly indicate that a cell or node does not belong to the pad by referring to it as an interior cell or node. If p is the width

114

6.1. Sequential implementation G2

G1

G0

Figure 6.1: A sample three-level AMR grid hierarchy. All refinement factors are set to the value 2.

of the pad at each side of the patch, then the ghost cells have indices x , . . . , N x + p − 1, for the x direction, and j = i = −p, . . . , −1 and i = Nℓ,k ℓ,k y y x and −p, . . . , −1 and j = Nℓ,k , . . . , Nℓ,k + p − 1 for the y direction, where Nℓ,k y Nℓ,k denote the dimensions of the patch under consideration. A set of x + N y ) + 4p2 ghost cells is therefore assigned to the patch. 2p(Nℓ,k ℓ,k We can assign to each point xℓi,j a single global index q, defined by q = jNℓx + i. Note that, from an index q ∈ {0, . . . , Nℓx · Nℓy − 1}, the indices for each component can be recovered by i = q mod Nℓx and j = [q/Nℓx ], where [·] indicates the integer part. We define the position of a patch as the indices of the the node located at the bottom-left corner of a patch. A patch can be therefore fully determined by its position and dimension. If a point with global indices (i, j) belongs to a single patch Gℓ,k , whose position is given by (iℓ,k , jℓ,k ), and x and N y , indices relative to the patch can also be its dimensions are Nℓ,k ℓ,k assigned to the point point xℓi,j by ˜i = (i − iℓ,k ),

˜j = (j − jℓ,k ),

x . and a single index, relative to the patch, can be assigned by q˜ = ˜i + ˜jNℓ,k In our implementation a grid hierarchy is stored using an structure named gridlist, which essentially contains:

• The number of refinement levels.

6. Implementation and parallelization of the algorithm

115

typedef struct gridlist { int num_levels; /**< Number of levels */ int **dim; /**< Dimension of each level for a fixed grid */ int **rf; /**< Refinement factors for each level and direction */ patch **base; /**< List of pointers to patch lists, one per level */ }gridlist; Figure 6.2: A code excerpt to define the gridlist structure.

• The dimension of the coarsest (fixed) grid. • The refinement factors. • A list of pointers to patch lists, one per level. A simplified version of our gridlist structure is shown in Fig. 6.2. The patch list for each level has been implemented using a linked list, being each element an structure containing: • The dimension of the patch • The position of the patch, relative to a fixed grid. • The pad in each direction. • A vector to store the values of the numerical solution. • A vector containing the physical fluxes • A vector containing the numerical fluxes. This data is necessary in order to perform the transfer of solution process. A simplified version of our actual definition for the structure patch is as shown in Fig. 6.3. The vectors containing the conserved variables and the physical and numerical fluxes store the data corresponding the the patch and the nodes located in their surrounding bands, defined by the pad. In order to deal with these two different kind of data we define some variables that help in their management. In particular we define two pointers for the conserved variables, one of them pointing to the first value including the surrounding band and another pointing to the first value inside the patch (see Fig. 6.4).

116

6.1. Sequential implementation

typedef struct patch { int dim[2]; /**< Dimension of the patch */ int pos[2]; /**< Position of the patch */ int pad[2]; /**< Pads of the patch */ REAL *d0; /**< Conserved variables, no shift */ REAL *d; /**< Conserved variables */ REAL *hatf; /**< Numerical fluxes. */ REAL *f; /**< Equation fluxes */ struct patch *next; /**< Next patch in a patch list */ }patch; Figure 6.3: A code excerpt to define the patch structure. The data type REAL is a redefinition for float or double

d0

d

Figure 6.4: Sample patch with dim = {7, 7} and pad = {3, 3}. Dashed cells correspond to ghost nodes

6. Implementation and parallelization of the algorithm

117

6.1.2 The adaptation process Given a grid Gℓ the adaptation process obtains a set of mesh patches ˜ ℓ . This adaptation procedure is that will compose the adapted grid G composed by three major processes: first, a flagging process to decide which cells have to be included in the refined grid is needed; second, a clustering procedure groups the selected cells into Cartesian patches; finally, the newly created mesh patches are filled with a solution. The only restriction is the nestedness property Gℓ ⊆ Gℓ−1 .

Marking cells for refinement. For the selection of the cells that need refinement we use the approach described in section 5.5: we combine a criterion based on marking the cells that cannot be predicted with enough accuracy from coarse data and a gradient sensor, that can detect the formation of shock waves from smooth data, allowing for the refinement of the grids before the shock forms. We use high thresholds in this sensor in order to avoid the detection of rapidly varying smooth data as discontinuities. The adaptation is forced to generate grids from subdivision of coarse cells. For ℓ−1 the gradient sensor we use the natural 2D extension of (5.13): if ui,j denotes the numerical solution at the node with indices (i, j) of the grid Gℓ−1 , then we mark for refinement (flag) the cells that result from the subdivision of the cell (i, j) if ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 ℓ−1 max ui+1,j − ui,j , ui,j − ui−1,j max ui,j+1 − ui,j , ui,j − ui,j−1 , max ∆xℓ ∆yℓ (6.6) is above a prescribed tolerance. For the identification of the cells that cannot be correctly predicted from the coarse grid we use again an extension of the procedure indicated in section 5.5: for each node xℓi,j of a patch we compute an approximate value interpolated in space using the operator I given by tensor product extension of the 1D interpolation explained in section 5.6, and decide to select it for refinement if ℓ (6.7) ui,j − I(uℓ−1 , xℓi,j ) > τp , with τp > 0.

118

6.1. Sequential implementation

Figure 6.5: An example of the addition of safety flags. The original marked cells are indicated with white circles, and the safety flags with black circles.

Safety flags. Once the coarse grid has been flagged we add a certain number of safety flags to ensure that the cells adjacent to a singularity are refined. The safety flags will avoid singularities to escape from the fine grid during one coarse time step. Fig. 6.5 shows an example of the addition of a band of one safety flag to an already flagged patch. Another example can be seen in Figs. 6.7(a) and 6.7(b). Another criterion for adding safety flags is dictated by the need of interpolating ghost cell values from relatively smooth regions: the length of the stencil of the interpolation operator must be less than twice the number of safety flags. In our case we use third order linear interpolation, and this imposes the addition of 2 safety flags. For analogous reasons, if the computation of the numerical flux depends on 2n values of the fine grid, then, in order to ensure that it is computed using non-interpolated data, the number of safety flags has to be greater than n2 . In the case of the method used in this work, described in chapter 4, we have n = 3, and thus the number of safety flags added should be at least 2. According to the criteria above, we add 2 safety flags in our implementation.

6. Implementation and parallelization of the algorithm

119

1111 0000 00000 11111 0000 1111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 0000 1111 0000 1111 0000 1111 00000 11111 0000 1111 0000 1111 0000 1111 00000 11111 0000 000011111 1111 000001111 0000 1111

Figure 6.6: Synchronization of marked cells between patches of the same level. A marked cell in the patch at the left (solid light gray) induces three marked cells in the patch at the right due to the safety flags (dashed dark gray). A band of one safety flag is added (all dashed cells).

Synchronization of marked cells. It may happen that the addition of safety flags to a flagged patch produces marked cells that are located in the pad of the patch. If another patch is adjacent or sufficiently near the first one (depending on the width of the pad and the number of safety cells added), some of the marked cells in the pad may correspond to interior cells of the other patch. In this case, a mechanism to ensure that the cell will also be marked in the second patch is applied after every patch has been flagged, and before the procedure for grouping the marked cells into rectangular patches, described next. An example where this situation appears is shown in Fig. 6.6.

Grouping cells into rectangular patches. Once the desired coarse cells have been flagged for refinement, we group them into rectangular patches at the next refinement level. These patches contain every flagged cell and possibly some non-flagged cells. We proceed with every patch independently from the others, grouping the cells that are marked and belong to that patch. A simple and effective clustering procedure consists of the following steps: first, the minimum Cartesian patch that contains all the flagged cells is found. If the ratio between the number of flagged cells and the total number of cells in the patch is above a prescribed tolerance, then

120

6.1. Sequential implementation

the patch is accepted and the process ends for the current patch. Otherwise the patch is sub-divided into smaller sub-patches and the same criterion is applied to each sub-patch. The procedure continues until a patch has been assigned to each marked cell. The procedure is then applied to the next patch, until every patch has been clustered. A new grid is then generated using the accepted rectangular patches. For computational efficiency reasons we have implemented some procedures to avoid the formation of mesh patches that have a very high aspect ratio, or patches that are very small. More precisely, for each patch, we first check if it fulfills the requirement dp :=

Number of marked cells > tc , Total number of cells

(6.8)

where 0 < tc ≤ 1 is a tolerance, or if the size (width · height) of the patch is smaller than a specified patch size sp , in which case it is accepted “as is”. Otherwise it is sent to the subroutine that subdivides the patch. This subroutine divides the patch along the directions whose length is bigger than a specified length sl . As a result, the patch can be divided in four parts (if it can be divided in both directions), or in two (if the division can only be done in one direction). We sketch the process in the code shown in Fig. 6.8, and we show an example in Fig. 6.7. In that example we start with a patch of size 16 × 16 that is to be refined. We fix sp = 4, sl = 2 and tc = 0.9. Assume that the result of the flagging procedure is the one shown in Fig. 6.7(a), where the marked cells are indicated with a white circle. For each marked cell we mark a band of one cell around it, in order to cover the region where the information contained in the previously marked cells can move during one time step. The added cells are marked with black circles, as shown in Fig. 6.7(b). The result contains 85 marked cells out of the 256 cells of the patch, i. e., dp ≈ 0.33 < tp so it is not accepted. We divide it into four parts (Fig. 6.7(c)) and we crop each of these parts so that we only consider the minimum rectangle that contains marked cells (Fig. 6.7(d)). The cells that are discarded because of the cropping of the patches are shown in light gray. None of the four patches fulfill the requirement (6.8). Their respective values for dp are, from left to right and from top to bottom, 18 32 19 16 48 ≈ 0.33, 56 ≈ 0.32, 64 = 0.5 and 48 ≈ 0.40. A new division is therefore performed as appears in Fig. 6.7(e). After cropping, some patches are accepted (indicated in dark gray) because they do not contain unmarked cells (Fig. 6.7(f)). The four sub-patches that are not accepted have values for dp (from left to right and from top to bottom) equal to 87 = 0.875, 86 =

6. Implementation and parallelization of the algorithm

121

9 14 0.75, 16 = 0.5625 and 16 = 0.875, that are below the specified threshold tc = 0.9. Fig. 6.7(g) shows the result of dividing these sub-patches and cropping them. Note that the two smaller patches have been divided only in two parts because one of their dimensions is equal to 2, which is not bigger than sl . The sub-patches in Fig. 6.7(g) are all accepted (some of them verify dp = 1 > tc , and the size of them all is not bigger than sd = 4). The final refined patch is shown in Fig. 6.7(h).

Ensuring nestedness. Due to the organization of the AMR algorithm it can happen that at a point in the algorithm several grids, corresponding to different resolutions, have to be adapted. More precisely, the fact that a grid which is not the finer grid has to be adapted implies that all grids that are finer than it have to be adapted as well (cf. Section 5.5). In this situation we perform the adaptation from the finest grid to the coarsest. This approach is motivated by the fact that some features that would need refinement in a certain resolution level could not be identified as such in coarser levels. Note that when a grid Gℓ is adapted, the actual con˜ ℓ is made using the coarser grid Gℓ−1 as struction of the adapted grid G well. If the adaptation is performed from coarse to fine, and the grid Gℓ−1 has been produced, in turn, by a previous adaptation, it can happen that it does not cover a region that needs to be included in the grid at level ℓ because the grid Gℓ−2 , used to construct Gℓ−1 , did not detect that that region needed adaptation, and therefore it was not included in Gℓ−1 . Cells not belonging to Gℓ−1 are not considered for the construction of Gℓ and, as a result, a loss of refinement can occur. The adaptation from fine to coarse prevents it, but requires an additional procedure to ensure that the grid hierarchy resulting from the adaptation of several levels is nested. This procedure amounts in practice to mark some cells to ensure that the adapted grid will contain the grid at the immediately finer level: if the grid Gℓ to be adapted is not the finer grid, we mark the cells of Gℓ−1 that intersect the grid Gℓ+1 , which was previously adapted.

Transfer of solution to the adapted grid Each new fine grid obtained by the adaptation process needs to be filled with a numerical solution. This process has been described in section 5.5 (see page 98) for the 1D case, and essentially amounts to copying the solution from the grid that existed before adaptation to the adapted grid in the nodes where these grids overlap, and to compute interpolated val-

122

6.1. Sequential implementation

(a) Cells initially marked (white circles).

(b) Cells marked because of the safety flags (black circles).

(c) Division of the patch in four parts.

(d) Resulting sub-patches after crop. Cells discarded are marked in light gray

Figure 6.7: An illustration of the clustering process.

6. Implementation and parallelization of the algorithm

123

(e) New division of the subpatches

(f) Resulting sub-patches after crop. Accepted sub-patches are marked in dark gray.

(g) New division of the unaccepted patches and crop

(h) Final result of the clustering.

Figure 6.7 (continued)

124

6.1. Sequential implementation

Function cluster(p:patch, g:gridlist) cnt =number of flagged cells in p if cnt 6= 0 then size =Size (width · height) of the patch p dpcurrent = cnt/size if dpcurrent ≥ dp or size ≤ sd or width ≤ sl or height ≤ sl then accept patch(p, g) else [q, npatches] = divide(p) for i = 1 until npatches q[i] = crop(q[i]) cluster(q[i], g) end for end if end if end function Figure 6.8: A pseudo-code fragment corresponding to the cluster function.

ues from a coarser grid where they do not overlap. Instead of checking, for each cell, if it is contained in the unadapted grid or not, and then decide if data has to be copied or interpolated from the coarser grid, in practice it is more efficient to perform both processes separately: first, as the nestedness property ensures that the new grid is wholly contained in the coarser grid, a numerical solution is interpolated from it. Second, the numerical solution is copied from the grid of the same level that existed before the adaptation process, in the regions in which both grids overlap, overwriting the interpolated solution. Finally, boundary conditions are applied wherever the patch boundary overlaps the domain boundary. An sketch of a function that performs these steps is shown in Fig. 6.9. By merging together the procedures of marking cells for refinement, ensuring nestedness, clustering and transfer of solution to the adapted grid we obtain an adaptation algorithm with the structure shown in Fig. 6.10. A call to adapt(ginitial , gf inal , l) will produce, from the gridlist ginitial another gridlist called gf inal which coincides with ginitial at levels 0 . . . l − 1 and has adapted grids for levels l . . . , L − 1. A numerical solution for the adapted grids is produced as well, using the transfer function.

6. Implementation and parallelization of the algorithm

125

Function transfer(ginitial:gridlist, gf inal :gridlist, l:integer) for each patch p of level l in gf inal Search overlaps O of p with patches in gf inal of level l − 1 for each overlap Oi in O interpolate solution(Oi) end for end for for each patch p of level l in gf inal Search overlaps P of p with patches in ginitial of level l for each overlap Pi in P copy solution(Pi) end for end for for each patch p of level l in gf inal if p intersects the domain boundary then put boundary conditions(p) end if end for end function Figure 6.9: A code fragment corresponding to the transfer function.

Function adapt(ginitial :gridlist, gf inal : gridlist, l:integer) L = number of levels in the gridlists for ℓ = L − 1 until l for each patch p of level ℓ in ginitial f lags = flag patch(p) if ℓ < L − 1 f lags = f lags ∪ flag from finer(p) end if f lags = f lags ∪ put safety flags(p) crop(p, f lags) cluster(p, gf inal ) end for transfer(ginitial , gf inal , ℓ) end for end function Figure 6.10: A code fragment corresponding to the adapt function.

126

6.1. Sequential implementation

6.1.3 Integration algorithm The ghost cell approach permits to perform the integration of each patch almost separately. When a grid Gℓ is to be integrated, we call a procedure that sweeps over the patches in Gℓ and integrates each of them. The integration of each patch essentially amounts to sequentially call a function to compute the numerical fluxes corresponding to the actual solution and functions to perform the time integration, in our case the third order Runge-Kutta algorithm (3.14). To complete the algorithm, mechanisms for the computation of a numerical solution for the ghost nodes is also required, and is applied before each Runge-Kutta step. Note that this process amounts to compute an interpolated solution from the coarse to the fine grid, and is performed independently for each patch. In the middle, a procedure for the synchronization of the intermediate solutions between the different patches of the grid is applied before each RungeKutta step, in order to ensure that overlapping nodes have the same solution. This synchronization is the only process within the integration algorithm that requires data interchange between patches. The structure of the integration algorithm is as shown in Fig. 6.11. The computation of the numerical divergence, represented by the function compute numerical divergence, is performed by considering separately the fluxes in the x and y directions, corresponding to the functions f and g in (6.1), and are given by formulas analogous to (4.12). The computation of the numerical fluxes is described in detail in chapter 4, and the complete integration algorithm that is used for the update of each patch is described in section 4.4. In 2D the operator D that represents the numerical divergence is simply given by the sum of the 1D numerical divergences in each Cartesian direction: D(U ) =

fˆi+ 1 ,j (U ) − fˆi− 1 ,j (U ) 2

2

∆x

+

gˆi,j+ 1 (U ) − gˆi,j− 1 (U ) 2

2

∆y

.

After each step of the Runge-Kutta algorithm has been applied to a grid, we update the ghost cells as described in section 5.6, using an space-time interpolation algorithm applied on the coarser grids corresponding to the solutions before and after the Runge-Kutta step. The two patches are first interpolated in time, for a suitable time instant (that depends on the particular step of the Runge-Kutta algorithm), in some coarse cells. These are the cells needed to compute spatial interpolations in the required ghost cells of the fine grid. The data that results from the

6. Implementation and parallelization of the algorithm

127

Function integrate(g:gridlist, l:integer) for step = 1 until 3 for each patch p of level l in g compute numerical divergence(p) update total fluxes(p, step) perform Runge Kutta step(p, step) interpolate ghost cells(p, step) end for synchronize level(g, l) impose boundary conditions(g, l) end for end function Figure 6.11: A code fragment corresponding to the integrate function.

time interpolation is then interpolated in space using the tensor product extension of the 1D algorithm explained in section 5.6. This process is represented by the function interpolate ghost cells. An example is depicted in Fig. 6.12. We show some cells where the interpolation is to be computed as squares in Fig. 6.12(a). The circles represent coarse nodes. We assume that the time interpolation between the coarse grids before and after a Runge-Kutta step, for the corresponding time instant, has been already performed. The interpolation in space is performed in the direction of the x coordinate, which allows to compute approximations in the points that correspond to the fine cells in a 1D grid (cf. Fig. 5.12). The crosses in Fig. 6.12(b) correspond to those points. The same procedure, applied to the nodes indicated by crosses, in the direction of the y coordinate produces approximations in the fine nodes of the 2D grid, as shown in Fig. 6.12(c). Once every patch has been updated for a Runge-Kutta step, we ensure that the data contained at the different patches is coherent. This operation is represented by the function synchronize level in Fig. 6.11. On the one hand, we check if a ghost cell of one patch overlaps an internal cell of another patch, in which case we copy the solution from the internal cell to the ghost cell. Pseudocode for this function is shown in Fig. 6.13. On the other hand, we impose appropriate boundary conditions wherever needed (this process depends on the particular kind of boundary conditions, see section 3.7). The time steps for the different grid levels are computed in a way

128

6.1. Sequential implementation

(a) A coarse grid (circles) and some fine nodes (squares) where interpolation in space has to be computed.

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

(b) 1D interpolation in space in the x direction produces approximations in the points marked with crosses.

(c) 1D interpolation in space in the y direction produces approximations in the points corresponding to the fine nodes (squares) of the 2D grid.

Figure 6.12: An illustration of the 2D interpolation used for the update of the ghost cells.

function synchronize level(g:gridlist, l:integer) for each patch p of level l in g Search overlaps P of p with patches in g of level l for each overlap Pi in P copy solution(Pi) end for end for Figure 6.13: Pseudocode for the function synchronize level

6. Implementation and parallelization of the algorithm

129

similar to the 1D case. (cf. (5.1) – (5.2)). We define, for 1 ≤ ℓ ≤ L − 1: ∆tℓ =

∆tℓ−1 x , ry } , max{rℓ−1 ℓ−1

(6.9)

and ∆t0 as

min{∆x0 , ∆y0 } , (6.10) M where 0 < K < 1 and M is the maximum numerical characteristic speed of the equation (see appendix A, section A.4.3 for a description of the computation of this quantity). ∆t0 = K

6.1.4 Flux projection The need of a transfer of information from fine to coarse grids was motivated in section 5.1. This transfer of information can be performed in two ways: by transferring nodal values, just copying the solution from the fine to the coarse nodes that correspond to the same points, and by transferring numerical fluxes from coarse to fine cell interfaces, and updating the coarse solution according to the corrected fluxes. In this work we choose to project the numerical fluxes, in order to ensure conservation between meshes, that would be lost if projection of nodal values is used. The projection, for the case of numerical flux projection in 1D, has been described in detail in Sections 5.1 and 5.4. We describe here the extension to two dimensions. During the integration process we store the numerical fluxes that will be used later for flux projection. The numerical fluxes are added for each Runge-Kutta step so that, at the end, the numerical fluxes corresponding to a time step (i.e., the three steps of Runge-Kutta), are available for their further projection onto the coarser grid. In the 1D case this corresponds to compute the values given by (3.17) for each coarse cell interface. This process is represented in Fig. 6.11 by the function update total fluxes. The 2D fluxes, equivalent of (3.17) are 1 1 2 fˆRK3 (U n ) = fˆ(U n ) + fˆ(U (1) ) + fˆ(U (2) ), 6 6 3 1 2 1 n (1) RK3 n gˆ (U ) = gˆ(U ) + gˆ(U ) + gˆ(U (2) ). 6 6 3

(6.11)

A first note is that within Shu-Osher’s flux-splitting formulation the numerical fluxes correspond to nodal values located in the boundary of

130

6.1. Sequential implementation

Figure 6.14: Relative location of the numerical fluxes for coarse and fine grid points for refinement factors of 2 (left) and 3 (right). Coarse nodes are indicated with solid circles and fine nodes with solid squares. The locations of the coarse fluxes are indicated with empty circles and the locations of the fine numerical fluxes with empty squares.

the cells whose centers are the nodal values where the solution is computed. In 1D, the grid points can be organized such that a grid hierarchy for which the cell interfaces of a grid coincide with grid interfaces of the finer grid can be built. We have described this grid organization in Section 5.1 and an example was shown in Fig. 5.2. In our 2D discretization, the locations of the coarse numerical fluxes coincide with locations of fine numerical fluxes only if the refinement factor is an odd number. Otherwise, numerical fluxes for the fine grid are computed in nodes that belong to the boundary of the coarse cell, but do not coincide with the locations of the coarse numerical fluxes. An example of the distribution of the nodes and the locations of the numerical fluxes for refinement factors equal to 2 and 3 is shown in Fig. 6.14. If the refinement factors are set to an even number, then Lagrange interpolation in space is performed to compute an approximation to the coarse numerical flux in the corresponding point. In the case of being equal to 2, the approximation is simply given by the average of the two fine values. Let us explain in more detail the most common case, where the re-

6. Implementation and parallelization of the algorithm

l

131

l

x 2i,2j+1

x 2i+1,2j+1 l−1

x i,j l

x 2i,2j

l

x 2i+1,2j

Figure 6.15: Relative location of the points involved in the flux projection at one cell.

ℓ−1 finement factors are set to 2. Consider a coarse node xi,j given by 1 1 ℓ−1 ∆xℓ−1 , j + ∆yℓ−1 . = i+ xi,j 2 2

The numerical fluxes required to update this node are located in the ℓ−1 ℓ−1 ℓ−1 points xi+ and xi− for the horizontal flux, and in the points xi,j+ 1 1 1 .j ,j 2

2

2

ℓ−1 and xi,j− 1 for the vertical fluxes. The nodes on the fine grid whose nu2

merical fluxes are computed in the same cell interfaces are the points xℓ2i+1,2j , xℓ2i+1,2j+1 , xℓ2i,2j and xℓ2i,2j+1 , The relative location of these points is depicted in Fig. 6.15, where circles indicate coarse points and squares fine points; solid objects indicates nodes and empty objects points where the numerical flux is computed. As in the 1D case described in Section 5.4, let us denote by utℓ−1 the numerical solution corresponding to time t in the grid Gtℓ , where ℓ−1 t ℓ , t). The computation of the solution ut+2∆t utℓ−1,i,j ≈ u(xi,j ℓ,2i,2j from uℓ is performed by means of two sequential integrations with time step ∆tℓ , analogous to (5.3), that can be summarized in the 2D version of (5.4) as: ℓ ut+2∆t ℓ,2i,2j

=

utℓ,2i,2j

∆tℓ RK3,t RK3,t+∆tℓ RK3,t+∆tℓ RK3,t ˆ ˆ ˆ ˆ fℓ,2i+ 1 ,2j + fℓ,2i+ 1 ,2j − − fℓ,2i− 1 ,2j + fℓ,2i− 1 ,2j ∆xℓ 2 2 2 2 ∆tℓ RK3,t RK3,t+∆tℓ RK3,t+∆tℓ RK3,t − gˆℓ,2i,2j+ + g ˆ + g ˆ − g ˆ , 1 ℓ,2i,2j+ 21 ℓ,2i,2j− 21 ℓ,2i,2j− 21 ∆yℓ 2

132

6.1. Sequential implementation

t+2∆tℓ t+2∆tℓ ℓ with analogous expressions for ut+2∆t ℓ,2i+1,2j , uℓ,2i,2j+1 and uℓ,2i+1,2j+1 , from which we deduce that t+2∆tℓ t+2∆tℓ t+2∆tℓ ℓ ut+2∆t ℓ,2i,2j + uℓ,2i+1,2j + uℓ,2i,2j+1 + uℓ,2i+1,2j+1

= −

4 utℓ,2i,2j + utℓ,2i+1,2j + utℓ,2i,2j+1 + utℓ,2i+1,2j+1

∆tℓ ∆xℓ

4 RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ ˆ ˆ fℓ,2i+ 3 ,2j + fℓ,2i+ 3 ,2j + fˆℓ,2i+ + fˆℓ,2i+ 3 3 ,2j+1 ,2j+1 2

2

2

2

4

−

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ fˆℓ,2i− + fˆℓ,2i− + fˆℓ,2i− + fˆℓ,2i− 1 1 1 1 ,2j ,2j ,2j+1 ,2j+1 2

2

2

2

4

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ ∆tℓ gˆℓ,2i,2j+ 23 + gˆℓ,2i,2j+ 23 + gˆℓ,2i+1,2j+ 23 + gˆℓ,2i+1,2j+ 23 − ∆yℓ 4

−

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ gˆℓ,2i,2j− ˆℓ,2i,2j− + gˆℓ,2i+1,2j− ˆℓ,2i+1,2j− 1 + g 1 + g 1 1 2

2

2

2

4

If we define now, for −1 ≤ i ≤ Nℓx and 0 ≤ j ≤ Nℓy ˆt fˆℓ−1,i+ = 1 ,j

RK3,t+∆tℓ RK3,t RK3,t+∆tℓ RK3,t + fˆℓ,2i+ + fˆℓ,2i+ + fˆℓ,2i+ fˆℓ,2i+ 3 3 3 3 ,2j ,2j ,2j+1 ,2j+1 2

2

2

2

4

2

(6.12)

,

(6.13)

,

(6.14)

and for 0 ≤ i ≤ Nℓx and −1 ≤ j ≤ Nℓy t gˆ ˆℓ−1,i,j+ 1 2

=

RK3,t RK3,t+∆tℓ RK3,t RK3,t+∆tℓ gˆℓ,2i,2j+ ˆℓ,2i,2j+ + gˆℓ,2i+1,2j+ ˆℓ,2i+1,2j+ 3 + g 3 + g 3 3 2

2

2

4

2

then (6.12) reads t+2∆tℓ t+2∆tℓ t+2∆tℓ ℓ ut+2∆t ℓ,2i,2j + uℓ,2i+1,2j + uℓ,2i,2j+1 + uℓ,2i+1,2j+1

= −

utℓ,2i,2j

+

utℓ,2i+1,2j

4 + utℓ,2i,2j+1 + utℓ,2i+1,2j+1

(6.15)

4

∆t ∆tℓ ˆ ˆ ℓ ˆt t t t ˆ ˆ fˆℓ−1,i+ g ˆ . − g ˆ − f − 1 1 ,j ℓ−1,i,j− 21 ℓ−1,i− 21 ,j ∆xℓ 2 ∆yℓ ℓ−1,i,j+ 2

If we assume that for time t the relation utℓ−1,i,j =

utℓ,2i,2j + utℓ,2i+1,2j + utℓ,2i,2j+1 + utℓ,2i+1,2j+1 4

(6.16)

6. Implementation and parallelization of the algorithm

133

∆tℓ ∆tℓ holds, then, using that the ratios ∆xℓ do not depend on the and ∆y ℓ resolution level ℓ, from (6.15), we deduce the same relation holds for time t + ∆t: t+∆t

t+∆tℓ−1 uℓ−1,i,j

=

t+∆t

t+∆t

t+∆t

ℓ−1 ℓ−1 ℓ−1 uℓ,2i,2jℓ−1 + uℓ,2i+1,2j + uℓ,2i,2j+1 + uℓ,2i+1,2j+1

4

,

(6.17)

provided that the coarse fluxes are corrected according to: ˆt t fˆℓ−1,i+ = fˆℓ−1,i+ 1 1 , ,j ,j

x , −1 ≤ i ≤ Nℓ−1

t t ˆ ˆℓ−1,i,j+ gˆℓ−1,i,j+ 1, 1 = g

x 0 ≤ i ≤ Nℓ−1 − 1,

2

2

2

2

y 0 ≤ j ≤ Nℓ−1 −1 y −1 ≤ j ≤ Nℓ−1

(6.18)

Note that the values RK3,t+∆tℓ RK3,t + fˆℓ,2i+ fˆℓ,2i+ 3 3 ,2j ,2j 2

2

2 and RK3,t RK3,t+∆tℓ fˆℓ,2i+ + fˆℓ,2i+ 3 3 ,2j+1 ,2j+1 2

2

2 are, respectively, the numerical fluxes at the points xℓ2i+ 3 ,2j and xℓ2i+ 3 ,2j+1 , 2

2

corresponding to a time step 2∆tℓ = ∆tℓ−1 . Their average is exactly the right hand side of (6.13) and is the linear approximation of the numerical flux, for the same time step ∆tℓ−1 at the midpoint of the vertical line joining the two points, which is the point 1 ℓ−1 xℓ2i+ 3 ,2j+ 1 = ((2i + 2)∆xℓ , (2j + 1)∆yℓ ) = ((i + 1)∆xℓ−1 , (j + )∆yℓ ) = xi+ 1 . ,j 2 2 2 2 t The same analysis can be performed for gˆˆℓ−1,i,j+ 1 in (6.14). The values 2

ˆt t and gˆ ˆℓ−1,i,j+ fˆℓ−1,i+ 1 are therefore approximations to the coarse numer1 ,j 2

2

ical fluxes in their respective locations. A pseudocode for the projection algorithm is shown in Fig. 6.16. The subroutine update coarse fluxes performs the operations represented by (6.18). The final algorithm that updates a gridlist is the same as described in chapter 5, see Fig. 5.9 (the notations used in chapters 5 and 6 are slightly different).

134

6.2. Parallel implementation

Function project(g:gridlist, l:integer) if l > 0 for each patch p in g of level l Search overlaps O of p with patches in g of level l − 1 for each overlap Oi in O update coarse fluxes(Oi) end for end for end if end function Figure 6.16: A pseudo-code fragment corresponding to the project function.

6.2 Parallel implementation After the description of the implementation of our algorithm, we focus on the parallelization of the code. Parallelization is important if one aims to face grand challenge problems that, even with the help of high order methods and adaptation, are impossible to solve on a single machine, due to both the spatial (data storage) and temporal (computational cost) requirements. Parallel implementations offer the potential for the computation of accurate solutions of such complex problems, at the cost of facing new challenges in resource allocation, data distribution, load balancing and data communication and synchronization. See [43] for a good discussion on data partitioning and load balancing in a more general context. The typical approach in the parallel implementations of AMR applications splits the grid hierarchy into different portions, that are distributed among the available processors. Each processor acts on one or more portions separately, sharing information with other processors when required. To reduce the communications, a first requirement is that if a patch is assigned to a certain processor, the finer patches that overlay it are assigned to the same processor. This leads to a partition strategy based on the coarsest grid, which is split into pieces that are assigned to the available processors, along with the parts of the finer grids contained in them. The dynamic nature of the grids present in AMR applications, that lead to deep hierarchies and small regions with high refinement, makes

6. Implementation and parallelization of the algorithm

135

difficult to design an strategy for finding a suitable partition of the grid, that ideally should: • balance the load between the processors and • minimize data transfers between processors. In our algorithm, data transfers are needed each time that the ghost nodes of the patches have to be filled with data. This includes in particular communicating after each step of the Runge-Kutta algorithm, after adaptation, and after the process of correcting the numerical solution using projected fluxes. Several authors have faced this problem in the last fifteen years. The simplest load balancing techniques do not consider data locality, and simply redistribute the work load among the available processors, using the data partition given by the adaptation algorithm [141]. Other authors further sub-divide the domain if a good load balancing is not achieved, and try to balance the loads by means, for example, of data exchanges between processors that have loads higher than the average and processors that have loads smaller than the average [153, 92]. Techniques of dynamic programming have been used as well for finding the distribution that optimizes the work load balance [145]. Partitioners based in graphs [154] (and more recently hypergraphs [33]), like ParMetis [84] or Zoltan [152], are often used in codes using unstructured meshes or adaptive finite elements, but their higher computational cost makes its application to dynamic load balancing problems, where the workloads vary dynamically as time evolves, very limited. Most of the partitioning techniques used in parallel AMR codes use space filling curves [151] to increase data locality [135, 31]. Space filling curves are defined as continuous, surjective functions, from the unit interval [0, 1] to the d-dimensional unit hypercube [0, 1]d . In particular, a 2-dimensional space-filling curve is a continuous curve that passes through every point of the unit square [0, 1]2 . Extension to general hypercubes is trivial. A space-filling curve is typically defined as the limit of a sequence of curves. If the target hypercube is partitioned uniformly in a certain way into N d patches (obtained by making N divisions along each of the d spatial dimensions, where N depends on the particular curve under consideration), then there is a curve in the sequence that visits each element in the partition following a particular order. Therefore, the intermediate curves, whose limit is the space filling curve under consideration, can be seen as bijections from {1, . . . N }d to {1, . . . , N d }, that assign an index

136

6.2. Parallel implementation

to each of the N d elements of the discretization. The interest of using space filling curves to obtain indexings of discretizations comes from the fact that these curves try to assign close indices to close elements of the discretization, thus producing orderings with high data locality. If the assignment of patches to processors is done following the ordering of the space filling curves, each processor will likely host patches that are neighbors of other patches hosted by the same processor. Because each patch in the grid hierarchy has to communicate only with his neighbors, data locality can reduce the total amount of data communication required by the parallel algorithm. In this work we use the Peano-Hilbert space filling curves [136, 69]. These curves are defined iteratively as follows for the two dimensional case (see Fig. 6.17): in the first stage we divide the unit square in four squares of equal size and assign a sequential number to these squares clockwise, starting at the top right square, thus ending in the top left square. If we draw the line that joins the centers of the squares in the order indicated we obtain the construction depicted in Fig. 6.17(a). For the second iteration the four squares of the first iteration are divided in turn into four squares, and each of these four quarters is considered in isolation. In the two bottom quarters the same construction of the previous iteration is repeated on a 14 scale . On the top left square we repeat the same process but the ”figure” of the previous iteration is rotated an angle of π2 counterclockwise and scaled. Finally, in the top right quarter we repeat again the same figure of the previous iteration but rotated π2 clockwise and scaled. The result is depicted in Fig. 6.17(b) with black lines. These four constructions are then merged together. The constructions in the bottom quarters are connected to their horizontal and vertical neighbors by the shortest lines that join them. These lines are depicted in gray in Fig. 6.17(b). We finally assign numbers to each cell following the same ordering as in the first step: starting in the top right cell and following the line, that ends in the top left cell. The curve corresponding to the third step is shown in Fig. 6.17(c), where the colors of the lines have the same meaning as in Fig. 6.17(b). The construction is the same as for the previous case, but acting on the figure of the second step. Recall that at the k-th step the unit square is divided into 4k squares of equal size. The application of this space-filling curve to load balancing of parallel programs divides the computational domain in 4k subdomains, for a suitable value of k, and assumes that a weight can be assigned to each subdomain. The weight represents the computational effort or workload

6. Implementation and parallelization of the algorithm

4

16

15

13

14

2

1

3

4

137

1

3

12

9

8

5

11

10

7

6

2

(a) Iteration 1

(b) Iteration 2

(c) Iteration 3

Figure 6.17: Construction of the Peano-Hilbert space filling curve

required to process that subdomain. The subdomains are then ordered according to the space filling curve and the resulting list of weights is divided in a number of pieces equal to the number of available processors. This division determines which subdomains are assigned to each processor. The division of the list is performed trying to ensure that the total loads assigned to the processors are as balanced as possible. In our k case, for the partition of the list of loads, that we denote by l = {li }4i=1 , into P parts, we proceed as follows: we start with the full list of loads and we compute the position p1 in the list whose accumulated load is the closer one to the average load, given by k

4 1 X li . L1 = P i=1

A simple way of doing that is to look for the position s1 in the list that verifies that sX s1 1 +1 X li ≥ L1 , li < L1 and i=1

i=1

The position s1 is the one whose accumulated load is the closer one to the average, being inferior to it. Then we take P1 P 1 +1 s1 if L1 − si=1 li < si=1 li − L1 p1 = . s1 + 1 otherwise

The pieces assigned to the first processor are the ones corresponding to the labels {1, . . . , p1 } in Hilbert’s order.

138

6.2. Parallel implementation

7

4

9

2

3

10

20

4

7

4

9

2

3

10

20

4

12

17

14

8

12

17

14

8

3

9

20

18

3

9

20

18

(a) Division of the domain into patches. (b) Load assignment. Patches of the Numbers indicate weights assigned to same gray intensity are assigned to the each patch. same processor.

Figure 6.18: An example for the load balancing algorithm.

For the rest of the processors we repeat the same computation, but considering the average load of the pieces that have not been assigned so far: 4k X 1 Lj = li , j = 2, . . . , P. P −j+1 i=pj−1 +1

and we look for the position pj whose accumulated load, counting only the loads of the unassigned pieces, is the closer one to the average load Lj . As an example, assume that the computational domain is [0, 1]2 and we have P = 4 processors. We divide the domain in 16 parts, as in Fig. 6.17(b). Consider the weight assignment of Fig. 6.18(a). If we order the subdomains according to the Hilbert curve corresponding to k = 2, we get the following list of weights: l = {10, 3, 20, 4, 8, 18, 20, 14, 17, 9, 3, 12, 9, 2, 4, 7}. The total load is 160 giving an average loadP of L1 = 40. Therefore, we 4 get s = 4 and we take p = 4, because L − 1 1 1 i=1 li = 40 − 37 = 3 and P5 l − L = 45 − 40 = 5 > 3. For the second processor we have L2 = 41 1 i=1 i and we therefore get p2 = 7. Repeating the process we obtain p3 = 10

6. Implementation and parallelization of the algorithm

139

and p4 = 16. This gives the load assignment shown in Fig. 6.18(b), where the pieces of the domain that would be assigned to each processor are depicted in different grayscale levels. The loads assigned to each processor are, respectively, 37, 46, 40 and 37. In the case of the AMR algorithm, we have grids and grid patches of several sizes, according to the different resolutions of each grid and to the result of the adaptation algorithm. Our approach consists in considering the dimensions of the coarsest grid, and we compute the number kmax of divisions that can be performed in each direction in order to be able to divide the domain in subdomains such that each subdomain contains an integer number of coarse cells (i.e., we do not divide coarse cells). Then, we compute the minimum value kmin of divisions that have to be done in each direction, in order to obtain a number of pieces that is greater or equal to the number of processors, so that each processor receives at least one piece. Then we assign to each piece the load that corresponds to every cell whose extent is contained in the given piece. The computation of this loads is described below. We start with k = kmin and divide the domain in 4kmin pieces. The assignment of pieces to each processor is computed following the algorithm based on space-filling curves described above. We accept or reject the load distribution of the algorithm using a simple check: if the maximum difference between the load assigned to each processor Pj , given by Pj =

pj X

li ,

i=pj−1 +1

and the corresponding average load Lj is smaller than a given threshold for every j, we decide to keep that division and we distribute the work according to it. Otherwise, if k < kmax we increase k and repeat the process until a division that fulfills the requirement is achieved. If it is not possible to find a division that suits the required conditions, we increase the threshold and repeat the load balancing procedure. A pseudocode algorithm for load balancing is shown in Fig. 6.19. The load assigned to each piece is basically determined by the total number of integrations needed to integrate the cells of the grid hierarchy whose extent is included in the given piece. In the simplest case, the cost for the integration of a given patch hj during a coarse time step is given by: L−1 X Wj = Card(Cℓ,j )2ℓ , (6.19) ℓ=0

140

6.2. Parallel implementation

Function balance(g:gridlist, P :integer, threshold:real) Compute kmin and kmax do k = kmin do C = compute Hilbert ordering(k) l = compute cost list(C, g) assign costs to processors(l, P, threshold) k = k+1 while (cost assignment is not acceptable and k ≤ kmax ) increase threshold while(cost assignment is not acceptable) end function Figure 6.19: A pseudocode for the balance function.

where all refinement factors of the grid hierarchy are assumed to be equal to 2 and Cℓ,j is the set of cells in the grid Gℓ that belong to the patch hj . A penalization cost for the communication can also be considered. One can, for example, add a load defined by an increasing function of the surface length of the logical patches defined by the cells in Cℓ,j , thus favoring that the ratio between area and length of the pieces is as high as possible. As an example assume that we have 4 processors at our disposition and consider that, at some stage, the working grids of the AMR algorithm are as depicted in Fig. 6.20(a). The thicker lines indicate patch boundaries, and the fine lines indicate cell boundaries. We observe that the grid hierarchy is composed by three levels and that the coarsest grid has 8 × 8 cells, thus allowing for three possible divisions according to the Hilbert curve. Following (6.19), we assign a cost of 1 to a coarse cell, and a cost of 2ℓ to each cell at level ℓ. A coarse cell corresponds to four cells of the next refinement level and to 16 of the finest level. If a coarse cell is overlaid by cells at the next refinement level only, it has a cost of 1 + 2 · 4 = 9, and if it is refined up to the finest level it has a cost of 1 + 2 · 4 + 4 · 16 = 73. The costs for each patch are depicted in Fig. 6.20(b). We assume that the threshold for the difference between the assigned load and the average is set to 20. The load balancing algorithm will proceed as follows: the domain is divided in four pieces, and the loads for each piece are computed, giving the values indicated in Fig. 6.20(c). The average load is L1 = 432. For the

6. Implementation and parallelization of the algorithm

141

first processor the algorithm compares the choices p1 = 1, with a load of 16 and p1 = 2, with a load of 938. None of them is admissible for the given threshold, and therefore a second subdivision is made, getting 16 pieces, as indicated in Fig. 6.20(d). Neither this subdivision gives a satisfactory balance, and the next division, into 64 pieces, is done (Fig. 6.20(e)). In this case the algorithm is able to balance the load according to the given threshold, and produces the load assignment shown in Fig. 6.20(e), where the pieces assigned to each processor are painted in different gray levels. The load assigned to the four target processors are, respectively, 417, 438, 437 and 436, with a maximum difference between the assigned load and the corresponding average of 15. The load assignment, along with the initial grid, is depicted in Fig. 6.20(f). In some cases it happens that the parallel performance of the algorithm is not as good as it would be desirable, even it the costs are balanced, because some processors have cells belonging to levels with more refinement than other processors. A big patch of the coarsest level can have a load similar to a small patch that is overlapped by finer grids. As communication is needed after each step of the Runge-Kutta algorithm, the processors whose number of cells in the coarsest level is smaller will wait for the other processor to end its computation, and, conversely, after the coarsest level has been integrated, the processor with no fine grids is idle while the other is integrating the fine grid. In practice we have observed that this phenomenon has a reduced impact, because as long as a piece has high refinement, its load grows exponentially, and if a good load balance is achieved both processors are likely to get a piece of the refined region. Moreover, including a cost in the load computation that includes a per-level load balancing leads to divisions into a big number of pieces, which produces higher transmission costs. Especially in distributed systems, the time for data transfers is much higher that the computational costs, and thus the potential gain of the per-level balancing does not produce a speedup in the algorithm.

142

6.2. Parallel implementation

36

36

16 73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

36

36

36

36

(a) Initial grid

36

(b) Initial grid with weights

400

36

36

4

4

36

292

4

4

36

292

292

36

36

36

292

292

16

400

912

(c) Initial division and Hilbert curve (k = 1) 9

9

9

9

1

1

1

1

9

9

9

9

1

1

1

1

9

9

73

73

1

1

1

1

9

9

73

73

1

1

1

1

9

9

73

73

73

73

9

9

9

9

73

73

73

73

9

9

9

9

9

9

73

73

73

73

(d) Hilbert curve for k = 2

36

36

16 73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

73

36

36

36 9

9

9

9

73

73

73

73

36

36

(e) Hilbert curve and load as- (f) Grid assignment after load signment after load balancing balancing for k = 3

Figure 6.20: An example to illustrate the load balancing algorithm

7 Numerical experiments In this chapter we analyze the performance of the numerical scheme for various one- and two-dimensional examples. Quantitative studies, such as the analysis of the numerical errors have been made in the onedimensional case only, because of the possibility of analyzing particular phenomenologies, that are hard to isolate in 2D experiments, together with the reduced execution times that allow us to make extensive testing in a reasonable time. In the two-dimensional examples we show the behavior of our scheme in more complex problems, with a more qualitative approach. We have selected several problems whose solutions exhibit a wide variety of phenomenologies, in order to show that the AMR algorithm performs well in the widest possible range of situations. It is the aim of this section to give an insight of how the algorithm behaves in different situations, through a wide range of test problems. The performance of the AMR algorithm depends on the complexity of the problem, on the resolution of the grid hierarchy and on the parameters, among other factors, and only the experience can help in the choice of a particular setup for a particular

144

7.1. One-dimensional tests

problem. Let us state some abbreviations and conventions used in this chapter. We will denote by τg the tolerance used in the gradient sensor defined in equations (5.11) for the 1D case and (6.6) for the 2D case. The tolerance used by the cell marking procedure based on the interpolation error, defined respectively in equations (5.12) and (6.7), will be denoted by τp . The parameter tc is the one used to group marked cells into rectangular patches, it appears in equation (6.8). The CFL constant, denoted by K, is used to compute the time steps for each iteration, as in (6.10), so that the CFL condition is ensured with the approach described in sections 5.3, 6.1.3 and A.4.3. In all tests we have set all refinement factors to be equal to 2 for simplicity. We will use the term error to refer to the difference between the solution computed by the AMR algorithm and the solution computed on a fixed grid with the same resolution as the finest grid in the grid hierarchy. We use the percentage of integrations (with respect to the number of integrations on a fixed grid) that the AMR algorithm needs as a measure of the performance of the algorithm because the integration algorithm is, by far, the most time consuming process in the algorithm. The choice of this quantity is justified in section 7.1.4.

7.1 One-dimensional tests We present in this section some tests performed on well known onedimensional test cases. Let us state some facts about the figures and results presented in this section. The AMR algorithm produces as an output a grid hierarchy with an associated numerical solution for each node of each grid. Therefore there is some redundancy in the output data, because of the grid overlapping intrinsic to the algorithm. It is not useful to plot all these data together, and we use two kinds of visualizations in the case of 1D data. In some figures we plot the numerical solution using a mixture of the solutions in the various grids, using the finest grid available in the AMR grid hierarchy wherever several grids overlap, thus discarding the values in a grid if a finer grid exists for the same spatial location. The density of the plotted values gives an idea of the refinement used. In other cases we plot the solution in a uniform grid with the same resolution as the finest grid of the actual grid hierarchy. The solution in the points where the finest grid does not exist is

7. Numerical experiments

145

computed using interpolation from coarser grids. In the computation of errors with respect to a fixed grid, we always use the last approach, in order to be able to compare both solutions point per point.

7.1.1 Linear advection equation Our first test consists in the solution of the linear 1D advection equation ut + cux = 0 x ∈ [−1, 1], t ≥ 0 (7.1) u(x, 0) = u0 (x) x ∈ [−1, 1], where c is a nonzero constant. The exact solution u(x, t) = u0 (x − ct) can be easily obtained by the method of characteristics, and consists on the advection of the initial data u0 (x) at speed c. An interesting test case is the advection of discontinuities. We have chosen the initial data 1 if |x| < 15 , u0 (x) = (7.2) 0 otherwise. Figure 7.1 shows the the numerical results obtained for c = 1 with the AMR algorithm, using five grid levels. The coarsest grid at level 0 is composed by a (fixed) grid of 50 points, and all refinement factors are equal to 2, giving a resolution equivalent to the fixed grid of 800 points. The solution has been evolved until time t = 0.5. Up to that time, the AMR algorithm performs a 12.66% of the integrations made by the algorithm running on a fixed grid of 800 nodes (we will refer to this ratio as global efficiency). The values of the parameters used are tc = 0.7, τp = 10−4 , τg = 10.0 and K = 0.5. We observe that the solution obtained by the AMR algorithm is in good agreement with the solution computed on a fixed grid. In Figure 7.2 we show the the difference between the solution obtained by the AMR algorithm and the solution computed on a fixed grid of 800 points, for some different executions with the same parameters as in Figure 7.1 except τp , which varies between 10−8 and 0.25. We plot the differences measured in the 1−norm, the 2−norm and the max–norm. In Figure 7.2(a) we observe that the error decreases linearly with respect to the tolerance parameter. In this case the grids have been adapted attending to the interpolation error only, without a sensor gradient. This is the reason why an abrupt increase in the error can be observed for

146

7.1. One-dimensional tests

1D advection, t = 0.5

1D advection, t = 0.5

Exact Fixed grid AMR

1

1

0.8

0.6

0.6 u(x)

u(x)

0.8

Exact Fixed grid AMR

0.4

0.4

0.2

0.2

0

−1

0

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

(a) Full solution.

0.25

0.26

0.27

0.28

0.29

0.3 x

0.31

0.32

0.33

0.34

0.35

(b) Zoom of the left contact discontinuity.

1D advection, t = 0.5 1D advection, t = 0.5 Exact Fixed grid AMR

1

4

0.8

3

u(x)

Level

0.6

2

0.4

1 0.2

0 0

−1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

0.65

0.66

0.67

0.68

0.69

0.7 x

0.71

0.72

0.73

0.74

0.75

(c) Grid hierarchy used in the last iteration. (d) Zoom of the right contact discontinuity.

Figure 7.1: Solution of the linear advection equation (7.1) with initial data (7.2) at time t = 0.5.

7. Numerical experiments

147

1D advection, t = 0.5

0

1D advection, t = 0.5

0

10

10 1−norm 2−norm Max−norm

−1

10

1−norm 2−norm Max−norm

−1

10

−2

−2

10

10

−3

−3

10

10

−4

−4

Error

10

Error

10

−5

−5

10

10

−6

−6

10

10

−7

−7

10

10

−8

−8

10

10

−9

10

−9

−8

10

−7

10

−6

10

−5

10

−4

10 Tolerance τp

−3

10

−2

10

−1

10

0

10

10

−8

10

−7

10

−6

10

−5

10

−4

10 Tolerance τp

−3

10

−2

10

−1

10

0

10

(a) Adaptation based on interpolation er- (b) Adaptation based on interpolation errors only. rors and gradient.

Figure 7.2: Difference between the solution with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the tolerance τp , for the linear advection equation (7.1) with initial data (7.2) at time t = 0.5.

big values of τg . Due to numerical diffusion, the contact discontinuity is smeared in all grids and, if the threshold is too high, the difference between the actual discrete values in a grid and the values predicted from the coarser grid by interpolation are smaller than the threshold and the contact discontinuity is not refined. This phenomenon is avoided if we include the gradient sensor in the adaptation. In this case the contact discontinuity is always refined and the error remains nearly constant for big values of τp . The errors corresponding to an adaptation based on both sensors are shown in Figure 7.2(b), where τg = 10 and τp varies in the same range as in Fig. 7.2(a). Note that up to a certain tolerance, both graphs coincide. Figure 7.3 shows the relation between the errors and the percentage of integrations made by the AMR algorithm with respect to the integrations needed by the algorithm on a fixed grid of 800 points. We observe the good performance of the algorithm, which is able to obtain a solution very close to the reference solution with a much smaller computational cost. As an example, the solution depicted in Figure 7.1, for which the algorithm makes only a 12.66% of the integrations needed on a fixed grid, has a difference with respect to the reference solution equal to 1.3243·10−5 (in the 1-norm), 4.3523 · 10−5 (in the 2-norm) and 3.3932 · 10−4 (in the max– norm).

148

7.1. One-dimensional tests 1D advection, t = 0.5

−2

10

1−norm 2−norm Max−norm

−3

10

−4

10

−5

Error

10

−6

10

−7

10

−8

10

−9

10

10

15

20 25 Percentage of integrations

30

35

Figure 7.3: Difference between the solution with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the percentage of integrations done by the AMR algorithm, for the linear advection equation (7.1) with initial data (7.2) at time t = 0.5.

7.1.2 Inviscid Burgers’ equation One of the simplest nonlinear scalar equations in 1D is Burgers’ equation. In this section we solve the following problem: 2 x ∈ [−1, 1], t ≥ 0 ut + ( u2 )x = 0 (7.3) u(x, 0) = u0 (x) x ∈ [−1, 1], with initial data (7.2). The solution consists on a shock wave moving to the right with speed c = 21 and a rarefaction wave. For t < 54 the solution of this problem is given by 0 if x < − 51 or x ≥ 2t + 51 , 1 x + 5t if − 51 ≤ x ≤ t − 15 , u(x, t) = t 1 if t − 51 < x < 2t + 15 .

In Figure 7.4, we compare the results obtained by the AMR algorithm with a reference solution computed on a fixed grid and the exact solution. In this experiment we have used grid hierarchies of the same type as in the case of the linear advection equation in section 7.1.1 (coarsest grid of 50 points, 5 refinement levels, equivalent to a resolution of 800 points). The parameters have been set to tc = 0.7, τp = 10−4 , τg = 10.0 and K = 0.5. The solution corresponds to time t = 0.7. We observe that the algorithm

7. Numerical experiments

149

has properly identified the regions corresponding to the shock wave and to the tail and the head of the rarefaction. The percentage of integrations of the algorithm with respect to the reference solution is a 12.96% for this experiment. It is interesting to observe what happens if we use a marking strategy based on a gradient sensor only, as is an usual approach in the literature (see e.g. [139, 82]). Figure 7.5 shows the results obtained using only the gradient sensor, with the same setup as the one used in Figure 7.4 and τg = 2.75. This parameter has been chosen such that the percentage of integrations (12.99% in this case) is similar in both experiments. A loss of accuracy can be seen in the tail and the head of the rarefaction, due to a lack of refinement. While shocks and contact discontinuities can be properly refined by the sensor gradient, it is not the case for rarefactions. In fact, the only way for the gradient sensor to refine these parts is to refine the whole rarefaction. To give more information about how the gradient sensor behaves, in Figure 7.6 we show the numerical solution corresponding to two different iterations of the algorithm for this case, along with the grid hierarchies used in those iterations. Figures 7.6(a) and 7.6(c) correspond to t = 0.339945 (iteration 17), and Figures 7.6(b) and 7.6(d) correspond to t = 0.379937 (iteration 19). While in iteration 17 the algorithm has refined the whole rarefaction, in iteration 19 only the shock is refined, because the slope of the rarefaction has become smaller than τg . Figure 7.7 shows the relation between the error incurred by the AMR algorithm, with respect to the solution on a fixed grid and the tolerance τp (Figure 7.7(a)) and with respect to the percentage of integrations (Figure 7.7(b)). The setup for this experiment is again the same: a grid hierarchy of 5 levels, with 50 points in the coarser level, τg = 10.0, tc = 0.7, K = 0.5. Figure 7.7(a) shows the same behavior as the case of the advection equation, plotted in Figure 7.2(b). The presence of the gradient sensor avoids big errors for big values of τp , producing a flat plot at the right part of the graph. For intermediate values, the errors behave linearly with respect to the tolerance. Unlike the case of the linear advection equation, the errors remain nearly constant and do not decrease for tolerances smaller than a certain value. The reason for this stabilization of the error can be tracked back to the fact that the numerical solution at the shock is extremely sensible to perturbations, and a small difference between the solution on a fixed grid and the solution of the AMR algorithm persists, provided the shock wave is refined, regardless the refinement used. This difference can be caused by several factors, but the clearest one is the fact that, when working with a single grid, correspond-

150

7.1. One-dimensional tests

1D Burgers, t = 0.7 1D Burgers, t = 0.7

1.2 Exact Fixed grid AMR

Exact Fixed grid AMR

0.1

1 0.08

0.8 0.06

u(x)

u(x)

0.6 0.04

0.4 0.02

0.2 0

0 −0.02

−0.2 −1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

−0.3

(a) Full solution.

−0.25

−0.1

−0.05

1D Burgers, t = 0.7

1.05

0.15

Exact Fixed grid AMR

Exact Fixed grid AMR

1

0.1

u(x)

u(x)

−0.15

(b) Zoom of the tail of the rarefaction.

1D Burgers, t = 0.7

0.95

0.9

0.85 0.4

−0.2 x

0.05

0

0.42

0.44

0.46

0.48

0.5 x

0.52

0.54

0.56

0.58

0.6

−0.05 0.45

0.5

0.55 x

0.6

0.65

(c) Zoom of the head of the rarefaction and (d) Zoom of the part at the right of the the part at the left of the shock. shock.

Figure 7.4: Solution of Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7.

7. Numerical experiments

151

1D Burgers, t = 0.7 1D Burgers, t = 0.7

Exact Fixed grid AMR

1

Exact Fixed grid AMR

0.1

0.08

0.8

0.06

u(x)

u(x)

0.6

0.04

0.4 0.02

0.2 0

0

−1

−0.02

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

−0.3

(a) Full solution

−0.25

−0.1

−0.05

1D Burgers, t = 0.7

1.05

0.15

Exact Fixed grid AMR

Exact Fixed grid AMR

1

0.1

u(x)

u(x)

−0.15

(b) Zoom of the tail of the rarefaction.

1D Burgers, t = 0.7

0.95

0.9

0.85 0.4

−0.2 x

0.05

0

0.42

0.44

0.46

0.48

0.5 x

0.52

0.54

0.56

0.58

0.6

−0.05 0.45

0.5

0.55 x

0.6

0.65

(c) Zoom of the head of the rarefaction and (d) Zoom of the part at the right of the the part at the left of the shock. shock.

Figure 7.5: Solution of Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7. Adaptation based on the gradient sensor only.

152

7.1. One-dimensional tests

1D Burgers, t=0.339945

1D Burgers, t=0.379937

1

1

0.8

0.8

0.6

0.6 u(x)

1.2

u(x)

1.2

0.4

0.4

0.2

0.2

0

0

−0.2 −1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

−0.2 −1

1

(a) Solution for t = 0.339945 (iteration 17).

−0.8

−0.6

−0.4

3

3

Level

Level

4

2

1

0

0

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

2

1

−0.6

0.2

1D Burgers, t=0.379937

4

−0.8

0 x

(b) Solution for t = 0.379937 (iteration 19).

1D Burgers, t=0.339945

−1

−0.2

0.4

0.6

0.8

(c) Grid hierarchy for iteration 17.

1

−1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

(d) Grid hierarchy for iteration 19.

Figure 7.6: Solutions of Burgers’ equation (7.3) with initial data (7.2) at times t = 0.339945 and t = 0.379937. Adaptation based on the gradient sensor only.

7. Numerical experiments

153

1D Burgers, t=0.7

−1

1D Burgers, t=0.7

−1

10

10 1−norm 2−norm Max−norm

1−norm 2−norm Max−norm

−2

−2

10

10

−3

−3

10

−4

Error

Error

10

10

−5

10

−5

10

−6

10

−6

10

−7

10

−4

10

−7

−8

10

−7

10

−6

10

−5

10

−4

10 Tolerance τp

−3

10

−2

10

(a) Errors vs. tolerance

−1

10

0

10

10

5

10

15

20 25 Percentage of integrations

30

35

40

(b) Errors vs. percentage of integrations

Figure 7.7: Difference between the solution with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the tolerance τp and to the percentage of integrations, for Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7.

ing to the finest resolution, the time step is updated after each iteration. The maximum characteristic speed is recomputed and the time step is adapted so that the CFL condition is verified. In the AMR algorithm, in turn, the time step is adapted after every iteration of the coarsest grid. In our example, this corresponds to 16 iterations of the finest grid. To illustrate this assertion, in Figure 7.8 we plot the difference uAM R − uREF between the solution uAM R of the AMR algorithm and the solution uREF computed on a fixed grid of 800 points, for various choices of the refinement tolerance. Figures 7.8(a), 7.8(b) and 7.8(c) correspond to τp = 10−4 , τp = 10−5 and τp = 10−6 , respectively. A pointwise convergence to the reference solution is clearly observed except at the shock, where the difference is not going to zero. Even more clear is Figure 7.8(d), where we compare the reference solution with a solution computed on a complete grid hierarchy, i.e., a grid hierarchy where the grid corresponding to each level is a complete fixed grid, of the corresponding size, covering the whole domain. In this case the finest grid is not affected by the solution at the coarser grids, since grid interpolation is never done. Also in this case, a difference similar to the cases depicted in Figures 7.8(a), 7.8(b) and 7.8(c) exists. The existence of this difference does not mean that one solution is more accurate than the other. The same thing can be observed in Figures 7.9(a) and 7.9(b), that show, respectively, a closeup

154

7.1. One-dimensional tests

of the differences in a part of the rarefaction and in the zone of the shock. The differences for the cases τp = 10−4 , τp = 10−5 and the case of using the complete grid hierarchy are plotted together.

7.1.3 The Euler equations of gas dynamics The Euler equations are one of the most important models of nonlinear hyperbolic systems of conservation laws. In this section we test the AMR algorithm for two common problems, namely the shock tube or Sod’s problem [166] and the interaction of a shock and an entropy wave [161]. The Euler equations in one dimension are given by (2.27), and we refer to section 2.3.3 for the description of the equations.

Shock tube problem We solve a Riemann problem for the Euler equations (2.27). The initial data is given by uL if x < 0 u0 (x) = (7.4) uR if x ≥ 0, with uL = (ρL , vL , pL ) = (1, 0, 1),

uR = (ρR , vR , pR ) = (0.125, 0, 0.1),

(7.5)

The solution consists of a shock wave, a contact discontinuity and a rarefaction wave. We evolve the solution until time t = 2 with the AMR algorithm using a grid hierarchy of 5 levels, with a coarsest grid of 50 points. We have used the parameters tc = 0.7, τg = 10.0, τp = 10−4 and K = 0.5. With this setup the AMR algorithm computes the solution depicted in Figure 7.10. In this example the AMR algorithm, using only a 19.59% of the integrations required by the algorithm on a fixed grid, is able to properly resolve all the features of the solution. Refinement has been made in the tail and the head of the rarefaction, in the contact discontinuity and in the shock wave. Figures 7.11 and 7.12 show zoomed regions of the density, velocity and pressure profiles around the relevant zones. Figure 7.13 shows the errors corresponding to the density field with respect to the tolerance parameter τp and the percentage of integrations. The error behaves in the same way as in the case of Burgers’ equation, shown in Figure 7.7.

7. Numerical experiments

−4

−4

1D Burgers, t = 0.7

x 10

5

4

4

3

3

2

2

1

1 uAMR − uREF

uAMR − uREF

5

155

0 −1

0 −1

−2

−2

−3

−3

−4

−4

−5

−5

−1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

0.4

0.6

0.8

1

1D Burgers, t = 0.7

x 10

−1

−0.8

−0.6

(a) τp = 10−4 −4

−4

1D Burgers, t = 0.7

x 10

5

4

4

3

3

2

2

1

1

0 −1

−2 −3

−4

−4

−5

−5

−0.6

−0.4

−0.2

0 x

0.2

(c) τp = 10−6

0.2

0.4

0.6

0.8

1

0.4

0.6

0.8

1

0.4

0.6

0.8

1

1D Burgers, t = 0.7

x 10

0

−3

−0.8

0 x

−1

−2

−1

−0.2

(b) τp = 10−5

uAMR − uREF

uAMR − uREF

5

−0.4

−1

−0.8

−0.6

−0.4

−0.2

0 x

0.2

(d) Complete grid hierarchy

Figure 7.8: Differences between the reference solution and several AMR solutions corresponding to different values of τg . Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7. All figures are at the same scale.

156

7.1. One-dimensional tests −4

−4

1D Burgers, t=0.7

x 10

1D Burgers, t=0.7

x 10

τp=10−4

τp=10−4

τ =10−5

τ =10−5

3

p

p

Full gridlist

Full gridlist

1 2

Error

Error

1

0

0 −1

−2

−3 −1 0.44

0.45

0.46

0.47 x

0.48

0.49

0.5

(a) Zoom in a part of the rarefaction.

0.542

0.544

0.546

0.548

0.55

0.552 x

0.554

0.556

0.558

0.56

0.562

(b) Zoom in the zone of the shock.

Figure 7.9: Differences between the reference solution and several AMR solutions corresponding to different values of τg . Burgers’ equation (7.3) with initial data (7.2) at time t = 0.7.

Shock-entropy wave interaction The solutions of the previous problems consist of some waves moving through regions in which the solution is piecewise constant. We consider now another problem, proposed by Shu and Osher [161], which shows a Mach 3 shock wave interacting with sinusoidal waves in density. In this experiment the solution merges regions with discontinuities and regions with a complicated smooth structure. The Euler equations (2.27) are solved with initial data: (ρL , vL , pL ) = (3.857, 2.629, 10.33), if x ≤ −4, u0 (x) = (7.6) (ρR , vR , pR ) = (1 + 0.2 sin(5x), 0, 1) else, at time t = 1.8. The computational domain has been set to [−5, 5], with outflow boundary conditions at x = −5 and inflow boundary conditions at x = 5. The density, velocity and pressure corresponding to the initial data (7.6) are depicted in Figure 7.14. Figures 7.15, 7.16 and 7.17 show, respectively, the density, velocity and pressure distributions computed by the AMR algorithm, and a good agreement between the AMR and the reference solution can be observed. We have used the previous grid hierarchy (5 refinement levels, coarse grids of 50 points), with the parameters τp = 5 · 10−3 , τg = 8.0, tc = 0.8, K = 0.5. The algorithm performs with this setup a 29% of the integrations needed by the algorithm applied to a grid of 800 points.

7. Numerical experiments

157

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t=2

1.1

1

Fixed grid AMR

Fixed grid AMR

1 0.8

0.9

0.8 0.6

Velocity

Density

0.7

0.6

0.4

0.5 0.2

0.4

0.3 0

0.2

0.1 −5

−4

−3

−2

−1

0 x

1

2

3

4

−0.2 −5

5

−4

−3

−2

(a) Density.

−1

0 x

1

2

3

4

5

3

4

5

(b) Velocity. 1D Euler, Sod’s problem, t=2

1D Euler, Sod’s problem, t = 2 Fixed grid AMR

1

4

0.8

0.6

Level

Pressure

3

0.4

1

0.2

0 −5

2

0

−4

−3

−2

−1

0 x

1

(c) Pressure.

2

3

4

5

−5

−4

−3

−2

−1

0 x

1

2

(d) Grid hierarchy used in the last iteration.

Figure 7.10: Solution of the shock tube problem for the Euler equations (2.27) with initial data (7.4) for t = 2.

158

7.1. One-dimensional tests

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2 Fixed grid AMR

1.04

0.5

1.02

0.48

Density

Density

Fixed grid AMR

1

0.46

0.98

0.44

0.96

0.42

0.94

0.4 −3

−2.8

−2.6

−2.4

−2.2

−2

−0.6

−0.4

−0.2

0

x

0.2

0.4

x

(a) Density, head of the rarefaction.

(b) Density, tail of the rarefaction.

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2 0.28 Fixed grid AMR

0.45

Fixed grid AMR 0.26

0.24

0.4

Density

Density

0.22

0.35

0.2

0.18 0.3

0.16

0.14 0.25 0.12 1.4

1.6

1.8

2

2.2

x

(c) Density, contact discontinuity.

2.4

3.3

3.4

3.5

3.6

3.7

3.8

x

(d) Density, shock wave.

Figure 7.11: Zoomed regions for the density field of the solution of the shock tube problem for the Euler equations (2.27) with initial data (7.4) for t = 2.

7. Numerical experiments

159

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2 Fixed grid AMR

0.12

0.94

0.1

0.92

Velocity

0.08

Velocity

Fixed grid AMR

0.96

0.06

0.9

0.88

0.04

0.02

0.86

0

0.84

0.82

−0.02 −3

−2.8

−2.6

−2.4

−2.2

−0.6

−2

−0.4

−0.2

0

0.2

0.4

x

x

(a) Velocity, head of the rarefaction.

(b) Velocity, tail of the rarefaction.

1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2

Fixed grid AMR

0.9

Fixed grid AMR

1.04

0.8 1.02

0.7

1

0.5

Pressure

Velocity

0.6

0.4

0.3

0.98

0.96

0.2 0.94

0.1 0.92

0 2.9

3

3.1

3.2

3.3

3.4 x

3.5

3.6

3.7

3.8

3.9

−3

−2.8

−2.6

−2.4

−2.2

−2

x

(c) Velocity, shock wave.

(d) Pressure, head of the rarefaction. 1D Euler, Sod’s problem, t = 2

1D Euler, Sod’s problem, t = 2 0.4

Fixed grid AMR

Fixed grid AMR

0.38

0.3

0.36

Pressure

Pressure

0.25

0.34

0.2

0.32

0.15 0.3

0.28

0.1

−0.6

−0.4

−0.2

0

0.2

0.4

x

(e) Pressure, tail of the rarefaction.

0.6

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

x

(f) Pressure, shock wave.

Figure 7.12: Zoomed regions for the velocity and pressure fields of the solution of the shock tube problem for the Euler equations (2.27) with initial data (7.4) for t = 2.

160

7.1. One-dimensional tests

1D Euler, Sod’s problem, t = 2

−1

10

1−norm 2−norm Max−norm

1−norm 2−norm Max−norm

−2

−2

10

−3

Error

Error

10

10

−3

10

−4

10

−4

10

−5

10

1D Euler, Sod’s problem, t = 2

−1

10

−5

−8

−7

10

−6

10

−5

10

10

−4

−3

10 Tolerance τp

−2

10

−1

10

10

0

10

10

(a) Errors vs. tolerance

0

5

10

15

20 25 30 Percentage of integrations

35

40

45

50

(b) Errors vs. percentage of integrations

Figure 7.13: Difference in the density field between the solution computed with the AMR algorithm and the solution computed on an equivalent fixed grid, with respect to the tolerance τp (a) and to the percentage of integrations (b), for the shock tube problem for the Euler equations (2.27), with initial data (7.4), for time t = 2.

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

4

11

3

10 3.5

2.5

9 3

8 2

7 Pressure

Density

Velocity

2.5 1.5

2

6

5 1

4 1.5

3

0.5 1

2 0

0.5 −5

−4

−3

−2

−1

0 x

1

2

(a) Density

3

4

5

−5

1 −4

−3

−2

−1

0 x

1

2

3

4

5

−5

(b) Velocity

Figure 7.14: Initial data (7.6)

−4

−3

−2

−1

0 x

1

2

(c) Pressure

3

4

5

7. Numerical experiments

161

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

5 Fixed grid AMR

Fixed grid AMR

4.1

4.5

4

4

3.5

Density

Density

3.9 3

2.5

2

3.8

3.7

1.5 3.6 1

0.5 −5

3.5 −4

−3

−2

−1

0 x

1

2

3

4

5

−3

−2.5

−1.5

−1

−0.5

0

x

(a) Density field

(b) Density, Zoomed region

1D Euler, Shock−entropy wave interaction problem, t=1.8

4.6

−2

1D Euler, Shock−entropy wave interaction problem, t = 1.8

Fixed grid AMR

Fixed grid AMR

4

4.4 3.5

4.2 3 Density

Density

4

3.8

3.6

2.5

2

3.4 1.5

3.2 1

3 0.2

0.4

0.6

0.8

1

1.2 x

1.4

1.6

1.8

(c) Density, Zoomed region

2

2

2.1

2.2

2.3 x

2.4

2.5

2.6

(d) Density, Zoomed region

Figure 7.15: Numerical solution of the shock-entropy wave interaction problem. Euler equations (2.27) with initial data (7.6). Density field for t = 1.8.

162

7.1. One-dimensional tests

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8 3 Fixed grid AMR

Fixed grid AMR

2.8

2.5

2.75 2

2.7

Velocity

Velocity

1.5

2.65

1

2.6

0.5

2.55

0

−0.5 −5

2.5

−4

−3

−2

−1

0 x

1

2

3

4

2.45

5

−3

−2.5

−1.5

−1

−0.5

0

x

(a) Velocity field

(b) Velocity, Zoomed region 1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8 Fixed grid AMR

2.8

Fixed grid AMR

3

2.5

2.75

2

Velocity

2.7

Velocity

−2

2.65

1.5

2.6

1 2.55

0.5 2.5

0 2.45 0.2

0.4

0.6

0.8

1

1.2 x

1.4

1.6

1.8

(c) Velocity, Zoomed region

2

2

2.1

2.2

2.3 x

2.4

2.5

2.6

(d) Velocity, Zoomed region

Figure 7.16: Numerical solution of the shock-entropy wave interaction problem. Euler equations (2.27) with initial data (7.6). Velocity field for t = 1.8.

7. Numerical experiments

163

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

12 Fixed grid AMR

Fixed grid AMR

11.5

10

11

Pressure

Pressure

8

6

10.5

10

4

9.5

2

0 −5

−4

−3

−2

−1

0 x

1

2

3

4

9

5

−3

−2.5

−2

−1.5

−1

−0.5

0

x

(a) Pressure field

(b) Pressure, Zoomed region 1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

Fixed grid AMR

Fixed grid AMR

11.5

10

11

10.5

Pressure

Pressure

8

6

10

4

9.5

2

9 0.2

0 0.4

0.6

0.8

1

1.2 x

1.4

1.6

1.8

(c) Pressure, Zoomed region

2

2

2.1

2.2

2.3 x

2.4

2.5

2.6

(d) Pressure, Zoomed region

Figure 7.17: Numerical solution of the shock-entropy wave interaction problem. Euler equations (2.27) with initial data (7.6). Pressure field for t = 1.8.

164

7.1. One-dimensional tests 1D Euler, Shock−entropy wave problem, t = 1.8

0

1D Euler, Shock−entropy wave problem, t = 1.8

0

10

10 1−norm 2−norm Max−norm

1−norm 2−norm Max−norm

−1

−1

10

Error

Error

10

−2

10

−3

10

−3

10

−4

10

−2

10

−4

−4

10

−3

10

−2

Tolerance τp

10

(a) Errors vs. tolerance

−1

10

10

15

20

25

30

35 40 45 Percentage of integrations

50

55

60

65

(b) Errors vs. percentage of integrations

Figure 7.18: Difference in the density field between the solution with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the tolerance τp (a) and to the percentage of integrations (b), for the shock-entropy wave interaction problem for the Euler equations (2.27), with initial data (7.6) for t = 1.8.

Figure 7.18(a) shows the errors corresponding to different values of the tolerance parameter τp , for the density distribution. The error in this case behaves differently to the cases previously studied (compare Figure 7.18(a) with Figures 7.2(b), 7.7(a) and 7.13(a)). We observe a combination of zones where the norm of the error is not significantly changing and zones with abrupt changes in the error. This behavior can be explained as the result of the influence on the error of several sources of differences between the AMR solution and the fixed grid reference solution. When analyzing the relation between the error and the refinement tolerance, in previous experiments – Burgers’ equation and the shock tube problem– we concluded that there was a difference between the solution on a fixed grid and the solution obtained by the AMR algorithm that is not related to the refinement but to the algorithm organization, and is present even if the numerical solution is obtained with a complete grid hierarchy. We observed that this difference was due to the fact that the numerical solution computed at the shock wave is extremely sensible to perturbations. In the case of the experiments in sections 7.1.2 and 7.1.3, this was the error source that was dominating the error for tolerances smaller than a certain value, so that the error was nearly constant for that range of tolerances. In the case of the shock-entropy wave interaction problem, the so-

7. Numerical experiments

165

lution is much more complicated and, as the refinement is changing, different factors can be influencing the error. Consider, for example, the abrupt error decrease that occurs near τp = 10−2 . We plot in Figure 7.19 the density distribution of two solutions of the AMR algorithm corresponding to two different –but close– tolerances, τp = 8.65 · 10−3 and τp = 1.30 · 10−2 . Despite the AMR algorithm for these two tolerances requires nearly the same percentage of integrations respect to the reference solution (24.55% for τp = 8.65 · 10−3 and 24.49% for τp = 1.30 · 10−2 ), the errors for τp = 8.65 · 10−3 are, depending on the case, 4 − 5 times smaller than the errors for τp = 1.30 · 10−2 , and a difference in the numerical solutions is clearly visible in the plots of Figure 7.19. To analyze why it happens, in Figure 7.20 we plot the differences between the AMR solution and the reference solution for several values of τp in the range 8.65 · 10−3 − 2.92 · 10−2 . Comparing Figures 7.20(a), 7.20(b) and 7.20(c) we observe that the reduction in the tolerance parameter mainly produces a reduction in the acoustic waves located at the left the point x = 0, but does not produce a significant reduction in the zone near the shock wave. As the error at the shock is the biggest one, the overall reduction of the error is small, as can be observed in Figure 7.18(a). If the tolerance is set to τp = 8.65·10−3 (Figure 7.20(d)), the solution at the shock is better approximated and the error presents an abrupt descent. Of course, the adaptation algorithm has followed the shock as it moves for all the referred tolerances, and the reason for the different quality of the solutions obtained with different tolerances is not a lack of refinement at the shock location for the bigger tolerances, but the high sensitivity to perturbations of this problem. In this particular case, the difference comes from the fact that for the smallest tolerance τp = 8.65 · 10−3 the algorithm has included during all the integration process two refinement levels (levels 0 and 1) in the zone where the sinusoidal waves are initially located, whereas for the other tolerances has not, as shown in Figure 7.21. Although the interpolation errors in that zone are –because of the smoothness of the solution– small for all the tolerances considered, the smaller interpolation errors of the smaller tolerance produces a better approximation of the states at both sides of the shock. The same argument explains the big decrease in the error that can be observed in Figure 7.18(a) for tolerances near 2 · 10−3 .

166

7.1. One-dimensional tests

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

5 4.1

4.5

4

4

3.5

Density

Density

3.9 3

2.5

2

3.8

3.7

1.5 3.6

Fixed grid AMR, τ = 8.65⋅ 10−3

1

Fixed grid AMR, τ = 8.65⋅ 10−3

p

p

AMR, τ = 1.30⋅ 10−2

AMR, τ = 1.30⋅ 10−2

p

0.5 −5

p

3.5 −4

−3

−2

−1

0 x

1

2

3

4

5

−3

−2.5

−2

−1.5

−1

−0.5

0

x

(a) Density field

(b) Density, Zoomed region

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8

4.6

4

4.4 3.5

4.2 3 Density

Density

4

3.8

3.6

2.5

2

3.4 1.5

3.2

Fixed grid AMR, τ = 8.65⋅ 10−3

Fixed grid AMR, τp = 8.65⋅ 10−3

p

AMR, τ = 1.30⋅ 10−2 3 0.2

0.4

0.6

0.8

AMR, τp = 1.30⋅ 10−2

1

p

1

1.2 x

1.4

1.6

1.8

(c) Density, Zoomed region

2

2

2.1

2.2

2.3 x

2.4

2.5

2.6

(d) Density, Zoomed region

Figure 7.19: Numerical solution obtained with the AMR algorithm for two values of τp , for the shock-entropy wave interaction problem. Density field for t = 1.8.

7. Numerical experiments

167

1D Euler, Shock−entropy wave interaction problem, t = 1.8

1D Euler, Shock−entropy wave interaction problem, t = 1.8 0.25

0.1

0.1

0

AMR

0

u

u

AMR

−u

0.05

REF

0.05

(Density)

0.2

0.15

−u

0.2

0.15

REF

(Density)

0.25

−0.05

−0.05

−0.1

−0.1

−0.15

−0.15

−0.2 −5

−4

−3

−2

−1

0 x

1

2

3

4

−0.2 −5

5

−4

−3

(a) τp = 2.92 · 10−2

0 x

1

2

3

4

5

3

4

5

1D Euler, Shock−entropy wave interaction problem, t = 1.8 0.25

0.2

0.2

0.15

0.15

0.1

0.1 (Density)

0.25

0.05

−u

REF

0.05

REF

(Density)

−1

(b) τp = 1.95 · 10−2

1D Euler, Shock−entropy wave interaction problem, t = 1.8

0

u

u

AMR

0

AMR

−u

−2

−0.05

−0.05

−0.1

−0.1

−0.15

−0.15

−0.2 −5

−4

−3

−2

−1

0 x

1

(c) τp = 1.30 · 10−2

2

3

4

5

−0.2 −5

−4

−3

−2

−1

0 x

1

2

(d) τp = 8.65 · 10−3

Figure 7.20: Difference at time t = 1.8 in the density field between the reference solution and several AMR solutions, for different refinement parameters. Shockentropy wave interaction problem. All figures are at the same scale.

168

7.1. One-dimensional tests 1D Euler, Shock−entropy wave interaction problem, t = 1.8

4

4

3

3

Level

Level

1D Euler, Shock−entropy wave interaction problem, t = 1.8

2

2

1

1

0

0

−5

−4

−3

−2

−1

0 x

1

2

(a) τp = 1.30 · 10−2

3

4

5

−5

−4

−3

−2

−1

0 x

1

2

3

4

5

(b) τp = 8.65 · 10−3

Figure 7.21: Grid hierarchies constructed by the AMR algorithm for the first time iteration, for two different refinement parameters. Shock-entropy wave interaction problem.

7.1.4 Two-component Euler equations in 1D The numerical experiment presented in this section consists on the onedimensional version of the interaction of a shock wave travelling in air with a helium bubble. The physical problem was originally studied by Haas and Sturtevant in [60] and has been simulated by Karni [83] using a primitive formulation and a second order algorithm, with simulations including AMR in [142]. The integration algorithm presented in this work has been applied to the helium bubble problem on a fixed grid in [118]. The setup of the problem is as follows: the two-component Euler equations (2.42) are considered in the interval [0, 0.356], with initial data that represents a one-dimensional helium “bubble”, located in the interval [0.15, 0.20] and surrounded by air. A left-traveling Mach 1.22 shock wave is located at x = 0.225. The initial data is thus given by:

uA if 0 ≤ x < 0.15 or 0.2 < x < 0.225, u(x, 0) = u0 (x) = u if 0.15 ≤ x ≤ 0.2, B uS if 0.225 ≤ x ≤ 0.356,

(7.7)

7. Numerical experiments

169

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1 0.1

1.4

0

1.2

−0.1 Velocity

Density

1

0.8

−0.2

0.6 −0.3

0.4

−0.4

0.2 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0

0.05

0.1

0.15

x

0.2

0.25

0.3

0.35

x

(a) Density

(b) Velocity

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1 1.6

1.5

0.8

Pressure

1.4

0.6 1.3

0.4

1.2

1.1

0.2 1

0 0.9

0

0.05

0.1

0.15

0.2 x

0.25

0.3

0.35

0

0.05

(c) Pressure

0.1

0.15

0.2

0.25

0.3

0.35

(d) Mass fraction φ

Figure 7.22: Initial data (7.7) for the shock-bubble interaction problem in 1D.

where uA = (ρA , vA , pA , φA ) = (1, 0, 1, 1), uB = (ρB , vB , pB , φB ) = (0.1819, 0, 1, 0),

(7.8)

uS = (ρS , vS , pS , φS ) = (1.3764, −0.3947, 1.5698, 1). Quiescent air is represented by uA , uB represents quiescent helium and uS is the state connected with uA by a Mach 1.22 shock traveling to the left. The initial data is depicted in Figure 7.22. This is the same setup used in [118] for this experiment. We have used the same grid hierarchy as in previous experiments (5 levels, coarse grid of 50 points, equivalent to a uniform grid of 800 points). The sample result in Figure

170

7.1. One-dimensional tests

7.23 corresponds to time t = 0.1, and has been obtained with the parameters τg = 8.0, τp = 10−4 , tc = 0.7 and K = 0.5. In this case the AMR algorithm performs a 54.97% of the integrations needed by the algorithm applied to a fixed grid of 800 points. Zooms of the plots in Figure 7.23 are shown in Figures 7.24, and 7.25 and 7.26. Figure 7.27 shows how the error is related to the tolerance parameter τp and to the percentage of integrations. The error behaves in the same way as in previous examples, with parts where the error decreases linearly and flat zones. We use this experiment to show the effects of using the flux projection from fine to coarse grids, described in sections 5.4 and 6.1.4. It was argued there that the update of the coarse fluxes using finer fluxes enforces inter-grid conservation and provides a benefit in terms of computational cost, because the updated coarse solution with numerical fluxes coming from the finer grid has an increased resolution. If some data is then interpolated from coarse to fine grids, the results are more accurate if the conservative fix-up has been applied. This feedback mechanism provides a sharper coarse numerical solution which is visually distinguishable after some time steps from the coarse solution without the conservative fix-up. An example is shown in Figure 7.28, where the numerical solution computed on the coarsest grid, corresponding to the experiment shown in Figures 7.23, 7.24, 7.25 and 7.26, is shown for both cases (with and without flux projection). The reduced smearing in the case of using flux projection results in a reduction in the quantity of refined cells. Figure 7.29 shows that, for the same number of integrations, the numerical solution obtained using flux projection is more accurate than the solution obtained without using it (the figure shows the 2-norm of the difference with respect to the reference solution). In all the previous experiments we have used the global efficiency, computed as the percentage of integrations required by the AMR algorithm with respect to a fixed grid of equivalent size, as an indicator of the performance of the AMR algorithm. It has been argued that this is a good measure of the efficiency of the algorithm, because of the relatively higher computational cost of the integration algorithm with respect to other processes as adaptation, interpolation and flux projection. The global efficiency is machine independent, is not affected by secondary processes running on the machine, as input/output operations and can be estimated without running the problem on a fixed grid. To end this section we show a comparison between the global efficiency and the actual percentage of time needed by the AMR algorithm with respect to a fixed grid algorithm, which is the interesting measure of efficiency in practice. Figure 7.30 shows a plot of the (wall-clock) percentage of time

7. Numerical experiments

171

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

0.1

Fixed grid AMR

Fixed grid AMR 0

1.2

−0.1

1

−0.2 Velocity

Density

1.4

0.8

−0.3

0.6

−0.4

0.4

−0.5

0.2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0

0.05

0.1

0.15

x

0.2

0.25

0.3

0.35

x

(a) Density

(b) Velocity 1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

1.5

Pressure

1.4

1.3

1.2

1.1

1 Fixed grid AMR 0.9

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

x

(c) Pressure

Figure 7.23: Sample numerical solution of the shock-bubble interaction problem in 1D, t = 0.1.

172

7.1. One-dimensional tests

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1 Fixed grid AMR

1.4

Fixed grid AMR

0.45

1.35 0.4 1.3

Density

Density

1.25

1.2

0.35

1.15 0.3 1.1

1.05 0.25 1

0.95 0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

0.13

0.12

0.125

0.13

0.135

0.14

x 1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

0.145 x

0.15

0.155

0.16

0.165

0.17

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1 Fixed grid AMR

Fixed grid AMR

1.4

1.4

1.35

Density

Density

1.35

1.3

1.3

1.25

1.25 1.2

1.15

1.2 0.16

0.17

0.18

0.19 x

0.2

0.21

0.22

0.23

0.24

0.25

0.26

0.27

0.28

0.29

0.3

x

Figure 7.24: Zoomed regions for the sample numerical solution of the shockbubble interaction problem in 1D, t = 0.1. Density.

7. Numerical experiments

173

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

0.05 Fixed grid AMR

0

Fixed grid AMR

−0.38

−0.4 −0.05 −0.42 −0.1 −0.44 Velocity

Velocity

−0.15

−0.2

−0.46

−0.48

−0.25

−0.5

−0.3

−0.35

−0.52

−0.4

−0.54

−0.45

−0.56 0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

0.13

0.18

0.2

0.22

x

0.24

0.26

0.28

x

Figure 7.25: Zoomed regions for the sample numerical solution of the shockbubble interaction problem in 1D, t = 0.1. Velocity.

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

1.6

1.5 1.55

1.4

Pressure

Pressure

1.5

1.3

1.2

1.45

1.4

1.1

1.35

1 Fixed grid AMR 0.9

0.05

0.06

0.07

0.08

0.09 x

0.1

0.11

0.12

0.13

Fixed grid AMR 1.3

0.18

0.2

0.22

0.24

0.26

0.28

x

Figure 7.26: Zoomed regions for the sample numerical solution of the shockbubble interaction problem in 1D, t = 0.1. Pressure.

174 −2

7.1. One-dimensional tests 1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

10

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

−2

10

1−norm 2−norm Max−norm

1−norm 2−norm Max−norm

−3

10

−3

10

−4

10

−4

Error

Error

10

−5

10

−5

10

−6

10

−6

10

−7

10

−7

−4

10

−3

Tolerance τ

p

(a) Error vs tolerance

10

10

25

30

35

40

45 50 55 Percentage of integrations

60

65

70

75

(b) Error vs percentage of integrations

Figure 7.27: Difference in the density field between the solution obtained with the AMR algorithm and the solution on an equivalent fixed grid, with respect to the tolerance τp (a) and to the percentage of integrations (b), for the shock-bubble interaction problem. Multi-component Euler equations (2.42) with initial data (7.7), t = 0.1

required by the AMR algorithm, measured with the time command provided by Linux, and the real global efficiency. We observe the linear relation between the percentage of time and the percentage of integrations, which is close to the optimal relation represented by a solid, straight line in the plot. We conclude that the global efficiency gives a good estimation of the relative computational cost of the AMR algorithm. For the experiments shown in the plot, we have run the shock-bubble and shock-entropy wave interaction problems for different values of τp , and compared the time needed and the integrations required by them with the same quantities for the algorithm applied on a fixed grid. To minimize the influence of the system, we use the average times over several runs, and we use setups with more points, so that startup times are reduced. For the shock-bubble interaction problem we used a grid hierarchy of 5 levels, with a coarse grid of 100 points, which is equivalent to a fixed grid of 1600 points. For the shock-entropy wave problem we used 6 refinement levels with a coarsest grid of 60 points, equivalent to a fixed grid of 1920 points.The rest of the parameters were set to K = 0.5, τg = 8 and tc = 0.8 for both experiments.

7. Numerical experiments

175

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

0.1

With projection Without projection

With projection Without projection 0

1.2

−0.1

1

−0.2 Velocity

Density

1.4

0.8

−0.3

0.6

−0.4

0.4

−0.5

0.2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0

0.05

0.1

0.15

x

0.2

0.25

0.3

0.35

x

(a) Density

(b) Velocity

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

1.6

With projection Without projection

1.5 1

1.4 0.8

Mass fraction

Pressure

1.3

1.2

0.6

0.4

1.1 0.2

1

0

0.9 With projection Without projection 0.8

0

0.05

0.1

0.15

0.2 x

(c) Pressure

0.25

0.3

0.35

−0.2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

x

(d) Mass fraction

Figure 7.28: Numerical solution on the coarsest grid (50 points), for the shockbubble interaction problem, with and without using flux projection. Multicomponent Euler equations (2.42) with initial data (7.7), t = 0.1. Same setup as in Figure 7.23

176

7.1. One-dimensional tests

1D Multiphase Euler, Shock−bubble interaction problem, t = 0.1

−2

10

With projection Without projection

−3

2−norm of the error (density)

10

−4

10

−5

10

−6

10

20

30

40

50 Percentage of integrations

60

70

80

Figure 7.29: 2-norm of the error obtained with and without flux projection, with respect to the percentage of integrations. Shock-bubble interaction problem, t = 0.1.

80 Experimental data, shock−bubble Experimental data, shock−entropy Line %time = %integrations 70

Percentage of time

60

50

40

30

20

10 10

20

30

40 50 Percentage of integrations

60

70

80

Figure 7.30: Relation between the percentage of time and the global efficiency, for the shock-bubble and the shock-entropy wave interaction problem.

7. Numerical experiments

177

7.2 Two-dimensional tests Various problems for the Euler equations and the multi-component Euler equations in 2D are considered in this section. In section 7.2.1 we consider a Riemann problem, with the initial data defining four shock waves. We next address in section 7.2.2 the double Mach reflection problem, where a shock wave encounters a wedge. The interaction of a shock wave with a vortex is addressed in section 7.2.3. Finally, the 2D version of the shock-bubble interaction problem for the multi-component Euler equations, whose one-dimensional version was studied in section 7.1.4, is considered in section 7.2.4. Schlieren-type images are used in some figures for the visualization of flow features. The discrete function |∇ui,j | si,j = exp −k maxl,m |∇ul,m | is computed, using a discrete approximation of the gradient, and is plotted as a grayscale image, for which darker pixel values correspond to higher density variations. The quantity u represent here a physical variable, typically density or pressure. In the case of a single fluid k is a constant, and in the case of two fluids the value of k depends on the fluid, being higher for the lighter fluid. We refer to [118] for details.

7.2.1 A Riemann problem for the Euler equations The first problem consists in a Riemann problem for the Euler equations (2.25)–(2.26), corresponding to configuration 3 of [155] (see also [101]). To define a Riemann problem in 2D, the domain, which is assumed here to be the unit square, is divided in four equal parts, as in Figure 7.31, and four constant states (ρk , uk , vk , pk ), k ∈ {I, II, III, IV }, are considered as initial data, so that, at each interface between two parts, a single wave appears. In this particular case the initial data corresponds to four shock waves and is given by ρ1 ρ2 ρ3 ρ4

= 1.5, = 0.5323, = 0.138, = 0.5323,

u1 u2 u3 u4

= 0, = 1.206, = 1.206, = 0,

v1 v2 v3 v4

= 0, = 0, = 1.206, = 1.206,

p1 p2 p3 p4

= 1.5, = 0.3, = 0.029, = 0.3.

(7.9)

178

7.2. Two-dimensional tests 1

II

I

III

IV

3/4

0

3/4

1

Figure 7.31: Sketch to the computational domain for a Riemann problem in 2D.

We solve this problem for time t = 0.5. The parameters used for this simulation are τp = 10−4 , tc = 0.9 and K = 0.45. We show the results corresponding to two different grid hierarchies. Figure 7.32 shows a schlieren image of the density field for a grid hierarchy of 7 levels, with a coarse grid of 32 × 32 points, equivalent to a resolution of 2048 × 2048 points. A zoom of the central part is shown in Figure 7.33. These figures can be compared with the results obtained with a fixed grid of resolution 2048×2048, shown in Fig. 7.34 and 7.35. Due to the high sensitivity of the solution with respect to small variations in the parameters, the solutions are slightly different, but with the same quality, having a a comparable resolution of the features of the solution The simulation requires a 14% of the integrations required on a fixed grid, and, running on 4 processors in parallel requires a wall-clock time of 16496 seconds (around 4.6 hours). The result on a fixed grid was obtained with a single processor and took a wall-clock time of 326131 seconds (around 90.6 hours), which represents an optimal timing of around 22.65 hours if it was run in parallel with 4 processors, as the AMR simulation was. We can compare these timings to conclude that the AMR algorithm used (at most) a 20.23% of the time used by the algorithm on a fixed grid, which represents a percentage of time for parallelization and operations related with the AMR algorithm (adaptation, projection, interpolation,

7. Numerical experiments

179

2000

1800

1600

1400

1200

1000

800

600

400

200

200

400

600

800

1000

1200

1400

1600

1800

2000

Figure 7.32: Numerical schlieren image of the density field for the 4-shocks Riemann problem, computed with a grid hierarchy of 7 levels. Euler equations (2.25)– (2.26) with initial data (7.9). Time t = 0.5.

etc.) of only the 6%. We have repeated the same experiment with the same setup as before but with 8 refinement levels instead of 7. The plots in Figures 7.36 and 7.37 correspond, respectively, to the plots in Figures 7.32 and 7.33. In this case the global efficiency is 12.8%.

7.2.2 Double Mach reflection This is a classical experiment, that appears in [39]. We consider the Euler equations (2.25)–(2.26), and the problem consists on a vertical Mach 10

180

7.2. Two-dimensional tests

1200

1000

800

600

400

200

200

400

600

800

1000

Figure 7.33: Zoom of the central part of Figure 7.32.

1200

7. Numerical experiments

181

2000

1800

1600

1400

1200

1000

800

600

400

200

200

400

600

800

1000

1200

1400

1600

1800

2000

Figure 7.34: Numerical schlieren image of the density field for the 4-shocks Riemann problem, computed with a fixed grid of 2048 × 2048 points. Euler equations (2.25)–(2.26) with initial data (7.9). Time t = 0.5.

182

7.2. Two-dimensional tests

1200

1000

800

600

400

200

200

400

600

800

1000

Figure 7.35: Zoom of the central part of Figure 7.34.

1200

7. Numerical experiments

183

4000

3500

3000

2500

2000

1500

1000

500

500

1000

1500

2000

2500

3000

3500

4000

Figure 7.36: Numerical schlieren image of the density field for the 4-shocks Riemann problem, computed with a grid hierarchy of 8 levels. Euler equations (2.25)– (2.26) with initial data (7.9). Time t = 0.5.

184

7.2. Two-dimensional tests

2000

1500

1000

500

500

1000

1500

2000

Figure 7.37: Zoom of the central part of Figure 7.36.

7. Numerical experiments

185

1

I

0 α

II

x0

4

Figure 7.38: Sketch of the computational domain for the double Mach reflection problem.

shock wave that moves horizontally to the right and encounters a wedge at some point x0 in the x axis. This is equivalent to sending a diagonal shock wave in a straight tube with a reflecting wall. The reflection of the shock on the wall produces a jet of dense gas with a complicated structure. The domain used in this experiment is sketched in Figure 7.38, and the corresponding initial data is given by ρI = 8.0, uI = 8.25 cos(α), vI = −8.25 sin(α), pI = 116.5, ρII = 1.4 uII = 0, vII = 0, pII = 1.0,

(7.10)

where the zones I and II are the ones indicated in figure 7.38. The angle α is the inclination of the shock with respect to the vertical. We consider the computational domain [0, 4] × [0, 1]. As in [39] we assume an inclination of the shock equal to π6 , and we set x0 = 14 . The numerical solution (density) obtained by the AMR algorithm for time t = 0.2 is shown in Figure 7.39. It has been obtained with a grid hierarchy of 6 levels, with a coarse grid of 80 × 20 points, which is equivalent to a fixed grid of 2560 × 640 points. The simulation has been run with a single processor, and the parameters have been set to the values τp = 10−4 , tc = 0.9 and K = 0.45. Figure 7.40 shows a contour plot of a part of the solution. These Figures can be compared with the solution obtained with a fixed grid of size 2560 × 640, shown in Figures 7.41 and 7.42. Both solutions give the same resolution, and only some small differences appear in the roll-up structure in Figures 7.40 and 7.42, due to the sensitivity of this structure to small perturbations. This simulation has a global efficiency of the 28%.

186

7.2. Two-dimensional tests

20 15 10 5

Figure 7.39: Plot of the density field for the double Mach reflection problem, computed with the AMR algorithm using a grid hierarchy of 6 levels. Euler equations (2.25)–(2.26) with initial data (7.10). Time t = 0.2.

16 300 14 250 12 200 10

150

8

100

6

50

4

2 50

100

150

200

250

300

350

400

Figure 7.40: Contour plot of a part of the solution of Figure 7.39.

7. Numerical experiments

187

20 15 10 5

Figure 7.41: Plot of the density field for the double Mach reflection problem, computed with a fixed grid. Euler equations (2.25)–(2.26) with initial data (7.10). Time t = 0.2.

16 300 14 250 12 200 10

150

8

100

6

50

4

2 50

100

150

200

250

300

350

400

Figure 7.42: Contour plot of a part of the solution of Figure 7.41.

188

7.2. Two-dimensional tests 1

y=0.5

b

III IV

I

a

II 0

x=0.25

2

x=0.5

Figure 7.43: Sketch of the computational domain used in the shock-vortex interaction problem.

7.2.3 Shock-vortex interaction This experiment shows the interaction of a planar stationary shock with a rotating vortex. We have used the same setup as in [144]. The flow is modeled with the Euler equations (2.25)–(2.26). Initially a shock and a vortex are considered in the square [0, 2] × [0, 1], so that they are isolated. The vortex is modeled as a rotating circle with uniform vorticity and an outer annulus with oppositely directed uniform vorticity and opposite total circulation. In our tests the shock is initially located at x = 0.5 and the vortex has its center at (0.25, 0.5), with radius a = 0.075 for the inner circle and b = 0.175 for the outer circle. Figure 7.43 shows an sketch of the computational domain. The shock Mach number is denoted by Ms . At the left of the shock and outside the vortex (region II in Figure 7.43) we fix initial conditions

ρII = 1,

uII =

√ γ Ms ,

vII = 0,

pII = 1.

The initial data for region I are derived from standard conditions for a

7. Numerical experiments

189

planar moving shock and are given by: (γ − 1)Ms2 , 2 + (γ − 1)Ms2 2 + (γ − 1)Ms2 uI = uII , (γ + 1)Ms2 vI = 0, ρI = ρII

pI = pII

2γMs2 − γ + 1 . γ+1

At the vortex, the (counterclockwise) angular velocity is given by r vm a if r ≤ a, a b2 vr = vm a2 −b2 r − r if a ≤ r ≤ b, 0 if r > b, p where r = (x − 0.25)2 + (y − 0.5)2 is the distance of a point (x, y) to the center of the vortex and vm is the maximum angular velocity. The velocity field inside the vortex is simply given by the components of the angular velocity added to the velocity of the region II, i.e. uIII,IV = uII − sin(θ)vr ,

vIII,IV = vII + cos(θ)vr ,

where θ is the angle composed by the point (x, y) and the horizontal axis. The density and pressure are given by the expressions [144] γ T γ−1 pIII,IV = pII , TII 1 T γ−1 , ρIII,IV = ρII TII where T is temperature, which is given by 2 s −1 s 2 − ln(s) T (r) = TII 1 + (γ − 1)Mv (1 − s)2 2s sb − s +(γ − 1)Mv2 2s if r ≤ a, and by T (r) = TII

1 + (γ − 1)Mv2

s (1 − s)2

s2b − 1 − ln(sb ) 2sb

190

7.2. Two-dimensional tests

3000

2500

2000

1500

1000

500

1000

2000

3000

4000

5000

6000

Figure 7.44: Numerical schlieren image obtained with the density field, corresponding to the interaction of a weak shock (Ms = 1.1) with a strong vortex (Mv = 1.7). Time t = 0.6.

pII ρII R , and vm Mv = √γRT , II

if a ≤ r ≤ b. The temperature TII is given by TII = a2 , b2

r2 b2

the rest of

the quantities are given by s = sb = and which is a measure of the strength of the vortex. Note the similarity between the definition of Mv and the shock Mach number. We have computed the solution for two configurations taken from [144]. In the first one, a weak (Ms = 1.1) shock interacts with a strong vortex (Mv = 1.7). The simulation has been made with one processor, setting τp = 10−4 , tc = 0.9, K = 0.45 and using a grid hierarchy of 6 levels, whose coarse grid is made of 200 × 100 points. This gives a resolution equivalent to a fixed grid of 6400 × 3200 points. Schlieren images obtained with the pressure field of the numerical solution, corresponding to time t = 0.6 are shown in Figures 7.44 and 7.45. The global efficiency for this case is 5.4%. The second configuration corresponds to a strong shock (Ms = 7) interacting with the same vortex. The setup is the same as in the previous case except for the number of levels, that have been set to 5, and the time instant which is t = 0.1. The grid hierarchy has the same resolution as a fixed grid of 3200 × 1600 points. In this case the solution has a much more complicated structure, shown in Figures 7.46 and 7.47. A 17.2% of

7. Numerical experiments

191

2000

1900

1800

1700

1600

1500

1400

1300

1600

1700

1800

1900

2000

2100

2200

2300

2400

2500

Figure 7.45: A closeup view of a part of Figure 7.44.

the integrations on a fixed grid is required in this example.

7.2.4 Shock-bubble interaction In this section we address the shock-bubble interaction problem in its two dimensional version. The governing equations for this problem are the multi-component Euler equations (2.43), considered in the square [0, 0.890] × [0, 0.089]. The helium bubble is a circle with center (0.42, 0.0445) and radius r = 0.025. A vertical 1.22 Mach shock wave, initially located at x = 0.6675 is moving left through air, see figure 7.48. The initial data are the following / B, UA if 0 ≤ x < 0.6775 and x ∈ U (x, 0) = U0 (x) = U if x ∈ B, B US if 0.6675 ≤ x ≤ 0.890,

where B = {(x, y) ∈ R2 /(x − 0.42)2 + (y − 0.0445)2 ≤ 0.0252 is the circle of center (0.42, 0.0445) and radius r = 0.025, and represents the helium bubble. The value of UA , represents quiescent air, UB represents helium contaminated with a 28% of air, in equilibrium with the surrounding air, and US is the state connected with quiescent air by a vertical left moving

192

7.2. Two-dimensional tests

1600 1400 1200 1000 800 600 400 200

500

1000

1500

2000

2500

3000

Figure 7.46: Numerical schlieren image obtained with the density field, corresponding to the interaction of a strong shock (Ms = 7) with a strong vortex (Mv = 1.7). Time t = 0.1.

1400

1200

1000

800

600

400

200

200

400

600

800

1000

Figure 7.47: A closeup view of Figure 7.46.

7. Numerical experiments

193 Shock 89 mm 50 mm

222.5 mm

222.5 mm

890 mm

Figure 7.48: Computational domain of the two dimensional experiment (not to scale)

1.22 Mach shock wave. The respective values of UA , UB and US are: UA = (ρA , uA , vA , pA , φA ) = (1.225, 0, 0, 101325, 1), UB = (ρB , uB , vB , pB , φB ) = (0.2228, 0, 0, 101325, 0), US = (ρA , uS , vS , pS , φS ) = (1.6861, −156.26, 0, 250638, 1). Note that, in contrast with the one-dimensional experiment in Section 7.1.4, we indicate here the physical values of the magnitudes, with no normalization, so that our results can be compared with the ones in other sources, like [60, 142, 118]. We have used a coarse mesh of 640 × 32 to discretize the upper half of the computational domain. To obtain the lower part by symmetry we impose artificial reflecting boundary conditions, following the same approach of Quirk and Karni [142] and Marquina and Mulet [118]. Six levels of refinement with all refinement factors set to 2 have been used to obtain a resolution equivalent to a fixed grid of 20480 × 1024 cells. In this experiment we have used the following parameters: the CFL condition has been set to K = 0.45. The refinement parameter is τp = 10−4 and the clustering parameter is tc = 0.7. The simulation has run up to time 1.7385 · 10−3 . With this setup the AMR algorithm performs a 10.08% of integrations with respect to a fixed grid algorithm. The execution has used 8 processors, with an execution time of 1856450 seconds ≈ 21.5 days. This leads to an estimation of the time needed on a fixed grid of around 7 months. In Figure 7.49 we display the bubble at several stages in the execution, as computed with the AMR algorithm. The times indicated correspond to the time elapsed since the shock arrives to the bubble.

194

7.2. Two-dimensional tests

(a) t = 4 µs.

(b) t = 32 µs.

(c) t = 143 µs.

(d) t = 250 µs.

(e) t = 326 µs.

(f) t = 386 µs.

Figure 7.49: Numerical schlieren images of the density field of the shock-bubble interaction problem for different times.

7. Numerical experiments

195

(g) t = 528 µs.

(h) t = 594 µs.

(i) t = 812 µs.

(j) t = 993 µs.

(k) t = 1123 µs.

(l) t = 1203 µs.

Figure 7.49 (continued)

196

7.2. Two-dimensional tests

8 Conclussions and further work

8.1 Conclusions In this work we describe a numerical method for the solution of hyperbolic systems of conservation laws. Our method results from the combination of a high order shock capturing scheme –built from Shu-Osher’s conservative formulation, a fifth order WENO interpolatory technique, Donat-Marquina’s flux-splitting method, and a third order Runge-Kutta algorithm– with the AMR technique developed by Berger and collaborators. We show how all these techniques can be merged together to build up a highly efficient numerical method, and we describe the practical implementation of the algorithm as a sequential or parallel computer

198

8.2. Further work

program. We have tested the algorithm with several one- and two-dimensional experiments, that show that our method is able to obtain solutions with the same quality as those obtained without adaptation, but with a much smaller computational time. The extensive testing of the algorithm gives an insight of the properties of the algorithm that can be useful in practice to have information about the potential gains that it can provide, as well as about its behavior with respect to the parameters involved. With the help of the experiments, we have explained several issues related to the AMR algorithm, in particular • the behavior of the adaptive method with respect to the same integration algorithm applied on a fixed grid, in terms of the difference between solutions, • the influence that the refinement procedure used in the adaptation stage has in the quality of the final result and the performance obtained by the adaptive algorithm, and • the role of the projection from fine to coarse in the algorithm. We also present in the appendix a description of the AMR algorithm that is much more general than the actual descriptions found in the scientific literature and tries to approach the foundations of the running algorithms that are described and implemented in practice.

8.2 Further work Despite the performance of the algorithm is satisfactory, we have detected several aspects, mostly related to the implementation, that can improve the efficiency of the algorithm in some cases. In particular, the overhead produced by the AMR algorithm can be reduced by using fast search algorithms in the grid hierarchy, so that the cost of finding mesh connectivity is reduced. The grid connectivity is used in various processes, like computing numerical solutions in the cells at the pads of the grid patches, or the creation of a numerical solution for an adapted grid. In the same line of the previous comment, in some cases it can be useful to merge small patches into bigger ones, wherever this is possible, without relaxing the clustering parameter. This leads to a reduction in the number of cells in the pad of the patches, thus improving the overall

8. Conclussions and further work

199

efficiency of the algorithm, but a cost has to be paid in order to find adjacent patches that can be merged and to merge them. Some more effort could be invested on the refinement criterion based on interpolation errors to make it less dependent on the choice of the threshold. The clearest extension of the method is the implementation in 3D. Despite the 3D version of the method can be easily described, as is partly done in appendix A, and it could seem a minor task to produce a 3D version from the 2D one, the amount of details to be taken into account for making such a code run in practice is so huge, that we are not currently considering its implementation. Instead, we aim to use our method for other interesting 2D problems, in particular hyperbolic systems of balance laws, where a source term is present (work in preparation), problems where the characteristic structure may not be fully available analytically, like traffic flow problems (see [45]). Another possible extension is the combination of penalization methods [25] with the AMR algorithm.

200

8.2. Further work

A A generic description of the AMR algorithm The adaptive mesh refinement algorithm is a general purpose framework for the efficient numerical integration of hyperbolic systems of equations. The algorithm was first developed by Berger in [21] and in the joint works with Oliger [23, 24] and Colella [22]. A simplified version was described by Quirk in [139]. Writing an AMR code is a challenging task. Most of the existing AMR implementations are designed for numerical methods based on the classical finite-volume approach, and the original developments of the algorithm were also thought for finite-volume methods. Since the first descriptions of the algorithm, decisions regarding the optimal way to implement an AMR algorithm have been discussed and taken in the literature. The descriptions of the AMR technique typically focus on the particular implementation made in this text, but a general description of the AMR algorithm is missing.

202

A.1. Grid system

Our description of the AMR algorithm has followed so far the same approach since, for the sake of clarity, we have centered our description in the way used in the implementation. In this chapter we will describe an AMR algorithm with a quite more general vision, trying to differentiate what is an AMR algorithm and what is an AMR implementation, which is the particular AMR algorithm that results from choosing particular types of grids, integration algorithms, adaptation techniques, etc. It is the goal of this chapter to describe the algorithm in wider generality, with the goal of giving some light to its insights, aimed in particular to those readers who are interested in the implementation of an AMR algorithm in a different way to the common approaches that exist in the literature. After some particular choices on the different elements that compose the AMR algorithm, our algorithm can be obtained. The advantageous numerical solution of hyperbolic systems of conservation laws using the AMR approach involves an efficient combination of several operations, performed on grids and on the numerical solutions defined on them. We will describe the main ingredients required to build such an algorithm using an AMR grid infrastructure, namely integration, adaptation and projection, for both the general case and the case of Cartesian grids, with special emphasis in the case of uniform cellrefined grids, which are the grids used in our implementation. An AMR algorithm can be built using these pieces and the pseudo-code included in chapter 5 (in particular the algorithm in Fig. 5.9). We start with the description of the grid and grid structures that lie on the basis of the algorithm.

A.1 Grid system The AMR algorithm uses a hierarchical grid system composed by grids with different levels of resolution. The coarsest grid covers the whole computational domain, and grids with smaller cells are overlapped where more refinement is needed. More grids, with smaller and smaller cells can be overlapped in turn over parts of the coarser grids, until the desired resolution is achieved. These grids are independent, in the sense that they can be, to some extent, handled in isolation, but some coherence has to be enforced between grids of different resolution, since they can cover the same spatial locations, and therefore different solutions corresponding to the same locations can exist. Despite AMR grid hier-

A. A generic description of the AMR algorithm

203

G2

G1

G0

Figure A.1: A sample three-level AMR grid hierarchy

archies have not been formally introduced yet, with illustrative purposes we show an example of such a hierarchy in Fig. A.1. A similar example is shown in fig. 6.1. The goal of this section is to describe the requirements of the sets of grids that conform the basis of an AMR algorithm. We start with the definition of grids, which are the simplest elements in the framework. Definition 9. Given an open bounded set Ω ⊂ Rd , a (complete) grid defined on Ω is a family of closed sets A := {ci }i∈I such that: ci = ˚ c¯i ∀i ∈ I,

˚ ci ∩ ˚ cj = ∅ ∀i, j ∈ I, [ ¯ ci = Ω,

i 6= j,

i∈I

¯ is the closure of Ω and ˚ where I ⊂ N is a set of indexes, Ω ci is the interior of the set ci . A subset of a grid A will be called a subgrid. For practical purposes both complete and subgrids will be often called grids henceforth, making an explicit distinction when required. The set of all complete grids defined on Ω will be denoted by Ac (Ω), and the set of all subgrids defined on Ω by A(Ω). Obviously Ac (Ω) ⊂ A(Ω). Definition 10. Given A ∈ A(Ω), we define the maximum grid size of A as |A|max = max{|ci | : ci ∈ A},

204

A.1. Grid system

where |ci | denotes the (Lebesgue) measure in Rd of ci . We define the minimum grid size of A by |A|mim = min{|ci | : ci ∈ A}. We say that A is uniform if |A|mim = |A|max , and in this case we denote |A| = |A|mim = |A|max . The definition of grid size provides a way of giving sense to the word resolution, by comparing grid sizes. This motivates the next definition. Definition 11. Let A1 , A2 ∈ A(Ω). If |A1 |max < |A2 |min we say that A1 is finer than A2 , or equivalently that A2 is coarser than A1 . If A1 is finer than A2 we will denote this fact by writing A1 < A2 , or A2 > A1 . Our aim is to define a grid hierarchy, composed by several subgrids, so that we obtain more accurate (better resolved) solutions as the resolution increases (i.e., the grid size decreases). This concept is introduced in the next definition. Definition 12. Given L > 1 and an open bounded set Ω ⊂ Rd , an L-level grid hierarchy defined on Ω is a set of L subgrids AL = {A0 , . . . , AL−1 }, with Aℓ = {cℓi }i∈Iℓ ∈ A(Ω), such that the following conditions are verified: Aℓ < Aℓ−1 , [ ¯ c0i = Ω.

1 ≤ ℓ ≤ L − 1,

(A.1) (A.2)

i∈I0

An L-level complete grid hierarchy defined on Ω is a set of L complete grids GL = {G0 , . . . , GL−1 } defined on Ω that verify condition (A.1) above (note that condition (A.2) is automatically satisfied for complete grids). Condition (A.1) means that the grids are getting finer as the level increases. Condition (A.2) is necessary in order to be able to solve the PDE in the whole domain of definition of the equation for the coarsest resolution, so that every grid, except the coarsest one, overlaps a grid coarser than it. This property is necessary in order to handle the different procedures that involve data exchange between grids of different resolutions. Despite an AMR algorithm can be, at least formally, built over a grid hierarchy defined as in Definition 12, it could lead to an excessive size ratio between cells of different grids corresponding to the same spatial location. It is more convenient to work with grid hierarchies where each grid is fully contained in its coarser grid. Data transfers between grids

A. A generic description of the AMR algorithm

205

that do not correspond to consecutive refinement levels are performed in cascade, passing through every grid in the middle. This approach reduces undesirable effects like high interpolation errors, that would ultimately lead to a loss on the accuracy of the algorithm. Definition 13. Given an L-level grid hierarchy AL = {A0 , . . . , AL−1 } defined on an open bounded set Ω ⊂ Rd , with Aℓ = {cℓi }i∈Iℓ (0 ≤ ℓ ≤ L − 1), we say that AL is nested if it verifies: [ [ cjℓ−1 , 1 ≤ ℓ ≤ L − 1. (A.3) cℓi ⊆ i∈Iℓ

j∈Iℓ−1

From now on, we will assume that every grid hierarchy is nested. We further restrict now the discussion to a class of grid hierarchies where each grid is obtained from the coarser by means of cell sub-division. Almost any AMR algorithm present in the literature is based on this kind of grid hierarchies. Definition 14. Let AL = {A0 , . . . , AL−1 } be a nested L-level grid hierarchy, with Aℓ = {cℓi }i∈Iℓ , (0 ≤ ℓ ≤ L − 1). We say that AL is cell-refined if for 1 ≤ ℓ ≤ L − 1 the following condition holds: let cjℓ−1 ∈ Aℓ−1 be such that the set Iℓj := {i ∈ Iℓ : ˚ cℓi ∩ ˚ cjℓ−1 ) 6= ∅} is nonempty. Then

[

cℓi = cjℓ−1 .

(A.4)

i∈Iℓj

The set Iℓj is a subset of Iℓ that represents the indexes of the cells of the grid Aℓ that intersect the interior of the cell cjℓ−1 . Condition (A.4), along with the nestedness of the grid hierarchy means that the refinement is performed following a cell-based approach: cells belonging to Aℓ are obtained by sub-division of cells belonging to Aℓ−1 . The grid hierarchy shown in Fig. A.2 is cell-refined. Fig. A.1 shows a cell-refined grid hierarchy composed by uniform grids. Cell-refined grids are important if we aim to use the AMR technology to numerically solve an hyperbolic system by means of a conservative scheme. In such numerical methods, the solution at a cell is updated using approximations of the fluxes that cross the cell interfaces. If the grid hierarchy is cell-refined, it follows from (A.4) that, if cjℓ−1 is such that Iℓj 6= ∅, then [ ∂cℓi , (A.5) ∂cjℓ−1 ⊂ i∈Iℓj

206

A.1. Grid system G2

G1

G0

Figure A.2: A sample three-level AMR cell-refined grid hierarchy

where ∂c indicates the boundary of the set c. Property (A.5) suggests that flux approximations computed on a grid can be used to update the solution at its coarser grid, so that conservation between grids is enforced, as was done in sections 5.4 and 6.1.4 when building the projection operator. Once the basic definitions for grids have been introduced, we assign to each grid a discrete set of data, that conceptually will correspond to the numerical approximation to the solution of an hyperbolic system of conservation laws of the form ut +

d X ∂f q (u) q=1

∂xq

= 0,

x ∈ Rd ,

t ∈ R+ .

(A.6)

The bundles composed by a grid and its associated numerical solution, and combinations of these pairs into hierarchies will be the basic elements to build an AMR algorithm. Definition 15. Given a grid A = {ci }i∈I ∈ A(Ω), a numerical solution of (A.6) in A is an application U : A −→ Rm . The images of the elements of A through U are denoted by U (ci ) = Ui . Both the application U and its image U (A) = {Ui }i∈I will be referred as numerical solution. The pair (A, U ) will be collectively denoted by M = {mi }i∈I , where mi = (ci , Ui ), and will be called a computational grid. The set of all pairs (A, U ),

A. A generic description of the AMR algorithm

207

where A ∈ A(Ω) and U is a numerical solution for A will be denoted by M(Ω). Definition 16. Mc (Ω) = {M = (A, U ) ∈ M : A ∈ Ac (Ω)}. The elements of Mc (Ω) will be called complete computational grids. We introduce now the concept of computational grid hierarchy, which is simply a grid hierarchy with its respective associated numerical solutions. Definition 17. An L-level computational grid hierarchy is a sequence of L computational grids {M0 , . . . , ML−1 } = {(A0 , U0 ), . . . , (AL−1 , UL−1 )} such that {A0 , . . . , AL−1 } is an L-level grid hierarchy. The set of all L-level computational grid hierarchies will be denoted by and by ML c (Ω) if all the grids in the grid hierarchy are complete grids, i.e.

ML (Ω),

L ML c (Ω) = {(A0 , U0 ), . . . , (AL−1 , UL−1 )} ∈ M (Ω) : Aℓ ∈ Ac (Ω), 0 ≤ ℓ ≤ L − 1}

From now on we will omit the appellative computational when referring to any type of grid, making distinctions, for example, between ordinary and computational grids or between complete grids and subgrids only when confusion is possible. An implicit distinction will be made sometimes by indicating the set to which the grid belongs. So far we have defined the basic mathematical structures needed for the solution of (A.6) using an AMR algorithm. We define next some operations that represent the basic elements in any AMR algorithm. We start by establishing relationships between subgrids and complete grids. The basic elements are, on the one hand, embedding operators, that relate a given subgrid to a complete grid, and restriction operators, that, from a complete grid, obtain subsets of it. The goal of the restriction and embedding operators is to simplify the description of other operators that are involved in the AMR algorithm. Once the operations that relate subgrids and complete grids have been defined, the description of the basic operators can be performed for complete grids, obtained through appropriate embeddings, and the result of the operation done on the complete grid is transferred back to the subgrid using a restriction operator. This approach of considering embedding and restriction operators as auxiliary procedures for the definition of operations on subgrids, is a purely abstract construction, made with the only goal of clarifying how operations like integration or cell marking are applied to subgrids. The idea is to pay no regard, by acting on complete grids only, to the fact that subgrids

208

A.1. Grid system M

R(M, {9, 27, 34, 36, 37, 40, 48})

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

36

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

R

Figure A.3: An example of a restriction operator R in 2D (only the grids are depicted).

can have arbitrary geometry and need information from a surrounding band (whose size is dependent on the particular operation) around it, as commented in section 5.1. Of course, because of efficiency reasons, the practical implementation of the algorithm does not follow this approach, and acts directly on subgrids. Definition 18. An embedding operator is any application E : M(Ω) −→ ˜ U ˜ ) = E(A, U ) then Mc (Ω) such that if (A, U ) ∈ M(Ω) and we denote (A, ˜ is an extension of U as an application. A ⊂ A˜ and U ˜ U ˜ ) = E(A, U ) we will We will use the following abuse of notation: if (A, ˜ ˜ denote A = E(A) and U = E(U ). Note that if M ∈ Mc (Ω) then E(M ) = M . Definition 19. A restriction operator is an application R : Mc (Ω)×P(N) −→ M(Ω) defined by R({mi }i∈I , I1 ) = {mj }j∈I∩I1 , where {mi }i∈I ∈ Mc (Ω) and I1 ∈ P(N). We shall use the same abuse of notation as with the embedding operaˆ U ˆ ) = R(A, U, I1 ) we will denote Aˆ = R(A, I1 ) and U ˆ = R(U, I1 ). An tor: if (A, example of a restriction operator with a two-dimensional grid is shown in Fig. A.3. Remark 3. Note that, if E is an embedding and R is a restriction, then for each M ∈ M(Ω), with M = {mi }i∈I one has R(E(M ), I) = M . Another remark is that that the embedding operator is not unique and that different

A. A generic description of the AMR algorithm

209

operations involved in the update process can be defined through different embeddings, according to the particular characteristics of the operator. These concepts are illustrated in Fig. A.4. Now, given an operator Pc : Mc (Ω) −→ Mc (Ω), that acts on complete grids, an embedding E and a restriction R, we can define an operator P : M(Ω) −→ M(Ω) by setting: P = R ◦ Pc ◦ E.

(A.7)

This construction is illustrated in Fig. A.5.

A.2 Grid adaptation The goal of the adaptation process is to track the moving features of the numerical solution as the integration advances the numerical solution in time. Given a subgrid A, after some iterations of the integration algorithm, it is necessary to change A so that it adapts to the updated numerical solution. This is precisely the process carried out by the adaptation algorithm. As we will see later on, a correct grid adaptation strategy is absolutely essential for any AMR algorithm to make sense. In this section we will denote by A˜ the subgrid obtained from A by adaptation. The adaptation is composed by two major processes: on the one hand, a procedure that decides which cells will compose the grid A˜ is required. This process will be referred to as flagging or marking procedure. On the other hand, once the composition of the new grid has been decided, the new cells have to be filled with a numerical solution. The only important restriction when constructing these building blocks is that the resulting grids have to compose a grid hierarchy in the sense of Definition 12. These two leading processes are described next. A marking operator, that decides which cells of a subgrid need refinement, is one of the most important components of the AMR algorithm. It will allow the algorithm to ensure that the discontinuities will never escape from a fine grid into a coarser one, which would lead to a loss of accuracy in the region where the AMR algorithm is precisely supposed to provide better resolution due to refinement. The actual procedures by means of which one decides if a cell is refined or not are diverse, and our choices were studied in Section 5.5. We give the following generic definition of a marking operator:

210

A.2. Grid adaptation

M2 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

36

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

R

E2 M

M1 1

2

3

4

7

8

5

9

10

14

11

12

13

16

17

18

19

20

25

26

27

30

31

32

15 21

22

28

29 33

23

24

6

E1

R

34

Figure A.4: An example of the combination of restriction and embedding. In the figure, E1 (M ) = M1 and E2 (M ) = M2 . M = R(M1 , {9, 27, 34, 36, 37, 40, 48}) = R(M3 , {4, 17, 19, 21, 22, 25, 31}) (only the grids are depicted).

A. A generic description of the AMR algorithm

Mc (Ω)

Pc

211

Mc (Ω) R

E

M(Ω)

M(Ω)

P

Figure A.5: Construction of an operator P between subgrids through an operator Pc acting on complete grids, an embedding operator E and a restriction operator R.

Definition 20. A marking operator is an application Fc : Mc (Ω) −→ P(N).

(A.8)

Note that Fc acts on complete grids. A marking operator for elements of M(Ω), that we denote by F , can be defined through a construction similar to the one defined in figure A.5: if E is an embedding we define: F (M ) := Fc (E(M )).

(A.9)

A marking operator is therefore an application that, having a grid as input, returns a set of indices corresponding to cells that need to be included in the subgrid, according to a certain criterion. Once the desired cells have been selected for refinement using a marking operator, a new grid is created from those cells. We formalize this fact in the following definition: Definition 21. Given a marking operator F : M(Ω) −→ P(N), the adaptation operator associated to it is the application Ad : M(Ω) −→ M(Ω), defined by Ad(M ) = R(E(M ), F (M )),

∀M ∈ M(Ω).

where E is the embedding used to define F . Remark 4. Note that Definition 21 applies for both subgrids and complete grids, the embedding being in the latter case the identity. If the initial grid

212

A.2. Grid adaptation

is complete, then the adaptation operator is nothing but a particular class of a restriction operator, where the selected indexes are determined by a marking operator. Remark 5. An adaptation operator defined as in Definition 21 retains unmodified the cells that are marked and belong to the initial grid. As a consequence, the coarsest grid, which is forced to cover the while computational domain, remains fixed by adaptation. Several points have to be taken into account when designing an adaptation strategy, (i.e., a marking operator). The first one is to ensure that the result of adapting one or more grids within a grid hierarchy, is still a grid hierarchy. If AL = {A0 , . . . , AL−1 } is a grid hierarchy and a grid Aℓ is adapted, being A˜ℓ = {˜ cℓi }i the adapted grid, then, looking at the definition of grid hierarchy (Definition 12) we have to ensure that conditions (A.1) and (A.2) hold, which means that, if 1 ≤ ℓ < L − 1, Aℓ+1 < A˜ℓ < Aℓ−1 , S ¯ whereas if ℓ = L − 1, A˜L−1 < AL−2 , and if ℓ = 0, A˜0 > A1 and i∈I0 c˜0i = Ω. A simple practical way to proceed is to fix |A˜ℓ | = |Aℓ |, for 1 ≤ ℓ ≤ L − 1, and to maintain fixed the coarsest grid A0 . For a nested grid hierarchy we also need to enforce the nestedness of the resulting grid hierarchy, i.e., A˜ℓ ⊂ Aℓ−1 , if 1 ≤ ℓ ≤ L − 1 and

Aℓ+1 ⊂ A˜ℓ , if 0 ≤ ℓ < L − 1.

Several particular strategies to ensure the nestedness are possible, and we have discussed our choice in Sections 5.5 and 6.1.2 (page 121). On the other hand, we have already commented that, because the grids have to follow the moving features of the solution, in general the adapted grid will not be contained in the original grid. Consider a sub˜ U ˜ ) = R(E(M ), F (M )), that grid M = (A, U ) and the subgrid Ad(M ) = (A, results from adaptation for some operators R, E and F . The fact that A˜ \ A 6= ∅ implies that there are several different possibilities for building a numerical solution for the newly created grid A˜ from the existing grid ˜ will depend in fact on the definition of A. The particular definition of U ˜ (ci ) the embedding operator, which is not unique. Recall that the value U assigned to a cell ci ∈ A˜ is exactly the value that comes from the embedding operator. From the definition of the embedding, the numerical

A. A generic description of the AMR algorithm

213

solution for the complete grid has to be an extension of the numerical ˜ (ci ) solution defined on the subgrid, i.e., if ci ∈ A ∩ A˜ then the value U ˜ assigned to ci as an element of A is the same that as an element of E(A). As the restriction is not modifying the numerical solution, we have ˜ (ci ), U (ci ) = U

∀i ∈ I˜ ∩ I,

where I is the set of indices that defines M and I˜ = F (M ) is the one ˜ . Therefore, for the cells that belong to both grids M that defines M ˜ and M , the natural requirement that the numerical solution takes the same value in both grids is imposed by construction, but nothing can be deduced for the rest of the cells. An adaptation operator defined on subgrids can thus be viewed as an operator that changes the grid, while preserving the numerical solution in the cells shared by the original and the adapted grid. For the cells corresponding to indices that belong to the set I˜ \ I, the definition of the values of U is in principle free, and is usually given by interpolation from the coarse grids, as is done in our case (see Sections 5.5, 5.6 and 6.1.2). For the cells that coincide with the boundary of the computational domain, external boundary data have to be supplied, as is often done when integrating the equations with a single grid. Some procedures to deal with (artificial) boundary conditions are described in Section 3.7.

A.3 Integration and projection In this section we introduce into the generic framework described in Section A.1 the operations related to the update, from a time instant to the following one, of the numerical solution of the system of equations under consideration, from a generic point of view. A discussion on generalities on numerical methods for fluid dynamics has been performed in chapter 3, while our particular numerical algorithm has been described in chapter 4. In further sections we will particularize the discussion to our particular choice, that is, Shu-Osher’s finite-difference algorithm with Donat-Marquina’s flux split on a uniform cell-refined Cartesian grid hierarchy. A generic integration algorithm can be expressed in terms of an operator that produces an updated solution from the solutions already computed, as in the following definition:

214

A.3. Integration and projection

Definition 22. An integrator or an integration operator is an application I : M(Ω) −→ M(Ω) ˆ Uˆ ) then Aˆ = A. so that, if M = (A, U ) ∈ M(Ω) and I(M ) = (A, An integrator is thus an operator that modifies the values of the numerical solution without modifying the grid. If Ic : Mc (Ω) −→ Mc (Ω) is an integrator that acts on complete grids, an integrator for subgrids can be defined through the construction (A.7), i.e. I(M ) = R(Ic (E(M )), I),

∀M ∈ M(Ω),

where I is the set of indexes that defines M , E is an embedding and R is a restriction. Within a grid hierarchy one has different numerical solutions corresponding to different grids. Sometimes these different solutions need to be coherent in some sense, for example when they correspond to the same time instant. We define next an operation that represents the modification of the numerical solution at one grid from the solution at a different grid. It is assumed that finer grids can provide more accurate information than coarser grids, so this operation is forced to reduce to the identity if information from a coarser grid is to be used to modify the solution at a finer grid. Projection operators involving more than two grids (and even applications from ML (Ω) to ML (Ω), that modify more than one grid using information from more than one grid) can be considered, but we will restrict the discussion to the case defined below, where only two grids are considered. Typically these grids will correspond to two consecutive refinement levels within a nested grid hierarchy, and the corrections are applied from the finest to the coarsest grid passing through each pair of grids in the middle. Definition 23. A projection operator or a projector is an application: P r : M(Ω)2 −→ M(Ω) such that, given M1 = (A1 , U1 ) and M2 = (A2 , U2 ) ∈ M(Ω), then P r(M1 , M2 ) = ˜ ) and if A2 ≤ A1 then U ˜ = U2 . (A2 , U A projector is thus an operator that modifies the numerical solution at a grid (M2 in the definition) using a finer grid (M1 in the definition). As with the integration operators, a projector can be defined using a projector P rc : Mc (Ω)2 −→ Mc (Ω) that acts on complete grids by P r(M1 , M2 ) = R(P rc (E(M1 ), E(M2 )), I),

∀M1 , M2 ∈ M(Ω)

A. A generic description of the AMR algorithm

215

where I is the set of indexes that defines M2 . So far we have defined the basic components needed to build an adaptive mesh refinement algorithm for the numerical solution of (A.6). The description up to now is however very general and several choices on the particular elections of each piece, as well as on the algorithm organization have to be still made. We describe next an algorithm for the integration of a system of hyperbolic conservation laws using a finite-difference conservative scheme with an adaptive mesh refinement infrastructure based on Cartesian grids. The election of such a class of grid hierarchies is motivated by several factors: • Our numerical method is based on Shu-Osher’s finite difference formulation, which is well suited for its application to uniform Cartesian grids. Other parts of the numerical method, as the Weighted ENO reconstruction, are also better suited for Cartesian grids than for other kinds of grids. • The practical implementation of the algorithm into a computer program is much easier with a Cartesian grid, and there is no special advantage on using other kinds of grids for the kind of problems of interest in this work.

A.4 Cartesian grids Cartesian grids are a class of grids of particular interest for the AMR algorithm. The original algorithm was actually described for rotated Cartesian grids [21, 23], and most practical implementations, including ours, follow a Cartesian approach. This is mainly due to the relative simplicity of Cartesian grids with respect to other kind of grids. In this section we describe a generic AMR algorithm for a Cartesian grid infrastructure, using the building blocks defined in the previous section, particularized for this kind of grids. We start by defining Cartesian grids and grid hierarchies1 Definition 24. Let Ω be an open cube in Rd . A Cartesian (complete) grid defined on Ω is a grid G ∈ Ac (Ω) that verifies: for each k ∈ {1, . . . d} There 1

In this section we will sometimes consider, for simplicity, only grids instead for computational grids, omitting the numerical solution associated to the grids. We will use the suitable abuses of notation where required.

216

A.4. Cartesian grids

exists a set of indexes Ik and ordered points {pkik }ik ∈Ik , verifying pkik < pkik +1 , ∀ik ∈ Ik , and such that G = {ci1 ,...,id : ik ∈ Ik , 1 ≤ k ≤ d} , where ci1 ,...,id =

d Y

[pkik , pkik +1 ].

(A.10)

k=1

Definition 25. Let Ω be an open cube in Rd . A Cartesian subgrid defined on Ω is a subgrid A ∈ A(Ω) such that there exists a complete Cartesian grid G defined on Ω with A ⊆ G. Complete Cartesian grids are therefore complete grids, whose elements are hyperrectangles of Rd that result from the Cartesian product of intervals of the real line. Cartesian subgrids are just subsets of them. The centers of the cells defined by (A.10) are the points xi1 ,...,id = (x1i1 , . . . , xdid ),

ik ∈ Ik ,

1 ≤ k ≤ d,

(A.11)

where, for each k,

pkik +1 + pkik , ik ∈ Ik . (A.12) 2 In Definitions 24 and 25, if, for a certain k, the differences pkik +1 − pkik are constant with respect to ik , then the grid is said to be uniform in the k-th Cartesian direction. If the property holds for every k, then the grid is uniform, according to Definition 10. The corresponding constants pkik +1 − pkik will be denoted by ∆xk and referred to as grid sizes. If ∆xk does not depend on k, then the grid is made of hypercubes, and is said to be a square Cartesian grid, with grid size ∆x. As an illustration, different types of grids are shown in Fig. A.6. Let Ω be an open hyperrectangle in Rd , that we can write as xkik :=

Ω=

d Y

(ak , bk ),

k=1

for certain ak , bk ∈ R, ak < bk , 1 ≤ k ≤ d. A complete Cartesian grid is fully determined by positive integer numbers defining the number of cells considered in each direction and the relative spacing of each cell. More precisely, given d nonnegative integers n1 , . . . , nd , and (n1 −1) + · · · + (nd −1) positive real numbers ∆x01 , . . . , ∆x1n1 −2 , . . . , ∆x0d , . . . ∆xdnd −2 such that nX k −2 i=0

∆xik < bk − ak ,

1 ≤ k ≤ d,

A. A generic description of the AMR algorithm y

y

1

1

1

0

x

0

y

y

1

1

1

0

x

0

217

1

x

1

x

Figure A.6: 2D Cartesian grids for Ω = (0, 1)2 : a square Cartesian grid (top left), a uniform Cartesian grid (top right), a Cartesian grid that is not uniform in any Cartesian direction (bottom left) and a uniform grid that is not a Cartesian grid (bottom right).

a complete Cartesian grid defined on Ω can be defined as follows: define, for each k: pk0 = ak , pki+1

= ak +

i X j=0

pknk = bk .

∆xjk = pki + ∆xik ,

0 ≤ i < nk − 1,

(A.13)

Q ¯ usWe can define dk=1 nk subsets {ci1 ,...,id : ik ∈ {0, . . . , nk − 1}} of Ω, ing (A.10), that trivially define a Cartesian complete grid of Ω. The grid obtained from the element n1 , . . . , nd , ∆x01 , . . . , ∆x1n1 −2 , . . . , ∆x0d , . . . ∆xdnd −2 ∈ Nd × R(n1 −1)+···+(nd −1)

using the above construction will be denoted by D n1 , . . . , nd , ∆x01 , . . . , ∆x1n1 −2 , . . . , ∆x0d , . . . ∆xdnd −2 .

This construction simplifies for square or uniform grids. A complete uniform grid is completely defined by a set of d positive integers

218

A.4. Cartesian grids

{n1 , . . . , nd } in the following way: since the grid spacings ∆xik are conk , and (A.13) reduces stant for each k, the only choice is to get ∆xk = bkn−a k to pk0 = ak pki+1 = ak + (i + 1) ∆xk ,

0 ≤ i ≤ nk − 1.

(A.14)

The sets defined with (A.10) using the points as defined in (A.14) define a complete Cartesian grid that is, by construction, uniform. This construction defines an application Du : Nd −→ Ac (Ω) that assigns to each vector {n1 , . . . , nd } ∈ Nd its corresponding grid Du (n1 , . . . , nd ). Square grids can only be defined for a grid size ∆x if nk :=

bk − ak ∈ Z, ∆x

for 1 ≤ k ≤ d.

In that case, we can repeat the same construction as in the case of uniform grids, according to (A.14) and (A.10), with ∆xk = ∆x, 1 ≤ k ≤ d. We will use the notation Ds (∆x) for the application that assigns to the quantity ∆x its corresponding square grid. 1 ) (top The Cartesian grids depicted in Fig. A.6 correspond to Ds ( 16 3 5 2 3 1 2 2 2 1 6 1 left), Du (8, 4) (top right) and D(6, 7, 14 , 14 , 14 , 14 , 14 , 14 , 14 , 14 , 14 , 14 , 14 ) (bottom left). Before entering into Cartesian grid hierarchies, let us comment that, given a Cartesian subgrid, according to Definition 25, in general there exist many complete grids that can produce the given subgrid by means of a restriction, see Fig. A.7. We will assume that a given subgrid comes from the restriction of a particular complete grid that is known and fixed for each problem: in other words, the numbers describing the complete grid by means of the operators D, Du or Ds are known. This provides a unique embedding for each subgrid, so that the embedding is the complete grid from which the subgrid comes from. A Cartesian grid hierarchy is defined as a grid hierarchy composed by Cartesian grids. We will focus in a particular class of Cartesian grid hierarchies: uniform cell-refined Cartesian grid hierarchies.

A.4.1 Uniform cell-refined Cartesian grid hierarchies Given an open cube Ω ⊂ Rd , a positive integer L, and a set of L · d positive integers (n1 , . . . , nd , r10 , . . . , rd0 , . . . , r1L−2 , . . . , rdL−2 ) we can define a unique L-

A. A generic description of the AMR algorithm

219

Figure A.7: Two complete Cartesian grids (top figures) that can produce the same subgrid (bottom) by restriction

level complete grid hierarchy GL = {G0 , . . . , GL−1 } on Ω as follows: (i) (ii)

G0 = Du (n1 , . . . , nd ) If Gℓ = Du (m1 , . . . , md ) then Gℓ+1 = Du (m1 · r1ℓ , . . . , md · rdℓ ), for 0 ≤ ℓ ≤ L − 2.

Condition (ii) of (A.15) is equivalent to Q Qℓ−1 i ℓ−1 i rd , r1 , . . . , nd i=0 (iib) Gℓ = Du n1 i=0

1 ≤ ℓ ≤ L − 1.

(A.15)

(A.16)

In the grid hierarchies constructed using (A.15) every grid is complete, uniform and cell-refined. In fact each cell at a level ℓ is divided into Qd ℓ cells that belong to the grid at level ℓ + 1. An example of a grid r i=1 d hierarchy defined in such a way is shown in Fig. A.8. A unique L-level complete Cartesian grid hierarchy on Ω is therefore fully determined by the numbers (n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 ), so we can define an application (L)

DLc : NL·d −→ Ac (Ω)× · · · ×Ac (Ω).

(A.17)

For grid hierarchies we will modify the notation for cells and nodes, by including an index to indicate the resolution level, so that a generic

220

A.4. Cartesian grids

Figure A.8: A three level grid hierarchy corresponding to D3c (4, 2, 2, 2, 2, 4)

cell at level ℓ will be denoted by cℓi1 ,...,id

=

d Y

k,ℓ [pk,ℓ ik , pik +1 ],

(A.18)

k=1

where

pk,ℓ ik = ak + ik ∆xk .

(A.19)

The center of the cell cℓi1 ,...,id is given by

where xk,ℓ ik =

k,ℓ pk,ℓ ik +1 +pik . 2

d,ℓ , , . . . , x xℓi1 ,...,id = x1,ℓ i1 id

(A.20)

Definition 26. A Cartesian, uniform, cell-refined grid hierarchy is a cellrefined grid hierarchy AL = {A0 , . . . , AL−1 } such that there exists a complete grid hierarchy GL = {G0 , . . . , GL−1 }, defined by (A.21) GL = DLc (n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 ), for a certain election of n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 , and such that Aℓ ⊆ Gℓ , for 0 ≤ ℓ ≤ L − 1.

A. A generic description of the AMR algorithm

221

In other words, we consider cell-refined grid hierarchies whose individual grids are obtained, by restriction, from a unique complete Cartesian grid hierarchy GL , which is in turn cell-refined, and is defined by the operator DLc . In the AMR algorithm the grid AL is changing in time. We will assume that the grid hierarchy GL is initially fixed, so that every grid hierarchy considered in the AMR algorithm is obtained, using a suitable restriction, from GL . If AL does not contain empty grids, then GL is fully determined by AL and by the fact that is cell-refined and every grid on it is uniform. Moreover we will assume that the flagging procedure provides an adaptation operator that, from a given grid, produces a cell-refined grid. We will show a simple way to design such a flagging operator in Section A.4.2. To sum up, the grids under consideration in our AMR algorithm are assumed to verify the following: the actual grid hierarchy will always be cell-refined, and will come from the same complete grid hierarchy, that is assumed to be a Cartesian grid hierarchy where each grid is uniform, and can be defined by (A.21). We show next a way to design a flagging operator that ensures that the grid hierarchy resulting from adaptation is cell-refined.

A.4.2 Adaptation for cell-refined Cartesian grids If, at some time instant in the AMR algorithm, the actual grid hierarchy is cell-refined, then from any flagging operator, an adaptation operator that produces a cell-refined grid hierarchy can be easily built as follows: let AL = {A0 , . . . , AL−1 } be a cell-refined L-level uniform Cartesian grid hierarchy. Let GL = {G0 , . . . , GL−1 } = DLc (n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 ), for certain values of n1 , . . . , nd , r10 , . . . , r1L−2 , . . . , rd0 , . . . , rdL−2 , such that Aℓ ⊆ Gℓ , 0 ≤ ℓ ≤ L − 1, fact that we will also denote by Hℓ = E(Aℓ ), where E stands for the embedding operator E(Aℓ ) = Du

n1

ℓ−1 Y i=0

r1i , . . . , nd

ℓ−1 Y i=0

rdi

!

,

1 ≤ ℓ ≤ L − 1.

Let Cℓ : N → N be the operator that, given an index corresponding to a cell in E(Aℓ ) returns the index of the cell in E(Aℓ−1 ) where the given

222

A.4. Cartesian grids ℓ

ℓ

cell is contained. Let Tℓ : N → Nr1 ·−·rd be the operator that, given a cell in E(Aℓ ) returns the indexes of the cells in E(Aℓ+1 ) that are contained in the given cell. Let F be a flagging operator. Define: F˜ (Aℓ ) = Tℓ−1 Cℓ F (Aℓ ). The adaptation operator corresponding to F˜ (see Definition 21) ensures that the adapted grid is composed by the subdivision of coarse cells. These cells do not need to belong to the grid Aℓ−1 , but to the grid E(Aℓ−1 ), and the grid Aℓ+1 is not necessarily contained in the adapted grid. In order to ensure the nestedness of the grid hierarchy we can define F˜ by F˜ (Aℓ ) = Tℓ−1 (Cℓ (F (Aℓ ) ∪ Cℓ+1 (Aℓ+1 )) ∩ Iℓ−1 ) ,

(A.22)

where Iℓ−1 is the set of indices that defines Aℓ−1 . Equation (A.22) is valid for 1 ≤ ℓ ≤ L − 2. The coarsest grid A0 is never adapted, and for the finer grid AL−1 (A.22) reduces to F˜ (AL−1 ) = TL−2 (CL−1 (F (AL−1 )) ∩ IL−2 ) .

(A.23)

The adaptation operator defined using F˜ from (A.22) and (A.23), will now produce an adapted grid hierarchy which is nested and cell-refined after adaptation of the grid at level ℓ. An example can be seen in Fig. A.9. The top plots show three grids corresponding to three consecutive refinement levels, depicted in solid squares. The corresponding complete grids are shown with empty squares. If the flagging procedure F (A1 ) returns the indexes of the cells shown in Fig. A.9(d), then the result of F (A1 ) ∪ C2 (A2 ), which includes the cells of A1 overlaid by A2 , is shown in Fig. A.9(e). Finally, Fig. A.9(f) shows the cells marked by the operator F˜ (A1 ) defined by A.22. In our implementation, described in chapter 6, we have used this procedure for the adaptation operator, with the flagging operator described there (section 6.1.2), and with the addition of a system for clustering the marked cells into rectangular patches, included for performance reasons.

A. A generic description of the AMR algorithm

(a) Grid A0

(d) Cells F (A1 ).

marked

by

223

(b) Grid A1

(c) Grid A2

(e) Cells marked to ensure that A2 will be included in the adapted grid

(f) Cells marked by F˜ (A1 )

Figure A.9: An illustration of the flagging procedure for cell-refined cells. Thicker lines indicate cells of coarser levels.

224

A.4. Cartesian grids

A.4.3 Integration and Projection of Shu-Osher’s finite-difference algorithm with Donat-Marquina’s flux split on a Cartesian grid hierarchy We describe in this section the integration of a uniform, cell-refined Cartesian grid hierarchy, and the procedure used to modify the numerical solution at a grid using the numerical fluxes computed in finer grids, for the case of the numerical method described in chapter 4. We essentially describe here a generalization to d dimensions of the methods described there. In our framework, based on Shu-Osher’s flux formulation, the numerical solution is computed at the centers of the cells of each mesh. Let AL = {A0 , . . . , AL−1 } be a grid hierarchy as in Definition 26. According to (A.18), (A.19) and (A.20), consider a generic cell cℓi1 ,...,id ∈ Aℓ , whose center is xℓi1 ,...,id . The numerical solution for (A.6) at the node xℓi1 ,...,id is advanced in time by solving the system of ODE’s raising from the semi-discrete formulation: t,ℓ d fˆ − fˆit,ℓ,...,i ,i − 1 ,i ,...,i X i1 ,...,iq−1 ,iq + 21 ,iq+1 ,...,id ∂u ℓ 1 q−1 q 2 q+1 d (xi1 ,...,id , t) + ∂t ∆x q q=1

(A.24)

where fˆit,ℓ,...,i 1

1 q−1 ,iq + 2 ,iq+1 ,...,id

), , . . . , Uit,ℓ = fˆ(Uit,ℓ 1 ,...,iq +p+1,...,id 1 ,...,iq −p,...,id

is the numerical solution correspondis the numerical flux and Uit,ℓ 1 ,...,id ing to the node xℓi1 ,...,id and time t. The value fˆit,ℓ,...,i ,i + 1 ,i ,...,i is an 1

q−1 q

2

q+1

d

approximation of the flux passing through the point xℓi

1 1 ,...,iq−1 ,iq + 2 ,iq+1 ,...,id

q−1,ℓ q,ℓ q+1,ℓ d,ℓ = (x1,ℓ i1 , . . . , xiq−1 , xi + 1 , xiq+1 , . . . , xid ), q

2

which is computed using the algorithm described in chapter 4. For simplicity we assume that, if a multi-step explicit discretization of the time derivative in (A.24) is used, then fˆ represents the accumulated numerical flux. In our case, where a three-stage Runge-Kutta method is used, the flux fˆit,ℓ,...,i ,i + 1 ,i ,...,i is given by the analogous of (6.11). We also 1

q−1 q

2

q+1

d

recall that our solver works in a dimensional splitting fashion, so that

A. A generic description of the AMR algorithm

225

the computation of each of the d terms in the right hand side of (A.24) is one-dimensional. The time step for each grid is selected according to the formula ∆tℓ =

∆tℓ−1 max1≤k≤d rkℓ−1

,

1≤ℓ≤L−1

(A.25)

where the time step ∆t0 corresponding to the coarsest grid is selected so that the CFL condition is verified on it. The CFL condition for our } be the numerical method can be written as follows: let U t,ℓ = {Uit,ℓ 1 ,...,id numerical solution of (A.6) computed at the grid Aℓ and time t, and let Uit,ℓ,...,i ,i + 1 ,i ,...,i ;L and Uit,ℓ,...,i ,i + 1 ,i ,...,i ;R be the two sided recon1

q−1 q

2

q+1

1

d

q−1 q

2

q+1

d

structions2 of the conserved variables at the point xℓi

1 1 ,...,iq−1 ,iq + 2 ,iq+1 ,...,id

.

q If we denote by {λqp (u)}m p=1 the eigenvalues of the flux function f (u), we define n o t,ℓ t,ℓ ℓ q q Mq = max max max |λp (Ui1 ,...,id ;L )|, |λp (Ui1 ,...,id ;R )| , i1 ,...,id

1≤p≤m

which is the maximum characteristic speed computed over all the nodes of the grid Aℓ . The CFL condition for the grid Aℓ can be written as: ∆xℓq . 1≤q≤d Mqℓ

∆tℓ ≤ min

In practice it it often used the more restrictive condition ∆tℓ ≤

min1≤q≤d ∆xℓq . max1≤q≤d Mqℓ

∆t0 ≤

min1≤q≤d ∆x0q . max1≤q≤d Mq0

Therefore we select

(A.26)

Selecting the time steps according to (A.25) – (A.26) ensures that the CFL condition is satisfied in each grid. Recall that different numerical methods require different CFL conditions. The projection of solution from a fine to a coarse grid is a generalization of the one described in Sections 5.4 and 6.1.4, where we considered a square one- or bi-dimensional grid hierarchy. As explained in chapter 4, the numerical fluxes represent approximations to the true fluxes 2

These two reconstructions are described in Section 4.2, see in particular (4.10)

226

A.4. Cartesian grids

passing through a point, rather than through a cell face, which is the case of finite-volume methods. In general it is not possible, in our setup, to ensure that the solver, when acting in different grids, will compute numerical fluxes corresponding to the same point, so that the coarse flux could be corrected using fine fluxes computed at the same points. Only in the case where all refinement factors are set to an odd number that approach could be possible. Instead, we proceed as in section 6.1.4 and consider a correction of the coarse flux with a value coming from interpolation of the fine fluxes that correspond to points that lie in the same cell face as the coarse flux. The point xℓi ,...,i ,i + 1 ,i ,...,i lies on a face of the hyperrectangle (A.18), 1

q−1 q

2

q+1

d

given by 1,ℓ q−1,ℓ q−1,ℓ q,ℓ q+1,ℓ q+1,ℓ d,ℓ d,ℓ [p1,ℓ i1 , pi1 +1 ] × . . . × [piq−1 , piq−1 +1 ] × {piq +1 } × [piq+1 , piq+1 +1 ] × · · · × [pid , pid +1 ]. Q In the finer grid Aℓ+1 there are k6=q rkℓ fine fluxes that are computed in the same face, at the points

{xℓ+1 j ,...,j 1

1 ℓ ℓ p−1 ,rp ·ip +(rp −1)+ 2 ,jp+1 ,...,jd

},

(A.27)

where each jk , for k 6= p, varies between rkℓ ik and rkℓ ik + (rkℓ − 1). The numerical flux at xℓi ,...,i ,i + 1 ,i ,...,i can therefore be updated with the 1

q−1 q

2

q+1

d

fine fluxes that lie at the same interface. The (d−1)−linear interpolation of the fine flux values gives an approximation of the numerical flux function at the coarse point corresponding to a single time step of the fine grid. If (A.25) is used for the definition of the fine time step from the coarsest time step, then max1≤k≤d rkℓ time steps are performed on the grid Aℓ+1 for each time step of the grid Aℓ . The value of the coarse flux can thus be substituted with the result of adding the interpolated values for the max1≤k≤d rkℓ fine time steps needed to perform a coarse time step. If we denote by L(fˆt,ℓ+1 , xℓi ,...,i ,i + 1 ,i ,...,i ) q−1 q

1

2

q+1

d

the (d − 1)−linear interpolation of the values of the numerical fluxes for time t at the points indicated by (A.27), evaluated at the point xℓi ,...,i ,i + 1 ,i 1

q−1 q

2

then the projected flux at it is given by ℓ

ˆ fˆit,ℓ,...,i 1

1 q−1 ,iq + 2 ,iq+1 ,...,id

where we have denoted substitution: fˆt,ℓ

N 1 X ˆt+i∆tℓ+1 ,ℓ+1 ℓ L(f , xi1 ,...,iq−1 ,iq + 1 ,iq+1 ,...,i ), = ℓ d N 2 i=1

Nℓ

= max1≤k≤d rkℓ . We can therefore perform the

i1 ,...,iq−1 ,iq + 21 ,iq+1 ,...,id

ˆ = fˆit,ℓ,...,i 1

1 q−1 ,iq + 2 ,iq+1 ,...,id

.

(A.28)

q+1 ,...,id

,

Bibliography [1] T. Abel, G. L. Bryan, and M. L. Norman. The formation and fragmentation of primordial molecular clouds. The Astrophysical Journal, 540:39–44, 2000. [2] R. Abgrall. Multiresolution analysis on unstructured meshes: applications to CFD. In K.W. Morton and M.J. Baines, editors, Numerical methods for fluid dynamics, volume 5, pages 271–276. Oxford Science Publications, 1995. [3] V. Agoshkov, A. Quarteroni, and G.Rozza. A mathematical approach in the design of arterial bypass using unsteady Stokes equations. J. Sci. Comput., 28(2–3):139–161, 2006. [4] A. S. Almgren, J. B. Bell, P. Colella, L. H. Howell, and M. L. Welcome. A conservative adaptive projection method for the variable density incompressible Navier-Stokes equations. J. Comput. Phys., 142:1–46, 1998. [5] American National Standards Institute. ANSI Fortran X3.9–1978, 1978. [6] J. D. Anderson. Modern compressible flow. McGraw-Hill, 1982. [7] M. Anderson, E. W. Hirschmann, S. L. Liebling, and D. Neilsen. Relativistic MHD with adaptive mesh refinement. Classical Quantum Gravity, 23:6503–6524, 2006. [8] F. Arandiga, ` A. Baeza, and A. M. Belda. Interpolacion ´ WENO para valores puntuales. In Proceedings of the XVIII CEDYA/VIII CMA Conference, 2003. (in Spanish).

228

BIBLIOGRAPHY

[9] F. Arandiga ` and R. Donat. Nonlinear multiscale decompositions: the approach of A. Harten. Numer. Algorithms, 23:175–216, 2000. [10] D. C. Arney. An adaptive method with mesh moving and mesh refinement for solving the Euler equations. In ASME, SIAM, and APS, National Fluid Dynamics Congress, 1st, Cincinnati, OH, July 25-28, 1988, 1988. AIAA Paper 88–3567–CP. [11] N. Aslan and T. Kammash. A Riemann solver for the twodimensional MHD equations. Int. J. Numer. Meth. Fluids, 25(8):953–957, 1998. [12] I. Babuska and B. Guo. The h-p version of the finite element method for domains with curved boundaries. SIAM J. Numer. Anal., 25:837–861, 1998. [13] A. Baeza and P. Mulet. Adaptive mesh refinement techniques for high order shock capturing schemes for hyperbolic systems of conservation laws. Technical Report GrAN 04–02, Departament de Matematica ` Aplicada, Universitat de Val`encia, Spain, 2004. [14] A. Baeza and P. Mulet. Adaptive mesh refinement techniques for high order shock capturing schemes for multi-dimensional hydrodynamic simulations. Technical Report GrAN 05–01, Departament de Matematica ` Aplicada, Universitat de Val`encia, Spain, 2005. [15] A. Baeza and P. Mulet. Adaptive mesh refinement techniques for high-order shock capturing schemes for multi-dimensional hydrodynamic simulations. Int. J. Numer. Meth. Fluids, 52:455–471, 2006. [16] G. K. Batchelor. An introduction to fluid dynamics. Cambridge University Press, 2000. [17] P. D. Bates, S. N. Lane, and R. I. Ferguson, editors. Computational fluid dynamics: applications in environmental hydraulics. Wiley, 2005. [18] A. M. Belda. Weighted ENO y aplicaciones. Technical Report GrAN 04–03, Departament de Matematica ` Aplicada, Universitat de Val`encia, Spain, 2004. (in Spanish). [19] J. Bell, M. J. Berger, J. Saltzman, and M. Welcome. Threedimensional adaptive mesh refinement for hyperbolic conservation laws. SIAM J. Sci. Comput., 15(1):127–138, 1994.

BIBLIOGRAPHY

229

[20] T. Belytschko, Y. Krongauz, D. Organ, M. Fleming, and P. Krysl. Meshless methods: and overview and recent developments. Comput. Methods Appl. Mech. Engrg., 139:3–47, 1996. [21] M. J. Berger. Adaptive mesh refinement for hyperbolic partial differential equations. PhD thesis, Computer Science Dept., Stanford University, 1982. [22] M. J. Berger and P. Colella. Local adaptive mesh refinement for shock hydrodynamics. J. Comput. Phys., 82:64–84, 1989. [23] M. J. Berger and J. Oliger. Adaptive mesh refinement for hyperbolic partial differential equations. Technical Report NA–83–02, Computer Science Department, Stanford University, 1983. [24] M. J. Berger and J. Oliger. Adaptive mesh refinement for hyperbolic partial differential equations. J. Comput. Phys., 53:484–512, 1984. [25] O. Boiron, G. Chiavassa, and R. Donat. A high-resolution penalization method for large Mach number flows in the presence of obstacles. Computers & Fluids, 38(3):703–714, 2009. [26] J. H. Boldstad. An adaptive finite difference method for hyperbolic systems in one space dimension. PhD thesis, Computer Science Dept., Stanford University, 1982. [27] J. U. Brackbill and J. S. Saltzman. Adaptive zoning for singular problems in two dimensions. J. Comput. Phys., 46(3):342–368, 1982. [28] W. Briggs. A multigrid tutorial. SIAM, 1987. [29] Greg L. Bryan, T. Abel, and M. L. Norman. Achieving extreme resolution in numerical cosmology using adaptive mesh refinement: resolving primordial star formation. In Supercomputing ’01: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (published in CD-ROM format). ACM Press, 2001. [30] J. M. Burgers. A mathematical model illustrating the theory of turbulence. Adv. Appl. Mech., 1:171–199, 1948. [31] P. M. Campbell, K. D. Devine, J. E. Flaherty, L. G. Gervasio, and J. D. Teresco. Dynamic octree load balancing using space-filling curves. Technical Report CS-03-01, Williams College Department of Computer Science, 2003.

230

BIBLIOGRAPHY

[32] C. E. Castro and E. F. Toro. A Riemann solver and upwind methods for a two-phase flow model in non-conservative form. Int. J. Numer. Meth. Fluids, 50(3):275–307, 2006. [33] U.V. Catalyurek, E.G. Boman, K.D. Devine, D. Bozdag, R.T. Heaphy, and L.A. Riesen. Hypergraph-based dynamic load balancing for adaptive scientific computations. In Proc. of 21st International Parallel and Distributed Processing Symposium (IPDPS’07). IEEE, 2007. [34] W. L. Chen, F. S. Lien, and M. A. Leschziner. Local mesh refinement within a multi-block structured-grid scheme for general flows. Comput. Methods Appl. Mech. Engrg., 144:327–369, 1997. [35] G. Chiavassa and R. Donat. Point-value multiscale algorithms for 2D compressible flows. SIAM J. Sci. Comput., 20:805–823, 2001. [36] A. J. Chorin and J. E. Marsden. A mathematical introduction to fluid mechanics. Springer, New York, 3rd edition, 2000. [37] R. Clausius. Abhandlungen u ¨ ber die mechanische W¨ armetheorie. Braunschweig. F. Vieweg und Sohn, 1864. [38] Albert Cohen, Sidi Mahmoud Kaber, Siegfried Muller, ¨ and Marie Postel. Fully adaptive multiresolution finite volume schemes for conservation laws. Math. Comp., 72(241):183–225 (electronic), 2003. [39] P. Colella and P. Woodward. The piecewise parabolic method (PPM) for gas-dynamical simulations. J. Comput. Phys., 54:174–201, 1984. ¨ [40] R. Courant, K. Friedrichs, and H. Lewy. Uber die partiellen Differenzengleichungen der mathematischen Physik. Math. Ann., 100(1):32–74, 1928. English translation: ”On the partial difference equations of mathematical physics”, IBM Journal of Research and Development, 11:215–234, 1967. [41] C. M. Dafermos. Hyperbolic conservation laws in continuum physics. Springer, 2000. [42] A. Dervieux and J.-A. Desideri. Compressible flow solvers using unstructured grids. NASA STI/Recon Technical Report N, 94, June 1992.

BIBLIOGRAPHY

231

[43] K. D. Devine, E. G. Boman, R. T. Heaphy, B. A. Hendrickson, J. D. Teresco, J. Faik, J. E. Flaherty, and L. G. Gervasio. New challenges in dynamic load balancing. Appl. Numer. Math., 52(2–3):133–152, 2005. [44] R. Donat and A. Marquina. Capturing shock reflections: an improved flux formula. J. Comput. Phys., 125:42–58, 1996. [45] R. Donat and P. Mulet. Characteristic-based schemes for multiclass Lighthill-Whitham-Richards traffic models. J. Sci. Comput., 37(3):233–250, 2008. [46] B. Einfeldt. On Godunov–type schemes for gas dynamics. SIAM J. Numer. Anal., 25(2):294–318, 1988. [47] V. Elling. A Lax-Wendroff type theorem for unstructured grids. PhD thesis, Stanford University, 2004. [48] J. L. Ellzey, M. R. Hennecke, J. M. Picone, and E. S. Oran. The interaction of a shock with a vortex: shock distortion and the production of acoustic waves. Phys. Fluids, 7:172–184, 1995. [49] F. Eulderink and G. Mellema. General relativistic hydrodynamics with a Roe solver. Astron. Astrophys. Suppl. Ser., 110:587–623, 1995. [50] E. Fatemi, J. Jerome, and S. Osher. Solution of the hydrodynamic device model using high order non-oscillatory shock-capturing algorithms. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 10:232–244, 1991. [51] Fluent, Inc. http://www.fluent.com/software/fluent/index.htm. [52] S. Fromang, P. Hennebelle, and R. Teyssier. A high order Godunov scheme with constrained transport and adaptive mesh refinement for astrophysical magnetohydrodynamics. Astronomy and Astrophysics, 457(2):371–384, 2006. [53] K. Fukuyo. Application of computational fluid dynamics and pedestrian-behavior simulations to the design of task-ambient airconditioning systems of a subway station. Energy, 31(5), 2006. [54] P. Glaister. Approximate Riemann solutions of the shallow-water equations. J. Hydraul. Res., 26(3):293–306, 1988.

232

BIBLIOGRAPHY

[55] J. Glimm. Solutions in the large for nonlinear hyperbolic systems of equations. Comm. Pure Appl. Math., 18:697–715, 1965. [56] E. Godlewski and P.-A. Raviart. Numerical approximation of hyperbolic systems of conservation laws. Springer, New York, 1996. [57] S. K. Godunov. A finite difference method for the numerical computation of discontinuous solutions of the equations of fluid dynamics. Matematicheskii Sbornik, 47:271, 1959. [58] J. A. Greenough and W. J. Rider. A quantitative comparison of numerical methods for the compressible Euler equations: fifth order WENO and piecewise-linear Godunov. J. Comput. Phys., 196:259– 281, 2004. [59] B. E. Griffith, R. D. Hornung, D. M. McQueen, and C. S. Peskin. An adaptive, formally second order accurate version of the immersed boundary method. J. Comput. Phys., 223(1):10–49, 2007. [60] J.-F. Haas and B. Sturtevant. Interaction of weak shock waves with cylindrical and spherical inhomogeneities. J. Fluid Mech., 181:41– 76, 1987. [61] W. Hackbush. Verlag, 1985.

Multi-grid methods and applications.

Springer-

[62] A. Harten. Adaptive multiresolution schemes for shock computations. J. Comput. Phys., 115:319–338, 1994. [63] A. Harten. Multiresolution algorithms for the numerical solution of hyperbolic conservation laws. Comm. Pure Appl. Math., 48:1305– 1342, 1995. [64] A. Harten, B. Engquist, S. Osher, and S. R. Chakravarthy. Uniformly high order accurate essentially non-oscillatory schemes, III. J. Comput. Phys., 71(2):231–303, 1987. [65] A. Harten and J. M. Hyman. Self-adjusting grid methods for one-dimensional hyperbolic conservation laws. J. Comput. Phys., 50:235–269, 1983. [66] A. Harten, P. D. Lax, and B. van Leer. On upstream differencing and Godunov-type schemes for hyperbolic conservation laws. SIAM Rev., 25:35–61, 1983.

BIBLIOGRAPHY

233

[67] A. Harten and S. Osher. Uniformly high order accurate essentially non-oscillatory schemes, I. SIAM J. Numer. Anal., 24(2):279–309, 1987. [68] P. A. Henne, editor. Applied computational aerodynamics, volume 125 of Progress in Aeronautics and Astronautics. American Institute for Aeronautics and Astronautics, 1990. ¨ [69] D. Hilbert. Uber die stetige Abbildung einer Linie auf ein Fl¨achenstuck. ¨ Math. Ann., 38:459–460, 1891. [70] C. Hirsch. Numerical computation of internal and external flows (volume 1): fundamentals of numerical discretization. John Wiley & Sons, Inc., New York, NY, USA, 1988. [71] C. Hirsch. Numerical computation of internal and external flows (volume 2): computational methods for inviscid and viscous flow. John Wiley & Sons, Inc., New York, NY, USA, 1988. [72] T. Y. Hou and G. Le Floch. Why nonconservative schemes converge to wrong solutions: error analysis. Math. Comp., 62(206):497–530, 1994. [73] C. Hu. Numerical methods for hyperbolic equations on unstructured meshed. PhD thesis, Brown University, 1999. [74] M. E. Hubbard and N. Nikiforakis. A three-dimensional, adaptive, Godunov-type model for global atmospheric flows. Mon. Weather Rev., 130:1026–1039, 2003. [75] H. Hugoniot. Sur la propagation du movement dans les coprs et sp´ecialement dans les gaz parfaits. J. Ecole Polytechnique, 57:3– 97, 1887. [76] ISO. The ANSI C standard (C99). Technical Report WG14 N1124, ISO/IEC, 1999. [77] C. Jablonowski. Adaptive grids in weather and climate prediction. PhD thesis, University of Michigan, 2004. [78] A. Jameson. Aerodynamic design via control theory. J. Sci. Comput., 3:233–260, 1988. [79] A. Jameson, W. Schmidt, and E. Turkel. Numerical solution of the Euler equations by finite volume methods. In 14th AIAA Fluid and Plasma Dynamics Conference, 1981. AIAA Paper 81-1259.

234

BIBLIOGRAPHY

[80] L. Jameson. High order schemes for resolving waves: number of points per wavelength. J. Sci. Comput., 15(4):417–439, 2000. [81] G.-S. Jiang and C.-W. Shu. Efficient implementation of weighted ENO schemes. J. Comput. Phys., 126(1):202–28, 1996. [82] J.-C. Jouhaud and M. Borrel. Discontinuous Galerkin and MUSCL strategies for an adaptative mesh refinement method. In Fifteenth International Conference on Numerical Methods in Fluid Dynamics, pages 400–405, 1997. [83] S. Karni. Multicomponent flow calculations by a consistent primitive algorithm. J. Comput. Phys., 112(1):31–43, 1994. [84] Karypis Lab. http://glaros.dtc.umn.edu/gkhome/views/metis. [85] R. Keppens, M. Nool, G. Toth, ´ and J. P. Goedbloed. Adaptive mesh refinement for conservative systems: multi-dimensional efficiency evaluation. Comput. Phys. Comm., 153:39–339, 2003. [86] R. Kimura. Numerical weather prediction. J. Wind Eng. Ind. Aerodyn., 90(12–15):1403–1414, 2002. [87] H. O. Kreiss and J. Oliger. Comparison of accurate methods for the integration of hyperbolic equations. Tellus, XXIV:3, 1972. [88] N. Kroll. ADIGMA – A European project on the development of adaptive higher order variational methods for aerospace applications. In P. Wesseling, E. Onate, ˜ and J. P´eriaux, editors, European Conference on Computational Fluid Dynamics (ECCOMAS CFD 2006), 2006. [89] D. Kroner, M. Rokyta, and M. Wierse. A Lax-Wendrof type theorem for upwind finite volume schemes in 2D. East-West J. Numer. Math, 4:279–292, 1996. [90] P. K. Kundu and I. M. Cohen. Fluid mechanics. Academic Press, 4th edition, 2008. [91] S. H. Lamb. Hydrodynamics. Cambridge University Press, 6th edition, 1975. [92] Z. Lan, V. E. Taylor, and G. Bryan. A novel dynamic load balancing scheme for parallel systems. J. Parallel Distrib. Comput., 62(12):1763 – 1781, 2002.

BIBLIOGRAPHY

235

[93] L. D. Landau and E. M. Lifshitz. Fluid mechanics. Course of theoretical physics, vol. 6. Pergamon Press, Oxford, 2nd edition, 1987. [94] G. Lapenta. A recipe to detect the error in discretization schemes. Int. J. Numer. Meth. Engng., 59:2065–2087, 2004. [95] M. Latini, O. Schilling, and W. S. Don. Effects of WENO flux reconstruction order and spatial resolution on reshocked twodimensional Richtmyer-Meshkov instability. J. Comput. Phys., 221(2):805–836, 2007. [96] P. D. Lax. Weak solutions of nonlinear hyperbolic equations and their numerical computation. Comm. Pure Appl. Math., 7:159–193, 1954. [97] P. D. Lax. Asymptotic solutions of oscillatory initial value problems. Duke Math. J., 24:627–646, 1957. [98] P. D. Lax. Hyperbolic systems of conservation laws, II. Comm. Pure Appl. Math., 10:537–566, 1957. [99] P. D. Lax. Shock waves and entropy. In E.A. Zarantonello, editor, Contributions to nonlinear functional analysis, pages 603–634. Academic Press, 1971. [100] P. D. Lax. Hyperbolic systems of conservation laws and the mathematical theory of shock waves, volume 11 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics, 1973. [101] P. D. Lax and X.-D. Liu. Solution of two dimensional Riemann problem of gas dynamics by positive schemes. SIAM J. Sci. Comput., 19:319–340, 1995. [102] P. D. Lax and R. D. Richtmyer. Stability of difference equations. Comm. Pure Appl. Math., 9:267–293, 1956. [103] P. D. Lax and B. Wendroff. Systems of conservation laws. Comm. Pure Appl. Math., 13:217–237, 1960. [104] R. J. LeVeque. Numerical methods for conservation laws. Birkh¨auser Verlag, 1992. [105] R. J. LeVeque. Finite-volume methods for hyperbolic problems. Cambridge University Press, 2004.

236

BIBLIOGRAPHY

[106] S. Li. Adaptive mesh methods for time-dependent partial differential equations. PhD thesis, University of Minnesota, 1998. [107] S. Li. Comparison of refinement criteria for structured adaptive mesh refinement. J. Comput. Appl. Math., 2009. [108] S. Li and J. M. Hyman. Adaptive mesh refinement for finite difference WENO schemes. Technical Report LA-UR-03-8927, Los Alamos National Laboratory, 2003. [109] W. T. Lin, C. H. Wang, and X. S. Chen. A comparative study of conservative and nonconservative schemes. Adv. Atmos. Sci., 20(5):810–814, 2003. [110] R. Liska and B. Wendroff. Comparison of several difference schemes on 1D and 2D test problems for the Euler equations. SIAM J. Sci. Comput., 25(3):995–1017, 2003. [111] T.-P. Liu. The entropy condition and the admissibility of shocks. J. Math. Anal. Appl., 53:78–88, 1976. [112] T.-P. Liu. Admissible solutions of hyperbolic conservation laws. Mem. Amer. Math. Soc., 30(240), 1981. [113] X-D. Liu, S. Osher, and T. Chan. Weighted essentially nonoscillatory schemes. J. Comput. Phys., 115:200–212, 1994. [114] B. J. Lucier. A moving mesh numerical method for hyperbolic conservation laws. Math. Comp., 46:59–69, 1986. [115] R. W. MacCormack. The effect of viscosity in hypervelocity impact cratering. In AIAA Hypervelocity Impact Conference, 1969. AIAA Paper 69-354. [116] P. MacNeice, K. M. Olson, and C. Mobarry. PARAMESH: a parallel adaptive mesh refinement community toolkit. Comput. Phys. Comm., 126:330–354, 2000. Available at http://sourceforge.net/projects/paramesh/. [117] A. Marquina. Local piecewise hyperbolic reconstruction of numerical fluxes for nonlinear scalar conservation laws. SIAM J. Sci. Comput., 15(4):892–915, 1994. [118] A. Marquina and P. Mulet. A flux-split algorithm applied to conservative models for multicomponent compressible flows. J. Comput. Phys., 185(1):120–138, 2003.

BIBLIOGRAPHY

237

[119] B. Massey and J. Ward-Smith. Mechanics of fluids. Taylor & Francis, 8th edition, 2005. [120] S. F. McCormick. Multilevel adaptive methods for partial differential equations. SIAM Frontiers in Applied Mathematics, 1989. [121] B. Merriman. Understanding the Shu-Osher conservative finite difference form. J. Sci. Comput., 19(1–3):309–322, 2003. [122] E. E. Meshkov. Instability of the interface of two gases accelerated by a shock wave. Soviet Fluid Dynamics, 4:101–104, 1969. [123] Message Passing Interface Forum. MPI: A message-passing interface standard. Version 2.1. Available at http://www.mpi-forum.org/docs/docs.html. [124] R. Mittal and G. Iaccarino. Immersed boundary methods. Annu. Rev. Fluid Mech., 37:239–261, 2005. [125] S. Mizohata. Some remarks on the Cauchy problem. J. Math. Kyoto Univ., 1:109–127, 1961. [126] B. Mohammadi and O. Pironneau. Applied shape optimization for fluids. Oxford University Press, 2001. [127] MPICH Home Page. http://www.mcs.anl.gov/research/projects/mpi/mpich1/. [128] P. Mulet and A. Baeza. Highly accurate conservative finite difference schemes and adaptive mesh refinement techniques for hyperbolic systems of conservation laws. In A. Bermudez ´ de Castro, D. Gomez, ´ P. Quintela, and P. Salgado, editors, Numerical mathematics and advanced applications. Proceedings of ENUMATH 2005, pages 198–206, 2006. [129] N.-T. Nguyen and T. Wereley. Fundamentals and applications of microfluidics. Artech House MEMS Series, 2002. [130] O. Oleinik. Discontinuous solutions of nonlinear differential equations. Amer. Math. Soc. Transl. Ser. 2, 26:95–172, 1957. [131] OpenCFD, Ltd. http://www.opencfd.co.uk/openfoam/index.html.

238

BIBLIOGRAPHY

[132] B. O’Shea, G. Bryan, J. Bordner, M. Norman, T. Abel, R. Harkness, and A. Kritsuk. Introducing Enzo, an AMR cosmology application. In T. Plewa, T. Linde, and V. G. Weirs, editors, Adaptive mesh refinement - theory and applications, volume 41 of Lecture Notes in Computational Science and Engineering. Springer, 2005. [133] C. Othmer, E. de Villiers, and H. G. Weller. Implementation of a continuous adjoint for topology optimization of ducted flows. In 18th AIAA Computational Fluid Dynamics Conference, 2007. AIAA Paper 2007-3947. [134] M. Pandolfi and D. D’Ambrosio. Numerical instabilities in upwind methods: analysis and cures for the ’carbuncle’ phenomenon. J. Comput. Phys., 166:271–301, 2001. [135] M. Parashar and J. C. Browne. On partitioning dynamic adaptive grid hierarchies. In Proceedings of the 29th Annual Hawaii International Conference on System Sciences, pages 604–613, 1996. [136] G. Peano. Sur une courbe, qui remplit toute une aire plane. Math. Ann., 36(1):157–160, 1890. [137] R. B. Pember, J. B. Bell, P. Colella, W. Y. Crutchfield, and M. L. Welcome. An adaptive cartesian grid method for unsteady compressible flow in irregular regions. J. Comput. Phys., 120(2):278–304, 1995. [138] K.G. Powell, P.L. Roe, T.J. Linde, T.I. Gombosi, and D.L. De Zeeuw. A solution-adaptive upwind scheme for ideal magnetohydrodynamics. J. Comput. Phys., 154:284–309, 1999. [139] J. J. Quirk. An adaptive grid algorithm for computational shock hydrodynamics. PhD thesis, Cranfield Institute of Technology, 1991. [140] J. J. Quirk. A contribution to the great Riemann solver debate. Int. J. Numer. Meth. Fluids, 18(6):555–574, 1994. [141] J. J. Quirk. A parallel adaptive grid algorithm for computational shock hydrodynamics. Appl. Numer. Math., 20:427–453, 1996. [142] J. J. Quirk and S. Karni. On the dynamics of a shock-bubble interaction. J. Fluid Mech., 318:129–163, 1996. [143] W. J. M. Rankine. On the thermodynamic theory of waves of finite longitudinal disturbance. Phil. Trans. Roy. Soc. London, 160:277– 288, 1870.

BIBLIOGRAPHY

239

[144] A. Rault, G. Chiavassa, and R. Donat. Shock-vortex interactions at high mach numbers. J. Sci. Comput., 19(1-3):347–371, 2003. [145] C. A. Rendleman, V. E. Beckner, M. Lijewski, W. Y. Crutchfield, and J. B. Bell. Parallelization of structured, hierarchical adaptive mesh refinement algorithms. Comput. Visual. Sci., 3:137–147, 2000. [146] R. D. Richtmyer. Taylor instability in a shock acceleration of compressible fluids. Comm. Pure Appl. Math., 13:297–319, 1960. [147] R. D. Richtmyer and K. W. Morton. Difference methods for initialvalue problems, volume 4 of Interscience Tracts in Pure and Applied Mathematics. Wiley Interscience, New York, U.S.A., 2nd edition, 1967. [148] P. L. Roe. Approximate Riemann solvers, parameter vectors, and difference schemes. J. Comput. Phys., 43:357–372, 1981. [149] A. M. Roma, C. S. Peskin, and M. J. Berger. An adaptive version of the immersed boundary method. J. Comput. Phys., 153(2):509– 534, 1999. [150] O. Roussel and M. P. Errera. Adaptive mesh refinement: A wavelet point of view. In European Congress on Computational Methods in Applied Sciences and Engineering (ECCOMAS), 2000. [151] H. Sagan. Space-filling curves. Springer-Verlag, 1994. [152] Sandia National Laboratories. http://www.cs.sandia.gov/Zoltan/. [153] K. Schloegel, G. Karypis, and V. Kumar. Multilevel diffusion algorithms for repartitioning of adaptive meshes. J. Parallel Distrib. Comput., 47:109–124, 1997. [154] K. Schloegel, G. Karypis, and V. Kumar. A unified algorithm for load-balancing adaptive scientific simulations. In Supercomputing ’00: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing (CDROM), page 59, Washington, DC, USA, 2000. IEEE Computer Society. [155] C.W. Schulz-Rinne, J.P. Collins, and H.M. Glaz. Numerical solution of the Riemann problem for two-dimensional gas dynamics. SIAM J. Sci. Comput., 14:1394–1414, 1993.

240

BIBLIOGRAPHY

[156] D. Schwamborn, T. Gerhold, and R. Heinrich. The DLR TAU-Code: Recent applications in research and industry. In P. Wesseling, E. Onate, ˜ and J. P´eriaux, editors, European Conference on Computational Fluid Dynamics (ECCOMAS CFD 2006), 2006. [157] J. Shi, Y.-T. Zhang, and C.-W. Shu. Resolution of high order WENO schemes for complicated flow structures. J. Comput. Phys., 186:690–696, 2003. [158] C.-W. Shu. Numerical experiments on the accuracy of ENO and modified ENO schemes. J. Sci. Comput., 5:127–149, 1990. [159] C.-W. Shu. Essentially non-oscillatory and weighted essentially non-oscillatory schemes for hyperbolic conservation laws. In Alfio Quarteroni, editor, Advanced numerical approximation of nonlinear hyperbolic equations, volume 1697 of Lecture Notes in Mathematics, pages 325–432. Springer, 1998. [160] C.-W. Shu and S. Osher. Efficient implementation of essentially non-oscillatory shock-capturing schemes. J. Comput. Phys., 77(2):439–471, 1988. [161] C.-W. Shu and S. Osher. Efficient implementation of essentially non-oscillatory shock-capturing schemes, II. J. Comput. Phys., 83(1):32–78, 1989. [162] U. Shumlak and J. Loverich. Approximate Riemann solver for the two-fluid plasma model. J. Comput. Phys., 187(2):620–638, 2003. [163] W. C. Skamarock and J. B. Klemp. Adaptive grid refinement for two-dimensional and three-dimensional nonhydrostatic atmospheric flow. Mon. Weather Rev., 121:788–804, 1993. [164] S. W. Skillman, B. W. O’Shea, E. J. Hallman, J. O. Burns, and M. L. Norman. Cosmological shocks in adaptive mesh refinement simulations and the acceleration of cosmic rays. Astrophys. J., 689(2):1063–1077, 2008. [165] J. Smoller. Shock waves and reaction-diffusion equations, volume 258 of A series of comprehensive studies in mathematics. SpringerVerlag, 1994. [166] G. A. Sod. A survey of several finite difference methods for systems of nonlinear hyperbolic conservation laws. J. Comput. Phys., 27:1– 31, 1978.

BIBLIOGRAPHY

241

[167] J. C. Strikverda. Finite difference schemes and partial differential equations. Wadsworth and Brooks, 1989. [168] Lord Rayleigh (J.W. Strutt). Investigation of the character of the equilibrium of an incompressible heavy fluid of variable density. Proceedings of the London Mathematical Society, 14:170–177, 1883. [169] Lord Rayleigh (J.W. Strutt). Scientific papers, volume II. Cambridge University Press, 1900. [170] H. Tang and T. Tang. Adaptive mesh methods for one- and twodimensional hyperbolic conservation laws. SIAM J. Numer. Anal., 41(2):487–515, 2004. [171] C. A. Taylor, T. J. R. Hughes, and C. K. Zarins. Finite element modeling of blood flow in arteries. Comput. Methods Appl. Mech. Engrg. [172] Sir G. I. Taylor. The instability of liquid surfaces when accelerated in a direction perpendicular to their planes. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci., 201:192–196, 1950. [173] The NPARC Alliance. NPARC Alliance policies and plans. Technical Report FY06-FY07, NPARC, 2005. [174] Lord Kelvin (Sir William Thomson). Mathematical and physical papers, Vol. 4, hydrodynamics and general dynamics. Cambridge University Press, 1910. [175] E. F. Toro. Riemann solvers and numerical methods for fluid dynamics. Springer-Verlag, third edition, 2009. [176] E. Turkel. Composite methods for hyperbolic equations. SIAM J. Numer. Anal., 44(4):744–759, 1981. [177] B. van Leer. Towards the ultimate conservative finite difference scheme, I. The quest of monotonicity. Lecture Notes in Physics, 18:163–168, 1973. [178] B. van Leer. Towards the ultimate conservative finite difference scheme, II. Monotonicity and conservation combined in a second order scheme. J. Comput. Phys., 14:361–370, 1974.

242

BIBLIOGRAPHY

[179] B. van Leer. Towards the ultimate conservative finite difference scheme, III. Upstream-centered finite-difference schemes for ideal compressible flow. J. Comput. Phys., 23:263–275, 1977. [180] B. van Leer. Towards the ultimate conservative finite difference scheme, IV. A new approach to numerical convection. J. Comput. Phys., 23:276–299, 1977. [181] B. van Leer. Towards the ultimate conservative finite difference scheme, V. A second order sequel to Godunov’s method. J. Comput. Phys., 32:101–136, 1979. [182] J.-L. Vay, P. Colella, J. W. Kwan, P. McCorquodale, D. B. Serafini, A. Friedman, D. P. Grote, G. Westenskow, J.-C. Adam, A. H´eron, and I. Haber. Application of adaptive mesh refinement to particle-in-cell simulations of plasmas and beams. Phys. Plasmas, 11(5):2928–2934, 2004. [183] R. Verfurth. ¨ A review of a posteriori error estimation and adaptive mesh-refinement techniques. John Wiley/Teubner, 1996. ¨ [184] H.L.F. von Helmholtz. Uber discontinuierliche Flussigkeitsbewe¨ gungen. Monatsberichte der k¨onigl. Akad. der Wissenschaften zu Berlin, 23:215–228, 1868. [185] R. Walden and D. Folini. A-MAZE: a code package to compute 3D magnetic flows, 3D NLTE radiative transfer and synthetic spectra. In Thermal and ionization aspects of flows from hot stars: observations and theory, ASP Conference Series, volume 204, pages 281– 284, 2000. [186] R. F. Warming and R. W. Beam. Upwind second order difference schemes with applications in aerodynamic flows. AIAA Journal, 24:1241–1249, 1976. [187] B. Wendroff. The Riemann problem for materials with nonconvex equation of state. J. Math. Anal. Appl., 38:454–466, 1972. [188] F. M. White. Fluid mechanics. McGraw-Hill, 5th edition, 2003. [189] D. C. Wilcox. Turbulence modeling for CFD. D. C. W. Industries, 2002. [190] A. Winslow. Adaptive mesh zoning by the equipotential method. Technical Report UCID19062, Lawrence Livermore Laboratory, 1981.

BIBLIOGRAPHY

243

[191] M. Zhang, C.-W. Shu, G. C. K. Wong, and S. C. Wong. A weighted essentially non-oscillatory numerical scheme for a multi-class Lighthill-Whitham-Richards traffic flow model. J. Comput. Phys., 191(2):639–659, 2003. [192] W. Zhang and A. I. MacFadyen. RAM: A relativistic adaptive mesh refinement hydrodynamics code. The Astrophysical Journal Supplementary Series, 164:255–279, 2006. [193] U. Ziegler. A three-dimensional Cartesian adaptive mesh code for compressible magnetohydrodynamics. Comput. Phys. Comm., 116:65–77, 1999.