Efficient, Visual, and Interactive Architectural Design ...

Efficient, Visual, and Interactive Architectural Design Optimization with Model-based Methods Submitted by Thomas Wortmann Thesis Advisors Prof. Thomas Schroepfer, Architecture and Sustainable Design Giacomo Nannicini, IBM TJ Watson Submitted to the Singapore University of Technology and Design in fulfillment of the requirement for the degree of Doctor of Philosophy 2018

PhD Thesis Examination Committee Chair:

Bige Tunçer, Associate Professor Architecture and Sustainable Design, SUTD

Main Advisor:

Thomas Schroepfer, Professor Architecture and Sustainable Design, SUTD

Co-Advisor:

Giacomo Nannicini IBM Thomas J. Watson Research Centre Assistant Professor (2011 to 2016) Engineering Systems and Design, SUTD

Committee Member:

Kristin Wood, Professor Engineering Product Development, SUTD

Committee Member:

Karthik Natarajan, Associate Professor Engineering Systems and Design, SUTD

External Committee Member:

Stephen Cairns, Professor Swiss Federal Institute of Technology Zurich

Declaration I hereby confirm the following: •

I hereby confirm that the thesis work is original and has not been submitted to any other University or Institution for higher degree purposes.

•

I hereby grant SUTD the permission to reproduce and distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created in accordance with Policy on Intellectual Property, clause 4.2.2.

•

I have fulfilled all requirements as prescribed by the University and provided 1 copy of this thesis in PDF.

•

I have attached all publications and awards related to this thesis.

•

The thesis does not contain patentable or confidential information.

•

I certify that the thesis has been checked for plagiarism via ithenticate. The score is 27%.

Thomas Wortmann, July 16, 2018

Abstract Increasing applications of parametric design and performance simulations by architectural designers present opportunities to design more resource and

energy efficient buildings via optimization. But Architectural Design Optimization (ADO) is less widespread than one might expect, due to, among

other challenges, (1) lacking knowledge on simulation-based optimization, (2) a

bias towards inefficient optimization methods—such as genetic algorithms

(GAs)—in the building optimization literature, (3) lacking state-of-the-art, easyto-use optimization tools, and, perhaps most importantly, (4) the problematic

integration of optimization with architectural design. This problematic integration stems from a contrast between “wicked” or “co-evolving”

architectural design problems, which exhibit vague and changing problem

definitions, and optimization problems, which require problem definitions to be explicit and unchanging.

This thesis presents an interdisciplinary study of ADO that draws on design theory, building optimization, mathematical optimization, and multivariate

visualization. To address the first three challenges, the thesis (1) surveys existing optimization methods and benchmark results from the mathematical and building optimization literatures, (2) benchmarks a representative set of

optimization methods on seven problems that involve structural, energy, and

daylighting simulations, and (3) provides Opossum, a state-of-the-art, easy-to-

use optimization tool. Opossum employs RBFOpt, a model-based optimization

method that simultaneously “machine-learns” the shapes of fitness landscapes

while searching for well-performing design candidates. RBFOpt emerges as the

most efficient optimization method from the benchmark, and the GA as the least

efficient. To mitigate the contrast between architectural and optimization problems, the thesis (4) proposes performance-informed design space exploration (DSE), a novel concept that emphasizes selection, refinement, and understanding over finding highest-performing design candidates, (5) presents

Performance Maps, a novel visualization method for fitness landscapes, (6) implements Performance Maps in the Performance Explorer, an interactive,

visual tool for performance-informed DSE, and (7) evaluates the Performance Explorer through a user test with thirty participants. The Performance Explorer

emerges as more supportive and enjoyable to use than manual search and/or

optimization from this test. In short, the thesis offers tools for ADO and

performance-informed DSE that are more efficient and that better acknowledge the “wickedness” of architectural design problems.

Acknowledgement I greatly appreciate the support I received over the past four years, without

which this thesis would have been impossible. My advisor Prof. Thomas

Schroepfer gave the initial impetus for this research and continuously supported it. My co-advisor Giacomo Nannicini displayed seemingly unending patience in

explaining black-box optimization concepts to a novice, correcting factual errors,

and—perhaps most importantly—taking suggestions for the development of

RBFOpt into account. Without Giacomo’s collaboration, Opossum and the Performance Explorer would not exist in their current forms.

My committee members, and especially Prof. Stephen Cairns, not only offered

their time and support, but also challenged me to sharpen my ideas. Assoc. Prof.

Bige Tunçer chaired the committee and advised on the Performance Explorer’s

user test. Special thanks are due to Prof. Kristin Wood and Assoc. Prof. Karthik

Natarajan for joining the committee relatively late.

Researchers Dimitry Demin and Akbar Zuardin assisted in the development of

Opossum and the Performance Explorer. As student researchers, Chu Wy Ton continues to improve them and Christyasto Priyonggo Pambudi partially

automated the benchmarking of optimization tools in Grasshopper. The

development also benefited from the excellent advice and examples provided on

the McNeel forums, frequently by David Rutten. For the energy optimization benchmarks, I collaborated with Christopher Waibel, a colleague and friend who

is a PhD Candidate at the ETH Zurich. The SUTD-MIT International Design Centre supported the development and evaluation of RBFOpt, Opossum, and the

Performance Explorer under grant numbers IDG215001100 and IDG2170010.

SUTD provided a stimulating and enjoyable environment in which to pursue this research, through its faculty and especially through my friends and colleagues in the ASD PhD program. Asst. Prof. Alstan Jakubiec always had an open ear for

daylighting simulation-related questions and generously and dependably supplied free DIVA licenses. I would like to thank Zack Conti for many engaging

and clarifying discussions, Elif Erdine for her conscientious reading of the draft and constructive comments, Ramanathan Subramanian for his advice on SUTD’s

Institutional Review Board, and Ludovica Tomarchio for finding additional participants for the user test. I also would like to thank the user test’s

participants for their time, enthusiasm, and feature requests.

Finally, I would like to thank my parents for their ever-indulgent patience.

Journal Publications and Book Chapters Costa A, Nannicini G, Schroepfer T, Wortmann T (2015) Black-Box Optimization of Lighting Simulation in Architectural Design. In: Cardin M-A, Krob D, Lui PC, et al. (eds) Complex Systems Design & Management Asia. Springer, Cham, CH pp 27–39 Wortmann T (2017) Model-based Optimization for Architectural Design: Optimizing Daylight and Glare in Grasshopper. Technology | Architecture + Design 1:2, pp 176–185 Wortmann T (2017) Surveying design spaces with performance maps. International Journal of Architectural Computing 15:1, pp. 38–53 Wortmann T, Costa A, Nannicini G, Schroepfer T (2015) Advantages of Surrogate Models for Architectural Design Optimization. Artificial Intelligence for Engineering Design, Analysis and Manufacturing 29:4, pp 471–481 Wortmann T, Nannicini G (2017) Introduction to Architectural Design Optimization. In: Karakitsiou A, Migdalas A, Pardalos PM, Rassia S (eds) City Networks - Planning for Health and Sustainability. Springer International Publishing, Cham, CH pp 259–278 Wortmann T, Tuncer B (2017) Differentiating parametric design: Digital workflows in contemporary architecture and construction. Design Studies 52, pp 173–197 Wortmann T, Stouffs R (2018) Algorithmic Complexity of Shape Grammar Implementation. Artificial Intelligence for Engineering Design, Analysis and Manufacturing 32:2, pp 138-146

Conference Publications Poirriez C, Wortmann T, Hudson R, Bouzida Y (2016) From complex shape to simple construction: fast track design of “the future of us” gridshell in Singapore. In: Kawaguchi K, Ohsaki M, Takeuchi T (eds) Proceedings of the IASS Annual Symposium 2016 “Spatial Structures in the 21st Century.” IASS, Tokyo, JP Royall E, Wortmann T (2015) Finding the State Space of Urban Regeneration: Modeling Gentrification as a Probabilistic Process using k-Means Clustering and Markov Models. In: Proceedings of CUPUM 2015 - Planning Support Systems and Smart Cities. Massachusetts Institute of Technology, Cambridge, MA Wortmann T (2017) Opossum—Introducing and Evaluating a Model-based Optimization Tool for Grasshopper. In: Janssen P, Loh P, Raonic A, Schnabel MA (eds) Proceedings of the 22nd CAADRIA Conference. CAADRIA, Hong Kong, CN pp 283–292 Wortmann T, Nannicini G (2016) Black-box optimization for architectural design: An overview and quantitative comparison of metaheuristic, direct search, and model-based optimization methods. In: Chien S-F, Choo S, Schnabel MA, et al. (eds) Proceedings of the 21th CAADRIA Conference. CAADRIA, Hong Kong, CN pp 177–186 Wortmann T, Tuncer B (2015) Performative Design and Fabrication - A Modular Approach. In: Real Time - Proceedings of the 33rd eCAADe conference. Vienna University of Technology, Vienna, AUT pp 521–530 Wortmann T, Waibel C, Nannicini G, et al (2017) Are Genetic Algorithms really the best choice for Building Energy Optimization? In: Proceedings of the Symposium on Simulation for Architecture & Urban Design. SCS, Toronto, CA pp 51–58

Conference Workshops Wortmann T, Lian A, Demin D (2016) Optimize this!, ETH Zurich, Advances in Architectural Geometry (AAG) Conference, Zurich, CH Lian A, Zuardin A, Wortmann T, Cichocka J, (2017) Workflows for Conceptual Architectural Design Optimization, Xi'an Jiaotong-Liverpool University, CAADRIA 2017: Protocols, Flows and Glitches, Suzhou, CN Wortmann T, Cichocka J (2017) Interfacing Architecture, Engineering and Mathematical Optimization, HafenCity University Hamburg, Symposium of the International Association for Shell and Spatial Structures (IASS), Hamburg, DE

Awards for Opossum Schroepfer T, Nannicini G, Wortmann T (2017) Good Design Award: Nominee Schroepfer T, Nannicini G, Wortmann T (2017) SG Mark Award Schroepfer T, Nannicini G, Wortmann T (2017) COIN-OR (Computational Infrastructure for Operations Research) Cup

Table of Contents PhD Thesis Examination Committee

2

Declaration

3

Abstract

4

Acknowledgement

5

Journal Publications and Book Chapters

6

Conference Publications

7

Conference Workshops

8

Awards for Opossum

9

Figures

14

Tables

19

1 Introduction

20

1.1 Research Questions

24

1.2 Scope

26

1.3 Chapter Overview and Contributions

29

2 Towards Architectural Design Optimization

32

2.1 Challenges

32

2.2 Relevance

38

2.3 Chapter Summary

44

3 Optimization and Architectural Design Processes

46

3.1 Theories of Architectural Design Processes

46

3.2 Theorizing the Role of Computers in Architectural Design

56

3.3 Optimization in Architectural Design Processes

64

3.4 Chapter Summary

77

4 Visualizing Optimization Results

80

4.1 Representing Fitness Landscapes

81

4.2 Multivariate Visualizations

85

4.3 Chapter Summary

92

5 Black-Box Optimization

94

5.1 Single-Objective Optimization

95

5.2 Multi-Objective Optimization

97

5.3 Categories of Black-Box Optimization Methods

100

5.4 Metaheuristics

102

5.5 Direct Search

105

5.6 Model-Based Methods

107

5.7 Mathematical Benchmark Results

114

5.8 Black-Box Optimization Methods in ADO

116

5.9 Chapter Summary

129

6 Tools for Architectural Design Optimization

130

6.1 Optimization Tools for Architectural Designers

130

6.2 Opossum: A Novel, Model-based Optimization Tool

138

6.3 Chapter Summary

142

7 Quantitative Benchmarks

144

7.1 Benchmark Criteria and Methodology

145

7.2 Benchmark Problems

148

7.3 Benchmark Results

158

7.4 Limitations

171

7.5 Discussion

173

8 Introducing Performance Maps

176

8.1 A Novel Method for Visualizing Optimization Results

176

8.2 Visualizing an Example Problem

181

8.3 Applications of Performance Maps

188

8.4 Limitations of Performance Maps

191

8.5 Tools for Interactive, Performance-Informed DSE

194

9 Introducing and Evaluating the Performance Explorer

198

9.1 Performance Explorer Features and Interface

199

9.2 User Test Methodology

204

9.3 Responses for the Manual Method

211

9.4 Responses for the Automated Method

213

9.5 Responses for the Interactive Method

216

9.6 Quantitive Comparison of the Three Methods

223

9.7 Limitations

228

9.8 Discussion

230

10 Conclusion

236

10.1 Thesis Summary

236

10.2 Research Questions

237

10.3 Limitations

240

10.4 Contributions and Implications

241

10.5 Future Research

246

10.6 Conclusion

248

Appendix

250

Responses for the Manual Method

250

Responses for the Automated Method

262

Responses for the Interactive Method

275

Final Comments

287

Glossary

292

Bibliography

300

Index

320

Figures Figure 1.1 Diagrams of (a) a design space in two dimensions with crosses representing design candidates, (b) design candidates with a performance dimension, i.e., points in a fitness landscape, and (c) a surrogate model of this fitness landscape. 22 Figure 2.1 The award-winning Future of Us Pavilion by SUTD’s Advanced Architecture Laboratory has a parametrically designed envelope. The envelope’s differentiated density lends itself to optimization. 40 Figure 3.1 Decomposition of requirements for an Indian village. Alexander (1964) develops a design by diagramming solutions for subproblems (A1, A2, etc.), integrating these diagrams into higher levels (A, B, C, D), and finally into a design for the entire village. Illustration: (Alexander, 1964, p. 151) 48 Figure 3.2 Four design strategies: Linear (top left), Trial-and-Error (bottom left), Successive Alternatives (top right), and Branching Alternatives (bottom right). Illustration: (Rittel, 1992, pp. 77-81) 51 Figure 3.3 Diagram of three paradigms of architectural design processes: Analysis Synthesis, Generate-and-Test, and Co-Evolution. (P = Design Problem, D = Final Design, C = Design Candidates) 55 Figure 3.4 Compound model of a digital design process from Oxman (2006, p. 261). “R” stands for representation, “G” for generation, “E” for evaluation, and “P” for performance. Note that the human designer occupies the center and interacts with all the other elements (linkages marked with “i” for interaction). 62 Figure 3.5 Architectural designs “evolved” with a GA. Note the similarities between the designs, which indicate a restricted design space. Illustration: (Dillenburger & Lemmerzahl, 2011, p. 361) 67 Figure 3.6 Optimization result for the dome of the Louvre Abu Dhabi. Illustration: (Imbert et al. 2013) 71 Figure 3.7 Relationships of ADO and performance-based DSE to an architectural design cycle. This design cycle embeds a Generate-and-Test process in a larger, co-evolutionary cycle of problem re(-definition) and solution generation. 77 Figure 4.1 Each of the ten columns presents design candidates from one of 80 clusters. Illustration: (Stasiuk et al. 2014) 83 Figure 4.2 Pareto front of a generic structural design problem, with the two performance criteria of strength and cost. Yellow triangles indicate “Pareto-optimal” or “Nondominated” design candidates, where improvements in one performance criterion imply losses in the other. Blue dots indicate the remaining, “dominated” design candidates. Illustration: (Evins et al. 2012) 84 Figure 4.3 Four methods to visualize a set of 3-dimensional points in two dimensions: Parallel Coordinates (a), Radial Coordinates (b), Star Coordinates (c) and RadViz (d). 87 Figure 4.4 Screenshot of Design Explorer (http://tt-acm.github.io/DesignExplorer/, accessed 24.10.2017) 88

Figure 4.5 (a) represents design candidates in a three-dimensional design space, (b) indicates the section planes shown in (c), a Matrix of Contour Plots. For this example, all design candidates lie on section planes. Typically, this planarity is not the case. 90 Figure 4.6 Three types of mappings between a high-dimensional space ℝ𝑛𝑛 and a twodimensional space ℝ2: bijective (a), injective (b) and surjective (c). 91

Figure 5.1 Four methods to find Pareto fronts: (a) nondominated sorting, (b) scalarizing with weighted sums of objectives, (c) crowding to ensure diversity, and (d) maximizing hypervolume. Illustration: (Knowles and Nakayama 2008) 98 Figure 5.2 Behavior of black-box algorithms on the Branin function. This function has two variables and three global minima (indicated as gray dots). The algorithms optimized this function until they found a solution within 1% of a global minimum. The design candidates evaluated during the algorithms’ searches are indicated as black dots. Note that the metaheuristics (GA, SA, and PSO) required more function evaluations than the global direct search (DIRECT) and model-based (RBFOpt) methods. 102 Figure 5.3 Diagram of three iterations of the DIRECT algorithm. DIRECT subdivides promising regions of the design space while ignoring others. 106 Figure 5.4 Three types of simulation-based, black-box optimization: Optimize the output of the exact simulation directly (a), optimize the approximating output of the surrogate model constructed from prior simulation results (b) and optimize and update the surrogate model during the optimization process (c). 108 Figure 5.5 Diagram of three iterations of a global, model-based algorithm. The dashed lines indicate the approximated design space. 111 Figure 6.1 Octopus’ visualization of a trade-off space. A cube represents a design candidate. The visualization depicts the design candidates in a single generation. At the end of the optimization process, users can browse through all generation. Illustration: www.food4rhino.com/app/octopus, accessed 29.10.2017 133 Figure 6.2 Design candidates’ morphologies arranged in a trade-off space in Octopus. Illustration: www.food4rhino.com/app/octopus, accessed 29.10.2017 133 Figure 6.3 Screenshot of modeFrontier 4.5 visualization options. Illustration: www.esteco.com/sites/default/files/design_space45.png, accessed 25.10.2017 136 Figure 6.4 The four tabs of Opossum’s GUI. From top left in clock-wise order: (1) default choice and convergence graph, (2) stopping conditions and benchmarking, (3) “expert” settings, and (4) results list. 139 Figure 6.5 Opossum in Grasshopper. The curves on the left link to the variables and the one on the right to the objective. 140 Figure 7.1 Problem 1’s small transmission tower. The left diagram represents each of the 8 beam sets with a color. The right diagram shows the tower deforming under load (1000% exaggerated). 150 Figure 7.2 Problem 2’s dome under dead load. Each color represents one of the 22 beam sets. 151 Figure 7.3 Energy models of office buildings for problem 3 (a) and problems 4 and 5 (b). The small model on the left simulates only two zones (i.e., rooms), while the larger model on the right simulates a complete floor. 153

Figure 7.4 Diagram of the room Problem 6 optimizes in terms of daylight. The crosses indicate the sensor grid for simulating UDI. 154 Figure 7.5 Diagram of the room Problem 7 optimizes in terms of daylight and glare. The crosses indicate the sensor grid for simulating UDI, the cone the camera position and view for simulating DGP, and the numbers the attractor points regulating the façade’s porosity. 156 Figure 7.6 Problem 1: Convergence

159

Figure 7.7 Problem 1: Stability

159

Figure 7.8 Problem 2: Convergence

160


161


162


162


163


163


164


164


165

Figure 7.17 Problem 6: Stability.

165

Figure 7.18 Problem 7: Convergence. Note the rapid improvement of RBFOpt after 40 iterations. Until this point, RBFOpt is quasi-randomly simulating design candidates. After 40 iterations (i.e., the number of variables of this problem), RBFOpt constructs the first surrogate model, which almost immediately results in a substantial improvement. 166 Figure 7.19 Problem 7: Stability

166

Figure 7.20 Problem 7 (Hypervolume): Convergence

167

Figure 7.21 Problem 7 (Hypervolume): Stability

167

Figure 7.22 Pareto fronts found during each algorithms’ most representative (i.e., median) run. “Best” is the combined front from all algorithms and runs, i.e., the most accurate. (The markers’ colors indicate the algorithm). 168 Figure 7.23 Non-dominated wall screen design with 84% UDI and 24% DGP. For office spaces in tropical climates, such wall screens could deliver substantial energy savings (from cooling and artificial lighting), while avoiding the need for louvers and other movable elements. Schroepfer + Hee proposed similar wall screens for the façade of the New Jurong Church (Schroepfer 2012) (pp 120-129). 169 Figure 7.24 Problems 1-7: Convergence

170

Figure 7.25 Problems 1-7: Stability. Note that DIRECT exhibits identical performance on individual problems, but varying performance relative to different problems. 170

Figure 8.1 362 explored designs of the example optimization problem represented as twodimensional points with Star Coordinates and triangulated with the Delaunay algorithm. 178 Figure 8.2 This performance map represents a design space in terms of the estimated performance values of unexplored design candidates and indicates explored candidates. Note the groupings of well-performing candidates in the upper left corner and near the origin. 182 Figure 8.3 Visual comparison between a performance map with directly interpolated performance values (left), an exact performance map with simulated values (center), and a performance map with values approximated with a surrogate model. 184 Figure 8.4 Prediction errors for the interpolated and approximated performance maps from Figure 8.3. 184 Figure 8.5 Matrix of Contour Plots representing the estimated performance of unexplored design candidates in terms of pair-wise interactions between parameters. The circle represents the best solution found, which is the base point for the contours. The diagonal from lower left to upper right represents the eight parameters individually. The matrix is symmetrical across the diagonal. 186 Figure 8.6 Visual comparison between Parallel Coordinates, Performance Maps, and Matrix of Contour Plots. The three visualizations employ the same surrogate model to approximate the performance of unexplored design candidates. (The interpolated Parallel Coordinates were drawn using the results from the performance map.) 187 Figure 8.7 Diagram showing the relationship between design parameters and a performance map. Changing design parameters changes the position of the changed design candidate on the performance map (and thus indicated its approximated performance), while changing the location on the performance map adjusts design parameters (and thus displays the corresponding design candidate). Also, note the area for further automated exploration (optimization) indicated on the performance map. 190 Figure 9.1 The Performance Explorer in Rhinoceros 3D. From left to right: (1) Morphology (i.e., appearance) of the current design candidate in Rhinoceros 3D, (2) definition of the parametric model in Grasshopper (note the green box with the number sliders representing the design variables and their current values), and (3) the Performance Explorer window. 198 Figure 9.2 The Performance Explorer window. (1) The performance map with the position cross and variable axes is on the left, and (2) the performance scale, (3) variable plot, and (4) “Simulate” and “Refresh” buttons are on the right (from top to bottom). 200 Figure 9.3 User Test with three participants.

205

Figure 9.4 Example design candidate’ morphologies from the design task’s parametric model. The numbers indicate the maximal structural displacement in centimeters and the pink color the areas of high displacement. The design candidate on the top left is close to the best-known solutions. 206

Figure 9.5 Diagram of the relationship between the parametric model, structural simulation, Opossum, surrogate model, and the Performance Explorer for the three performance-informed DSE methods. All methods receive the simulation’s performance values as inputs and generate parameters values as outputs. The interactive method also exchanges parameter and (approximated) performance values with the surrogate model. Note that the automated method includes the manual method, and that the interactive method includes the two methods. 208 Figure 9.6 Maximal structural displacement of the preferred design candidates selected by the participants. 224 Figure 9.7 Responses to the statement “This design is a promising starting point for further development.” 225 Figure 9.8 Responses to the statement “I got a good overview over potential design candidates (i.e., the design space).” 225 Figure 9.9 Responses to the prompt “Please rank the design methods based on how much they supported you with the design task.” 226 Figure 9.10 Responses to the prompt “Please rank the design methods based on how much you enjoyed using them.” 227 Figure 9.11 Mean scores for the three performance-informed DSE methods. The three ranks for support (Figure 9.9) and enjoyment (Figure 9.10) are scored as 1, 3, and 5. The scores for promising starting points (Figure 9.7) and the overview over the design space (Figure 9.8) are scored as 1,2,3,4, and 5. 228

Tables Table 2.1 Fields and roles of registered users of Opossum, 26 February 2018.

42

Table 3.1 Relation of the three paradigms of architectural design processes (section 3.1) and the respective roles of computers (section 3.2) and optimization (section 3.3) to the computational design concepts discussed in section 3.3.5. 76 Table 4.1 Overview of representations for fitness landscapes discussed in this chapter

93

Table 5.1 Overview of algorithms and implementations in this chapter. Local methods marked with a * perform repeated local search from different starting points. 101 Table 6.1 Overview of ADO tools with performance-informed DSE features.

137

Table 7.1 Overview of benchmark problems in this chapter. 𝑛𝑛𝐶𝐶 indicates the number of continuous variables, 𝑛𝑛𝐷𝐷 the number of discrete variables, and 𝑡𝑡 the required time for a single function evaluation in milliseconds on an Intel i7 6700K CPU with 4.0 Ghz and eight threads. Center indicates the objective value for a solution with values in the center of each variable range and Best the objective value for the best solution found in the benchmarks. Due to their formulation, the improvement from Center to Best for problems 5 and 6 is comparatively small. 149 Table 8.1 Comparison Summary

188

Table 8.2 Example of a color scale overlaying three normalized performance criteria. Although the criteria’s sum is identical, their addition results in different colors. This example is merely illustrative, since the resulting palette is not perceptually adequate and thus potentially misleading. 193 Table 9.1 Numbers of participants using a performance-informed DSE strategy for each DSE method. These numbers are indicative only and are based on the author’s interpretation of the participants’ responses. 223 Table 9.2 Comparison of the Performance Explorer with other ADO tools with performanceinformed DSE features (section 5.1). 231

1 Introduction This thesis studies the integration of optimization into architectural design processes by developing, benchmarking, and user-testing novel computational design tools that are more efficient and better acknowledge designers’ preferences than existing ones. Improving this integration will help architects to design more resource- and energy efficient buildings and thus contribute to a more sustainable built environment. From the perspective of design theory, the thesis contributes to the reconciliation of two contrasting paradigms of architectural design: Co-Evolution and Generate-and-Test. Co-Evolution characterizes architectural design as an unstructured process, during which the definition of a design problem and its potential solutions coevolve (Dorst and Cross 2001). Architectural design problems thus are “wicked problems” that “one cannot first understand, then solve” (Rittel and Webber 1973). The wide range of designs typically submitted to architectural competitions, which often address slightly different problems, illustrates this indefiniteness. Co-Evolution corresponds to the widespread notion that architectural design is largely subjective and intuitive. Generate-and-Test characterizes architectural design as a structured process, during which design candidates are generated from well-defined rules and relations and tested according to explicit criteria (Mitchell 1990). Although generate-and-test does not require the use of computational design methods such as parametric design and performance simulation, their increasing use (Scheer 2014) underscores this paradigm’s relevance. Parametric design

20

automates the generation of design candidates and performance simulations predict, for example, a design candidate’s material or energy consumption. An important concept related to generate-and-test is the design space, which is the set of all potential design candidates for a given problem definition (Woodbury and Burrow 2006). For example, a parametric model explicitly defines a space of design candidates (Figure 1.1a). In a design space, one can transition from one design candidate to others to find better designs. The computational design literature terms this search design space exploration (DSE) and uses it as a model to understand design processes. Summarizing his chapter on design problems, Johnson (2017) describes how a generate-and-test process can emerge from a co-evolving one, and how a performance dimension can be added to a design space, for example through a performance simulation:

… research into wicked and situated problems supports the notion of puzzle making or co-evolution of problem definitions and solutions. Characteristics by which the problem is defined become characteristics by which a solution is measured and constraints defined. These can be taken to define a design state space that is useful for conceptualizing the problem–solution pair. This space provides a domain over which a utility or objective function might be defined, providing a framework within which to assess solution quality and perform design iterations until a suitable design is recognized or a stopping rule satisfied. Adding performance dimensions to design spaces frames DSE as an optimization problem, and mathematical optimization as automated DSE. But this thesis

21

Figure 1.1 Diagrams of (a) a design space in two dimensions with crosses representing design candidates, (b) design candidates with a performance dimension, i.e., points in a fitness landscape, and (c) a surrogate model of this fitness landscape.

distinguishes mathematical optimization from performance-informed DSE: Optimization is only about finding the “best” solution from a solution set, while performance-informed DSE emphasizes informed choices and understanding design spaces (section 2.2.6). As such, performance-informed DSE aims to support the co-evolution of architectural design problems by harnessing the potential of automated Generate-and-Test processes. A design space with a performance dimensions is also known as a fitness landscape (Figure 1.1b). When evaluating parametric models with performance simulations, the shapes of their fitness landscapes (i.e., their problem characteristics) are typically unknow. A related concept are surrogate models, which use statistical or machinelearning methods to approximate entire fitness landscapes, based on simulated design candidates (Figure 1.1c). This thesis assumes that surrogate models can support performance-informed DSE by allowing much faster explorations and visualizations based on approximated fitness landscapes. Most optimization problems can be formulated in analytical (i.e., explicit) form, but it is also possible to optimize “black-box” functions that are impractical to

22

formulate explicitly. For example, we might want to find the values for a parametric model, which, according to a numerical performance simulation, result in an especially energy-efficient design candidate. From the perspective of black-box optimization, the parameters are the decision variables and the parametric model and its simulated energy consumption the implicit objective

function. This objective function does not have to be known as a single, explicitly known mathematical expression, but nevertheless provides a single objective value for every set of variables. Mathematical optimization defines the possible solutions to a problem and their evaluation criteria unambiguously, in contrast to co-evolving architectural design problems. Nevertheless,

given the potential of mathematical

optimization for automated DSE on the one hand, and the increased use of computational methods such as parametric design and simulations by architectural designers—described as a “sea change” by Scheer (2014)—on the other, one might expect optimization to be integral to contemporary computational design processes. But, due to several challenges, this is not the case. These challenges—which are discussed further in the next chapter—include: (1) Skepticism about using computers for architectural design, the limited application of (2) parametric design and (3) performance simulations, (4) lack of knowledge on and incentives for optimization, (5) a bias for inefficient optimization methods, (6) a lack of state-of-the-art, easy-to-use optimization tools, and (7) the problematic integration of optimization with architectural design.

23

To improve the integration of optimization into architectural design processes, this thesis aims to address some of these challenges with an interdisciplinary approach that draws from design theory, mathematical optimization, and multivariate visualization. The following section 1.1 formulates the thesis’ research questions, section 1.2 delineates its scope, and section 1.3 summarizes its chapters and contributions.

1.1 Research Questions To-date, most studies in architectural design optimization (ADO) present only specific algorithms or applications, with little comparison between optimization methods, and little consideration of the role of optimization in architectural design processes (e.g., Hare et al. 2013; Evins 2013; Touloupaki and Theodosiou 2017). This thesis positions itself in this gap by raising the broader questions of how to integrate optimization into architectural design processes and which optimization method might be the most appropriate for this integration. To this end, the thesis formulates four research questions. Research Question 1

Which optimization methods are especially suitable for architectural design optimization? In Chapter 10, the thesis addresses this broader question in terms of the following three: If optimization methods are (1) more efficient on ADO problems and (2) better at supporting performance-informed DSE, it follows that they are especially suitable for ADO. The first premise reinforces the second: A more

24

effective optimization method yields solutions faster, and thus makes performance-informed DSE more responsive and accurate. Research Question 2

Which optimization methods are efficient on building optimization problems? The thesis addresses this question in Chapter 7. The chapter presents benchmark results of three distinct kinds of simulation-based optimization methods (metaheuristic, direct search, and model-based) on structural, building energy, and daylighting problems. Research Question 3

Which visualization methods best exploit the opportunities of surrogate modeling for performance-informed design space exploration? The thesis addresses this question in Chapter 8. The chapter presents Performance Maps, a novel visualization method for high-dimensional design spaces. It compares Performance Maps to two common visualizations for design spaces and fitness landscapes: Contour Plots and Parallel Coordinates. Research Question 4

Do interactive visualizations support performance-informed design space exploration more than only manual search or optimization? The thesis addresses this question in Chapter 9. The chapter presents the Performance Explorer, a prototypical software that combines model-based optimization and Performance Maps into an interactive, performance-informed

25

DSE tool. A user test compares the Performance Explorer with two other DSE methods: manual search and automated search, i.e., optimization. The thesis contends that answering these questions will not only result in more sustainable, i.e., resource- and energy-efficient, architectural designs, but also contribute to an important broader discussion: Given the increased use of quantitative technologies in architectural design processes, what is the best way to integrate them with existing design practices and methods? 1

1.2 Scope This section delineates the scope of the thesis. The following chapters discuss specific limitations where appropriate. 1.2.1 Normative Theories of Architectural Design Normative theories of architectural design that prescribe a certain style or method—such as “Parametricism” (Schumacher 2011) or “Performance-Oriented Architecture” (Hensel 2013)—are beyond the scope of this thesis. Instead, the thesis considers the potential and actual role of computational tools in architectural design processes. 1.2.2 Professional and Cultural Implications The thesis focuses on improving the integration of ADO and performanceinformed DSE into architectural design processes and does not consider what the Questions 3 and 4 are relevant also for engineering design, where surrogate modelling and optimization are used much more widely. In their survey of “Metamodeling [i.e., Surrogate Modeling] in Multidisciplinary Design Optimization” Viana et al. (2014) conclude that, among other factors, “limited visualization capabilities” make “highdimensional [optimization] problems inherently difficult.” 1

26

professional and cultural implications (e.g., “deskilling” (Lawson 2004), loss of “haptic knowledge” (Scheer 2014), and “homogenization” (Llach 2017)) of such a shift might be. 1.2.3 Graphical User Interfaces The design of graphical user interfaces (GUIs) constitutes a vast field. But this field is only peripheral to the research questions, which consider the underlying characteristics of mathematical optimization and visualization methods and the implications of these characteristics for their integration into design processes. The thesis follows existing GUIs as much as possible and assumes that the novel GUIs it presents are “good enough” to address the research questions. 1.2.4 Only Black-Box Optimization The thesis considers optimization only in the context of linking parametric models with performance simulations. This kind of simulation-based optimization is also called “black-box,” because it presupposes no knowledge of the underlying objective functions 2. Accordingly, the thesis discusses neither gradient-based optimization nor optimization methods for specialized applications, such as (analog) form finding (Frei and Rasch 1995), cross section optimization (Joyce et al. 2011),

and topology optimization (Bendsoe and

Sigmund 2003).

In theory, one can analyze the numerical models that underlie performance simulations. In practice, such analyses are typically impractical. For example, one can estimate the gradient of a simulation-based objective function with repeated simulations. But this estimation can be very time-consuming, especially for problems with many variables and/or time-consuming simulations. 2

27

1.2.5 Multi-Objective Optimization The thesis addresses multi-objective optimization (MOO) only peripherally. MOO is a separate field from single-objective optimization (Marler and Arora 2004; Coello 2013) and less well grounded in mathematical theory and empirical benchmarks. Nevertheless, MOO is often applied in ADO (Evins 2013). Compared to single-objective optimization problems, MOO problems are “drastically” more difficult (Audet and Hare 2017). The ADO literature typically does not address this fact (e.g., Evins et al. 2012). Accordingly, this thesis focuses on single-objective optimization to establish a firmer foundation from which— in future work—to address MOO. Section 5.2 discusses MOO in general and section 5.8.4 its relevance to ADO. Section 7.3.7 presents benchmark results from a comparison of single-objective algorithms with a multi-objective one on a problem involving daylight and glare. To the knowledge of the author, this is the first benchmark in ADO that compares methods of both types. This comparison heightens the suspicion that the popularity of MOO reflects not only its relevance for ADO, but also a neglect of more rigorous and efficient single-objective methods. Nevertheless, extending the methods and tools developed in this thesis to allow MOO, visualization, and interactive, performance-informed DSE is a promising direction for future work. 1.2.6 Surrogate Modeling beyond Black-Box Optimization There is an extensive literature on surrogate modeling from engineering design, due to its applicability to replace and/or augment time-intensive simulations. Topics include the accuracy of different model types, the efficiency of different

28

model construction, i.e., sampling strategies, and using surrogate modeling to understand the impact (i.e., sensitivity) of design variables, which is important for problems with many variables (Viana et al. 2014). Specifically, surrogate model accuracy remains a “critical” research topic. This thesis does not extensively study the accuracy of surrogate models and instead discusses surrogate modelling only in the context of simulation-based optimization. Nevertheless, the presented benchmark results and user test indicate that the employed surrogate models are “accurate enough” to enhance both optimization and design exploration. Yang et al. (2016) more closely study the accuracy of surrogate models on a daylighting problem and conclude that “the [model]-based approach has the potential to search the overall design space sufficiently and locate the promising regions.” But note that, next to sample number, sample strategy and model type, the accuracy of surrogate models depends on individual problem characteristics (Tseranidis et al. 2016).

1.3 Chapter Overview and Contributions This chapter introduces the thesis and demarcates it in terms of research questions, limitations, and contributions. Chapters 2-5 flesh out its background, while chapters 6-9 present original research. Specifically, the chapters make the following contributions: •

Chapter 2 surveys challenges to the integration of optimization into architectural design processes and argues ADO’s relevance as an interdisciplinary field of study.

29

•

Chapter 3 discusses the paradigms of Generate-and-Test and CoEvolution in more depth. It proposes performance-informed DSE as a bridge between the two paradigms and as a key concept for the integration of optimization into architectural design processes.

•

Chapter 4 surveys multivariate visualization methods and introduces “reversibility” as a key concept for exploiting such visualizations for interactive, performance-informed DSE.

•

Chapter 5 addresses lacking knowledge on black-box optimization by surveying existing black-box optimization methods.

•

Chapter 6 presents Opossum, a state-of-the-art, easy-to-use, model-based optimization tool in the context of existing optimization tools.

•

Chapter 7 presents benchmark results for Opossum and other ADO tools of simulation-based problems from structural design, building energy, and daylighting. These results reveal a bias towards inefficient methods in the ADO literature.

•

Chapter 8 presents a novel visualization method for high-dimensional design spaces, Performance Maps, and compares it to existing ones.

•

Chapter 9 describes a pioneering, visual and interactive, performanceinformed DSE tool—the Performance Explorer—and presents results from a user test. These results indicate that, by better acknowledging designer’s

preferences,

performance-informed

DSE

supports

integration of optimization into architectural design processes.

30

the

•

Chapter 10 summarizes the thesis and discusses directions for future research.

31

2 Towards Architectural Design Optimization This chapter surveys challenges for the integration of optimization into architectural design processes (section 2.1) and argues for the importance of studying this integration in an interdisciplinary manner (section 2.2).

2.1 Challenges Section 3.3.3 presents a small number of practical case studies of applications of optimization to architectural designs, which are the result of an extensive, albeit unsystematic search. This small number reflects the contrast between coevolving architectural design problems and well-defined optimization problems. The building optimization literature overwhelmingly consists of theoretical examples. Surveys typically do not discuss any case studies from architectural or engineering practice (e.g., Attia et al. 2013; Evins 2013; Nguyen et al. 2014; Machairas et al. 2014; Touloupaki and Theodosiou 2017). Apparently, the application of optimization to architectural design is not widespread, despite an extensive literature and large potential benefits, such as material and energy savings. This thesis attributes this limited application to the following challenges. 2.1.1 Skepticism about Using Computers for Architectural Design Compared to engineering design fields, architectural designers have been slower in adopting digital technologies (Flager and Haymaker 2009). In part, this lag may be due to the more general skepticism about using computers for architectural design (section 3.2.1).

32

2.1.2 Limited Application of Parametric Design To optimize architectural designs, one needs to automatically generate design candidates, most often via parametric design or scripting (i.e., computer programming). But it is unclear how widespread these methods are among architectural designers and consultants. In an online survey of 165 interns and professional architects on “Optimization in the Architectural Practice” (Cichocka et al. 2015), 83% of respondents indicated that they used a parametric software in their practice, and 56% of respondents indicated that they used scripting. But these results are not representative, since the online survey targeted only architects that use parametric software. In another online survey of 118 professional architects on their use of building performance simulation (Soebarto et al. 2015), most respondents indicated that they were developing their designs with hand sketches, physical models, and/or non-parametric software. In any case, it seems likely that parametric design will become more integrated into architectural practices: It is promoted by a growing number of books (e.g., Woodbury 2010; Burry 2011; Marble 2012; Jabi 2013; Andia and Spiegelhalter 2015), has been used in the design and realization of several important 21th century buildings (Wortmann and Tuncer 2017), and is increasingly included in architectural curricula. At Singapore University to Technology and Design (SUTD), parametric design is part of the core curriculum in Architecture and Sustainable Design (ASD).

33

2.1.3 Limited Application of Performance Simulations To optimize architectural designs, one also needs to evaluate design candidates quantitatively, which in most cases involves building performance simulations. But in the online survey by Soebarto et al. (2015), 74% of respondents indicated that they do not use such simulations in their day-to-day practices. 74% of respondents also indicated that they outsourced building performance analysis to internal or external specialist consultants. In other words, while specialist consultants conduct building performance simulations, architectural designers often do not. According to the survey, this inaction is due to (1) lacking incentives in terms of fee structure, (2) lack of knowledge and skills, and (3) lack of easy-touse, computational tools. At SUTD, building performance simulation is part of the core curriculum in ASD. 2.1.4 Lack of Knowledge on and Incentives for Black-box Optimization Next to automatically generating and evaluating design candidates, one also needs a theoretical and practical understanding of black-box (i.e., simulationbased) optimization to optimize architectural designs. But black-box optimization is an advanced subject, which architectural curricula (Becerik-Gerber et al. 2011) and text books typically do not include. For example, Scheer (2014) discusses computational design and simulations at length, but does not mention optimization. Two notable exceptions are Gerber and Flager (2011) and Pasternak and Kwiecinski (2015), who document architectural design studios that use optimization in the design of high-rises. Even engineering and computer science curricula typically include black-optimization only at the graduate level. 34

When architectural theorists discuss black-box optimization, they do so indirectly and in the context of specific types of algorithms, namely “genetic” or “evolutionary” ones (e.g., De Landa 2002; Oxman 2006; Carpo 2015; Johnson 2017). But from the perspective of mathematical optimization, such algorithms are only one type among other, typically more efficient ones (Conn et al. 2009; Hendrix and G.-Tóth 2010). Unfortunately, the online survey by Cichocka et al. (2015) did not ask whether respondents had the relevant background knowledge for optimizing their designs or whether they actually did optimize their designs, but only whether they would like to do so. (87% of respondents did.) Attia et al. (2013) interview a group of experts on building optimization: 26 from academia and 2 from practice. The experts identify four “soft obstacles” for the use of optimization in architectural design and consultant practices: (1) low return and lacking appreciation, (2) high requirement for expertise, (3) lack of a standardized approach, and (4) low trust in the results. Note that these obstacles resemble the reasons for the limited application of simulations discussed above but extend even to practices that apply both parametric design and simulations. 2.1.5 Bias for Inefficient Optimization Methods in the Literature The preference for evolutionary algorithms in architectural theory extends into the specialized building optimization literature, where they are by far the most widely used type of algorithm (Evins 2013). Generalizations such as “evolutionary algorithms are robust in exploring the search space for a wide range of building optimization problems” (Attia et al. 2013) accompany this popularity. But ironically, the experts interviewed in the same study “agreed 35

that computation time is very long and this might well inhibit the initial takeup of optimization.” The survey by Cichocka et al. (2017) illustrates this challenge: 66% of respondents indicated that they were willing to wait one hour or less for optimization results. One hour is very short, considering that a single performance simulation often takes minutes and that, typically, hundreds—or, in the case of evolutionary algorithms, thousands—of simulations are necessary to find good results. In mathematical benchmarks, algorithms that rely less on randomness and more on mathematical foundations are typically much more efficient, i.e., they find good or better solutions with smaller numbers of simulations (e.g., Holmström 2008; Rios 2009; Costa and Nannicini 2014). On the example problems presented in Chapter 7, the evolutionary algorithm performs poorly as well. Consequently, the limited application of building optimization is also due to a bias for inefficient optimization methods in the literature. Chapter 5 identifies four potential reasons for this bias: (1) fragmentation of the literature on black-box optimization, (2) conceptual appeal of biological analogies, (3) a preference for Pareto-based optimization in the building optimization community, and (4) a small number of benchmarks on building optimization problems. 2.1.6 Lack of State-Of-The-Art, Easy-To-Use Optimization Tools The interview study by Attia et al. (2013) diagnoses a lack of “environments integrating … simulation and optimization seamlessly” and with “friendly

36

[graphical user interfaces] GUI allowing post processing and visualization techniques.” Indeed, there is only a small number of optimization tools that interoperate with architectural parametric design and simulation software without requiring specialized programming skills, and an even smaller number that implement state-of-the-art optimization algorithms, and/or provide visualizations of the optimization results (section 6.1.3). 2.1.7 Problematic Integration of Optimization with Architectural Design The contrast between co-evolving architectural design problems and welldefined optimization problems points towards a more conceptual reason for this limited application. Architectural designs must fulfill many quantitative and non-quantitative evaluation criteria which one often cannot formulate a priori. Rather, such criteria are continuously redefined during design processes (Dorst and Cross 2001). One possibility is to apply optimization to well-defined subproblems, instead of using it to generate full building designs (section 3.3.2). But, due to the “wickedness” of co-evolving architectural design problems, this is possible only when problem definitions have stabilized and the potential for efficiencyimproving design changes has diminished, i.e., towards the end of design processes (Architectural/Engineering Productivity Committee 2004). The integration of optimization into architectural design processes is thus problematic.

37

The skepticism about computational tools in the architecture community is difficult to address directly, but parametric design and performance simulations are likely to become more widespread in the future. This thesis aims to address (1) the lack of knowledge on black-box optimization, (2) the bias towards inefficient optimization methods in building optimization, (3) the lack of stateof-the-art, easy-to-use optimization tools, and (4) the problematic integration of optimization with architectural design.

2.2 Relevance In most of the practical case studies identified in this thesis, a specialized, structural or environmental, consultant applied optimization and not the design architect. Nevertheless, this thesis discusses the optimization of building designs under the term Architectural Design Optimization (ADO) and studies it in an interdisciplinary manner. The following reasons support the relevance of this study, as well as the need for an interdisciplinary approach. 2.2.1 Need for more Resource- and Energy-Efficient Buildings Construction is the largest global consumer of resources and raw materials (De Almeida et al. 2016), while completed buildings account for about 20% of energy consumption and 48% of greenhouse gas emissions (Conti et al. 2016). Buildings also are the sector with the greatest potential and lowest cost for reducing carbon emissions (IPCC Core Writing Team et al. 2007). Large resource and energy consumptions combined with great reduction potentials represent compelling motivations to design more resource- and energy-efficient buildings and

38

highlight the potential benefits of ADO in mitigating the negative effects of climate change and designing a more sustainable built environment. 2.2.2 Precedents from 21 st Century Architecture There are few documented instances of the use of optimization in architectural design practice. Nevertheless, important examples of 21st century architecture owe significant parts of their appearance to performance simulations and optimizations conducted by specialized consultants. For the London City Hall, the architects Foster + Partners determined its overall stepped shape according to solar simulations and the interior volume of the debating chamber according to acoustic ones (Whitehead 2003). The multidisciplinary design consultancy ARUP supplied both simulations. For the Louvre Abu Dhabi—designed by Ateliers Jean Nouvel—structural consultants optimized not only the weight of its 165-meter-span dome, but also the regularity of its structure. This regularity was important for the patterned aesthetic of the light-modulating panels covering the dome’s top and bottom (Shrubshall and Fisher 2011). Lighting consultants determined the density of these panels via simulations (Tourre and Miguet 2010).

39

Figure 2.1 The award-winning Future of Us Pavilion by SUTD’s Advanced Architecture Laboratory has a parametrically designed envelope. The envelope’s differentiated density lends itself to optimization.

Like the Louvre Abu Dhabi, the Future of Us pavilion—designed by SUTD’s Advanced Architecture Laboratory—has a parametric envelope, whose generation and documentation the author automated (Wortmann and Tuncer 2017) (Figure 2.1). This envelope is differentiated in terms of its density and thus susceptible to environmental optimization. 2.2.3 Consultants as Co-Designers This thesis argues that specialized consultants, such as structural and environmental engineers, are actively participating in architectural design processes, rather than passively executing the design architect’s intent. Loukissas (2012), who terms such consultants “co-designers,” explains that

40

simulations enhance the agency of such consultants in architectural design processes. Loukissas (2012) presents the roof of the Nasher Sculpture Center in Dallas, Texas as an example: Developed in collaboration between Renzo Piano Building Workshop and ARUP, this roof resulted from “a simulation process aimed at optimizing and homogenizing illuminance levels on the artwork.” Accordingly, in applying optimization in their design practices, which often involve collaborating with design architects and other consultants in multidisciplinary teams, co-designers face similar challenges and opportunities as design architects. 2.2.4 Optimization is a Task (also) for Architects Conversely, given that the literature related to the optimization of building designs overwhelmingly relates to specialized disciplines such as structural or environmental design, one might assume that architectural designers do not use optimization at all. But Schaffranek (2012) urges that “optimizing [is] a task for the architect” and that “we architects have to be aware of such developments [in ADO], and argue what optimization of architecture is about.” The studies discussed above indicate that some architects use optimization in their design practices, or at least would like to (Bradner et al. 2014; Cichocka et al. 2015). 3

But note that these studies may not be representative of the entire field: Bradner et. al selected their subjects because they were using optimization and Cichocka et al. surveyed their subjects via an online survey relevant only to users of parametric design.

3

41

Opossum, the optimization tool developed during the author’s thesis research (section 6.2), is available as a free download. A plurality of registered users indicates that they are professional architects (Table 2.1). The registered users most often name structural, building energy, and daylighting optimization as planned applications. Table 2.1 Fields and roles of registered users of Opossum, 26 February 2018. Field Architecture

Structural Design

Environmental Design

Construction

Other

Professional

88

20

12

9

16

Student

70

14

4

1

4

Role Researcher

20

9

3

2

5

2.2.5 Fragmentation of the Building Optimization Literature The current fragmentation of the building optimization literature related to the optimization of building designs into field-specific journals (such as “Energy and Buildings,” “Computers and Structures,” and “Automation in Construction”) obscures the general applicability of insights from mathematical optimization to these fields and hinders the exchange of insights between them. 2.2.6 Need for Novel Computational Design Methods and Tools Recognizing the problematic integration of optimization into architectural design processes, Johnson (2017) concludes that optimization supports “convergent” thinking, i.e., the refinement of design ideas, but that “stronger support for exploratory divergent thinking, support […] which provides responsive assistance, is needed.” From this perspective, optimization not only improves almost completed architectural designs but, potentially, supports

42

designers’ reflections at earlier, more conceptual stages of design processes (section 3.3.4). The survey by Cichocka et al. (2015) underscores the importance of choice: 91% indicated that they would like to “influence [optimization] outcomes by choosing promising designs” and 82% that they would prefer “a few high-quality solutions” as an optimization outcome, rather than a single result. The interview study by Bradner et al. (2014) concludes that “user interfaces should enable users to easily pivot between exploring a solution set and examining a specific solution.” In short, the better integration of optimization into architectural design processes requires interactive, performance-informed DSE methods and tools that support the redefinition of architectural design problems by offering informed choices and fostering understanding. These methods and tools should be easy to use and implement efficient, state-of-the-art optimization algorithms. Conceptually, performance-informed DSE bridges between Generate-and-Test and Co-Evolution by combining optimization methods with interactive, visual representations (section 3.3.5). Given that architectural design is a largely visual discipline, the effective visualization of design spaces is an understudied but necessary foundation for performance-informed methods and tools that support understanding. Their development requires an interdisciplinary approach that draws from

43

computational

design,

mathematical

optimization,

and

multivariate

visualization. 4, 5

2.3 Chapter Summary In summary, there are practical, theoretical, and methodological reasons for the interdisciplinary study of ADO. This thesis reviews different streams of literature (computational

design,

mathematical

optimization,

building

energy

optimization, structural optimization, multivariate visualization), employs several methodologies (benchmarking optimization algorithms, developing novel visualization methods, prototyping and user testing computational design tools), and tests distinct kinds of optimization problems (structural, building energy, daylighting). The next chapter further discusses the role of optimization from a design theoretical

perspective

and

presents

applications

of

optimization

in

architectural practice.

4

Multivariate visualizations visualize data with multiple variables.

5 The need for visual and interactive, performance-informed DSE methods and tools exists also in engineering design. In their “Review of Metamodeling Techniques in Support of Engineering Design Optimization” Wang and Shan (2007) ask the following:

What are the more intuitive and easy-to-understand visualization techniques? What data in design need to be visualized? Why? What are the interactive means that the tool should and can provide to users? How will the visual aid help a designer to enhance the understanding of the problem or better direct the design?

44

45

3 Optimization and Architectural Design Processes Architectural design is increasingly influenced by systematic, computational methods such as parametric design (Woodbury 2010), performance simulation (Malkawi 2005), performance-based design (Kolarevic 2005; Oxman 2006; Hensel 2013), and building information modelling (BIM) (Kensek 2014). Such methods allow designers to develop geometrically complex designs, while more accurately predicting their future performance (e.g., Luebkeman and Shea 2005; Lin and Gerber 2014; Wortmann and Tuncer 2017). When designers parametrically define design spaces and numerically simulate the performance of individual design candidates, mathematical optimization methods can identify well-performing designs. To further explore the challenges and potentials of integrating optimization with architectural design discussed in the previous chapter, this chapter reviews contemporary theories of architectural design processes (section 3.1), as well as theories on the role of computers (section 3.2) and optimization in such processes (section 3.3). It identifies two conflicting paradigms of (computational) design— Generate-and-Test and Co-Evolution—and proposes performance-informed DSE to partially reconcile these paradigms.

3.1 Theories of Architectural Design Processes This section identifies three major phases of theories of architectural design processes that have developed since the post-war period, and that, to different extents, inform contemporary thinking on design processes. The “behaviorist” school frames design processes as sequential stages from analysis to synthesis,

46

the rationalist “information processing” school aims to model (and potentially simulate)

designers’

internal

cognitions

computationally,

while

the

“constructivist” school emphasizes the idiosyncratic, personal nature of design processes (Rowe 1991; Dorst and Dijkhuis 1995). Corresponding to these three schools of thought, this section identifies three paradigms of design processes: Analysis-Synthesis, Generate-and-Test, and Co-Evolution. 3.1.1 Analysis-Synthesis The “behaviorist” school of thought understood design processes as orderly sequences of stages, which it mapped in increasingly complex diagrams (Rowe 1991, first published 1986). Rowe characterizes this school as “strongly deterministic,” because of the “strong implication that the eventual synthesis of information in the form of some designed object follows in a straightforward fashion from an analysis of the problem at hand” (p 48). From this perspective, architectural design resembles an optimization problem, with the designer discovering variables, objectives, and constraints by gathering and analyzing information. Alexander (1964) presents a design method that synthesizes designs from explicitly listed requirements, with the design of an Indian village with 141 requirements as a “worked example” (Figure 3.1). A similar position is evident in Broadbent’s “Design in Architecture” (1988, first published in 1973), which next to discussions on architectural design processes introduces materials on mathematics, statistics, and linear programming. According to Darke (1979), part of the motivation to formalize and quantify

47

Figure 3.1 Decomposition of requirements for an Indian village. Alexander (1964) develops a design by diagramming solutions for subproblems (A1, A2, etc.), integrating these diagrams into higher levels (A, B, C, D), and finally into a design for the entire village. Illustration: (Alexander, 1964, p. 151)

design problems in this way was “the possibility of transferring much of the process to the computer, which would not be limited by preconceptions.” 3.1.2 Generate-and-Test The Analysis-Synthesis paradigm’s preoccupation with the definition of design problems contradicts the widely accepted suggestion that architectural design is “solution focusing” instead of “problem focusing” and that this cognitive strategy distinguishes design from other problem solving activities (Lawson 1979; Cross 2001). In his revised edition, Broadbent (1988) laments that “whatever kind of analysis has been brought to bear the architect, conditioned by the ‘paradigm’ within he works, will bring in sideways to the process the kind of design he wanted to do anyway!” (p 465).

48

In her interview study with architects 6, Darke (1979) identifies such preconceptions and terms them “primary generators.” Primary generators 7 are “a broad initial objective or small set of objectives, self-imposed by the architect, a value judgement rather than the product of rationality.” This initial objective serves to reduce “the variety of potential solutions to the as yet imperfectly understood problem, to a small class of solutions that is cognitively manageable.” In other words, architectural designers do not exhaustively analyze the structure of design problems, but search for solutions by generating and examining design candidates. This framing of design as heuristic search was first made explicit by the “information processing” school, which had firm links with early research in artificial intelligence (AI) and cognitive science. Buoyed by the early successes of artificial intelligence such as the “General Problem Solver” (Newell et al. 1958), theorists and practitioners aimed to computationally replicate the capabilities of human problem solvers, at least initially (Dreyfus 1992, first published in 1972). The definitive statement of the “information processing” position is Herbert Simon’s “The Sciences of the Artificial” (1996) (first published in 1969). Simon advocates formalizing design as a scientific, mathematical discipline with statistics, mathematical optimization, and means-ends analysis as core subjects (p 134). Design thus is a rational process of either choosing the optimal solution from a set of given alternatives or, if the set of alternatives is too large to examine

Including Alison and Peter Smithson, known as pioneers of the “Brutalist” style in modern architecture.

6

Rechristened as “design concepts”, primary generators have become a staple of contemporary architectural design education (e.g., Makstutis 2010; Anderson 2010).

7

49

exhaustively, a process of “heuristic search” that finds an acceptable solution by “satisficing,” i.e., by finding a solution that satisfies the performance criteria to a sufficient degree. Simon assumes that design problems themselves are welldefined but concludes that designers must rely on heuristic search instead of on problem analysis or exhaustive search, because in most practical cases one neither can “generate all the admissible alternatives and compare their respective merits,” nor “recognize the best alternative, even if we are fortunate enough to generate it early” (p 120). Although Simon’s vision has influenced the curriculum in architecture schools only to a small degree, it nevertheless informs many approaches to (computational) design research. For example, Akin (1988) analyzes the “expertise of the architect”, with the goal of developing an automated “expert system.” The paradigmatic Generate-and-Test cycle for computational design proposed by Mitchell (1990) originates with Simon as well (1996) (pp 128-130). The “information processing” approach initiated a close study of how architectural designers actually worked, which with studies such as (Darke 1979), revealed that designers were much less obviously rational than previously supposed. According to Rowe (1991), designers indeed proceed using a variety of heuristics. But, in contrast to Simon’s view of heuristics as a “small repertory of information processes” (1996), these heuristics “may be quite subjective” and “may or may not be explicit and repeatable” (p 76). For Cross and Roozenburg (1992), “consensus” models of architectural design processes assume ill-defined problems and opportunistic processes that start with solution-conjectures, while “consensus” models of engineering design

50

Figure 3.2 Four design strategies: Linear (top left), Trial-and-Error (bottom left), Successive Alternatives (top right), and Branching Alternatives (bottom right). Illustration: (Rittel, 1992, pp. 77-81)

processes assume well-defined problems and systematic processes that start with problem analysis (similar to the Analysis-Synthesis paradigm outlined in section 3.1.1). Rittel (1992) describes four heuristic design strategies: (1) the linear strategy of “grand masters”, (2) a “scanning”, trial-and-error process relying on the application of known precedents, (3) the systematic, successive generation of alternatives with a single solution chosen for further development at each step, and (4) the systematic, branching generation of alternatives that considers multiple solutions over different steps (Figure 3.2). Woodbury and Burrow (2006) introduce the concept of search as DSE:

Numerous studies have shown the utility of modeling designers as information processing systems that search to satisfy changing goals in a strongly constrained problem space […] exploration is a compelling model for designer action by

51

demonstrating a correspondence with the actions of designers and by suggesting means to work around human cognitive limits. This characterization of DSE contrasts with the “classic” information processing paradigm by Simon (1996) in an important respect: DSE assumes that, as “designs create the context for further decisions,” the goal of the search changes during exploration (Woodbury and Burrow 2006). This thesis distinguishes automated DSE, which applies only optimization, from

performance-informed DSE, which emphasizes human exploration and decision making. The next section further explores the concept of changing problem definitions. 3.1.3 Co-Evolution Dorst and Dijkhuis (1995) characterize Simon’s view of design as heuristic search as “positivist” and contrast it with the “constructivist” paradigm of “reflective practice” developed by Schön (1983). For Schön, architectural design processes are a “conversation with the situation” that start with developing candidate designs, which then lead to the discovery of (potentially unintended) consequences and new ideas. Schön calls such discoveries “the situation’s backtalk … which generates a system of implications for further moves” (p 74). Importantly, such discoveries can “trigger a reframing of the problem” (p 184). Schön’s view is influenced by the constructivist philosopher Nelson Goodman, who describes how “worlds” (i.e., worldviews) are fashioned by operations such as composition, decomposition, weighting, ordering, deletion, supplementation, and deformation (1975). For Schön, the designer’s “ability to construct and

52

manipulate virtual worlds is a crucial component of his ability not only to perform artistically but to experiment rigorously” (1983). An example of such a virtual world is the “world of the drawing” (p 107). In other words, when designers create representations of a design, they fashion virtual worlds that yield unique insights into a design problem. Stiny (2006) proposes a design method of “seeing and doing” that involves the application of graphically expressed rules. Despite his insistence on the systematic application of rules, he emphasizes the ambiguities inherent in these rule applications and the serendipitous discoveries afforded by what he terms “visual calculating” (2011). Similarly, Gänshirt (2007) describes architectural design as a cycle of perceiving a situation or problem, mentally developing a candidate solution, representing the candidate solution in a design medium— for example, a sketch, drawing, or model—and reassessing the design in light of this new representation. Lawson (2006) describes design processes as “conversations” that negotiate between design problems and candidate solutions. Such conversations not only include members of a design team, but also drawings, images, or other types of representations. In other words, Schön, Stiny, Gänshirt, and Lawson understand architectural design as an iterative exchange between “external representation[s] and internal mental structure” (Lawson 2004). In his introduction to architectural design computing, Johnson (2017) describes a similar design cycle that alternates between convergent and divergent modes thinking, and notes that this conception “invokes a number of concepts that connect to computing, including representation, transformation, collaboration,

53

solution spaces, generative processes, simulation, evaluation, and optimization” (p 7). Section 3.2 discusses these connections in more detail. Dorst and Cross (2001) develop Schön’s notion of “reframing” into an explicit model of “Co-Evolution”:

Design is not a matter of first fixing the problem and then searching for a satisfactory solution concept [but rather] a matter of developing and refining together both the formulation of a problem and ideas for a solution, with constant iteration of analysis, synthesis and evaluation processes between the two notional design ‘spaces’—problem space and solution space. According to Dorst and Cross, there are not only hypothetical design spaces that contain sets of design candidates—as proposed by Woodbury and Burrow (2006)—but there also are problem spaces that contain sets of formulations, or framings, of design problems (with different problem formulations suggesting different design spaces and vice versa). Protocol studies of designers have confirmed the notion of Co-Evolution (Cross 2001). 3.1.4 Three Paradigms of Architectural Design Processes The preceding sections identify three increasingly complex and sophisticated paradigms of architectural design processes (Figure 3.3). 1.

Analysis-Synthesis The ideal design process proceeds from an exhaustive analysis of the design problem directly to its solution.

54

2. Generate-and-Test An exhaustive analysis might not be possible, but we can find “satisficing” solutions by heuristically generating design candidates and testing them. Heuristics are rules of thumb that help to generate design candidates. 3. Co-Evolution Not only is there more than one design candidate, but there also are several “framings” of the design problem. Designers uncover information about the design problem and generate design ideas by “reflecting” on design candidates represented in various media. Therefore, the definition of the design problem and the development of the design solution must “co-evolve” in parallel. Different understandings of computers have greatly influenced some design theorists. The next section more explicitly examines the role of the computers in architectural design processes.

Figure 3.3 Diagram of three paradigms of architectural design processes: Analysis Synthesis, Generate-and-Test, and Co-Evolution. (P = Design Problem, D = Final Design, C = Design Candidates)

55

3.2 Theorizing the Role of Computers in Architectural Design Discussions on the roles of computers in architectural design processes elicit skepticism about their suitability as representational, let alone generative, tools from some, and perpetual optimism about their current and future potential from others. The below sections present a range of perspectives on the role of computers in architectural design processes that have emerged in the last forty years. These perspectives range from fears of deskilling (Lawson 2004) and of neglecting qualitative factors (Scheer 2014) to enthusiasms about the role of computers in “supporting complexity” (Oxman 2006), eliciting “a major shift in architectural

thinking

towards

formal

methods”

(Kotnik

2010),

and

“transcend[ing] the small-data logic of causality and determinism” (Carpo 2015). 3.2.1 Critiques of Computers in Architectural Design Already in the 1980s, designers expressed misgivings about having to use CADsystems that they experienced as lacking the formal freedom and ambiguity of hand sketches and drawings (Ehlers et al. 1986; Turkle et al. 2009). Lawson (2004) expresses a skeptical, conservative position:

Before computers the student architect had to learn to draw in order to design and also in order to see and record. It was of course possible that very poor architecture could be presented so beautifully that one was deceived. But the sensibilities needed to draw well and to design well are sufficiently similar for this to hardly ever happen.

56

Lawson fears that computer-use leads to “deskilling” not only with regard to traditional hand drawing, but also with regard to design skills more broadly, but provides no support for the assumption that the two require similar sensibilities. Similarly, Goldschmidt (2017) defends the relevance of hand sketching relative to designing with computers by stating that the latter results in “complex and exciting geometries” that often are not “resolved in terms of building performance and construction.” Scheer (2014) is critical of BIM for “facilitating architecture’s complete assimilation into the building production process and forcing it to embrace the latter’s performative logic” and regards the increasing use of computers for architectural design as not only a professional, but as a societal crisis:

Simulation dispenses with any relationship to materials and the haptic knowledge they provide … The public, which by and large experiences their environment as simulation, rewards works that satisfy its craving for stimulation. Scheer regards these changes as inevitable and irreversible. Rather than advocating a return to the traditional practice of architecture, he concludes that “to continue as architects, we must change our ideas.” Llach (2017), speaking from the perspective of science and technology studies, describes BIM as “the homogenization of a diverse ecology of design and construction practices” that enacts “an imperialist impulse to colonize and reorganize worlds of practice” and that emphasizes “the centrality of simulations.”

57

According to Johnson (2017), “skepticism about the machine’s ability to contribute to a creative outcome persists.” This thesis likewise assumes that the skepticisms expressed by these critics reflect the attitudes of a substantial number of architects but contends that adapting computational tools to existing design practices requires concrete proposals in addition to critiques of the status quo. It identifies such skepticisms as one of the challenges for integrating optimization into architectural design processes (section 2.1.1). 3.2.2 Computers as Designers Nicolas Negroponte, an early proponent of computer-use for architectural design, envisioned intelligent design systems that, in time, would learn to generate designs based on the designers’ (1973) or users’ (1976) specifications. Such visions have close links both to behaviorist (section 3.1.1) and information processing (section 3.1.2) views of architectural design processes. However, researchers in AI have so far made little headway with automated or learning design systems 8, which, among other obstacles, is probably due to the ill-defined nature of architectural design problems (Rittel and Webber 1973; Cross and Roozenburg 1992). 3.2.3 Computers as a Sources of Representations From a phenomenological study of the information processing tasks required for architectural design, Schön (1992) concludes that “the practitioners of Artificial Intelligence in design would do better to aim at producing design assistants rather than knowledge systems phenomenologically equivalent to those of

8

For a critical reflection on the lack of progress in AI generally, see Dreyfus (1992).

58

designers.” Schön argues that computer environments should (1) “enhance the designer's seeing-drawing-seeing,” (2) “extend the designer's ability to construct and explore [microworlds],” and (3) “create an environment that helps the designer to discover and reflect upon his own design knowledge.” Consistent with his view of designing as a “reflective conversation with the materials of a design situation,” for Schön computers are not authors of designs, but sources of representations that allow designers to gain new insights. The visual and interactive tool presented in Chapter 9—the Performance Explorer—intends to support these activities by (1) presenting fitness landscapes in a visual, interactive manner that (2) allows exploration and (3) fosters reflection and understanding. Similarly, for Gänshirt (2007) the computer is chiefly a representational tool, albeit “a new medium that digitalizes all the traditional design tools and unites them within it, completely changing them in the process” (p 226). The computer’s main advantage over sketches, drawings, and models is that it “not just make[s] it possible to represent objects, but also to simulate events in time” (p 194). The next section discusses the computer as a tool for automated DSE. 3.2.4 Computer as Generators, Simulators, and Optimizers Radford and Gero (1980) distinguish three roles for computers in architectural design processes: simulation, generation, and optimization. For them, “the designer's principal need is for information that is prescriptive.” Optimization offers an advantage over generation in proposing only “relevant” design candidates, and an advantage over simulation by avoiding a tedious trial-anderror process.

59

Mitchell (1990) combines generation and simulation into a “Generate-and-Test process taking place in a search space” (pp 179-181). Here, a design system generates designs from an “architectural grammar” of “shape rules” until it finds a satisfactory design. The process consists of two key tasks, design generation and design evaluation (which, when automated, often involves simulation), both of which can be performed by humans or computers. However, since “the language specified by the grammar should include a depiction of every possible building in the class, and every design in the language should be for a possible building in that class,” Mitchell apparently assumes that a human designer defines the architectural grammar. This a priori definition of a bounded design space is the key difference between Negroponte’s AI-inspired approach and the approaches in this section. Today, designers use parametric design to represent a class of design candidates by relating a set of parameters through a hierarchical network of operations (Woodbury 2010), and analyze design candidates from this class via environmental or structural simulations and other quantitative performance metrics (Malkawi 2005). In that sense, Mitchell’s Generate-and-Test process is, if not a practical reality, at least realizable in principle. Indeed, Generate-and-Test lies at the heart of many “performance-driven” (Shea et al. 2005; Shi and Yang 2013) and “performance-based” (e.g., Kolarevic 2005; Lin and Gerber 2014) design processes. In such processes, computers derive a design by searching a parametrically defined design space with an optimization algorithm according to an objective function evaluated via simulations.

60

For Johnson (2017), “design then becomes a question of searching the design state-space efficiently for the ideal solution, which is identified by evaluating the design’s fitness against the goal criteria.” 9 Carpo (2015) describes a paradigm shift that “demarcate[s] the computational nature of today’s process of optimization from its physical precursor – form-searching” and concludes that “using today’s digital tools this [computational form-searching] is the best – perhaps the only – way for us to work.” In its latest publications, Autodesk—one of the largest software companies in architecture, engineering and construction—refers

to

the

combination

of

parametric

models

with

optimization algorithms as “Generative Design” (e.g., Nagy et al. 2017). Oxman (2006) provides an elaborate framework of computational design that includes formation, generation, and performance. Formation describes the development of an architectural form through a mediated computational process, for example parametric design or animation, while generation encompasses design processes that rely on a set of rules to create a design such as visual calculating. In performance-based approaches, which can be formative or generative, “the object is generated by simulating its performance,” with performance encompassing such factors as “environmental performance, financial cost, spatial, social, cultural, ecological and technological perspectives.” An example of performance-based formation is the Greater London Authority Headquarters building by Foster and Partners, where “the optimization of energy and acoustical performance was achieved while the surface of the curvilinear

But the bias for GAs, which also is evident in (Johnson 2017), limits the potential for efficient automated DSE (section 5.9). 9

61

facade was minimized” (Oxman 2006). In Oxman’s view, optimization methods form an integral part of computational design because they allow the automated derivation of architectural form from the combination of a parametric model with a performance simulation. Oxman characterizes this performance-based approach as a “unique compound model of design.” Throughout her paper and especially in the accompanying diagrams (e.g., Figure 3.4), Oxman stresses the central role of human designers as authors and critics of the various automated processes, and the need for direct interaction with these processes. In framing the role of computers as supporting human designers instead of fully automating design processes, her conception of computational design relates to Schön’s and Gänshirt’s. However, Oxman and these authors would likely disagree on the importance of the difference between mediated,

Figure 3.4 Compound model of a digital design process from Oxman (2006, p. 261). “R” stands for representation, “G” for generation, “E” for evaluation, and “P” for performance. Note that the human designer occupies the center and interacts with all the other elements (linkages marked with “i” for interaction).

62

abstract representations, such as parametric models or formation processes, and unmediated, traditional representations, such as drawings or cardboard models. Oxman (2006) discerns a “paradigm shift of [a] new design culture,” while Schön (1992) and Gänshirt (2007) stress the continuity between digital and traditional, non-digital design tools. Oxman (2008) defines “performative design” as “the potential of an integration of evaluative simulation processes with digital ‘form generation’ and ‘form modification’ models [which] implies that performance can in itself become a determinant … of architectural form.” Here, Oxman frames the need for and method of human intervention in automated formation processes as a “challenging question” that might eventually be “obviated.” This thesis views computers in architectural design less as drivers of automated processes, in which human designers can intervene, and more as assistants for human design processes. 3.2.5 Enhancing Human Capability and Understanding There is no doubt that computers have an increasing role in architectural design processes; even strong critics of the use of computers for architectural design conclude that architects need to adapt to new technological realities (e.g., Scheer 2014; Llach 2017). This thesis likewise sees changes in architectural thought and practice as inevitable and proposes a method and tool intended to better integrate computation into existing architectural design processes. This thesis understands the role of computers not as imitating and replacing human design capabilities, but as enhancing and extending them: by automatically exploring human-defined design spaces and through interactive 63

and visual representations. Optimization methods explore bounded design spaces in a strategic and efficient manner. Model-based optimization in addition provides representations, namely surrogate models of the fitness landscapes corresponding to such bounded design spaces. The user test in Chapter 9 demonstrates that, when harnessed as representational tools, the surrogate models employed by model-based optimization afford “reflective conversations” that enhance designers’ understandings of design problems. The next section examines ADO in more detail.

3.3 Optimization in Architectural Design Processes Despite the interest by theorists in and the extensive ADO literature, mathematical optimization is not widely applied in architectural design, especially when compared to engineering design (Flager and Haymaker 2009). Chapter 2 has discussed challenges to the wider application of ADO. This section presents different perspectives on, and various applications of, ADO. 3.3.1 Criticisms of ADO The complexity of architectural design problems is best characterized by Rittel and Webber’s oft-cited dictum, “planning problems are wicked problems” (1973). For such wicked problems, “setting up and constraining the solution space and constructing the measure of performance … is more essential than the remaining steps of searching for a solution.” According to Cross and Roozenburg (1992), the view of architectural design problems as ill-defined by definition is one of the key differences between models of architectural design processes and models from engineering design. In other words, architects can define design problems

64

in diverse ways, with different implications for potential solutions. From this perspective, the Co-Evolution of design problems and solution spaces (Dorst and Cross 2001) results from the “wickedness” of architectural design problems. This “wickedness” also implies a severe difficulty in applying optimization methods, which demand explicitly defined problems, to architectural design problems. According to Lawson (2006), another aspect of the complexity of architectural design problems is that their solution requires a “holistic response” based on “skilled judgement.” Unsurprisingly, given his skeptical stance on the use of computers in architectural design processes (section 3.2.1), he states:

Rarely can the designer simply optimise one requirement without suffering some losses elsewhere. Kotnik (2010) has a more positive view of computational design approaches, but fears that performance-based design and ADO detract from the potential of these approaches for a “systematization of knowledge and methods on design.” In his view, the application of ADO hinders the designer from understanding architectural design problems:

For architectural design, the output-driven perspective onto the computational function of a performative design strategy is a pitfall because it encourages a tendency towards optimisation and with it, an economisation and closing up of architectural thinking towards parametric manipulation. The importance of an algorithmic description of the computational function … does not lie in the possibility of computing an optimal solution, but rather in the ability to control precisely the geometric relation between architectural elements under consideration. 65

Speaking from a theoretical perspective, Kotnik sees the rule-based definition of an architectural design as a contribution to design knowledge in the spirit of Mitchell’s “architectural grammars.” But, from a practical standpoint and for this thesis, the performance of individual design candidates is as important as the definition of the class of design candidates. Accordingly, this thesis regards optimization as a proven method for automated DSE, and as such complementary to the definition of bountiful design spaces. But the thesis acknowledges the difficulties highlighted by the theorists above and does not assume that optimization methods can solve architectural design problems outright. Rather, it contends that optimization methods contribute to architectural design processes by (1) by offering good solutions for bounded subproblems and (2) by providing a medium for reflection. In other words, this thesis aims to enhance designer’s capabilities through optimization, instead of replacing them with it. 3.3.2 Optimization as a Source of Solutions for Architectural Sub-Problems Optimization methods offer solutions for explicitly defined sub-problems. 10 For example, a test problem in Chapter 7 considers the daylight performance of a façade composed of louvered façade components. In this case, the architectural design is largely completed, and the only variables are the opening angles of the façade components. This optimization problem not only considers performance

This view does not entail that one can completely decompose architectural design problems as in (Alexander 1964), but merely that specific aspects of architectural designs can be optimized in isolation, with the definition of optimization problems ensuring a fit with overall design intentions. In other words, decompositions for design problems “coevolve” during design processes, but such decompositions are neither unique nor inevitable. 10

66

aspects, but also preserves the design intention by constraining the louvers’ opening angles to maintain the visual coherence of the façade. In the ADO literature, examples that develop a full architectural design through optimization are rare. Dillenburger and Lemmerzahl (2011) and Lin and Gerber (2014) present exceptions that employ genetic algorithms (GAs) to synthesize building designs from criteria such as fulfillment of the building’s program, energy efficiency, and cost. But the constrained design space considered by these examples appears to confirm Rittel’s insight that the definition of a design space is more decisive than the optimization result. For example, the designs in Figure 3.5 exhibit the same overall density, floor height, number of floors, circulation core locations, and similar room shapes. Examples such as these, where

Figure 3.5 Architectural designs “evolved” with a GA. Note the similarities between the designs, which indicate a restricted design space. Illustration: (Dillenburger & Lemmerzahl, 2011, p. 361)

67

optimization methods choose well-performing candidates from a highly constrained set of solutions, raise the question if optimization is necessary or meaningful as a design method for full-fledged building designs. More often, researchers and practitioners apply optimization methods to one or more specific dimensions of architectural design problems, such as energy consumption or structural weight, or to specific building components, such as the building envelope. Evins (2013) identifies six areas of applications of ADO to sustainable building design: envelope, form, HVAC systems, renewable energy, controls, and lighting. The architects interviewed by (Cichocka et al. 2017) were interested in optimizing structure (21%), daylight availability (19%), massing in terms of building codes and regulations (19%), circulation (12%), layouts (12%), and views (11%). The ADO literature contains several examples of floorplan and site layout optimization (Damski and Gero 1997; Jagielski and Gero 1997; Yeh 2006; El Ansary and Shalaby 2014) and at least on example of maximizing building volume relative to daylighting regulations (Pasternak 2016). In structural design, optimization can be applied to the topology of a structure, the shape of a structure with a fixed topology, and the sizing of the individual elements of a structure with a fixed shape (Kicinger et al. 2005; Hare et al. 2013). The next section reviews applications of ADO from architectural practice.

68

3.3.3 Practical Applications of ADO The focus of this section, which is adapted from (Wortmann and Nannicini 2017), are published examples of ADO from architectural design practice. Relative to the substantial amount of ADO research, such examples are rare. Lui (2015) documents an early example from the 1960s: the development and application of a “Building Optimization Program” at architecture and engineering firm Skidmore, Owings & Merrill LLP (SOM). Lui’s account presents a textbook example of the pitfalls of formulating the holistic design of buildings as optimization problems: The program resulted in “cost-effective commercial architecture” that, in one case, “resulted in almost a caricature of the mundane office building.” Most likely, this program employed linear programming, a commonly used optimization technique for problems that are formulated as linear functions. Luebkeman and Shea (2005) describe practical and experimental applications of black-box optimization at the Foresight Innovation and Incubation group of multidisciplinary design consultancy ARUP. They describe the minimization of the number of bracing elements for the Bishopsgate Tower in London and the number of members in a stadium roof. Other examples consider the panelization and rationalization of curved surfaces, with the objective of using only flat and ideally repeating panels, and the Pareto optimization of a building envelope in terms of energy and daylight. Also at ARUP, Hladik and Lewis (2010) document the optimization of the angles of the large louvers of Singapore’s National Stadium in terms of shading and view. Binkley et al. (2014) provide a similar example: the application of a GA to the design of the roof of a multipurpose

69

sports hall and athletics stadium in Saudi Arabia. The algorithm optimized the clear height below the roof, as well as overall steel tonnage. Rüdenauer and Dohmen (2007) describe their use of a GA to optimize the weight of the timber structure of a mountain shelter on Switzerland’s highest mountain, which reduced cost, waste, transport, and assembly time in a difficult to reach location. Scheurer (2007) describes a “proof of concept” developed in collaboration with structural engineering firm Bollinger+Grohmann that revolved around using a GA to optimize the shape of a large roof structure. In addition to the efficiency of the optimized solutions, Scheurer emphasizes their novelty: “Not one of the engineers on the project, with an impressive amount of experience between them, would have come up with the same engineering concepts

that

evolved

from

the

[genetic]

algorithm.”

Currently,

Bollinger+Grohmann regularly employ multi-objective, Pareto-based GAs to generate efficient structures “with an aesthetic logic between order and disorder” (Heimrath 2017). Similarly, structural engineering firm Web Structures employs both single- and multi-objective GAs to optimize architectural forms in terms of structural performance (Bamford 2018). Besserud et al. (2013) document the use of gradient-based and black-box optimization algorithms for integrated structural and architectural design at SOM. Besserud (2015) discusses the use of GAs at SOM for structural, solar, and daylighting optimization. Imbert et al. (2013) mention the “novel iterative approach to structural optimization” developed for the design of the Louvre Abu Dhabi that achieved “a fine balance of … structure self-weight, aesthetics, cost and buildability” (Figure 3.6). Bañon (2014) documents the use of GAs for the site

70

Figure 3.6 Optimization result for the dome of the Louvre Abu Dhabi. Illustration: (Imbert et al. 2013)

layout of two of his architecture practice’s projects: a running track and a park. Junghans (2015) mentions the optimization of the setpoints for temperature, CO2, and humidity—with a GA—for a zero-energy office building by Austrian Architects Baumschlager Eberle. The above examples illustrate the popularity of GAs for ADO and clarify that performance criteria in architectural design practice are often multiple and cover several

disciplines,

including

a

building’s

geometry,

structure,

and

environmental design. This multidisciplinarity motivates studies that consider multidisciplinary, multi-objective optimization in the context of ADO (e.g., Lin and Gerber 2014; Yang et al. 2015). Nevertheless, most of the practical examples presented in this section employ single-objective, black-box optimization methods, but often with multiple performance criteria.

71

Most often, optimization was carried out in larger structural or multidisciplinary engineering consultancies, but smaller firms, including architects or environmental consultants, applied optimization as well. The participants’ evident interest in and openness to computational design methods seem to be unifying characteristics that overcome at least the first three challenges outlined in section 2.1. The application of ADO by various participants further motivates the thesis’ stance on ADO as a method for both “co-designers” and architects (sections 2.2.3 and 2.2.4). Specialized, large-scale building types such as, for example, high-rises and stadiums appear to provide opportunities for ADO most often. Consistent with the framing of ADO as a source of solutions for well-defined subproblems (section 3.3.2), these opportunities mostly concern subproblems of otherwise largely completed building designs, such as the Louvre Abi Dhabi’s roof structure or the louvers of Singapore’s National Stadium. As such, the potential of optimization as a medium for reflection—which supports not only design development but also earlier, conceptual design phases—is mostly absent. This absence likely is due to a lack of appropriate methods and tools. The next section discusses this potential. 3.3.4 Optimization as a Medium for Reflection This thesis argues that, next to finding good solutions for sub-problems, optimization methods provide a medium for reflection and conceptual development of architectural designs. In other words, optimization methods can afford what Schön (1992) terms a “reflective conversation with the materials of a design situation.” Schaffranek (2012) argues that architects should not accept 72

optimized solutions prepared by “computer scientists and mathematicians,” and instead learn to use optimization themselves:

Those algorithms might not generate an optimal solution but they help to understand the possible outcomes of the rules defined through the algorithm … Similarly, Chen et al. (2015) suggest that optimization methods should not only find high-performing solutions, but “give architects a better understanding of the relationship between architectural features and design performance.” The framing of optimization as a generator of insights also is supported by (Bradner et al. 2014):

One key finding is that professionals use design optimization to gain understanding about the design space, not simply to generate the highest performing solution. Professionals reported that the computed optimum was often used as the starting point for design exploration, not the end product. Following Schön (1983), Stouffs and Rafiq (2015) propose a similar conception of optimization in an editorial of a special issue on “Generative and evolutionary design exploration”:

… the aim is less on optimization per se and more on exploration: the results from optimization are about changing one’s way of thinking more than choosing a single design and then realizing it. Schaffranek (2012), Chen et al. (2015), Bradner et al. (2014), and Stouffs and Rafiq (2015) emphasize that, for ADO, understanding optimization problems, i.e., fitness landscapes, is more important than finding “optimal” solutions. Johnson (2017) concludes that ADO “workflows present powerful ways of executing the

73

convergent portion of the divergent–convergent design cycle,” but that “stronger support for exploratory divergent thinking, support which doesn’t place much cognitive load on the user but which provides responsive assistance, is needed” (p 173). 3.3.5 Towards Performance-Informed DSE The “wickedness” of architectural design problems makes their solution with optimization problematic, especially for full building designs. Nevertheless, optimization is useful for a wide range of subproblems from, for example sustainable and structural design, and has found a small number of (published) real applications in architectural practice. To further integrate optimization into architectural design processes, it is necessary to make it productive also for conceptual design phases, which implies harnessing its potential as a medium for reflection more fully. This thesis proposes the concept of performance-informed DSE to enhance understanding and to support both exploratory divergent thinking and exploitive convergent thinking. Performance-informed DSE should support (1) selection, i.e., present designers with a choice from (groups of) design candidates and their performance—as with, for example, clustering (section 4.1.1) or Paretobased optimization (section 5.2)—instead of only a single solution, (2) refinement, i.e., allow direct parameter changes and indicate directions for potential improvement, and (3) understanding, i.e., represent the relationships between design parameters, morphology (i.e., appearance) and performance. Refinement can serve two distinct purposes: (1) adjusting a well-performing design candidate found by an automated process (e.g., an optimization 74

algorithm) according to preferences not formulated as part of the automation (e.g., aesthetics or other qualitative criteria) and (2) adjusting a design candidate preferred by a designer to improve its quantitative performance. 11 For example, the modular approach to parametric design presented by Wortmann and Tuncer (2015) supports refinement in the former sense: Instead of generating a design candidate only from explicit, quantitative performance criteria, their approach supports designers’ granular interventions based on aesthetics or other unquantified criteria. As such, performance-informed DSE contrasts with other computational design concepts, such as “performance-driven design” (Shea et al. 2005), “performative” or “performance-based design” (Oxman 2008), “post-parametric automation” (Andia and Spiegelhalter 2015) and “generative design” (Nagy et al. 2017), which emphasize automated DSE, i.e., optimization. While these concepts mostly relate to the Analysis-Synthesis and Generate-andTest paradigms of architectural design processes (and the corresponding roles for computers and optimization in such processes), performance-informed design relates to Generate-and-Test and Co-Evolution (Table 3.1). Figure 3.7 diagrams these relationships: It embeds a Generate-and-Test design cycle in a larger, co-evolutionary cycle of problem re(-definition) and solution generation (section 3.1.3). The diagram positions ADO as a support for Generate-

Arguably, refinement in the former sense also fulfills a psychological need to retain ownership of automated processes by customizing their results. Interactive optimization methods address this need by allowing designers to influence optimization processes. 11

75

and-Test, and Performance-informed DSE as a support for transitioning from a Generate-and-Test cycle to a problem redefinition. As such, ADO and performance-informed DSE do not represent a fundamental change to architectural design processes: Rather, they support and amplify activities that are inherent to the co-evolutionary paradigm of architectural design with a novel set of methods and tools. In Chapter 9, this thesis evaluates a novel, interactive, and visual tool for performance-informed DSE, the Performance Explorer, that supports selection, refinement, and understanding. Table 3.1 Relation of the three paradigms of architectural design processes (section 3.1) and the respective roles of computers (section 3.2) and optimization (section 3.3) to the computational design concepts discussed in section 3.3.5.

AnalysisSynthesis

Generate-and-Test

CoEvolution

Role of Computers

Designer

Generator, Simulator, and Optimizer

Source of Representat ions

Role of Optimization

Design Method for Building Design

Source of Solutions for Architectural SubProblems

Medium for Reflection

Related Computational Design Concepts

Performance-Driven Design (Shea et al. 2005) Performative Design (Oxman 2008) Performance-based Design (Oxman 2008) Post-Parametric Automation (Andia and Spiegelhalter 2015) Generative Design (Nagy et al. 2017) Performance-Informed Design Space Exploration

76

3.4 Chapter Summary This chapter introduces three paradigms that describe architectural design processes: (1) Analysis-Synthesis, (2) Generate-and-Test, and (3) Co-Evolution. There is a consensus in design theory that Analysis-Synthesis is too simplistic, but Generate-and-Test has proven relevant especially for computational design and ADO. Although holistic building design by automated DSE appears problematic, the chapter documents both potential and practical applications of ADO to architectural subproblems. Importantly, Co-Evolution emphasizes changing problem definitions as integral parts of architectural design cycles, and representation as a key tool for discovery Figure 3.7 Relationships of ADO and performance-based DSE to an architectural design cycle. This design cycle embeds a Generate-and-Test process in a larger, co-evolutionary cycle of problem re(-definition) and solution generation.

77

and reflection. Co-Evolution thus motivates the need for ADO tools that enhance understanding through meaningful representations, in addition to the automated DSE afforded by mathematical optimization methods. This chapter proposes performance-informed DSE as an approach to bridge between Generate-and-Test and Co-Evolution by better accommodating changing problem definitions and reflection through selection, refinement, and understanding. The next chapter reviews examples of this small but emerging field in the context of visualizing optimization results.

78

79

4 Visualizing Optimization Results Chapter 3 emphasizes the need for optimization tools that not only provide good solutions, but that also afford understanding, reflection, and interactive exploration. For Shneiderman (2000), “visualizing data and processes” and “exploring solutions” with “what-if” tools are two key activities that creativityenhancing computational design tools should support. Such tools require appropriate representations. This chapter, adapted from (Wortmann 2017a), discusses multi-variate visualizing methods as an approach for representing design spaces and fitness landscapes. Parametric models (Woodbury 2010) generate (finite or infinite) numbers of design candidates. Accordingly, parametric models define design spaces, with the number of dimensions equal to the number of parameters, i.e. variables (section 3.1.2). Since design spaces often have more than two or three dimensions, they are difficult to visualize. 12 This difficulty motivates the need for multivariate visualizations. Such visualizations help designers to understand (1) the range of design candidates and their performance, (2) the similarities characterizing groupings of design candidates, and (3) relationships between design parameters and performance criteria. As such, multi-variate visualizations are appropriate visualizations for performance-informed DSE.

Woodbury et al. (2017) propose “Interactive Design Galleries” that, on wall-size screens, present large numbers of parametric design candidates side by side, but do represent design parameters and/or performance criteria. 12

80

Section 4.1 reviews indirect approaches for representing fitness landscapes, such as k-means clustering and Pareto fronts. Section 4.2 reviews direct approaches, such as Parallel Coordinates, Star Coordinates, and Matrix of Contour Plots.

4.1 Representing Fitness Landscapes ADO combines the notion of a parametrically defined design space with one or more numerically expressed performance criteria. When such performance criteria extend a design space as extra dimensions, the literature often describes the result as a fitness landscape (e.g., Talbi 2009; Miles 2010; Koziel and Yang 2011; Rutten 2014). This thesis focuses on visualizing fitness landscapes with only one performance criterion. In a fitness landscape, the height of a coordinate in the fitness landscape corresponds to the performance value of a design candidate, and the remaining two coordinates specify the design itself (e.g., Mourshed et al. 2011; Rutten 2014). It therefore is impossible to represent design candidates defined by more than two parameters as a literal landscape. Some of the multi-variate visualization discussed in this chapter overcome this limitation. Recently, ADO has received new understandings both as a generative design tool that provides starting points for further design exploration and as a representational tool that aids the understanding of design problems (section 3.3.4). Mourshed et al. (2011) point out that “the discarded ‘inferior’ [i.e., suboptimal] solutions and their fitness contain useful information about underlying sensitivities of the system and can play an important role in creative decision making.”

81

Conti and Kaijima (2017) use Bayesian networks to predict which values for the design parameters are likely to result in well-performing designs. Bayesian networks analyze relationships between parameters and performance in terms of probability. But, to make reasonably accurate predictions, such networks require large evaluation budgets, which can be impractical. 13 4.1.1 Clustering Harding (2016) proposes Self-Organizing Maps to visualize high-dimensional fitness landscapes in ADO. Self-Organizing Maps are neural networks that arrange high-dimensional data in two-dimensional grids based on similarity. Such maps are well-suited to organizing design candidates but make it difficult to understand relationships between design parameters and performance, because this mapping is nondeterministic and different for each parameter. In other words, Self-Organizing Maps “distort” design spaces. A more popular approach to move from optimization to selection and understanding is k-means clustering (MacQueen 1967). K-means clustering is an unsupervised machine learning method that works for arbitrary numbers of parameters and sorts data into a pre-defined number of groups, again based on similarity.

Achieving a prediction error below 10% on a six-variable test problem requires approximately 4,000 simulated design candidates (Conti and Kaijima 2017). Wortmann et al. (2017) compare the results from 10,000 quasi-randomly selected design candidates with optimizing for 200 function evaluations (i.e., simulating the performance of 200 design candidates) on a building energy problem with 13 variables: All tested optimization algorithms found better solutions than the quasi-random sampling. 13

82

Figure 4.1 Each of the ten columns presents design candidates from one of 80 clusters. Illustration: (Stasiuk et al. 2014)

Stasiuk et al.(2014) cluster over 2,000 evaluated candidates of a bending active structure—found by a GA—according to 18 characteristics. Their analysis yields 80 clusters, i.e. 80 archetypal design candidates (Figure 4.1). However, due to this considerable number, a human designer might struggle to select a design for further development or to understand the characteristics of these 80 candidates in relationship to their performance. Chen et al. (2015) define a twelve-dimensional parametric model of an abstract building geometry, resulting in a space of 752,640 design candidates. They search this space with a multi-objective GA that aims to minimize the building envelope’s thermal transfer and cost and to maximize the available daylight, yielding 5,000 evaluated candidates. They then use k-means clustering to identify relationships between design parameters and performance, though with only partial success.

83

Similarly, Brown and Mueller (2017) use k-means to represent the optimization results from a GA on two structural design problems with three and five clusters, respectively. Nevertheless, that one must provide the number of clusters a priori is a principal disadvantage of k-means clustering. 14 4.1.2 Pareto Fronts Pareto fronts are a visualization method often employed in MOO (section 5.2). Such fronts are two- or three-dimensional plots of the relationship between two or three performance criteria (Figure 4.2).

Figure 4.2 Pareto front of a generic structural design problem, with the two performance criteria of strength and cost. Yellow triangles indicate “Pareto-optimal” or “Nondominated” design candidates, where improvements in one performance criterion imply losses in the other. Blue dots indicate the remaining, “dominated” design candidates. Illustration: (Evins et al. 2012) 14 Bipartite modularity, an alternative clustering method, does not require the number of

clusters, but the range of values which are considered similar as an input (Barber 2007).

84

Although Pareto fronts provide a set of design candidates to select from and allow designers to understand tradeoffs between different criteria (Radford and Gero 1980)—i.e., tradeoff spaces—they are of limited use for DSE, since they cannot visualize relationships between performance criteria and design parameters, i.e., design spaces. These relationships are critical for understanding why certain kinds of designs perform better than others. 4.1.3 From Indirect to Direct Representations of Fitness Landscapes The works discussed in this section are symptomatic of the need for humanunderstandable representations of fitness landscapes. Instead of trying to indirectly understand the relationships between design parameters and performance through statistical inferences, grouping design candidates, or analyzing their characteristics or tradeoffs, the remainder of this chapter considers methods that visualize such relationships directly by—without distortions—deterministically mapping high-dimensional design spaces into lower-dimensional ones. The next section discusses three types of multivariate visualizations—Parallel and Radial visualizations, Star Coordinates and RadViz, and Matrix of Contour Plots—and addresses the reversibility of these methods, a property that is necessary for turning them into interactive tools for performance-informed DSE.

4.2 Multivariate Visualizations Multivariate visualizations represent data that have more than two or three dimensions and thus are difficult to display on print-outs and screens. They serve a range of purposes, for example summarizing data, identifying patterns in or

85

similarities between data, and displaying correlations between parameters. Hoffman and Grinstein (2002) survey multivariate visualization types. According to them, multivariate visualizations harness humans' perceptual abilities to "look for structure, features, patterns, trends, anomalies, and relationships in data." De Oliveira and Levkowitz (2003) describe how "[multivariate] visual mapping techniques are now being used both to convey results of [data] mining algorithms in a manner more understandable to end users and to help them understand how an algorithm works." This thesis proposes that performance-informed DSE can likewise benefit from multivariate visualization, by, instead of presenting only one or a small selection of design candidates, providing designers with an overview of design spaces and/or fitness landscapes. 4.2.1 Parallel and Radial Visualizations A straightforward method to represent data with several parameters is Parallel Coordinates (Wegman 1990). Parallel Coordinates introduces a set of parallel— usually vertical—axes equal to the number of parameters of the data. To display a datum, one marks the value of each parameter on the corresponding axis and connects the resulting points with a polyline (Figure 4.3a). A variation of this method uses radial axes, which allow the representation of data as closed polylines (Figure 4.3b).

86

Figure 4.3 Four methods to visualize a set of 3-dimensional points in two dimensions: Parallel Coordinates (a), Radial Coordinates (b), Star Coordinates (c) and RadViz (d).

Although such visualizations are easy to construct and understand, they become hard to read when representing many data, since the polylines tend to overlap (e.g., Figure 4.4). 15 Naturally, such visualization methods also become harder to understand as the number of parameters, i.e. coordinate axes, increases. The ordering of coordinate axes can influence the insightfulness of parallel and radial visualizations Mueller (2014) reduces these overlaps with k-means clustering. She first groups similar design candidates and then plots these groups separately. 15

87

(Johansson et al. 2008). The visualizations presented in this thesis order coordinate axes according to the orders of parameters in the underlying optimization problems. Design Explorer (2015), an online visualization tool developed by structural engineering consultants Thornton Tomasetti, uses Parallel Coordinates to represent design candidates (Figure 4.4). Confusingly, the tool’s representation does not distinguish between design parameters and performance objectives and plots them on the same set of axes. Ashour and Kolarevic (2015) supplement a Pareto Front with a Parallel Coordinate visualization of multiple criteria, but do not visualize the relationship between those criteria and design parameters.

Figure 4.4 Screenshot of Design Explorer (http://tt-acm.github.io/DesignExplorer/, accessed 24.10.2017)

4.2.2 Star Coordinates and RadViz The visualization method applied in this thesis—Star Coordinates—also introduces one coordinate axis for each parameter, with the axes typically arranged radially (Figure 4.3c). In contrast to Parallel Coordinates, however, Star

88

Coordinates displays a datum not as a closed polyline but as a single point. As a point-based representation, Star Coordinates can represent many more data than Parallel Coordinates, and even display a continuous field to represent the space of the data, although at the price of making individual parameters less readable. Note that, although in theory Star Coordinates might represent several data with an identical point, the novel visualization presented in Chapter 8 circumvents this limitation by preferring better performing design candidates. RadViz—a method closely related to Star Coordinates—places data with a spring system, with each parameter represented by a spring (Figure 4.3d). RubioSanchez et al. (2016) conclude in a comparison of the two methods that RadViz "introduces non-linear distortions, can encumber outlier detection, prevents associating the plots with useful linear mappings, and impedes estimating original data attributes accurately.” An important advantage of parallel and radial visualizations is that their coordinate axes allow the estimation of numerical parameters. This advantage contrasts with other multivariate visualization methods such as clustering, which cannot directly represent numerical relationships. 4.2.3 Matrix of Contour Plots The optimization community sometimes represents fitness landscapes with a Matrix of Contour Plots. Such a matrix consists of 𝑛𝑛2 − 𝑛𝑛 two-dimensional plots,

with each plot representing the relationship between a pair of parameters and

the performance criterion. The plots on the diagonal of the matrix remain blank or show this relationship for a single parameter (Figure 4.5). In other words, this

89

visualization consists of two-dimensional "sections" or "contours" through the higher dimensional space. This architecturally-inspired metaphor clarifies an important limitation: Although the number of contour plots increases quadratically with the number of parameters, these plots display only a very small portion of the space, since, to draw a single plot for two parameters, all other parameters must stay constant. In terms of the section metaphor, the sections are drawn through a single "base point" in the higher dimensional space. This limitation is usually mitigated by basing the visualization on the best set of parameter values found during optimization. 16 Contour plots are unable to provide an overview over the whole design space and, due to their quadratically growing number, become increasingly difficult to understand even for relatively small numbers of parameters. Additionally, there is no guarantee that the selection of design candidates displayed is relevant for understanding the design space at hand, since the contour plots only capture interactions between pairs of parameters and thus miss other, potentially more

Figure 4.5 (a) represents design candidates in a three-dimensional design space, (b) indicates the section planes shown in (c), a Matrix of Contour Plots. For this example, all design candidates lie on section planes. Typically, this planarity is not the case. Van Wijk and van Liere (1993) propose an interactive Matrix of Contour Plots that allows the user to navigate the space by changing the base point of the visualization. 16

90

important interactions. This inability to represent non-pairwise, non-linear interactions adds to the cognitive difficulty of integrating information from a quadratically large number of contour plots. 4.2.4 Reversing Multivariate Visualizations Reversibility of multivariate visualizations is a critical issue for DSE: To support designers with interactive tools that visualize design spaces and fitness landscapes, one should not only be able to map high-dimensional design candidates into low-dimensional visualizations, but also to map locations on these visualizations back into the high-dimensional design space. Ideally, such a mapping would be bijective (Figure 4.6a). In other words, every design in the higher dimensional design space would map uniquely onto a lower dimensional representation, and every lower dimensional representation would map onto exactly one design in the higher dimensional space. Many multivariate visualization methods do not support such a "reverse" mapping from low to high dimensions, and only Parallel and Radial Coordinates support a bijective mapping. But, as we have seen, in practice overlaps inhibit the understandability of this mapping.

Figure 4.6 Three types of mappings between a high-dimensional space ℝ𝑛𝑛 and a twodimensional space ℝ2: bijective (a), injective (b) and surjective (c).

91

Matrices of Contour Plots allow reverse mappings, but since this method only takes (relatively arbitrary) slices of the higher dimensional space, it is not bijective. Instead, Matrices of Contour Plots are injective, with every location in the two-dimensional representation mapping onto a unique design candidate, but with many design candidates not representable at all (Figure 4.6b). Chapter 8 proposes an extension to Star Coordinates—Performance Maps—that allows designers to go back and forth between design spaces and their representations by using a surjective mapping (Figure 4.6c). Surjective mappings, while falling short of the bijective ideal, improve injective mappings in that they can represent all designs from the higher dimensional space, but with some design candidates potentially overlapping in the representation.

4.3 Chapter Summary This chapter reviews current efforts in ADO to enhance the understanding that human designers can garner from (sub-optimal) optimization results. Beyond the Pareto-fronts employed in MOO, most of these efforts employ clustering or other methods that represent fitness landscapes only indirectly. The chapter then discusses three types of multivariate visualization methods that can visualize fitness landscapes directly: Parallel and Radial Coordinates, Star Coordinates and RadViz, and Matrix of Contour Plots. Finally, the chapter introduces reversibility as a key property for employing multivariate visualizations in performance-informed DSE tools. Table 4.1 summarizes the discussed representations in terms of type and reversibility.

92

Table 4.1 Overview of representations for fitness landscapes discussed in this chapter

Indirect

Method

Type

Reversibility

Bayesian networks

Statistical Inference

N/A

Self-Organizing Maps K-Means Pareto Fronts

Clustering Tradeoff Analysis

Star Coordinates RadViz Direct

Parallel Visualization

Irreversible Projection Bijective

Radial Visualization Matrix of Contour Plots

Irreversible

Sections

Injective

This thesis assumes that direct, reversible representations of fitness landscapes are the most effective for performance-informed DSE, due to their ability to interactively visualize entire design spaces. Chapter 8 presents Performance Maps, a novel, direct and surjective visualization method based on Star Coordinates, and compares it with Parallel Coordinates and Matrix of Contour Plots in terms of its potential for performance-informed DSE. The next chapter surveys black-box optimization methods and their reception in the ADO literature.

93

5 Black-Box Optimization This chapter, adapted from (Wortmann and Nannicini 2017) 17, introduces singleand multi-objective, black-box optimization, and provides an overview over three categories of black-box optimization methods: Metaheuristics, Direct Search, and Model-based Methods. It further discusses mathematical benchmark results and contrasts these with recommendations from the ADO literature. Finally, it identifies a false dichotomy between performance-informed DSE and efficiency, which underscores the relevance of Research Question 2. Optimization problems are formulated with varying numbers of continuous and/or discrete variables and with or without constraints, resulting in linear or nonlinear optimization problems, and convex or nonconvex objective functions. Many optimization methods work only on specific kinds of optimization problems. For example, linear programming works only on optimization problems formulated as linear inequalities. Moreover, the effectiveness of optimization methods depends on individual problem characteristics, with problem-adapted optimization methods usually being more effective than general purpose ones 18. A major distinction exists between gradient-based and derivative-free optimization methods (Choudhary and Michalek 2005). Gradient-based

17

Reprinted/adapted with permission from Springer International Publishing.

The No-Free-Lunch theorem states that the effectiveness of all possible optimization methods, when averaged over all possible optimization problems, is identical (Wolpert and Macready 1997). However, the theorem makes assumptions that limit its applicability to practical cases. For example, it assumes that all possible fitness landscapes are equally likely to occur. 18

94

optimization employs the derivative of the objective function that relates the input variables to the value one wants to optimize. Intuitively, this derivative describes the steepness of the fitness landscape at a certain point, which is useful in guiding the search. Gradient-based methods are fast and mathematically rigorous but require either the mathematical formulation of the objective function and its gradient, or many function evaluations to estimate the gradient via finite differences. But, when evaluating an objective function with timeintensive computer simulations, such a mathematical formulation is unavailable, while estimating the gradient requires a prohibitive number of function evaluations (Nannicini 2015).

5.1 Single-Objective Optimization Black-box (or derivative-free) optimization attempts to solve optimization problems where the objective function is computable but not available in analytical form. In other words, the objective function’s mathematical expression is unknown (Conn et al. 2009; Hendrix and G.-Tóth 2010). This unavailability is typical for ADO: A numerical simulation linked to a parametric model is a function that takes as input the design parameters (i.e., the values of the decision variables) and outputs a measure of performance of the corresponding design candidate (the objective function or fitness value). From this perspective, a single simulation is equivalent to a single objective function

evaluation. Then the goal of the optimization process is to minimize (or maximize) the output of the numerical simulation, for example the total stress and displacement of a structure or the energy use of a building.

95

There are several types of mathematical optimization problems. This thesis focuses on problems with a single objective function and simple lower and upper bounding constraints on the decision variables. Such problems can be expressed as follows: min{𝑓𝑓(𝑥𝑥): 𝑥𝑥 ∈ [𝑥𝑥 𝐿𝐿 , 𝑥𝑥 𝑈𝑈 ] ⊆ 𝑅𝑅 𝑛𝑛 , 𝑥𝑥𝑖𝑖 ∈ 𝑍𝑍 ∀𝑖𝑖 ∈ 𝐼𝐼}, (𝑂𝑂𝑂𝑂𝑂𝑂)

where the vectors 𝑥𝑥 𝐿𝐿 , 𝑥𝑥 𝑈𝑈 represent lower and upper bounds for the decision

variables, and 𝐼𝐼 ⊆ {1, … , 𝑛𝑛} is the set of indices of the decision variables that are constrained to take on integer values (the remaining variables can take any fractional value). Here, a point in the design space [𝑥𝑥 𝐿𝐿 , 𝑥𝑥 𝑈𝑈 ], 𝑥𝑥𝑖𝑖 ∈ 𝑍𝑍 ∀𝑖𝑖 ∈ 𝐼𝐼, also called a solution, represents values of the design parameters and therefore defines a design candidate. A fitness landscape is a design space with an additional dimension that contains the objective values for all points in the space. One often characterizes optimization problems in terms of their fitness landscape, which, for more difficult problems is “rugged.” A rugged fitness landscape can contain discontinuities, sharp bends or ridges, noise, local optima, and/or outliers. A large part of the literature on black-box optimization deals with (OPT), or an even simpler, continuous problem in which 𝐼𝐼 = ∅, i.e., all variables can take fractional values. When solving (OPT), the goal is to obtain a global minimum of

the function f, that is, a point that attains the minimum over the whole design

space (also known as the feasible region). But this minimization may be difficult to achieve in theory and practice, and in the worst case may require an infinite number of steps. One may therefore settle for the less ambitious goal of

96

determining a local minimum, that is, a point that attains the minimum over some area of the design space centered on the point itself. Problems with a single local minimum that is also the global minimum are known as unimodal, while problems with multiple local minima are known as multimodal. While unimodal problems are easier to optimize, in practice most simulation-based problems are multimodal (e.g., Wetter and Wright 2004). Known convergence results guarantee that one can determine a local minimum with high accuracy in a small number of steps if the objective function is sufficiently well-behaved (Vicente 2013; Dodangeh and Vicente 2016). Most black-box algorithms combine global search, which aims to determine the area of the design space that contains the global minimum (without being able to guarantee that they will find this area in finite time), and local search, which focuses on identifying a local minimum starting from a given point. This combination of (global) exploration and (local) exploitation has proven very successful: Despite the lack of strong theoretical guarantees, state-of-the-art optimization software consistently and rapidly find near-optimal solutions to some classes of problems (OPT) (Rios and Sahinidis 2013).

5.2 Multi-Objective Optimization While single-objective algorithms aim to optimize only a single objective function f—which can include a weighted sum of several performance criteria and penalty terms—multi-objective algorithms consider multiple, potentially conflicting objective functions simultaneously. One can formulate a multi-

97

objective (MOO) problem as the optimization of a vector of single-objective functions: min{𝐹𝐹(𝑥𝑥) = [𝑓𝑓1 (𝑥𝑥), 𝑓𝑓2 (𝑥𝑥), … , 𝑓𝑓𝑘𝑘 (𝑥𝑥)]}

A multi-objective problem may not have a well-defined solution, because the set of all interesting solutions (the set of all non-dominated solutions, also known as the Pareto front) may have infinite size and be very difficult to represent. For a non-dominated solution, it is impossible to improve an objective value without losses in other objective values. The numbers in Figure 5.1a indicate the solutions’ Pareto ranks, i.e., by how many other solutions they are dominated. Solutions

Figure 5.1 Four methods to find Pareto fronts: (a) nondominated sorting, (b) scalarizing with weighted sums of objectives, (c) crowding to ensure diversity, and (d) maximizing hypervolume. Illustration: (Knowles and Nakayama 2008)

98

with rank 1 are non-dominated. Accordingly, the true Pareto front—which for two objectives represents a curve—represents the trade-offs between conflicting objectives. Finding the Pareto front is usually more difficult than finding the global optimum for a single-objective problem (Coello 2013). The goal of Paretobased, multi-objective algorithms is to approximate the true Pareto front, which is the most common type. One measures the quality of Pareto-based optimization algorithms in terms of how well they approximate the “true” Pareto front, and not in terms of how much they improve a single objective value. In other words, one can think of Pareto-based optimization algorithms as pursuing a single “meta”-objective: the approximation of the Pareto front. To achieve a good approximation, an algorithm needs to find not only a single, well-performing solution, but a set of solutions that is diverse with respect to how much they satisfy the individual objectives (Figure 5.1c). This diversity ensures that the approximated Pareto front will be as wide as possible. In a second step, the designer selects solutions from the Pareto front based on the trade-offs represented by the Pareto front and/or additional considerations or preferences. An alternative approach to MOO is to employ a weighted sum of objectives, which reduces the problem to be single-objective. Weighted sums avoid the need for human decision makers by defining the relative importance of different performance criteria a priori, while choosing a solution from the Pareto front assigns a relative importance a posteriori. One can approximate a Pareto front by running a single-objective algorithm multiple times with different weights

99

(Figure 5.1b). But this approach finds only solutions that are in certain parts of the Pareto front, more specifically the convex parts, unless one takes special precautions in the formulation of the weighted sum (Steuer and Choo 1983).

5.3 Categories of Black-Box Optimization Methods Since the 1950s, many black-box optimization algorithms have been proposed and the understanding of the algorithms’ performance from a theoretical and practical point of view has increased considerably. The following sections discuss some of the major algorithm categories and outline their characteristics and limitations. Although these categories represent dichotomies, it is important to remark that many algorithms draw on ideas from both sides of these dichotomies. The first distinction is between iterative methods and metaheuristics. Among iterative methods, the categorization distinguishes direct search methods and model-

based methods. Metaheuristics, such as genetic algorithms (GAs), lack mathematical proofs of their rate of convergence towards the global optimum and often exhibit biological analogies and randomization. Both direct search and model-based methods rest on more solid mathematical foundations and tend to be deterministic, although stochastic methods exist as well. The difference between direct search and model-based methods is that the latter construct a (either local or global) mathematical model to guide the search. There is no widely agreed on taxonomy for black-box algorithms, which have been developed in different fields of mathematics and computer science.

100

Textbooks typically cover only one of the three categories (e.g., Talbi 2009; Koziel et al. 2011) or leave out important (sub-)categories such as global direct search (e.g., Koziel and Yang 2011) and/or global model-based methods (e.g., Hendrix and G.-Tóth 2010; Yang 2010; Audet and Hare 2017). It is likely that this fragmentation has contributed to the preference for metaheuristics in ADO (section 5.8.2). Wortmann and Nannicini (2016; 2017) propose three categories to better structure discussions in ADO (Table 5.1): metaheuristics, direct search, and model-based methods. Table 5.1 Overview of algorithms and implementations in this chapter. Local methods marked with a * perform repeated local search from different starting points. Local Metaheuristics

Global

Pareto-based

Genetic Algorithms (GAs)

NSGA-II

Particle Swarm Optimization (PSO)

SPEA-2

Simulated Annealing (SA)

HypE

Covariance matrix adaptation (CAM-ES) Variable Neighborhood Search Direct Search

Pattern Search (Hooke-Jeeves)

DIRECT

Nelder-Mead

MCS

MADS

TOMLAB/ GLCCLUSTER

MULTIMADS

NOMAD Model-based

Trust-Region Methods

EGO

TOMLAB/ MULTI-MIN*

RBFOpt

KNITRO*

SNOBFIT

101

ParEGO

Figure 5.2 Behavior of black-box algorithms on the Branin function. This function has two variables and three global minima (indicated as gray dots). The algorithms optimized this function until they found a solution within 1% of a global minimum. The design candidates evaluated during the algorithms’ searches are indicated as black dots. Note that the metaheuristics (GA, SA, and PSO) required more function evaluations than the global direct search (DIRECT) and model-based (RBFOpt) methods.

Figure 5.2 gives an impression of the behaviors of algorithms from these categories on the Branin function, a mathematical test function for black-box optimization algorithms. The following sections examine these categories and algorithms in more detail.

5.4 Metaheuristics A metaheuristic is a high-level procedure designed to construct a heuristic for an optimization problem.

102

5.4.1 Single-Objective Metaheuristics There is a vast literature on single-objective metaheuristics: some of the most well-known are simulated annealing (SA) (Kirkpatrick et al. 1983), GAs (Holland 1992) and particle swarm optimization (PSO) (Kennedy and Eberhart 1995) (Figure 5.2). Covariance matrix adaptation evolution strategy (CMA-ES) (Hansen and Ostermeier 2001) and variable neighborhood search (Hansen and Mladenovic 2001) are more recent examples. Among metaheuristics, GAs are the most well-known class of algorithms and have been applied to many problems in architecture and engineering (Miles 2010). GAs mimic biological evolution by defining candidate solutions as a set of genes, and by generating “evolved” candidate solutions through “chromosome crossover” and random “mutations.” Like many metaheuristics, GAs belong in the category of population-based algorithms: They generate a set of points for every iteration (Hendrix and G.Tóth 2010) 19. In contrast, iterative methods generate a single point. (But the generation of this point may involve additional function evaluations, for example to estimate the gradient.) Another difference is that iterative methods often are deterministic, while metaheuristics tend to be stochastic, i.e., they employ or even embrace randomness. The main advantages of metaheuristics are their conceptual simplicity, ease of implementation, and wide applicability. Their main disadvantages are the lack of sound theoretical performance guarantees besides those that derive from

19

Depending on the algorithm, such a set is called a generation, swarm, colony, etc.

103

their inherent randomness, the need to tune many optimization parameters for every algorithm and problem (Talbi 2009), and the fact that metaheuristics often require thousands of evaluations of the objective function to be effective. In GAs for example, a single generation typically consists of between 25–100 candidate solutions, and GAs require several generations to achieve “evolved” results that are significantly better than random guessing. In a more extreme example, when comparing the effectiveness of various metaheuristics on benchmark problems from structural design, Hasançebi et al. (2009) examine 50,000 design candidates for each algorithm on easier problems, and 100,000 on the most difficult problem. The last disadvantage is a source of concern when it comes to the application of metaheuristics in ADO: In many situations, the evaluation of the performance of a design candidate requires a time-intensive numerical simulation, which makes the evaluation of thousands of design candidates impractical. 5.4.2 Pareto-Based Metaheuristics Since diversity of solutions is an explicit goal of Pareto-based optimization, population-based metaheuristics appear well-suited to this task. Indeed, Paretobased optimization algorithms typically employ a GA—for example NSGA-II (Deb et al. 2002), SPEA-2 (Zitzler et al. 2001) and HypE (Bader and Zitzler 2008)— or PSO. Pareto-based algorithms differ not only in their optimization method, but also in their method for comparing solutions, because in MOO there is no straightforward method to decide which of two solutions is better. NSGA-II and SPEA-2 rank solutions with a non-domination sorting and filter them with a crowding measure to assure a good spread of solutions on the Pareto front 104

(Figure 5.1c). HypE estimates the solutions’ hypervolumes in the objective space to more directly asses their contributions to the Pareto front (Figure 5.1d).

5.5 Direct Search In contrast to metaheuristics, direct search methods (Kolda et al. 2003) typically are deterministic—although stochastic variations exist—and rest on solid mathematical foundations. Hooke and Jeeves (1961) coined the phrase “direct search” to describe the ”sequential examination of trial solutions involving comparison of each trial solution with the ‘best’ obtained up to that time together with a strategy for determining (as a function of earlier results) what the next trial solution will be.” Accordingly, direct search methods typically do not

employ

(estimated)

derivatives,

randomization,

populations,

or

mathematical models. 5.5.1 Local Direct Search Local direct search algorithms are among the earliest black-box optimization algorithms. The basic structure of such algorithms involves a polling step, which tests several directions (i.e., variable changes) starting from the current point for improvement of the objective function, and a search step, which determines the next point based on the results of the polling step. If the directions explored in the polling step and the length of the steps taken towards those directions satisfy certain conditions, one can show that a direct search converges to a stationary point (i.e., a point which is a candidate to be a local or global optimum) of the objective function.

105

A classic example for this structure is pattern search, also known as Hooke-Jeeves (Hooke and Jeeves 1961). A well-known variant is the Nelder-Mead algorithm (Nelder and Mead 1965) that samples points on a simplex (i.e., a generalization of a triangle to any dimension). At every iteration, the algorithm replaces one corner of the simplex with a better point. More recent examples are generating set search (Kolda et al. 2003) and mesh-adaptive direct search (MADS) (Audet and Dennis 2006). While direct search has been designed for single-objective problems,

extensions

to multi-objective

problems

exist, for example

MULTIMADS (Audet et al. 2010). 5.5.2 Global Direct Search Literature on global direct search algorithms is scarce, possibly because guaranteeing global optimality for black-box optimization problems is difficult. The most well-known algorithm of this class is DIRECT (Jones et al. 1993), a variant of the Lipschitzian approach that does not require the Lipschitz constant to be specified. For a function f, the Lipschitz constant is a measure of the function’s maximum change for a given interval. The algorithm proceeds by dividing the design space into boxes and further subdividing those boxes that— based on an approximation of the Lipschitz constant, which is unknown but bounded—potentially contain the optimum (Figure 5.2, Figure 5.3). If a bound on

Figure 5.3 Diagram of three iterations of the DIRECT algorithm. DIRECT subdivides promising regions of the design space while ignoring others.

106

the Lipschitz constant is known, it is guaranteed that the algorithm’s recursive subdivision eventually finds the global optimum (with a given tolerance). In practice, this bound typically is unknown. In that case, one runs DIRECT for a specified number of iterations, which often is sufficient to obtain a solution with a close-to-optimal objective value. Another global direct search algorithm is multi-level coordinate search (MCS) (Huyer and Neumaier 1999), which also relies on recursively subdividing the design space. It performs well especially on problems with small numbers of variables (ibid.).

5.6 Model-Based Methods Model-based methods choose an indirect route. To find good solutions, modelbased methods do not evaluate design candidates according to a metaheuristic or direct search strategy but use evaluated design candidates to construct an approximation of the unknown black-box function. The resulting surrogate

model—also known as a meta-model or response surface—guides the search for good design candidates (Simpson et al. 2001; Koziel et al. 2011). 5.6.1 Surrogate Models Surrogate models are mathematical, statistical, or machine-learning models of processes that take a long time and/or are difficult (i.e., costly) to evaluate, such as physical experiments and time-intensive simulations. Surrogate models interpret known outcomes to replace slow and exact processes, such as simulations, with faster and potentially less accurate processes.

107

Figure 5.4 Three types of simulation-based, black-box optimization: Optimize the output of the exact simulation directly (a), optimize the approximating output of the surrogate model constructed from prior simulation results (b) and optimize and update the surrogate model during the optimization process (c).

Importantly, surrogate models not only serve to guide optimization processes, but also provide performance estimates of unexplored design candidates to a reasonable degree of accuracy (Costa and Nannicini 2014). For example, a surrogate model can attempt to infer the unknown daylight performance of a given parametric design candidate from the known daylight performance of other candidates from the same parametric model. 5.6.2 Model-Based Optimization A simple approach to use a surrogate model for optimization is to first train a model, and then to replace the original time-intensive simulation. This replacement improves the speed of any optimization algorithm by providing much faster function evaluations and allows the application of statistical techniques that require many points, such as sensitivity analysis. Eisenhower et al. (2012) describe such an approach for building energy problems. A major disadvantage of this method is that the initial accuracy of the surrogate model limits the achievable optimization result. To improve this accuracy, one must increase the initial number of points, which can negate the original speed advantage. Model-based optimization methods, on the other hand, employ

108

different strategies to iteratively improve the accuracy of the surrogate model by evaluating points during the optimization process (Figure 5.4). Model-based methods have found numerous successful applications in engineering design (Hemker 2008; Zhang et al. 2011; Koziel and Leifsson 2013; Ulaganathan and Asproulis 2013). They have proven especially effective when evaluations are costly (Regis and Shoemaker 2007), for example when one employs time-intensive simulations such as daylighting or computational fluid dynamics. 5.6.3 Local Model-Based Methods A prominent example of a local model-based method is the trust-region framework for nonlinear optimization (Conn et al. 2000). This framework is used for classical nonlinear programming—problems for which gradient information is available—as well as black-box problems, which are the focus of this chapter. A trust-region algorithm builds a local model of the objective function around the current point. The next search point lies inside the trust region radius. The algorithm adjusts this radius based on the difference between the actual improvement and the improvement estimated by the model. The algorithm expands the radius when the estimate is accurate; otherwise, it shrinks it. Trust region methods employ different types of models, such as linear models (Powell 1994) or radial basis functions (RBFs) (Wild et al. 2008). The most common choice is a quadratic model (e.g. Powell 2009), that, in effect, represents the gradient and Hessian matrix (i.e., the local curvature) of the objective function. Under suitable conditions, trust region methods enjoy strong convergence guarantees. 109

5.6.4 Global Model-Based Methods The need in engineering and architecture to optimize problems that require time-intensive numerical simulations, such as complex finite element models, has led to considerable interest in global model-based methods. This thesis presents applications of a model-based algorithm to structural, building energy, daylighting, and glare problems. Global model-based algorithms approximate the unknown objective with a surrogate model. To determine a promising point to simulate next, the algorithm searches the surrogate model (deterministically, randomly or with a metaheuristic). The algorithm then refines the model with information gained from the simulation (Figure 5.5). This iterative refinement is an advantage over approaches that require an a priori decision on the model’s quality. Global model-based algorithms construct surrogate models with a variety of statistical (e.g. Polynomial Regression and Kriging) and machine learningrelated (e.g. RBFs, Neural Networks, and Support Vector Machines) techniques. Due to their ability to model complex design spaces, Kriging and radial basis functions are particularly suitable for simulation-based problems from engineering design (Forrester et al. 2008) and, by extension, ADO. Opossum, the optimization tool presented here, interpolates design spaces with RBFs (Regis and Shoemaker 2007; Costa and Nannicini 2014) (Figure 5.2). Compared to other types of black-box algorithms, global model-based algorithms require additional calculations to, at every iteration, regenerate the surrogate model. But this additional cost is negligible once function evaluations become more expensive. For RBFOpt 1.0—an early version of the algorithm on which 110

Opossum relies—this is the case once a performance simulation takes more than a few seconds (Wortmann and Nannicini 2016). 20 One of the earliest global model-based algorithms is Efficient Global Optimization (EGO) (Jones et al. 1998), which fits a linear combination of Gaussian functions to points with known objective function values and selects the next point by maximizing the expected improvement of the objective function. With this type of model, the variance of the Gaussian functions at a point is a measure for the model’s accuracy. In practice, experimental studies indicate that other types of models typically yield better results (Holmström 2008), for example RBF (Gutmann 2001) or ensemble models (Müller and Shoemaker 2014). Wang et al. (2004) present a global, model-based optimization algorithm that does not search the (quadratic regression) surrogate model, and instead samples towards the best-know design candidate. But their algorithm requires a relatively large number of function evaluations per iteration and risks getting stuck in local optima (ibid.).

Figure 5.5 Diagram of three iterations of a global, model-based algorithm. The dashed lines indicate the approximated design space. Note that, since version 2.1.0, RBFOpt has more than halved its computational cost (Nannicini 2017a). 20

111

5.6.5 The Radial Basis Function-Method RBFOpt implements a global, model-based algorithm using RBF models (Costa and Nannicini 2014). RBF models interpolate the unknown objective value 𝑠𝑠𝑘𝑘 (𝑥𝑥)

of a point 𝑥𝑥 by linearly combining 𝑘𝑘 radially symmetric functions 𝜙𝜙—one for

every known point 𝑥𝑥𝑖𝑖 —with a vector of model parameters 𝜆𝜆 and a polynomial 𝑝𝑝: 𝑘𝑘

𝑠𝑠𝑘𝑘 (𝑥𝑥): = � 𝜆𝜆𝑖𝑖 𝜙𝜙(‖𝑥𝑥 − 𝑥𝑥𝑖𝑖 ‖) + 𝑝𝑝(𝑥𝑥) 𝑖𝑖=1

In other words, the “influence” of a known point depends on its distance to the unknown point the model is interpolating. RBFOpt offers five types of functions 𝜙𝜙, i.e., types of interpolations: Linear (𝜙𝜙(𝑟𝑟) =

𝑟𝑟), cubic (𝜙𝜙(𝑟𝑟) = 𝑟𝑟 3 ), thin plate spline (𝜙𝜙(𝑟𝑟) = 𝑟𝑟 2 ln 𝑟𝑟), multi-quadratic (𝜙𝜙(𝑟𝑟) = �𝑟𝑟 2 + 𝛾𝛾 2 ), and Gaussian (𝜙𝜙(𝑟𝑟) = exp(− 𝑟𝑟 2 ⁄2𝛾𝛾 2 )). (𝛾𝛾 is an additional shape

parameter that is usually set to one.) Every type results in, for an individual

problem, a model with different accuracy. Conveniently, RBFOpt’s automatic model selection feature “eliminates the difficult choice of the basis function, yielding in the worst case a small performance loss with respect to the best basis function” (Costa and Nannicini 2014). A critical question for global model-based algorithms is how to balance the need for improving the accuracy of the model—which entails exploring a problem’s design space in its entirety—with the goal of improving the value of the objective function—which entails exploiting a promising area of limited size. Algorithms using RBF models typically address this question by choosing the next point based on two criteria: the potential improvement of the objective

112

function value and the likelihood of improving the model’s accuracy. For example, Regis and Shoemaker (2007) asses the former as a candidate point’s predicted objective function value, and the latter as the candidate point’s distance from previously evaluated points. Alternatively, algorithms following Gutmann (2001) measure a model’s bumpiness (i.e., a measure of how much the model oscillates) to assess the likelihood that a point will improve both the objective function value and the model’s accuracy. Another important concept of the RBF-method is the “cycle” (Regis and Shoemaker 2007). A cycle consists of several iterations, during which the relative importance of the distance from previous points and the predicted objective function value changes. The beginning of the cycle emphasizes distance (i.e., global search or exploration), while the end emphasizes the objective function value (i.e., local search or exploitation). In other words, the RBF-method balances global and local search by repeatedly shifting from one to the other. RBFOpt 4.0.1—the implementation of the RBF-method employed by Opossum— augments this balance by also including a dedicated, local model-based algorithm (Nannicini 2017b). Under weak assumptions, the RBF-method is guaranteed to converge onto the global optimum as the number of evaluations goes to infinity (Gutmann 2001). The RBF-method has proven to converge significantly faster than metaheuristics (Regis and Shoemaker 2007) and is competitive with state-of-the-art direct search methods (Costa and Nannicini 2014). Holmström et al. (2008) conclude that for multimodal black-box optimization problems with constraints “[model-based] deterministic derivative-free

113

methods compare well with the derivative-based ones, but the stochastic genetic algorithm solver is several orders of magnitude too slow for practical use.” 5.6.6 Model-Based Pareto Methods Increasingly, model-based optimization is employed to also approximate Pareto fronts. ParEGO (Knowles 2006)—a multi-objective extension of EGO—employs only a single surrogate model. ParEGO calculates the objective values for the surrogate model as a weighted sum and recalculates the model with different sets of weights at every iteration (Figure 5.1b). The algorithm approximates the Pareto front by finding good solutions for the different sets of weights. Knowles and Nakayama (2008) survey model-based Pareto optimization in more detail.

5.7 Mathematical Benchmark Results This section summarizes benchmark results of single-objective black-box methods for mathematical test problems. There is a much smaller amount of benchmark results for Pareto-based algorithms, which are not considered here. Rios and Sahinidis (2013) benchmark 22 state-of-the-art black-box software implementations on an extensive set of 502 convex and non-convex, continuous and discontinuous mathematical test problems with up to 300 variables. Although no single algorithm outperforms all the others, they identify four algorithms that, in most cases, are sufficient to find the best result within 2,500 function

evaluations:

the

commercial

TOMLAB/MULTI-MIN

and

TOMLAB/GLCCLUSTER, and the free MCS and SNOBFIT. TOMLAB/MULTI-MIN is a multi-start algorithm that conducts repeated local searches from different starting points, while TOMLAB/GLCCLUSTER is an implementation of DIRECT

114

hybridized with clustering techniques. The open-source SNOBFIT (Huyer and Neumaier 2008) implements a global model-based algorithm that does not employ a single global model but multiple local ones. The tested metaheuristics, which include GAs and a variant of SA, perform poorly, except for the nonconvex, discontinuous problems, on which PSO and CMA-ES came in third and sixth, respectively. Costa and Nannicini (2014) compare selected software on problems with and without integer variables, using a small budget of objective function evaluations (30 ∗ (𝑛𝑛 + 1), where 𝑛𝑛 is the number of variables of each optimization problem).

In this setting, which is relevant for ADO whenever simulations take a long time,

efficiency is crucial. They conclude that on problems without integer variables, RBFOpt and KNITRO (a commercial implementation of a multi-start local algorithm) are more efficient than SNOBFIT and NOMAD (an open-source implementation of the local direct search algorithm MADS) and consistently find better solutions with the allowed small budget. Among the software tested in (Costa and Nannicini 2014), only RBFOpt and NOMAD handle integer variables, and RBFOpt outperforms NOMAD on a benchmark set. KNITRO, NOMAD and RBFOpt also participated in the 2015 and 2016 editions of the Black-Box optimization competition (Loshchilov and Glasmacher 2017). These extensive benchmarks show that KNITRO performs better than RBFOpt, which in turn is more efficient than NOMAD. Costa and Nannicini (2014) analyze the 2015 benchmark problems—with had up to 64 variables and function evaluation budgets up to 6400—and conclude that KNITRO works best because the relatively large budget of allowed function evaluations permits the estimation

115

of gradients by finite differences, whereas with smaller budgets, the modelbased RBFOpt is more efficient. Recent results thus suggest that model-based and direct search methods are the best available methodologies, but the specific choice depends on the characteristics of the problem. Chapter 7 sheds additional light on the strengths and weaknesses of each algorithm with benchmark results from simulationbased problems.

5.8 Black-Box Optimization Methods in ADO The ADO literature typically seems unaware of the state-of-the-art black-box optimization software discussed in the previous section, let alone of the associated benchmark results. Instead, the literature focuses on metaheuristics, which, despite the skepticisms of the mathematical optimization community, are popular with architectural theorists and researchers alike. 5.8.1 Metaheuristics in Architectural Theory Some architectural theorists find the randomized strategies and biological analogies employed by metaheuristics conceptually appealing. For example, De Landa (2002) argues “that the productive use of genetic algorithms necessitates the deployment of three philosophical forms of thought: populational, intensive and topological” and Frazer (1995) imagines a “new architecture [that] will emerge on the very edge of chaos, where all living things emerge, and [that] will inevitably share some characteristics of primitive live forms.” According to Oxman (2006) “evolutionary techniques [such as genetic algorithms] have been part of a long research tradition exploring computational mechanisms of form

116

generation.” Carpo (2015) does not directly mention metaheuristic optimization, but endorses randomized, heuristic search strategies:

[The engineers] may not have seen it this way, but by simulation and iteration they … generated a vast and partly random corpus of many very similar structures that all failed under certain conditions; and they chose and ultimately replicated one that didn’t. This is a far cry from how a modern engineer would have designed that structure – which is one reason why no modern engineer could have designed it. Johnson (2017) succinctly summarizes the potential of parametric design for optimization, but, myopically, only through the lens of GAs (p 173):

Candidate solutions to design problems can be thought of as a set of parameters or values in the problem’s solution space (its genome in a genetic space). In the absence of direct solutions or provably effective search strategies, research has focused on using computers to generate solution candidates systematically through rule-based mechanisms and parametric recombination (evolutionary algorithms), with candidates evaluated using a fitness function. It is likely that statements such as these have contributed to an uncritical popularization of GAs and other metaheuristics in ADO. 5.8.2 Metaheuristics in ADO Most of the applications from structural design reviewed by Kicinger et al. (2005) and Hare et al. (2013) employ genetic algorithms, and all employ metaheuristic algorithms (although sometimes in comparison with another type of algorithm). Other examples of metaheuristics used in architectural design are SA (Luebkeman and Shea 2005) and PSO (Felkner et al. 2013).

117

Of the 74 applications from sustainable building design surveyed by Evins (2013), 47 employ a GA, 12 another metaheuristic, and 16 local direct search. Nguyen et al. (2014) survey 90 applications related to building design, including 40 uses of GAs, 23 uses of other metaheuristics, and 9 uses of local direct search. Wortmann et al. (2015) advance two reasons for this popularity: “First, metaheuristics usually are much easier to implement than direct-search and surrogate-based methods; and second, metaheuristics are applicable to almost any kind of optimization problem, regardless of the type (e.g., continuous/discrete, linear/nonlinear, convex/nonconvex) and number of variables.” Indeed, the ADO literature supposes that fitness landscapes in ADO are nonconvex, non-smooth and discontinuous, and that, due to their stochastic and population-based characteristics, metaheuristics tackle such discontinuous fitness landscape more easily than other methods and without getting trapped by local optima (Kämpf et al. 2010). Based on their seminal benchmark of six building energy optimization problems, Wetter and Wright (2004) recommend GAs as “a good choice” for non-smooth, simulation-based optimization. In their benchmark, pattern search, a local direct search method, performed well in general, but failed on one of the problems. The ADO literature cites this paper 240 times 21 and often in support of—as shown in Chapter 7—unwarranted generalizations. For example, Attia et al. (2013) write that “evolutionary algorithms are robust in exploring the search space for a wide range of building optimization problems” and Machairas et al. (2014) that “direct

21

Google Scholar on 27.2.2018

118

search methods can be very efficient if the objective function doesn't have large discontinuities, otherwise it can fail or get trapped in local minima.” According to Touloupaki and Theodosiou (2017), “GAs are used to search the solution space (building parameter combinations) more efficiently, since they can handle nonlinear problems of many dimensions and are capable of processing large quantities of noisy data efficiently [and] can be applied in search spaces with many local minima.” Exceptionally in ADO, Hare et al. (2013) raise a critical voice and suggest direct search as an alternative to evolutionary algorithms. After noting the ease of implementation and versatility of evolutionary algorithms, they observe the following:

However, this [versatility] does not necessarily imply that evolutionary algorithms are the most appropriate method for black-box problems in structural engineering. In fact, we see multiple papers using evolutionary algorithms as benchmark methods …. In all of these papers, evolutionary algorithms are shown to perform comparably or worse with respect to efficiency and solution quality. This observation does spark the suggestion that evolutionary algorithms may be overused, specifically for continuous problems. The popularity of metaheuristics in ADO is accompanied by a lack of benchmarking: Of the 74 works surveyed by Evins (2013), only Wetter and Wright (2004) compare more than three algorithms. Of the remaining works, only three compare more than two algorithms and only nine perform any comparison. Only four of these comparisons, including (Wetter and Wright 2004), are across categories, and none involve global direct search or global model-based methods.

119

To the knowledge of the author, structural design is the only field in ADO that employs a set of benchmark problems to test and compare optimization algorithms (Gandomi and Yang 2011). Chapter 7 challenges the popularity of metaheuristics by testing five algorithms on seven test problems. Also note that, in operations research, typical applications of metaheuristics are combinatorial problems such as scheduling and vehicle routing (Gendreau and Potvin 2010). While, for many industries, such problems are very relevant, they differ from simulation-based problems in ADO: Design spaces in scheduling and vehicle routing tend to be very large, but function evaluations tend to be very fast. The popularity of metaheuristics thus might also stem from an uncritical transfer of methods from one problem domain to another. 5.8.3 Critiques of Metaheuristics Despite the popularity of metaheuristics in ADO, the optimization community remains skeptical. For example, Conn et al. (2009) regard metaheuristics as “methods of last resort” (p 6) and Hendrix and G.-Tóth (2010) characterize developments in GA-based methods as “[aimless] genetic drift” (p 197). Audet and Hare (2017) are somewhat more positive and point out that “while heuristic methods may not have the mathematical structure of a [direct search or model-based] method, they may be nonetheless fairly effective in practice” and useful for problems “where function evaluations are very quick” or "imbedded heuristic subroutines." 22

22

Incidentally, RBFOpt can use a GA as a subroutine (section 6.2.2).

120

Sörensen—an authority in the field of metaheuristics—criticizes the lacking scientific rigor in the development of new metaheuristics (2015):

The behavior of virtually any species of insects, the flow of water, musicians playing together—it seems that no idea is too far-fetched to serve as inspiration to launch yet another metaheuristic. One reason for this skepticism is the already mentioned lack of sound theoretical performance guarantees. For direct search and model-based methods, these guarantees rest on assumptions that, for practical, simulation-based problems, might not hold. Nevertheless, they inspire more confidence than having (almost) no guarantees. The results from mathematical test problems in section 5.7 and simulation-based test problems in Chapter 7 strengthen this confidence. 5.8.4 Pareto-Based Optimization in ADO In the ADO community, many see the openness of Pareto-based optimization to human decision-making as an advantage (e.g. Evins et al. 2012). According to Radford and Gero (1980), the shape of Pareto curves provides valuable information about trade-offs inherent in architectural design problems. 39% of the works discussed by Evins (2013) perform Pareto-based optimization and 78% of the architects surveyed by Cichocka et al. (2017) would prefer to optimize multiple performance criteria at once. Since most Pareto-based algorithms are metaheuristics, their popularity in ADO is mutually reinforcing. However, (OPT) can already be very difficult to solve, and multiple objectives increase this difficulty. Hamdy et al. (2016) find that, for the tested MOO algorithms and problem, Pareto fronts for the two criteria of building

121

energy and cost stabilized only after 1400-1800 function evaluations. For problems in which a single function evaluation takes minutes or hours, such a large number is often impractical. Accordingly, Chiandussi et al. (2012) conclude in their benchmark of five MOO methods that “the large computational effort makes [the Pareto-based] method generally not acceptable in usual engineering problems where, e.g., the Finite Element Method is used and models with a large number of degrees of freedom are implemented.” But the ADO literature pays little heed to the increased difficulty of MOO. For example, Evins et al. (2012) characterize Pareto-based optimization as “getting more for less” and mention computation cost only as a general limitation. Similarly, Radford and Gero (1980) acknowledge the performance tradeoffs of this approach only in a general sense. In addition, the affinity between architectural design and the trade-offs reflected by Pareto fronts is imperfect. Architectural design problems tend to have many more (qualitative and quantitative) design criteria than one can formulate—let alone solve—as MOO problems. On the other hand, it sometimes is more appropriate to express performance criteria in ADO as constraints and not as optimization objectives. For example, usually structural stress and deflection should not exceed safety thresholds, rather than be as low as possible. A similar point holds for environmental objectives such as glare and thermal comfort. This thesis hypothesizes that architects often employ Pareto-based optimization because they prefer a range of alternatives rather than a single result (Cichocka et al. 2017), and not because it accurately represents trade-offs. In other words, architects might “misuse” Pareto-based optimization—which is about trade-offs,

122

and not about the design space as such—for DSE. This hypothesis is supported by Cruz et al. (2017), whose interactive, Pareto-based algorithm finds design candidates that are less efficient but more diverse than the ones found by a noninteractive algorithm. Allegedly, this diversity “increases the probability of finding a relevant solution.” In other words, their approach prioritizes exploration over performance. This thesis presents an example of the lower efficiency of Pareto-based optimization methods in section 7.3.7. Nevertheless, it agrees that exploration is important in ADO, but argues that, to be meaningful, it should be performanceinformed, which entails finding and presenting high-performing solutions. Designers might prefer a less well-performing design candidate due to evaluation criteria that are external to the objective function but should at least understand the characteristics of high-performing solutions. Understanding such characteristics allows an informed choice. This thesis therefore focuses on identifying efficient, single-objective optimization methods that also support understanding. 5.8.5 Model-Based Methods in ADO Although surrogate models are more often employed in engineering design (Wang and Shan 2007), they are relevant also for architectural and urban design, since the applicable simulations, such as structural, energy, daylight, and wind simulations, often are time-consuming. For the Morpheus Hotel in Macao— designed by Zaha Hadid Architects—the structural simulation of its exoskeleton took more than twelve hours (Piermarini et al. 2016). Several recent ADO surveys

123

mention surrogate models, but not model-based algorithms (Evins 2013; Nguyen et al. 2014). While the ADO literature recommends metaheuristics over local direct search methods, it is largely unware of global direct search and model-based methods, with surveys typically not discussing them at all (e.g., Attia et al. 2013; Machairas et al. 2014; Touloupaki and Theodosiou 2017). This unawareness is especially surprising since global model-based methods address the need in ADO (Attia et al. 2013; Cichocka et al. 2017) to find good solutions quickly, that is with small function evaluation budgets. Machairas et al. (2014) suggest that “the number of evaluations [for training a surrogate model] is probably bigger than or the same as the number needed when an optimization algorithm [has] been coupled to a building simulation engine.” As such, for Machairas et al. (2014), surrogate models “have a long way to go before … being used by practicing professionals.” Nevertheless, they see the ability “to create and train machine learning surrogate models … for the optimization of a difficult problem with a huge size of search space” as a desirable feature for future ADO tools. The large number of function evaluations for training is a valid point about replacing exact simulations with surrogate models, but betrays an unawareness of model-based algorithms, which overcome this problem by refining the model during the optimization process. Several authors present modifications of NSGA-II that construct and refine one surrogate model per objective (Brownlee and Wright 2015; Xu et al. 2016; Wood and Eames 2017; Xu et al. 2017). But these algorithms are not established model124

based algorithms: They select design candidates for simulation only based on predicted performance and without taking the improvement of the models’ accuracy into account. Nevertheless, on the tested problems, these algorithms are more efficient than a standard NSGA-II. Yang et al. (2015) apply a similar, proprietary Pareto-based algorithm (Montrone et al. 2014) to the design of a sports hall, without a comparison with other algorithms. To the extent of the author’s knowledge, the remaining applications of established model-based algorithms in ADO are the following: Zhang et al. (2013) and Sóbester et al. (2014) apply model-based algorithms to building energy problems. Tresidder et al. (2011; 2012) compare a Kriging-based algorithm (Forrester et al. 2008) to a GA on building energy problems and conclude that the former is more efficient. Tresidder (2014) finds a higher efficiency on six out of seven (single- and multi-objective) test problems for the Kriging-based algorithm. Nevertheless, he concludes that, due to the large computational cost of Kriging and its limited potential for parallel computing, “for most low-carbon building design problems, a stand-alone genetic algorithm is the most suitable optimisation method” (p 3). But he reaches this conclusion without comparisons to other optimization algorithms. In addition, these disadvantages are specific to the employed method and its implementation. Tseranidis et al. (2016) show that, compared to other kinds of surrogate models, Kriging models take longer to construct. RBFOpt exploits parallel computing (although it is not applied in this thesis) and, as mentioned in section 5.6.4., has a relatively small computational cost.

125

This thesis presents the first applications of an RBF-method to ADO problems. In these applications, the model-based algorithm usually is more efficient than the genetic algorithm, and excels on daylighting problems, which require especially time-intensive simulations (sections 7.3.6 and 7.3.7). Presenting examples of the efficiency of model-based methods on ADO problems thus is among the key contributions of this thesis. 5.8.6 Advantages of Model-based Methods for ADO Beyond efficiency, the surrogate models resulting from model-based methods have the potential to connect co-evolving architectural design problems with well-defined optimization problems by allowing more insightful visualizations (which otherwise would take too long to construct) and more interactive optimization processes by providing real-time feedback to a human designer’s choices (section 3.3.5). For example, Geyer and Schlüter (2014) combine the parametric model of a building’s façade with a surrogate model of its energy consumption. This combination allows designers to explore different design candidates relative to their predicted energy performance. Jones et al. (1998) identify three advantages of “the response surface approach” (i.e., model-based optimization) for engineering design:

First, the technique often requires the fewest function evaluations of all competing methods. This is possible because, with typical engineering functions, one can often interpolate and extrapolate quite accurately over large distances in the design space. Intuitively, the method is able to ‘see’ obvious trends or patterns in the data and ‘jump to conclusions’ instead of having to move step-by-step along some trajectory. 126

Second, [for some model-based algorithms] the response surface approach provides a credible stopping rule based on the expected improvement from further searching … Third, the response surface approach provides a fast approximation to the computer model that can be used to identify important variables, visualize the nature of the input–output relationships, and quantify tradeoffs between multiple objectives. In short, the approach not only provides an estimate of the optimal point, but also facilitates the development of intuition and understanding about what is going on in the model. Similarly, an article that the author partially intended as a roadmap for this thesis discusses four “Advantages of Surrogate Models for Architectural Design Optimization” (Wortmann et al. 2015): Speed, Exploration, Enhancement 23, and Changed Objective. •

Speed is the higher efficiency (i.e., the higher speed of convergence) of model-based optimization methods relative to others as well as the much faster feedback of approximate surrogate models relative to exact simulations.

•

Exploration is the potential of this real-time feedback for interactive, performance-informed DSE.

•

Enhancement is the possibility of improving the accuracy of a surrogate model during a designer’s exploration by recalculating it with additional

The original paper refers to the “enhancement” of surrogate models as “refinement.” This thesis uses “enhancement” of surrogate models to disambiguate it from “enhancement” as a concept in performance-informed design (section 3.3.5). 23

127

simulation

results

from

promising

design

candidates.

With

enhancement, designers can interactively improve the accuracy of surrogate models based on their preferences for certain regions of the design space. Indeed, Machairas et al. (2014) see the ability to direct “the search to selected areas, thus improving efficiency using expert knowledge” as desirable for future ADO tools. •

Changed objective is the possibility of building different surrogate models from the same set of simulations results but with different objective functions. In this way, designers can understand the impact of individual evaluation criteria when, as often is the case, an objective function consists of multiple terms. One example for such multiple terms is an objective function that contains separate terms for the conflicting criteria of daylight quality and glare. Another is an objective function that penalizes a structure’s weights when it exceeds stress and/or displacement limits. To better understand the individual impact of these criteria one can construct and examine several models with different weights given to, for example, glare or displacement. Since these models rely on a single set of simulations results, constructing them takes only a very small amount of time.

In Chapters 8 and 9, this thesis presents and evaluates a performance-informed DSE method and tool—the Performance Explorer—that harnesses the first three advantages. The fourth advantage is epically relevant for multi-objective, performance-informed DSE, which is a promising area for future work.

128

5.9 Chapter Summary The ADO literature exhibits a preference for metaheuristics over local direct search methods and considers global direct search and model-based methods only sparingly. Considering (1) the small number of benchmarks in ADO, (2) the much more extensive, contrasting benchmark results from mathematical optimization, and (3) the untapped potential of model-based methods for simulation-based problems, one might well consider this preference a bias. This chapter identifies four potential reasons for this bias: (1) fragmentation of the literature on black-box optimization, (2) the conceptual appeal of biological analogies, (3) a preference for Pareto-based optimization, and (4) the abovementioned small number of benchmarks on ADO problems. An underlying reason might be a preference among designers of exploration over exploitation. But this would be a false dichotomy: If one cares about performance at all, what good is exploration when it is not well-informed? To understand the characteristics of well-performing solutions, one first needs to identify them, which requires efficient optimization methods. This thesis therefore rigorously tests the effectiveness of a global, model-based method and examines its potential for performance-informed DSE. The next chapter surveys existing ADO tools—including Opossum, a modelbased optimization tool developed during the author’s thesis research—and their visualization features.

129

6 Tools for Architectural Design Optimization This chapter surveys existing optimization tools for architectural designers and describes the development and implementation of Opossum (OptimizatiOn Solver with Surrogate Models), the model-based optimization tool developed by the author. 24

6.1 Optimization Tools for Architectural Designers Most researchers in ADO rely on external programs for optimization, such as MATLAB or GenOpt® (Wetter 2001), whose connection to geometry-creation and simulation packages requires special expertise or custom tools (Nguyen et al. 2014). Palonen et al. (2013) survey such external optimization programs in the context of building energy optimization. Turrin et al. (2011) and Gerber and Lin (2014) present examples of ADO workflows based on custom tools. Exceptionally, modeFRONTIER® (1998) allows the linkage of its optimization algorithms to external geometry creation and simulation packages via a graphical user interface (GUI). This lack of publicly available, user-friendly tools is one of the reasons why most architectural designers do not use optimization during the design process (Bradner et al. 2014). To the knowledge of the author, there are only three publicly-available architectural design tools that integrate performance simulation and black-box optimization within a single GUI and without programming skills: Grasshopper® (Rutten 2010a), Dynamo Studio (2017) and

The SUTD-MIT International Design Centre supported the development of Opossum (IDG215001100, PIs: Thomas Schroepfer and Giacomo Nannicini). 24

130

DesignBuilder (Singh and Kensek 2014). In contrast to the emphasis on singleobjective optimization in the mathematical optimization literature, the last two offer only Pareto-based optimization. Grasshopper is a parametric modeling tool that is constantly extended by an active community due to its popularity and the openness of its SDK 25. For example, structural analysis, energy analysis and daylight analysis are available from third parties, either as plug-ins or as easy-to-use interfaces to standalone software. Grasshopper includes Galapagos, a plug-in implementing a GA and SA. There are six third-party black-box optimization plug-ins for Grasshopper, all of which are available for free and share a similar GUI: Goat, Octopus, Silvereye, Opossum, Nelder-Mead Optimization, and Design Space Exploration. This breadth of available simulation and optimization tools makes Grasshopper an attractive platform for benchmarking optimization algorithms on ADO problems. According to Touloupaki and Theodosiou (2017), “the most popular software for computational performance-driven design optimization among architects is, by far, Grasshopper for Rhinoceros 3D.” A recent alternative is Dynamo Studio (2017), a similar parametric design tool that has a smaller community and for now offers only one black-box optimization tool: Optimo. DesignBuilder is less modular than Grasshopper and Dynamo, and offers geometry creation, simulation, and optimization in one integrated package.

90% of the architects surveyed in (Cichocka et al. 2017) most often use Grasshopper for parametric design. 25

131

6.1.1 Optimization Tools in Grasshopper •

Galapagos (Rutten 2010a) is included in Grasshopper and offers two metaheuristics: GA and SA. The Galapagos GA allows designers to potentially inspect the performance values and morphologies (i.e., visual characteristics) of all evaluated design candidates (i.e., design candidates that the optimization process has simulated). 26 ARUP has employed Galapagos to optimize structural designs (Binkley et al. 2014).

•

Goat (Flöry et al. 2012) offers an interface with NLopt (Johnson 2010), a free library containing various gradient-based and direct search optimization methods. Goat includes the following algorithms: DIRECT, the linear trust region method COYBLA (Powell 1994), the quadratic trust region method BOBYQA (Powell 2009), the Nelder-Mead variant SUBPLEX (Rowan 1990) and CRS2 (Kaelo and Ali 2006), a stochastic direct search method.

•

Octopus (Vierlinger 2012) implements the Pareto-based GAs SPEA-2 and HypeE algorithms. It displays an interactive visualization of the trade-off space, including solutions not on the Pareto front (Figure 6.1). Octopus can also visualize the trade-off space in terms of design candidates’ morphologies, i.e., appearances (Figure 6.2). Bollinger+Grohmann often use Octopus for structural design (Heimrath 2017).

•

26

Silvereye (Cichocka et al. 2016) offers PSO.

For a maximum of up to 27 generations.

132

Figure 6.1 Octopus’ visualization of a trade-off space. A cube represents a design candidate. The visualization depicts the design candidates in a single generation. At the end of the optimization process, users can browse through all generation. Illustration: www.food4rhino.com/app/octopus, accessed 29.10.2017

Figure 6.2 Design candidates’ morphologies arranged in a trade-off space in Octopus. Illustration: www.food4rhino.com/app/octopus, accessed 29.10.2017

133

•

Opossum (Wortmann 2016) is a global model-based optimization tool. It provides an interface to the open-source RBFOpt library (Costa and Nannicini 2014), which offers several RBF models and model-based algorithms (section 5.6.5). Like Galapagos, Opossum 1.4 allows designers to potentially inspect the performance values and morphologies of all evaluated design candidates.

•

FRoG (Framework for Optimization in Grasshopper) (Wortmann and Waibel 2017) is an open-source variant of Opossum that does not include an optimization algorithm, but that makes it easy to connect existing algorithms with Grasshopper. FRoG is intended to encourage benchmarking in the ADO community. Chapter 7 employs FRoG to test the simple genetic algorithm (SGA) of Wetter and Wright (2004), as implemented by Christoph Waibel.

•

Nelder-Mead Optimization (Gregson 2017) offers a multi-start variant of the Nelder-Mead algorithm (Nelder and Mead 1965) with constraint aggregation (Martins and Poon 2005).

•

Design Space Exploration (Mueller et al. 2017) is an optimization and analysis tool box. It includes (1) the Pareto-based GA NSGA-II, (2) quasirandom sampling, (3) k-means clustering, (4) surrogate-modelling with Ensemble Neural Network and Random Forest models (Tseranidis et al. 2016), and (5) Stormcloud, an interactive, single-objective GA (Danhaive and Mueller 2015). Stormcloud allows designers to select the parents for the next generation from the eight best solutions of the last generation. In this way, Stormcloud allows designers to influence the GA’s automated

134

search, but the small number of choices limits its potential for performance-informed DSE. Some of the tools discussed above allow designers to examine not only the best solution found, but all solutions (e.g., Galapagos and Opossum). Among the Pareto-based tools, Octopus stands out for its interactive GUI, that allows the selection of solutions both on and off the Pareto front. 6.1.2 Other Tools •

modeFRONTIER® (1998) is a commercial optimization software aimed at engineers that provides connections to external geometry creation and simulation packages, including Grasshopper. It implements several wellknown metaheuristic and direct search algorithms, as well as a proprietary, Pareto-based algorithm using surrogate models (Montrone et al. 2014). modeFRONTIER provides a wide range of analysis and visualization options (Figure 6.3), such as k-means clustering, scatter plots, Parallel Coordinates, and Self-Organizing Maps (Chapter 4). But since modeFrontier links to Grasshopper only as a source of performance data, it does not support the inspection of the morphologies of evaluated design candidates.

•

DesignBuilder (2005) is a commercial software aimed at architects and engineers

that

integrates

geometry

creation,

simulation,

and

optimization. It simulates energy consumption, daylight, carbon emissions, and thermal comfort and optimizes these criteria with NSGAII. DesignBuilder cannot optimize a design’s shape or structure, but only choices for material and technical systems (such as glazing, wall sections,

135

HVAC, and lighting). As such, its potential for ADO and performanceinformed DSE is severely limited. •

Optimo (Asl et al. 2014), a free, open-source plug-in for Dynamo, also implements NSGA-II.

Figure 6.3 Screenshot of modeFrontier 4.5 visualization options. Illustration: www.esteco.com/sites/default/files/design_space45.png, accessed 25.10.2017

6.1.3 Summary Although a wide range of optimization algorithms and tools is available in Grasshopper, the metaheuristic Galapagos and the Paret0-based GA Octopus are the most popular tools by far. For example, the works surveyed in (Touloupaki and Theodosiou 2017) only use these two. This popularity probably is due to (1) bias (section 5.9), (2) the fact that Grasshopper ships with Galapagos, and, in the case of Octopus, due to (3) its interactive visualization of tradeoff spaces. Goat offers the only global direct search algorithm and Opossum the only global model-based algorithm. Except for modeFrontier, tools outside of Grasshopper 136

offer only Pareto-based metaheuristics. modeFrontier offers a wide range of optimization algorithms, but no global direct search or global model-based ones. 27 Some of the tools considered in this section also provide features for performance-informed DSE (Table 6.1). Galapagos, Octopus, and Opossum allow the inspection not only of (Pareto) optimal solutions, but of all evaluated design candidates and their morphologies. 28 Design Space Exploration and modeFrontier offer functions to analyze and, in the case of modeFrontier, visualize, optimization results, but do not allow (easy) inspections of the design candidates’ morphologies. Stormcloud allows such inspections, but only for the eight best design candidates per generation. Table 6.1 Overview of ADO tools with performance-informed DSE features. Provides all design candidates Stormcloud

Provides design candidates’ morphologies

Visualization of fitness landscapes

x

Galapagos

x

x

Octopus

x

x

Opossum

x

x

Design Exploration

x

x

mode Frontier

x

Interactive Optimization x

x

The proprietary, Pareto-based algorithm using surrogate models (Montrone et al. 2014) is not an established model-based algorithm (section 5.8.5).

27

To allow inspection of a design candidate’s morphology, these tools reset and recalculate the parametric model according to the variable values of a design candidate. If the model includes a performance simulation, the default is to recalculate it as well. 28

137

6.2 Opossum: A Novel, Model-based Optimization Tool This section, adapted from (Wortmann 2017b), discusses the development of Opossum (OPtimizatiOn Solver with SUrrogate Models), a free, model-based optimization tool for Grasshopper. The author has led the development of Opossum during his thesis research. To the author’s knowledge, Opossum is the only model-based optimization tool that is accessible to architectural designers without specialized programming skills. Creating Opossum involved linking Grasshopper to the RBFOpt library and developing a GUI. 6.2.1 Linking Grasshopper and RBFOpt RBFOpt is programmed in Python 3.4 and relies on libraries for numerical computations (NumPy and SciPy) and auxiliary optimization (Pyomo). But Grasshopper supports only IronPython, a Python variant integrated with Microsoft’s .NET framework that does not support these libraries. Opossum is written in C#, which Grasshopper supports. The C# program starts an external Python 3.4 process that runs RBFOpt. Opossum and RBFOpt exchange data via a (hidden) command line window. 6.2.2 Algorithmic Parameters and GUI RBFOpt has over 40 parameters (Nannicini 2016), some of which are interrelated and some of which dramatically change its behavior. Most fundamentally, one can choose between two model-based algorithms: Gutmann (2001) and MSRSM (Regis and Shoemaker 2007). Gutmann searches the surrogate model for a point that—if it has the desired objective value—minimizes the “bumpiness” of the fitness landscape. MSRSM searches the model for points that balance improving

138

Figure 6.4 The four tabs of Opossum’s GUI. From top left in clock-wise order: (1) default choice and convergence graph, (2) stopping conditions and benchmarking, (3) “expert” settings, and (4) results list.

the model’s accuracy with the promise of better solutions. Both methods can use either a GA, random sampling, or mathematical solvers to search the surrogate model. To make this complexity accessible to non-experts, Opossum’s GUI consists of tabs that afford increasing levels of control (Figure 6.4): (1) The first tab lets users choose between minimization and maximization, select one of three pre-sets of parameters, and start and stop the optimization. The presets (Fast, Extensive, and Alternative) are based on intensive testing with mathematical test functions. “Fast” runs MSRSM with a genetic algorithm. “Extensive” is identical but spends more time on searching the model. “Alternative” runs Gutmann, which works well in certain cases. The first tab also displays an animated convergence graph to inform users about the progress of the optimization.

139

(2) The second tab lets users define stopping conditions based on the number of iterations or the elapsed time, and to conduct and log multiple optimization runs. (3) The third tab accepts command line parameters for RBFOpt. When desired, this “expert” window gives the user full control, with the parameters entered here overriding parameters set by the first two tabs. (4) The fourth tab presents a list with all design candidates that the optimization process evaluated, ordered by the performance values. Clicking a set of parameters in the results list resets the parametric model to these values. In future versions, the results tab should present only design candidates with meaningful differences, since the best-performing design candidates tend to be very similar. 29 In Grasshopper, Opossum follows the look and behavior of existing optimization tools, including conventions regarding the colors of optimization components and their connections to variables and objective values (Figure 6.5). Doubleclicking on an optimization component opens a window with a GUI unique to

Figure 6.5 Opossum in Grasshopper. The curves on the left link to the variables and the one on the right to the objective. Bipartite modularity (Barber 2007) is a promising approach for “filtering” design candidates.

29

140

each tool, which, for Opossum, contains the four tabs discussed above. Opossum thus presents a complex and innovative optimization library in a manner that is easy-to-use and familiar to users of Grasshopper. 6.2.3 Co-Evolution of RBFOpt and Opossum Although the author has not contributed directly to the development of RBFOpt, Opossum and RBFOpt nevertheless have “co-evolved” to some extent. Specifically, the development of Opossum has contributed to the development of RBFOpt in three ways: 1.

While RBFOpt is developed on Linux, Opossum is developed on and exclusively for Windows. As such, Opossum helps to identify bugs in RBFOpt that relate to different operating systems.

2. The benchmarking of Opossum on simulation-based problems revealed weaknesses in local search (e.g., Wortmann et al. 2017). In these tests, Opossum quickly identified well-performing regions of the fitness landscape but failed to completely capitalize on local optima in such regions. Compared to other algorithms, Opossum exhibited the fastest convergence early on, but was overtaken by slower algorithms later. RBFOpt 3.0.1 responded to these and similar results with the integration of a trust-region method for local search (Nannicini 2017b). 3. In order to harness RBFOpt’s surrogate models for performance-informed DSE, it is necessary for them to be available after the optimization is completed. Unique among optimization libraries, RBFOpt provides for this necessity by “freezing” the optimization state. The Performance

141

Explorer presented in Chapter 9 “unfreezes” this state to approximate fitness landscapes and to refine surrogate models with design candidates selected by designers.

6.3 Chapter Summary Opossum makes RBFOpt—a state-of-the-art, global, model-based optimization library—accessible to the ADO community by linking RBFOpt to Grasshopper via an easy-to-use GUI. 30 Opossum indirectly contributes to the continuing development of RBFOpt by suggesting directions for improvements that are relevant for ADO and performance-informed DSE. None of the tools discussed in this chapter support both interactive DSE and visualizations of fitness landscapes. Building on the capabilities of RBFOpt and a novel visualization method for fitness landscapes presented in Chapter 8, Chapter 9 presents the Performance Explorer, an interactive performanceinformed DSE tool that supports interactive and visual, performance-informed DSE. The next chapter presents benchmarks results on simulation-based problems for five single-objective, metaheuristic, or global optimization algorithms available in Grasshopper. 31

30 The mathematical optimization community recognized RBFOpt and Opossum with the

COIN-OR Cup 2016. Opossum has also received the SG Mark Award 2017 from Singapore’s Design Business Chamber and was nominated for the Japanese Good Design award.

31

Excluding Stormcloud’s interactive GA.

142

143

7 Quantitative Benchmarks The popularity of metaheuristics and lack of benchmarks in ADO (section 5.8) raise the question which optimization methods are especially suitable for ADO, and whether the ADO literature’s preference for metaheuristics is justified. This chapter compares six single-objective algorithms (GA, SA, PSO, DIRECT, and RBFOpt) on seven simulation-based test problems (two structural, three building energy, one daylight and one daylight and glare). The building energy problems (sections 7.2.3, 7.2.4, and 7.2.5) also compare the SGA. Based on their benchmark— which includes these problem—Wetter and Wright (2004) recommend the SGA. Daylight and glare problem 7 also compares the Pareto-based HypE. The problem evaluates HypE’s performance as a single-objective algorithm and the quality of the Pareto fronts found by the single-objective algorithms. The chapter tests five single-objective, global, black-box optimization algorithms available in Grasshopper, one Pareto-based one, and an unpublished implementation of the SGA: GA and SA (Galapagos), PSO (Silvereye), DIRECT (Goat), RBFOpt (Opossum), and HypE (Octopus) (section 6.1.1). Testing available black-box algorithms for the most popular platform for parametric design and ADO (Cichocka et al. 2017; Touloupaki and Theodosiou 2017) ensures the relevance of the benchmark for architectural practice. Section 7.1 presents the criteria and methodology for the benchmark, section 7.1 the benchmark problems, section 7.3 the benchmark results, and section 7.4 limitations. Section 7.5 discusses the results in light of the ADO and mathematical optimization literature.

144

7.1 Benchmark Criteria and Methodology Selection criteria for single-objective algorithms include speed of convergence (section 7.1.1), algorithmic overhead (section 7.1.2), and stability (section 7.1.3). 32 When measuring these criteria, the function evaluation budget (section 7.1.4) and algorithmic parameters (section 7.1.5) are important methodological choices. 7.1.1 Speed of Convergence and Algorithmic Overhead Most often, speed of convergence (i.e., effectiveness) is the most critical criterion. It measures how fast an optimization algorithm improves the objective value in terms of the number of function evaluations. To benchmark the speed of convergence, one graphs the values of the objective function, or an aggregate thereof, over the number of function evaluations, i.e., simulations (More and Wild, 2009). 7.1.2 Algorithmic Overhead Measuring speed of convergence in terms of function evaluations makes it independent of (1) an optimization algorithm’s complexity, (2) the speed of particular implementations, and (3) computers. Algorithmic overhead (i.e., computational cost) thus is an additional consideration. Many metaheuristics employ relatively simple calculations that require little time per iteration. Model-based methods, by contrast, typically require many more calculations to construct and search surrogate models. Direct search methods display a broad range of workloads.

Note that speed of convergence and stability also depend on the characteristics of individual problems. 32

145

When objective functions require more than a few seconds to evaluate (e.g., long simulation times), time per iteration quickly becomes negligible, which is why in many practical cases the number of function evaluations is the most relevant performance criterion. This chapter measures speed of convergence only in terms of the number of function evaluations. 7.1.3 Stability Optimization algorithms that exploit randomness achieve different results when applied repeatedly to the same problem. In such cases, more stable methods result in a smaller range of outcomes. Stability (i.e., reliability) is an important criterion since an unstable method can sometimes yield unsatisfactory results even if it displays satisfactory performance on average. Stability is crucial especially for metaheuristics, due to their heavy reliance on randomness. The ADO literature typically considers stability in the context of “robustness,” which refers to the stable performance of optimization algorithms across different problems and/or for different algorithmic parameters (Wright and Alajmi 2005). As discussed in section 5.8.2, the ADO literature typically considers GAs to be particularly robust (e.g., Wright and Alajmi 2005; Attia et al. 2013). A typical measure of stability is the standard deviation of the objective function values attained by repeated runs of the same algorithm and problem, for a fixed evaluation budget. Direct search methods that are fully deterministic are the most stable and always result in identical outcomes. Many optimization algorithms, including most model-based methods that employ RBFs, combine random and deterministic elements. 146

7.1.4 Function Evaluation Budget This benchmark considers the median objective function value and standard deviation from twenty runs on each problem. 33 To allow comparisons across all problems in this benchmark, the evaluation budget is identical for all problems (500 function evaluations). Wetter and Wright (2004) employ this evaluation budget for the “detailed” problem (Problems 4 and 5 in this benchmark). Based on the author’s experience, this budget represents a compromise between practical situations where only a small number of function evaluations is possible and ADO benchmarks, which—perhaps reflecting the relatively slow speed of convergence of the employed metaheuristic and local direct search methods (section 5.8.2)—sometimes employ thousands of functions evaluations. For example, Kämpf et al. (2010) budget 3,000 function evaluations on building energy problems and Hasançebi et al. (2009) up to 100,000 function evaluations on structural problems. 7.1.5 Algorithmic Parameters Choice of parameters significantly affects the performance of optimization algorithms, especially for metaheuristics (Talbi 2009). Ideally, algorithmic parameters should be “tuned” to individual problem characteristics. Nevertheless, we assume that algorithm’s authors chose sensible default parameters. Sensible defaults are important when, as in architectural practice, time pressure does not allow for extensive parameter tuning but calls for algorithms that are immediately efficient and usable by non-experts.

33

Since DIRECT is deterministic, a single run suffices for it.

147

Accordingly, the benchmark employs default parameters, but reduces the population sizes of GA and HypE to 25 to achieve a larger number of generations, i.e., a larger number of optimization steps. 34 The settings for the SGA on problems 3,4, and 5 follow (Wetter and Wright 2004).

7.2 Benchmark Problems The following simulation-based ADO problems represent a wide range (Table 7.1). They include structural, building energy, and daylight design and have four to forty variables. One problem from each category has discrete variables. The definitions of structural problems 1 and 2 are adapted from (Wortmann and Nannicini 2016), of building energy problems 3, 4, and 5 from (Wortmann et al. 2017), and of daylight problems 6 and 7 from (Wortmann 2017c). Daylight problems 6 and 7 further develop designs that originally were conceived for the façade of the New Jurong Church by Schroepfer + Hee (Schroepfer 2012) (pp 120129).

Silvereye’s PSO adjusts its “Max. Velocity” parameter based on the ranges of a problem’s variables. This adjustment is a simple form of automated parameter tuning and considered “default” for the benchmark in this chapter. If this automated tuning is effective, Silvereye should be more effective than a standard PSO. 34

148

Table 7.1 Overview of benchmark problems in this chapter. 𝑛𝑛𝐶𝐶 indicates the number of continuous variables, 𝑛𝑛𝐷𝐷 the number of discrete variables, and 𝑡𝑡 the required time for a single function evaluation in milliseconds on an Intel i7 6700K CPU with 4.0 Ghz and eight threads. Center indicates the objective value for a solution with values in the center of each variable range and Best the objective value for the best solution found in the benchmarks. Due to their formulation, the improvement from Center to Best for problems 5 and 6 is comparatively small. Type 1

Structural

0

8

𝑡𝑡 [ms] ~25

319

220

kg

2

Structural

0

22

~300

38,500

4,581

kg

3

Building Energy

4

0

~4,000

153.2

132.9

kWh/ m²a

4

Building Energy

13

0

~8,000

214.6

175.5

kWh/ m²a

5

Building Energy

0

13

~8,000

179.0

166.3

kWh/ m²a

6

Daylight

0

15

~45,000

13% (87% UDI)

11% (89% UDI)

7

Daylight and Glare

40

0

~120,000

45% (43% UDI / 32% DGP)

20% (85% UDI / 25% DGP)

𝑛𝑛𝐶𝐶

𝑛𝑛𝐷𝐷

Center

Best

Units

7.2.1 Problem 1: Structural, 8 Discrete Variables The first benchmark problem, which is a standard problem in the structural optimization literature (Hasançebi et al. 2009), concerns a small—5.08 meter tall—transmission tower with 25 structural members belonging to one of eight beam sets (Figure 7.1). Each beam set can be assigned one of 30 pre-selected, circular cross sections. The optimization problem consists in finding an assignment of cross sections to the eight beam sets that minimizes the weight of the tower while meeting stress and displacement constraints under horizontal and vertical point loads. In other

149

Figure 7.1 Problem 1’s small transmission tower. The left diagram represents each of the 8 beam sets with a color. The right diagram shows the tower deforming under load (1000% exaggerated).

words, the design problem is to find the lightest selection of cross sections that is structurally feasible, given a fixed geometry and load. As is typical for black-box optimization problems, the objective function includes “soft constraints.” Such a problem does not formulate constraints explicitly but includes constraint violations as penalties in the objective function. For the objective function 𝑓𝑓(𝑥𝑥) indicated below, 𝑝𝑝(𝑥𝑥) are the penalty parameters for constraints of the form 𝑐𝑐(𝑥𝑥) ≤ 𝑐𝑐𝑚𝑚𝑚𝑚𝑚𝑚. The penalty weight 𝑤𝑤𝑝𝑝 is 200 kg.

0 𝑝𝑝(𝑥𝑥) = � 𝑐𝑐(𝑥𝑥) −1 𝑐𝑐𝑚𝑚𝑚𝑚𝑚𝑚

𝑐𝑐(𝑥𝑥) ≤ 𝑐𝑐𝑚𝑚𝑚𝑚𝑚𝑚

𝑐𝑐(𝑥𝑥) > 𝑐𝑐𝑚𝑚𝑚𝑚𝑚𝑚

𝑓𝑓(𝑥𝑥) = 𝑤𝑤(𝑥𝑥) + 𝑤𝑤𝑝𝑝 ∗ �𝑝𝑝1 (𝑥𝑥) + 𝑝𝑝2 (𝑥𝑥) + 𝑝𝑝3 (𝑥𝑥)

𝑝𝑝1 (𝑥𝑥) corresponds to maximum stress, 𝑝𝑝2 (𝑥𝑥) to maximum displacement in the x-

axis, and 𝑝𝑝3 (𝑥𝑥) to maximum displacement in the y-axis. For a given set of cross

sections, the Grasshopper plug-in Karamba did first-order analysis of the stresses

150

and displacements for problems 1 and 2. The Grasshopper plug-in Karamba (Preisinger 2017) performed the structural simulations for problems 1 and 2. 7.2.2 Problem 2: Structural, 22 Discrete Variables The second structural design problem, also drawn from (Hasançebi et al. 2009), involves the assignment of cross sections to the 22 beam sets of a braced dome truss (Figure 7.2). The dome has a diameter of 40 meters and a height of 8.28 meters. This problem is a more complex variant of the previous problem and considers displacements and stresses for three different load cases (Dead load and snow (1) without wind, (2) with negative wind pressure, and (3) with positive wind pressure). The objective is to minimize the weight of the dome, subject to the two “soft” constraints of displacement (along all axes) and stress utilization of the

Figure 7.2 Problem 2’s dome under dead load. Each color represents one of the 22 beam sets.

151

structural members. Accordingly, the objective function is the same as for Problem 1, but with two instead of three constraints, and 𝑤𝑤𝑝𝑝 as 20,000 kg. 7.2.3 Problem 3: Building Energy, 4 Continuous Variables

Building energy problems 3, 4, and 5 originate from the seminal benchmark by Wetter and Wright (Wetter and Wright 2004). This study uses a newer EnergyPlus version but the original weather files. 35 Problem 3 concerns the annual energy consumption of an office building in Seattle with four variables: building orientation 𝛼𝛼 in degrees, window width for the West

and East façades 𝑤𝑤𝑊𝑊 and 𝑤𝑤𝐸𝐸 , and the shading device transmittance 𝜏𝜏. For

simplicity, the simulation only calculates two zones, one east-facing and one westfacing (Figure 7.3a). The objective function to be minimized is annual energy consumption in

kWh/m²a, calculated as the sum of annual heating and cooling loads (𝑄𝑄ℎ (𝑥𝑥) and

𝑄𝑄𝑐𝑐 (𝑥𝑥), divided by the typical plant efficiencies 𝜂𝜂ℎ = 0.44 and 𝜂𝜂ℎ = 0.77,

respectively) and the energy for artificial lighting (𝐸𝐸𝑙𝑙 , converted into primary fuel consumption with a multiplication by three).

𝑓𝑓(𝑥𝑥) =

𝑄𝑄ℎ (𝑥𝑥) 𝑄𝑄𝑐𝑐 (𝑥𝑥) + + 3𝐸𝐸𝑙𝑙 (𝑥𝑥) 𝜂𝜂ℎ 𝜂𝜂𝑐𝑐

EnergyPlus 8.5.0 performed the energy simulations for problems 3, 4, and 5.

Christoph Waibel (EMPA, ETH Zurich) obtained the original energy models from the authors of (Wetter and Wright 2004), updated them to EnergyPlus 8.5.0 with the official EnergyPlus file updater, and prepared customs scripts to run the models in Grasshopper. 35

152

7.2.4 Problem 4: Building Energy, 13 Continuous Variables Problem 4 concerns a more detailed office building in Houston (Figure 7.3b). The thirteen variables control the window width and heights for the North, West, East and South façade (𝑤𝑤𝑁𝑁 , 𝑤𝑤𝑊𝑊 , 𝑤𝑤𝐸𝐸 , 𝑤𝑤𝑆𝑆 ), the depth of the window overhangs in West,

East and South (𝑜𝑜𝑊𝑊 , 𝑤𝑤𝐸𝐸 , 𝑤𝑤𝑆𝑆 ), the setpoint of the shading devices in W/m² at the

West, East and South façade (𝑠𝑠𝑊𝑊 , 𝑠𝑠𝐸𝐸 , 𝑠𝑠𝑆𝑆 ), the setpoint for the zone air temperature for night cooling during summer and winter (𝑇𝑇𝑢𝑢 , 𝑇𝑇𝑖𝑖 ), and the cooling design supply air temperature used for the HVAC system sizing (𝑇𝑇𝑑𝑑 ). The objective function is identical to problem 3.

7.2.5 Problem 5: Building Energy, 13 Discrete Variables Problem 5 is identical to 4 but uses discrete variables and Chicago Weather. 7.2.6 Problem 6: Daylight, 15 Discrete Variables Problem 6 considers the daylight in a single room (11.25 by 7.5 meter, with a floor to ceiling height of 4.3 meter), located on the Southwest corner of the third floor of a building in Singapore (Figure 7.4). The room has two facades: One with nine façade components, and one with six.

Figure 7.3 Energy models of office buildings for problem 3 (a) and problems 4 and 5 (b). The small model on the left simulates only two zones (i.e., rooms), while the larger model on the right simulates a complete floor.

153

Every façade component is 1.25 meters wide and perforated with a grid of 426 circular holes, with a diameter of 50 millimeters each. The holes have “flaps,” or micro-louvers, that, for each façade component, can take angles between 0° and 180° (in increments of 5°). This discretization standardizes the façade components into 37 daylight-modulating types. The optimization problem searches for a configuration for the angles of the fifteen louvered components that maximizes Useful Daylight Illuminance (UDI). UDI measures the annual percentage of time during which a sensor point receives an amount of daylight that is sufficient for office work while avoiding glare and excessive heat gains (300-3000 lux) (Mardaljevic et al. 2012).

Figure 7.4 Diagram of the room Problem 6 optimizes in terms of daylight. The crosses indicate the sensor grid for simulating UDI.

154

To ensure a coherent appearance of the façade from the outside and a subtle modulation of daylight on the interior, the objective function penalizes design candidates for angle differences larger than 10° between neighboring façade components. The penalty function below computes a penalty value 𝑝𝑝𝑖𝑖 for an

individual façade component with angle 𝑑𝑑𝑖𝑖 . If the angle difference with the

previous neighbor is smaller than 10°, the penalty 𝑝𝑝𝑖𝑖 is zero, otherwise it is a

squared error term. The objective value to be minimized is the difference of 100% and average UDI 𝑢𝑢(𝑥𝑥) minus the penalty sum of the fifteen façade components 𝑝𝑝(𝑥𝑥).

0

2

𝑝𝑝𝑖𝑖 (𝑥𝑥) = � |𝑑𝑑𝑖𝑖 − 𝑑𝑑𝑖𝑖−1 | − 10° � � 170° 𝑓𝑓(𝑥𝑥) = 𝑢𝑢�(𝑥𝑥) − �

𝑛𝑛

𝑑𝑑𝑖𝑖 − 𝑑𝑑𝑖𝑖−1 ≤ 10°

𝑑𝑑𝑖𝑖 − 𝑑𝑑𝑖𝑖−1 > 10°

𝑝𝑝𝑖𝑖 (𝑥𝑥)

𝑖𝑖=0

RADIANCE performed the daylighting and glare simulations for problems 6 and 7, via the DIVA 4.0.2 Grasshopper plug-in (Jakubiec and Reinhart 2011). 7.2.7 Problem 7: Daylight and Glare, 40 continuous variables Problem 7 considers a single room in Singapore (Figure 7.5). The rectangular room has a South-facing, 10.8 meters long and 3.6 meters high façade and is 7.2 meters deep. The room’s floor is raised 20 meters above ground level. The façade has a porous screen with a triangular grid of 1.692 circular openings. To avoid controlling every opening with an individual variable and to create a graduated, cloudy appearance, a grid of forty “attractor points” controls the openings, with weights in the range [0.0,1.0]. To create a soft falloff, the

155

parametric model calculates the radius of every opening as the average of the values of all attractor points, weighted by the inverse squares of their distances to the opening and multiplied by the maximum radius of 65 millimeters. Openings with a radius below 10 millimeters are closed completely. This formulation results in a problem with forty continuous variables. The objectives of the optimization are to (1) maximize UDI while (2) minimizing Daylight Glare Probability (DGP). This problem calculates UDI as the average from a seven by five grid of sensor points. DGP measures glare as a percentage for a specific camera view and for a specific point-in-time (Wienold 2010). This value indicates whether the glare is imperceptible (35% > DGP), perceptible (40%> DGP ≥ 35%), disturbing (45% > DGP ≥ 40%) or intolerable (DGP ≥ 45%). To reduce calculation time, this value is simulated only for a single camera. The south-

Figure 7.5 Diagram of the room Problem 7 optimizes in terms of daylight and glare. The crosses indicate the sensor grid for simulating UDI, the cone the camera position and view for simulating DGP, and the numbers the attractor points regulating the façade’s porosity.

156

facing camera points directly at the screen and is in the center of the room at a height of 1.6 meters from the floor. 36 An annual glare simulation calculates direct sunlight only for the daylight hours of five days (21th of June, August, September, October, and December). In Singapore, these days add up to 59 daylight hours. For the remaining hours in the year, the simulation interpolates the direct sunlight contribution. Nevertheless, annual glare simulations can take hours even at low quality settings. This timeintensiveness makes such simulations impractical as an optimization objective. Instead, the problem approximates annual glare as the average of the 59 DGP values corresponding to the 59 daylight hours on which the more extensive annual glare simulations rely. Although less accurate than a full annual simulation, this approach yields a good qualitative assessment of the presence or absence of discomfort glare. An and Mason (2010) apply a similar method. If daylight quality and glare avoidance are equally important, subtracting average annual UDI 𝑢𝑢 from 100% and adding (approximated) average annual DGP 𝑔𝑔 yields a single minimization objective. (UDI and DGP are in the range [0,1].):

𝑓𝑓(𝑥𝑥) = (1.0 − 𝑢𝑢(𝑥𝑥) + 𝑔𝑔(𝑥𝑥)) / 2

For a more realistic glare assessment, several camera views representing the users’ field of view should be evaluated. 36

157

7.3 Benchmark Results This section presents the benchmark results, individually for each problem and as a combination of the seven problems’ results.

158

7.3.1 Problem 1: Structural, 8 Discrete Variables For problem 1, RBFOpt converges most rapidly and the Galapagos GA most slowly. The remaining algorithms exhibit similar speeds of convergence (Figure

Median Structural Weight [kg] (20 runs)

7.6).

300

GA DIRECT PSO SA RBFOpt

290 280 270 260 250 240 230 220

0

100

200 300 Function Evaluations

400

500

Figure 7.6 Problem 1: Convergence The legend displays the algorithms according their final solutions’ values, from bottom to top.

Figure 7.7 Problem 1: Stability This figure orders the algorithms according their results’ standard deviations, increasing from left to right. In this “box-and-whisker plot,” the top lines of the colored boxes represent the upper quartiles—i.e., the 25% worst solutions are above this line—and the bottom lines the lower quartiles. The boxes’ center lines represent medians, and the crosses averages. The parallel lines at top and bottom represent the values’ ranges, except for outliers, which exceed the respective quartile by more than 150%.

159

Note that RBFOpt converges rapidly until about 100 evaluations, then improves more slowly until about 300 evaluations, and then again improves comparatively rapidly. This behavior probably is due to a combination of restarts (which trigger after 100 evaluations without improvements) and an increase in refinement (i.e., local searches). This “unlimited refinement” starts after the algorithm has expended 60% of the function evaluation budget, i.e., after 300 evaluations. In terms of stability, the deterministic DIRECT is the most stable, followed by SA. GA, RBFOpt, and PSO, display comparable variance (Figure 7.7). For PSO and especially for RBFOpt, this variance tends towards solutions that are better than those found by DIRECT and SA. In summary, DIRECT is a good choice when stability is critical, but RBFOpt finds better solutions 95% of the time. 7.3.2 Problem 2: Structural, 22 Discrete Variables For problem 2, RBFOpt again is the fastest converging algorithm, with SA “catching up” at around 250 function evaluations. DIRECT converges far too

Median Structural Weight [kg] (20 runs)

55,000

DIRECT GA

45,000

PSO SA RBFOpt

35,000 25,000 15,000 5,000

0

100



160

400

500


slowly, probably due to this problems’ considerable number of variables (22 discrete variables) (Figure 7.8). Interestingly, DIRECT’s recursive subdivision is discernable in the “staircase” pattern of its convergence graph. Apart from the deterministic DIRECT, RBFOpt is the most stable algorithm, again closely followed by SA (Figure 7.9). Accordingly, RBFOpt is the best choice for this problem and evaluation budget. 7.3.3 Problem 3: Building Energy, 4 Continuous Variables Problems 3, 4, and 5 also compare the SGA. For problem 3, SA, PSO, DIRECT, and RBFOpt find median solutions of very similar quality (within 0.05 kWh/m2a) (Figure 7.10). SA finds the best median solution, but DIRECT, and especially RBFOpt, converge faster for the first hundred function evaluations. After DIRECT, RBFOpt is the most stable algorithm (Figure 7.11). Accordingly, DIRECT is the best choice for this problem and evaluation budget. GA and SGA perform comparatively poorly.

161

Median Annual Energy Consumption [kWh/m²a] (20 runs)

160

GA

SGA

155

RBFOpt DIRECT PSO

150

SA

145 140 135 130

0

100


400

500



7.3.4 Problem 4: Building Energy, 13 Continuous Variables For problem 4, SA, DIRECT, and RBFOpt find the best median solutions (within 0.3 kWh/m2a of each other) (Figure 7.12). RBFOpt is the fastest converging algorithm until about 200 function evaluations. After DIRECT, RBFOpt is the most stable algorithm (Figure 7.13). Accordingly, RBFOpt is the best choice for smaller function evaluation budgets and DIRECT the best choice when stability is critical. GA and SGA perform similarly poorly.

162


220

GA

215

SGA

210

PSO

205

RBFOpt

200

DIRECT SA

195 190 185 180 175 170

0

100


400

500



7.3.5 Problem 5: Building Energy, 13 Discrete Variables For problem 5, SA and RBFOpt find the best median solutions (within 0.2 kWh/m2a of each other) (Figure 7.14). RBFOpt is fastest converging algorithm until about 200 function evaluations. After DIRECT, RBFOpt is the most stable algorithm (Figure 7.15). SA exhibits two outliers. Accordingly, RBFOpt provides the best balance between speed of

163

convergence and stability for this problem and evaluation budget. Again, GA and SGA perform similarly poorly.


190

GA SGA

185

PSO DIRECT

180

RBFOpt SA

175 170 165 160

0

100


400

500



7.3.6 Problem 6: Daylight, 15 Discrete Variables For problem 6, RBFOpt and DIRECT find the best median solutions (within 0.1% of each other) (Figure 7.16). DIRECT evaluates the “Center” solution—which for

164

this problem is very good (Table 7.1) 37—first. RBFOpt catches up at around 100 function evaluations. PSO and SA converge slower, while the GA does not find useful solutions. After DIRECT, RBFOpt is the most stable algorithm (Figure 7.17). Accordingly, DIRECT provides the best balance between speed of convergence and stability for this problem and evaluation budget.

100% - f(x) = Median Annual UDI (with penalty)

800%

GA SA PSO DIRECT RBFOpt

700% 600% 500% 400% 300% 200% 100% 0%

0

100


400

500


Figure 7.17 Problem 6: Stability. For the Center design candidate, all panels have medium size openings, which incurs no penalty and results in very good daylight. 37

165

7.3.7 Problem 7: Daylight and Glare, 40 continuous variables On problem 7, DIRECT is the worst-performing algorithm because its recursive subdivision proceeds too slowly in the forty dimensions corresponding to the variables (Figure 7.18). The metaheuristic SA, GA, HypE (a Pareto-based GA), and PSO perform similarly and comparatively poorly. This result indicates that, compared to the single-objective GA, SA, and PSO, the multi-objective GA is not DIRECT SA GA HypE PSO RBFOpt

100% - 2 x f(x) = Median Annual UDI - Annual DGP

50% 45% 40% 35% 30% 25% 20%

0

100


400

500

Figure 7.18 Problem 7: Convergence. Note the rapid improvement of RBFOpt after 40 iterations. Until this point, RBFOpt is quasi-randomly simulating design candidates. After 40 iterations (i.e., the number of variables of this problem), RBFOpt constructs the first surrogate model, which almost immediately results in a substantial improvement.


166

less efficient when evaluated as a single-objective algorithm. Opossum’s RBFOpt is the best-performing algorithm and, except for the deterministic DIRECT, the most stable (Figure 7.19). The results for the hypervolume in Figure 7.20 indicate that, on this problem, the single-objective RBFOpt finds more accurate Pareto fronts than the multiobjective HypE. RBFOpt also is more stable (Figure 7.21).

Median Hypervolume

75%

DIRECT SA

70%

GA

65%

HypE

60%

RBFOpt

PSO

55% 50% 45% 40% 35%

0

100


Figure 7.20 Problem 7 (Hypervolume): Convergence

Figure 7.21 Problem 7 (Hypervolume): Stability

167

400

500

DGP (Daylight Glare Probability)

37.5%

DIRECT SA PSO GA HypE RBFOpt Best

35.0%

32.5%

30.0% 27.5%

25.0% 22.5%

20.0%

17.5%

15.0% 90%

85%

80%

75%

70%

65%

60%

55%

50%

UDI (Useful Daylight Illuminance)

45%

40%

35%

30%

Figure 7.22 Pareto fronts found during each algorithms’ most representative (i.e., median) run. “Best” is the combined front from all algorithms and runs, i.e., the most accurate. (The markers’ colors indicate the algorithm).

Hypervolume is a quality measure for Pareto fronts (Zitzler et al. 2007). Here, the hypervolume is calculated as a percentage of the “nondominated” trade-off space left uncovered by the design candidates evaluated so far (section 5.2). Except for the Pareto-based HypE, which exhibits a larger improvement relative to GA, SA, and PSO, the curves in Figure 7.20 resemble the ones in Figure 7.18. This similarity is due to the fact that improvements of the weighted objective value typically imply improvements of the Pareto front. In Figure 7.22, RBFOpt and HypE have found the closest approximations of the best-known Pareto front. The diagonal fronts indicate a tradeoff between maximizing daylight and minimizing glare 38, although one can achieve large improvements of daylight quality by accepting small increases in glare. (But low average DGP values can contain isolated instances of intolerable glare.) This improvement is especially noticeable for RBFOpt: It finds high-daylight solutions that also suffer less glare than the next best daylight solutions by other 38

This tradeoff indicates that the “useful” range of UDI avoids glare only imperfectly.

168

algorithms (Figure 7.23). Hype finds a wider front, i.e., a more diverse range of choices, but suggests a steep tradeoff between daylight and glare, while RBFOpt indicates a less dramatic tradeoff. Rather than promoting understanding, the results from HypE are misleading in terms of the true trade-off space.

Figure 7.23 Non-dominated wall screen design with 84% UDI and 24% DGP. For office spaces in tropical climates, such wall screens could deliver substantial energy savings (from cooling and artificial lighting), while avoiding the need for louvers and other movable elements. Schroepfer + Hee proposed similar wall screens for the façade of the New Jurong Church (Schroepfer 2012) (pp 120-129).

7.3.8 Combined Results Figure 7.24 and Figure 7.25 combine the results from all problems. They indicate the (median) objective value achieved by each algorithm as a percentage of the best-know solution. (Since all problems are minimization problems, these values are larger than 100%.) Apart from the first forty function evaluations, during which DIRECT often finds the best solutions due to its deterministic search sequence, RBFOpt is the fastest converging algorithm, followed by PSO, SA, and DIRECT (Figure 7.24). Remarkably given its specialization on small evaluation budgets, RBFOpt exhibits strong

169

Median across problems 1-7, relative to the best solutions found for each problem

140%

GA DIRECT SA

130%

PSO RBFOpt

120%

110%

100%

0

100


400

500

Figure 7.24 Problems 1-7: Convergence

Figure 7.25 Problems 1-7: Stability. Note that DIRECT exhibits identical performance on individual problems, but varying performance relative to different problems.

improvement also after 200 function evaluations, which is where PSO, SA, and DIRECT begin to stagnate. 39 RBFOpt is the most stable algorithm, and GA the most unstable (Figure 7.25). As such, RBFOpt is the only algorithm that performs well in terms of convergence

39

This “later” convergence is likely due to RBFOpt’s improved local search (section 6.2.3).

170

and robustness. and that shows significant improvement after 200 function evaluations. DIRECT performs well on problems with smaller numbers of variables. As such, it is worth trying on such problems, especially because its deterministic nature makes it more reliable than PSO and SA, which otherwise perform comparably. The GA performs significantly worse than the remaining algorithms. Given its popularity in architectural theory (section 5.8.1) and ADO (section 5.8.2), the GA’s performance is remarkably disappointing, especially relative to other metaheuristics such as SA and PSO.

7.4 Limitations In addition to the choices of the function evaluation budget (section 7.1.4) and algorithmic parameters (which were left as defaults (section 7.1.5)), this section identifies two limitations: (1) the size of the problem set and (2) the number of tested algorithms. 7.4.1 Size of the Problem Set The most critical limitation of this benchmark is its problem set since the performance of optimization algorithms is problem-dependent (section 5.1). Compared to mathematical benchmarks (section 5.7), this problem set is small, which limits its generalizability. Nevertheless, in the author’s opinion, the problem set combines distinct types of simulation-based ADO problems, numbers and types of variables, and evaluation times in a diverse and relevant manner. Note that, in contrast to mathematical benchmarks, this benchmark employs simulation-based problems instead of mathematical test functions. 171

The number of problems (seven) also is comparable to well-cited ADO papers such as (Wetter and Wright 2004)—two building energy problems with three different weather files, i.e., six problems—and (Hasançebi et al. 2009), with five structural problems. Most ADO papers present only a single ADO problem and algorithm (e.g., Turrin et al. 2012; Lin and Gerber 2014; Weng et al. 2015; Nagy et al. 2017). Compared to other ADO papers, this problem set is more varied since it covers three ADO domains: (1) Structure, (2) building energy, and (3) daylight. The author also is not aware of any other ADO benchmarks on daylighting, and especially glare, problems. 7.4.2 Number of Tested Algorithms Similarly, the number of algorithms (five) that were tested across all problems is small compared to, for example, mathematical optimization competitions (section 5.7). It also is slightly smaller compared to (Wetter and Wright 2004), with nine algorithms, and (Hasançebi et al. 2009), with seven algorithms. But Wetter and Wright (2004) test only local direct search, metaheuristic, and hybrid methods, while Hasançebi et al. (2009) test only metaheuristics. To the author’s knowledge, no other ADO benchmark includes both global direct search and model-based algorithms 40.

except (Wortmann and Nannicini 2016; Wortmann 2017b; Wortmann 2017c; and Wortmann et al. 2017).

40

172

7.5 Discussion This benchmark indicates that—for practical, simulation-based, and timeintensive ADO problems with modest evaluation budgets—a global modelbased method such as RBFOpt is the most likely to yield the best results. 7.5.1 Critique of Genetic Algorithms in ADO In the seminal study by Wetter and Wright (2004), the authors conclude that the hybrid PSO/Hook-Jeeves algorithm finds the best solutions and the SGA offers faster convergence at a slight decrease in solution quality. In this benchmark, the Galapagos GA is the worst-performing algorithm on problems 1, 3, 4, and 5 and SGA the second-worst on problems 3, 4, and 5. The latter three problems replicate problems from (Wetter and Wright 2004). The Galapagos GA also is the worst-performing algorithm overall. 41 These results contradict not only (Wetter and Wright 2004), but most of the ADO literature, which often exhibits a bias towards GAs (section 5.9). On the other hand, the results are unsurprising in light of benchmark results (section 5.7) and mathematical proofs of convergence (section 5.8.3). Generalizations such as “evolutionary algorithms are robust in exploring the search space for a wide range of building optimization problems” (Attia et al. 2013) demand critical scrutiny and more extensive benchmarking.

Without the comparison with the SGA, one might have suspected that the poor performance of the GA is due to an ineffective implementation by Rutten (2010b). But the comparison shows that the GA is only slightly worse than the SGA. 41

173

The poor performances of the GAs in this benchmark also problematize the common practice (Hare et al. 2013; Evins 2013) of using GAs as a baseline in (already rare) ADO benchmarks. At least compared to the Galapagos GA, almost any algorithm will be better most of the time, which can give rise to the spurious innovations decried by, for example, Hendrix and G.-Tóth (2010) and Sörenson (2015). Instead, benchmarks should include state-of-the-art methods such as DIRECT and RBFOpt. 7.5.2 Single- vs. Multi-objective Optimization Although this benchmark touches on multi-objective optimization (MOO) only for problem 7, its results nevertheless indicate that one should apply Paretobased MOO only when a large budget of function evaluations is available and the efficiency of finding good solutions is less important. One should not necessarily apply a Pareto-based algorithm when one wants to optimize a problem with more than one performance criterion: single-objective algorithms focus on finding good values for the objective function, and Paretobased algorithms focus on providing a choice of different trade-offs between conflicting performance criteria. Single- and multi-objective algorithms thus have different applications both in terms of the goal of the optimization process and the required budget of function evaluations, a point which the ADO literature often does not discuss explicitly enough (e.g., Radford and Gero 1980; Evins 2013, see section 5.8.4). A near total absence of benchmarks—with (Hamdy et al. 2016) a notable exception—makes many MOO ADO case studies hard to evaluate in terms of performance and practical relevance. 174

For example, Nagy et al. (2017) use a Pareto-based algorithm to optimize an office layout in terms of six objectives for 100 generations with a population size of 100 (i.e., 10,000 function evaluations). This problem likely is exceedingly difficult due to its large number of variables (45) and objectives (6), but the lack of a comparison with other, state-of-the-art optimization methods makes it very difficult to evaluate the result. 7.5.3 Conclusion In short, this benchmark questions (1) the preference for Pareto-based algorithms in ADO (section 5.8.4) and (2) the bias towards GA’s in the ADO literature (section 5.9). Instead—and in reference to Research Question 2—this benchmark confirms an insight from the mathematical optimization community: Mathematically well-grounded methods often achieve reliable results also on practical problems where proofs of convergence might not necessarily hold, such as in ADO. Specifically, the global model-based method RBFOpt is the only algorithm that performs well across all test problems.

175

8 Introducing Performance Maps This chapter, adapted from (Wortmann 2017a), presents a novel, reversible method to visualize high-dimensional parametric design spaces and fitness landscapes with applications in performance-informed DSE. The method is an extension of Star Coordinates using triangulation-based interpolation via Barycentric Coordinates. It supports the understanding of design problems in ADO by allowing designers to move back and forth between a high-dimensional design space and a low-dimensional performance map. The chapter shows how to construct a performance map (section 8.1), followed by an example and comparison with existing techniques (section 8.2), a discussion of potential applications (section 8.3), and limitations (section 8.4).

8.1 A Novel Method for Visualizing Optimization Results The method presented in this chapter extends the metaphor of fitness landscapes into higher dimensions through a multivariate visualization that represents performance not as elevation but as color. An extension of Star Coordinates, the method allows mappings from two-dimensional performance maps to high-dimensional design spaces. The method achieves this reversible, surjective mapping in four steps: (1) Project the evaluated design candidates (i.e., candidates whose performance has been simulated) onto two dimensions using Star Coordinates, (2) compute a Delaunay triangulation for the projected points in the two-dimensional map, (3) approximate the performance of unexplored design candidates (i.e., candidates with unknown performance values) by

176

interpolating their parameter values based on the corner points of the triangulation's triangles, and (4) estimate their performance values via a surrogate model. 42 8.1.1 Star Coordinates The method projects evaluated design candidates using Star Coordinates (Figure 8.1). Although other arrangements are possible, here the coordinate axes are spaced equally around a circle, which is typical. To avoid skewing the visualization, it is advisable to normalize the parameters to the same range (e.g., [0, 1]). Star Coordinates determines the two-dimensional position 𝑝𝑝 of a 𝑛𝑛-

dimensional design candidate by multiplying the value of each parameter with the vector representing the corresponding coordinate axis, and by adding the resulting scaled vectors. In other words, the two-dimensional embedding 𝑝𝑝 ∈ 𝑅𝑅2

of the 𝑛𝑛-dimensional design candidate 𝑥𝑥 ∈ 𝑅𝑅𝑛𝑛 is a linear combination of 𝑛𝑛 two-

dimensional coordinate vectors 𝑣𝑣: 𝑝𝑝 = 𝑥𝑥1 𝑣𝑣1 + 𝑥𝑥2 𝑣𝑣2 + ⋯ + 𝑥𝑥𝑛𝑛 𝑣𝑣𝑛𝑛

This linear mapping makes Star Coordinates easy to understand and allows visual estimates of design parameters (Rubio-Sanchez et al. 2016). A complication of Star Coordinates is that two design candidates with different parameters can map onto an identical point. For example, candidates with the same value for all parameters (all zero, all one, etc.), map onto the origin of the coordinate axes. From a mathematical standpoint, a two-dimensional point maps to an infinite

Alternatively, one can interpolate the performance of unexplored design candidates directly, but this method is less accurate. 42

177

number of parameter sets in the higher dimensional space, since the twodimensional mapping is a linear combination of non-independent vectors. This mapping to more than one parameter set makes Star Coordinates surjective. Performance Maps resolve this complication by preferring the better performing design candidate when several candidates map onto the same point. (In practice, candidates rarely overlap.) Performance Maps thus are biased slightly towards better-performing candidates. 8.1.2 Triangulating a Surjective Mapping The Delaunay algorithm (Press et al. 2007) provides a quick (𝑂𝑂(𝑚𝑚 𝑙𝑙𝑙𝑙𝑙𝑙 𝑚𝑚)) and

unique triangulation for 𝑚𝑚 points in the plane. The triangulated points are termed vertices and the connections between the vertices edges. Since every set

of 2D coordinates belongs to exactly one triangle (with coordinates coinciding with edges or vertices a special case), the triangulation provides a unique set of three vertices (i.e., three evaluated design candidates) for every set of 2D coordinates (Figure 8.1).

Figure 8.1 362 explored designs of the example optimization problem represented as twodimensional points with Star Coordinates and triangulated with the Delaunay algorithm.

178

Delaunay triangulations have the attractive property that every vertex is connected to its nearest neighbor. Therefore, a point in a performance map is always close—though not necessarily closest—to the three vertices of its triangle, i.e., it always relates closely to the three evaluated design candidates that its parameters are interpolated between. Determining (several) nearest neighbors for every point is a more computationally expensive and less visually intuitive alternative. 8.1.3 Interpolating Performance Values with Barycentric Coordinates The method approximates the performance of unexplored design candidates by linearly interpolating between either the performance values or the design parameters of the three evaluated candidates associated with an unexplored candidate through a triangle in the Delaunay triangulation. 43 This interpolation is necessary since one cannot directly infer a unique set of design parameters from a performance map. 44 There are three possibilities for drawing performance maps: (1) Interpolating performance values directly, (2) interpolating a set of design parameters and obtaining the corresponding performance value by evaluating (i.e., simulating) the resulting design candidate and (3) interpolating a set of design parameters and approximating the corresponding performance value via a surrogate model. The first method requires only a simple calculation. However, the resulting performance map will add little to a mapping of evaluated design candidates

43

Press et al. (2007) describe a similar, triangulation-based interpolation method.

44

Because the radial coordinate axes are not linearly independent.

179

only. The second method is the most accurate but will take a prohibitively long time in most cases. (Depending on their size and resolution, performance maps can require tens of thousands of performance values.) Figure 8.2 This performance map represents a design space in terms of the estimated performance values of unexplored design candidates and indicates explored candidates. Note the groupings of well-performing candidates in the upper left corner and near the origin. was created with the third method, discussed in more detail in the next section. One interpolates performance values or design parameters with Barycentric Coordinates (Press et al. 2007), a common technique for mapping textures in 3D computer graphics. Barycentric coordinates associate every point inside a triangle with three weights, one for each corner point. These weights always sum to one. For example, the weights for one of the corner points are (1, 0, 0). One uses these weights to interpolate between values associated with the corner points of a polygon, such as, in this case, a performance value or one of a set of design parameters. To interpolate a full set of design parameters, one interpolates the design parameters one by one. 8.1.4 Approximating Performance Values with a Surrogate Model A more sophisticated method to approximate the performance of unexplored design candidates is the employment of surrogate models (section 5.6.1). Using interpolated parameter values, surrogate models quickly generate performance estimates that are more accurate than directly interpolated performance values. Even with a surrogate model, drawing the performance map in the example below takes several minutes. To further speed up the drawing of the map, one 180

can obtain a performance approximation not for every pixel, but only for a grid of pixels. One then obtains the performance values for the remaining pixels with bilinear interpolation, a straightforward method for interpolating between points on a grid. Using parallel processing to obtain the (grid of) performance values from the surrogate model further accelerates the drawing of the performance map. Depending on the number of design candidates and speed of the computer, drawing a performance map for a 4 x 4 grid of pixels and with parallel processing typically takes a couple of seconds.

8.2 Visualizing an Example Problem This section presents a performance map for a small transmission tower to be optimized in terms of weight while meeting stress and deflection constraints (section 7.2.1). The tower consists of 25 structural members, which are categorized into eight beam groups. The problem thus has eight discrete variables. The visualizations presented here represent the optimization results and surrogate model derived from a ten-minute run with an early version of Opossum (i.e., RBFOpt). For every iteration, RBFOpt uses a surrogate model to decide which design candidate to simulate and then recalculates the model based on the simulation result. In total, RBFOpt evaluated 362 design candidates with the lightest weighing 240 kilograms. 8.2.1 Example Performance Map In Figure 8.1 and Figure 8.2, every evaluated design candidate is represented as a colored circle, with the position of each circle corresponding to the candidate's parameters, and the circle's fill color corresponding to its weight. Due to the large

181

weight difference between the best and the worst evaluated design candidates (240 kg and 1940 kg)—which also includes penalties for exceeding stress and deflection constraints—the colors are applied on a logarithmic scale. In this way, more detail is visible for the better, lighter design candidates. The colors follow a palette that is more perceptually adequate than conventional rainbow palettes due to its linearly increasing luminance (Niccoli and Lynch 2012). The color palette and scale are identical for all figures. Figure 8.1 indicates only the colored points of the evaluated design candidates on the left, with the Delaunay triangulation used to interpolate between them on the right. Figure 8.2 shows the approximated performance map derived from the surrogate model. The visualization indicates two separate areas of good, i.e., light

Figure 8.2 This performance map represents a design space in terms of the estimated performance values of unexplored design candidates and indicates explored candidates. Note the groupings of well-performing candidates in the upper left corner and near the origin.

182

design candidates: one closer to the center, and one in the upper left corner. These groupings indicate two different types of solutions, with each type displaying similar parameters. The visualization also shows that the central group has been explored extensively, while the peripheral group consists of only a small number of explored points. The visualized surrogate model indicates that this area might contain high-performing design candidates and thus is promising for future exploration. 8.2.2 Comparison between Simulation, Interpolation and Approximation Figure 8.3 compares an exact performance map of the example problem—all 379.922 design candidates, i.e., pixels, were simulated 45—with a directly interpolated one (section 8.1.3) and an approximated one that uses a surrogate model (section 8.1.4). On an Intel i7 6700K CPU with 4.0 Ghz and eight threads, generating the interpolated map took 10 seconds, generating the approximated map 20 seconds, and generating the simulated map about one-and-a-half hours. The sharp contrasts in the simulated performance map are due to the discontinuities that often characterize structural optimization problem: Failing designs can be very similar to high-performing ones in terms of their variables.

45

The performance maps’ sampling was complete instead of on a grid (section 8.1.4).

183

Figure 8.3 Visual comparison between a performance map with directly interpolated performance values (left), an exact performance map with simulated values (center), and a performance map with values approximated with a surrogate model.

The interpolated and approximated performance maps appear similar, but the approximated performance map is more accurate in some areas, for example on the bottom left. A quantitative comparison of their prediction errors confirms this impression: Although their median errors are similar (~10%), the average error for the interpolated map is 20%, with only 14% for the estimated map (Figure 8.4). In addition, the statistical distribution is tighter for the approximated map. In other words, most of the prediction errors of the approximated map are comparatively small, but both maps exhibit large outliers.

Figure 8.4 Prediction errors for the interpolated and approximated performance maps from Figure 8.3.

184

8.2.3 Comparison with Parallel Coordinates and Matrix of Contour Plots This section compares these visualizations in terms of (1) the readability of design parameters, (2) the ability to visualize continuous design spaces and fitness landscapes, (3) the type of mapping, and (4) the ability to interpolate a continuous design space from a smaller number of known design candidates. This comparison does not consider clustering methods, such as self-organizing maps or k-means (section 4.1.1), because they non-uniformly project individual parameters. In other words, clustering methods “distort” design spaces, which is unsuitable for interpolation (which is necessary to establish reversibility (section 4.2.4)) and makes it very hard to visually estimates design parameters. One can directly read design parameters from the axes of the Parallel Coordinates and the position in each Contour Plot but must indirectly infer them in the case of Star Coordinates, i.e., Performance Maps. In other words, Parallel Coordinates and Contour plots represent design parameters explicitly and Star Coordinates implicitly.

185

Figure 8.5 Matrix of Contour Plots representing the estimated performance of unexplored design candidates in terms of pair-wise interactions between parameters. The circle represents the best solution found, which is the base point for the contours. The diagonal from lower left to upper right represents the eight parameters individually. The matrix is symmetrical across the diagonal.

Since each contour plot visualizes the pairwise relationship between two design parameters and the performance criterion, one can identify the sensitivity of these design parameters. (In Figure 8.5 the color changes more in some contour plots, and less in others.) However, Matrices of Contour Plots can display these relationships only relative to their basepoint, excluding more complex, nonlinear relationships. Parallel Coordinates can identify the sensitivity of design parameters through bands of equal color, i.e., performance (Figure 8.6). Figure 8.6 presents a side by side comparison of different visualizations using the identical optimization results and surrogate model. While both Parallel Coordinates and the performance map identify two groups of well-performing

186

Figure 8.6 Visual comparison between Parallel Coordinates, Performance Maps, and Matrix of Contour Plots. The three visualizations employ the same surrogate model to approximate the performance of unexplored design candidates. (The interpolated Parallel Coordinates were drawn using the results from the performance map.)

design candidates, only the performance map gives a continuous overview over the “geography” of the design space due to the overlapping of Parallel Coordinates. 46 The two groups are not identifiable in the Matrix of Contour Plots. This comparison confirms the disadvantages of contour plots identified in Chapter 4, such as the fragmentation of the design space into a quadratic number of contours and the inability to identify nonlinearities between more than two parameters. Parallel Coordinates are bijective (section 4.2.4). However, since it is unclear which design candidates from the design space one should to visualize in this case, their visualization in Figure 8.6 Visual comparison between Parallel Coordinates, Performance Maps, and Matrix of Contour Plots. The three visualizations employ the same surrogate model to approximate the performance of unexplored design candidates. (The interpolated Parallel Coordinates were drawn using the results from the performance map.) relies on the interpolated results from the surjective performance map. The injective

The polylines of the Parallel Coordinates are drawn in order of performance, with the polylines representing the best candidates on top.

46

187

contour plots can indicate only the best optimization result because the remaining evaluated points lie on different section planes. To summarize, Parallel Coordinates and Matrices of Contour Plots offer more explicit readings of design parameters than Performance Maps and give a limited sense of their sensitivity. However, of the three visualizations, Performance Maps are the only one that can both interpolate unexplored portions of design spaces and visualize them continuously and without overlap (Table 8.1). While this chapter focuses on Performance Maps, on should keep in mind that the other visualizations have their own strengths and can benefit from being combined. The next section discusses potential applications for Performance Maps, with an emphasis on performance-informed DSE and ADO. Table 8.1 Comparison Summary

Parameters Reversibility Interpolation Sensitivity Growth per Parameter Fitness Landscape

Parallel Coordinates Explicit Bijective no yes Linear Overlapping

Performance Maps Implicit Surjective yes no Linear Continuous

Contour Plots

Explicit Injective no yes Quadratic Discontinuous

8.3 Applications of Performance Maps The proposed visualization method not only supports designers’ understandings of fitness landscapes, but also promises to improve the interactivity of ADO tools. Performance Maps (1) provide insights into the nature of the optimization problem, (2) represent the predicted performance of unexplored design

188

candidates, (3) display design candidates that are estimated to perform well, and (4) identify promising areas for further exploration. Beyond ADO, applications extend to higher dimensional DSE more broadly. For example, the mapping can represent similarities between parametrically defined design candidates or display trajectories of design decisions in rule-based design spaces. 8.3.1 Analyzing Optimization Problems Performance Maps capture salient characteristics of fitness landscapes. For instance, a fitness landscape is "smooth" when the relationships between design parameters and the performance criterion are relatively linear and without discontinues, and "rugged" when they are not. Another important distinction is between "convex" fitness landscapes with only a single peak (i.e., optimum) and "non-convex" landscapes with additional peaks (i.e., sub-optima). Such characteristics, which define the difficulty of an optimization problem and inform the selection of an appropriate optimization algorithm, can be identified via a performance map. For example, Figure 8.2 shows multiple sub-optima and a rugged fitness landscape where design candidates with similar parameters can have very different performance values. This ruggedness is expected for a structural optimization problem, where the lightest, feasible design candidates are similar to failing ones. 8.3.2 Relating Parameters and Performance Both the second and third applications imply linking performance maps to their underlying parametric models in real-time (Figure 8.7). Designers can adjust parametric models according to design intentions not captured by numerical

189

Figure 8.7 Diagram showing the relationship between design parameters and a performance map. Changing design parameters changes the position of the changed design candidate on the performance map (and thus indicated its approximated performance), while changing the location on the performance map adjusts design parameters (and thus displays the corresponding design candidate). Also, note the area for further automated exploration (optimization) indicated on the performance map.

performance values and simultaneously obtain the estimated performance of adjusted design candidates as locations on a performance map (based either on direct interpolation or on a surrogate model). They can also identify similarities between the new design candidate and evaluated ones in terms of their locations in the two-dimensional parameter space of the map. When a new design candidate is promising, designers can evaluate its performance (i.e. simulate it) and add this information to a performance map and the underlying surrogate model. In this way, manual, map-based DSE can improve the accuracy of a performance map and its underlying surrogate model. Performance Maps thus afford real-time, performance-informed DSE by replacing costly performance simulations with estimates from a surrogate model, which can be refined during exploration. (Typically, parametric models are much faster to calculate than performance simulations.)

190

8.3.3 Examining Promising Design Candidates Conversely, designers can select promising locations on a performance map and simultaneously see their representations by the original parametric model. In this way, designers can appreciate the values for the design parameters that correspond to a location on the performance map (and thus an estimated performance value), and what kind of design candidate these parameters imply. 8.3.4 Guiding Automated Design Exploration Finally, based on the understating gained from a performance map, designers can guide further automated DSE by limiting the ranges of design parameters. This limiting can be achieved by selecting a promising area on a performance map and restricting future exploration to the parameter ranges in the selected area. In this way, designers can point the optimization algorithm in a direction that supports their design intentions while ensuring a more efficient optimization process due to the smaller design space to be explored. Performance Maps thus allow designers to interact manually with an otherwise automated optimization process.

8.4 Limitations of Performance Maps Despite the potentials sketched in the preceding sections, there are a number of limitations that are important to point out. 8.4.1 Only One Parametric Model The most critical limitation of Performance Maps as tool for interactive, performance-informed DSE is that a performance map is only meaningful for an

191

individual parametric model. In other words, Performance Maps do not support changes to the parametric model such as the introduction of new design parameters. This limitation is not only a challenge for Performance Maps, but for parametric design in general. 8.4.2 Only One Performance Criterion Performance Maps only support DSE with a single performance criterion. However, further research is likely to overcome this limitation by overlaying two or more color scales in a perceptually adequate fashion. Heinrich and Ayres (2016) overlay different color palettes to visualize multiple performance criteria using contour plots. This approach implies creating an individual performance map for each criterion, which is more efficient than optimizing multiple criteria at once. Since different combinations of performance values result in different colors even when they sum to the same numerical value (Table 8.2) this approach avoids the drawbacks associated with computing only a (weighted) sum of performance values. In this way, one can not only understand relationships between design parameters and performance criteria and the tradeoffs between performance criteria, but also find better design candidates in a shorter amount of time.

192

Table 8.2 Example of a color scale overlaying three normalized performance criteria. Although the criteria’s sum is identical, their addition results in different colors. This example is merely illustrative, since the resulting palette is not perceptually adequate and thus potentially misleading. Criterion 1

Criterion 2

Criterion 3

Sum

RGB Color

0.0

0.0

1.0

1.0

0, 0, 255

0.0

0.7

0.3

1.0

0, 179, 77

0.5

0.3

0.2

1.0

128, 77, 51

8.4.3 Not the Full Design Space Lastly, Performance Maps do not provide a full picture of a design space. From a mathematical standpoint, the surjective mapping employed by Performance Maps implies that some (and potentially an infinite number of) design candidates are not accounted for. Instead, Performance Maps represent collections of design candidates that are relatively similar to already evaluated candidates. This similarity, however, increases the likelihood of performance estimates to be accurate. In other words, it is of little use to represent portions of the design space that have not been explored at all. In addition, the accuracy of a performance map also depends on the method used to provide performance estimates. Most likely, a surrogate model will be more accurate than the direct interpolation of performance values. This thesis contends that limitations in terms of comprehensiveness and accuracy do not impede the usefulness of Performance Maps, since an incomplete and inaccurate map is preferable to no map at all. This chapter’s

193

concluding section further illustrates the usefulness of Performance Maps by discussing their role in computational design processes.

8.5 Tools for Interactive, Performance-Informed DSE Despite limitations, the applications described above imply several innovative ways in which Performance Maps can integrate performance-informed DSE with existing computational design processes. They not only inform manual DSE, but also allow the integration of manual and automated (i.e., using optimization algorithms) exploration.

8.5.1 Informing Manual Design Space Exploration In ADO, designers use optimization algorithms to automatically explore the performance of parametric models. Optimization algorithms suggest single, “optimal” solutions, but it is likely that designers will take these suggestions as a starting point for further exploration rather than as an endpoints (Bradner et al. 2014). Performance Maps supports this further exploration by providing an overview of the parametric models’ design spaces in relationship to their estimated performance (Figure 8.7). They allow designers to better understand the optimization problem and to identify promising directions for further exploration with real-time feedback on the estimated performance of design candidates. Performance maps can be updated when designers simulate the performance of additional design candidates. In short, Performance Maps are not only static

194

representations, but also interactive design tools that support designers’ performance-informed explorations. 8.5.2 Alternating between Manual and Automatic Design Space Exploration Performance Maps also provide the option to alternate back and forth between manual and automatic DSE. For example, designers can select a portion of a performance map for an optimization algorithm to explore. This alternation is especially compelling when designers employ model-based optimization algorithms (section 5.6) because model-based algorithms can take advantage of surrogate models whose accuracy has been improved through manual DSE. Performance Maps thus enable the employment of model-based optimization algorithms for interactive optimization. Compared to interactive genetic algorithms (Takagi 2001), such interactive model-based optimization algorithms promise to be more efficient (Chapter 7) and to provide a more comprehensive form of interaction that is directed at the design space as a whole. 8.5.3 Summary In the interactions discussed above, designers guide the exploration informed by performance estimates and values. But, importantly, designers do not have to limit their considerations to performance values but can include criteria that have not or cannot be defined numerically, such as aesthetics. This openness underscores the potential of Performance Maps as interactive, exploratory tools for performance-informed DSE.

195

Based on the comparison in section 8.2 and the applications in section 8.3, Performance Maps exploit the opportunities of surrogate modeling for performance-informed DSE more fully than Parallel Coordinates, Matrices of Contours Plots, or clustering methods (Research Question 3). The next chapter presents user test results of the Performance Explorer, a performance-informed design tool that implements the ideas outlined in section 8.5.1.

196

197

9 Introducing and Evaluating the Performance Explorer This chapter presents the Performance Explorer, a novel, visual, and interactive, performance-informed DSE tool utilizing surrogate models (Figure 9.1). Specifically, the Performance Explorer provides an interactive Performance Map (Chapter 8) of the surrogate models constructed by RBFOpt and uses RBFOpt to further enhance these models. The Performance Explore has three critical functions: (1) It provides an (approximated) overview of the fitness landscape, (2) allows the interactive and targeted enhancement of the underlying surrogate model, and (3) provides multiple representations of a parametric design candidate (as a 3D model of its appearance, i.e., morphology, as a dot on the Performance Map, and as a radial plot of it is parameter values). According to Yamamoto and Nakakoji (2005) such multiple representations are important for “fostering creativity.” The chapter introduces the Performance Explorer (section 9.1) and presents the methodology (section 9.2) and results (sections 9.3-9.6) from a user test. The user

Figure 9.1 The Performance Explorer in Rhinoceros 3D. From left to right: (1) Morphology (i.e., appearance) of the current design candidate in Rhinoceros 3D, (2) definition of the parametric model in Grasshopper (note the green box with the number sliders representing the design variables and their current values), and (3) the Performance Explorer window.

198

test compared the interactive DSE afforded by the Performance Explorer to manual DSE (i.e., manipulating variable values by hand) and automated DSE (i.e., using Opossum for optimization). The user test had thirty participants. Its results consist of qualitative and quantitative responses to a questionnaire. Section 9.7 identifies limitations of the user test and section 9.8 concludes the chapter with a discussion of its results.

9.1 Performance Explorer Features and Interface Like Opossum, the Performance Explorer is a plug-in for the parametric design software Grasshopper, which itself is a plug-in for the 3D-modelling software Rhinoceros 3D. In Grasshopper (Rutten 2010a), the Performance Explorer is a “component” on the Grasshopper “canvas” ((2) in Figure 9.1). To use the Performance Explorer, one first runs Opossum’s RBFOpt for several function evaluations (i.e., simulations) to sample the design space and to construct a surrogate model (section 6.2). The larger the number of function evaluations, the more accurate the surrogate model’s approximation will be. The Performance Explorer window has four main GUI (graphical user interface) elements: (1) The performance map, (2) the performance scale, (3) the variable plot, and (4) the “Simulate” and “Refresh” buttons (Figure 9.2). 9.1.1 Performance Map The performance map (Chapter 8) is a two-dimensional, radial mapping of the fitness landscape, insofar RBFOpt has explored it ((1) in Figure 9.2). Every dot represents a design candidate whose performance value RBFOpt has simulated. The remaining colored area represents design candidates whose parameter 199

Figure 9.2 The Performance Explorer window. (1) The performance map with the position cross and variable axes is on the left, and (2) the performance scale, (3) variable plot, and (4) “Simulate” and “Refresh” buttons are on the right (from top to bottom).

values the performance map has interpolated via barycentric coordinates and whose performance values RBFOpt has approximated with a surrogate model. The six radial axes indicate the radial mapping and are labelled according to the names of the number sliders, i.e. the variables, in Grasshopper ((2) in Figure 9.1). Designers interact with the performance map via the white “position cross.” The position cross indicates the position of the current Grasshopper model (i.e., of the parametric variables values) in the fitness landscape ((1) in Figure 9.2). When the Performance Explorer starts, or the performance map is regenerated, the Performance Explorer (re-)sets the position cross to the best-known design candidate.

200

Designers can change the current variable values by dragging the position cross across the fitness landscape, which resets the slider values in Grasshopper, and thus the parametric model. Since the Performance Explorer derives these variable values by interpolating between the simulated design candidates, they change non-linearly, even when designers drag the position cross along one of the coordinate axes. Alternatively, designers can move the position cross by manipulating the variables values not on the Performance Map, but directly in Grasshopper via the number sliders. This direct manipulation is slightly less convenient, since it happens outside the Performance Explorer window, but allows more exact manipulations of individual variable values. Changing a single variable moves the position cross along a single axis, which can help designers to better understand the visualization. In other words, changing variables values directly moves the position cross, and moving the position cross changes variables values, but these changes are not commensurate. This incommensurability is a results of the performance map’s bijective mapping and interpolation (section 8.4.3). 9.1.2 Performance Scale The performance scale ((2) in Figure 9.2) provides a legend for the colors of the performance map. The white “performance bar” and the number next to it represent the (simulated or approximated) performance value corresponding to the current position of the position cross.

201

In this way, the Performance Explorer accompanies designers’ variable changes with real-time performance values. To achieve this real-time feedback, the Performance Explorer temporarily disables the performance simulation in Grasshopper—Grasshopper otherwise responds to variable changes with triggering new simulations—and displays performance values that it has approximated from the surrogate model in advance or which RBFOpt or the designer have simulated earlier. 47 9.1.3 Variable Plot The variable plot ((3) in Figure 9.2) is a radial plot of the current design candidate’s variable values, colored with the corresponding performance value. Since it is hard to visually estimate variable values from the performance map, the variable plot supports designers’ intuitions with an alternative visualization of the current design parameters and (approximated) performance value. The axes of the variable plot correspond to the “main” axes of the performance map, but with radial coordinates instead of a radial mapping (section 4.2.1). 9.1.4 Simulate and Refresh Buttons The Simulate and Refresh buttons ((4) in Figure 9.2) implement two critical, novel features of the Performance Explorer, whose combination allows designers to manually enhance surrogate models that underlie the performance map.

When a designer adjusts variable values via number sliders (i.e., not with the position cross), the Performance Explorer has not approximated the corresponding performance value in advance. Instead, the Performance Explorer displays the performance value for another design candidate that occupies the same location on the Performance Map as the current design candidate. In this case, the predicted performance thus does not relate to the current design candidate, but to another, typically similar, candidate. Drawing approximations from the surrogate model in real-time would address this challenge. 47

202

As mentioned in section 9.1.2, the Performance Explorer disables the performance simulations of parametric models in Grasshopper to allow realtime performance feedback, especially for time-intensive simulations. Avoidance of simulations means that, for design candidates that have not been simulated previously, the performance values indicated by the performance map are only estimates. (The performance map indicates simulated design candidates as dots.) Designers can use the Simulate button to verify the performance values for promising design candidates. When a designer presses the simulate button, the performance bar jumps to the “correct” (i.e., simulate) performance value, with larger jumps indicating a less accurate estimate from the surrogate model. By repeatedly using this feature, designers also can get a sense of the accuracy of the underlying surrogate model. Based on the experience of the author, these estimates are more precise for design candidates that are predicted to perform better, and less precise for design candidates that are predicted to perform worse. Importantly, performing a simulation in this way also implies generating an additional sample for the surrogate model. But performing a simulation does not immediately regenerate the performance map, since—depending on the speed of the computer—this regeneration can take several seconds. Rather, the recalculation of the surrogate model and regeneration of the performance map is triggered only when designers press the Refresh button, after performing one or more simulations. Such a regeneration can result in visible changes to the performance map, which typically indicate areas of the fitness landscape where the surrogate model’s accuracy has improved.

203

Regeneration also resets the position cross to the best-know (simulated) design candidate, which potentially has changed. When optimization has not fully explored the boundaries of the design space, designers can move the position cross outside the colored area with Grasshopper’s number sliders and manually expand the performance map by simulating the performance of design candidates and refreshing the performance map. 48 The following sections present the methodology and results from a user test that put the efficacy of the ideas behind and implementation of the Performance Explorer to a hands-on test.

9.2 User Test Methodology Examining Research Question 4, the user test investigated the hypothesis that the Performance Explorer supports performance-informed DSE better than the other two methods by allowing designers to introduce qualitative criteria into their search for quantitatively well-performing design candidates. In addition, the user test examined the selection criteria and DSE strategies employed by the participants. The user test took place from July 19th to November 18th, 2017. 49 The user test compared three different methods (i.e., tools) for tackling a performance-

48 Future versions of the Performance Explorer could automate this process by calculating

and simulating the design candidates at the corners of the radial mapping.

SUTD’s Internal Review Board approved the study protocol on July 14th, 2017 (Application number 17-145).

49

204

Figure 9.3 User Test with three participants.

informed DSE task: (1) Manual, (2) automated (using Opossum), and (3) interactive (using the Performance Explorer). 9.2.1 Participants A total of 30 subjects (15 males and 15 females) participated in the user test. All participants were either students or researchers in architecture with at least some familiarity with Grasshopper (13 undergraduates, 4 master’s students, 11 PhD students, and 2 researchers). The participants used their personal laptops for the test (Figure 9.3). 9.2.2 Design Task The user test involved the following, performance-informed DSE task:

For a given parametric model, find a well-performing design that is a promising starting point for further architectural development. 50

This definition of the design task follows the insight by Bradner et al. (2014) that, in ADO, optimization results are more often used as starting points for further design development than as end results (section 3.3.4). 50

205

The parametric model—defined in Grasshopper—represented a small pavilion and had six parameters (three parameters for the height of the pavilion’s corners, one for its center height, one for the size of its openings, and one for the depth of the overhangs over the entrances) (Figure 9.4). The quantitative performance criterion was the pavilion’s maximum displacement under dead load, as a relative measure of the pavilion’s structural performance. The participants were explained the meaning of structural displacement and told to interpret the displacement as a relative measure of structural performance for conceptual design and to ignore factors such as the pavilions thickness or material.

Figure 9.4 Example design candidate’ morphologies from the design task’s parametric model. The numbers indicate the maximal structural displacement in centimeters and the pink color the areas of high displacement. The design candidate on the top left is close to the best-known solutions.

206

The participants were remined that their task was not to find the pavilion with the lowest maximum displacement, but to find a well-performing pavilion that they considered a promising conceptual design from an architectural point of view (based on individual design criteria, for example conceptual, formal, or programmatic ones). This “preferred design candidate” could be the best performing one, but not necessarily. The participants had to decide how much displacement was acceptable to them, but were told that the lowest displacement was around 1 cm. On an Intel i7 6700K CPU with 4.0 Ghz and eight threads, performing the structural simulation takes about 300 milliseconds. This time is short enough to allow sufficient DSE within the ten minutes allotted for the design task, but long enough to make the real-time feedback afforded by the performance-informed DSE tool meaningful, especially on the participants (slower) laptops. 9.2.3 Performance-Informed DSE Methods The participants performed the performance-informed DSE task with three distinct methods, for ten minutes each (Figure 9.5): •

The “Manual” method involves manipulating the variable values of the parametric model directly and simulating the resulting design candidates.

•

The “Automated” method involves optimizing the parametric model with Opossum—with the number of function evaluations and runs decided by the participants—and choosing a well-performing candidate

207

Figure 9.5 Diagram of the relationship between the parametric model, structural simulation, Opossum, surrogate model, and the Performance Explorer for the three performance-informed DSE methods. All methods receive the simulation’s performance values as inputs and generate parameters values as outputs. The interactive method also exchanges parameter and (approximated) performance values with the surrogate model. Note that the automated method includes the manual method, and that the interactive method includes the two methods.

from the results list (section 6.2.2). This method allows “fine-tuning” of design candidates by adjusting parameter values directly. •

The “Interactive” method involves first running an optimization to generate a surrogate model, and then—using the Performance Explorer introduced in section 8.1—searching a visualization of the surrogate model for well-performing design candidates. If desired, participants could run the optimization multiple times, potentially resulting in different surrogate models.

Compared to “pure” optimization that wants to find only a single, bestperforming solution, the automated method introduces an element of choice. This element of choice facilitates a more meaningful comparison of the three methods in terms of performance-informed DSE, since optimization tools that

208

return only one best-performing design candidate facilitate understanding and exploratory divergent thinking only to a small degree (sections 3.3.4 and 3.3.5). The three methods demand only minimal skill in using Grasshopper, since, beyond using the custom GUI’s of Opossum and the Performance Explorer, only manipulating variable values is required. All methods were demonstrated to the participants before the user test. 9.2.4 Experimental Design Each participant performed the performance-informed DSE task with each of the three methods (manual, automated, and interactive). The participants could use each method for at most ten minutes or stop earlier when they had found a design candidate that was satisfactory to them. This experimental design introduced a potential bias, because the participants progressively learned more about the design task’s design space with each method. To mitigate this bias, the order in which the participants used the methods was randomized, taking the participants’ gender into account. Since there were three methods (i.e., A, B, and C), there were six possible orders (ABC, ACB, BAC, BCA, CAB, and CBA). Five participants followed each of the six orders. 51 9.2.5 Questionnaires After using each method, the participants indicated the parameter and objectives values of their preferred design candidate, their criteria for choosing it, and their strategy for finding or selecting it. Using five-level Likert scales, the

51

Either three females and two males, or vice versa.

209

participants rated in how far the preferred design candidate was a promising starting point for further development and in how far they got a good overview over potential design candidates (i.e., the design space). The participants also had the opportunity to note any other comments and/or features requests regarding each design method. After using all three methods, the participants ranked them in terms of how much they supported them with the performance-informed DSE task and in terms of how much they enjoyed using them. They also had the opportunity to record any final comments. No personal data was collected. In summary, the user test consisted of the following steps: 1.

Method A (10 minutes)

2. Questionnaire on Method A 3. Method B (10 minutes) 4. Questionnaire on Method B 5. Method C (10 minutes) 6. Questionnaire on Method C 7. Final ranking and comments The following sections analyze the qualitative responses to the questionnaires for each method, and then compare the quantitative results. The Appendix contains the participants’ full qualitative responses.

210

9.3 Responses for the Manual Method This section discusses the manual method in terms of selection criteria for the preferred design candidates, the DSE strategies for finding them, and further comments and features requests. 9.3.1 Selection Criteria for the Manual Method With the manual method, participants gave a wide range of motivations for choosing their preferred design candidates, such as symmetry, shading, “expressing the idea of flight,” “the potential for interesting programs to occur,” “inviting and noticeable entrances,” and—in one case—ease of construction. 52 Reflecting the complexity of architectural design problems discussed in section 3.3.1, only sixteen of the thirty participants mentioned structural performance as a criterion for choosing the preferred design candidate, and only four mentioned it as the sole criterion. 9.3.2 Design Space Exploration Strategies for the Manual Method One can identify three DSE strategies employed by the participants: (1) Random manual exploration, i.e., manipulating the variable values unsystematically until a satisfactory design is found, (2) strategic manual exploration, i.e., manipulating the variable values systematically by setting them in a certain order or by trying to achieve a certain shape, and (3) analytic manual exploration, i.e., trying to understand the impact of the variables on the design and its displacement.

52

The author edited the participants’ responses in the terms of spelling and grammar.

211

Responses indicating strategy 1 included phrases such as “play around [with] the parameters” or “moving the sliders until I find a nice design.” Responses indicating strategy 2 included phrases such as “decide the three heights first and then the center height” or “I immediately looked at stepping the heights.” Responses indicating strategy 3 included phrases such as “I tried to figure out the relationship between center height and deformation” or “I … tested the impact [variable changes] made on the overall deflection.” Based on their survey responses, ten participants pursued a mostly random exploration, fourteen a mostly systematic exploration, and six a mostly analytic exploration. 53 9.3.3 Comments and Feature Requests for the Manual Method Two participants noted that, with the manual method, they had “more freedom” than with the others, but another noted that “I didn’t know what small improvements I could do to improve the value of the displacement, without changing my shape.” 54 Five participants noted that the manual method was more time-consuming than the other methods. One participant explained that “many interesting shapes can be achieved through six parameters, but I would not have noticed or found them (unless I spend 1,000 hours … to get all possible combinations).”

53 Here and elsewhere in this chapter, the numbers for the design strategies are indicative

only and solely based on the interpretation of the participants’ responses by the author. The Appendix contains the participants’ full qualitative responses.

The Performance Explorer assist in such situations by presenting an overview of the fitness landscape. 54

212

Two participants requested a way to “save” the parameter values of promising design candidates during exploration. Another participant manually recorded the parameters values of promising design candidates to be able to return to these configurations. These responses indicate a need to store promising design candidates during performance-informed DSE. 55 9.3.4 Manual Method Summary In summary and based on the participants’ responses, the manual method affords more freedom than the other two methods, but also is more timeconsuming and leads to a reduced interest in performance goals. The diversity of DSE strategies that the participants applied reflects this freedom.

9.4 Responses for the Automated Method This section discusses the automated method. 9.4.1 Selection Criteria for the Automated Method With the automated method, participants again gave a wide range of motivations for choosing their preferred design candidates, albeit—and perhaps unsurprisingly—with a stronger focus on structural performance. One participant mentioned that (s)he felt “compelled to compromise my massing shape for its performance.”

This need could be addressed by “Interactive Design Galleries” (Woodbury et al. 2017), an approach that allows the storage, retrieval, and side-by-side comparison of parametric design candidates’ morphologies. 55

213

Nineteen participants sought a compromise between structural performance and other criteria (e.g., aesthetics, spatial experience, shading, etc.), seven did not mention structural performance, and two choose the preferred design candidate solely based on structural performance. A typical response from the first group was “Aesthetically [the design] looks quite balanced and it has a good optimization value.” 9.4.2 Design Space Exploration Strategies for the Automated Method 22 participants ran the optimization and then selected a design from the results list ((4) in Figure 6.4). A typical response from this group was “I ran the [optimization] for a while, then I checked all the possible solutions, and I picked the one following my design criteria among the ones with the best displacement values.” Only three participants from this group “fine-tuned” their selected design candidates by manually adjusting variable values. One of the three responded as follows:

I was going through the results quickly from the worst values to the best to get an overall understanding of the changes in shapes. Then [I] decided on the one that I prefer and used the sliders to adjust it … Two participants used the best optimization result as a starting point for further manual exploration, and three simply accepted the best optimization result. One of the first two tried to “play with the parameters to understand what they do, and how they affect deflection.”

214

The remaining thee participants followed idiosyncratic strategies (such as maximizing deflection or trying to stop the optimization when “the design looks interesting”). In short, the participants’ responses reveal four additional DSE strategies: (4) Automated exploration, i.e., accepting the most “optimal” design candidate, (5) selective automated exploration, i.e., selecting a design candidate from a set of well-performing

candidates,

(6)

automated

exploration

with

manual

refinement, i.e., manually “fine-tuning” the best-known solution, and (7) selective exploration with manual refinement, i.e., fine-tuning a selected design candidate. Strategies 6 and 7 illustrate the concept of optimization providing a starting point for further DSE (section 3.3.4). 9.4.3 Comments and Feature Requests for the Automated Method Several participants indicated that the well-performing designs in the results list were “very similar” or “do not change too much,” and one participant even identified “repeated designs within bands of results,” i.e., clusters. Accordingly, seven participants requested improvements to the results list, such as filtering the results according to similarity (e.g., with a clustering method), better representations of parameter values, and/or enhancing the convenience in browsing the results. Such features—which are lacking also in other ADO tools— indicate an important direction for the further development of Opossum, and ADO tools more generally. One participant requested that the optimization should take initial parameter values into account—which implies limiting the parameters’ ranges or using a local search method—and another the possibility to add constraints on the 215

design variables (e.g., that one corner point should always be higher than another). 56 These requests illustrate how different optimization methods can address diverse needs in performance-informed DSE. 9.4.4 Automated Method Summary In summary and based on the participants’ responses, the automated method increases the need to balance trade-offs between the explicit and quantitative performance goal and additional, implicit, and qualitative, design criteria. The participants again followed different DSE strategies. The most prominent method was searching and choosing from the results list, with only a small number of participants accepting the optimization result outright. This small number and the participants’ feature requests reinforce the need for enhanced performance-informed DSE tools (section 3.3.5).

9.5 Responses for the Interactive Method This section discusses the interactive method. 9.5.1 Selection Criteria for the Interactive Method With the interactive method, the motivations for choosing the preferred design were very similar to the automated method. Twenty participants again sought a compromise between structural performance and other criteria, seven participants selected a preferred design for non-structural, largely aesthetic reasons, and three considered structural performance only.

Including a local search method in Opossum would be straightforward. There also are plans to integrate constraint handling into future releases of RBFOpt.

56

216

But there is evidence that, compared to the automated method, the participants approached the trade-offs between different criteria with more deliberation and freedom: One participant mentioned that the “visualization offers an opportunity to change parameters according to some reason, not pure instinct or experiment,” and another followed his original design concept of “expressing the idea of flight,” but “the offset [parameters] were better controlled to better optimize the maximal deflection.” Several participants indicated that the interactive method was more permissive in terms of accepting suboptimal performance values. One mentioned that the preferred design was “visually appealing, but not very efficient,” and another that “even if the objective is not very low, the shape of the model is goodlooking.” A larger group of participants was very satisfied with the balance struck by the preferred design, for example finding a design “that has a nice structure and it also suits the design that I like.” The following response expressed perhaps the highest satisfaction:

This design has the most desired shape with the lowest displacement value. When utilizing the [Performance Explorer], I was able to somewhat determine the parameters that I want … Through this method I was able to visualize which parameters will provide me with the most stable design.

217

9.5.2 Design Space Exploration Strategies for the Interactive Method Remarkably, the participants utilized (variants of) most of the design strategies identified for the manual and automated methods. The participants also utilized two novel strategies, which, to the author’s knowledge, are unique to the Performance Explorer: (8) Selective exploration with informed refinement, i.e., selecting an appealing design candidate and using the performance map to refine it in terms of structural performance, and—as the most “sophisticated” strategy— (9) informed analytic exploration with selection, i.e., refining the visualization of the fitness landscape in terms of scope and/or accuracy and selecting an appealing design candidate with satisfactory performance. Two participants applied strategy 1 (e.g., “Navigating through the design space (quite randomly) and trying to find a solution that looks good according to my concept.”). Compared to the other two performance-informed DSE methods, this strategy is easier to implement with the Performance Explorer, since, instead of adjusting individual variable values, one can simply drag the position cross over the performance map. Three participants applied strategy 2 (e.g., “A constraint of a smaller span was first set, with a greater variation in the three heights.”), and one participant strategy 3 (e.g., “I tried to understand the relationship between design results regarding the displacement value.”). No participants applied strategies 4 and 6, which imply accepting the best optimization results as the preferred design candidate. This non-acceptance indicates that the Performance Explorer encourages selection and/or exploration. 218

Nine participants applied strategy 5 (e.g., “I chose points in the [well-performing] sections of the graph and tested out which one simulates to give the best maximum displacement values and design.”). Four participants applied strategy 7 (e.g., “I was clicking through the [design candidates] to see the overall range of results and fixed on the one that I felt was more sculpturally looking and began to tweak it with the sliders.). In contrast to strategies 5 and 7 with the automated method, with the Performance Explorer, this strategy can include an extra step: Verifying the displacement value of a promising design candidate that is predicted to perform well. Six participants applied strategy 8 (e.g., “Find the region that closely [approximates] the final massing that I want by using the mapping, change the sliders manually to see how the objective changes (whether it is moving to a ‘bad value; region or not.)”). Five participants applied strategy 9, e.g.:

I explored the solution space extensively. First, I generated extra points to understand areas that also score highly, but not as high as the optimal zone. Then, I went around the solution space, [simulating] as I went, to explore everything. I settled for an area with a good (but not the best) score that resulted in a shape that I liked.

219

9.5.3 Comments and Feature Requests for the Interactive Method The interactive method received several very positive comments, for example that it was enjoyable to use, “provides more options of the design shape,” “makes a lot of sense,” or that its visualization was “helpful” or even “very very helpful” and allowed for the “immediate pinpointing of ideal tests.” This enthusiasm also resulted in a large and productive number of feature requests. The feature requested most often—by seven participants—was (1) being able to zoom in and out on the performance map, since the details of wellperforming regions in the fitness landscape can be hard to distinguish. Another repeatedly requested feature—by four participants—was (2) being able to manually control individual parameter values in the Performance Explorer window instead of in Grasshopper. Two participants each requested the following three features: (4) Being able to “bookmark” solutions (also requested by two other participants for the manual method (section 9.3.3)) 57, (5) being able to adjust individual parameter ranges or to lock individual parameters entirely, and (6) being able to rerun optimization with smaller variable ranges, or based on a selected set of design candidates, which implies a novel form of interactive optimization (section 8.5.2).

With the Performance Explorer, it is much easier to revisit known design candidates than with the manual method, but it is likely hard to remember which dots exactly represent potentially preferred design candidates. 57

220

Individual participants requested another three features: (7) being able to delete design candidates from the performance map, and integration in a single GUI with (8) Opossum’s optimization tab and (9) results list ((4) in Figure 6.4). The participants also requested minor tweaks to the Performance Explorer’s GUI, such as more clearly labeling the parameters, adding explanatory tooltips, and allowing a nonlinear color scale. 58 Many of these requested features would to improve the support for the nine performance-informed DSE strategies even further. For example, zooming and bookmarking can help with selection, and controlling, limiting, or locking individual parameter values directly in the Performance Explorer window with refinement. A tighter integration between Opossum and the Performance Explorer would allow alternating between the automated and interactive methods and—by affording re-running optimization with ranges of parameters limited according to insights gained from interacting with the performance map—interactive optimization or “automated refinement.” 9.5.4 Interactive Method Summary In summary and based on the participants’ responses, the interactive method accommodates a wide range of selection criteria and performance-informed DSE strategies. It allows designers to pursue strategies associated with the manual and automated methods, enhances these strategies with real-time feedback and the understanding afforded by the performance map (i.e., the visualization of an

58 The visualizations in Chapter 8 utilize a nonlinear color scale, but this functionality was

not available to the participants.

221

approximated fitness landscape), and affords novel performance-informed DSE strategies. The interactive performance map enhances (1) selection (for strategies 5, 7, 8, and 9) by indicating promising areas of the fitness landscape, (2) refinement (for strategy 8) by indicating directions for potential local improvement and by allowing designers to manually enhance the surrogate model (for strategy 9), and (3) understanding (for strategies 3 and 9) by providing an overview of the fitness landscape and through multiple representations. These enhancements are discernable from the fact that the largest groups of participants applied strategy 5, and comparatively large groups strategies 7, 8, and 9 (Table 9.1). As such, the three key functions of the interactive performance map— (1) overview of the Performance Map, (2) interactive and targeted enhancement of the underlying surrogate model, and (3) multiple representations—support two key concepts in performance-informed DSE: choice and understanding (section 3.3.5). These three functions rely on the three advantages of surrogate modelling discussed in section 5.8.6 (speed, exploration, and enhancement). In short, the interactive method both accommodates and transcends the design strategies associated with the manual and automated methods, which also is discernible in Table 9.1. The bounty of feature requests documents the participants’ enthusiasm for the interactive method and the potential for an improved Performance Explorer and similar future tools to further enhance the support for performance-informed DSE. The quantitative comparison in the next section supports the conclusions from this section.

222

Table 9.1 Numbers of participants using a performance-informed DSE strategy for each DSE method. These numbers are indicative only and are based on the author’s interpretation of the participants’ responses. Manual Method

Automated Method 3

Interactive Method

o

Strategies Idiosyncratic strategies

1

Random manual exploration

1o

2

2

Strategic manual exploration

14

3

3

Analytic manual exploration

6

1

4

Automated exploration

3

5

Selective automated exploration

19

6

Automated exploration with manual refinement

2

7

Selective exploration with manual refinement

3

8

Selective exploration with informed refinement

6

9

Informed analytic exploration with selection

5

9

4

9.6 Quantitive Comparison of the Three Methods This section presents and discusses three kinds of quantitative comparisons: (1) the performance values of the preferred design candidates, (2) results from evaluating each method directly after using it, and (3) results from ranking the three methods after using all of them. 9.6.1 Structural Displacement Figure 9.6 displays the maximal structural displacement of the preferred design candidates selected by the participants with each method. The interactive

223

Figure 9.6 Maximal structural displacement of the preferred design candidates selected by the participants.

method exhibits the smallest average and range and the automated method the smallest mean. There are two outliers per method. These results suggest that the participants attached varying importance to structural displacement relative to other criteria, and that—with the manual method—they either attached a smaller importance to structural performance, or that they found it harder to find structurally well-performing solutions. 9.6.2 Evaluating Individual Methods After using each method, the participants rated in how far their preferred design candidates were promising starting points for further development and in how far the methods provided an overview over potential design candidates (i.e., the design space), using five-level Likert scales. In terms of promising starting points, the interactive method resulted in the smallest number of disagreeing and undecided participants (Figure 9.7), but the manual method resulted in the largest number of strongly agreeing participants.

224

The automated method resulted in the largest number of disagreeing and undecided participants. In total, the manual method received 117 points, the automated method 115, and the interactive method 122 (on a scale from 1-5). Accordingly, the participants found the most promising starting points for further design development with the interactive method, followed by the manual method. 59

Manual

4

Automated

Interactive

4

3

13

6

1

9

14

4

7

17

0%

20%

Strongly Disagree

8

40%

Disagree

60%

Undecided

Agree

80%

100%

Strongly Agree

Figure 9.7 Responses to the statement “This design is a promising starting point for further development.”

Manual

1

Automated

1

Interactive

1

9

3

2

2

0% Strongly Disagree

8

10

10

14

14

15

20% Disagree

40% Undecided

60% Agree

80%

100%

Strongly Agree

Figure 9.8 Responses to the statement “I got a good overview over potential design candidates (i.e., the design space).” But note that, with longer simulation times, the manual method quickly becomes unfeasible. 59

225

The results for the quality of the provided overview were more unambiguous (Figure 9.8): The interactive method scored best (with 133 points), the automated method second (with 119 points) and the manual method last (with 105 points). Remarkably, almost all participants agreed that the interactive methods afforded a “good overview” over potential design candidates. In summary, although some participants were very satisfied with the design candidates they discovered manually, the interactive method was the most effective in terms of supporting the discovery of promising design candidates and providing an overview of the design space. 9.6.3 Ranking the Three Methods After completing the performance-informed design task with the three methods, the participants ranked the methods in terms of their helpfulness with the design task and in terms of their enjoyment in using them.

Manual

Automated

Interactive

0%

4

17

9

1

3

7

20

23

6

20% Least enjoyable

40% Neutral

60%

80%

100%

Most enjoyable

Figure 9.9 Responses to the prompt “Please rank the design methods based on how much they supported you with the design task.”

226

Manual

Automated

2

0%

6

13

11

Interactive

5

8

17

19

9

20% Least helpful

40% Neutral

60%

80%

100%

Most helpful

Figure 9.10 Responses to the prompt “Please rank the design methods based on how much you enjoyed using them.”

In this a-posteriori comparison, the differences between the methods are more pronounced. The interactive method scored the highest in both dimensions (Figure 9.9 and Figure 9.10), followed by the automated and manual methods. Almost two thirds of the participants found that the interactive method was the most helpful, and more than two thirds found that it was the most enjoyable. 9.6.4 Summary In general, the differences between the three methods are more pronounced in the a-posteriori comparison than in the individual evaluations and the preferred performance-values. This larger difference is probably due to the participants’ development of a better understanding of potential design candidates (i.e., the design space) and the strengths and limitations of the three methods in the course of the test, as well as the requirement to rank the methods (i.e., participants had to pick a best and a worst method).

227

5 4 3 2 1 0

Support for the design task

Enjoyment in use Manual

Promising starting point

Automated

Overview over the design space

Interactive

Figure 9.11 Mean scores for the three performance-informed DSE methods. The three ranks for support (Figure 9.9) and enjoyment (Figure 9.10) are scored as 1, 3, and 5. The scores for promising starting points (Figure 9.7) and the overview over the design space (Figure 9.8) are scored as 1,2,3,4, and 5.

The interactive method emerges as (1) the most supportive and (2) enjoyable method that results in (3) the most promising starting points for further development and (4) affords the most comprehensive overviews over design spaces (Figure 9.11). (But the averages scores for promising starting points are very similar.) In three of four of these dimensions, the automated method is second and the manual method last.

9.7 Limitations This section discusses the three major limitations of the user test: (1) The selection of participants, (2) the method of data collection, and (3) the simplified DSE task. 9.7.1 Selection of Participants The most serious limitation concerns the selection of the participants. All participants were from academia, and thirteen were undergraduates. 60 This The author also informally tested the Performance Explorer at a conference masterclass (“Interfacing Architecture, Engineering, and Mathematical Optimization,”

60

228

selection allowed a larger sample size but might limit the relevance of the user test for professional practice. On the other hand, many participants had at least some experience with professional practice, albeit mostly through internships. In addition, the different degrees of sophistication in using performance-informed DSE methods that are apparent in their responses in part reflect professional practice as well. 9.7.2 Method of Data Collection Another limitation was the method of data collection: The participants’ written responses sometimes were unclear or left out important aspects. Although this limitation was partially compensated by the—compared to similar studies 61— larger number of participants, the interpretation of the participants’ written responses is somewhat subjective, especially in terms of the categorization of the responses into the nine strategies. The user test also did not directly test the participant’s acquisition of “tacit” knowledge about the design space and fitness landscape. Future studies should gather richer data, for example by recording the participants’ explorations, and/or by verbally interviewing them. 9.7.3 Simplified DSE Task The final limitation concerns the performance-informed DSE task itself. Compared to the simulations and parametric models used in architectural

22.9.-25.9.2017, annual symposium of the International Association for Shell and Spatial Structures, IASS 2017: Interfaces). The attendees, which included architects and structural designers, were enthusiastic about the Performance Explorer, but there was not enough time for a formal test due to technical problems with different operation systems. 61

Andersen et al. (2013) tested twelve participants, and Shireen et al. (2017) tested nine.

229

practice (e.g., Shepherd et al. 2011; Fisher 2012; Imbert et al. 2013), the design task was simplified dramatically. This simplification was necessary to reduce the user tests’ requirements for software, expertise, and—most critically— simulation time. Nevertheless, the participants’ responses demonstrate that the simplified design task retained sufficient complexity to allow meaningful explorations.

9.8 Discussion This section discusses the Performance Explorer’s novelty, summarizes the user test results, and outlines the results’ implications for theories on ADO. 9.8.1 Novelty of the Performance Explorer To the author’s knowledge, the Performance Explorer integrates several wellknown computational DSE concepts in a unique and novel manner. The following existing tools implement aspects of the Performance Explorer, but none implement its totality: •

Geyer and Schlüter (2014) use a surrogate model to provide real-time feedback for the energy performance of a parametric model, but do not provide a visualization of the surrogate model itself (i.e., the of the approximated fitness landscape).

•

Design Space Exploration (section 6.1.1) is not a tool, but a tool box. As such, and in contrast to the Performance Explorer, it does not provide integration between its various functions, such as optimization and surrogate modelling. Design Space Exploration does not include visualizations. 230

•

modeFrontier (section 6.1.2) offers various visualizations and analysis methods for optimization results (Di Stefano 2009), but none that represent high-dimensional fitness landscapes as continuous fields. Moreover, modeFrontier represents performance only relative to variable values, and not relative to the design candidates’ morphologies. This inability of examine morphologies hinders designers from qualitatively assessing design candidates, and thus from redefining their selection criteria.

In short, the Performance Explorer’s novelty resides in its the interactive performance map (Chapter 8), which represents (approximated) highdimensional fitness landscapes as continuous fields. As discussed in section 9.5.4, the interactive performance map in turn is critical to support performanceinformed DSE in terms of selection, refinement and understanding. Table 9.2 Comparison of the Performance Explorer with other ADO tools with performanceinformed DSE features (section 5.1). Provides all design candidates Stormcloud

Provides design candidates’ morphologies

Visualization of fitness landscapes

x

Galapagos

x

x

Octopus

x

x

Opossum

x

x

Design Exploration

x

x

mode Frontier

x

Performance Explorer

x

Interactive Optimization x

x x

x

231

The Performance Explorers’ remaining key functions—interactive and targeted enhancement of surrogate models and multiple representations—are also present in Design Space Exploration and modeFrontier. However, their inability to visualize high-dimensional fitness landscapes continuously and—in the case of modeFrontier—to represent design candidates’ morphologies, make these functions less compelling and convenient for performance-informed DSE. Compared to the optimization tools discussed in section 6.1, the Performance Explorer is the only tool that represents all design candidates, presents their morphologies (albeit sequentially), and visualizes the fitness landscape (Table 9.2). The author expects that such features will become more common in future performance-informed DSE tools. Design Space Exploration and modeFrontier—the two performance-informed DSE tools most comparable to the Performance Explorer—also employ surrogate modelling. In addition, the Performance Explorer employs model-based optimization. 62 This use of surrogate modelling underscores its importance for performance-informed DSE and the suitability of model-based optimization for ADO (Research Question 1). 9.8.2 Summary of the User Test Results Some participants felt “freer” with the manual method, but most participants nevertheless preferred the automated and interactive methods. Some participants mentioned that the latter methods allowed the examination of

62 Sections 5.6.1 and 5.6.2 discuss the difference between surrogate modelling and model-

based optimization.

232

larger numbers of design candidates. For more realistic problems with longer simulation times, this advantage would become more pronounced. In addition to evaluating the manual, automated, and interactive methods, the user test also identified nine performance-informed DSE strategies. These strategies can serve as a framework for future research and development in performance-informed design. Notably, of the three tested methods, the interactive method is the only one that accommodates all nine strategies. Integrating Opossum’s GUI more closely with the Performance Explorer and allowing alternation between manual, automated, and interactive explorations—as suggested by the participants— will further enhance this accommodation of diverse design strategies. 9.8.3 Implications for Theories on ADO Only a small number of participants accepted the “best" design candidates from optimization. Although the experimental design likely encouraged this outcome, it nevertheless confirms the design-theoretical ideas on optimization as a medium for reflection discussed in section 3.3.4, and thus the need for enhanced performance-informed DSE tools identified in section 3.3.5. Designers indeed applied a variety of (implicit and explicit, quantitative and qualitative) design criteria, and—as hypothesized in section 5.8.4—prefer selection from a range of alternatives over being presented with a single, “optimal” solution. The metaphor of design as search continues to be fruitful, with refinement emerging as a crucial aspect of some performance-informed DSE strategies and some participants requesting the possibility to “bookmark”

233

promising design candidates. The Performance Explorer supports selection and refinement by providing an overview of the fitness landscape, including clusters of promising design candidates. Beyond selection, some designers prefer to understand the relationship between design parameters, the morphologies of design candidates, and their performance. In other words, some designers prefer to gain insight into the “black boxes” defined by parametric models and performance simulations. Although the Performance Explorer’s interactive visualization of the fitness landscape and multiple representations do not guarantee understanding, they certainly support its acquisition. Integrating analysis methods such as sensitivity analysis into the Performance Explorer would further enhance this knowledge acquisition. 9.8.4 Conclusion This thesis considers the Performance Explorer as a prototypical example of a performance-informed DSE tool that acknowledges designers’ preferences for selection and understanding: Although the participants suggested several and potentially valuable improvements (section 9.5.3), the test results nevertheless validate the utility of the Performance Explorer as a performance-informed DSE tool. Since the key differences between using only Opossum or using it in combination with the Performance Explorer are the latter’s visualizations and interactivity, the test results indicate that, at least in this case, interactive visualization supported performance-informed DSE more than only manual search or optimization (Research Question 4).

234

235

10 Conclusion This final chapter summarizes the thesis (section 10.1) and discusses the research questions from section 1.1 (section 10.2), general limitations (section 10.3), contributions and implications relative to different streams of literature (section 10.4), and future research directions (section 10.5).

10.1 Thesis Summary The thesis aims to better integrate mathematical optimization with architectural design processes: It (1) introduces performance-informed Design Space Exploration (DSE) as a theoretical concept that better relates optimization with computational architectural design processes than existing ones, (2) compares a model-based optimization algorithm with the currently popular genetic and other algorithms, and (3) presents and user-tests a novel tool for visual and interactive DSE. 10.1.1 Performance-Informed Design Space Exploration The thesis introduces the notion of performance-informed DSE, which—in contrast to other Architectural Design Optimization (ADO) paradigms such as performance-based or performance-driven design—emphasizes selection and understanding, and not only automation (section 3.3.5). Performance-informed DSE bridges between two paradigms that claim to describe architectural design processes: (1) Generate-and-Test, which emphasizes rule-based design and, eventually, automation, and (2) Co-Evolution, which emphasizes increasing understanding through iterative problem definitions and design development (section 3.1.4).

236

10.1.2 Comparing Model-based Optimization with Genetic Algorithms The thesis presents Opossum, the first optimization tool for architectural designers based on an established, global model-based method (section 6.1.3) and examines its efficacy in comparison to three other ADO tools on seven simulation-based ADO problems (Chapter 7). In this benchmark, the genetic algorithms—which are the most popular in ADO theory and practice—are among the worst-performing ones. 10.1.3 Visual and Interactive Design Space Exploration Finally, the thesis presents Performance Maps, a novel method to visualize highdimensional fitness landscapes as continuous fields (Chapter 8), and implements it in a visual and interactive tool for performance-informed DSE, the Performance Explorer (Chapter 9). In a user test, the participants found the Performance Explorer more supportive and more enjoyable to use than Opossum (which allows only limited interaction and does not visualize optimization results) or searching for well-performing design candidates manually. In addition, the thesis identifies nine performanceinformed DSE strategies based on the responses to the user test (section 9.5.4).

10.2 Research Questions This section discusses the results of this thesis in terms of the four research questions posed in section 1.1.

237

Research Question 1

Which optimization methods are especially suitable for architectural design optimization? There is an intimate connection between the efficiency of optimization methods and the understanding that, according to some computational design theorists (section 3.3.4), optimization methods should provide: Inefficient methods are likely to produce inaccurate results, which prevents understanding. Model-based methods find good design candidates and construct reasonably accurate surrogate models with comparatively small numbers of simulated design candidates, while surrogate models play a critical role for visualizations and interactivity by providing the performance estimates needed for visualizing fitness landscapes and for real-time feedback. Model-based methods thus are especially suitable for ADO, because (1) they are more efficient and reliable than the popular-in-ADO GAs and other optimization methods (at least for comparatively small evaluation budgets), because (2) they afford better possibilities for visualization and interaction, and because (3) visualization and interaction are important for performance-informed design, which—according to the user test in Chapter 9 and other empirical studies (Bradner et al. 2014; Cichocka et al. 2017)—most computational designers prefer over simple automation (i.e., optimization without visual and interactive features). Research Question 2

Which optimization methods are efficient on building optimization problems?

238

The benchmark results on simulation-based, time-intensive ADO problems in Chapter 7 indicate that the global model-based optimization method converges faster and is more reliable than the direct search and metaheuristic methods. The model-based optimization method thus is the most efficient overall. Research Question 3

Which visualization methods best exploit the opportunities of surrogate modeling for performance-informed design space exploration? Chapter 8 concludes that Performance Maps better exploit the opportunities of surrogate modeling for performance-informed DSE than Contour Plots and Parallel Coordinates. The user test results from Chapter 9 reinforce this conclusion. These opportunities include speed, exploration, and enhancement (section 5.8.6) and enable three critical functions of the Performance Explorer: (1) visualizing approximated fitness landscapes, (2) targeted enhancement of these approximations, and (3) multiple representations (section 9.5.4). Research Question 4

Do interactive visualizations support performance-informed design space exploration more than only manual search or optimization? The user test results from Chapter 9 indicate that interactive visualizations support performance-informed DSE more than only manual search or optimization. This better support likely results from a recognition of designer’s preferences for selection and understanding (section 9.8.3).

239

10.3 Limitations Beyond limitations discussed in previous chapters, there are two general ones that are worth discussing: Generalizability and the role of parametric models in architectural design processes. 10.3.1 Generalizability Generalizability is an issue both for the benchmark in Chapter 7 and the user test in Chapter 9: For the benchmark, the algorithms’ relative performance might be quite different for another problem set or a larger budget of function evaluations. The thesis demonstrates that RBFOpt performs well for comparatively small budgets of function evaluations and for diverse simulation-based problems. In the author’s opinion, this claim is broad enough to be relevant for many applications in ADO, but more modest than the claims regarding GAs discussed in section 5.8.2. Just like the performance of optimization algorithms is problem-dependent, the user test’s response in Chapter 9 reflect the designs of the experiment and the tested software. One should thus hesitate to transfer these results to other settings or into theoretical frameworks without further testing. However, the— to the author’s knowledge—only other empirical studies of architectural designers’ use of ADO (Bradner et al. 2014; Cichocka et al. 2017) also indicate preferences for selection and understanding. 10.3.2 Role of Parametric Models The author agrees with Kotnik (2010) that, in the context of architectural design processes, the definition of parametric models is more important than their 240

optimization (section 3.3.1). Considering the co-evolution of architectural design problems (section 3.1.3), computational designers likely develop various parametric models during a single design process. The author thus does not intend the design tools presented in this thesis— Opossum and the Performance Explorer—to contain design processes. Rather, the author imagines their sporadic use at relevant moments of design processes to inform the next iterations of co-evolutionary design cycles. This intended use reinforces the importance of finding good design candidates quickly over finding the optimal one at all costs, and the importance of performance-informed DSE over automation.

10.4 Contributions and Implications Reflecting the thesis’ interdisciplinarity, this section summarizes contributions to and implications for the fields of (1) mathematical optimization, (2) ADO, (3) multivariate

visualization,

(4)

computational

design

theory,

and

(5)

performance-informed design tools. 10.4.1 Mathematical Optimization The thesis confirms the findings from mathematical benchmarks (section 5.7)— where GAs and other metaheuristics usually perform poorly—with a benchmark on simulation-based problems from architectural design. Typically, black-box optimization benchmarks employ either mathematical test problems, or problems from engineering design.

241

This

benchmark

confirms

the

mathematical

optimization

literature’s

assumption that theoretical proofs of convergence and good performance on mathematical test problems indicate good performance also on practical, simulation-based problems (section 5.8.3). The thesis further contributes to the mathematical optimization literature by categorizing existing black-box optimization methods, presenting applications for black-box optimization in ADO, and by presenting requirements for optimization tools from an ADO standpoint. (For example, optimization tools should be efficient, but also support selection, refinement, and understanding and, ideally, work for multiple objectives.) 10.4.2 Architectural Design Optimization The thesis proposes the study of ADO as a unified field, challenges the ADO literature’s preference for metaheuristics and, especially, GAs (section 5.8.2), highlights a need for more frequent and rigorous benchmarking (especially when assumptions contradict the mathematical benchmarking literature), and demonstrates the efficacy of a global model-based optimization algorithm on simulation-based ADO problems. Further developing a proposal for the façade of the New Jurong Church by Schroepfer + Hee (Schroepfer 2012) (pp 120-129), the innovative, daylightmaximizing and glare-minimizing wall screen presented in section 7.3.7 illustrates the potential novel applications of model-based optimization for sustainable design.

242

With Opossum, the thesis provides the ADO community with a global modelbased optimization tool that is easy to install and use. In addition, the thesis contributes to the ADO literature by presenting examples of annual daylight and glare optimization. 10.4.3 Architectural Pedagogy Although this thesis does not contribute to architectural pedagogy as such, it raises questions on how to integrate ADO and performance-informed DSE into architectural curricula. Architectural students face challenges like the ones faced by practitioners (section 2.1). To apply ADO and performance-informed DSE, students need to also master parametric design and (some) performance simulations. Integrating such advanced computational methods into design processes can be challenging especially for learning designers. Nevertheless, in the author’s opinion, one should not understand architectural design and computational methods as separate, or even contradictory. Rather, a proper understanding of the strengths and limitations of such methods comes from applying them to design projects. To improve the integration of compactional methods such as ADO into professional design processes, it is important to teach such methods early, and in combination with architectural design. Gerber and Flager (2011) and Pasternak and Kwiecinski (2015) present examples of successful integrations of such methods into architectural studio teaching, with both examples containing high-rises as a design brief.

243

In response to fears—expressed by Lawson (2006) and others (sections 3.2.1 and 3.3.1)—that such an emphasis will lead to a loss in architectural design skills, this author responds that (1) such concerns are long-standing (Ehlers et al. 1986) but ill-supported with evidence 63, (2) it is better to tackle the impending “sea change” (Scheer 2014) head on, and (3) on the contrary, performance-informed DSE aims to support and amplify both convergent and divergent aspects of existing architectural design process (section 3.3.5). 10.4.4 Multivariate Visualization The thesis introduces “reversibility” as a key concept for exploiting visualizations for performance-informed DSE and presents a novel, reversible visualization method for visualizing high-dimensional fitness landscapes with applications in performance-informed design: Performance Maps Performance Maps might also prove useful for other optimization-related fields, such as fitness landscape analysis (Pitzer and Affenzeller 2012) or engineering design.

Critiquing the use of “computational crutch[es]” in design education and practice, Goldschmidt (2017) writes that “one may claim that we get more complex and exciting geometries than ever before, as in the Dalian Conference Center. Are these geometries resolved in terms of building performance and construction? As of now, often the answer is still negative.” She then gives the Stata Center in Cambridge, MA as an example of a “flawed building that triggered a huge lawsuit.”

63

But this anecdotal evidence is selective: Thanks to computational technologies, numerous buildings with complex geometries have been completed successfully (Wortmann and Tuncer 2017). On the other hand, it is easy to find examples of flawed buildings without complex geometries, such as Berlin’s new airport. The airport’s rectangular main terminal was due to open in 2012 but remains closed due to various problems with design and execution, as well as “political and bureaucratic obstacles” (Ros 2017).

244

10.4.5 Computational Design Theory The thesis tests the theoretical framing of optimization as promoting not only automation but also understanding with a practical software implementation and an empirical user test. To contrast this framing with concepts that focus on automation, the thesis introduces the concept of “performance-informed design.” In addition to understanding, the user test identifies refinement (i.e., indicating potential directions for improvement) and, more importantly, selection (i.e., allowing choice) as requirements for better integrating optimization into architectural design processes. The user test also identifies nine performanceinformed DSE strategies that provide a starting point for further empirical research into computational design processes. The importance of selection puts the popularity of Pareto-based optimization in ADO into a new light (section 5.8.6): Does the main advantage of Pareto-based optimization for ADO lie in its support for selection, and less in its (sometimes imperfect, see section 7.5.2) illumination of trade-offs? If so, appropriate singleobjective optimization algorithms and tools can also provide this support, and, potentially, more efficiently than the Pareto-based GAs often used in ADO. 10.4.6 Performance-informed Design Tools In contrast to most works in the emerging field of performance-informed design tools, such as (Conti and Kaijima 2017) or (Mueller et al. 2017), this thesis not only provides a software implementation, but also puts this implementation and, indirectly, its underlying ideas, to the test with prospective users.

245

The user test’s results increase the author’s confidence that, in the future, visual and interactive features will increasingly be common in ADO tools and help their wider adoption. The feature requests and design strategies resulting from the user test provide promising starting points for the further development and testing of such features.

10.5 Future Research The thesis suggests three interrelated avenues for future research: (1) More extensive ADO benchmarks, (2) further software development, and (3) further investigation of Performance-informed DSE. 10.5.1 More Extensive ADO Benchmarks Benchmarking a wider set of simulation-based ADO problems would enhance the generalizability of benchmark results and thus contribute to better recommendations in the ADO literature on which algorithms to use when. Currently, the author is collaborating on an extensive benchmark (in terms of both the number of problems and the number of algorithms) on building energy problems. While there is a general lack of benchmarking in the ADO literature, this lack is especially apparent for Pareto-based optimization. Developing Pareto-based variants of RBFOpt and Opossum will allow comparisons of GAs and modelbased methods also on simulation-based problems with multiple objectives. As a longer-term goal, optimization algorithms and ADO problems could be collected in an online database, which would enhance reproducibility, accelerate benchmarking via cloud-computing, and allow recommendations based on 246

individual problem characteristics, such as the number of variables, the type of simulation, and the evaluation budget. 10.5.2 Further Software Development Another direction for future work is the further development of RBFOpt, Opossum, and the Performance Explorer. 64 Planned developments for RBFOpt include improving the interplay between more intensive local search and restarts and the accommodation of multiple-objectives. 65 Filtering optimization results according to diversity (for example with a clustering method) and defining a pre-set of algorithmic settings for RBFOpt that promote such diversity would improve Opossum’s support for selection. For the Performance Explorer, making it available as a free download is a priority. This distribution to a wider audience will undoubtedly necessitate additional bug fixes and features but might accelerate the adoption of performanceinformed design as a promising research direction and a valuable addition to computational architectural design practices. In addition, adding more types of visualizations and implementing some of the features requested by the user test’s participants promises to increase the support and enjoyment provided by the Performance Explorer.

The SUTD-MIT International Design Centre currently supports this further development (IDG21700105). 64

One can achieve the latter by either constructing one surrogate model per objective or by constructing a single surrogate model of a weighted sum of objectives, but with shifting weights.

65

247

10.5.3 Further Investigation of Performance-informed DSE In the author’s opinion, the most interesting direction for future research is the further investigation of design practices in performance-informed DSE. Such an investigation could take the nine performance-informed DSE strategies as a starting point and empirically examine the concepts of selection, refinement, and understanding with additional user tests or case studies of the further developed software (section 10.5.2). Such an investigation might result in a deeper understanding of computational design processes more generally.

10.6 Conclusion The conviction that one should put (both new and received) ideas to the test has been an important driver of this thesis. Interdisciplinarity can inspire and support such testing by contrasting different disciplinary assumptions and approaches to related questions. The empirical results from the benchmark and user test should discourage uncritical generalization and spur further inquiries into ADO fields such as building optimization and computational DSE, which sometimes overgeneralize from individual case studies or forego testing entirely. Lawson (as cited in Johnson 2017) succinctly encapsulates this conviction in the context of design thinking:

I have found that one of the most penetrative inquiries you make into how designers think is to demand that they use a computer tool and then allow them to complain about it.

248

249

Appendix This appendix contains the full qualitative responses from the user test discussed in Chapter 9, ordered according to the three performance-informed DSE methods (Manual, Automated, and Interactive) and numbered per participant. The appendix presents the responses for selection criteria, performanceinformed DSE strategies, and comments and/or feature requests that were recorded after using each method, as well as final comments that were recorded after using all three methods. The responses have been lightly edited for grammar and spelling.

Responses for the Manual Method This section presents responses for the manual method, which involved manipulating the variable values of the parametric model directly and simulating the resulting design candidates. Selection Criteria for the Manual Method Responses to the question “In one to three sentences, why did you choose this design?”: 1.

I was attempting to design a pavilion with a largely undulating surface with irregular curvatures. This was done in consideration with the lowest possible deflection (