C++ Models

0 downloads 0 Views 1MB Size Report
3.20 Fixed sample time, granularity 6500 cycles . . . . . . . . . . . . . . . . 111 .... techniques face this problem and make simulation faster and faster. Now, as integrated ...
Fachhochschule Augsburg Fachbereich Informatik Diploma Thesis

Development and Verification of fast C/C++ Models for the Star12 Micro Controller Joachim Schlosser

Examiner: Co-examiner: Company: Handed in:

Prof. Dr.-Ing. Christian Märtin Prof. Dr. rer. nat. Michael Lutz Motorola GmbH Summer Term 2001

Diploma Thesis Joachim Schlosser Development and Verification of fast C/C++ Models for the Star12 Micro Controller

Created: March–August, 2001 Author’s Address: Joachim Schlosser Nördlinger Str. 107 86343 Königsbrunn http://schlosser.info

Abstract With more and more transistors residing on a single chip, accomplishing more and more functions, chip development becomes more challenging. New technologies emerge in shorter cycles, and devices have to be thrown on the market faster. For software engineers in the embedded controller sector this means their programs must not delay this process. Therefore it is vital to be able to write software even before the chip is available as hardware. And this brings simulation technology in account. With advanced simulators, it is possible to write even complete operating systems, including low-level drivers, without having the device in silicon. Full-chip simulation is the buzzword, meaning that not only the CPU of a system is simulated, but also its peripherals like network interfaces and I/O devices. This diploma thesis engages in writing parts of a simulator. The first one is the development of two peripherals of a current CPU, modeled to be used in a simulator. The second part was the implementation of an interface between a simulator engine and a general-purpose simulation application. The peripherals’ models are a step forward to have all peripherals of the particular CPU available for simulation, while the interface allows a complete system test with all timings and functionality, including software as well as network systems, without the need for any piece of the target hardware to be present. But before diving into the details of implementation, all necessary theory background is given. The theory part gives a deep view inside simulation technology used in the practical experience part. Many important terms are introduced, like models, model managers, simulators, taxonomy, classification, abstraction, interface, only to mention a few. A concrete implementation for a simulator backplane is presented, which is also used for the practical part. There exists a refined taxonomy, a classification scheme for categorizing models that regards several categories, like time and data representation. It was created by the Virtual Socket Interface Alliance. A standard, named Open Model Interface, for interfaces between models of hardware components and applications, is available by the IEEE consortium. Of course, verification is an important subject. Different reasons and methodologies for testing are known and presented in the thesis. The concrete implementation documentation shows the modeling of two peripherals of a micro controller, and an adaptation layer between an application and the concrete simulator engine implementation.

iii

iv

Contents

1 Introduction 1.1 Objective of the Thesis . . . . . . . . . . . . . . . . 1.1.1 From Integration Boards to System-on-Chip 1.1.2 Simulation . . . . . . . . . . . . . . . . . . 1.1.3 My Job . . . . . . . . . . . . . . . . . . . . 1.2 The Virtual Socket Interface Alliance . . . . . . . . 1.2.1 Intent . . . . . . . . . . . . . . . . . . . . . 1.2.2 Organization . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

2 Theory and Background 2.1 Model Abstraction Levels . . . . . . . . . . . . . . . . 2.1.1 A basic abstraction model . . . . . . . . . . . . 2.1.2 Gajski-Kuhn chart . . . . . . . . . . . . . . . . 2.1.3 The VSIA approach . . . . . . . . . . . . . . . 2.2 Model Interfaces . . . . . . . . . . . . . . . . . . . . . 2.2.1 Basic OMI Concepts . . . . . . . . . . . . . . . 2.2.2 OMI Information Model . . . . . . . . . . . . . 2.2.3 Execution Stages . . . . . . . . . . . . . . . . . 2.3 Model Verification . . . . . . . . . . . . . . . . . . . . 2.3.1 Intent Verification . . . . . . . . . . . . . . . . 2.3.2 Equivalence Verification . . . . . . . . . . . . . 2.3.3 Verification Test Suite Migration . . . . . . . . 2.3.4 VC Verification versus Integration Verification 2.3.5 Summary: Functional Verification Mapping . . 2.4 Octopus . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Event Based Simulation with Octopus . . . . . 2.4.2 Execution Logic . . . . . . . . . . . . . . . . . 2.4.3 Octopus classification . . . . . . . . . . . . . . 2.4.4 Octopus through OMI Eyes . . . . . . . . . . . 2.5 Summary of the Theory Part . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . .

1 1 1 2 4 5 5 6

. . . . . . . . . . . . . . . . . . . .

7 7 7 9 11 21 22 26 40 47 48 53 54 55 55 55 57 61 64 67 71

v

Contents 3 Modeling and Implementation 3.1 Modules Modeling . . . . . . . . . . . . . 3.1.1 Environment . . . . . . . . . . . . 3.1.2 Analog-to-Digital Converter . . . . 3.1.3 Pulse Width Modulator . . . . . . 3.2 Simulink Interface for Octopus . . . . . . 3.2.1 Simulink S-Functions . . . . . . . . 3.2.2 Implementation . . . . . . . . . . . 3.3 Summary of the Practical Experience Part

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

73 73 73 76 84 93 95 100 112

4 Conclusion

113

About Motorola

115

Bibliography

117

CD-ROM Contents

119

Tools and Applications used

121

Glossary of Abbreviations

123

vi

List of Figures

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20

Gajski-Kuhn Y-chart . . . . . . . . . . . . . . . Design Levels . . . . . . . . . . . . . . . . . . . Taxonomy Axes . . . . . . . . . . . . . . . . . . Empty classification scheme . . . . . . . . . . . Symbol keys . . . . . . . . . . . . . . . . . . . . Functional Model Classification . . . . . . . . . Behavioral Model Classification . . . . . . . . . Structural Model Classification . . . . . . . . . Interface Model Classification . . . . . . . . . . Performance Model Classification . . . . . . . . Mixed Level Model Classification . . . . . . . . Data Flow Graph Model Classification . . . . . OMI model integration . . . . . . . . . . . . . . Basic OMI elements . . . . . . . . . . . . . . . OMI execution model with callbacks . . . . . . Event-driven simulation cycle with callbacks . . Cycle-based simulation cycle with callbacks . . General Concept . . . . . . . . . . . . . . . . . Octopus classification . . . . . . . . . . . . . . Functional resolution of Octopus, another view

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

9 10 13 13 14 18 18 18 19 20 20 20 23 24 41 45 46 58 65 66

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11

ATD Block Interface Diagram . . . . . Sample Hiwave output for ATD . . . . PWM_8B6C Block Diagram . . . . . PWM Left Aligned Output . . . . . . PWM Center Aligned Output . . . . . Hiwave sample output for PWM . . . PWM basic functional test output . . PWM prescaler decrease test output . PWM re-enabling channels test output Simulink block groups . . . . . . . . . Simulink Simulation Parameters . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

77 82 84 86 86 91 92 92 92 94 95

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

vii

List of Figures

3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 3.20 3.21

viii

Simulink Block . . . . . . . . . . . . . . . . . Simulink simulation stages . . . . . . . . . . . Test Model for Adaptation Layer . . . . . . . Octopus S-function variable port number . . Control and Data Flow for Adaptation Layer Continuous sample time . . . . . . . . . . . . Continuous sample time, zoomed . . . . . . . Fixed sample time, granularity 500 cycles . . Fixed sample time, granularity 6500 cycles . . Implementation Overview . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

96 97 100 102 104 110 110 111 111 112

List of Tables

2.1 2.2

Accessing information from an object . . . . . . . . . . . . . . . . . . . 39 Functional Verification Mapping . . . . . . . . . . . . . . . . . . . . . 56

3.1 3.2 3.3

Example for ATD Output Codes . . . . . . . . . . . . . . . . . . . . . 78 ATD ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Concatenation Mode Summary . . . . . . . . . . . . . . . . . . . . . . 86

ix

List of Tables

x

Listings 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9

ATD read routine . . . . . . . . . Checking Routine for PWM . . . PWM generic read check routines PWM check callback routines . . PWM eval callback . . . . . . . . Port list creator routine . . . . . Signal monitor routine . . . . . . Hack into model table . . . . . . S-function output routine . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

80 87 89 89 90 104 106 106 107

xi

Listings

xii

1 Introduction 1.1 Objective of the Thesis How does the topic of the thesis emerge? What is the use of simulation? What is the objective of the practical part of this diploma thesis? Many basic thoughts have to be discussed before talking about concrete theories.

1.1.1 From Integration Boards to System-on-Chip Since the silicon transistor was invented in electronic components, it was possible to put more and more transistors on a single chip every year. It began with few thousands in the late sixties to multi-millions in the present, and an end is hardly conceivable. But not only the pure transistor density has grown, of course there is also an evolution in functional density. Different ICs have resided on their own chips in the past, and were put together with an integration board. Such boards hold several chips and have circuits between the single chips. In addition to the production processes of the chips – one for each chip, with the resulting times and cost – comes the assembly process. Although the complete boards can be protected against environmental influences, too, it is more likely that one contact point on the board has an error in it or is damaged than it is with the circuits within the chips. Another point is that each transit from chip to board and vice versa consumes additional power, no matter how perfect and precise the production process may be, and enlarges the distance a signal has to run. So these systems cannot perform perfectly. Now, with the increasing number of transistors placed on a chip, it is possible to integrate multiple ICs in a single chip. No additional integration board is needed; all the chips usually needed for a system can be pooled. The result is a compact unit that is relatively insusceptible to environmental influences and can easily become part of a bigger system, without many requirements with respect to space and power. The signal propagation delay becomes shorter because the different components are nearer to each other. These highly integrated devices have different names, they are referred to as System-on-Chip (SoC), system-ASIC, systems-on-a-chip or System Level Integration (SLI) devices. [VSI01b] A number of 20 or more ICs integrated on a single chip now is nothing unusual. As mentioned before, this has great advantages regarding cost and robustness. But with

1

1 Introduction

more and more chips developed for the consumer electronics business, there is also a problem emerging from this: consumer electronics products change very quickly, and with this shorter product time-to-market also the cycles for the SoCs are shrinking. As the complexity of these SoCs does not shrink but in fact explodes, the gap between the demands of the consumer electronics vendors and the abilities of the engineering community and its tools is becoming larger. It is an expanding dilemma between what is theoretically possible by technology and what can practically be done with the human and financial resources staying at the same level. The same number of engineers has to spend the same or even less money as in the past but develop more complex systems in a much shorter time. It becomes quite apparent that it is impossible and unnecessary to develop a new chip from scratch each time. Parts of the old chips can be reused, thus splitting a chip logically into pre-designed blocks, as it was done long time using off-the-shelf ICs on printed circuit integration boards [VSI01b]. The pre-designed blocks now are not used in form of chips that are assembled onto one board, but they are a form of Intellectual Property, variously referred to as IP, IP blocks, cores, system-level blocks (SLB), macros, system level macros (SLMs). The Virtual Sockets Interface Alliance (VSIA) – an organization introduced in the following section – gives these pre-designed blocks the name Virtual Components (VCs). The IP blocks are the replacement of the former chips. They are virtually combined long before the actual manufacturing, instead of physically assembling them. A whole system is manufactured at once, in a solid product. This change in design and production process leads to another problem: It is not possible to relatively easy change layout or function of a system after the chips are produced. With several chips residing on one integration board it is not an unsolvable situation to change the arrangement of ICs. Simply print a new board and that’s it. With SoC the situation is different. All ICs are on one chip. If anything changes, you cannot use chips any more, produced before the change. A new layout has to be generated, new masks, a new production process has to be elaborated, with all the cost and time disadvantages. Whereas with boards it was possible to assemble the ICs on a bigger board for verification and testing, producing a chip just for verification and testing is a major expense factor and so the number of different verification chips has to be kept as small as possible.

1.1.2 Simulation To fulfill this requirement new techniques have been developed and deployed, and well-known techniques have been refined and were extended. It is usual to create software for microprocessors not on these target systems for self-evident reasons, namely limited resources in memory and speed, often there is no operating system present. Development is done on different platforms like PCs or

2

1.1 Objective of the Thesis

Workstations, where the environment is much more convenient. Using cross compiler and cross linker tools, code for the microprocessor machine can be generated. To transfer the program to the target platform either the help of a burner is utilized, if a connection between the development host and the target can be made – e. g. evaluation boards with serial interfaces – or the program is burnt directly into the ROM later in the production process. For testing purpose the first solution is the better one. Usually the ROM on an evaluation board is replaced with another sort of non-volatile memory – flash. A simulator is used more often for testing than these hardware-solutions. An emulator, acting as the interface between the debugger and the evaluation board, has to provide 100% of the reaction of the target processor. It has to support debugging, to provide the ability of setting breakpoints, of reading memory and registers without changing the program counter and so on. A simulator has none of these restrictions. It is only software pretending to be a processor. All internal states and information can be achieved and even manipulated by the debugger. It is quite clear that an additional piece of software brings in new error possibilities, but it is the same with evaluation boards and emulators, which can contain errors, too. The real disadvantage of simulators is their speed, because the host system has to simulate all the actions occurring in the microprocessor in each cycle by software, meaning to copy or move data and to decide between different execution paths. Recent simulation techniques face this problem and make simulation faster and faster. Now, as integrated systems cannot be physically debugged and tested, the whole SoC is to be simulated. What is usually simulated within a debugger is only the core processor. Regarding modern chips the core is rather a small module, not any more important than the other components. There are many components one can think of: complex timers with many stages and prescalers, digital-to-analogue and analogueto-digital converters, signal generators, data transfer ports, implementing even whole protocols within the module, sophisticated memory with multi-access possibilities, watchdogs, externally triggered input and output ports and many more. Regarding the space occupied on a chip the processor with all its registers takes the smallest amount. With simulation of whole SoCs the interaction between the components can be evaluated, timing problems can be determined and – embedding the simulator in a bigger environment – even the interaction with the rest of the world can be seen. The most thrilling possibility opened by SoC simulation for commercial customers is certainly the ability to write software for a system, including low-level drivers, before the first silicon for the chip comes out of the factory. This reduces the timeto-market for a system, because in the past the programming could not be started before the chip was there. And the time, it takes to program complex systems, of course cannot be ignored. With simulation a parallel development is possible after releasing the precise specifications of a system. While the hardware developer works

3

1 Introduction

on the detailed layout and refines the design, the customer can write his software and operating systems. The chip can be virtually integrated into the environment it should work in later. When, finally, the chip is physically available, only small adaptations have to be made where simulation differed from the real chip. Differences cannot be avoided, as certain effects in timing or electrical behavior are very difficult to simulate and would not be effective neither in performance nor in the resources needed to implement them.

1.1.3 My Job Motorola is one of the largest vendors of embedded systems, and, of course, also of SoCs. So it is obvious, that there is the need for a full-chip-simulation tool. This tool already exists, but for all the devices and peripherals, models must be created step by step to allow their virtual integration into these chips, that are to be simulated. It is very necessary to give customers the ability to simulate SoCs for the reasons mentioned above. On one side, simulators are a product on their own, but, of course, they are also a bundle with hardware fabrication contracts and so an instrument for customer loyalty. A customer, who can virtually use the chip, long before having it physically, is also more communicative, more design issues can be found and so he also can help to make the product – his product – better. The first practical part of my job was to implement models for two peripherals of the Star12 microprocessor, also known as M68HC12. We choose the eight/sixteenchannel analogue-to-digital converter and the six/eight channel pulse width modulator for modeling, i. e. one input and one output peripheral. See section 3.1.1.1 on page 73 for a description on the models and a short overview of the Star12 microprocessor. The simulator engine, Motorola uses, is able to be integrated into the debugger of the microprocessor development platform as well as into Verilog simulators and others. Within engineering, the toolkits Matlab and Simulink are widely used to model the circuit systems, for example complete engines. My second big job was, to allow models, running on the simulator engine, to be used as blocks within Simulink and thus be able to virtually integrate a SoC into any other environment. To learn more about Motorola, see appendix 4 on page 115. Concerning my job with Motorola, I would like to thank especially Mr. Manfred Thanner, who supported me great in learning the capabilities that were necessary for doing the job. Also thanks go to Dr. Andreas Both and the whole team of the Virtual Garage Munich, who were, like Mr. Thanner, always open to my questions – and there were quite a lot of questions of course.

4

1.2 The Virtual Socket Interface Alliance

1.2 The Virtual Socket Interface Alliance As all companies, engaged in SoC, have the same problems. It is logical, that at a certain point even competitors sit together and think of what can be made better.

1.2.1 Intent The same way, a SoC is built out of several IP blocks, a simulator cannot be a monolithic engine. IP blocks have to be reused either unchanged or patched, to match a different specifications of a component of the same type. A simulation component manager that manages the models of the IP blocks and is called the model manager holds all simulation components together. Whoever starts to simulate SoCs, needs a model manager, either a self-developed or an external product. With the whole bunch of different model managers arising from this fact, and the large number of debuggers a new problem arises: A large number of interfaces between debugger and simulator, all with different structures and views, complicate the reuse of a simulator in a different debugger environment or within a different debugger. One level deeper in the simulator, the number of interfaces between model manager and models is also as big as the number of simulators. The benefit of using Virtual Components is lost if they must be converted regarding interfaces and data formats, when integrating them into a different environment, to make them again compatible with a certain model manager. The problem is pretty much the same as it is with components in personal computer boards – different devices are combined and the function has to be tested. But there is a decisive difference: with PC boards there exists an infrastructure of standard interfaces, integration, verification and tests. So mixing and matching ICs that originate from different vendors is not a big problem at all. With VCs and system chips the situation is quite different. The infrastructure to support development and verifications of such components is not very defined, so mixing VCs from different sources and integrating them into system chips is difficult. And the system-chip industry does not only consist of the VC vendors, there are of course companies which produce the chips, others that deliver the tools, like compiler, debugger and so on. Then other companies do only the chip design without having fabrication facilities, or Electronic Design Automation (EDA) companies. So there is the “lack of open VC-to-VC interface standards, as a base for VC development and use” [VSI01b]. Standards are needed to give the different vendors a possibility, making their products compatible to each other and if not compatible with the exact calling paths, then at least compatible in respect of structure and data flow.

5

1 Introduction

1.2.2 Organization In order to establish such a unifying vision, vendors of the system-chip industry formed the Virtual Socket Interface Alliance in September 1996. The basic goal of the VSI Alliance is to build technical standards, required to mix and match the Virtual Components from different sources. With specified standards for VCs, the SoC development can be accelerated dramatically. VSIA is driven by the so-called design working groups (DWGs), which concentrate on specific problems and write the specifications. Up to now, VSI has produced 19 documents. Volunteers of the 177 member companies and 29 individual members staff the DWGs. Among the member companies are nearly all big hardware vendors like Motorola, Infineon Technologies, IBM, Analog Devices and Intel as well as research organizations like the Fraunhofer Institute and many more vendors of hardware, tools and services. Offices are maintained in Europe, the US and Japan. The Board is a group that directs the work and coordinates public relations and releases. The Board elects a president each year, who guides the direction and policy. The actual administration contains an executive director, a staff and a webmaster. The Technical Committee (TC) is responsible for coordinating the DWGs. A DWG has a chairman and a bunch of volunteers from members. It defines its own objectives. There are the following Design Working Groups: • Analog Mixed-Signal • Implementation/Verification • IP Protection • Manufacturing Related Tests • On-Chip Buses • System-Level Design • Virtual Component Transfer For more information on the individual DWGs and the VSIA in general refer to their Website [VSI01b].

6

2 Theory and Background This chapter will discuss all theory that is necessary to understand the simulation technology used in the practical experience part. It will start with a discussion on abstraction levels, presenting different schemes for the categorization of models. The subsequent part shows a standard for interfaces within simulation, followed by a summary of verification techniques. The last section in the theory and background chapter introduces a concrete implementation for a simulator backplane.

2.1 Model Abstraction Levels The design of a chip is not a straightforward procedure. No one starts, designing a chip, by running a CAD program and draw the layout of the circuits. Usually, the known thing is a set of functionality, which has to be covered by the system. The design team refines this set of functionality into a specification. Starting with the description of functionality, the specification refines to register level, timing aspects will occur, the specification will be translated to a register-transfer-language (RTL), a hardware description language (HDL) and at the end, there is the layout. Different levels of abstraction are used to design the system, from high-level abstraction to low level abstraction. Each level has its own intents of behavior it describes; let it be timing or data values, functions or structure. Each aspect has to be verified and tested, so each level must be accompanied with a simulation model of the whole system. A high level model allows to verify the requirements to the system, evaluate different architectural aspects and test the usability of existing components in respect of the system’s needs.

2.1.1 A basic abstraction model There are different approaches to categorize abstraction levels. [DH99] defines six abstraction levels for embedded systems in the automotive industry, ordered by incremental refinement: Scenarios. A scenario is the highest abstraction level, describing systems in a big universality with fewest amount of information. The concept of events and their dependencies is used to model sort of use cases and “not use cases” by forming sequences of events. Use cases are examples of the aims, “not use

7

2 Theory and Background

cases” show the possible exceptions or errors as well as tests – for short, how a system may not be used. Functions. The functions abstraction level considers complete sequences and their structure. The sequences are still implementation independent. Structure is needed to provide a way to navigate through the usually huge system of functions. It is not important here to know which functions interact exclusively with or depend on other functions, only what they do. Functional network. This level adds information about instantiation of the defined functions to particular functional configurations, enlarging the name space. The mutual dependencies, which were not treated in the level above, now have to be defined and in case of conflicts, also have to be solved. Referring to [DH99] this is the first level where a “simulation of the overall system is possible”. A global time is invented, allowing to define the communication between the functions as event-driven and synchronous, based upon multi-casting. Logical system architecture. New orthogonal information classes are invented, dealing with the logical distribution and the information about sensors and actors that are part of the environment. The time is split up into two parts: the real time part or the universal time of the environment, and the system time part, which includes clocking. The communication between the functions is no longer event-based, but a clocked data-flow. This makes the synchronization of the two times crucial. If they diverge over a certain point of tolerance, the real-time capability of the system is violated. Technical system architecture. Here, the model is filled with information about concrete realizations of the architecture, specified in the above level, e. g. how buses are implemented. Additionally, the control unit specifications and the operating system are described. At this level a first performance simulation is possible. Implementation. The complete model from the above abstraction levels is now implemented. Code optimizations take place as well as adoptions to existing systems. The presented taxonomy is perfectly right, but from my point of view it is far from being differentiated enough. It is more or less possible to identify the level of abstraction in which a design step takes place, but what I miss, is a description of what type of information is to be abstracted. Abstractions of different aspects like time, data, data-flow and functions are mixed in a one-size-fits-all view. So for classifying models it is nothing more than a very rough overview. For example it is not clear for me where the boundary between level 5 and 6 is erected. Where is RTL

8

2.1 Model Abstraction Levels

and where HDL coding done? Is it not too late to think of adoptions to existing systems when the actual implementation should be done?

2.1.2 Gajski-Kuhn chart A more detailed abstraction model is the Y-chart Gajski and Kuhn invented in 1983, well explained in [DH99]. With this well-known chart it is possible to visualize design views as well as design hierarchies. It is widely used within VHDL design and can give us an idea for modeling abstraction levels, too. The name Y-chart arises from the three different design views, which are shown as radial axes forming a ‘Y’. Behavioral

Structural

Systems

CPU, Memory

Algorithms

Subsystems, Buses

Register-transfer

ALUs, Registers

Logic

Gates, Flipflops

Transfer functions

Transistors

Polygons Cells, Module Plans Macros, Floor Plans Clusters Chips, Physical Partitions Geometry

Figure 2.1: Gajski-Kuhn Y-chart, pursuant to [DH99, p. 358] Five concentric circles characterize the hierarchical levels within the design process, with increasing abstraction from the inner to the outer circle. Each circle characterizes a model, explained after introducing the three domains. Behavior. This domain describes the temporal and functional behavior of a system. Structure. A system is assembled from subsystems. Here the different subsystems and their interconnection to each other is contemplated for each level of abstraction. Geometry. Important in this domain are the geometric properties of the system and its subsystems. So there is information about the size, the shape and the

9

2 Theory and Background

physical placement. Here are the restrictions about what can be implemented e. g. in respect of the length of connections. With these three domains the most important properties of a system can be well specified. The domain axes intersect with the circles that show the abstraction levels. The five circles from highest to lowest level of abstraction are (outer to inner circles): Architectural. A system’s requirements and its basic concepts for meeting the requirements are specified here. Algorithmic. The “how” aspect of a solution is refined. Functional descriptions about how the different subsystems interact, etc. are included. Functional block or register-transfer. Detailed descriptions of what is going on, from what register over which line to where a data is transferred, is the contents of this level. Logic. The single logic cell is in the focus here, but not limited to AND, OR gates, also Flip-Flops and the interconnections are specified. Circuit. This is the actual hardware level. The transistor with its electric characteristics is used to describe the system. Information from this level printed on silicon results in the chip.

Figure 2.2: Design Levels, [DH99, p. 358] A good illustration of the abstraction levels is Figure 2.2. I personally like this taxonomy, as it shows how the word abstraction can be seen from different point of views. It is a refined model but simple enough to get an overview on one eye shot. Keeping in mind that the Gajski-Kuhn chart was created in 1983, you can see that the ideas of [BR01] are not totally new, their achievement was more to transfer the approach of Gajski and Kuhn to the automotive sector and to determine the possibility of using modeling languages like UML for the problem.

10

2.1 Model Abstraction Levels

2.1.3 The VSIA approach The VSI Alliance wants to simplify and accelerate integration and reuse of VCs and models of VCs into various systems. According to [VSI99] the Alliance achieves this goal by a “divide-and-conquer methodology” which breaks up the entire design reuse problem into easy-to-handle pieces and resolving each individually. A basic problem among the developers of VCs, CAD systems and semiconductors is the differing terminology. Different developers or vendors use different terms, which can actually have the same meaning. But for the VSIA goal, it is essential to be able to clearly communicate ideas about modeling techniques. Compatibility means first of all a compatibility of terminology, and more – a common terminology. The System-level Design Development Working Group (SLD DWG) of the VSIA collected the existing sets of terminology and brought them to a common base by removing overlapping definitions, adding missing ones and stated more precisely inexact definitions. The resulting document simply is titled “Model Taxonomy”. The SLD DWG was of course not the first group thinking about such problems. A sub-group of the Rapid-Prototyping of Application Specific Signal Processors (RASSP) program named RASSP Taxonomy Working Group (RTWG) was formed in 1995. The sub-group produced a document with the title “VHDL Modeling Terminology and Taxonomy” to address exactly this problem [Rap01]. The work of the SLD DWG based upon the RTWG approach, so it is not necessary here to write an extra section on the contents of the RTWG document. Comparing the two documents you can see that SLD DWG uses most of the symbols and terms already defined by RTWG, it only has one chapter in addition about implementation performance models (see section 2.1.3.2 on page 17). The following two sections are based on the SLD DWG document [VSI99], section 2.1. 2.1.3.1 Taxonomy Definition A model first of all can be viewed from two different positions: internal and external. While the internal view describes all of the structure and behavior that actually takes place within the model, the external view only sees the interface, its interaction with other models. This is a major difference, as the interface mostly is part of some sort of protocol. An interface can be described as looking at the pins of a IC package. It is well known what the purpose of each pin is, but not how the pins are processed. To get implementation details you could remove the lid off the package and look inside. This would be the internal view. You see how the interface is wired. The most important fact to recognize in model taxonomy is that a model cannot be categorized only in respect of a single aspect, or that all aspects can be merged. The view has to be split up into several taxonomy axes, each for internal and external view.

11

2 Theory and Background

For each internal and external there are four characteristics a model can be classified on: 1. Temporal detail 2. Data Value detail 3. Functional detail 4. Structural detail The relationship between temporal and data axes is orthogonal, and so is the relationship between these two and the functional and structural. Functional and structural, although viewing from the same direction, are better to be considered on their own. The invented eight axes – four axes each for internal and external view – do not cover the software programming aspect of a model, the appearance of the hardware model to software is undefined so far. Therefore, the set of axes is augmented with a ninth aspect, characterizing the software programming detail. This axis does exist only once, as the programming does not differentiate between internal and external view, it only sees the whole system. A complete overview of the axes and their abstraction levels can be seen in Figure 2.3 on the facing page. The figure shows there are different numbers of levels in each axis. The detailed meaning is explained in the subsequent sections. A model now can be categorized by specifying for each axis the level of abstraction. A textual description for the content of a particular RTL model could look like this: Internal (

External (

SW-Program (

temporal=Gate Propagation, data=Bit, function=Digital Logic structure=register), temporal=Clock Accurate, data=Bit, function=Digital Logic structure=I/O Pins), programming-level=none)

Easier to read is a visualization of this information. The VSIA Model Taxonomy uses a chart with all axes drawn in a matrix. To mark where a model is located within this taxonomy, some easy symbol keys are used. An example for resolving partial information could be control but not data values or functionality. Containing no information is also important to know, this means that the model’s purpose is not the simulation of the axis on which the X occurs.

12

2.1 Model Abstraction Levels

Independently Describe:

Resolution of INTERNAL (Kernel) Details Resolution of EXTERNAL (Interface) Details

In Terms of: Temporal Resolution High Res

Low Res

Gate Prop. Cylce Acc. Cylce Approx. Instr. Cylce Token Cylce System Event Partial Order (pS) (10s of nS) (100s of nS) (10s of uS) (100s of uS) (10s of mS) (conc, seq.)

Data Resolution Low Res

High Res

Bit (0b01101)

Format (Processor-like)

Value (13)

Property (enumeration, e.g. Blue)

Token

Functional Resolution

Low Res

High Res

Digital Logic (Boolean operations)

Algorithmic (Bubble-sort procedure)

Mathematical -1 (W=R b)

Structural Resolution High Res

Low Res

Structural (Register, Gate netlist, I/O-pins) Full implementation info

Block Diagram (Major Blocks, composite I/O-ports) Some implementation info

Single Black Box No implementation info

Software Programming Resolution High Res

Low Res

ObjectCode (1001011)

MicroCode.

AssemblyCode (fmul r1,r2)

HLL (ADA,C) Statements (i := i + 1;))

DSP Primitive Major Block-Oriented Modes (FFT(a,b,c)) (Search, Track)

(ldmar; opA r1; opB r2; add; dst muxA) (Note: Low Resolution of Details = High Level of Abstraction, High Resolution of Details = Low Level of Abstraction)

Figure 2.3: Taxonomy Axes, pursuant to [VSI99, p. 10]

Internal

External

Temporal Data Value Functional Structural SW Programming Level

Figure 2.4: Empty classification scheme

13

2 Theory and Background

Model resolves information at specific level only. Model resolves information at one of the levels spanned. Model optionally resolves information at one of the levels spanned. Model resolves partial information at levels spanned. Model does not contain information on attribute.

Figure 2.5: Symbol keys, pursuant to [VSI99, p. 19] The following sections will explain the different precision items for each axis, with the items ordered by descending abstraction level. Temporal Resolution Axis Different models aim at different purposes in respect of temporal precision. Each model has some information about time, let it be that one thing occurs after the other by an unspecified time, or one thing occurs exactly n seconds after another one. Partially ordered event accurate. The order of events is ordered completely or partially. There is no precise information in temporal units about when events start or finish. This level is common in dataflow analysis. System event accurate. Start and end times of major system functions are measured in thousands or millions of clock cycles. Token cycle accurate. Duration for the transfer or processing of data frames is specified. Instruction cycle accurate. Information is gathered about how many instructions are needed for certain data processing steps. Cycle-approximate accurate. A system clock is invented, and it is now known how many cycle counts an instruction approximately takes. Pipeline effects and cache is not considered, therefore the numbers may differ from the correct values. Cycle-accurate. For all instructions there are accurate cycle counts present. All sub-instructions are defined within which cycle they occur.

14

2.1 Model Abstraction Levels Gate propagation accurate . All events are defined in time units within clock cycles like ns or ps. The degree of accuracy of the circuit level limits the accuracy of the event timings. Data Resolution Axis The representation format of data values specifies the position on the data resolution axis. A value could be binary 0b111, signed integer –1 or enumeration blue, the same data represented in different precision items, which are: Token. The highest abstraction level for data representation contains no implementation details. The data format is unspecified. Property. Enumerations are invented and only the values defined by the enumerations are allowed. Value. Still there are no implementation details. Values may be integer or real, but the exact representation is not specified. Processor-like. Exact data formats are invented. Values are represented in certain data formats, like the IEEE 754 for real numbers. Bit Logical. The lowest abstraction level. No data formats are present any more, the values are only 1, 0, X, Z representing high, low, tri-state or undefined. The term Composite means a representation that is formed by a combination of the above types. Functional Resolution Axis The function of a model can be described in the following ways: Mathematical Relationships. The functionality is specified as a set of mathematical equations. No sequencing is done. Algorithmic Processes. Sequencing is introduced, to allow ordering of operations. No implementation details of the algorithms are considered, although the algorithmic processes can be decomposed to smaller portions. (Algorithm Implementation). I personally would invent this category to provide an intermediate step towards the next one. Here, implementation details of the algorithms would take place, how a certain algorithm is implemented. As this precision item is not present in the taxonomy, I will not use it to prevent from confusion.

15

2 Theory and Background Digital Logic. Basic Boolean operators (AND, OR, NOT, etc.) represent the whole functionality. The amount of information in this level will probably be very high. Structural Resolution Axis The information on how a component is constructed from its constituent parts can be different in detail level, which is represented by this axis. Single Black Box. No implementation information is given. The component occurs as one large block. Block Diagram. Some implementation information is given, but only the connections of large sub-blocks and those of computer networks. Structural. The full implementation information is available. All connections of simple units, of composite units and of complex units are specified. A complex unit may be an adder, multiplier, shifter, etc., whereas composite units are e. g. flip-flops and simple units only logic gates or registers. Resolution on the structural axis is not limited, it can be as precise as wanted. Software Programming Resolution Axis The granularity level of software instructions, a model of a hardware component interprets, defines the item on the Software Programming Resolution axis. A model could interpret data flow instructions, like vector operations or complete Fourier transforms, usually consisting of multiple lines of code. Therefore the model would be very abstract, whereas a model that interprets opcode hast a much higher resolution. The items on the axis are: Major Modes. The software to be interpreted exists in terms of major working modes, such as searching, tracking, initialization, etc. DSP Primitive block-oriented. This level can be seen as parameterized calls to complex functions. High Level Language. A model interprets software written in a high level language, e. g. C, C++, ADA or. . . Assembly Code. Processor specific assembly code is the base for this sort of interpretation. Usually this software level is compiled from the above. Micro code. Code in form of a set of control signals active in a certain clock

16

2.1 Model Abstraction Levels Object Code. The binary form of code. This is the final representation of a program, the model takes the same input as a real processor would do. The Software axis is important in that way that a model has to accept some sort of input to be useful. What input it interprets does affect its implementation extensively. 2.1.3.2 Model Classes We have seen different aspects in abstraction of models and their nuances. Referring to this basic taxonomy, we now can classify models. Some concepts of modeling occur again and again, therefore they can be named and grouped. There are three groups: • Primary Model Classes: – Functional Model – Behavioral Model – Structural – Interface Model • Specialized Model – Performance Model – Hybrid (Mixed Level) Model • Computational Model – Data Flow Graph Model – Other Models Knowing what these model classes mean is essential to understand the benefits of the taxonomy. Therefore I will shortly summarize the properties of the model classes in the subsequent sections, pursuant to [VSI99], chapter 3. Functional Model A functional model specifies the general function of a system or VC, but does not cater for a specific implementation. For this reason it is possible to create a functional model at any level of abstraction, depending on the precision of implementation details. It is also possible to define separate input and output functional models, or models in terms of mathematical functions. The temporal axis is completely ignored, same applies for the internal structural axis. A functional model basically is a model without timing information

17

2 Theory and Background

Internal

External

Temporal Data Value Functional Structural SW Programming Level

Figure 2.6: Functional Model Classification, pursuant to [VSI99, p. 20] Behavioral Model Internal

External

Temporal Data Value Functional Structural SW Programming Level

Figure 2.7: Behavioral Model Classification, pursuant to [VSI99, p. 21] A behavioral model adds information about timing; the rest seems to be the same as for the functional model. But the addition of the temporal axis changes the purpose of a model dramatically. With temporal information the order of functions can be specified, constituting a flow. Structural Model Internal

External

Temporal Data Value Functional Structural SW Programming Level

Figure 2.8: Structural Model Classification, pursuant to [VSI99, p. 21] A structural model is interested in the interconnections of the constituent subcomponents of a system or component. The sub-components can be functional, behavioral or structural, too. With this type model a hierarchy can be created, reflect-

18

2.1 Model Abstraction Levels

ing the organization or the communication topology of components. The structural model requires behavioral of functional models in the leaf branches of the hierarchy. The leafs control the effective resolution of the temporal, data value and functional axis. Structural models are possible at any level of abstraction. Interface Model Internal

External

Temporal Data Value Functional Structural SW Programming Level

Figure 2.9: Interface Model Classification, pursuant to [VSI99, p. 22] As stated through its name, the interface model only models the external view of a component for partial information on all axes. The internal view is not addressed in any way. Other terms for this model are “bus functional” and “interface behavior”. The former one can easily lead to misunderstandings because usually a bus represents a lower level of abstraction. An interface model provides the external connective points as ports or parameters as well as timing details and functional constraints. A complete interface model acts as a black box without data values and constraints, whereas partial interface models usually do not contain external data values. In general, an interface model should implement all features of a certain level of interface specification, but not more. This includes that an interface model may also encapsulate other model aspects of the internal view. So it is even possible that the level of external abstraction is totally different from the internal one. In this case the interface model acts as an adapter between the environment and the functional models of a component, only transferring information between different levels of abstraction. One could also think of nesting interface models allowing to plug in different models, regardless of their abstraction level. Performance Model The first of the two models in Specialized Model Classes is the performance model. This so-called uninterpreted model implements internally only the temporal axis and for the external view additionally a high abstraction of data values and maybe the interface structure at some level. The only purpose of a performance model is to allow measures regarding the timeline of a system, which means the answer times when reacting to stimuli. So performance here means timing and delay characteristics.

19

2 Theory and Background

Internal

External

Temporal Data Value Functional Structural SW Programming Level

Figure 2.10: Performance Model Classification, pursuant to [VSI99, p. 24] Mixed Level (Hybrid) Model Internal

External

Temporal Data Value Functional Structural SW Programming Level

Figure 2.11: Mixed Level Model Classification, pursuant to [VSI99, p. 25] A mixed-level model combines models of differing descriptive paradigms and therefore differing abstraction levels, so this class is sort of a wildcard for models. Data Flow Graph Model Internal

External

Temporal Data Value Functional Structural SW Programming Level

Figure 2.12: Data Flow Graph Model Classification, pursuant to [VSI99, p. 26] The last group is the Computation Model Classes. Within this group we have the Dataflow Graph Model, considering only data values and software programming and internal function, all with high levels of abstraction. In center of interest are the inherent data dependencies of mathematical operations in an algorithm. Far away from being a normal model that can be implemented, it is a directed graph, consisting

20

2.2 Model Interfaces

of nodes and arcs, representing mathematical transformations, respective their data dependencies and queues. The Data Flow Graph is architecture independent, as it is a formal notation supporting analytical methods, it can be decomposed, aggregated, transformed and of course analyzed. Data Flow Graphs are widely used within compilers that produce opcode for pipelined or parallel processors. Other Computational Models Beside the introduced models there are of course a whole bunch of models that can be used. For the sake of completeness a few of them should be mentioned: Communicating sequential processes, Discrete Events, Hierarchical communicating finite state machines, Petri-Nets, Process Networks, and many others. 2.1.3.3 Special Models There are many other common models which of course all can be categorized with this taxonomy, e. g. an executable specification, mathematical equation model, tokenbased performance model, abstract-behavioral model, ISA model, RTL model or power model, only to mention some of them. To see all special model examples in detail, refer to chapters 4–8 of [VSI99].

2.2 Model Interfaces The main scopes for this diploma thesis are models that fall in the class of high-level abstraction, although allowing low-level implementation, too. Detailed information will be given in section 2.4 on page 55 where the actual platform is described and analyzed. High-level models or logic models are needed within the design process to check the specification for insufficiencies in detail or logical problems, or to ensure that different components can work together. So one could think of taking a common micro processor of one vendor together with some peripherals that the processor supports, and adding an own peripheral to provide special capabilities, e. g. a network protocol. Of course, the customer wants to verify that his peripheral will fit into the system, and maybe wants to test if the system fits into his application. But from the need to combine models from different designers a new problem arises: Different designers use different modeling languages, such as VHDL, Verilog and C. They will implement their models using different techniques. And much more they will use totally different simulators, software-oriented ones and hardware-oriented ones. For testing the system integration, the ideal thing could be a simulator within a debugger where software can be tested. A hardware design engineer will use a simulator allowing detailed monitoring of signal timings. So you have some languages,

21

2 Theory and Background

even more techniques and a vast array of simulators, resulting in a big number of possible combinations. For enabling models to cooperate with each other, it is a bad idea to simply exchange sources of the models and let everyone adopt them to the own style. Protection of intellectual property without any doubt is a primary postulation when talking about interoperability, so this open-source solution is not available. The only way to bring together model vendors among each other and simulator vendors is to offer a programming interface. To aid in increasing the availability of logic models, the Open Model Forum (OMF) was founded. The group recognized the lack of standardized interfaces by defining a programming interface between simulators and models. The interface, being language independent, was called Open Model Interface (OMI) and allowed the interaction of models and simulators supporting different languages. To be a basis for a standard for hardware description models, the OMF transferred the OMI specification to a working group of the IEEE, which finally resulted in the IEEE Std 1499-1998 with the title “IEEE Standard Interface for Hardware Description Models of Electronic Components” [IEE99]. The purpose of the OMI standard is to permit the usage of any OMI-compiled model with any OMI-compliant application supporting that class of OMI model. This shall work regardless of the programming language of the model. Before working with the standard in later sections, there will be a brief overview of the concepts and definitions the IEEE Std 1499 describes in [IEE99].

2.2.1 Basic OMI Concepts A few basic terms are defined by the standard; they will be represented here. The architecture, the OMI is founded on, defines the Open Model Interface to be the linkage between an application and one or more model managers. An application can be a debugger or simulator front-end, basically a software tool that uses a number of models. A model in OMI context is a “tailorable, functional representation” ([IEE99, p. 3]) of a VC, not a stand-alone application but a component delivered in object code format to ensure intellectual property protection. The model may cover aspects of specification that can be simulated as well as timing information and physical characteristics. All data related to the VC can be integrated into the model. Usually a model is customized and then incorporated one or more times into a complete system, it is then called a model instance. So a model can have any number of instances associated with it. The application using the models can be specialized to a simulator, adding the capability of simulating the behavior of a hardware or system design. The OMI models do not interact directly with the application; they are integrated into a model

22

2.2 Model Interfaces

manager, which coordinates interactions. In Figure 2.13 the structure is visualized. A model and its instances are associated to exactly one model manager. A model manager manages any number of models. An application uses one or more model managers. A model manager is used by only one application. The interface between application and model manager is the OMI, as mentioned before. Application

OMI Model manager 1

Model 1

...

Model J

...

Model manager N

Model 1

...

Model K

Figure 2.13: OMI model integration, [IEE99, p. 4] It is essential that an application never directly calls a model routine, as the model manager may run private communication with internal services like error handling. Models do not necessarily be delivered as single parts, they can be collected to form a model library, controlled by a common model manager, as shown in Figure 2.14 on the next page. For model usage it does not matter whether it is part of a library or not. The same way it is possible to deliver the OMI software elements as a monolithic block, e. g. a model that includes its own model manager. The lines between the elements in the two figures have arrows on both ends, indicating that information flows in both directions at each connection. Especially at the OMI boundary this means that the standard defines function calls across the interface go in both directions, resulting in an application part and a model manager part of OMI. As models can be implemented using a variety of languages, the IEEE standard uses the general term modeling language for all languages that are used to develop OMI models. The compiler, which produces the object code representation from a high-level modeling language like a HDL, is called a model compiler. So for example a VHDL or Verilog design is fed into a model compiler to get object code, whereas with C code a normal C compiler is sufficient. An OMI model package contains a so-called bootstrap file besides the dynamically loadable model manager object file. The only global symbol the object file may

23

2 Theory and Background

Application

Library Model 1

Model manager

Model 2

Instance 1

Instance 2

Instance 3

Figure 2.14: Basic OMI elements, [IEE99, p. 5] define is the omiBootstrapRoutineT function. The bootstrap file is needed by the application to bootstrap the model manager it is associated to, it gives information on the location-independent name of the model manager object file and the C name of the bootstrap routine for the model manager. The file contents looks like this: MANAGER: vendorX_MODELMANAGER_1.0.so BOOTSTRAP: vendorX_bootstrap_1_0 An OMI application can be specified using two key characteristics, presented in detail in [IEE99], section 4.3 on pages 8-9. The terms “port”, “viewport” and “parameter” are explained in detail in the subsequent section 2.2.2 on page 26. Model boundary class. This represents what and how many kinds of object data types the application environment supports regarding the model interface. The OMI defines four classes. Two-state model boundary class. Ports and viewports can only pass a twostate logic value (‘0’, ‘1’) or arrays of it. A parameter is a Boolean, Integer or String. Four-state model boundary class. Ports and viewports can only pass a twoor four-state logic value (‘0’, ‘1’, ‘Z’, ‘X’) or arrays of it. A parameter is a Boolean, Integer or String. Simple-logic model boundary class. Ports and viewports pass any logic type or arrays of it, Real and Integer. A parameter is a Boolean, Integer or String. Unrestricted model boundary class. Ports, viewports and parameters can be any OMI type.

24

2.2 Model Interfaces Simulation style. Applications may provide different environments: Cycle-based. The application does not use or support delayed updates or event times. This restricts simulation to a simple clock-based model. Event-driven. All callbacks and simulator routines can be used, allowing a general, event-based simulation paradigm. The event-driven style includes the cycle-based style. The application characteristics are vital to the simulation environment as they may narrow the bunch of fitting model managers. It is required that an OMI-compliant simulator – remember a simulator being a special type of application – at least supports the two-state model boundary class and the cycle-based simulation style. An OMI event-based simulator is a different kind of OMI-compliant simulator that supports at least the simple-logic model boundary class and the event-driven simulation style. The model or model manager determines the characteristics of the application and diagnoses incompatibilities between them and its own requirements or those of the models. The information gained can be used to auto-reconfigure the model manager or model to the simulation environment provided by the application during the bootstrap process. Execution is activity – calls to OMI routines – during an OMI session driven by an application and one or more model managers. Simulation is a primary kind of execution but not the only one. One could think of an execution that only extracts timing information. Of course the execution proceeds on an orderly fashion, separated into three sessions and five stages, presented in order of occurrence: Model query session. The only stage residing in this session is the Bootstrap, where the nature of the session is established and general setup activities are done. Elaboration session. The Elaboration stage creates the model instances required for the OMI session. Simulation session. This is the only session consisting of two stages. The Initialization establishes the initial values of stateful data objects, while the Simulation after it approximates the behavior of a system by exercising a set of models. Therefore it uses stimulus and response propagation. All stages can directly lead to the fifth stage, Termination, without executing the following stages, making it the final stage in every case.

25

2 Theory and Background

2.2.2 OMI Information Model 2.2.2.1 Methodology For an application it is essential to understand accurately what model information a model manager is providing. Only with detailed information the application can correctly utilize the model manager and so accomplish its task. In the OMI information is realized as a set of objects manipulated through the programming interface, thus forming the basis for the communication between model manager and application. The precise definition of all objects is specified in the OMI information model. This information model does not only describe the type and meaning of information, but also the kinds of routines available for accessing this information. Some important terms for the information model must be clarified, they are briefly explained here and more detailed later when they are used The basic three terms for model boundary classes are: Port. A port allows signal information to flow back and forth between application and model manager during simulation in defined ways. Parameter. A parameter is used at creation time to configure model instances for particular needs. Viewport. A viewport is an internal, usually hidden, model object. It allows backdoor access to the model. Then there are other terms that are used extensively to specify information: Data object. Any kind of object that possesses certain common characteristics, like name and value is called data object, it has an associated data type. Data type. The data type defines the set of values a data object may assume. Attribute. An attribute is a particular kind of data object, used to provide additional information, decorating OMI objects. The information exceeds the basic set of information items. Property. A property is a primitive value. It does not provide access to other objects. Relationship. A relationship defines access to zero or more objects. Object. An object is a collection of information that describe a particular item. Class. A class defines a group of objects with common characteristics. Primitive type. A primitive type is the means of giving types in a programming language independent manner.

26

2.2 Model Interfaces

Two kinds of primitive types are distinguished: built-in primitive types and enumeration primitive types. The following primitive types are valid for the OMI information model: Integer String DataValue

: primitive type : primitive type : primitive type

Boolean IODirection StorageType ClockValueRelationship

: : : :

primitive primitive primitive primitive

enumeration enumeration enumeration enumeration

type type type type

There is a corresponding C type for each primitive type in this list, specified in the programming interface. To get the name of the C type, add a prefix omi and a suffix T to the name. The notation used for the OMI information model is based on a system of class definitions, object definitions, relationships and properties. It aids to formally relate the kinds of OMI objects to the interface routines operating upon them. An object meets the common definition it has in the object-oriented programming paradigm, which says information, comprised in an object, should never be accessed directly, but read and written using interface functions. This allows implementing the internal representation in different ways. A class or object can have zero or more parent classes, which means the object definition or class inherits all properties and relationships of its parent classes. This concept meets the multiple inheritance concept of some object-oriented programming languages. The definitions of classes and objects appear in the information model in the following form: class ClassName parent classes list of parent classes properties list of properties relationships list of relationships

object ObjectName parent classes list of parent classes

27

2 Theory and Background

properties list of properties relationships list of relationships Properties are primitive values, as defined above, that do not allow access to other objects. In contrast to attributes they provide required information, while attributes represent optional information. A relationship defines access to other objects. There is a difference between a singular relationship, which defines access to just one single object, and a multiple relationship that yields any number of objects, unordered in a set or ordered in a list. Each information item has a type and an unique name within the class or object. Relationships and properties share one name space, so a property cannot have the same name as a relationship within the same object. Named items inherited from more than one source occur only once in the current class or object. The names of objects are given by the Name property, which is an example of an OMI name. An OMI name has the following constraints: • Zero-byte terminated • Case sensitive • No non-printable or control characters • Additionally for Name property: no whitespace In order to solve the problem of possible name clashes in the overall system, the concept of scopes is invented, including the name space technique. A scope is a set of objects, all having distinct values for their Name properties, so within the name space there is a set of uniquely names. An overview of the objects and their scope-defining relationships will be presented later. There are some basic objects, which have to be explained, in order to be able to use them later for classifying a concrete model manager. The definitions are taken from [IEE99], chapter 7. 2.2.2.2 Root Object Starting point for accessing the information known to a model manager is a single root object. The root object contains includes relationships for Models, which represent a set of stand-alone-models known to the model manager; these are models that are not contained in a library. It also contains the Libraries known to the model manager. The Attributes relationship may represent the characteristics of the model manager.

28

2.2 Model Interfaces

object Root relationships Models Libraries Attributes

: : :

set of Model set of Library set of Attribute

2.2.2.3 Library Objects Models can be collected in libraries. A library bears a property, Name, with which libraries are distinguished. The unordered set of models contained in the library is represented by the Models property. Like in the root object, the Attributes relationship is used for additional characteristics. Certain attributes are predefined by the OMI. object Library properties Name relationships Models Libraries Attributes

:

String

: : :

set of Model set of Library set of Attribute

2.2.2.4 Model Objects A model object contains information on its model boundary, which is the set of parameters, ports, viewports and timing definitions, as well as other information regarding functionality and timing characteristics. object Model properties Name SupportsSaveRestore SupportsResetTimeZero relationships ParameterDefinitions PortDefinitions ViewportDefinitions DefaultTimingDelays DefaultTimingChecks ClockingSpecifications EnclosingLibrary Attributes

: : :

String Boolean Boolean

: : : : : : : :

list of ParameterDefinition list of PortDefinition set of ViewportDefinition set of TimingDefinition set of TimingDefinition set of ClockingSpecification Library set of Attribute

29

2 Theory and Background

Most of the properties and relationships should be self-explanatory due to their names. The types of objects, not introduced before, will be explained within the subsequent sections. 2.2.2.5 Model Instance Objects When a model is instantiated, it is tailored to follow specific needs of the application. The instantiation forms a new object named ModelInstance object. object ModelInstance relationships Definition Parameters Ports Viewports TimingDelays TimingChecks

: : : : : :

Model list of Parameter list of DefinedPort set of Viewport set of TimingDefinition set of TimingDefinition

All default relationships are again specified here, but now with the actual values. 2.2.2.6 Data Objects A data object epitomizes a modeling element having a value and is so capable of storing state information. class DataObject properties Value relationships ObjectType Definition

:

DataValue

: :

CompletelyCharacterizedType DataObjectDefinition

The Value property may not be available without an initialization of the object, and additionally it may change during simulation. The second information in a data object is its completely characterized data type. A data object may have a Definition relationship; in this case it is derived from a data object definition object, which has the following definition: class DataObjectDefinition properties Name relationships ObjectType Attributes

30

:

String

: :

DataType set of Attribute

2.2 Model Interfaces

The value of the Name property of course has to be unique in the related scope. An application may have certain expectations for data objects and their names, but it must not assume all expectations to be fulfilled, rather the application has to verify, that the model has these characteristics using the OMI interfaces. Parameters A parameter is an instrument by which characteristics of a model can be customized at creation time, so it has a constant value. Usually a parameter is derived from a parameter definition. object DataObjectDefinition parent classes DataObjectDefinition properties DefaultValue

:

DataValue

The default value shall always be available, so not depend on the instantiation of the model. An actual parameter object is a data object with a relationship to the parameter definition it is derived from: object DataObjectDefinition parent classes DataObject properties Definition

:

ParameterDefinition

Ports Ports are the dynamic communication channels of a model. A port is usually attached to an object in the application to allow the application and the model to interact during simulation. Like a parameter, a port is derived from a definition, too. object PortDefinition parent classes DataObjectDefinition properties Direction IsAtomic

: :

IODirection Boolean

31

2 Theory and Background

The Direction property defines the direction in which the information flows, may it be from or to the model, or both. A port can be manipulated as a single atomic unit or as a collection of separate elements. These possibilities are named atomic and non-atomic and indicated by the IsAtomic property. The actual port object represents a port associated with a model instance. Like all other data objects, a port may be of a scalar type or a composite type, thus leading to scalar port and composite port objects. A port of a model is related to the port definition of the corresponding model in such a way that for each port definition, a corresponding port, called defined port, exists on the instance. A port definition, which is configured to instantiate non-atomic ports, leads to a port-per-scalar element, called port element. All ports are either defined ports or port elements. class Port parent classes DataObject properties NumElements

:

Integer

For a composite non-atomic port, a NumElements property with a value greater than 0 indicates the number of port elements. It is 0 only if the port is an atomic port. The defined port is one of the two kinds of ports as mentioned above; it is not part of another port but one immediately associated with a model. class DefinedPort parent classes Port relationships Elements Definition

: :

list of PortElement PortDefinition

If it is an atomic port, the list of the Elements relationship is empty; otherwise the list contains the elements’ objects. class PortElement parent classes Port properties Name relationships Parent

32

:

String

:

DefinedPort

2.2 Model Interfaces

A port element bears a name and of course a relationship to its parent port. Viewports Ports are usually visible to the application as they are the normal way of interaction. For special purposes, such as debugging exploration or action, a model contains access ways to the hidden internal data objects. The means for access to these data objects is the viewport concept, which is a way to provide access to certain internal objects without disclosing the whole internal structure and information. Again, the actual viewport is derived from a viewport definition. object ViewportDefinition parent classes DataObjectDefinition properties CanModify

:

Boolean

The CanModify property of a viewport definition distinguishes between read-only and read-writable viewports. object Viewport parent classes DataObject relationships Definition

:

ViewportDefinition

Internally a viewport may represent a signal, a variable or a constant. Again the application is responsible to determine the correct type and must not rely on a specific realization. A composite viewport contains viewport elements. object ViewportElement parent classes DataObject properties Name relationships Parent

:

String

:

Viewport

A viewport element object bears a name and a relationship to its parent viewport.

33

2 Theory and Background 2.2.2.7 Data Types Data types are not only used within object and class definitions; they themselves are members of a class. A data type generally is used to specify either the values, that a data object can adopt or the structure of a complex data type, like an array or a record. class DataType properties Name StorageType relationships Attributes

: :

String StorageType

:

set of Attribute

Array Types There are two other classes within the data type context. The array type defines a sequence of homogeneous elements that are referenced by an index. class ArrayType parent classes DataType relationships ElementType

:

CompletelyCharacterizedType

There are some predefined array types, e. g. omiStringT for character arrays. The array is stored left-to-right with the left bound element having the index 0. Completely Characterized Data Types In some of the former definitions I mentioned the completely characterized data type, which is one, whose characteristics have been fully established. This is mandatory for data objects, and optional for parameter, port and viewport definitions. The least ones may also be defined using incompletely characterized data types. class CompletelyCharacterizedType parent classes DataType properties StorageSize

34

:

Integer

2.2 Model Interfaces

The StorageSize property is quite self-explanatory, specifying the size (in bytes), required to store a value of the data type within its omiDataValueT form. The introduced class is a super class for a whole set of data types. The complete definitions are omitted, a list of the types is sufficient: • Scalar Data Types, having low and high bounds – NumericType – IntegerData – RealData – TimeData • Enumeration types, being derived from the Scalar Data Type – BooleanData – CharacterData – LogicType, with its subclasses ∗ ∗ ∗ ∗

1164LogicData 1364LogicData MVL4LogicData MVL2LogicData

• FixedArrayData type for fixed index range • RecordData type for composites, similar to structures in programming languages • RecordElement, being the element of RecordData Incompletely Characterized Data Types If not all characteristics of a data type have been fully established; it is an incomplete characterized one. class CompletelyCharacterizedType parent classes DataType All incompletely characterized data types are array types. First, there are the Unconstrained Array Type Objects. As the name indicates, this type does not have a fixed index range associated with it. For such a parameter or port definition the bounds for the index range come from the application.

35

2 Theory and Background

object UnconstrainedArrayData parent classes IncompletelyCharacterizedType ArrayType Due to its unclear size it cannot be the element type of an array type, as this would interfere with the fixed element size of an array. Second, there are the Parameterized Array Type Objects. If the bound of an array type cannot be specified at definition time but are fixed by model parameter values, this is called a parameterized array type. For this there has to be some link to the parameter type. class ArrayBound

object FixedArrayBound parent classes ArrayBound properties FixedBound

:

Integer

object ParameterizedArrayBound parent classes ArrayBound relationships ParameterizedBound

:

ParameterDefinition

object ParameterizedArrayData parent classes IncompletelyCharacterizedType ArrayType relationships LeftBoundParameter : ArrayBound RightBoundParameter : ArrayBound The class ArrayBound has no own information, it serves as a super class to the two objects representing the two cases of array bounds: fixed values or parameter values. Note that although it is possible to define a parameterized array type with two fixed bounds, in this case the fixed array type should be used. With the parameterized array type it is possible to create e. g. generic wrapper models.

36

2.2 Model Interfaces Private Data Types It is possible in OMI to use also data types that are not defined by the OMI. These are private data types, whose representation are implementation specific and may be proprietary to a particular model vendor. object PrivateData parent classes CompletelyCharacterizedType For usage it is necessary that the application utilizes the information provided by the model vendor about the nature of the private type. 2.2.2.8 Timing Definitions A timing object represents a piece of timing information associated with a model. object TimingDefinition properties TimingSpecification relationships Attributes

:

String

:

set of Attribute

The TimingSpecification property is a string in the SDF (Standard Delay Format) syntax. There are two kinds of timing definitions: timing checks, defined by an SDF tc_spec rule, and timing delays, defined by an SDF del_def rule. The Standard Delay Format is subject to become IEEE Std 1497, it was developed by a group named Open Verilog International, which is now part of the Accellera group. Referring to the specification [Hor95], a SDF file “stores the timing data generated by EDA tools for use at any stage in the design process.” This can include delays, timing checks, timing constraints, timing environments, incremental and absolute delays, conditional and unconditional module path delays, design- or instance-specific data, type- or library-specific data and scaling, environmental and technology parameters. Additional to the SDF specification there are some other constraints for the TimingSpecification string, such as containing some extra formatting restrictions and naming conventions. 2.2.2.9 Clocking Specifications For cycle-based simulation, as described in section 2.2.1 on page 22, there were invented clock ports, which are described by clocking specifications. A clock port is required in cycle-based simulation to allow efficient communication between application

37

2 Theory and Background

and model instead of the application assuming worst-case characteristics. Therefore a model for cycle-base simulation should provide clocking specifications. A clock port is an input-only port with a MVL2LogicData object type. class ClockingSpecification relationships Attributes

:

set of Attribute

A clock appears in a certain waveform, described by the ClockWaveform object. object ClockWaveform parent classes ClockingSpecification properties ClockPeriod RisingPhaseLength FallingPhaseLength RisingPhaseIsFirst relationships Clock

: : : :

DataValue DataValue DataValue Boolean

:

PortDefinition

As ports can depend on a particular clock a clock domain object is needed. Such objects should exist either for none or all clock ports, so that the simulator can easily determine if there are any dependencies between ports and clocks. object ClockDomain parent classes ClockingSpecification relationships Clock Dependents

: :

PortDefinition set of PortDefinition

Two clocks can be related, too, using the ClockRelationship object. object ClockRelationship parent classes ClockingSpecification properties Relationship relationships LeftClock RightClock

38

:

ClockValueRelationship

: :

PortDefinition PortDefinition

2.2 Model Interfaces 2.2.2.10 Attributes Many of the objects introduced are decorated with attributes. There are some OMI predefined attributes. object Attribute parent classes DataObject properties name relationships Attributes

:

String

:

set of Attribute

Each attribute is unique within its scope, which usually is the current object, by its name. Attributes may be nested. 2.2.2.11 Model Information Access A small set of routines is used to provide access to the information described in OMI information models. There are access routines for all basic types, shown in Table 2.1. Table 2.1: Accessing information from an object, [IEE99, p. 65, Table 10] Information item Integer property Enumeration property String property DataValue property Singular relationship Multiple relationship Next element in multiple relationship

From Object with given property Object with given property Object with given property Object with given property Object with given relationship Object with given relationship Iterator object

Access routine omiGetInteger

Result Integer value

omiGetInteger

Integer Value

omiGetString

String Value

omiGetDataValue

Data Value

omiGetHandle

Object

omiGetIterator

Iterator object

omiScan

Object

A handle is used to denote an object. Any number of handles may point to a single object. A handle remains valid until it is explicitly released. After releasing a handle

39

2 Theory and Background

it is erroneous to use it. So a handle is the same as a pointer in various programming languages. Therefore a handle is implementation dependent, so handles, obtained through different model managers, will be incompatible in practice.

2.2.3 Execution Stages The execution stages were introduced in 2.2.1 on page 22. In this section the stages will be described more detailed, pursuant to [IEE99], chapters 5 and 6. As explained before, there are five consecutive execution stages, whereas the fifth and last stage can be a direct successor of each of the stages. A central means by which the OMI works are callbacks, registered and removed by the model manager. With the number and kind of callbacks it registers, the model manager can choose the granularity of interaction with the application. Basically a callback is kind of a function pointer in the C programming language, allowing a dynamic linkage of simulator and model manager. The omiCallbackDescriptionT structure is used during registration to characterize exactly how the callback should be executed. This includes a callback reason – an indication of the circumstances under which a callback is to occur. The registration routine is omiRegisterCallback, it may be invoked in the bootstrap and elaboration stage. On execution of a callback an omiCallbackInfoT structure keeps recipient information for both, the caller and the callee, such as the current simulation time, caller-specific data, callee private data and additional state information. Figure 2.15 on the facing page shows the different execution stages with the most important callbacks. Of course these are only the main callbacks, whereas the standard defines much more functions for different purposes. An important concept of directing the simulation are the control requests, which are used by a model manager to directly affect the flow of execution. A control request is not an implicit command, but – as the name suggests – a request for an action. An application may decline or accept a request. The request for termination is the only one that cannot be declined and does not return control to the model manager routine. A normal control request can e. g. be a request to revert to a previous state or to suspend the manager, to save its state or load it. Requests are executed either immediately or at the end of the simulation cycle, whatever is reasonable for the particular request. It is significant that during the omiControlRequest routine the application does not call any model manager callbacks. Therefore, from the model manager’s perspective, the application returns immediately, signaling acceptance. In the next steps the model manager returns control to the application, which then executes the requested operation and finally resumes execution at the appropriate point. Another essential technique is the system exceptions, which include errors like segmentation violation, bus errors and process signals like break or end. All these excep-

40

2.2 Model Interfaces

Bootstrap

BeginElaboration

Elaboration EndElaboration

DriverInitialization

Initialization NetInitialization

BeginSimulation

SimulationStep

The Simulation Cycle

NetUpdate

Simulation NetSensitivity CycleSensitivity ViewportUpdate

Termination

Provided by model manager. Others provided by application.

Termination

Figure 2.15: OMI execution model with callbacks, [IEE99, p. 22, Fig. 6] tions are to be handled by the application. Although a model manager may install its own exception handler, the application’s handler shall be restored before returning the execution control. Of course a model manager should only catch exceptions that it can handle completely. The model manager can aid in increasing the usability of the models under its control by providing special capabilities, like state viewing and -manipulation. These capabilities are accessed through so-called model manager commands using a function of type omiCommandRoutineT. The actual command is issued through a string argument to this routine. Using the command facility it is possible to add dynamic operations that are not part of the set of OMI routines. The model manager is responsible for interpreting the arguments and performing the appropriate actions.

41

2 Theory and Background

Model manager commands are either called by the application or by the end user through an interface provided by the application. Now in the following sections not the mentioned generic techniques will be discussed but the execution stages in detail. 2.2.3.1 Bootstrap Stage The first execution stage takes place at startup time, where the application goes through a particular initialization protocol with each of the model manager it is using. The initialization protocol clarifies the OMI version for the interaction, the addresses of the callback routines and several session-specific options. In order to detect version clashes the bootstrap process is split up into two steps, realized with two routines called by the application. • The first routine to be called is the model manager’s omiBootstrap function, which initiates the communication. The application gets knowledge of the range of OMI revisions the model manager supports. Referring to this range the application selects a specific OMI version for the session. Additionally the function returns a pointer to the second routine, which should only be called if no version clashes occur and the pointer is valid. • The second routine – only to be called if the first one successes – is the omiBeginSession function. Goal of this routine is to exchange all basic characteristics of the session. This includes the OMI revision as well as the nature of the OMI session and the reference simulation style. The model boundary type supported by the application is given, so is the time value precision and the timing mode. As functional information besides the names of application and model manager all callback pointers of both sides are exchanged. 2.2.3.2 Elaboration Stage In the elaboration stage the application tells the model manager(s) to instantiate the models, including their tailoring to specific needs and binding their ports to the environment. For this the application goes through the following protocol for each model. Interleaving calls are not allowed. • A call of omiCreateInstance triggers the actual model instantiation. For each model this routine is called once, creating a unique instance. • With the omiSetParameter function the model instances are tailored. Parameters not set explicitly with function calls are set to default values by the model manager.

42

2.2 Model Interfaces • After passing all parameters to the instance the application elaborates it by calling omiElaborateInstance, forcing the parameters to take effect. Here the data objects like ports, viewports and parameters are created. • As the ports are now created, each one can be connected to the application environment utilizing the omiConnectPort routine. A port shall be entirely connected, which is important for non-atomic ports. • The final function call for a model elaboration is omiEndPortConnect, signaling the end of the protocol. The means by which objects are defined and created were explained in the above section about the information model. For all incompletely characterized types the model manager replenishes these objects with right values for the missing values, like bounds for unconstrained arrays or storage sizes. The information for these replenishments is supplied by the application. Certain restrictions are imposed on the model boundaries. A parameter definition shall not be parameterized type, so not depend on other parameter values. A port definition may not have an unconstrained array type, as the size of a port array then would be undefined and so impossible to use. The same restriction applies to viewport definitions. 2.2.3.3 Initialization Stage After the pure elaboration stage the stateful OMI objects, like ports, viewports, nets or drivers, have no initial values yet and can yield erroneous values when accessing them. Therefore the initialization stage assigns values first to the drivers associated with OMI output or bi-directional ports, then to output or bi-directional nets and finally to ports and viewports. The order of these three sub-stages does matter, since the port initialization may require nets to have valid values. The stage is triggered by the application calling the omiDriverInitialization routine. The model manager then invokes omiInitializeDriver for each driver. A model manager may register an omiNetInitialization callback in order to obtain the initial value of a net as soon as it becomes available. 2.2.3.4 Simulation Stage Here the actual simulation takes place by executing a series of simulation cycles. The simulator and the models manager interact through the execution of callbacks registered by the model manager. The callbacks lead to multiple port changes by retrieving values from the simulator. The OMI standard describes the simulation stage in detail in [IEE99], chapter 6.

43

2 Theory and Background

A system of processes builds the OMI simulation. An OMI process is a black box that represents an independent thread of execution during simulation. The content of a process is not specified in detail in the standard for interoperability reasons. Network propagation means, that the process recomputes the OMI network in order to reflect changes in nets or their drivers. Pending updates are executed and removed from the queues. In a given simulation cycle, network propagation takes place only once. The propagation can be regarded as a transaction, thus implying to be an atomic step: no update is visible prior to the network propagation step, and any change to OMI ports can take effect after the propagation. This technique ensures that the set of OMI processes is always consistent. The set of OMI processes can be divided into three basic types, differing by the times they are executed: New-time-step processes. These processes execute at the beginning of a time step, prior to network propagation in the first simulation cycle. Post-propagation processes. As the name implies, these processes execute each cycle after the network propagation has taken place. End-time-step processes. At the end of a time step, just before simulation time is advanced, this kind of process is executed. Here, no data in the current cycle should be affected. The differences between event-driven simulation style and cycle-base simulation style can be seen best by showing the two flow charts Figure 2.16 on the next page and Figure 2.17 on page 46. The time period that covers all of the activity that occurs between consecutive updates to the current simulation time is called simulation time step. It may contain more than one simulation cycle, due to the possibility of scheduling zero-delay updates. Cycle-based simulation does not know the concept of time steps, there is only one process left of the three kinds: the post-propagation process. In this kind of simulation style, it is not feasible to have transparent data paths, which means that a change in the input port’s value directly – within the same cycle – affects the output port’s value. This is forbidden because cycle-based simulation, by concept, computes the next state, including output values, from the current one plus input values. This concept does not work if the output can change within a particular state, this would lead to another state before having calculated the next transition. OMI defines some callbacks, used to implement simulation cycles: omiSimulationStep. This callback is executed when simulation time reaches a specified activation point.

44

2.2 Model Interfaces

Initialization

Advance current time

Execute new-time-step processes

omiSimulationStep callback processKind=omiNewTimeStep

omiNetUpdate callback

Update OMI network

Advance to new cycle

zero-delay Execute update scheduled post-propagation processes

omiSimulationStep callback processKind=omiPostPropagation

omiCycleSensitivity callback

more to do

omiNetSensitivity callback processKind=omiPostPropagation

Execute end-time-step processes

omiSimulationStep callback processKind=omiEndTimeStep omiNetSensitivity callback processKind=omiEndTimeStep

Termination

Figure 2.16: Event-driven simulation cycle with callbacks, [IEE99, p. 30, Fig. 7]

45

2 Theory and Background Initialization

Advance clock(s)

more to do

Update OMI network

Execute post-propagation processes

omiCycleSensitivity callback

Termination

Figure 2.17: Cycle-based simulation cycle with callbacks, [IEE99, p. 32, Fig. 8] omiCycleSensitivity, omiNetSensitivity. These two functions react on changes on the values of nets. They are called zero or once in a given simulation cycle, even if more than one change has occurred. omiCycleSensitivity is responsible for scheduling activity with unit delay and omiNetSensitivity cares on changes on one or more nets. omiNetUpdate. This function callback determines which nets are changing and simultaneously obtains their new values. It does not present a stable view of the OMI network, because it may be invoked during network propagation when transaction is in progress. The OMI standard additionally includes a concept that allows reverting to a previous state. This does not only mean that it is possible to reset the session to its beginning state, but also to return the session to a previously saved simulation state. This of course requires that the model manager and all attached models support state saving and loading, a serialization concept has to be implemented in all parts. The according callbacks for the reversion functionality are omiReset, omiBeginSimulation, omiSave and omiRestore; all do what their name indicates. Additionally there are two functions for saving and restoring opaque references, usually implemented as pointers: omiSaveReference and omiRestoreReference.

46

2.3 Model Verification 2.2.3.5 Termination Stage The final stage’s task is to gently shut down an OMI session. Termination is initiated due to one of three reasons. There may be an explicit request, which can mean, the session is ended by user action. The second possibility is the occurrence of a fatal error within execution and the last one is the equivalence to natural death – the normal completion of the simulation due to absence of any further scheduled activity. Termination callbacks of the model manager shall be called by the application to release resources like files or sockets. After the execution of the termination callbacks no further communication between the application and the model manager can take place, as handles and data structures are invalid. The application may also receive a control request for termination for a particular model manager. There are two possibilities of reacting to such a request: either it terminates the entire session or only terminates the session for the particular model manager and continues simulation with the remaining ones. With both choices the simulator terminates the particular model manager and ceases communication with it.

2.3 Model Verification Of course, if a VC is created, it has to be ensured that it does work correctly. This task is very complex, because there are different chip designs, different development environments and different physical implementations, many different tools. All the discrepancies have to be flattened to produce one final result: the answer to the question if there are any errors to fix. It is quite obvious that there are several methodologies and techniques to verify VCs. They can be split up into four categories (all quotes taken from [VSI01a, p. 2]): Intent Verification. “Process of determining whether a design fulfils a specification of its behavior.” The question is: Is it that what we wanted to do? Equivalence Verification. This is the process of determining whether multiple levels or formats of a design match in terms of functionality. Mostly the newly created model is verified against the “golden model”. The question is: Is it that what we had to do? Integration Verification. “Process of verifying the functionality of a system-on-chip (SoC) design that contains one or more virtual components.” The question is: Is it usable? VC Verification. “Process of verifying the functionality of a virtual component, i. e. unit test of that component.” The question is: Does it work?

47

2 Theory and Background

A large overlap is between these categories without doubt. Many processes are shared e. g. between VC and integration verification, although the actual test code will be diverse for the divergent tasks. The VSIA Functional Verification Development Working Group has produced a taxonomy document reflecting the verification classes and their techniques, see [VSI01a]. In the subsequent sections the four main categories and their methodologies will be discussed. For a specific test, usually not only one technique is used but a combination of some of them. Some depend on others whereas some exclude each other.

2.3.1 Intent Verification There are different techniques leveraged for Intent Verification, covered by several tools I will not mention here. 2.3.1.1 Physical Prototyping The most direct approach for verifying a VC is to found it into silicon. This device, still being a model, is at close to the target platform performance for its hardware nature. With a physical prototype several tasks can be accomplished. The application and system software can be developed and debugged before the real SoC device is available. As this is possible with other techniques later explained, too, it will not be the main task for the immense effort of time and cost. More interesting are tasks like system-level performance testing as well as enabling exhaustive test cycles by being a high performance platform simulation for the target design. It is also needed for hardware and software co-verification and providing a logic analyzer interface. But the only important task listed in [VSI01a] from my point of view in the present is that physical prototypes are for marketing demonstration of the target device. And with the further evolution of virtual prototypes this task will dissolve, too. Implemented as real hardware, a physical prototype operates within the same order of performance as the target system and so is much faster than any software simulator. There are different sub-methods used for physical prototyping: Emulation Systems. This technique adopts an emulation system discussed in section 2.3.1.2 on the next page. Reconfigurable Prototyping System The different VC building blocks are implemented separately and assembled on an integration board. They are realized either as bonded-out silicon, non-volatile FPGAs or via in-circuit emulator (ICE) systems. Application-Specific Prototype For Physical Prototyping with an ASP a complete design must be developed using commercially available components, it will have

48

2.3 Model Verification

limited expansion capability. As the package is very specific, tests based upon this technique are very costly. 2.3.1.2 Emulation One little step away from pure hardware tests is the emulation. An emulator usually consists of a FPGA and its connection to some other parts of the system. The execution speed is very high in comparison to all software solutions that will be explained later. For the benefits of high speed it is possible to perform even large and computation-intensive verification tasks, like running operating systems on a chip, which often includes millions of lines of code. An emulator can be implemented in different ways. Mostly there are one or more interconnected FPGAs, in addition to some custom processors, interfaces to allow controlling the software and the hardware as well as debugging the system. 2.3.1.3 Formal Verification Far away from real hardware formal verification utilizes mathematical techniques that allow to check the functional aspects of a design. A verification test suite is not required, since the design is analyzed purely mathematical. The purpose of formal verification includes equivalence checking, explained in section 3.3.2 below. For intent verification the following methodologies are applicable: Property and Model Checking This technique probes the entire state space of a design. All possible input conditions are explored, finding bugs that probably cannot be found with simulation. The model properties are verified using queries in a specification language. In case an error occurs, the complete track from an initial state to the fail state can be traced. In most designs not the entire state space has to be evaluated, because there is a set of input constraints. Compliance with these constraints has to be ensured by the model checker. For these bounds there is an extra constraint language, too. Of course, if the data path of an IC is very long, formal verification can be ineffective due to the wide and deep state space a long path implies. Better to check are control-intensive designs with a usually narrow state space. A model checker can also be useful within simulation, too, verifying assertions for certain conditions. Theorem Proving A specification language based on a formal logic and a set of strategies are the requirements for theorem proving verification systems. The strategies typically are present in form of commands, and allowing constructing a proof of an assertion in

49

2 Theory and Background

the logic. The kind of formal logic as well as the level of automation widely varies among the theorem proving systems. This sort of verification typically is done by first describing the design model and the property under test in the specification language. Then a proof of the correctness criterion can be constructed. Size of state spaces is not a problem here, allowing also verification of data path oriented designs and highlevel applications. This technique is used in equivalence checking, too, by forming an appropriate assertion. The biggest disadvantage of theorem proving is, that it does not provide good possibilities for automation like model checking does. The proof has to be constructed interactively. In case of a failure in proof construction, there is no automatic trace; the cause of the fail has to be found manually by analyzing the failed state. 2.3.1.4 Dynamic Verification This is a very practical verification technique, using a set of stimuli to exercise one or more models of a design or a hardware implementation of the design. There are several sub-categories within dynamic verification: Deterministic Simulation In deterministic simulation a well-formed stimulus is fed into the model. The response from the model can be predicted and so the expected and the real response are compared for verification. Random Pattern Simulation (Directed and Non-Directed) By applying random address, data and control values as stimulus to a model, the robustness of a design is tested, if false inputs lead to undefined states. Often one or more bus protocol checkers are employed to verify that no bus protocol violations do occur. In contrast to bus protocol checking, where undirected random patterns form the stimuli, it is also possible to take directed random patterns as stimuli, which means the patterns are still random, but within certain boundaries. The operations are not purely random but generate actions to stress the design in specific ways, e. g. reading, writing or calculations. Algorithmic errors can be found very quickly using random pattern simulation, as the boundary conditions are reached and extremely different operations can take place in an order that was never thought of. Hardware Acceleration One or more components of a design are mapped into hardware for performance reasons. Still being purely software based, the verification test bench profits from

50

2.3 Model Verification

the speed-up of certain operations. The grade of hardware acceleration varies, even entire simulations could run in hardware. Hardware Modeling Hardware modeling means that components that are not available or insufficiently accurate are run in a hardware modeler, interacting with the software simulator. So this technique can be used in combination with any of the other ones. Protocol Checkers A protocol checker is a set of valid operations that can be monitored as transactions on an interface. With a protocol checker this monitoring can detect invalid operations and flag them as an error. A protocol checker is typically embedded in a verification test bench, for simulation purposes. Embedding in the design is possible, too, to check for violations also during normal operation of a device. Expected Result Checkers Every system verification test bench needs to compare the gained responses to the expected results. Any discrepancies will be flagged as errors. 2.3.1.5 Virtual Prototyping Within several of the techniques invented in the previous sections virtual prototypes could be used as simulators. Other modeling terms differentiate model based on their characteristics, whereas virtual prototype has nothing to do with any particular model characteristics, it rather relates to the role of the model within a design process. A virtual prototype’s purpose is to: • Explore design alternatives • Demonstrate design concepts • Test for requirements satisfaction and correctness. As already discussed in section 2.1 on page 7, a virtual prototype is not tied to a special level of abstraction but can be constructed at any level. Typically a virtual prototype defines the interfaces to the rest of the design and so can be integrated into it, not disclosing its internal structure, which can be designed in a wide variety of possibilities. A virtual prototype can provide a detailed view of how a component will work without the need to design complex hardware just for figuring out alternatives, and so is more cost effective. For the model existing as pure software, a very detailed

51

2 Theory and Background

view into the states of the component is possible, depending on how the model is implemented. The biggest disadvantage of virtual prototypes is, that their speed is within simulation order rather than hardware order, so less verification tests can be run as on real hardware in the same time. 2.3.1.6 Verification Metrics Verification itself has to be verified to gauge whether it is complete or not. Different points are to be tested. Hardware Code Coverage The first group is the hardware code coverage metrics. These can be automatically determined, producing as result a value for the percentage coverage of each property assessed and a list of those areas that are unexercised by the test. Coverage metrics can be applied to the following points: Statement coverage , counting how often each statement was executed. Toggle coverage , regarding the toggled bits of the signals. FSM arc coverage counts the transitions of the Finite State Machine. Visited state coverage shows how many states of a FSM were passed. Triggering coverage checks whether each process has been started uniquely by each of the signals. Branch coverage shows which branch was selected in case of a decision. Expression coverage explores the grade of exercise of a Boolean expression in a conditional statement. Path coverage traces the routes the execution has taken through the branches. Signal coverage shows how well addresses or state signals have been exercised. Functional Coverage The second group of metrics is the functional coverage. This cannot be derived automatically, as it measures the amount of features that have been verified. Generally the functional coverage can be described as a cross-combination between temporal behavior and data. This metric is very important, as it ensures that all features given from the specification are verified, ensuring all bugs can be found. Typically the functional coverage is analyzed on RTL view of the design.

52

2.3 Model Verification

2.3.2 Equivalence Verification During the development process of a design, several models with different levels of abstraction are created. The lower the level of abstraction gets, the more refined the models become. Of course each model has to describe the same functionality and the same original design intent, so the models have to be equivalent. The group of methods for proving this is called equivalence verification. There are particularly similar techniques in use as for intent verification. 2.3.2.1 Physical Verification Again the most direct approach regards the hardware. Here, three distinct checks are used to verify the physical implementation being a correct realization of the original logic design: Electrical Rules Checks (ERC). In center of interest in this test is the electrical correctness, so no electrical design rules are violated. Violations could be open and short circuits as well as unused outputs, floating inputs or loading violations. Design Rules Checks (DRC). This concerns the process design, checking layer-tolayer spacing, overlaps of layers and their width. The rules are specified in a DRC rule file. Layout Versus Schematic Checks (LVS). This requires a “golden netlist” to be present. From physical layout a netlist is extracted by taking polygons and building devices, then compared to the “golden netlist”. Typically the LVS is performed as a last step before mask generation and fabrication. 2.3.2.2 Formal Equivalence Checking Functional equivalence regarding the I/O boundaries and the cycle accuracy can be verified used formal equivalence checking tools, usually operating at RTL or gate level netlists. The benefit of the formal approach is that a complete equivalency can be ensured as opposed to simulation, which depends on the completeness of the test bench and takes more time to execute. Two different approaches for formal equivalence checking are known: Boolean Equivalence Checking. This is the technique used in most cases, checking combinational logic. Automatic name mappings are made between the memory elements of the two designs. After this the tool compares the combinational logic at the input to each pair of memory elements. With this proceeding it can be verified that for all possible combinations of inputs, the outputs are the same for both systems.

53

2 Theory and Background Sequential Equivalence Checking. If the two designs to be equivalent show differences in arrangement or number of components, although fulfilling the same specifications, it is called sequential equivalence. An example for this could be two FSMs, encoding 8 states, one uses 3 latches and the other one 8. They have the same output although acting in different ways. There exists only little tool support for sequential equivalence checking because only small finite state machines can be explored. For large designs sequential equivalence checking is possible theoretically but there remains an unsolved implementation problem. 2.3.2.3 Dynamic Verification There are two techniques within dynamic verification that are the same as for intent verification. Deterministic simulation is explained in section 2.3.1.4 on page 50. The same test bench with the well-defined stimuli can be applied to different models to check whether the results are the same. Expected Result Checkers, introduced in section 2.3.1.4 on page 51, are used, too. There are three techniques not introduced before: Golden Model Checkers. A “golden” or trusted model is a model that is assumed to be correct. The golden model checker monitors the responses of the golden model and the model under test. By comparing the responses to the input stimuli it can be determined whether the model to be verified is correct. No formal techniques are used; simply the behavior of the two models is explored. Regression Testing. Regression testing is a technique that can be applied to all simulation techniques with two pre-requirements: First, all electronic design automation (EDA) tools, verification test benches and result analyzers must be able to run in batch mode without user interaction. Second, the decision between success and failure of a test must be possible in batch mode, too, usually comparing responses against golden results. These requirements allow a setup where many tests can be run automatically. Regression tests allow checking whether design changes cause any existing verification tests to fail. As with design changes also new features may be implemented, the test suites tend to grow.

2.3.3 Verification Test Suite Migration With different views of the design it is necessary to migrate verification test suites to a format suitable for the other view. Stimulus has to be translated from one level of abstraction to the other, then the two suites can be applied to the models to get the points of divergence between the two result sets. With the additional level of detail a new version of the test suite can be extracted. Before describing different migration

54

2.4 Octopus

paths, restrictions must be introduced to ease the migration from functional to lower levels: Bit true data representation. Data values in “C” do not imply a bus width concept, using this concept within modeling at functional level ensures convergence of results. Fixed- and floating-point transparency. Same problem applies here. Fixed-point implementation helps in the alignments of different implementations. Now two common paths of migration will be introduced: Functional to RTL Migration. The functional level usually is realized availing tokens. For migration the tokens have to be translated into pin- and bus-level cycles with the associated clocks. The original test suite can be compared with the derived one on regarding the memory contents of each model. RTL to Netlist Migration. The RTL verification test suite is bus-based and so it has to be translated into a bit and pin accurate stimulus. Again the results of the former and current suite are compared, the difference is that the comparison takes place at the end of each cycle.

2.3.4 VC Verification versus Integration Verification VC verification utilizes no own techniques but covers both, intent verification and equivalence checking. Generally a higher level of abstraction allows higher performance in testing. With RTL being the highest level of abstraction, many tools like model and equivalence checkers support, it is the typical level for intent verification. Equivalence checking is usually done at the lowest level of abstraction like net lists. Like the VC verification, integration verification is rather an appliance than a technique. The difference between model checking and integration verification is the different view: The whole SoC has to work. In most cases gray-box models will suffice as they accurately model the VCs interfaces.

2.3.5 Summary: Functional Verification Mapping This section summarizes the different techniques and their usage. Table 2.2 on the next page shows what verification intent utilizes which technologies on what models.

2.4 Octopus After all the theory stuff it is now time to go into practical experience. In the previous sections many things about simulation and modeling theory were discussed.

55

2 Theory and Background Table 2.2: Functional Verification Mapping Verification Step

Verification Technology

Models

Hardware Intent

Simulation

Functional Behavioral RTL

Emulation

RTL

Model Checking

RTL

Theorem Proving

RTL

Physical Prototype

Behavioral RTL Logic

Virtual Prototype

Functional

Code Coverage

RTL

Hardware/Software Co-Verification

Behavioral RTL

Emulation

RTL

Physical Prototype

Behavioral RTL Logic

Simulation

Functional Behavioral RTL Logic Gate Switch Circuit

Emulation

RTL

Equivalence Checking

RTL Gate

Fault Coverage

Gate

Physical Verification

Geometric database Circuit Switch Gate

Hardware/Software Co-Verification

Behavioral

Emulation

RTL

Physical Prototype

Behavioral RTL Logic

Software Intent

Hardware Equivalence

Software Equivalence

56

2.4 Octopus

In section 2.2 on page 21 the concept of model managers, directing a set of models of several hardware components, was invented. Octopus is Motorola’s implementation of such a model manager, being the interface layer between a simulation application and the simulation models. The intents of Octopus is to allow rapid virtual prototyping of micro controllers with all their peripherals in different simulation environments. The usage in various simulation environments is important to benefit from being able to reuse models throughout the design flow. The particular models are completely independent of the simulator, all they know is the Octopus interface. The main debugging features of Octopus – all explained in detail in subsequent paragraphs – cover all requirements for full-chip simulation. The user can activate multiple, model specific debug levels to be used in model developing as well as model assembling. The model manager commands, as mentioned in section 2.2 on page 21, enable the user to inspect and manipulate the models’ status during the simulation. An interface to generate and reuse Value Change Dump (VCD) files according to IEEE Std 1364-1995, chapter 15 [IEE95]. In order to test and validate single models in stand-alone mode there is also a test shell available named Octopussy. Octopus itself as well as all models is written in C/C++.

2.4.1 Event Based Simulation with Octopus Octopus’ high simulation performance results from its consequent implementation of the event based simulation paradigm. Each single model is only evaluated if there is a possible change in its state. There is not only an abstraction of time but also an abstraction of data, so in addition of simulating only certain cycles, even complete bus signal structures can be centralized in one model port and so processed as a single value. The values are called tokens; they include data and may include timing information as well. This all together leads to a high simulation speed. But with all the time and data abstraction the simulation stays cycle and bit accurate as all the information is present anyway, only pooled for processing. 2.4.1.1 Models and Signals Usually the models reflect a virtual component, so a portion of a chip. They call Octopus functions and provide callback routines to be called by Octopus; they may never use any functions of other models. This allows the models to be reused in other chips. All the models are tied together using the configuration file, a C source file that includes structures and arrays for the port lists and function pointers for all models used. With the information in the configuration file, the models are registered and instantiated in Octopus in the elaboration stage. Instantiation means that a model can be used multiple times representing the fact that a component is present more than once in a system.

57

2 Theory and Background

component

...

interface layer

processor

external port

port

component signal bus functional kernel

component bus

external signal

signal

model instance

Octopus library

Figure 2.18: General Concept, [KLR99, p. 8, Fig. 1-1]

Also available is the so-called “base class”, a system of C++ classes that allows the easy implementation of micro controller peripherals. It supports control registers in addition to bit and complex signals and the bus interface that come along with Octopus. The registers are usually updated as a result to certain state changes, but there is also a possibility to gain “backdoor” access to the registers, enabling a debugger to read values without any effect and to modify them without timeline advance. The term “base class” is referred to hereafter not as a single base class as known from object oriented programming languages but as the system of classes giving easier access to Octopus from the model view. A basic concept in Octopus is the signal technology. A signal is identified by a name that is unique in the particular hardware system. It is attached to an arbitrary number of component ports, whereas one of them is external. A signal itself has no data type, but it only determines the connection without specifying the data type. This is both advantage and disadvantage in one: On one hand it is difficult to ensure type safety, when reading and writing signals in different components; on the other hand, the signal can carry type information, which gives the power to a signal to carry different, specified, data types at different times, because always the type and the information is transferred. The ports, the signals are attached to, can be configured input, output or bi-directional. The other basic concept is the bus interface. It allows assigning a memory range to a component. If the bus driver generates a bus access, Octopus can determine to which model instance the event should be sent. So a component has a bus port, which is a special kind of a port using the pre-defined bus signal data structure. Octopus itself is controlled by an application. The application’s main task after the elaboration session is to stimulate the advance of time by calling the evaluation

58

2.4 Octopus

function with the number of cycles to be simulated. All necessary actions are performed now within this call so when the control is returned, the complete system under Octopus is then in the state that it should be at the given offset. 2.4.1.2 Simulator Interface Due to its nature as a model manager, Octopus has two interfaces to be described in this section: the simulator interface for the application side, and the model interface for the model side. The simulator interface consists of a set of functions and callback definitions. The easiest way to describe the interface is to go through the three sessions like the table in the basic description of [Roh01b]: elaboration, simulation, and termination. As first action in the elaboration session, the application prepares a data structure, including function pointers to the callback functions, the application implements. This structure is passed to the motInit function, which registers the callbacks and initializes internal variables. Before returning, it calls the motInitialSetup function that resides in the configuration file of the modeled system and registers the models. Because Octopus now knows the callback addresses of the application, it can invoke the Prepare function of the application. Here the application is to set up the system configuration, regarding the simulator itself. As we will later see in section 4.2, this is also the ideal place for manipulating model registration information. Back in the motInit function Octopus processes potential command line arguments and the system configuration. The debug commands of the models are registered before the Init callback of each model is executed. The last action of motInit is to let the application perform the setup of its interface with the information of the already registered models. The next step for the simulator is to define the clock frequency and then begin to reset the model manager with the motReset function. motReset in turn invokes three application callbacks. The first one, EnterReset, takes place before Octopus resets time and ports. After the second one, Reset, all model initialization callbacks are invoked and followed by the LeaveReset callback. Finally the motReset function informs all models that the reset stage has ended. This system of callbacks calling other callbacks and back and forth may appear chaotic and redundant at first sight, but it ensures that all possibilities are given to perform initialization actions at various times on both sides – application and models. For the simulation session the most important functions are motEval to tell Octopus to work off a certain amount of cycles, and motGetNextEvent to determine when the next event will take place. With these two functions the application can easily perform its own actions, let Octopus react and then find out when Octopus will do the next step. Of course these two functions alone do not make much sense if there is no data exchange. Ways of data exchange are explained in the practical part. The models may use debug output functions, for each one there is a callback in

59

2 Theory and Background

the application interface, like the Debug or Message callbacks. In order to use model manager commands, the application interface can implement a possibility to accept user interaction and then pass it to Octopus with the motUserCommands function. If the application wants to terminate the simulation session, it calls the motExit function of Octopus. Now all dump files and stimulus files are closed and the Close callback of the application is invoked, which reverts all actions of the Setup callback in the elaboration session. Then Octopus destroys all internal structures, created by motInit and then raises the Exit callback of each model. The models terminate, and then Octopus executes the application’s Cleanup function, being the opposite of the Prepare routine. Now Octopus can close all message- and log-files and invoke the simulated system’s ExitCode callback. Rarely used, it would close any globally opened files and release globally allocated memory. This is the very last function to be executed, after it the motExit function returns, too. 2.4.1.3 Model Interface To the model side, the interface of Octopus is much more refined, due to the various mechanisms that are necessary to let the models interact through Octopus. First all callbacks, a model has to register, will be presented. Later on a selection of functions will be discussed. The exact type and function definitions can be read in the Octopus reference [Roh01a]. There are six callbacks a model has to provide. Function pointers of the callbacks have to be passed to octopus when starting the session. As already mentioned in the previous section the Init callback performs actions at the beginning of certain stages. Along with the function call always comes the TaskTypeT parameter, which reflects the cause for the call: it could be initialization, power on, or a reset. At the end of these tasks the Exit callback is invoked. This is important because a reset for example could take several cycles, so it may be not possible to perform all steps necessary at once when wanting to preserve full accuracy. The Update callback is activated when anyone of the input or bi-directional ports has been sensitive for a modification of an attached signal. The call is the pure information, that something has happened, but not what. The function must determine this itself, by calling the appropriate Octopus functions, as there may be more than one updated port. When not used, this function can be omitted. Very similar to the prior one is the Access callback, being called when a bus access takes place. This is an own function because the mechanism is different: signals are attached especially to a model, whereas the bus passes all models. Octopus differentiates the targets by the memory address in the memory map. It is possible to overlap memory ranges, then all models, associated with a particular address, are accessed. The Access callback may be omitted. The Backdoor callback allows debug accesses while providing basically the same

60

2.4 Octopus

interface as Update or Access. A model may schedule events only for itself. Octopus then executes these events by raising the Eval callback. The function carries a 31 bit wide TaskTypeT parameter that can be used to forward information about what actions to be performed. This parameter is the conjunction of all task types that were scheduled for the particular cycle. So it becomes clear that there are not 231 possible actions but only 31. If the model does not schedule any events, this callback can be omitted of course. Although for each model these callbacks have to be implemented separately, in most cases the implementations will pass the information to the base class, which processes it and allows to receive updates and events in special handler functions registered in the base class at model instantiation time. So the base class e. g. encapsulates the Access callback in a way, that it is possible to define register-based access handler functions, each for read and for write access. The analysis of the information that comes along with the Access callback, has not to be programmed manually, but is done by the base class. In model creation, a handler function for a memory-mapped register is defined, registered and called by the base class if especially that register is accessed via the bus. More details on the base class will follow in the subsequent sections. Important for the model interface is the signal concept, introduced in one of the previous sections. A signal, in most cases, is a logical connection between two ports of different models, or between a port of a model and the outside world. Input ports receive update events, pre-processed by the base class, via handler callback functions. Output ports are written from model functions using the motPutPort function. Signal information, read or written, consists of a variable of type motSignalValueT and a pointer variable of type motTokenT, generally specifying the type of signal and the address of the carried data. motTokenT technically is simply a void pointer, so without the type information it would be erroneous to use it, as it may e. g. point to a double value, reflecting an analog signal, or an integer, reflecting a data line, as well as to a complex data structure. A special case is the usage of bit lines: For the simple information carried, it would be inefficient to use pointers, so the pointer here always has to be NULL, the actual value is ASCII-coded in the motSignalValueT number. The possible values for bit lines are ‘0’, ‘1’, ‘Z’ for tristate and ‘ ?’ for the “undriven” state. The full bit signals of course again consist of the ASCII value and the NULL pointer. This concept of signals has the big advantage, that it defines a communication channel without restricting the type of information using that channel.

2.4.2 Execution Logic In this section, the flow of execution during the simulation session will be discussed. The actions of the base class will be clarified as well as the correlation of model action

61

2 Theory and Background

and model updates. 2.4.2.1 Event Queue We presume that the elaboration session and the reset process have already been finished, and simulation is about to begin. Even before the application triggers, the evaluation of the first cycles there may be pending events in Octopus’ event queue, scheduled in the initialization stage of a model. A model can schedule events only for itself, it avails the motNextEval function with the cycle offset and a TaskTypeT value as parameters. In earlier versions of Octopus there were – additionally to the normal event-driven models – so-called active models, which were evaluated automatically every clock cycle. The concept was used for CPU models, changing their state in each cycle. The active models were discontinued because the administrative and performance expense is basically the same as for self-scheduling models. Especially for CPU models there are many instructions that take more than one clock cycle, e. g. due to memory speed. So it would be a waste of performance to evaluate the model in the meantime. Without active models there is only one type of model execution logic, making the structure and maintenance of Octopus easier. So if the event queue is empty, Octopus advances the simulation time and returns control to the application. In the other case, the time is set to the point of the first event in the queue and the according models are evaluated, invoking the Eval callback of each model with a pending event. The model now determines the action, it has to perform, by interpreting the bits in the 31-bit TaskTypeT value. This is actually a 4-byte long integer with the MSB being used by Octopus internally. Depending on the action to be performed, the model could now initiate an update of a memory mapped module register by using the base class function motSetReg16 for 2-byte registers. The base class modifies that value in an internal data structure. The next bus read access to the address of the register will receive the new register content. 2.4.2.2 Port Update The callback could update an output port using the motPutPort function of Octopus, or – better – the base class encapsulation SetPin. The latter has the advantage, that only the port index instead of the port handle and time offset has to be passed beside the type-token combination. In the configuration file, a signal is attached to each of the ports by name, Octopus now searches the netlist for this signal name to ascertain to which ports the signal is assigned to on the input side. Now it is known to which port the update has to be delivered to. The offset, specified for the output port, reflects the number of cycles, the port delays its output. After this time Octopus informs the other model, using the Update callback. As usual, the update information is parsed by the base class, which then knows exactly what port has been

62

2.4 Octopus

updated. It raises the handler callback function that has been registered in the base class for receiving updates on that specific port. If the application, respectively the user, had decided to monitor the signal, the update is passed to the signal dumper, too. The signal dumper is the piece of code, that produces signal dump files, e. g. in VCD format, for all signals selected for monitoring. The update handler function for the port addressed receives the type-token pair of information and can easily react on the signal change, possibly with scheduling an event using motNextEval. 2.4.2.3 Bus Access Unlike normal signals, which are usually connected to two ports only, a bus signal passes several models, but addresses only a particular one. A “bus communication consists of two successive communication events” ([KLR99, section 3.4.5]): The bus master, usually the processor, requests for a read or write access to a memory-mapped register of some component. The generated request keeps all necessary data, like the address of the accessed register, the data, if it is a write access, and bus control information. Octopus parses the bus token generated by the master, and sends it to the Access callback of the appropriate models, depending on the entries of the memory map. The base class takes the token and passes the extracted information to the read or write handler function assigned to the specific register. The addressed component responds, either acknowledging the access or signaling an sending an error. Along with the acknowledgment comes the data if it has been a read access. In the implementation, a handler function for write accesses to a particular register could be registered. Within this handler function, the change of state of the module takes place. The base class generates acknowledge and returns it to Octopus. 2.4.2.4 Debug Functionality Generally, there are two versions of Octopus used for simulation. In the model development process, the debug version allows sophisticated debugging, while for model delivery, when the performance is most important, the fast version without any debug code is linked. The difference between debug and fast version lies not only in the different library files of Octopus, but also in the header files. All debug commands are C macros, like the command for printing debug information: motMODELDEBUG1PARAM( l e v e l , s t r i n g , param1 )

63

2 Theory and Background

Of course, when using macros it is not possible to pass a variable number of parameters like one would do with a printf statement. So there are macros for zero to nine parameters. The level parameter specifies the debug level. There are nine levels available for each model, so for execution it is feasible to run different models in different debug levels to gain the level of detail of information needed. If the developer wants to use a whole code block as debug code, he can embed it in model debug brackets: motMODELDEBUG_START( l e v e l ) . . . code . . . motMODELDEBUG_END

When compiling and linking for debug, the brackets are substituted with the debug level check code, when compiling for delivery it is replaced with an #ifdef 0 ... endif bracket to deactivate the code completely. While it is generally not desirable to have many macros in a source code, here macros are the means to make debugging convenient when wanting to debug and the debug code nonexistent when delivering models. Model manager commands are supported in Octopus by allowing models to register their own commands. With this technique, models can e. g. implement debug commands to show internal states, or also manipulate them. All commands have the following syntax:

In later subsections of chapter 3.1 on page 73, examples for debug commands will be presented.

2.4.3 Octopus classification After introducing Octopus, this section tries to classify the model manager, with all its capabilities, using the VSI SLD Model Taxonomy as presented in section 2.1.3 on page 11. First of all in Figure 2.19 on the facing page the summarized classification can be seen. The reasons for ranking Octopus in the particular levels in each category will be discussed in the subsequent paragraphs. The classification is not for Octopus as a programming platform, but for Octopus in conjunction with the models running on it. Note, that there are several differences in the internal, versus the external resolution. The disparity is a result from the wide range of possibilities to implement models and of Octopus to provide access to the models. The internal view here means the models as well as their interaction with Octopus, so it covers the model manager and the models. The external view means the interaction between Octopus

64

2.4 Octopus

Internal

External

Temporal Data Value Functional Structural SW Programming Level

Figure 2.19: Octopus classification and the simulator application. The subsequent sections will show the reasons for the particular classifications in each axis, both for internal and external view. 2.4.3.1 Temporal Resolution Axis As mentioned before, Octopus provides cycle accurate simulation of hardware. All events are usually modeled on a cycle-base. If a CPU is modeled, the actions of an instruction are split up into their cycle components. These actions can be monitored from the outside, from the application, too. So the left boundary for the temporal axis is clear: cycle-accurate. For Octopus it is not possible to achieve a gate propagation resolution, because the time-driving value within Octopus is the clock. Although the current simulation time can be obtained using the motGetSimulationTime function, which returns hours and seconds. But the time is only a calculation from cycles and the clock frequency that was specified in the elaboration session. The right boundary reaches to cycle-approximate accuracy. Our classification has to include this level of abstraction, because it is well possible to develop a CPU model in a way, that its state is only accurate at instruction boundaries. An implementation could be, that a 3-cycle instruction does all it has to do in the first cycle and then waits another two cycles, instead of fetching information in the first, processing in the second and writing back in the third cycle. Of course the external resolution cannot be higher, so the ranks for internal and external view are exactly the same on the temporal axis. 2.4.3.2 Data Resolution Axis On the data resolution axis the internal implementation has all possibilities of abstraction, except token, so is a composite. Register values are usually stored processor-like with in defined data types, like UINT08 or UINT16. For signals we are more flexible. Of course, bit signals use the bit logical abstraction, in MVL4LogicData format. But the concept of the type-token combination allows further possibilities in data resolution. Signals can contain numbers, either processor-like, or as general values. And even properties are possible, a module could send a token in enumeration format to

65

2 Theory and Background

another module, where the enumerated property then is interpreted. In this case, the signal is no exact model of the reality, but a general communication channel. This could be used in a stage of development, where the specification already defines basic flows, but still lacks refinement. The abstraction does not reach up to token precision level, because in a software tool, like Octopus, some storage information has to be specified, software completely without storage information is impossible. The external view’s range of abstraction levels is basically the same as the internal view’s one, but without the property precision level. A property is difficult to pass across programming language borders, there will always emerge the need to convert the property on the sending side and reconvert it on the receiving one. The actual passing will take place in a data format that is either a value or processor-like. Therefore, the external view’s lowest resolution is the value level.

2.4.3.3 Functional Resolution Axis The rank in the level of functional resolution in the internal view is ambivalent. Here it would be quite good to split up the internal view into another scheme, where the internal view describes a single model, while the external view describes the intermodel and model/model manager action, so the one at model interface boundary. Internal (Model)

External (Model Interface)

Functional

Figure 2.20: Functional resolution of Octopus, another view The internal implementation of an algorithm does not have to be the same as it will be in the end hardware, only the results have to be the same, so it can use the algorithmic processes level of abstraction. We will later see an A/D converter, where the actual value is determined mathematically, instead of approximating it step by step, like the real hardware does. Although this freedom for implementation, the model has to implement functionality in a way, that if the algorithm needs interaction with other models, that communication has to model the way it will finally be. Therefore, the external functional resolution here is set to the digital logic level. This does not mean that the external view knows the actual model implementation, but it can assume the functionality to strictly follow the specification. The application view of the functional resolution may see functional details, but does not have to know all features on each level, therefore only partial information is provided on the algorithmic processes and digital logic level.

66

2.4 Octopus 2.4.3.4 Structural Resolution Axis Octopus and the attached models describe a system either completely or partially in terms of connection of large blocks, such as ALU and register files, of a CPU and its peripherals. The interaction of all the components is well described, but the structure is not described on the full implementation level, because a model simply does not contain any information on how registers or, one level below, flip-flops are connected. Here again the situation is ambivalent. A model, internally, contains no structural information. The components are well known, but their connection is left out. From model manager view, there is some implementation information: There are the different modules and the net list describing the signals between them. Externally, the model manager and its models may show some implementation information, like a block diagram would do, so one can obtain the netlist and the memory map, and the system can also be seen as one large block. Of course no full implementation information is available, as it simply does not exist in the models. 2.4.3.5 Software Programming Resolution Axis The level of software programming resolution depends on the kind of models. Software programming of course requires a CPU to be present. This CPU model has to support object code by nature; it may further support micro code or even assembly code for debugging. But for sure micro or assembly code requires much more intelligence in the CPU model. Another kind of software is a stimulus file, which can be used to feed data in some signals that are attached to input ports. The stimulus can be seen as software, because models may react differently on different input data. So the data format of the stimulus file has to be ranked. For ranking let’s take a VCD file, as this is very common as file format for saving signals. The file contains several header sections: a date, a version, a comment, and information on the signals, the file monitors. The body is a series of time stamps, accompanied with information on which signals change their value at this point in time. The file is not binary, which excludes micro and object code. There are plain text sections as well as the ASCII-coded body, which can be read but not like plain text C programming language statements. So the probable rank for stimulus file is the assembly code abstraction level. This section proved that the VSI Model Taxonomy can be applied to real model implementations to classify their level of abstraction.

2.4.4 Octopus through OMI Eyes The last section dealt with abstraction level, this section will show that Octopus is an implementation of the Open Model Interface (OMI) concept, as introduced in chapter 2.2 on page 21. Before going into detail, it has to be clear that Octopus cannot

67

2 Theory and Background

be fully compliant to the OMI standard, because there are very few applications available that implement OMI, and none of these are used within Motorola. So what Octopus can do, is to provide kind of a OMI compliant interface at application boundary, but this interface has to be encapsulated with an adaptation layer that transforms the application’s semantic to Octopus’. So what would be the application part in implementing OMI is done at the Octopus side in reality. The section headlines will be similar to those in the OMI discussion to retain the way of contemplation. 2.4.4.1 Basic OMI Concepts One of the first properties mentioned was the way of delivery of the models and model managers. With Octopus, models are always delivered to the end user as a unit together with the model manager as a shared library on Solaris or as a dynamically linked library on Windows. Although it is theoretically possible to dynamically include models, Octopus does not cover this in respect of any dynamic library handling. The common process is to link Octopus with all models needed for a particular system, and give the end result away as a product. Of course the implementation is much more interesting than the way of delivery, so next thing to review is the way of bootstrapping Octopus. There is no bootstrap file, as it was considered to be packaged together with the model manager. This fact is quite obvious, because there is no OMI compliant application reading and interpreting the bootstrap file, so why generate one. The adaptation to the application is made within the library, and the adaptation layer of course knows what to do. A further discussion on bootstrapping and elaboration will follow in section 2.4.4.3 on page 70. Very important in the OMI are the model boundary class and the simulation style, the applications using Octopus have to support. The answer is simple but as well dissatisfying: The requirements totally depend on the adaptation layer. The Octopus simulator interface supports applications with unrestricted model boundary class and an event-driven simulation style. But the adaptation layer – actually being on the application side – can convert semantics and data to any of the other model boundary classes, too. The same applies to the simulation style. An application, only capable of cycle-based simulation, can invoke its calls to the adaptation layer and this will care for pending events and in unused cycles simply do nothing. 2.4.4.2 Information Model Another interesting piece of information is the type concept of Octopus, the way information is stored and processed. Referring to the OMI standard, the three basic terms for model boundary classes are port, parameter and viewport. The port concept is implemented using the same term – port. Octopus port can be used not

68

2.4 Octopus

only for internal interaction of models, but also for communication between Octopus and the application. For data retrieval and manipulation usually ports are used, the implementation for this is a subject of section 0. A parameter, used for tailoring model instances, is realized availing configurable models. A user-defined structure can be passed to a model at creation time. The particular model delimits the amount of adaptability by defining the structure. This will be shown in practical experience with two models in sections 3.1.2 on page 76 and 3.1.3 on page 84. The Octopus feature named backdoor register access implements the third basic term, viewport, not literally but conceptual, allowing the application to get a module’s register contents without performing a normal access that may produce side effects. An implementation for root object and library objects is missing, but again, this would be useful only for an OMI application. The counterpart for the model object is the structure motModelEntryT, which contains properties for name, timing information and type as well as relationships for model manager commands and addresses of the callback functions. The model instance object is included in the structure; important for this is the relationship to the parameter object named ConfigInfo, which holds the actual model parameters. For performance reasons, Octopus does not use a class-oriented data type system, as invented with the OMI Information Model. Anyhow, there are some structure definitions, e. g. motPortT for the definition of ports, specifying name, direction, sensitivity and initial type-token combination. Another structure definition is the motSignalDefinitionT type. It specifies signal properties, especially of signals that are connected externally. Although Octopus defines several types for e. g. port directions, debug print type, reset type, it does not inherit any data object classes. Many types are enumerations to make the numbers they hide more readable. Timing definitions are provided not in standard delay format (SDF), but on a model base using the motClockEntryT structure type. It describes the correlation of model timing and the global clock in respect of multiplier, offset, times to wake up and power off, an own clock rate; or even to define the model as clock-less. There are some macros that allow a more easy configuration of models that fall in one of the four subsequent categories. • Models without clock. • Models using the system clock. • Models using a scaled system clock. • Models using an own clock frequency. There are no special implementations for array types or records, the common C programming language constructs are used for these.

69

2 Theory and Background 2.4.4.3 Sessions and Execution Stages This section will go through the five stages, introduced in section 2.2.3 on page 40: Bootstrap, elaboration, initialization, simulation and termination. For each stage the OMI standard will be compared with the actual Octopus implementation. Some of the appropriate callbacks for the particular stages will be mentioned, without going too much into interaction details. Bootstrap Stage. As already discussed, there is no bootstrapping in Octopus, as the name – respectively the address – of the bootstrap function does not have to be determined. If in future Octopus is to support OMI applications, it will be a penny to add a bootstrap file and the appropriate functions. Elaboration Stage. The omiCreateInstance function, to be invoked for each model, has its equivalent in the Init callback function that is registered in the motModelEntryT structure. So in Octopus, elaboration starts even before, when it raises the motInitialSetup function that is located in the configuration file models.cpp. This function is responsible to pass an array with all model entries, another one with all signal definitions – the netlist – and a third one specifying all model manager commands. Beside these main tasks, the dump code for signal dumping is registered. After this function, Octopus knows which models are to be used and what parameters they have and need. Then it goes through the array and invokes the Init callbacks. Each model’s initialization is responsible to create its registers and ports, then to announce them to Octopus. The registers have to contain initial values, while output ports need their initial type-token combination, which leads directly to the next stage. Initialization Stage. The Init callback does not only create the model instance, it also configures the model with the information available, using the function motGetModelConfigInfo, which returns a pointer to the configuration structure mentioned above. In the models, created for the thesis, it will be seen that elaboration and initialization cannot be strictly separated, because information in the configuration structure is used to determine how many registers are to be created. So there are no exact analogies for the omiDriverInitialization, omiInitializeDriver and omiNetInitialization functions. Simulation Stage. Octopus is driven by the application, which calls Octopus’ motEval function. A simulation loop looks like specified in Figure 2.16 on page 45, all possibilities shown there are implemented in Octopus. Termination Stage. The termination stage was explained in section 2.4.1.2 on page 59. Ending a session is no big deal, it has to be clear that termination must follow strict rules to avoid memory leaks or access to already destroyed objects.

70

2.5 Summary of the Theory Part

2.5 Summary of the Theory Part With the introduction to Octopus as finish, all major aspects in theory for the diploma thesis have been discussed now: Several ways of structuring different abstraction levels were shown to allow a rating of models; a standardized concept for simulating hardware uncovered some important terms in simulation technique; possibilities and methodologies for verifying models brought in the quality aspect; and the base, that the practical part is founded on, was briefly analyzed.

71

2 Theory and Background

72

3 Modeling and Implementation This chapter will document the practical experience of the diploma thesis. The two tasks were: • Implement models of two different peripherals of the Star12 microprocessor. Take the analog-digital converter as first, and the pulse width modulator as second component. Test the modules. • Explore the possibilities for integrating systems, modeled with Octopus, into MathWorks Simulink. Develop an adaptation layer that allows data exchange and cares about the different time concepts. The sub-chapters will present details and difficulties when developing these pieces of software.

3.1 Modules Modeling As mentioned in earlier chapters, it is desirable for a SoC customer to write and especially test software, including low-level drivers and operating system, without having any silicon of the target system available. So in the special case for Motorola, it would be best to have all CPUs with all peripherals accessible as models. Two of these peripherals for a small processor were selected. To have one input and one output component as practice, the ATD converter was picked as input device, the Pulse Width Modulator as output device.

3.1.1 Environment For the practical experience description it is essential to know the processor, the modules belong to, as well as the simulator engine, that provides the CPU model and models of the core modules like memories, interrupt controller, bus interface, etc. The following two sections will cover these topics. 3.1.1.1 Star12 Core Overview This section wants to give a brief impression of the Motorola Star12 V1.5 Core; therefore the major features were summarized from [Mot00].

73

3 Modeling and Implementation

The Star12 is a 16-bit processing core that uses the 68HC12 instruction set architecture, allowing 68HC11 source code directly as assembler input for the Star12 CPU. The core contains sub-blocks for interrupt (INT), module-mapping control (MMC), multiplexed external bus interface (MEBI), breakpoint (BKP) and background debug mode (BDM). The core is structured to be easily integrated into SoC designs. The Star12 allows instructions with odd byte counts, including many single-byte instructions for more efficient use of program memory space. A three-stage instruction queue buffers program information for more efficient CPU execution. The set of indexed addressing capabilities include: • Using the stack pointer as an indexing register in all indexed operations • Using the program counter as an indexing register in all but auto increment/decrement mode • Accumulator offsets using A, B or D accumulators • Automatic index pre-decrement, pre-increment, post-decrement and post-increment (from −8 to +8) • 5-bit, 9-bit or 16-bit signed constant offsets • 16-bit offset indexed-indirect and accumulator D offset indexed-indirect addressing The Star12 core provides 2 to 122 I bit maskable interrupt vectors, 1 X bit maskable interrupt vector, 2 nonmaskable CPU interrupt vectors and 3 reset vectors, and a register may configure the highest priority I bit maskable interrupt. There is on-chip memory and peripheral block interfacing with internal memory expansion capability and external data chip select, additionally configurable system memory and mapping options. For connection to the rest of the SoC, it provides an external bus interface (8-bit or 16-bit, multiplexed or non-multiplexed). For debugging, there is hardware breakpoint support for forced or tagged breakpoints with two modes of operation: either dual address mode to match on either of two addresses, or full breakpoint mode to match on address and data combination. A single-wire background debug system is implemented in on-chip hardware, and additionally there exists a secured mode of operation. The Integrated Peripherals (IP) Bus and its interface, defined by the Motorola Semiconductor Reuse Standards (MSRS), connect the Star12 core to the system peripherals. The core communicates with the on-chip memory blocks either directly through the Core interface signals or via the STAR bus. Also available special features of the Star12 core were not used for model implementation.

74

3.1 Modules Modeling 3.1.1.2 The Barracuda Integration Platform The Barracuda Integration Platform is a package of models developed by a Motorola division named Global Software Group (GSG). The platform is a pilot project for the Full Chip Simulation initiative; it also serves as a starting point for development of a Barracuda full chip simulator. Some of the models are developed for reuse in other simulators. Especially interesting for the thesis project is the usage as the base platform. The Barracuda Integration Platform consists of a Star12 ISS, accompanied by a Star12 integration model, which connects the core to Octopus, referring to [WLvH01]. This is necessary because the CPU model was originally not developed for usage with Octopus. The integration model is designed especially for the usage with the specific debugger for the processor. This was a major drawback when trying to apply another simulator interface adaptation layer to Octopus, because the integration model occupies the simulator interface of Octopus and so prevents it from attaching another one. This was the reason, why no CPU could be used for testing the Simulink interface, as explained in section 3.2 on page 93. Of course the integration model itself has a reason: The Star12 ISS serves as the master of the simulation by the means it controls the simulation loop. As Octopus can only “serve” to one master, it is controlled by the integration model, which itself provides the interface for the debugger. Several Octopus models build the second part of the Barracuda Integration Platform. To support the programmable memory map, the Barracuda has, the Module Map Control model can decode the 16-bit Star12 addresses into the Octopus address space, and additionally recognize misaligned transactions. It implements the actual memory mapping and a configurable module size. The IP bus interface (IPBI) model implements features like input ports for peripheral interrupts and the real-time interrupt, conversion from high-active into lowactive interrupts, packaging information into a token and sending it to the interrupt controller. The clock and reset generator (CRG) model implements a COP watchdog and the real time interrupt generator, but no clock signals, not the start-up counter nor the oscillator and crystal monitor. The generic memory model is used to realize EEPROM and Flash memory arrays, extending the capabilities of the Octopus memory model. The two memories are split up into two models each: the actual memory array, and an associated controller. A command token, generated by the controller, is able to either program, erase or retrieve cells of the array. The controller checks the alignment and the access size of the program command and cares about timing specifications. Details of the Flash and EEPROM models can be omitted here. For the interrupt controller model there is not much to say, it implements all

75

3 Modeling and Implementation

features mentioned in the specification. As an interesting feature, the Barracuda platform supports banked memory, which are pages within the Flash memory. In the real hardware, the software cannot access the memory banks directly, the PPAGE register in the Star12 CPU must be used to switch between the different memory banks. The platform exactly models this behavior, in order to grant that in the software engineering process, the generated applications in linked form can directly be loaded to the target hardware without changing anything in memory access.

3.1.2 Analog-to-Digital Converter 3.1.2.1 Specification Overview The first module chosen for modeling was the ATD, which is an analog-to-digital converter that provides 10/8-bit resolution and up to 16 analog input channels. For serving the 16 input channels, internal multiplexing is used. The A/D converter has a sample buffer amplifier; the sample time can be programmed. Output values can be left or right justified, signed or unsigned. Several conversions can be pooled in a sequence, whereas a sequence can contain up to 16 conversions, either on one channel or on a series of channels. The start of such a sequence can be triggered externally or by clock, the completion may be signaled with an interrupt. Additionally, the ATD provides the capability to use some or all of the analog input channels as digital input ports ([Sch01]). The register map comprises 48 bytes. There are six 8-bit control registers, ATDCTL0 to ATDCTL5. Four are actually used to configure the module, while the two remaining are reserved for future use. The general status register, ATDSTAT0, flags errors and informs about the next target register to be written to. Two other status registers, ATDSTAT2/1, are merged to one 16-bit register and show the sequence progress. A 16-bit test register, ATDTEST0/1, is active in special mode only and contains the values during the approximation process. The 16-bit register ATDDIEN0/1 configures the digital input capability; the values read then appear in PORTAD0/1, with a particular bit being 1 when configured not to act as digital input. The remaining space of the register map is occupied by the 16 data result registers, each 16-bit wide. There are 21 external pins: Most obvious are the 16 analog input channel pins named PORTAD0-15. Two pins, VDDA and VSSA, act as power source for the analog part of the converter. Another two pins define the lower and upper boundaries for the conversion: VRL and VRH. The last external pin is ETRIG, which is the external trigger input. It may occupy a separate pin or use channel pin 15. For the actual conversion the component owns a “Sample and Hold Submodule”, which includes an analog input multiplexer that selects one of the 16 external analog input channels,

76

3.1 Modules Modeling VDDA VSSA Supplies VRH References VRL ETRIG

Sequence Control and

AD15/PA15 AD14/PA14

Analog to Digital Machine

Sample and Hold Machine

Register File AD0/PA0

PORTAD Register

Clock Prescale

ATD IP Bus Interface Unit

IP Bus

Figure 3.1: ATD Block Interface Diagram, [Sch01, p. 13, Fig. 1-3] the sample buffer amplifier that aids the storage node, and the circuits for the conversion. The resistor and Capacitor DAC arrays produce the comparison values. The comparator uses three stages to compare the input signal to the generated ones. The results are collected in the A/D state machine and the Successive Approximation Register (SAR), a part of the ATDTEST0/1 register. Concerning the external trigger, there are six different modes of operation defined in section 1.6.2.2 of the specification [Sch01]: • Ignore external trigger, perform one conversion sequence and then stop. • Ignore external trigger and perform continuous conversion sequences. • Falling edge triggered; perform one sequence per trigger. • Rising edge triggered; perform one sequence per trigger. • Trigger active low: As long as the trigger is low, perform continuous sequences. • Trigger active high: As long as the trigger is high, perform continuous sequences.

77

3 Modeling and Implementation

For configuring these modes, four bits in two control registers are used. Sequences can be configured with respect to the number of conversions they consist of, whether they should work on a single channel or multiple, consecutive channels, and on which channel they are to start. The number of conversions is specified in the SxC bits in ATDCTL3. ATDCTL5 gives the input channel the conversion should be performed or started on with its CD, CC, CB and CA bits . The MULT bit in ATDCTL5 decides, whether to scan a single channel or multiple consecutive channels. Additionally, there is the possibility to specify whether the result registers should or should not map to the conversion sequence, specified with the FIFO bit in ATDCTL3. The time it takes to perform one conversion depends on different aspects. The conversion clock is a prescaled clock, whereas the prescaler is configured with the 5-bit PRSx value in ATDCTL4, providing a prescaler from 2 to 64 in steps of two. The duration of one of the three sampling phases can be configured with the 2-bit SMPx value in ATDCTL4, making the whole sampling stage either 8, 10, 14 or 22 conversion clock cycles long. The other aspect for timing is the selection of 8-bit or 10-bit conversion, which takes 8, resp. 10 cycles. This leads to another configure group: the data result configuration. So it is possible to have the results either right or left justified. In left justified mode it is possible to gain signed or unsigned data values. As an example, Table 3.1 shows an example for the output codes. Table 3.1: Example for ATD Output Codes, [Sch01, p. 29, Table 3-9] Input Signal VRL = 0V VRH = 5.12V

Signed 8-bit Codes

Unsigned 8-bit Codes

Signed 10-bit Codes

Unsigned 10-bit Codes

5,12 5,1 5,08 2,58 2,56 2,54 0,02 0

7F 7F 7E 01 00 FF 81 80

FF FF FE 81 80 7F 01 00

7FC0 7F00 0007 0100 0000 FF00 8100 8000

FFC0 FF00 FE00 8100 8000 7F00 0100 0000

The remaining bits of the ATDCTLx registers are used for general configuration, like enabling interrupts, how conversion complete flags are cleared and what has to be done with wait or power down modes. In test mode, the ATD module can be examined with respect to the power sources and reference pins.

78

3.1 Modules Modeling

An actual sampling and conversion consists of several phases: First, two cycles are used to sample the signals on the capacitor of the channel. After this, the sample is transferred to the A/D machine’s storage node, which takes another four cycles. Finally the external analog signal is attached directly to the storage node for the configured amount of cycles. This is done to remove possible noise from the signal. After the sampling the conversion takes place, a loop of 8 or 10 cycles, in which consecutively the result bits are generated. 3.1.2.2 Implementation The model is a full implementation of the specification. It includes all configurable operations as well as the test mode. To get a short overview of the interface it presents to Octopus, Table 3.2 shows the types of the ports. For modeling analog values, Octopus provides the NUMBER token, which is simply a pointer to a double value. For easier handling, the external trigger input pin is separate, and a digital port. Table 3.2: ATD ports, [Sch01, p. 5, section 1.2] Signal

Type

In/Out/Bi

Token Type

notices

bus intr ipg_stop ipg_wait testmode portad0-15 vssa vdda vrl vrh etrig

Bus ’0’, ’1’ ’0’, ’1’, ’Z’ ’0’, ’1’, ’Z’ ’0’, ’1’, ’Z’ Real Real Real Real Real ’0’, ’1’

Bidir Out In In In In In In In In In

motBusTokenT motLogic motLogic motLogic motLogic NUMBER NUMBER NUMBER NUMBER NUMBER motLogic

1 & 2 byte access high active (’1’) high active (’1’) high active (’1’) high active (’1’) double* double* double* double* double*

It was a major task to program all the logical connections between the registers and the module’s behavior, as all of the approximately 50 pages of the specification contain information that is linked to other pages. So e. g. if a result register is read from, the corresponding bit in the conversion complete flags register ATDSTAT2/1 has to be cleared, but only if the status register was read before or the AFFC flag was set. Of course, if the debugger generates a backdoor access, the bit must not be cleared to prevent from driving the module in a non-consistent state. The following piece of source code is the handler routine that the base class invokes, if any of the result registers is read from. The main conditional branch checks if the AFFC bit

79

3 Modeling and Implementation in ATDCTL2 is set. If so, the Conversion Complete Flag associated with the result register that was read can be cleared, although the status register was not read before. Additionally, a possible interrupt can be reset if there are no more unread values. In the other case, an internal variable, indicating the status read, is checked. Here the interrupt reset has not to be performed, because this is done on status reading. 1 2 3 4 5 6 7 8 9 10 11 12 13

14 15 16 17 18 19 20 21 22 23 24 25

/∗ Read a r e s u l t r e g i s t e r ∗/ void atd : : Read_ATDDRx ( motUINT08 i n d e x ) { i f (ATDCTL2.AFFC) { ATDSTAT0. SCF = 0 ; // r e s e t SCF b i t SetReg08 (ATDSTAT0, const_REG_ATD_ATDSTAT0) ; ATDSTAT2_1 &= ~(1> 8 , const_REG_ATD_ATDSTAT2) ; else SetReg08 ( motUINT08 ∗ (ATDSTAT2_1) , const_REG_ATD_ATDSTAT1) ; i f ( ! ATDSTAT2_1) // r e s e t ASCIF and I n t r i f no more unread data { ATDCTL2. ASCIF = 0 ; SetPin (const_PIN_ATD_INTR , motLOGIC0 , const_INTR_LATENCY) ; } } e l s e i f (bRead_ATDSTAT [ ( i n d e x > 7 ? 1 : 0 ) ] ) { ATDSTAT2_1 &= ~(1> 8 , const_REG_ATD_ATDSTAT2) ; else SetReg08 ( motUINT08 ∗ (ATDSTAT2_1) , const_REG_ATD_ATDSTAT1) ; } }

Listing 3.1: ATD read routine The actual sampling conversion is modeled in a different way to the real hardware, as presented at the end of the previous section. The samples are taken as a single snapshot, not as an average. As there is usually no noise in simulated systems, the process can be omitted to gain simulation performance. The same reason applies to the conversion process, which is performed at the end of the approximation loop. Modeling the conversion process with all details would mean no benefit, as the intermediate values are not used during normal operation. So in the model a conversion takes two stages: The method atd::ConversionTrigger does the sampling at the very beginning of the duration, and calls motNextEvent to evaluate the model 16 to 32 clock cycles later. The Eval callback determines which task is to be performed, and filters outdated events, then executes atd::ConversionExecute, which actually

80

3.1 Modules Modeling

calculates the result. The conversion uses a simple formula, which calculates the proportion of the input value to the range spanned by VRH and VRL , then transforming it into a 10-bit integer, which later can be processed in respect of justification, signed or unsigned, or 8-bit data. result = 1023 ·

input − VRL VRH − VRL

Benefits of using a simulator are not only that no real hardware is needed, but also that a model can aid the software developer with additional information. In the ATD model, i. e. if a write access is performed to a read-only register, like the ATDDRx registers, besides ignoring the access, the user is informed with a warning message. The same applies for all other invalid accesses. If any electrical specification is violated, an error is printed, too. The ATD module registers several model manager commands to allow the user to get advanced status information. The values of all registers and ports can be printed, in addition to the regular access; and the internal state of the module. The print state command, as an example, shows: • if any sequence or conversion is in progress, • the current analog channel to be sampled, • the next ATDDR register to be written to, • the external trigger control state, • how many conversions in a sequence are done and how many are pending, • the sample mode used, • the power down state, • the number of conversions triggered (not inevitably performed), • and if ATDSTAT2/1 was read since the preceded conversion. Other information that can be accessed with the commands is a calculation of the current sample rate, as the different control registers define it, and the current electrical specifications of the ATD module. A sample output of the Hiwave command line interface could look like this: The model is configurable at elaboration time regarding different topics. As the real ATD module can also be implemented with only 8 channels, the model can be configured to act as one of the both kinds of A/D converters. The implementation has e. g. to care about the correct wrap-around when writing to the result registers

81

3 Modeling and Implementation in>pt octo atd0 state samplerate ATD internal module state Sequence in progress: no Conversion in progress: no Next ATDDR to be written: 7 Current analog channel: 6 External trigger state: no external, single sequence Done conv. in sequence: 6 of 12 Sample mode simple Power down status: awake Total conv. triggered: 127 ATDSTAT2 read since conv. no ATDSTAT1 read since conv. yes ATD current sample rate clock frequency: 25.000000 MHz clock prescaler PRS: 12 unscaled clocks per conv.:14 scaled clocks per conv.: 1 conversion rate: 25.000000 MHz conv. per sequence: 4 sample rate: 6.250000 MHz in> Figure 3.2: Sample Hiwave output for ATD in FIFO mode. The other topic concerns the electrical specification. Although the specification defines the absolute maximum and minimum values for the analog input ports, the model allows them to be configured. Of course, if no explicit configuration is made, default values are used. 3.1.2.3 Testing For testing purposes, a second model was developed to produce stimuli for the external input ports of the A/D converter. Though Octopus provides the possibility to use stimuli files for input ports, I decided to use a stimuli model, because the tests can be more dynamic, the test can react on the results the module under test produces. The module is obviously called Digital to Analog Converter (DTA), but does not follow

82

3.1 Modules Modeling any specification. It is able to produce analog values from −1.0 to +7.0, by writing values from 0x0000 to 0xFFFF to the according registers. One 16-bit register is used for each of the 21 signals. The conversion is performed every time a value is written to any of the registers, it uses the formula: output = 8 ·

input −1 65536

The test cases are several C programs, compiled and executed automatically; they perform the following test cases: • All register read/write. Take care of unwritable bits etc. • Electrical range restrictions. • Digital port feature. • Resolutions, data justification, signed/unsigned. • Different sequence length 1-16 conversions, input channel selection. • FIFO mode, multi mode, scan mode • multi mode, scan mode, FIFOR bit • Clock Prescaler (SRES8, SMP, PRS). • External Trigger falling/rising edge • External Trigger active low/high • ETORF bit, has to be set if trigger acts too fast • Interrupt functionality. Is interrupt set/reset at the right times? • Power down features. Test with APDU flag. Test disabled AFFC functionality The test cases can determine on their own whether they passed, it can be seen by the return value, stored in an accumulator. Each bit of the return value stands for another sub test. Overall, the A/D model consists of about 2,300 lines of code, additionally of about 1,000 lines of test code.

83

3 Modeling and Implementation

3.1.3 Pulse Width Modulator 3.1.3.1 Specification Overview The second module chosen for modeling was the PWM_8B6C, a Pulse Width Modulator, incorporated in the HC12 micro controller family. A PWM is used e. g. for engine control, which needs periodic pulses for ignition, etc. The PWM used here, according to the specification [CO98], features six independent channels for pulse generation with programmable period and duty cycle, and a counter. The channels can be enabled and disabled separately; the same applies to the duty pulse polarity. Period and duty cycle are double buffered, which means their values can be changed without directly affecting the current pulse, the change takes effect at the end of the effective period, or if the channel is disabled. A register exists for specifying whether a particular channel’s pulse output should be left or center aligned. It is possible to pool the six 8-bit channels to three 16-bit channels, which increases resolution. Four clock sources (A, B, SA, SB), based on two clocks, provide for a wide range of frequencies. PWM_8B6C PWM Channels Channel 5

Bus Clock

Clock select

PWM Clock

Period and Duty

Counter

Channel 4 Period and Duty

Counter

PWM5

PWM4

Control Channel 3 Period and Duty

Counter

Channel 2

Enable

Polarity

Period and Duty

Channel 1 Period and Duty

Alignment

Counter

Counter

Channel 0 Period and Duty

Counter

Figure 3.3: PWM_8B6C Block Diagram, [CO98, p. 14, Fig. 1-1]

84

PWM3

PWM2

PWM1

PWM0

3.1 Modules Modeling

Before going through the register map, the clock concept has to be discussed. Of course, all four clocks depend on a system clock, as there is no external trigger input port in the PWM module. The two clocks, A and B, use a prescaler, which divides the system clock by 1, 2, 4, 8, 16, 32, 64, or 128. The prescaler can be specified separately for the two clocks, using two 3-bit values, fed into a multiplexer that selects the appropriate prescaler. The two other clocks, SA and SB, take the prescaled clocks, A and B, as input, and divide them further with a re-loadable 8-bit counter, then finally divide them by 2, which means SA and SB apply a divisor of 2, 4, 6, 8, . . . , 512 to the prescaled clocks. Clocks A resp. SA are used for channels 0, 1, 4 and 5; clocks B resp. SB for channels 2 and 3. The register map comprises 32 registers, each 8-bit wide. PWME enables and disables the channels, PWMPOL selects the polarity, and PWMCLK specifies if clock A/B or SA/SB should be used. PWMPRCLK avails bits 0-2 and 4-6 to pick the prescaler of the clocks. The pulse outputs can be left or center aligned; register PWMCAE selects one of the alternatives. The most important bits in PWMCTL indicate if channels 0 and 1, 2 and 3 or 4 and 5 should be concatenated to a 16-bit channel. The divisors for clocks SA and SB are stored in PWMSCLA/B, whereas a zero as value is interpreted as 256. The next registers are six registers, named PWMCNT0-5, that count the clock cycles during pulse generation. PWMPER0-5 specify the period length and PWMDTY0-5 give the duty cycles. Of course, the value of a PWMDTYx register cannot be greater than the according PWMPERx contents. The last register, PWMSDN, controls shutdown functions. All remaining registers are either reserved or test registers. The significant external pins include of course six pins ipp_pwm_do0–5 for the pulse output. Additionally there is an extra pin for each channel that signals if the channel is enabled; the pins are named ipp_port_en0-5. Besides the interrupt pin, some extra pins are specified, but not used during normal operation. A pulse signal in left aligned mode is generated as follows: At the beginning of the period, the output pin is set to the value in the according bit of the PWMPOL register. The counter PWMCNTx is incremented on each rising clock edge – whatever clock is used as source – and compared to the values in PWMDTYx and PWMPERx. When the counter reaches the value in PWMDTYx, the output pin is inverted, and then is hold on the level until the counter reaches the value in PWMPERx. Here the output pin is set to the original value, and the counter PWMCNTx reset to 0. The process for center aligned pulse generation is slightly different: The effective period is twice as long as in left aligned mode. The first half of the period matches the above one. If the counter reaches the PWMPERx value, the pin retains its level, and the counter begins to count backwards. When it passes the PWMDTYx value while it is decremented, now for the second time, the pin changes the level. When the counter gets to 0, again the count direction swaps and a new period begins. The 16-bit mode works similar to the 8-bit functionality. As mentioned before,

85

3 Modeling and Implementation PPolx=0

PPolx=1

PWMDTYx Period=PWMPERx

Figure 3.4: PWM Left Aligned Output, [CO98, p. 40, Fig. 4-3]

PPOLx=0 PPOLx=1 PWMDTYx

PWMDTYx PWMPERx

PWMPERx Period=PWMPERx*2

Figure 3.5: PWM Center Aligned Output, [CO98, p. 41, Fig. 4-5]

the PWMCNT, PWMDTY and PWMPER registers are grouped into pairs, the even registers (0, 2, 4) act as the high bytes, the odd registers (1, 3, 5) as the low bytes of the 16-bit values. The configuration for polarity, clock selection, alignment, and the output pins is done with the odd numbered bits and pins, as seen in Table 3.3. Table 3.3: Concatenation Mode Summary, [CO98, p. 44, Tab. 4-2] CONxx

PWMEx

PPOLx

PCLKx

CAEx

PWMx Output

CON45 CON23 CON01

PWME5 PWME3 PWME1

PPOL5 PPOL3 PPOL1

PCLK5 PCLK3 PCLK1

CAE5 CAE3 CAE1

PWM5 PWM3 PWM1

Exactly the same specification exists for an eight channel PWM; the two additional channels are associated to clocks B/SB. The register map increases then 6 bytes in size, and there are no longer unused bits in PWME, PWMPOL, PWMCAE, PWMCLK and PWMCTL.

86

3.1 Modules Modeling 3.1.3.2 Implementation The model implements the full specification, and turns its attention to performance. This is much more complicated than in the ATD module, because here we have to deal with four different clocks and six completely independent channels. As an example: The PWMCNTx registers contain the current counter value, so the program that runs on the Star12, can determine the progress of a period. A solution to this could be, that an event is generated for each PWM clock cycle, to update the values. But this would lead to excessive event scheduling – possibly every system clock cycle the PWM could produce an event. It is quite obvious that this way may result in poor performance, especially because it is also possible, that no one ever reads the register contents. So the right solution is to start the calculation, when the output signals have to be swapped, and schedule an event for this, and do not care about the counter register. If that is accessed at any time, its value can be calculated dynamically from the system cycle count; the model only has to know when the particular period has started. The following piece of code shows, how a cycle counter is calculated on demand. The value dTimestamp[x], used to determine when a period has started, is calculated on each period start. See comments for details. 1 2 3 4 5 6

7 8 9

10 11 12 13 14 15 16

17 18 19 20 21 22

// read a c c e s s t o PWMCNTx. void pwm : : ReadCheck_PWMCNTx( motUINT08 c h a n n e l ) { // c a l c u l a t e v a l u e on read t o s a v e e v e n t s motOffsetT o f f s e t ; motCyclesT c y c l e s ; motGetCycles(& o f f s e t , &c y c l e s ) ; // g e t c u r r e n t s i m u l a t i o n time i f ( bConcatenate [ channel >>1]) // i n c o n c a t e n a t i o n mode? { c h a n n e l = ( ( c h a n n e l +1) % 2 ) + c h a n n e l ; // c a l c . e f f e c t i v e channel i f ( reg_PWME & (1= 0 ) // a l r e a d y s t a r t e d ? { i f ( tmp < iValuePER [ c h a n n e l ] ) // t h e c o n c a t e n a t e d period { reg_PWMCNT[ c h a n n e l ] = static_cast(tmp ) ; reg_PWMCNT[ channel −1] = static_cast(tmp >> 8 ) ; } else { // i n CAE mode c o u n t e r may run backwa rds reg_PWMCNT[ c h a n n e l ] = static_cast(2∗iValuePER [ c h a n n e l ] − tmp ) ;

87

3 Modeling and Implementation

reg_PWMCNT[ channel −1] = static_cast((2∗ static_cast( iValuePER [ c h a n n e l ] )−tmp ) >> 8 ) ;

23

} SetReg08 (reg_PWMCNT[ c h a n n e l ] , const_PWMCNT0+c h a n n e l ) ; SetReg08 (reg_PWMCNT[ channel −1] , const_PWMCNT0+channel −1) ;

24 25 26

} } } else { // no c o n c a t e n a t i o n i f ( reg_PWME & (1= 0 ) // a l r e a d y s t a r t e d ? { i f ( tmp < reg_PWMPER[ c h a n n e l ] ) // c n t may run ba ckwards reg_PWMCNT[ c h a n n e l ] = tmp ; else reg_PWMCNT[ c h a n n e l ] = 2∗ static_cast(reg_PWMPER[ c h a n n e l ] )−tmp ; } SetReg08 (reg_PWMCNT[ c h a n n e l ] , pwm : : const_PWMCNT0+c h a n n e l ) ; } }

27 28 29 30 31 32

33 34 35 36 37 38

39 40 41 42 43

}

Listing 3.2: Checking Routine for PWM This may look expensive, but remember this code is only executed, when really a read access to a PWMCNTx register occurs. In line 13 you may notice the comment “already started”, which might be confusing at first sight, because the check whether the channel is enabled, takes place in line 10. But if a channel is enabled, it might be necessary to wait for the clock edge of the particular clock. So the channel may be enabled, though the pulse generation has not started yet. The branch beginning in line 19 shows the extra code necessary for center aligned pulses: The effective period is twice as long as usual, and the counter may run backwards. In most functions there are no discrete branches for 8-bit and 16-bit mode. Period and duty registers are not called directly, but through intermediate values, to ensure transparency for 8-bit and 16-bit concatenated mode of operation. But here, the PWMCNTx registers are actually written, so their values have to be calculated explicitly. In lines 23, 24 and 38 you clearly see that there are the so-called external registers, which are accessed by Octopus, and internal variables having the same values for faster access. This technique also allows a very simple implementation of the double buffering of PWMPERx and PWMDTYx: The Octopus registers are only read, when buffers, pursuant to the specification, are exchanged. Another difficulty is, that the model should cover two PWM modules. This also

88

3.1 Modules Modeling

is more difficult than in the ATD, because there the additional registers were at the end of the register map, so it is possible to program them statically and then mask the unused part out. Here, the variable part of registers is not at the end of the map, but interleaved. So the module must dynamically create the register map, including the pointers to the access handler functions, which must exist for the biggest possible number of channels, e. g. eight times. Of course these handler functions do nothing more than to invoke a generic handler function for that access type and the type of register. 1 2 3 4 5 6 7 8

s t a t i c motBoolT OnReadCheck_PWMCNT0 ( motBusTokenT∗ Token ) { ( (pwm∗ ) motGetModelData ( ) )−>ReadCheck_PWMCNTx( 0 ) ; return motT ; }; s t a t i c motBoolT OnReadCheck_PWMCNT1 ( motBusTokenT∗ Token ) { ( (pwm∗ ) motGetModelData ( ) )−>ReadCheck_PWMCNTx( 1 ) ; return motT ; };

Listing 3.3: PWM generic read check routines The function motGetModelData resolves to a pointer to the particular model instance. These callbacks execute the function presented above. To allow dynamic register definition, pointers to these callbacks are collected in an array, so are the register names. 1 2 3 4 5 6 7

motBoolT ( ∗ const pwm : : OnReadCheck_PWMCNT [ ] ) ( motBusTokenT ∗ ) = { pwm : : OnReadCheck_PWMCNT0, pwm : : OnReadCheck_PWMCNT1, pwm : : OnReadCheck_PWMCNT2, pwm : : OnReadCheck_PWMCNT3, pwm : : OnReadCheck_PWMCNT4, pwm : : OnReadCheck_PWMCNT5, pwm : : OnReadCheck_PWMCNT6, pwm : : OnReadCheck_PWMCNT7 };

8 9 10

const char∗ pwm : : names_PWMCNT [ ] = { "PWMCNT0" , "PWMCNT1" , "PWMCNT2" , "PWMCNT3" , "PWMCNT4" , "PWMCNT5" , "PWMCNT6" , "PWMCNT7" } ;

Listing 3.4: PWM check callback routines Of course the same technique applies for PWMDTYx and PWMPERx registers. The name and function pointers for each register are assembled in a structure, and merged to an array of register structures. This array is then passed to the base class, which will later use the names and callbacks. A very complicated case was the fact that scaler and prescaler for the clocks can be changed while channels are enabled. Although the specification warns of doing this, because of possible pulse truncations, I had to model the behavior. Additionally there raises the problem of outdated events: If a pulse is generated, an event is always

89

3 Modeling and Implementation

pending. If now the scaler is changed, this event becomes outdated. But Octopus provides no possibility to un-schedule events, so it has to be filtered out when the simulation time reaches that point. The following piece of code shows a part of the Eval callback, where the event filtering is done. motCyclesT modulo = fmod ( C u r r e n t C y c l e s − ModelInstance −>dTimestamp [ c h a n n e l ] , ModelInstance −>i C l o c k s P e r P e r i o d [ c h a n n e l ] ); int motUINT16 i S t a t e = static_cast( C u r r e n t C y c l e s − ModelInstance −>dTimestamp [ c h a n n e l ] ) / ModelInstance −> i C l o c k s P e r P e r i o d [ c h a n n e l ] ;

Listing 3.5: PWM eval callback The remainder of the division offset cycles by cycles per clock shows, if this event was released before a change of scale. It has to be 0 to be current. Of course, for the PWM, there are the same debugging aids like for the ATD model. If any of the registers PWMPOL, PWMCLK, PWMCAE, PWMPRCLK, PWMSCLA/B are written to, while channels are enabled, a message is generated that warns of possible pulse truncation. There are also warnings for accesses to registers that have no effect or are erroneous, when concatenated mode is active. Again, there are model manager commands that show the state of the module. Besides registers and ports, the state shows: • If any channel is enabled • Concatenation information • The current sample rate as implied by core frequency • Effective prescaler and scaler A sample output of the Hiwave command line interface looks as in Figure 3.6 on the facing page. The model can be configured with respect to the number of channels. It is possible to have 2, 4, 6 or 8 channels. Although the two existing specifications are only for 6 and 8 channels, it was no additional effort to allow also 2 and 4 channels. So the model can be used for four different PWM modules. 3.1.3.3 Testing For testing purposes, the VCD dumper, integrated in Octopus, was used. All test cases produce some output. The VCD files then are analyzed graphically, as the expense for rating them automatically would have been too big. As VCD viewer a free tool named GTKwave (see appendix 4 on page 121) was used.

90

3.1 Modules Modeling in>pt octo pwm state clock pwm internal module state channel(s) enabled: 0 1 2 3 4 5 concatenated channels: 4&5 channel 0 uses clock A channel 1 uses clock A channel 2 uses clock B channel 3 uses clock B channel 4 uses clock A channel 5 uses clock A pwm current sample rate clock frequency: clock A prescaler: clock B prescaler: clock SA scale: clock SB scale: clock A rate: clock B rate: clock SA rate: clock SB rate:

25.000000 1 8 2*2 2*1 25.000000 3.1250000 6.2500000 3.1250000

MHz

MHz MHz MHz MHz

in> Figure 3.6: Hiwave sample output for PWM Again the test cases are C programs, compiled and executed automatically. They perform the following test cases, accompanied with some screen shots. • Basic functional test • All registers read and write accesses, test on correct initial values • PWME functionality: enable/disable channels • 8-bit mode: set duty to 1 and increase period length from 0 to 255. • 8-bit mode: set period length to 255 and increase duty from 0 to 255. • 16-bit mode: set duty to 1 and increase period length from 0 to 65535 in steps of 255.

91

3 Modeling and Implementation

Figure 3.7: PWM basic functional test output

Figure 3.8: PWM prescaler decrease test output

Figure 3.9: PWM re-enabling channels test output

92

3.2 Simulink Interface for Octopus • 16-bit mode: set period to 255 and increase duty from 0 to 65535 in steps of 255. • Prescaler for both clocks, increase • Prescaler for both clocks, decrease • Scale for both clocks, increase • Scale for both clocks, decrease • PWMCAE and PWMPOL settings • Re-enabling channels (preserve PWMCNTx) • Reset and Restart operations (PWMRSTRT and PWMCNTx) For some test cases, of course, it is necessary to count cycles, but in most cases it becomes immediately obvious if a test case was passed or not. The PWM module consists of about 1,900 lines of code, additionally about 800 lines of test code.

3.2 Simulink Interface for Octopus Simulink is an interactive tool for modeling, simulating and analyzing dynamic systems. The MathWorks Inc., the leading developer and supplier of technical computing software, created it. Allowing interactive simulations, parameters can be changed “on the fly” and produce effects [Mat00]. All Simulink Models have three things in common: They have some sort of signal generator, maybe more than one, that produce values subject to time. The values may represent physical variables, electrical systems or simply mathematical numbers. Second, the signals are processed through an arbitrary number of operations like integrals, gain blocks, and much others. The elements of such a Simulink model are called blocks. Figure 48 shows the block groups in Simulink. With Simulink, one can simulate e. g. physical behavior like a bouncing ball, or automotive applications like an automatic transmission control, both presented with this thesis in a video demo, see appendix 4 on page 119 for details. The graphical user interface is one of the main benefits of Simulink. With drag-and-drop operations the user draws the models, like block diagrams. The blocks have the possibility to customize them, there are several variables for each block. It is also important that models are hierarchical, so it is possible to build models using both, top-down and bottom-up approaches. The user can view the system at a high level, then go down through the levels to see increasing levels of model detail. This approach, called sub

93

3 Modeling and Implementation

Figure 3.10: Simulink block groups, [Mat00]

block masking, provides insight into a model’s organization and how its parts interact [Mat00]. Basically, a Simulink block diagram is just a pictorial model of a dynamic system. It consists of a set of symbols, called blocks, interconnected by lines. Each block represents an elementary dynamic system that produces an output either continuously (a continuous block) or at specific points in time (a discrete block). The lines represent connections of block inputs to block outputs. The type of the block determines the relationship between a block’s outputs and its inputs, states, and time. A block diagram can contain any number of instances of any type of block needed to model a system. A block is the elementary dynamic system that Simulink knows how to stimulate. The output of a block is a function of time and the block’s inputs and states, if there are any. The particular function depends on the type of block. There are stateful and stateless blocks. The integrator is an example for a stateful block, because the output depends on the history of the input. So its only state is the integral value before the current time. An example for stateless blocks is the gain block, which simply outputs its input signal, multiplied with a constant called the gain. An important classification of blocks is, to divide them into continuous and discrete blocks. A continuous block responds immediately to continuously changing input. A discrete block, by contrast, responds to changes only at integral multiplies of a fixed sample time, holding its output constant in the meantime. Blocks that allow both working modes are called to have an implicit sample rate. The actual simulation itself allows several customizations, too. For different purposes, different solvers can be used. The solver is the engine that involves the numerical integration of sets of ordinary differential equations (ODEs). There is a major difference between variable-step and fixed-step solvers and the meaning of these is already included in their names. For most solvers the minimum and maximum step size, and the tolerances can be tailored to the specific requirements of the simulation. Scripting Simulink is possible, too, using the command line interface. Access is

94

3.2 Simulink Interface for Octopus

Figure 3.11: Simulink Simulation Parameters, [Mat00] provided to all functions and settings, so scripts can automate simulations.

3.2.1 Simulink S-Functions Automotive engineers could consider simulating an automatic window lifter with Simulink, to see how fast the electrical engine has to work to lift the window with a certain speed, and take friction into account. The ideal thing now would be, if they could put the micro controller, running the program that directs the lifter engine, directly into the Simulink model. The benefits of a simulation then could be to see whether the program reacts within range to unattended events, like someone putting his hand into the closing gap. In this case, which could be simulated by increasing the friction to a high value, the program has to stop the engine of course. Of course one could simulate the processor within Simulink, but that does not really match our re-use paradigm. So better use the controller already present in Octopus models. But how get the Octopus run controller into Simulink? The answer is to use the S-function interface that Simulink provides. Although there is a whole bunch of different blocks, there will always be the need for an engineer to use own, proprietary blocks that are not present in the library of Simulink. So there exists the capability to define blocks, using S-functions. These are computer language descriptions of Simulink blocks. An S-function can be written in MATLAB, C, C++, Ada, or Fortran. We will use C/C++ for our purpose. For usage in models, a block named S-function is dragged from the library into the model. Its dialog box takes the name of the S-function, which has to be also the name of the

95

3 Modeling and Implementation

file, the S-function is compiled in. Additionally, there is a second field, taking the parameters that should be passed to the S-function. As all other blocks, S-functions process inputs u and states x to receive an output y. It is up to the S-function to perform this task by reacting to the callbacks Simulink invokes. These will be discussed below. u (input)

x (states)

y (output)

Figure 3.12: Simulink Block, [Mat00] This diagram can be expressed using equations that show the mathematical relationship between the inputs, states and outputs, taken from [Mat01, p. 20].

where

y = fo (t, x, u)

(output)

x˙ c = fd (t, x, u)

(derivative)

ydk+1 = fu (t, x, u)

(update)

x = xc + xd

3.2.1.1 Simulation stages A Simulink model is executed in stages [Mat01]. The first one is the initialization phase, where Simulink incorporates library blocks into the model. Widths, data types and sample times are propagated, parameters are processed and the block execution order is determined. The second phase is the simulation!loop. A single pass through the loop is called a simulation!step. During a simulation step, Simulink executes all blocks by invoking functions that calculate states, derivatives and outputs of a block. In Figure 3.13 on the next page you see the execution stages that all blocks run through. Two terms were not mentioned before: Locate zero crossings is a mechanism that allows solving the problem that occurs when an input signal crosses the zero line. Many functions change their behavior and output dramatically when getting an input zero. In numerical simulation, as it is used here, normally the zero line is passed with the same step size as the rest of the simulation. But this could lead to a mis-behavior in the output vector. Simulink faces this problem by decreasing the step size when being near zero, to allow the model producing correct results. Minor time step is the part of the loop where outputs can be re-calculated. Here the zero crossing detection is done, too.

96

3.2 Simulink Interface for Octopus

Initialize model

Calculate time of next sample hit (only for variable sample time blocks)

Simulation loop

Calculate outputs

Update discrete states

Clean up final time step Calculate derivatives

Calculate outputs

Calculate derivatives

Locate zero crossings

Integration (minor time step)

Figure 3.13: Simulink simulation stages 3.2.1.2 Callbacks and Concepts The S-function has to provide a set of callbacks to allow Simulink directing the simulation. Subsequently there will be a list of the tasks, the callbacks perform: Initialization. This callback is invoked prior to the first simulation loop. It has to initialize a structure containing information about the S-function, called SimStruct. Then it sets the number and dimensions of input and output ports, and the block sample time or times. Finally, it allocates storage areas and arrays. The associated callbacks are mdlInitializeSizes and mdlInitializeSampleTimes for what the names suggest, then mdlStart for the startup process. Calculation of next sample hit. This function applies only when a variable sample time is used for the particular block. As the name suggests, it calculates the next step size, depending on state and current input. mdlGetTimeOfNextVarHit is the name of the function invoked for this. Calculation of outputs. The function mdlOutputs is called in every major time step

97

3 Modeling and Implementation

to make all output ports valid. Update discrete states. Called in major time step, too, this ensures that all discrete states are in the state they have to be for next simulation loop pass. The associated callback is mdlUpdate. Integration. If a model has continuous states or non-sampled zero crossings or both, the minor time steps are evaluated, calculating output and derivatives. Two callbacks are used for the integration phase: mdlDerivatives and mdlZeroCrossings. There are three key concepts that are important for implementing S-functions: Direct feed-through. If the output is controlled directly by the value of an input port, this is called direct feed-through. Two basic characteristics can indicate it: Either the output function mdlOutputs is a function of the input u, which means that u is accessed in mdlOutputs, or the “time of next hit” function mdlGetTimeOfNextVarHit accesses u. Dynamically sized inputs. Input dimensions to a block can be arbitrary; their size is determined dynamically in the initialization stage. The size then can lead to the number of continuous or discrete states, and the number of outputs. Dynamically sized inputs are very useful for attaching different sources to the block, e. g. a three-channel multiplexer that has a three-element output vector, or a clock that has a scalar output. Setting sample times and offset. This is the concept that explains how the block is acting in time. There are several different possibilities for timing: Continuous sample time. This kind of timing is used for S-functions that have continuous states and/or non-sampled zero crossings. The output changes in minor time steps. Continuous but fixed in minor time step. The S-function is executed at every major simulation step, but it does not change values in minor time step. Discrete sample time. The block acts as a function of discrete time intervals. Additionally, an offset can be specified for delaying the sample time hit. The effective sample time hit occurs at TimeHit = (n · period) + offset, with n starting at zero. Variable sample time. This is a special case of discrete sample time, where the intervals between sample hits can vary. The variable sample times for the next hit are queried at the start of each simulation step – a fact that was complicating the implementation.

98

3.2 Simulink Interface for Octopus Inherited Sample time. A block can have no inherent sample time characteristics and be dependent on its input or output. In this case, it has inherited sample time, which can be taken from either the driving block, the destination block, or the fastest block in the system. The sample times are usually given in pairs of sample time and offset time. According to [Mat01], the valid sample time pairs are: [CONTINUOUS_SAMPLE_TIME, 0.0] [CONTINUOUS_SAMPLE_TIME, FIXED_IN_MINOR_STEP_OFFSET] [, ] [VARIABLE_SAMPLE_TIME, 0.0] with the following definitions: CONTINUOUS_SAMPLE_TIME = 0.0 FIXED_IN_MINOR_STEP_OFFSET = 1.0 VARIABLE_SAMPLE_TIME = -2.0 3.2.1.3 Build process The build process of C MEX S-functions is basically not complicated. The source file, in which the callbacks are implemented, has to follow certain rules. So it has to include a given header, define mandatory callbacks and include C source files in the footer. The source file is then compiled and linked with the Matlab/Simulink libraries to get a dynamically loadable library (DLL). This DLL has to have exactly the same name as the included S-function has. Implementation For the implementation of the Simulink-Octopus interface several tasks had to be accomplished and parts to be written: • S-function interface • Octopus simulator interface • Time synchronization • Data exchange All tasks mentioned in this list will be discussed in the following sections on their own, although it is not always possible to explain one functionality without considering the other.

99

3 Modeling and Implementation

3.2.2 Implementation 3.2.2.1 Test Model As mentioned in section 3.1.1.2 on page 75, it is not possible to use the Barracuda Integration Platform – and so the Star12 CPU model – in another environment than the Hiwave debugger. For enabling the usage in various environments the integration platform will have to be modified. So for testing the adaptation layer between Octopus and Simulink, a small model was created, not reflecting any existent hardware. Delayer Model clock, rate=10000 cycles

de_clk_out

del_dt0_in

invert (1=0, 0=1)

del_dt0_out

del_dt1_in

delay 50000 cycles

del_dt1_out

del_an0_in

invert

del_an0_out

del_an1_in

delay 50000 cycles

del_an1_out

digital

analog

Figure 3.14: Test Model for Adaptation Layer As illustrated in Figure 3.14, the test model, called Delayer, has four input and five output ports. Besides the output-only clock port, there are two analog and two digital channels, which invert input signals on the first and delay input signals on the second channel. There are no registers, no bus or anything else. The Delayer is the only model in the test system, a circumstance, which removes all effects that could distract from the main problems. 3.2.2.2 S-Function Interface The S-function interface here means the implementation of all the callbacks that are required from the Simulink side. The first callback to be invoked is mdlInitializeSizes. Here, the number of input and output ports, as well as the number of sample times, is defined. This function also sets the number of parameters that are expected. Parameters are all variables that are entered in the parameter field of the S-function dialog box in Simulink. The Octopus S-function takes five parameters, each of them enclosed in double quotes when containing more that one word:

100

3.2 Simulink Interface for Octopus Initialization command. This parameter allows specifying commands that should be processed by Octopus at startup time. This could be telling Octopus to load a binary, in case that a CPU is integrated into the set of models. Besides the Octopus commands, all model manager commands that were registered, can be used. Output signals. The names of all Octopus signals that should be monitored and passed to Simulink are given in the second parameter. For each signal there will be an own output port. Up to now, only bit and analog signals are supported, but – theoretically – also complex signals, e. g. complete bus tokens, could be sent to Simulink. Details on signal monitoring will follow in section 3.2.2.4 on page 103. Input signals. Equivalent to the output signals, the names of all Octopus signals that should receive input from Simulink are given here. Again, only bit and analog signals are possible up to now. Frequency. Octopus is required to run at a specific frequency. The fourth parameter takes this frequency in Megahertz. This value is also important to control the synchronization. Granularity. This parameter takes effect on the synchronization, as explained in section 4.2.2.4. It affects the ratio of performance and accuracy. So mdlInitializeSizes checks, whether the number of parameters matches five, and makes all parameters permanent, which means, they cannot be changed during simulation. This is logical, because it would neither make any sense, nor be possible for Simulink to change the number of input or output ports during simulation; the same applies for granularity. As the callback is invoked every time the parameters for that block are changed, here the output and input signals parameter is processed by the means that the number of signals is reflected in the model’s block symbol, which therefore always shows the appropriate number of connection points. This ensures that the Octopus model can be integrated in various ways to test different functionalities of the SoC. The next callback to be executed by Simulink is the mdlInitializeSampleTimes function. It is possible to assign more than one sample time to a block, either by allocating different sample times to different ports, or by assigning several sample times to the complete block. In the concrete case, two sample times are specified. The final function in the initialization stage is mdlStart. Its main task is to start Octopus. It fills the motCallbackStruct with pointers to the appropriate Octopus callback functions, then calls moInit and so triggers the startup process. After this, the commands from the S-function parameter dialog box are executed.

101

3 Modeling and Implementation

octo_sfun

octo_sfun

3 inputs, 4 outputs

1 input, 3 outputs

Figure 3.15: Octopus S-function variable port number After initialization there starts the simulation loop. A major simulation time step starts with execution of the mdlGetTimeOfNextVarHit. As mentioned before, the problem for the variable time step technology arises from the fact that the size of the next time step has to be determined before the current step is executed. So this time is determined in another function and only transferred to Simulink in here. In mdlOutputs the actual Octopus simulation is executed, by determining the amount of cycles that have to be executed. Then with motEval the time for Octopus is advanced to the particular point in time. The other important task to be accomplished in mdlOutputs is to pass the input port signals to Octopus. The mdlUpdate callback is not used in this implementation, as there are no states that are useful for Simulink. 3.2.2.3 Octopus Simulator Interface The Octopus simulator interface implementation has to provide all callbacks that are necessary for correct execution, as presented earlier in section 2.4.1.2 on page 59. The list of functions starts with some output functions for different purposes, although doing the same: the callbacks Error, Warning, Debug and Print simply put a message on the Simulink command line window. They are invoked from Octopus for the specific reasons. The AskUser callback of Octopus is defined to allow user interaction by the means that the user can pass model manager commands to Octopus. This is not implemented up to now. A very important callback for in the elaboration session is the Prepare callback, which is invoked from motInit. It is executed after the information, contained in the models.cpp configuration file, has been read. The same applies to motInitialSetup function, where the arrays of motModelEntryT and motSignalT structures are passed to Octopus. In Prepare lies the most important task to enable the interface to monitor and affect Octopus signals. For this, a special model is dynamically registered in Octopus and then initialized with the signal names, specified as S-function parameters. The rest of the callbacks, specified in the Octopus simulator interface document,

102

3.2 Simulink Interface for Octopus

are not used for the particular implementation, although they are defined in the source code file for future usage. 3.2.2.4 Data exchange It is vital for the adaptation layer to allow data exchange between Simulink and Octopus, otherwise it would not make any sense to include Octopus simulators in a Simulink model. For the Simulink side, there are input and output ports available, as we are inside a block. But at the Octopus side there are only the model manager commands get_signal and put_signal available, to print and set signals. This would result in executing motUserCommands several times during each simulation step, and additionally pre- and post-process the string representations of the signals and the model manager command’s output. A loss in performance would be sure. Another idea was to use the dumper to get signals from Octopus. But besides that, this means that the signal dumper would not be available for its natural purpose, the problem of putting signals to Octopus would not be solved anyway. The more sophisticated way is to use an extra model that takes the signals we want to monitor as input, and those we get from Simulink as output. The direct approach would be to add a model to the configuration file models.cpp that has the complete set of signals present in the system once as input and once as output ports, or once as bi-directional ports. But on the one hand, this would mean that all ports – which can be several hundreds for big systems – would have to be monitored, with the resulting loss of performance. On the other hand, the user would have to specify all signals by hand, which is a very uncomfortable way. A much better solution is to determine dynamically which signals are present in the system. But then the model cannot be included in the configuration file, because when this file is executed, Octopus does not yet know the signals. And of course, if the signal names are determined dynamically anyway, it is possible to use only the signals specified as S-function parameter. So the signal monitor is created dynamically, after all the other models have been registered to Octopus. First the Prepare callback invokes a function named CreatePortList, which takes as argument two string arrays: one for the input signals, one for the output signals. All arrays are always NULL-pointer terminated to include array size information. This function – as the name suggests – creates the array of motPortT structures that will be used for the signal monitor model. As Octopus ports have no data type, the monitor will not know what type of signal to create when sending a signal to Octopus. So in the S-function parameter the input signals definition carries not only the name but also the type information: :, e. g. PORTAD:r, which means that the signal PORTAD should carry real number signals. The ports get the same name as the signals they monitor, only with a prefix _MONi for input, and _MONo for output signals. The following code excerpt is a part

103

3 Modeling and Implementation

Control,Data Control

Adaptation Layer

Control Control

Control, Data

Simulink

Octopus & Models Data

Data

Signal Monitor

Data

Figure 3.16: Control and Data Flow for Adaptation Layer of that function. 1 2 3

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

24 25

motPortT∗ C r e a t e P o r t L i s t ( const char ∗∗ s i g n a l s _ I n , const char ∗∗ signals_ Out ) // c r e a t e and r e t u r n p o r t l i s t t o be used w i t h i n a motModelEntryT struct . { motUINT16 iCount_In = 0 ; motUINT16 iCount_Out = 0 ; motUINT16 i ; string sBuffer ; // c h e c k i f i n p u t s i g n a l names a v a i l a b l e , co unt how many i f ( s i g n a l s _ I n != NULL) { for ( ; s i g n a l s _ I n [ iCount_In ] != NULL; iCount_In++) ; } // c h e c k i f o u t p u t s i g n a l names a v a i l a b l e , c ou nt how many i f ( s ignals_Ou t != NULL) { for ( ; signals_O u t [ iCount_Out ] != NULL; iCount_Out++) ; } // c r e a t e a new p o r t a r r a y w i t h t h e a p p r o p r i a t e s i z e motPortT∗ T h e P o r t L i s t = new motPortT [ iCount_In+iCount_Out + 1 ] ; // i f t h e r e a r e i n p u t s i g n a l s , c r e a t e p o r t s f o r them i f ( s i g n a l s _ I n != NULL) { fo r ( i =0; i < iCount_In ; i ++) { // f i l t e r o u t t h e t y p e i n f o r m a t i o n a f t e r t h e c o l o n sBuffer = string ( signals_In [ i ] ) ; i f ( s B u f f e r . f i n d ( " : " ) != s t r i n g : : npos ) sBuffer = sBuffer . erase ( sBuffer . find (" : ") , sBuffer . length () ) ; // s e t p r o p e r t i e s T h e P o r t L i s t [ i ] . Name = s t r d u p ( const_cast( s t r i n g (

104

3.2 Simulink Interface for Octopus

s B u f f e r + s t r i n g ( "_MONi" ) ) . c _ s t r ( ) ) ) ; T h e P o r t L i s t [ i ] . Type = motPORTTYPE_OUTPUT; // [ s e v e r a l member a s s i g n m e n t s l i k e t h e a bo ve one ]

26 27

} } // i f t h e r e a r e o u t p u t s i g n a l s , c r e a t e p o r t s f o r them // [ same l o o p f o r o u t p u t p o r t s , not p r i n t e d h e r e ] // c r e a t e t h e l a s t e n t r y . . . motPortT LastEntry = motPORT_LAST_ENTRY; // and copy i t t o t h e end o f t h e a r r a y memcpy(&( T h e P o r t L i s t [ iCount_In+iCount_Out ] ) , &LastEntry , s i z e o f ( motPortT ) ) ; // r e t u r n t h e g e n e r a t e d p o r t a r r a y return T h e P o r t L i s t ;

28 29 30 31 32 33 34 35

36 37 38

}

Listing 3.6: Port list creator routine See lines 21–22 of the code excerpt above, where the name is split off the arguments from the parameter list. The very convenient macros, Octopus offers for port definitions, cannot be used here, because they only work while initializing variables, not while dynamically filling the structures. In lines 24–25 the port structure’s members are filled with data. In lines 30 and 32, a trick is used to avoid having to fill in manually the last port, which has to have special values for its members. The macro motPORT_LAST_ENTRY is such a macro that can only be used for variable initialization, so a variable is initialized and then its contents is copied to the end of the list. It is quite obvious that this could have been done with the other members of the array, too, but that would have caused lots of memory copy actions, which are very critical to performance. Of course, the model has an initialization callback named ModelInit, where it determines the port handles of its ports by a series of calls to motPortStructure, which returns a pointer to a port for the port name as argument. The actual use is as follows: p o r t H a n d l e s [ i ] = m o t P o r t S t r u c t u r e ( const_cast( s t r i n g ( s t r i n g ( signalNameList_FromSL [ i ] ) + s t r i n g ( "_MONi" ) ) . c _ s t r ( ) ) ) ;

For the S-function output ports – actually input ports for the signal monitor – there is an Update callback named ModelUpdate. As defined by Octopus, this callback does not receive any arguments, it has to call motGetUpdatesCount to determine how many ports have been updated, and then invoke motGetUpdatePort repeatedly. For each port that was updated, the new value is directly saved to the S-function output port address range, which can be ascertained by calling Simulink’s ssGetOutputPortRealSignal. Also bit signals are converted to the double type, taking 0.0 for a logical 0, 1.0 for logical 1, 0.5 for tristate and 2.5 for undriven signals.

105

3 Modeling and Implementation

Back in Prepare, the type information of the signal definitions in the S-function parameter field is processed. As the only implemented types are bit and real signals, specifying ‘b’ or ‘r’ is sufficient. Then the motModelEntryT is created as a stand-alone variable: 1 2 3 4 5 6 7 8

motModelEntryT SignalMonitorSL_tmp = { " S i g n a l M o n i t o rS L " , motMODEL__NO_CONFIG_INFO(motINACTIVE) , motMODEL__USE_SYSTEM_CLOCK( 0 , 0 , 0 ) , motMODEL__PORT_LIST( P o r t L i s t ) , M o d e l I n i t , NULL, NULL, NULL, ModelUpdate , NULL, motMODEL__NO_USER_CMDS };

Listing 3.7: Signal monitor routine Line 2 specifies the name, in lines 3 and 4 general configuration and timing information is passed, line 5 passes the port list, created with the CreatePortList function, line 6 points to the two existing model callback functions, and for line 7 there are of course no model manager commands to register. Of course, the signal monitor is of no use if Octopus does not have it in its list of models. Unfortunately, there is no function to register a model dynamically, usually all models are collected in an array that is passed in the configuration file’s motInitialSetup function. So we must hack the model entry into the model table. This is done by first copying the model table into a memory space that is big enough to take the whole model table plus the additional entry for the signal monitor, seen in lines 1 and 2 of the code excerpt below. Then the signal monitor’s motModelEntry is copied to the table in line 3, then the obligatory last entry from the original to the end, as stated in line 4. Now the original table can be deleted, as line 5 shows. Finally, the original model table pointer that Octopus uses, must point to the new table, see line 6. Note that ModelTable is of type motModelEntryT**, an indirect array, because it is argument to the Prepare callback. 1 2

3

4

5 6 7

motModelEntryT∗ ModelTableCopy = new motModelEntryT [ EntryCount + 2 ] ; memcpy( ModelTableCopy , ∗ ModelTable , EntryCount ∗ s i z e o f ( motModelEntryT ) ) ; memcpy(&( ModelTableCopy [ EntryCount ] ) , &SignalMonitorSL_tmp , s i z e o f ( motModelEntryT ) ) ; memcpy(&( ModelTableCopy [ EntryCount +1]) , &((∗ ModelTable ) [ EntryCount ] ) , s i z e o f ( motModelEntryT ) ) ; delete [ ] ∗ ModelTable ; ∗ ModelTable = ModelTableCopy ; S i g n a l M o n i t o r S L = &(ModelTableCopy [ EntryCount ] ) ;

Listing 3.8: Hack into model table

106

3.2 Simulink Interface for Octopus

Line 8 assigns the signal monitor entry of the model table to a variable named SignalMonitorSL. This has nothing to do with the actual patching, but is important for delivering the signals to Octopus. When signals are updated from Octopus, these events are handled in the model callback routine. But when getting signals from Simulink, we are not within an Octopus callback. If the motPutPort function – does exactly what the name suggests – is used outside a callback, Octopus crashes, because a port manipulation function always has to occur in a particular model context. So how can we persuade Octopus of being in the context of the signal monitor? The context in Octopus is set with a global pointer named motModel, which points to the model that is active or NULL if not working currently on models. Octopus sets this variable to the correct model before it invokes a callback. So the only thing to do is to set motModel to the SignalMonitorSL pointer we have defined before, then do the port manipulation, then reset motModel to NULL. See the code excerpt of the S-function’s mdlOutput callback routine to see how the signals are sent to Octopus. 1 2 3 4 5 6 7 8 9 10

11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

... InputRealPtrsType uPtrs ; motModel = S i g n a l M o n i t o r S L ; f o r ( int i =0; i < lis tSiz e_From S L ; i ++) { i f ( signalTypeList_FromSL [ i ] == signalNUMBER ) // r e a l s i g n a l s { // p u t t h e s i g n a l s on t h e o u t p u t p o r t s o f t h e s i g n a l monitor uPtrs = s s G e t I n p u t P o r t R e a l S i g n a l P t r s ( S , i ) ; motPutPort ( p o r t H a n d l e s [ i ] , 0 , motTOKEN( signalNUMBER , uPtrs [ 0 ] ) ); } e l s e i f ( signalTypeList_FromSL [ i ] == 0 ) // b i t s i g n a l s { uPtrs = s s G e t I n p u t P o r t R e a l S i g n a l P t r s ( S , i ) ; i f ( ∗ uPtrs [ 0 ] == 0 . 0 ) motPutPort ( p o r t H a n d l e s [ i ] , 0 , motTOKEN( ’ 0 ’ , NULL) ) ; e l s e i f ( ∗ uPtrs [ 0 ] == 1 . 0 ) motPutPort ( p o r t H a n d l e s [ i ] , 0 , motTOKEN( ’ 1 ’ , NULL) ) ; e l s e i f ( ∗ uPtrs [ 0 ] == 0 . 5 ) motPutPort ( p o r t H a n d l e s [ i ] , 0 , motTOKEN( ’ Z ’ , NULL) ) ; else // e r r o r c a s e or u n d e f i n e d motPutPort ( p o r t H a n d l e s [ i ] , 0 , motTOKEN( ’ ? ’ , NULL) ) ; } } /∗ f o r ( each i n p u t p o r t ) ∗/ motModel = NULL; ...

Listing 3.9: S-function output routine This section showed that quite a number of considerations were necessary to al-

107

3 Modeling and Implementation

low performant data exchange, but now works completely transparent to both, the Octopus user and the Simulink user. 3.2.2.5 Time synchronization The most challenging task, without any doubt, was the synchronization of time between Simulink and Octopus. The ideal case would be to register the model to have variable sample time, so that simulation could advance from Octopus event to Octopus event. But that would bring with it the disadvantage that inputs from Simulink would only be recognized when the next event occurs, even if they have changed dramatically during that time. Possibly, Octopus would have to react on input changes immediately, but gets the input at the point in time where it had its next event before the input change. So the variable sample time concept alone would be perfect for a system that has output only, but not for those having inputs, too. So a second sample time has to be invented to ensure the input ports are updated regularly. In these regular intervals all input ports are read out and their values fed into Octopus. After this, Octopus is evaluated to determine if the potentially changed input values caused a new event. If so, the variable sample time is specified to reflect the new events. From the variable sample time technique there arises another problem. As mentioned before, the callback routine mdlGetTimeOfNextVarHit, which Simulink calls to determine the next step size, is invoked at the beginning of a simulation time step. At this point, Octopus was not yet evaluated, no inputs have been updated and so no current information about the time of the next event is available. But a big advantage is that the callback, although required to be present, does not deliver the next step size by a return value, but by calling the Simulink function ssSetTNext. Fortunately this routine can be called also outside the callback and Simulink considers its argument for the next simulation time step. So instead of passing the step size at the beginning of the loop, it is transferred at the end. Basically the scheduling works as follows: time_T now = ssGetT ( S ) ; todo = ( now ∗ dClockFrequency ∗ 1 0 0 0 0 0 0 ; motEval ( todo ) ; motGetNextEvent(& c y c l e s T o E v a l ) ) ; now += ( c y c l e s T o E v a l / dClockFrequency / 1 0 0 0 0 0 0 ) ; ssSetTNext ( S , now ) ;

This is not the real code, but only the basic scheme, without all error and special case handling. In line 1 the current Simulink time is fetched. Line 2 determines the total simulation cycles Octopus has to be advanced to. The actual model evaluation takes place in line 3, followed with the determination of the next pending event in

108

3.2 Simulink Interface for Octopus

line 4. The Simulink time for the next call is calculated in line 5 and finally passed in line 6. All these actions are settled in the mdlOutputs callback. The update process can be outlined in four steps: 1. Bring Octopus to the current Simulink time by evaluating it. 2. Pass input port values to Octopus. 3. Determine if passing caused a new event. 4. Schedule Simulink for execution of the next Octopus event. The periodic execution of Octopus and the variable time step technique generate another problem: First, the periodic input passing can be redundant, when the particular Octopus system does react on this change of values very slowly. Second, a CPU will have events nearly every cycle, which means that Simulink would have to call Octopus every 40 nanoseconds, a much higher resolution than Simulink has as default. As every evaluation of Octopus causes a series of function calls, performance would decrease dramatically. The problem becomes even worse, considering that not every CPU cycle corresponds to a change in an output signal, assuming we only monitor signals that a micro controller would pass to the outside world in reality. So evaluating Octopus separately each cycle is simply superfluous. Usually, the user who assembles a Simulink model, knows what type of micro controller is simulated with Octopus, and – where applicable – what software is running on it. So he knows at which approximate rate output changes occur. This brings in the granularity S-function parameter. The granularity value affects the overall simulation speed. Its value is the number of Octopus cycles that Simulink should let pass between subsequent calls to Octopus. A value of 1000, e. g. means that Octopus is evaluated every 1000 cycles, regardless whether there are events during this time or not. Additionally it is possible to specify a granularity value of zero. As it makes no sense to have zero cycles between subsequent evaluation, of course, a zero is interpreted as the command to assign continuous sample time, fixed in minor time step, to the block. This means that the block is evaluated when the simulation enters a new simulation time step, so the rate is dependent on the simulation parameters, as seen before in Figure 3.11 on page 95. The following paragraphs show the effects of the different settings. All following figures show a sinus wave as input signal to the del_an0_in port of the Delayer model, the corresponding output, del_an0_out and the module-internal clock signal del_clk_out. When configuring the granularity to zero, continuous sample time with fixed values in minor time steps is assumed. Figure 3.17 on the next page shows the scope diagram.

109

3 Modeling and Implementation

It can be seen that the output is stepped, a consequence of the Simulink simulation time step size. As the block was specified to have fixed values during minor time steps, the single points are not interpolated.

Figure 3.17: Continuous sample time When zooming into the graph, as showed in Figure 3.18, it can be clearly seen that the steps are not equidistant, but sometimes have sub-steps. The regular steps are a result from the clock events, whereas the sub-steps are caused by Simulink’s signal updates. This configuration is ideal for systems that produce few or none events, like a direct-feed-through circuit. The system is perfectly and performantly integrated into Simulink, because evaluation does only take place when Simulink simulation wants it.

Figure 3.18: Continuous sample time, zoomed

110

3.2 Simulink Interface for Octopus

Besides the possibility to make the simulation steps dependent on Simulink, a fixed step size can be given using the granularity S-function parameter. Figure 3.19 shows the graph for a granularity of 500 cycles. Keeping in mind that the Delayer module clock frequency is 10000 cycles, this is a high resolution. 500 cycles at 25 MHz are 20µs, because of this, the output graph is smooth, without any steps.

Figure 3.19: Fixed sample time, granularity 500 cycles Of course the configurability imposes additional responsibility onto the user. The user must know what the system does and what its timing characteristics are. The effect of selecting a granularity value higher than an important event rate of the system, causes irregularities in the output, presented in Figure 3.20. When looking at the clock signal, it can be seen that the rising and falling edges are not equidistant, as they should be. The mistake was to evaluate Octopus every 6500 cycles. The module clock period length is 10000, which means there is a transition every 5000 cycles, and so more often than evaluation occurs.

Figure 3.20: Fixed sample time, granularity 6500 cycles So for the correct setting, the knowledge about the Nyquist frequency is essential.

111

3 Modeling and Implementation

The Nyquist theorem states that the sampling rate must be greater than two times the highest frequency in the input signal. In our case, the “input” signals are all relevant signal changing events of the Octopus system. But the granularity parameter so allows a high performant simulation. For a video presentation refer to section 4 on page 119. The synchronization between Simulink and Octopus took much time for testing and interpreting the different effects. Several considerations about the correct implementation were made, and the solution presented was rated to be the best way. The Octopus/Simulink adaptation layer sources take about 1,200 lines of code, not counting the test code.

3.3 Summary of the Practical Experience Part In chapter 4 the practical experience part of the diploma thesis was discussed, the implementation and testing of the presented tasks took about 460 h. A summarizing visualization on what has been done, can be seen in Figure 3.21. Both sides of simulation were covered: the development of models for simulation, and the attachment of a simulator engine to an application. Star12 Core

Hiwave Debugger

Matlab Simulink

Adaption Layer

Adaption Layer

Platform Simulation Backplane Octopus

S-Function Dump I/O

Mem, Flash, ROM, IRQ, IPBI, CRG, MMC, ...

ATD module PWM module

Signal Dump and Stimuli Interface (VCD, Octopus Stimuli)

Thesis Implementation

Figure 3.21: Implementation Overview The figure shows – abstracted – the different components of the simulation technique: The two applications, Hiwave and Matlab as well as their adaptation layers, of which the lower one was topic of this thesis. Then the simulation backplane and the attached models, the existing ones and the two covered by the thesis, too.

112

4 Conclusion This diploma thesis deals with a topic, every embedded controller software engineer uses, but usually does not think of. The simulation of CPUs to allow software debugging is well known, but the simulation of full chips in various environments drives system development into new spheres. In times of decreasing product cycle times and new technologies, becoming available faster and faster, the simulation sector will become consequently more important. For me, it was fascinating to see that there are already many approaches and products to face the full chip simulation challenge, that there are already a few groups developing standards for this topic, and to see the refinement is growing every year. The main advances in my personal knowledge were the techniques, to simulate systems, how specifications of hardware are to be read and interpreted, and how they can be transferred to a piece of software. It was an experience to experiment with software, to find a very performant, but still comfortable, middle course between raw, fast and hard-to-read C programming and the object oriented software paradigm of C++. In many ways, object oriented software offers solutions that are nice to implement and nice to read, but the strict object oriented design also often costs performance. These points are better solved with plain C. In other parts, the usage of plain C does not result in a better performance but in complicated statements and functions that are easier to express in C++. As an example take implementing functions as class members. This may be nice to read, but the “this” pointer of object instances, passed to every method as a hidden argument, costs performance for often-called methods. Another gain in experience was the transfer of standards to existing implementations. Things that were difficult to understand when reading the user’s manual of the model manager became clearer when reading the corresponding standards. Simulator engines, caring of full systems, and integrated into various environments gives engineers the power to effectively design electronic systems for devices of the 21st century. I hope, my diploma thesis can help a little to improve this.

113

4 Conclusion

114

About Motorola Motorola is a globally acting vendor of embedded microprocessor products and integrated communication solutions. Paul V. Galvin founded it as Galvin Manufacturing Corporation in 1928 in Chicago, Illinois. Its products in the beginning were radios under the brand name “Motorola”, which became the name of the company in 1947. After the Second World War Motorola’s domain was the military, space and commercial communication. When the cellular telephone technology was evolved, the company became one of the big players in this market, too. Today Motorola also focuses in the automotive sector, with cars using more and more electronic components. Motorola has five main business units, which are described in [Mot01]: Global Customer Solutions Operations (GCSO) has no own products but is responsible to provide complete customer solutions. The regional organizations are all reporting to GCSO, the marketing and support management is also part of this unit. The Integrated Electronic Systems Sector (IESS) is the unit that provides products for the following markets: automotive, industrial, transportation, navigation, communication and energy systems. Within the Networks Sector (NS) there are developed solutions with broadband communication, combined voice, data and multimedia IP communication and mobile intelligence. The most obvious and public-known business unit is the Personal Communications Sector (PCS), because in this unit the mobile phones are produced. Also in this unit are paging and messaging devices as well as personal two-way radios. The unit develops the software and accessories, too. The biggest business unit is the Semiconductor Products Sector (SPS), which actually produces the microprocessors used within the other units and for other customers, e. g. the automobile industry. A buzzword in this unit is DigitalDNA, which means that the microprocessors enable customers to create intelligent solutions. In this unit, respectively in the Transportation Systems & Products Group (TSPG), I worked for the practical part of this diploma thesis, in a team called Virtual Garage.

115

About Motorola

116

Bibliography [BR01]

Peter Braun and Martin Rappl. Abstraktionsebenen eingebetteter Systeme. In Andy Schürr, editor, OMER-2 Workshop Proceedings, page 27, München, March 2001. Universität der Bundeswehr München. 10

[CN01]

Inc CNET Networks. CNET glossary. http://www.cnet.com/Resources/ Info/Glossary/, 2001. 123

[CO98]

Cathy Cox and Einat Ophir. PWM_8B6C Block User Guide. Motorola, Inc., Gurgaon, India, 1998. 84, 86

[DH99]

Prof. Dr.-Ing Gerd Dost and Dr.-Ing. Göran Herrman. Entwurf und Technologie von Mikroprozessoren. In Thomas Beierlein and Olaf Hagenbruch, editors, Taschenbuch Mikroprozessortechnik, pages 357–359, München/Wien, 1999. Fachbuchverlag Leipzig. 7, 8, 9, 10

[Hor95]

Lynn Horobin. Standard Delay Format Specification. Open Verilog International, Los Gatos, CA, 1995. 37, 124

[IEE95] IEEE, New York. IEEE Std 1364-1995: IEEE Standard Hardware Description Language Based on the Verilog Hardware Description Language, 1995. 57, 124 [IEE99] IEEE, New York. IEEE Std 1499-1998: Standard Interface for Hardware Description Models of Electronic Components, 1999. 22, 23, 24, 28, 39, 40, 41, 43, 45, 46, 123 [KLR99] Matthias Koch, Stefan Lenk, and Michael Rohleder. User Guide Octopus Generic Interface Library. Motorola Co., Munich, 1999. 58, 63 [Mat00] The MathWorks, Inc., Natick, Massachusetts, US. Matlab/Simulink Online Help, 2000. 93, 94, 95, 96 [Mat01] The MathWorks, Inc., Natick, Massachusetts, US. Writing S-Functions, 2001. 96, 99 [Mot00] Motorola, Inc., Australia. STAR12 V1.5 Core User Guide, 2000. Version 1.0. 73

117

Bibliography

[Mot01] Motorola, Inc. Motorola homepage, 2001. http://www.motorola.com. 115 [Rap01] Rapid-Prototyping of Application Specific Signal Processors (RASSP). RASSP homepage, 2001. http://rassp.scra.org. 11 [Roh01a] Michael Rohleder. Octopus 0.88 Online Help. Motorola, Co., Munich, 2001. 60 [Roh01b] Michael Rohleder. Octopus 0.88 Simulator I/F Description. Morotola, Co., Munich, 2001. 59 [Sch01]

Joachim Schlosser. Analog to Digital Converter Interface & Usage Documentation. Motorola, Co., Munich, 2001. 76, 77, 78, 79

[VSI99]

VSI Alliance. Model Taxonomy Version 2.0 (SLD 2.2.0), 1999. 11, 13, 14, 17, 18, 19, 20, 21

[VSI01a] VSI Alliance. Taxonomy of Functional Verification For Virtual Component Development and Integration, 2001. 47, 48 [VSI01b] VSI Alliance. VSI homepage, 2001. http://www.vsi.org. 1, 2, 5, 6 [WLvH01] X. Wu, S. Lane, and M von Hoff. Barracuda Integration Platform User Manual. Motorola, Inc., Australia, 2001. 75

118

CD-ROM Contents The video demos are part of the thesis. They have very big file sizes to preserve all details and full resolution. / /autorun.inf /thesis.pdf /start.html /shellopn.exe /tools/ /tools/ar500enu.exe /tools/QuickTimeInstaller.exe /video/ /video/models_review.mov /video/models_demo.mov /video/sl_intro.mov /video/sl_review.mov /video/sl_demo.mov /html/ /html/models_review.html /html/models_demo.html /html/sl_intro.html /html/sl_review.html /html/sl_demo.html

CD-ROM root CD-ROM startup file Thesis document in Portable Document Format (PDF) CD-ROM main file ShellOpen, self-developed program, used within autorun.inf Programs required for viewing the documents provided on the CD Adobe Acrobat Reader (for document) Apple Quicktime Player (for video demos) Video demos, all in Quicktime format Code review of model development video Hiwave debugger video Simulink introduction video Simulink/Octopus interface code review video Simulink/Octopus interface demo video CD navigation files

CD navigation files in HTML format

119

CD-ROM Contents

120

Tools and Applications used All tools are listed in alphabetical order.

Practical Experience Barracuda Integration Platform CD053_RELBL_1.0.5 GNU make 3.77 GTKwave Analyzer 1.2.97 Metrowerks ANSI-C/C++ Compiler for HC12 V-5.0.21 Metrowerks CodeWarrior IDE 4.1, build 0629 Metrowerks Hi-Wave 6.1 Metrowerks SmartLinker V-5.0.15 Microsoft Visual C++ Version 6.0 Motorola Octopus 0.88 VIM 5.7

Star12 base models test case build process and execution VCD file viewer Star12 compiler Star12 code generation Star12 debugger Star12 linker simulator compiler model manager editor

Thesis Document and Video Demos LATEX 2ε METAPOST Corel Draw Corel PhotoPaint GNU Emacs TechSmith Camtasia TechSmith SnagIt Demo

documentation figures figures, CD print figures, cover picture creation documentation editor video demos capturing video demos capturing, screen shots

121

Tools and Applications used

122

Glossary of Abbreviations ASIC

FPGA

HDL

IC ICE ISA

ODE OMF OMI OVI

RTL

Application Specific Integrated Circuit. As its full name implies, an ASIC is a custom microchip designed for a specific application. Of course, the chip doesn’t reinvent the wheel: ASIC design involves taking common functions from a library and integrating them onto a circuit. [CN01] Field-Programmable Gate Array. Technique for a test-implementation of ICs or for production of very small series. The circuits are burnt into a matrix of transistors. Hardware Description Language. A language used to describe hardware in a such a way that a layout can be generated from the description. It is the last step before the physical layout is created. Here all entities like memory cells are modeled. Very common HDLs today are Verilog and VHDL. Integrated Circuit. A complete functionality encapsulated usually in a chip or integrated into a SoC. In-Circuit Emulator. Technique for realizing a physical prototype by building a reconfigurable prototyping system. Instruction Set Architecture. This is how the instruction set of a CPU does look like and how it works. It includes the list of instructions as well as their semantics and timings. Ordinary Differential Equations. Differential equations arising from assembling circuit representations to simulation models. Open Model Forum. Group "formed to solve the problem of logic model availability." [IEE99, p. iii] Open Model Interface. Interface between models and simulators. Language independent. Specification in IEEE Std 1499. Open Verilog International. Former group that developed standards for systems, semiconductor and design tools. Has joined the VHDL International in 2000 and founded Accellera. http://www.accellera.org/ Register Transfer Language. Register Transfer Language (RTL) is a concise way of specifying micro code instructions. Micro code is a level below assembly language, which is itself a level below high level languages like C and C++. So RTL can be viewed as the opcode.

123

Glossary of Abbreviations

SDF

SoC VCD

124

Standard Delay Format. The Standard Delay Format is subject to become IEEE Std 1497, it was developed by a group named Open Verilog International which is now part of the Accellera group. Referring to the specification [Hor95], a SDF file "stores the timing data generated by EDA tools for use at any stage in the design process." This can include delays, timing checks, timing constraints, timing environments, incremental and absolute delays, conditional and unconditional module path delays, design- or instance-specific data, type- or library-specific data and scaling, environmental and technology parameters. http://www.eda.org/sdf/ System-on-Chip. multiple ICs integrated into one chip. Replaces former integration boards Vector Change Dump. VCD is a file format used to save signals, analog and digital. It is specified in IEEE Std 1364-1995 ([IEE95, chapter 15]).

Index

abstraction level, 7, 19 accuracy, 14 algorithm, 10, 66 aligned, 85 alignment, 86 analog-to-digital converter, 76 implementation, 79 pins, 76 registers, 76 sampling, 79 architecture, 10 attribute, 26 behavior, 9 block stateful, 94 stateless, 94 board, 1 bootstrap, 23, 25, 42, 70 bus interface, 58 master, 63 port, 58 signal, 57 callback, 44, 59, 60, 97, 101, 107 channel, 84, 90 chip, 1 circuit, 10 class, 26 clock, 37, 62

prescaled, 78 code optimization, 8 configuration, 57 coverage, 52 functional, 52 hardware code, 52 data exchange, 103 object, 26, 30 type, 26, 34 divisor, 85 DLL, 99 double buffering, 88 elaboration, 25, 42, 57, 59, 70, 81, 102 emulation, see verification execution, 40, 61 function, 8, 12 functional network, 8 gain block, 93 Gajski, 10 geometry, 9 granularity, 109 HDL, 7, 9 hierarchy, 18 Hiwave, 81, 90, 119 IC, 1, 5

125

Index

IEEE, 15, 22, 23, 37, 57, 123, 124 initialization, 43, 60, 70, 97, 105 integral, 93 interface, 100 interoperability, 22 IP block, 5 library, 29, 96 logic, 10, 16 logical system architecture, 8 MathWorks, 73, 93 migration, 54 model abstraction, 9 behavioral, 18 boundary, 29, 43 boundary class, 24, 26, 68 bus functional, 19 characteristics, 51 class, 17 dataflow graph, 20 functional, 17 integration, 75 interface, 19, 60, 75 interface behavior, 19 manager, 5, 22, 23, 26, 41, 57, 59 memory, 75 mixed-level, 20 performance, 19 structural, 18 vendor, 5, 22 Motorola, 4, 57, 73, 115 MSRS, 74 netlist, 53, 62, 67, 70 Nyquist frequency, 111 object, 26 root, 28 Octopus, 57, 79

126

classification, 64 OMI, 22, 67 information model, 26, 27, 39, 68 Open Model Forum, 22 Open Model Interface, see OMI parameter, 26, 31 peripheral, 21, 73, 74 integrated, 74 polarity, 86 port, 26, 31, 57, 58, 100, 105 digital, 79 direction, 69 non-atomic, 32 update, 62 prescaler, 85, 89 processor, 21 production, 1 cost, 1 process, 1 time, 1 propagation, 44 property, 26 pulse width modulator, 84 implementation, 87 pins, 85 registers, 84 RASSP, 11 register map, 76, 85, 89 relationship, 26 multiple, 28 singular, 28 resolution data, 15, 65 effective, 19 functional, 15 software programming, 16, 67 structural, 16, 67 temporal, 14 reuse, 57

Index

RTL, 7, 8 S-function, 95, 96 sample time, 98 continuous, 98 discrete, 98 inherited, 99 variable, 98 scenario, 7 SDF, see standard delay format signal, 61 clock, 111 technology, 58 update, 110 signal generator, 93 silicon transistor, 1 simulation, 43, 59, 70, 94, 102 cycle-based, 25, 37, 44 event-driven, 25, 57 performance, 80 random pattern, 50 speed, 57 style, 25 simulator interface, 102 Simulink, 73, 75, 93, 119 standard delay format, 37, 69 Star12, 73, 100 structure, 9, 12 synchronization, 112 System-on-Chip, 1 taxonomy, 8, 10, 11, 17, 67 technical system architecture, 8 termination, 47, 70 test case, 83 timeline, 58 timing, 37, 69 signal timing, 21 step size, 108 synchronization, 108

token, 57 topology, 19 transistor, 1 trigger, 76 type array, 35 C, 27 completely characterized, 30 enumeration, 35 incompletely characterized, 34 primitive, 26 UML, 10 verification, 47 coverage, see coverage dynamic, 50, 54 emulation, 48, 49 equivalence, 53 formal, 49 formal equivalence checking, 53 integration, 55 metrics, 52 model checking, 49 physical, 53 prototyping, 48 summary, 55 theorem proving, 49 virtual prototyping, 51 Verilog, 21, 23 VHDL, 9, 11, 21, 23 viewport, 26, 33, 43 composite, 33 virtual component, 5, 22, 55, 57 VSIA, 6, 11, 48, 67 VSI Alliance, see VSIA Y-chart, 9

127

Index

128

Statement of Creation I, Joachim Schlosser, born January 22nd , 1977, profess that I created this diploma thesis on my own. It was not presented anywhere else for the purpose of examination. No other than the mentioned sources or aids were used. All quotes, literal and logical ones, were labeled. Königsbrunn, 2001-02-06 Joachim Schlosser

129