Signal Processing for Sound Synthesis - IEEE Xplore

[from the GUEST EDITORS]

Vesa Välimäki, Rudolf Rabenstein, Davide Rocchesso, Xavier Serra, and Julius O. Smith

Signal Processing for Sound Synthesis: Computer-Generated Sounds and Music for All

D

igital sound synthesis is celebrating its 50th birthday in 2007: It was first tested at the Bell Labs by Max Mathews and his colleagues in 1957, but it took quite some time before the technology was ready for the market. Digital music synthesizers were commercialized about 20 years later starting at the end of 1970s. In the mid 1980s, digital synthesizers became popular, when the musical instrument digital interface (MIDI) specification was released and incorporated in products. This enabled the playing of several synthesizers connected to each other or playing music synthesizers with computer control using so-called sequencing software. During the past five years, the MIDI music synthesizer has found its way to mobile phones. At the time of this writing there are about 2.5 billion mobile phones in the world. Many of them can play melodic (polyphonic) ring tones, which are usually generated from some sort of MIDI file (e.g., SP-MIDI) using wavetable synthesis. For this reason, almost everybody uses sound synthesis in their daily lives, although many may not know it. Sound synthesis is a field of technology in which signal processing plays a key role. The research focus has spread from music synthesis to various multimedia and entertainment applications, such as sound effects and background music for gaming, ring tones, and alarms, and interactive sound generation for virtual reality or other multimodal systems. In addition to musical tones, it is now practicable to devise algorithms for everyday sounds, for instance footsteps, hand clapping, or rolling wheels.

Still, music synthesizers and related equipment, such as digital pianos, are popular, and they are continuously cultivated. Old sound synthesis technology does not disappear but is brought back to life with the help of new signal processing: take, for example, virtual analog synthesis, a methodology that emulates the oscillators and filters of analog synthesizers using digital signal processing. A

ALMOST EVERYBODY USES SOUND SYNTHESIS IN THEIR DAILY LIVES, ALTHOUGH MANY MAY NOT KNOW IT. major shift during the last decade has been that many sound synthesis products that have traditionally been hardware electronics have become software. Some traditional problems seem to be waiting for perfect solutions forever. These include parametric naturalsounding synthesis of musical instruments and the singing voice, automatic evaluation of sound quality, and musically meaningful intelligent control of virtual musical instruments. The exploitation of sound synthesis methods seems to be still dependent on computational power. Indeed, one may trace the waves of sound synthesis systems and products, from frequency-modulation synthesizers to time-domain physical models, in terms of increasingly-powerful generations of computing devices. This issue of computational efficiency has two different aspects in sound synthesis: realtime processing and latency. The first aspect requires that a synthesis algorithm can be run at least as fast as the corresponding physical sound production process. The second aspect simply means that the sound output has to start

IEEE SIGNAL PROCESSING MAGAZINE [8] MARCH 2007

at the same time as, or a very short instant after, a musician touches a key. There are no excuses for buffers, mesh generation, or matrix preconditioning. While the real-time processing capability may be achieved by raw digital signal processing power, low latency requires a careful design of the interrelations among the algorithm, the operating system, and the audio hardware. Among the challenges for future researchers, we mention the problem of translating representations of sound that make sense for humans into soundsynthesis parameters. This and other open issues are part of the scope of the EU Coordination Action “Sound-to-Sense, Sense-toSound” http://www.soundandmusiccomputing.org, which has supported the preparation of this special issue. In the future, we foresee several new developments. Sound synthesis is likely to become even more ubiquitous in our daily lives, but perhaps in ways that are subtle and unobtrusive to the user. It is likely to become prominent in (nonmusical) interaction with devices, appliances, and artifacts of various kinds. In many contexts, sound synthesis is crucial for providing feedback that is immediate, engaging, and expressive. Even now, there are gaming consoles and electronic toys that are based on interactions of gestures and sound. Of course, only sound synthesis, as opposed to frozen prerecorded audio, can afford realistic continuous interaction. User interfaces and sonic enhancement of everyday objects are potentially growing application areas. In the traditional music synthesis area, singing voice synthesis is not yet mainstream in produced music. Nevertheless,

MMIC AMPLIFIERS DC-10GHz

99¢

as low as

ea.(qty. 30)

ERA-SM DC-8GHz

Gali/GVA DC-8GHz

LEE DC-10GHz

Gain up to 26dB, Output power upto 21dBm, High IP3 up to 38dBm MMIC amplifiers with just the right performance and size to fit your design, Mini-Circuits skinny Sigma LEE, Gali, ERA-SM and our new fixed voltage GVA. All MMIC's are transient protected against DC current surges and are available in a variety of over 50 models.These affordable compact amplifiers have low thermal resistance for high reliability. Available in three packages;

Leadless 3x3mm Low Profile package SOT-89 for superior grounding and heat dissipation Plastic Micro-X with surface mount leads, for easy configuration and assembly. Complete performance specs, data and a wide selection of Amplifier Designer's Kits geared for your development needs are available on our web site. All models protected under U.S. patent # 6,943,629.

Available as RoHS compliant. ®

EW ALL N

minicircuits.com

ISO 9001 ISO 14001 CERTIFIED P.O. Box 350166, Brooklyn, New York 11235-0003 (718) 934-4500 Fax (718) 332-4661 For detailed performance specs & shopping online see Mini-Circuits web site TM

The Design Engineers Search Engine Provides ACTUAL Data Instantly From MINI-CIRCUITS At: www.minicircuits.com

RF/IF MICROWAVE COMPONENTS

376 Rev F

[from the GUEST EDITORS]

continued

pieces of vocal synthesis software, such as the Yamaha Vocaloid, have been available for a few years. Can we expect to hear more computer-generated singing in music in the future? Perhaps. Particularly in popular music, the singer’s voice is often heavily processed using, for example, filters, dynamic compression, or pitch-shifting algorithms. The step to fully computer-generated voices will not be a large one in many categories, such as choral singing. IN THIS ISSUE This special issue is a collection of articles by active researchers in sound synthesis. For many years, the electronic keyboard has been among musicians almost synonymous with the synthesizer. Conversely, in the first article Jukka Rauhala, Heidi-Maria Lehtonen, and Vesa Välimäki tell us how traditional keyboard instruments are going to be simulated in the near future. The second article, by Maarten van Walstijn, addresses the special problem of simulating wind instruments such as the clarinet or the trumpet. It is based on physical modeling sound synthesis, just like the next three articles. Robust algorithms for nonlinear oscillators, strings, and plates are presented in the third article by Stefan Bilbao. Here, energy considerations are applied to design numerical methods, which are stable even for strongly nonlinear systems. The article of Rudolf Rabenstein, Stefan Petrausch, Augusto Sarti, Giovanni De Sanctis, Cumhur Erkut, and Matti Karjalainen provides a review of modular approaches to sound synthesis based on physical simulation. Particular emphasis is given to the problem of compatibly interconnecting modules of different types. In the article by Damian Murphy, Antti Kelloniemi, Jack Mullen, and Simon Shelley, a technique for simulating linear multidimensional acoustic resonators is presented, with applications in synthetic drum kits, room reverberation, and voice synthesis. The next two articles are related to the development of sophisticated signal processing methods for software synthesiz-

ers. Jordi Bonada and Xavier Serra write about singing voice synthesis, which is now approaching natural quality. Eric Lindemann, in his article, describes a synthesis method called reconstructive phrase modeling. It is used in a piece of software called Synful Orchestra that enables high-quality synthesis of orchestral music. Diemo Schwarz generalizes the systems introduced in the previous two articles and presents what can be called corpus-based concatenative methods, which are methods that assemble sound samples using musical controls. Pietro Polotti and Gianpaolo Evangelista consider, in their article, wavelet-based methods for analysis and synthesis of pseudoperiodic signals. A deterministic model for the harmonic components is combined with a stochastic model for noise-like fluctuations in musical signals. In the last article, Vesa Välimäki and Antti Huovilainen describe ways to achieve classic signal waveforms in the digital domain, such as pulse-train and sawtooth waveforms, without incurring the audible aliasing that is common in simple algorithms. We may notice that out of these ten articles, five are related to physical modeling of musical instruments and three to spectral modeling. This reflects the fact that physical modeling has been the most popular synthesis approach in the academic research since early 1990s. Much technical literature related to physical modeling synthesis has appeared during the past few years. For those interested in this topic, we recommend the book by Trautmann and Rabenstein [1] and the online book by Julius Smith [2]. We can suggest several recent books as further reading for those interested in sound synthesis. The textbook by Perry R. Cook [3] describes the basics in a very accessible manner. Some textbooks are available for learning the fundamentals of sound synthesis in the context of computer music. We recommend the classics by Roads [4], Dodge and Jerse [5], and Boulanger [6]. Several edited books are available that cover special areas of sound synthesis. The book by Roads et al. [7] contains articles on spectral model-

IEEE SIGNAL PROCESSING MAGAZINE [10] MARCH 2007

ing, physical modeling, and granular synthesis among others; Kahrs and Brandenburg [8] include in their book writings about wavetable synthesis, sinusoidal analysis/synthesis, and physical modeling; Zölzer [9] has compiled the results from a series of conferences, for example, on source-filter and spectral synthesis models; Rocchesso and Fontana [10] expand the horizon of sound synthesis from music to nonmusical noises, such as those related to collisions and breaking. We hope that the articles published in this special issue will become important references for those working in sound synthesis in academia or industry. ACKNOWLEDGMENTS We would like to thank all the invited authors for their great work and the anonymous reviewers who contributed their valuable time to give detailed feedback and constructive criticism. We are grateful to the “Sound-to-Sense, Sense-to-Sound” Coordination Action and its coordinator Nicola Bernardini for providing funding for the meeting of the guest editors that was held at the University Erlangen-Nuremberg, Germany, in June 2006.

REFERENCES [1] L. Trautmann and R. Rabenstein, Digital Sound Synthesis by Physical Modeling Using the Functional Transformation Method. New York: Kluwer/Plenum, 2003. [2] J.O. Smith, Physical Audio Signal Processing: For Virtual Musical Instruments and Audio Effects [Online]. Aug. 2006 ed., Available: http://ccrma.stanford.edu/~jos/pasp/. [3] P.R. Cook, Real Sound Synthesis for Interactive Applications. Wellesley, MA: A.K. Peters, 2002. [4] C. Roads, The Computer Music Tutorial. Cambridge, MA: MIT Press, 1996. [5] C. Dodge and T.A. Jerse, Computer Music: Synthesis, Composition, and Performance. New York: Wadsworth, 1997. [6] R.C. Boulanger, Ed., The Csound Book: Perspectives in Software Synthesis, Sound Design, Signal Processing, and Programming. Cambridge, MA: MIT Press, 2000. [7] C. Roads, S.T. Pope, A. Piccialli, and G. De Poli, Eds., Musical Signal Processing. Lisse, The Netherlands: Swets and Zeitlinger, 1997. [8] M. Kahrs and K. Brandenburg, Eds., Applications of Digital Signal Processing to Audio and Acoustics. Norwell, MA: Kluwer, 1998. [9] U. Zölzer, Ed., DAFX-Digital Audio Effects. Chichester, UK: Wiley, 2002. [10] D. Rocchesso and F. Fontana, Eds., The Sounding Object. Florence, Italy: Mondo Estremo, 2003. [SP]