We are moving towards an era of software-based
musical instruments, intelligent accompanists, and music as information,
says Ray Kurzweil in highlights from his keynote speech at the 2003 Audio
Engineering Society convention.
Highlights of the Richard C. Heyser Memorial Lecture to the 115th
Annual Convention of the Audio Engineering Society on Oct. 11, 2003. Published
on KurzweilAI.net Oct. 13, 2003.
Music technology is about to be radically transformed. Communication bandwidths,
the shrinking size of technology, our knowledge of the human brain, and
human knowledge in general are all accelerating. Three-dimensional molecular
computing will provide the hardware for human-level "strong" AI well before
2030. The more important software insights will be gained in part from
the reverse-engineering of the human brain, a process well under way.
Once nonbiological intelligence matches the range and subtlety of human
intelligence, it will necessarily soar past it because of the continuing
acceleration of information-based technologies, as well as the ability
of machines to instantly share their knowledge.
The impact of these developments will deeply affect all human endeavors,
including music. Music will remain the communication of human emotion
and insight through sound from musicians to their audience, but the concepts
and process of music will be transformed once again.
The Coming Revolution in Intellectual Property
The issue of protecting intellectual property goes far beyond music and
audio technologies, but the crisis has started in the music industry.
Already, music recording industry revenues are down sharply, despite an
overall increase in the distribution of music. The financial crisis has
caused music labels to become cautious and conservative, investing in
proven artists, with less support available for new and experimental musicians.
The breakdown of copyright protection is starting to impact musical instruments
themselves. Synthesizers, samplers, mixers, and audio processors can all
be emulated in software. It has been estimated that at least 90 percent
of the copies of "Reason," one of the emulation software leaders, are
Music controllers still require hardware, but when full- immersion visual-auditory
virtual reality environments become ubiquitous, which I expect by the
end of this decade, we'll be using virtual controllers that are essentially
comprised of "just" software. When we have the full realization of nanotechnology-based
assembly in the 2020s, we will be creating actual hardware at almost no
cost from software.
We are not far from that reality today, and for the recording industry
it is already clear that the principal product – music – is pure information.
In all industries, the portion of products and services represented by
their information content is rapidly increasing. By the time we get to
the nanotechnology era, most products will be essentially information.
With file sharing, we've seen a breakdown of copyright protection. With
streaming and remote access technologies, the problem will become even
worse because existing copyright law doesn't even cover these situations.
If I call up a friend on the phone and play a new CD that I purchased,
that's not a violation of copyright law, nor should it be. But what is
a phone call? It's a streaming connection. File sharing networks will
evolve into file streaming networks.
So if you want to listen to a song, the network finds a machine with that
file and it is played on that machine. You listen in on a streaming connection.
No files or information are ever copied. Copyright law is based entirely
on the concept of copying, so if we bypass copying, there is no violation.
We can extend this concept to all forms of software, including interactive
software. In this case, the user effectively uses someone else's machine
using remote access software (such as pcAnywhere or Microsoft's Remote
Desktop). With continued acceleration in hardware power, running software
on someone else's machine is likely to occupy only a small fraction of
the power of the computers involved.
Clearly, intellectual property licenses, and copyright law itself, can
be amended to try to deal with this situation, but there are still problems.
How do you define what is to be proscribed? Playing songs or demonstrating
software to friends should still be allowed. Obviously, vast sharing networks
go beyond friendship. So the law will need to define what constitutes
a friend. Obviously there are some very slippery slopes here.
The educational challenge will be even greater. If consumers today understand
copyright at all, they understand it in terms of making copies of information.
How is the public to understand the concept if no actual copying takes
There are workable schemes for protecting software by building in locks
that prevent software from working on machines other than authorized ones.
These rely on means to identify what computer is being used, and these
systems work reasonably well today. But the streaming approach bypasses
this form of protection.
Having cited some of the difficulties, we need to recognize that protection
of intellectual property is critical, otherwise we destroy the business
model that provides for the capital formation required to create the intellectual
property in the first place.
We could discuss at length various technical means for protecting information
such as music files, but the bottom line is that all of these systems
are easily breakable if that is what the public wants to do.
It may seem obvious that this is indeed what the public wants to do, but
that does not need to be the case. Educating consumers on the value to
them of protecting intellectual property is feasible, and without such
a social compact, technical approaches will inevitably fail.
Is such a social expectation feasible? We do have a successful example:
the cell-phone industry. Unlike the recording industry, this communications
industry did not stick with the business model of the 1950s and 1960s,
which included very high charges for a long distance call. The cost of
a long distance call has fallen from tens of dollars to pennies. Had that
not been the case, you can be sure that people would be routinely breaking
cell phone network access just as readily as they now share music files.
Although there are people who do break cell phone access codes, this is
not considered a cool thing to do.
In the recording industry, the fault lies primarily with the industry
for not having budged from a business model of charging tens of dollars
for an album, a pricing model that existed when my father was a child.
The current lawsuits may have an educational effect, but the industry
is being disingenuous in the extreme by launching these suits before they
have provided a viable legitimate system of file downloading. Apple's
music site is a good initiative, but under industry pressure they have
backed off their commitment to allow personal copies, and the services
still doesn't run on 98% of the installed base of personal computers.
As we've seen in the case of cell phones, people won't go to the trouble
of breaking technical protection schemes if an industry provides a system
of access and competitive pricing that the public views as tolerable and
With the entire economy headed towards the complete dominance of information,
this remains a critical challenge.
New Ways to Create Music
Musical expression also offers new challenges. It has always used the
most advanced technologies available, from ancient drums, the cabinet-making
crafts of the eighteenth century, the mechanical linkages of the nineteenth
century, the analog electronics of the mid-20th century, the digital technology
of the 1980s and 1990s to the artificial intelligence coming in the 21st
century. With digital samplers and synthesizers, we were able for the
first time in human history to create sounds that had the complexity of
acoustic sounds, but that did not originate from purely acoustic instruments.
For example, we could start with piano samples and modify them with a
variety of digital synthesis techniques to create sounds that had the
richness of the piano, but were impossible with acoustic means alone.
A particular challenge that we dealt with in creating the Kurzweil 250
was how to recreate the enharmonic overtones of a piano. Most instruments
have harmonic overtones, that is the overtones are perfect multiples of
the fundamental frequency. In a piano, the overtones are slightly different
from being perfect multiples, and this is one of the features that gives
a piano its unique timbre. Conventional samplers at the time looped the
last waveform and applied a decay envelope. But their piano samples sounded
like organ samples (at the point of looping) because the overtones were
simple multiples of the fundamental frequency, lacking the subtlety of
the complex waveforms generated by the piano and other natural instruments.
In recent years, we've seen the emergence of software-based samplers,
synthesizers, mixers, and sound processors. Although there still are significant
performance benefits in using hardware DSP-based devices, software-based
systems such as Reason are adequate to create professional recordings,
such as movie soundtracks.
The next wave of instruments will be based on physical modeling, actually
simulating the interaction of sound with the strings, curved wood, and
other components of physical instruments. It is then possible, of course,
to create simulated instruments that would be impossible to render physically.
The concept of physical modeling has been around for over a decade, but
available systems are limited to building instruments from limited sets
of building blocks.
Future physical modeling systems will allow detailed emulation of highly
complex shapes and materials, including, for example, the special resins
used to create fine violins. The state of the art in physical modeling
requires high-end DSP chips today, but software-based physical modeling
synthesizers will be ubiquitous within five years. However, PCs will increasingly
include DSPs, particularly since they are targeted at applications with
audio and image processing that can benefit from DSPs. Intel experimented
with this with a special version of the Pentium (Pentium MMX). This is
likely to continue to happen. Microprocessors used in synthesizers and
consumer products will also increasingly include DSP functionality.
We are also moving towards an era of intelligent accompanists. We've had
for many years "autoplay" features on home pianos for beginning students,
but these are largely unsatisfactory because they require the human player
to keep up with the automated players. What is needed in an intelligent
accompanist is a system that follows the user, not the other way around.
With such a system, a student could be playing a simple one-line melody,
and the system would fill in with appropriate walking bass lines, rhythmic
patterns, and harmonic progressions.
Tod Machover has developed a series of interactive instruments that he
calls hyperinstruments. They effectively provide the serious musician
with intelligent accompanists. Although the human player stays in control,
a single player can match the richness and intricacy of an entire ensemble.
Music is a means of communicating human feelings and ideas from composers
and performers to an audience. It is a language, or we might say a set
of languages, that allows us to communicate emotions ranging from humor
to sorrow. Machines can amplify our ability to communicate musically by
providing richer palettes of sounds and means of manipulating and controlling
Machines can also provide narrow forms of intelligence that work in close
concert with human intelligence. The closeness of this connection will
grow over time, reflecting the overall growing intimacy between humans
and their machines.
Visit also the conference about
'the future of the Media & Entertainment Industry' and
the sections with books,
articles and links.