Audio Tech
Audio Tech
Introduction
This chapter introduces the kind of hardware and software that will be needed in a Digital
Audio Workstation. Important hardware components and possible specifications are
explained. Then comes software specifications which include codecs, editors and players.
1.1.1 CPU
The CPU is the foundation of an audio production PC. The processor’s core count and
speed will determine how quickly you can accomplish various editing tasks. If your PC
doesn’t have a powerful processor, it’s going to be slow, regardless of anything else.
Modern editing software will take advantage of many CPU cores and hyperthreading, so
investing in a good CPU is crucial when building a PC for audio production.
Generally speaking, the CPU is where you should invest the largest part of your budget.
Serious builders should be considering a 6-core CPU at a minimum or a 4-core/8-thread
CPU as a viable alternative.
In most situations, having more CPU cores should give you better performance. With that
said, relatively modest projects will likely be just fine with an 8- or even-lower-core-count
CPU, but depending on the complexity of your project and the latency you require, you may
want to upgrade to one of the monstrous 10-, 14-, or even 18-core CPU options on the
market.
One of the things in the current market you can relax about is the GPU for your build. If you
are using your workstation for more than audio production, then you are going to need to
make sure that you do have a balanced system. But just by itself, the GPU is not really
utilized when it comes to audio work.
For most music production and general DAW (Digital Audio Workstation) builds, the GTX
1660 will give you plenty of power for any and all needs. However, if you are also planning
to use the workstation for 3D Rendering, high-resolution video editing, and/or gaming—or if
you just want a PC that can handle a wide variety of program tasks, just in case—then you
may want to consider upgrading to something like the RX 5700 XT or even the RTX 2070
Super.
1.1.3 RAM
Having enough RAM is critical when looking for good all-around performance for your build.
For music production and audio editing, having 16GB is a reasonable minimum to ensure
headroom for all relevant applications. Then, as you go up in the performance of your CPU,
you can jump up to 32GB or 64GB of RAM for a top-end system.
A useful alternative to just throwing more RAM at the problem is spending a bit more on
faster RAM. Especially if you are happy with overclocking your CPU, you can get very
noticeable performance improvements from 3000MHz (or higher) RAM over the standard
2133MHz. If you know you are not going to be running a lot of programs at once, then this is
a major alternative to simply doubling your RAM.
You can have the fastest system in the world, but if the CPU cannot simply read the data
from the HDD fast enough, then that performance is wasted.
The general rule of thumb is that you want a dedicated drive to store the files you are
editing, and then a main storage drive once you are done with them. This is why all of our
examples come with 2 drives as standard, a smaller-capacity (faster, more expensive)
NVMe SSD or SATA SSD and a larger-capacity (slower, cheaper) SATA HDD or SATA SSD.
As you go up in budget, you can get even more elaborate by having one drive for your
operating system and key programs, one for project files in use, and a third as a short-term
cache drive. You can also get quite smart with this, as you’ll find a lot of the newer M.2
SSDs have super-high sustained read/write speeds to and from the drive. So, although they
are more expensive, you are paying for pure performance.
An alternative to this or a further possibility (especially if avoiding data loss is a high priority)
is setting up multiple drives in a RAID configuration; this can be achieved by mounting
additional drives inside the case if there is space, or by purchasing an external enclosure.
1.1.5 Motherboard
For an audio workstation, the motherboard choice usually comes down to a very simple
question of, “how many things can I plug into this?” For that reason, we would usually
recommend full-size ATX boards, so that you are not losing a PCIe slot on the board.
Losing a PCIe slot may seriously limit you, especially if you’re browsing the sound card list.
A few other considerations when selecting a motherboard are likely to include exactly what
the onboard sound is capable of handling in terms of bitrates and the like, the number of
available USB ports, and the number of available SATA ports.
When it comes to audio processing, modern motherboards often have very good integrated
sound. If you’re especially concerned about the quality of sound as you’re editing, however,
check reviews of the motherboards you’re considering in order to make sure the audio
quality will meet your needs. As mentioned above, a good additional purchase for such
concerned users is a PCIe sound card.
Finally, as mentioned above, think about how many hard drives you’re planning to have.
While most PC users get away with having one or two, editors are very often strapped for
storage as you focus on maintaining good workflow. Standard SSDs and HDDs run off of
SATA ports, whereas the super-fast M.2 SSDs run off a special M.2 port on the board (or
can be mounted onto a PCIe slot adapter and plugged into the board that way).
The last thing you want is for your PC to instantly power down through lack of power, or
even to short out the entire board from an overload. So, it is important to not only get a
power supply that will handle what you throw at it, but also one which has good power
safety features in the unfortunate case of a power surge. For this reason, we would strongly
recommend getting a power supply only from a well-known and reputable manufacturer and
to opt for an 80+ rating of Bronze or better—which will also ensure your PC uses electricity
efficiently, and runs cooler and quieter.
There can be various ways of producing audio depending on the nature of the audio being
produced and the purposes for which the audio is going to be used. That means a digital
audio workstation can be a single simple PC or it can have a lot of both hardware and
software. In all these circumstances, important software products that may stand alone or
that may be integrated into other systems include
Audio composing software can be defined as software used for musical composition, digital
recording, the creation of electronic music, and other musical applications. There are many
kinds of software in this category, but the main one and the standard that shall be discussed
in detail is the MIDI.
A digital audio player, shortened to DAP, is a consumer electronic device that stores,
organizes and plays digital audio files. In contrast, analog audio players play music from
cassette tapes, or records. Often digital audio players are sold as MP3 players, even if they
support other file formats.
Audio compression is the process of reducing a signal’s dynamic range. Dynamic range is
the difference between the loudest and quietest parts of an audio signal. This would make
the file smaller for either storage or for transmission.
Introduction
This topic focuses on two main technologies in the multimedia industry: the MIDI and the MP3.
It describes the components of the MIDI and its history, how it works and the interfaces and the
cables sit can use to transfer audio data. For the MP3, we look at the compression method used
and the advantages associated with the method, how to store music in the MP3 and how to
share it. Finally, we look at how to convert other file formats into MP3 format.
3.1 Introduction
As has been indicated in the previous chapters, there are many hardware and software
components that are needed in a real digital audio workstation. Most of them have been listed
earlier in this module. In this chapter we are going to look at the most important elements: the
music production software (MIDI) and the most common container inside which music can be
stored and played (MP3). Although there are many music production storage software products,
understanding of these two main technologies will go a long way in making you acquire the
basic idea behind music related technology.
MIDI, which stands for Musical Instrument Digital Interface, is a connectivity standard for
transmitting digital instrument data. It is mainly utilised by computers, electronic keyboards and
synthesisers. In addition, it is supported by other instruments like beat boxes, electronic drums,
and even digital stringed instruments, like guitars and violins.
MIDI is a protocol designed for recording and playing back music on digital synthesisers,
allowed for by many different types of personal computer sound cards. It was originally
designed to control one keyboard by using another, but was quickly adapted for the personal
computer. At its most basic, MIDI sends data about how music is created. The command set
contains note-ons, note-offs, key velocity, pitch bend, and many other methods of controlling a
synthesiser. The sound waves created are already saved in a wavetable in the receiving
instrument or sound card.
With a program that offers this interface, you can produce music using a standard keyboard, or
some other input device. You can play your MIDI creation with the same program (or with
another program), as long as you have a sound card to function as a music synthesizer. The
MIDI program may have a graphical user interface that looks like a sound studio control room.
Many sound cards come as a package with MIDI software.
MIDI data contains different kinds of information. When a single key on a synthesiser is
pressed, it sends the note played, the velocity (or how firm the note is pressed), and how long
the note is held. If several notes are played at once, the MIDI data is sent for all the notes
simultaneously. Other pieces information that may be transmitted over a MIDI connection
include the instrument ID, sustain pedal timings, and controller data, like pitch bend and vibrato.
When a synthesiser is connected to a computer through a MIDI connection, the notes played
can be recorded by Digital Audio Workstation (DAW) software in MIDI format. The MIDI
information can be played back by transmitting the recorded MIDI notes to the keyboard, which
outputs them as audio samples, such as piano or strings. Most DAW software supports MIDI
editing, which allows you to adjust the timing and velocity of individual notes, change their pitch,
and add / delete notes. MIDI information is often displayed in a digital format, with lines
representing each note played. Many programs can translate MIDI information to a musical
score.
A MIDI recording contains both instrument data and the notes played. The actual sound is
played back using samples from real instruments. By changing the output instrument, a MIDI
track for piano can be played back with a guitar sound, or vice versa.
Before, MIDI connections utilized MIDI cables connected to a 5-pin MIDI port on each device.
Nowadays, most of the MIDI devices have standard computer interface linking mechanisms, like
USB or Thunderbolt ports. These modern interfaces offer more bandwidth than traditional MIDI
ports, which in turn allows more tracks with more data to be sent at the same time.
The MIDI protocol utilises eight-bit serial transmission with one start bit and one stop bit. It has
a 31.25 Kbs data rate, and is asynchronous. Connection is made through a five-pin DIN plug,
whereby three pins are in use at any given time.
3.2.1 General MIDI
The MIDI standard outlines 128 General MIDI (GM) instruments, which already exist on most
computers as software instruments. The sound of each GM instrument may vary between
different computers and keyboards because the samples used for the instruments may be
different, but the GM instrument ID is consistent across devices. For example, GM instrument
#1 is assigned to an acoustic piano, whereas #20 is assigned to a church organ, and #61 is
assigned to a French horn. Modern synthesisers include hundreds—or even thousands—of
other instruments that can be chosen, most of which offer more genuine sound than the original,
standard GM options.
MIDI is a communication standard that allows digital music gear to speak the same language.
MIDI is short for Musical Instrument Digital Interface. It’s a protocol that allows computers,
musical instruments and other hardware to communicate.
MIDI was first developed in the early 80s to standardize the growing amount of digital music
hardware. Manufacturers needed a simple way to make their products compatible with those of
other brands. Roland founder Ikutaro Kakehashi proposed the idea of a standard instrument
language to the other major manufacturers, including Oberheim, Sequential Circuits and Moog,
in 1981. The project had some lofty goals. MIDI attempted to provide a way to communicate all
the features of a musical performance digitally. The architects of the MIDI standard had to define
all kinds of complex musical behaviour in a way that 1980s-era technology could work with—not
an easy task. Their choices had big consequences for the way electronic instruments were
designed for the next 40 years.
The finished MIDI standard was finally unveiled in 1982. Kakehashi and Dave Smith both later
received Technical Grammy Awards in 2013 for their key roles in the development of
MIDI—about time!
MIDI never transmits an actual audio signal—it’s information only. That means that if a MIDI
keyboard doesn’t have an onboard sound source like a synth or sampler, it won’t make any
sound!
That sheds some light on where MIDI can come into your workflow.
If you’re composing using plugins in your DAW, MIDI clips are the regions on your timeline that
control which notes your plugins play and when they play them.
When you connect a MIDI controller to your DAW to play virtual instruments, you’re simply
feeding them real time MIDI information.
The same is true when you sequence MIDI in your DAW and send the information to hardware
gear like an analog synth or drum machine.
The biggest benefit of MIDI is that you can easily edit performances note by note, change their
articulation, or even alter or replace the sound that plays them!
But that’s not all. You can control a lot more than just notes using MIDI. Many features of a
traditional musical performance have an equivalent in MIDI.
You can also use it to automate parameters or change patches on hardware or software
instruments or effects. That’s where MIDI messages come in.
You can easily edit performances note by note, change their articulation, or even alter or replace
the sound that plays them.
They carry information about which parameters to change, how the system should behave or
which notes to play—and how they should be played.
MIDI messages can be broken down into two types: System messages and Channel messages.
Most of the time you’ll be dealing with channel messages, although some important functions
like clock and transport (stop, start and continue) are system messages.
For example, note on and off messages carry the note number value as well as the velocity
value—the intensity the note was played with.
Note ON and OFF: which notes are depressed and released. Includes velocity.
Aftertouch: the pressure a key is held down with after it’s depressed
Control Change: changes a parameter value on the device
Program Change: changes the patch number on the device
Channel Pressure: the single greatest pressure value for all depressed keys
Pitch Bend Change: change in the pitch bend wheel or lever.
System messages control other essential data that digital instruments need to communicate
with each other.
A MIDI event is a MIDI message that occurs at a specific time. They’re especially relevant when
it comes to compositions that rely on MIDI sequencers.
In this configuration, the sequencer sends its data to all the different parts of your setup and
keeps them in sync with each other.
For the majority of producers, their DAW takes care of MIDI sequencer duties.
Some musicians prefer to use hardware sequencers for their unique workflow or capabilities.
Sequencers can control external hardware, virtual instruments in your DAW or a combination of
the two.
One stream of MIDI data has a total of 16 independent channels for messages and events.
You can think of these channels kind of like tracks in your DAW or sequencer—but don’t get
confused, you’re certainly not limited to only 16 MIDI tracks in your DAW!
MIDI channels are most important when you’re dealing with external hardware.
Each device in your MIDI setup can be set to send or receive data on a particular channel.
From there it’s as easy as setting the output channel on tracks in your sequencer to determine
which device should play what.
It also means you can chain all your devices together easily with a single cable using your MIDI
interface.
They’re perfect for working with external MIDI gear like hardware synths and drum machines.
Some even have multiple pairs of MIDI I/O to accommodate every possible device in your
studio.
MIDI controllers are a special type of MIDI interface with an input device built-in. Typical input
methods available on MIDI controllers are piano keys and pressure sensitive pads, but most
include knobs and sliders for controlling other parameters as well.
They’re called 5-pin DIN cables and they’re for connecting the inputs, outputs and thru outputs
on traditional MIDI gear.
Some extra compact gear uses MIDI over ¼” or ⅛” balanced TRS cable. In this situation you
may have to use special cables or converter boxes to interface with devices using the 5-pin
connector.
MIDI interfaces (and some forward-looking MIDI synths) often connect to the computer using
USB.
The correct way to connect them can be a bit confusing depending on the situation. To get it
right you have to follow the direction of your signal flow.
The output of the device sending MIDI information must always be connected to the input of the
device you intend to receive it.
MIDI thru is for sending the same MIDI information to multiple devices. Remember—you can set
each device to a different channel and use a single stream of MIDI to control them all.
MIDI THRU is how you make that connection. Simply connect the MIDI Thru of the first device
to the MIDI IN of the next device in the chain to duplicate the MIDI data and send it downstream.
With the help of VST plugins this setup turns your MIDI controller into whatever you want it to
be: Millions of different synths, drum machines, guitars, flutes, horns, or pretty much anything
else you can dream up.
You can edit sequences in the piano roll and and input notes manually the help of the controller
to play your parts.
Plus many MIDI controllers come with knobs, pads and sliders that are assignable as well
through your DAW.
This setup is light and intuitive for composing all genres of music via MIDI.
In this situation, your DAW acts as the main hub for sending and sequencing all the MIDI
information.
Using the DAW Piano roll, each hardware unit can be instructed to play any sequence of notes
on any MIDI channel.
In this example a hardware sequencer takes the place of your DAW’s MIDI editing features.
Using MIDI THRU, the sequencer sends information to three devices: two synths and a drum
machine.
This setup is like a mini DAW rig made up of entirely hardware gear. This is how most producers
used MIDI before computers were cheap enough to be commonly used in music.
The original architects of the protocol did a fantastic job of creating a way for digital instruments
to communicate.
But a lot has happened in the world of technology since the beginning of MIDI.
At this point the standard needs to evolve to fit in with how music tech has changed around it.
For one thing, the power and speed of even the simplest modern digital gear is light years
ahead of what designers were working with in the 80s.
And ideas about how digital music devices should interact have changed too.
All this has led to the development of the all new MIDI 2.0 standard. It’s not completely out yet,
but it has the potential to greatly expand the possibilities of digital music production in the near
future.
Before you grab your pitchforks, the MIDI association guarantees that the new standard will be
perfectly backwards compatible with any MIDI 1.0 gear—that’s a relief!
Music is sampled 44,100 times per second. The samples are 2 bytes (16 bits) long.
Separate samples are taken for the left and right speakers in a stereo system.
So a CD stores a huge number of bits for each second of music:
Let's break that down: 1.4 million bits per second equals 176,000 bytes per second. If an
average song is three minutes long, then the average song on a CD consumes about 32 million
bytes (or 32 megabytes) of space. Even with a high-speed cable or DSL modem, it can take
several minutes to download just one song. Over a 56K dial-up modem, it would take close to
two hours.
The MP3 format is a compression system for music. The goal of using MP3 is to compress a
CD-quality song by a factor of 10 to 14 without noticeably affecting the CD-quality sound. With
MP3, a 32-megabyte song on a CD compresses down to about 3 MB. This lets you download a
song much more quickly, and store hundreds of songs on your computer's hard disk.
To make a good compression algorithm for sound, a technique called perceptual noise shaping
is used. It's "perceptual" partly because the MP3 format uses characteristics of the human ear to
design the compression algorithm. For example:
There are certain sounds that the human ear cannot hear.
There are certain sounds that the human ear hears much better than others.
If there are two sounds playing simultaneously, we hear the louder one but cannot hear the
softer one.
Using facts like these, certain parts of a song can be eliminated without significantly hurting the
quality of the song for the listener. Compressing the rest of the song with well-known
compression techniques shrinks the song considerably -- by a factor of 10 at least. When you're
done creating an MP3 file, what you have is a "near-CD-quality" song. The MP3 version of the
song does not sound exactly the same as the original CD song because some of it has been
removed.
Not all MP3 files are equal. Let's take a look at the different ends of the MP3 spectrum in the
next section.
The MP3 compression format creates files that don't sound exactly like the original recording --
it's a lossy format. In order to decrease the size of the file significantly, MP3 encoders have to
lose audio information. Lossless compression formats don't sacrifice any audio information. But
that also means that lossless compression files are larger than their lossy counterparts.
You can choose how much information an MP3 file will retain or lose during the encoding and
compression process. It's possible to create two different MP3 files with different sound quality
and file sizes from the same source of data. The key is the bit rate -- the number of bits per
second encoded in the MP3 file.
Most MP3 encoding software allows the user to select the bit rate when converting files into the
MP3 format. The lower the bit rate, the more information the encoder will discard when
compressing the file. Bit rates range from 96 to 320 kilobits per second (Kbps). Using a bit rate
of 128 Kbps usually results in a sound quality equivalent to what you'd hear on the radio. Many
music sites and blogs urge people to use a bit rate of 160 Kbps or higher if they want the MP3
file to have the same sound quality as a CD.
Some audiophiles -- people who seek out the best ways to experience music -- look down on
the MP3 format. They argue that even at the highest bit rate settings, MP3 files are inferior to
CDs and vinyl records. But other people argue that it's impossible for the human ear to detect
the difference between an uncompressed CD file and an MP3 encoded with a 320 Kbps bit rate.
Some musicians and audio engineers say that the MP3 format is changing the way music
studios mix recordings. They say that the MP3 format "flattens" out the dynamics -- the
differences in pitch and volume -- in a song. As a result, much of the new music coming out of
the industry has a similar sound, and there's not as much of a focus on creating a dynamic
listening experience.
From this description, you can see that MP3 is nothing magical. It's simply a file format that
compresses a song into a smaller size so it is easier to move around and store on your home
computer -- or your portable music player.
The Name MPEG is the acronym for Moving Picture Experts Group. This group has developed
compression systems used for video data. For example, DVD movies, HDTV broadcasts and
DSS satellite systems use MPEG compression to fit video and movie data into smaller spaces.
The MPEG compression system includes a subsystem to compress sound, called MPEG audio
Layer-3. We know it by its abbreviation, MP3.
If you would like to download and listen to MP3 files on your computer, then you need:
A computer
A sound card and speakers for the computer (If your computer has speakers, it has a sound
card.)
An Internet connection (If you are browsing the Web to read this article, then you have an
Internet connection and it is working fine.)
An MP3 player (a software application you can download from the Web in 10 minutes)
If you have recently purchased a new computer, chances are it already has software that can
play MP3 files installed on its hard disk. The easiest way to find out if you already have an MP3
player installed is to download an MP3 file and try to double-click on it. If it plays, you're set. If
not, you need to download a player, which is very easy to do.
There are literally thousands of sites on the Web where you can download MP3 files. Go to one
of these sites, find a song and download it to your hard disk. Most MP3 sites let you either listen
to the song as a streaming file or download it -- you'll probably want to download it, if you want
to save a copy for later. Most songs range between 2 and 4 MB, so it will take 10 to 15 minutes
unless you have a high-speed Internet connection. Once the song has finished downloading, try
to double-click on the file and see what happens. If your computer plays it, then you're set.
If you find that you cannot play it, then you need to download an MP3 player. There are dozens
of players available, and most of them are free or shareware -- shareware is extremely
inexpensive.
You're now ready to begin collecting MP3 files and saving them on your computer. Many people
have hundreds of songs they have collected, and they create jukebox-like playlists so that their
computer can play them all day long!
Many people who start collecting MP3 files find that they want to listen to them in all kinds of
places. Small, portable MP3 players answer this need. These players are like portable cassette
players except that they are smaller.
These players plug into your computer's FireWire or USB port to transfer the data. A software
application lets you transfer your MP3s into the player by simply dragging the files.
If you have a CD collection and would like to convert songs from your CDs into MP3 files, you
can use ripper and encoder software to do just that. A ripper copies the song's file from the CD
onto your hard disk. The encoder compresses the song into the MP3 format. By encoding
songs, you can play them on your computer or take them with you on your MP3 player.
If you have a writable CD drive in your computer, there are two ways to save your MP3 files on a
CD:
You can write the MP3 files themselves onto a data CD in order to save them and clear some
space on your hard disk. You can then listen to the files on any computer. Some car stereos and
DVD players let you play data-encoded MP3s, too. Because the file size is much smaller than a
CD file, you can fit many more songs onto a CD when you use the MP3 file format.
You can convert (decode) your MP3 files into full-sized CD tracks and then save them to an
audio CD. This allows you to listen to your MP3 files on any CD player. But remember that
converting MP3 files into CD tracks limits the number of files you can fit on a CD. Also,
converting an MP3 into a larger file format doesn't replace the information lost during the original
MP3 encoding. In other words, the music files won't sound any better than they did as MP3 files.
Many MP3 encoders have plug-ins that create full-size WAV files from MP3 files, and some of
the encoders will also decode. Once you have the full-size CD tracks, then the software that
comes with your CD-R drive will let you create an audio CD easily. Other MP3 encoders and
players have similar features. It's good to do a little research before you choose your MP3
application -- some are more reliable than others.
Your MP3 files are only as good as the encoder that created them. Inferior encoders using low
bit rates can produce errors called artifacts.
If you are an artist who is recording music at home or in a small studio, you can use MP3 files
and the Web to distribute your music to a larger audience. The first step is to create a song,
either on a cassette tape, minidisc or CD. If it's on a CD, you can use the ripper and encoder
tools described in the previous section to create an MP3 file. If it's on a cassette or other source,
you can connect the output of the audio source to the line-in or microphone jack of your sound
card and record the music digitally on your computer. Then you can encode that file to create
the MP3.
Once you have an MP3 file in hand, you have two distribution options:
You can go to an MP3-distribution site and let them distribute your music. The advantage of this
approach is that large MP3-distribution sites gets millions of visitors every month, so your
potential audience is very large.
You can create your own Web site for your music or band and promote the site yourself. This
gives you more control and individuality, but requires you to get the word out on your own.
Some musicians distribute their music through a blog.
One good option is to make your MP3 files available on a large Web site and then link to the
download area from your band's Web site. This lets you get the best of both worlds, and you
can take advantage of the larger site's servers for those big MP3 files.
Introduction
This chapter introduces the basics of audio production and processing. Among the concepts
that are going to be explained herein are digital audio compression, audio resolution,
common audio file formats and sample ADCs.
2.1 ADC
A high quality digital audio representation can be achieved by directly measuring the
analogue waveform at equidistant time steps along a linear scale. This procedure is
indicated in figure 1. The distance between the grid markers on the time axis is defined by
the sample frequency, and on the amplitude axis by the maximal amplitude divided by the
number of bits. Therefore, higher sample frequencies give narrower grids along the time
axis and more bits (narrower grids) along the amplitude axis. The individual values to be
stored on computers and representing the sound wave can only be located on the cross
points of the underlying grid. Since the actual sound pressure will never be exactly on these
cross-points, errors are made, i.e. digitization noise is introduced. It is obvious that the
narrower the grids are the smaller the error will be. Therefore, a higher sample frequency
and a larger number of bits will increase the quality. However, for speech signals we know
that the highest frequency a child’s voice can create is in the order of 8 kHz. According to
the Nyquist rule, a double as high sample frequency is sufficient to represent a frequency
component nicely, so 20 kHz is sufficient. Our human ear is said to be sensitive up to 20
kHz, therefore the HiFi norm requires sampling frequencies above 40 kHz.
In the real world, analog signals are signals that have a continuous sequence with
continuous values (there are some cases where it can be finite). These types of signals can
come from sound, light, temperature and motion. Digital signals are represented by a
sequence of discrete values where the signal is broken down into sequences that depend
on the time series or sampling rate (more on this later). The easiest way to explain this it
through a visual! Diagram below shows a great example of what analog and digital signals
look like.
Microcontrollers can’t read values unless it’s digital data. This is because microcontrollers
can only see “levels” of the voltage, which depends on the resolution of the ADC and the
system voltage.
ADCs follow a sequence when converting analog signals to digital. They first sample the
signal, then quantify it to determine the resolution of the signal, and finally set binary values
and send it to the system to read the digital signal. Two important aspects of the ADC are its
sampling rate and resolution.
The ADC’s sampling rate, also known as sampling frequency, can be tied to the ADC’s
speed. The sampling rate is measured by using “samples per second”, where the units are
in SPS or S/s (or if you’re using sampling frequency, it would be in Hz). This simply means
how many samples or data points it takes within a second. The more samples the ADC
takes, the higher frequencies it can handle.
fs = 1/T
Where,
fs = Sample Rate/Frequency
T = Period of the sample or the time it takes before sampling again
For example, on the diagram above, it seems fs is 20 S/s (or 20 Hz), while T is 50 ms. The
sample rate is very slow, but the signal still came out similar to the original analog signal.
This is because the frequency of the original signal is a slow 1 Hz, meaning the frequency
rate was still good enough to reconstruct a similar signal.
“What happens when the sampling rate is considerably slower?” you might ask. It is
important to know the sampling rate of the ADC because you will need to know if it will
cause aliasing. Aliasing means that when a digital image/signal is reconstructed, it differs
greatly from the original image/signal caused from sampling.
If the sampling rate is slow and the frequency of the signal is high, the ADC will not be able
to reconstruct the original analog signal which will cause the system to read incorrect data.
A good example is shown below.
In this example, you can see where the sampling occurs in the analog input signal. The
output of the digital signal is not at all close to the original signal as the sampling rate is not
high enough to keep up with the analog signal. This causes aliasing and now the digital
system will be missing the full picture of the analog signal.
One rule of thumb when figuring out if aliasing will happen is using Nyquist Theorem.
According to the theorem, the sampling rate/frequency needs to be at least twice as much
as the highest frequency in the signal to recreate the original analog signal. The following
equation is used to find the Nyquist frequency:
fNyquist = 2fMax
Where,
For example, if the signal that you input into the digital system has a max frequency of 100
kHz, then the sampling rate on your ADC needs to be equal or greater than 200 kS/s. This
will allow for a successful reconstruction of the original signal.
It is also good to note that there are cases where outside noise can introduce unexpected
high frequency into the analog signal, which can disrupt the signal because the sample rate
couldn’t handle the added noise frequency. It is always a good idea to add an anti-aliasing
filter (low-pass filter) before the ADC and sampling begins, as it can prevent unexpected
high frequencies to make it to the system.
The ADC’s resolution can be tied to the precision of the ADC. The resolution of the ADC
can be determined by its bit length. A quick example on how it helps the digital signal output
a more accurate signal is shown below. Here you can see that the 1-bit only has two
“levels”. As you increase the bit length, the levels increase making the signal more closely
represent the original analog signal.
If you need accurate voltage level for your system to read, then the bit resolution is
important to know. The resolution depends on both the bit length and the reference voltage.
These equations help you figure out the total resolution of the signal that you are trying to
input in voltage terms:
Step Size = VRef/N
Where,
N = 2n
Where,
n = Bit Size
For example, let’s say that a sine wave with a voltage range of 5 needs to be read. The
ADC has a bit size of 12-bit. Plug in 12 to n on equation 4 and N will be 4096. With that
known and the voltage reference set to 5V, you’ll have: Step Size = 5V/4096. You will find
that the step size will be around 0.00122V (or 1.22mV). This is accurate as the digital
system will be able to tell when the voltage changes on an accuracy of 1.22mV.
If the ADC was a very small bit length, let’s say only 2 bits, then the accuracy would reduce
to only 1.25V, which is very poor as it will only be able to tell the system of four voltage
levels (0V, 1.25V, 2.5V, 3.75V and 5V).
Diagram below shows common bit length and their number of levels. It also shows what the
step size would be for a 5V reference. You can see how accurate it gets as the bit length
increases.
With understanding both the resolution and the sample rates of the ADC, you can see how
important it is to know these values and what to expect from your ADC.
Analog Devices have a great range of ADCs that are high quality and reliable that can be
general or special purpose converters. Here are a few to consider for your next design:
2.6.1 AD7175-2 (Max Resolution: 24-bit | Max Sample Rate: 250 kSPS)
The AD7175-2 is a Delta-Sigma analog-to-digital converter for low bandwidth inputs. It has
low noise, fast settling, multiplexed, 2-/4-channels that has a maximum channel scan rate of
50 kSPS (20µs) for fully settled data. The output data rates can range from 5 SPS to 250
kSPS. You can also configure an individual setup for each analog input channel in use and
can have a max of 24-bit resolution. Applications include: process control (PLC/DCS
modules), temperature and pressure measurement, medical and scientific multichannel
instrumentation, and chromatography.
2.6.2 AD9680 (Max Resolution: 14-bit | Max Sample Rate: 1.25 GSPS)
This ADC has a wide full power bandwidth that supports IF sampling of signals up to 2GHz.
It has four integrated wideband decimation filters and its numerically controlled oscillators
(NCO) blocks supporting multiband receivers. With its buffered inputs with programmable
input termination, it eases filter design and implementation. Applications include:
communications, general-purpose software radios, ultrawideband satellite receivers,
instrumentation, radars and much more.
2.6.3 AD7760 (Max Resolution: 24-bit | Max Sample Rate: 2.5 MSPS)
AD7760 is a high-performance sigma-delta ADC that combines input bandwidth and high
speed with benefits of a sigma-delta conversion to achieve a performance of 100 dB ANR at
2.5 MSPS, making it ideal for high speed data acquisition. It can simplify the design process
with its wide dynamic range combined with significantly reduced antialiasing requirements.
Applications include: data acquisition systems, vibration analysis, and instrumentation.
2.7 File formats and standards
Audio files come in all types and sizes. And while we may all be familiar with MP3, what
about AAC, FLAC, OGG, or WMA? Why do so many audio standards exist? Is there a best
audio format? Which ones are important and which ones can you ignore?
It's actually quite simple once you realize that all audio formats fall into three major
categories. Once you know what the categories mean, you can just pick a format within the
category that best suits your needs.
This digital audio format has a "sampling rate" (how often a sample is made) and a "bit
depth" (how many bits are used to represent each sample). There is no compression
involved. The digital recording is a close-to-exact representation of analog sound.
PCM is the most common audio format used in CDs and DVDs. There is a subtype of PCM
called Linear Pulse-Code Modulation, where samples are taken at linear intervals. LPCM is
the most common form of PCM, which is why the two terms are almost interchangeable at
this point.
A lot of people assume that all WAV files are uncompressed audio files, but that's not
exactly true. WAV is actually a Windows container for different audio formats. This means
that a WAV file could potentially contain compressed audio, but it's rarely used for that.
Most WAV files contain uncompressed audio in PCM format. The WAV file is just a wrapper
for the PCM encoding, making it more suitable for use on Windows systems. However, Mac
systems can usually open WAV files without any issues.
Also similar to WAV files, AIFF files can contain multiple kinds of audio formats. For
example, there is a compressed version called AIFF-C and another version called Apple
Loops which is used by GarageBand and Logic Audio. They both use the same AIFF
extension.
Most AIFF files contain uncompressed audio in PCM format. The AIFF file is just a wrapper
for the PCM encoding, making it more suitable for use on Mac systems. However, Windows
systems can usually open AIFF files without any issues.
In other words, lossy compression means sacrificing sound quality and audio fidelity for
smaller file sizes. When it's done poorly, you'll hear artifacts and other weirdnesses in the
audio. But when it's done well, you won't be able to hear the difference.
The main goal of MP3 is three-fold: 1) to drop all the sound data that exists beyond the
hearing range of normal people, and 2) to reduce the quality of sounds that aren't easy to
hear, then 3) to compress all other audio data as efficiently as possible.
Nearly every digital device in the world with audio playback can read and play MP3 files,
whether we're talking PCs, Macs, Androids, iPhones, Smart TVs, or whatever else. When
you need universal, MP3 will never let you down.
Even though MP3 is more of a household format, AAC is still widely used today. In fact, it's
the standard audio compression method used by YouTube, Android, iOS, iTunes, later
Nintendo portables, and later PlayStations.
Vorbis was first released in 2000 and grew in popularity due to two reasons: 1) it adheres to
the principles of open source software, and 2) it performs significantly better than most
other lossy compression formats (meaning it produces a smaller file size for equivalent
audio quality).
MP3 and AAC have such strong footholds that OGG has had a hard time breaking into the
spotlight---not many devices support it natively---but it's getting better with time. For now, it's
mostly used by hardcore proponents of open-source software.
Not unlike AAC and OGG, WMA was meant to address some of the flaws in the MP3
compression method---and it turns out that WMA's approach to compression is pretty similar
to AAC and OGG. Yes, in terms of objective compression quality, WMA is actually better
than MP3.
But since WMA is proprietary, not many devices and platforms support it. It also doesn't
offer any real benefits over AAC or OGG, so when MP3 isn't good enough, it's simply more
practical to go with one of those two instead of WMA.
The downside is that lossless compressed audio files are bigger than lossy compressed
audio files---up to 2x to 5x larger for the same source file.
Audio File Format: FLAC
FLAC stands for Free Lossless Audio Codec. A bit on the nose maybe, but it has quickly
become one of the most popular lossless formats available since its introduction in 2001.
What's nice is that FLAC can compress an original source file by up to 60 percent without
losing a single bit of data. What's even nicer is that FLAC is an open-source and
royalty-free audio file format, so it doesn't impose any intellectual property constraints.
FLAC is supported by most major programs and devices and is the main alternative to MP3
for music. With it, you basically get the full quality of raw uncompressed audio at half the file
size. That's why many see FLAC as the best audio format.
While ALAC is good, it's slightly less efficient than FLAC when it comes to compression.
However, Apple users don't really have a choice between the two because iTunes and iOS
both provide native support for ALAC and no support at all for FLAC.
Compared to FLAC and ALAC, WMA Lossless is the worst in terms of compression
efficiency---but not by much. It's a proprietary format so it's no good for fans of open-source
software, but it's supported natively on both Windows and Mac systems.
The biggest issue with WMA Lossless is the limited hardware support. If you want to play
lossless compressed audio across multiple devices and platforms, you should stick with
FLAC.
● If you're capturing and editing raw audio, use an uncompressed format. This way
you're working with the truest quality of audio possible. When you're done, you can
export or convert to a compressed format.
● If you're listening to music and want faithful audio representation, use lossless audio
compression. This is why audiophiles always scramble for FLAC albums over MP3
albums. Note that you'll need a lot of storage space for these.
● If you're okay with "good enough" music quality, if your audio file doesn't have any
music, or if you need to conserve disk space, use lossy audio compression. Most
people actually can't hear the difference between lossy and lossless compression.
Digital audio broadcasting (DAB), also known as digital radio and high-definition radio, is
audio broadcasting in which analog audio is converted into a digital signal and transmitted
on an assigned channel in the AM or (more usually) FM frequency range. DAB is said to
offer compact disc (CD)- quality audio on the FM (frequency modulation) broadcast band
and to offer FM-quality audio on the AM (amplitude modulation) broadcast band. The
technology was first deployed in the United Kingdom in 1995, and has become common
throughout Europe.
Digital audio broadcast signals are transmitted in-band, on-channel (IBOC). Several stations
can be carried within the same frequency spectrum. Listeners must have a receiver
equipped to handle DAB signals. At the transmitting site, the signal is compressed using
MPEG algorithms and modulated using coded orthogonal frequency division multiplexing
(COFDM). A digital signal offers several advantages over conventional analog transmission,
including improved sound quality, reduced fading and multipath effects, enhanced immunity
to weather, noise, and other interference, and expansion of the listener base by increasing
the number of stations that can broadcast within a given frequency band.
A DAB receiver includes a small display that provides information about the audio content in
much the same way that the menu screen provides an overview of programs in digital
television (DTV). Some DAB stations provide up-to-the-minute news, sports, and weather
headlines or bulletins in a scrolled text format on the display. Using the DAB information, it
may also be possible to see what song is coming up next.
2. Data rate represents the speed of data transmission via a communication channel,
usually measured in bits/second. This measure is required due to the restriction of various
media such as the access speed to various storage, the capacity of a transmission channel
and the playback speed of a certain mechanical device.
3. Complexity means the amount of work required and consequently its cost in order to
achieve a certain compression or decompression task. The cost is not always reflected in
the amount of work because of the increasing computer power and the development of
technology. In the real world, perhaps the implementation cost is more important than
anything else.
These measures may be quite subjective but this is due to the nature of audio systems. The
audio compression methods rely a lot upon the perspectiveness of the human audio
system. The research in this area relies on multidisciplinary knowledge and techniques
much more heavily than any other areas. As in any other complex system design, any audio
compression system design in general aims at high fidelity with low data rates, while
keeping the complexity and delay as low as possible.
Two types of sound with distinct characteristics are speech and music. They are the most
commonly used types of sound in multimedia productions today. For example, if we plot the
amplitude over a period of time for both the sound generated by speech and by a piece of
music, we would find that the shape of the two waves is quite different.
The requirements to media by the two types of audio data are also different. For example,
the media for telephone speech needs to be able to handle signals with a frequency of
200–3400 Hz, and wideband audio with frequency 50–7000 Hz, while the media for music
corresponds to the CD-quality audio needed to be able to process the signals with
frequency 20–20 000 Hz.
Representations specific to speech and music have been developed to effectively suit their
unique characteristics. For example, speech may be represented in a sound model that is
based on the characteristics of the human vocal apparatus, but music can be represented
as instructions for playing on virtual instruments.
This leads to two big compression areas, namely speech compression and music
compression. They have been developed independently for some time. Conventionally,
voice compression aims at removing the silence and music compression at finding an
efficient way to reconstruct music to play to the end user. Today, almost every stage
between the source sound and the reconstructed sound involves a data compression
process of one type or another.
Both forms of redundancy coding exploit another perceptual property of the human auditory
system. Psychoacoustic results show that above about 2 kHz and within each critical band,
the human auditory system bases its perception of stereo imaging more on the temporal
envelope of the audio signal than on its temporal fine structure.
In intensity stereo mode the encoder codes some upper-frequency subband outputs with a
single summed signal instead of sending independent left- and right-channel codes for each
of the 32 subband outputs. The intensity stereo decoder reconstructs the left and right
channels based only on a single summed signal and independent left- and right-channel
scale factors. With intensity stereo coding, the spectral shape of the left and right channels
is the same within each intensity-coded subband, but the magnitude differs.
The MS stereo mode encodes the left- and right-channel signals in certain frequency
ranges as middle (sum of left and right) and side (difference of left and right) channels. In
this mode, the encoder uses specially tuned threshold values to compress the side-channel
signal further.
Introduction
This topic offers a sample audio production process. The main activities that are associated
with producing and audio file are listed and explained. The stages include ideation,
composition, arrangement, recording, editing, mixing and mastering.
Conception is the very first stage of the audio production process. All creative projects start
out with an idea or a burst of inspiration. This can be based on many things such as a
personal experience or certain life events.
This is the stage where you want to create audio that makes you feel a certain type of way
or carries with it a certain kind of message. From that initial spark does the audio creation
process begin as you start to imagine and develop a sort of imagination or theme on how
the audio should sound and feel.
It can be a happy music or sad audio. It can be aggressive audio a chill one. Either way the
conception stage is where you allow all your creative juices to roam freely and come up with
something that you like.
4.2. Composition
The next step is to materialize your idea or bring it to reality. For most audio producers, this
starts out with humming a certain tune, writing down lyrics, fleshing out chords on a guitar or
hitting piano notes to give a sort of melody to work with.
The initial idea doesn’t have to be big as long as you have a some kind of a loop or phrase
that forms the foundation of the audio you’re going to create.
You can start to think about things like audio key, tempo, time-signature, chord progression,
strong motif, melody, countermelodies or harmonic layers.
Having a good basic understanding of audio theory goes a long way in helping you develop
your idea so that it musically sounds appealing and satisfying. They include:
● Rhythm
● Melody
● Harmony
Everything else you create will revolve around these 3 pillars of audio theory in order to
come up with audio that sounds amazing.
4.3. Arrangement
The arrangement stage is where you get the intro, verse, chorus, bridge and outro.
The goal of arrangement is to assemble musical ideas in a way that is easily absorbed and
understood, flowing from one phrase to the next with a strong sense of start then taken
through a journey up to the finish.
If you’re starting out its best to listen to your message and harmony associated. These are
the two main points that matter in any audio, whether it be music, an audio lesson or a
speech rehearsal.
4.4. Recording
Most producers prefer to finish arranging the beat then record vocals on top.
Recording is where you get to record the vocals that will be used in the music. This can be
recording demos to give a rough idea on how the audio should sound like.
This goes all the way up to the overdubs where vocals are recorded over a track until
perfection.
4. 5. Editing
This is the stage where you drag, drop, cut, trim, copy, paste, quantize and nudge the
various parts to fit together.
You’ll need to make small adjustments such adding and removing instruments, level
automation, lengthening or shortening a certain section. However, at this critical stage do
you find many musicians, engineers and producers get so caught up in the small details.
Be fast and make things as simple as possible without being to overly complex.
4.6. Mixing
Mixing is where you get to adjust audio levels to give a desirable sound and effect. At this
stage is also where you get to tweak and clean up transitions and remove any unwanted
tracks which are less desirable in the overall mix.
Most importantly you get to use audio effects such as EQ, delay, reverb, compression and
distortion. However, mixing your tracks won’t compensate for bad vocals or poorly recorded
instruments. A good mix should have depth, width and height.
Mixing your music involves a lot of practice, listening skills and patience. Mixing is the stage
in which you use levels, EQ, panning and effects to create the stereo image and final sound
you want.
4.7. Mastering
Mastering is basically adjusting your mix for public use or distribution while still maintaining
maximum sound quality. Whether used in broadcast, downloaded, or streamed on a hi-fi
system, headphones, car system, or on any other playback system; it should give the
listener the most accurate interpretation.
There are various techniques and tools used to make sure the track is loud enough, has a
good stereo image and sounds professional. In mastering, the audio engineer is focusing on
the whole song or audio, EP, or album as a whole from start to finish.
4.8 Conclusion
The bottom line here is that any compromise in the production process is one that will leave
the overall product in a state that is less than ideal.
It’s best that you discover and develop your own music production process as you practice
your craft in order to become good and confident in creating your own music.
The stages look like they are organised in a sequence from one to another. However, in the
real world work does not always flow like that. Some stages may be omitted depending on
the nature of the audio being created. Some may be repeated several times and some may
follow the stages in a cyclical manner than linear. All this depends on the nature of the audio
being produced and the circumstances under which production is taking place. Some may
even follow a process that has more or less stages than the ones mentioned herein. All this
should be taken into consideration.
Introduction
This chapter focuses on audio editing. In the current world of audio production, there are
now several audio editors from which the audio producer can select. In this chapter a
sample of some of the most common audio editors has been presented. Important features
of each and every sample editor have been listed, with advantages and disadvantages. The
chapter goes on to list some of the common features that are found within the most
common editors. We also look at the common file types the editors can deal with and
explain the salient features of each file type or codec
5.1 Introduction
Audio editing is one of the most important stages in the audio production business. There
are several software products that are available in industry for audio editing. Here are some
of the few common examples that you can use. For the purpose of demonstration and
practice, we are going to use Audacity, but as a student you can make your own choice of
the audio editor. What is important is for you to produce a professional product from the
application that you would have chosen.
1. Adobe Audition
Adobe Audition is one of the most popular audio editing programs on the market. You can
customize the look to suit your workflow. The software has all the best tools for editing and
completing any audio project.
Adobe Audition software is a complete multi-tracker voice recorder, mixer and multitrack
digital audio clip editor for Windows. When used with a Windows sound card, the program
provides the user with a complete digital recording studio experience.
This is one of the few programs that allow multiple sources to be recorded simultaneously in
separate tracks. This facilitates post-production tasks such as editing and processing
effects.
It also comes with an Adobe Cloud subscription, which is more useful if you plan on editing
pictures or videos. Adobe offers discounts to students and teachers, as well as companies
that require multiple licenses.
Sound Quality – Adobe Audition produces good quality audio, especially when using
MATLAB codes. But it’s good to produce music because it doesn’t look like a Garage Band.
It’s primarily a song editing app package (so no MIDI). But if you can actually play an
instrument, there’s no reason you can’t create multitrack music with it.
Features
The software has multi-track recording capabilities, excellent sound recovery tools that
deliver satisfactory results and great compatibility and file conversion capabilities. This is
one of the best audio editing programs available.
The software has multi-track recording capabilities, excellent sound recovery tools that
deliver satisfactory results and great compatibility and file conversion capabilities. This is
one of the best audio editing programs available. The Adobe Audition bill as “designed to
accelerate video workflows and complete audio” and this is truly its key market.
Advantages
– Easy Audio Editor provides a speech synthesis tool, which is rare in this type of software.
Disadvantage
2. Audacity
Audacity is a free music editing software package compatible with both Mac and Windows
computers, it’s considered the most popular windows audio editor available.
The user interface is not pretty, but it makes it easy to find editing tools and its sleek design
can speed up the workflow. One-click repair and restore plugins are easy to use and have
worked well in our tests.
The program struggled several times to run and crashed during testing, but this is normal
with free software. If you have trouble editing podcasting software or scanning a vinyl record
collection, Audacity is a great place to start because it can teach you some of the basic
editing features for free.
Audacity offers a good selection of add-ons and editing effects, easily accessible from the
main menu bar. They might look as outdated as the user interface, but the controls are
simple to understand and most have some pre-orders to help get you started.
The noise reduction tool worked well in our tests. After highlighting the part of the recording
that requires special attention, the plugin analyzes this section and automatically filters out
unwanted noise.
There are also sensitivity controls and downscaling controls to try to eliminate noise
manually. The preview feature lets you check changes before you make destructive
changes.
Features
File compatibility is not as complete as the highest-paid programs we tested, but you can
import and export the most popular formats, such as MP3, WAV and AIFF
You need to download a free LAME encoder file to export MP3s, but you can find that link
during the software download process. There are many useful keyboard shortcuts to
improve your workflow, and most tools and features are in the main menu bar.
Audacity is great software for novice podcast makers and vinyl lovers looking to digitize
their disc collection. The interface may seem tricky, but it’s easy to navigate and has all the
tools to change and create a professional quality project easily.
If you’re worried about initial software associated with the paid programs, try Audacity first.
Audacity comes with a complete suite of built-in tools that let you edit pre-recorded files,
record audio through microphones, or even stream music and podcasts.
There is a wide range of audio formats for import and export, and the range of built-in
effects is impressive. With all its power and an amazing set of features, Audacity is the best
free sound editor you can download today.
Advantages
Disadvantages
– The interface seems outdated, and the design of some tools is inconsistent
However, Sound Forge Audio Studio still offers a handful of audio restoration tools,
including the ability to scan vinyl records directly into the program, provided you have a vinyl
conversion deck.
This audio editing software also offers a handful of effects and default effects that you can
apply to your audio recordings. It is the basic audio editor.
Features
Sound Forge Audio Studio comes with 11 audio effects from DirectX plugins. The effects
found in this audio editor include a large number of presets.
The assigned effects include 90 audio settings, allowing you to easily find the sound you are
looking for. You can also preview the effects before applying them to your sound.
One of the cool things about using Sound Forge is that the entire interface and toolbars they
contain are fully customizable.
In fact, this is one of the most customizable audio editing programs we’ve reviewed. You
can change the look and feel of the sound editing window. You can also edit the toolbars to
have the exact tools you want. Basically, you can create your own sound editing experience
in Sound Forge.
4. Ocenaudio
Another powerful but simple audio editor, but easier to manage than Audacity. Ocenaudio is
available on several platforms (Windows, Linux and Mac), although not full of features, it is
a great tool for daily audio editing.
It real-time effects should speed up your work, as you do not need to apply the change, but
try it, and the extremely accurate selection tool makes it easier to apply the same effect to
multiple sections of the file.
You can use the software to work with files stored locally or even to open files stored online.
The rare editor interface quickly becomes a pleasure to use, and if you are familiar with
keyboard shortcuts for a while, you should go through the usual tasks at any time. It’s
definitely one of the best music editing apps available.
Features
– Clear interface
– No stacking effect
Ocenaudio offers a good range of effects, with other accessories available, and it is even
possible to export your creations as ringtones for your iPhone.
Advantage
– Clear interface
Disadvantage
Unlike Audacity, this software will not help you create and master perfect note recordings or
remove background noise, but that’s not what it’s designed for.
The Free Audio Editor interface is a simple case of icons, without confusing the menu or
scrolling through the list.
The main attraction is a simple cutting tool, but Free Audio Editor also includes a great
metadata editor for music file editor, as well as various export formats for storing a song in a
format suitable for your reading device. choice.
Features
Advantages
Disadvantages
Not all audio editing software packages share the same features, but most enable you do
the following:
Audio can be recorded in several different file formats, the most common ones are:
● MP3
● AAC
● WMA
● WAV
● FLAC
● OGG
Introduction
In this chapter Audacity has been selected as sample audio editing software for use in
doing the process of editing an already existing audio file. A number of basic editing
activities are performed as per the manual from audacity. The editing includes importing,
exporting, trimming, fade out fade in and reducing noise, among other things.
6.1 Introduction
There are several editing features that are found in Audacity audio editor, but in this module
we are going to focus on the most important and common ones. This is a general guideline.
For detailed processes and functions in Audacity you can visit their official site on
https://manual.audacityteam.org/#tutorials. The following is part of a tutorial extracted from
audacity official site. You can follow it as you do your own practice or visit the site on your
own. If you have experience with audio editing you can even chose a different audio editor
to perform the tasks defined in the objectives of this tutorial.
The easiest way to use Audacity is to import an existing audio file and make small changes.
If you've never used Audacity before, this is a great place to start.
If you have just made a recording it is strongly recommended that you immediately
export your audio using File > Export > Export Audio... to WAV or AIFF (ideally to an
external drive) as a safety copy before you start editing the project.
Objective
The objective of this tutorial is to learn how to edit an audio file. To achieve this objective,
we are going to import an existing sound file, remove all but 10 seconds of this file, apply a
1-second fade-out at the end, export the results, and play it in your favorite audio player.
These steps will introduce the basic steps commonly used when editing the contents of an
audio file.
If you want to edit music that you have on an audio CD, you need to "rip" the music into an
audio file. See the Audio CDs page for information on getting the audio off of CDs and into
Audacity.
Don't have any audio files handy? There is lots of free music online! Here is one site where
you can download free music: Opsound
The recordings on this site are free, distributed under the Creative Commons
Attribution-Sharealike license, which gives you the right to create a derivative work without
paying royalties, as long as you give credit and make your derivative work free, too. This is
similar to the license for Audacity, which allows any devloper to modify it and redistribute it
for free.
A quicker method is to just drag and drop the file as in the following examples:
Windows: Drag the audio Mac: Drag the audio file Linux: Drag the audio
file icon icon file icon
into the open Audacity to the Audacity icon in into the open Audacity
window. the Dock window.
(does not work for all
formats yet).
● On Windows and Mac you can also drag to Audacity's icon in a file manager
application.
● On Mac and Linux you can drag the file to the Audacity icon in the Dock or Taskbar
respectively to import the file into Audacity.
○ On Windows, dragging the file to the Audacity icon in the Taskbar will either
switch the window to Audacity if it is running (from where you can drag the file
in), or if Audacity is closed, give the option to launch Audacity with the file
imported.
Command-line file importing: On all three platforms you can also import files by launching
Audacity at the command-line and passing the files you wish to import as arguments. For
example, on Linux Ubuntu:
This image above shows a stereo waveform. The left channel is displayed in the top half of
the track and the right channel in the bottom half. The track name takes the name of the
imported audio file ("No Town" in this example). Where the waveform reaches closer to the
top and bottom of the track, the audio is louder (and vice versa).
The ruler above the waveform shows you the length of the audio in minutes and seconds.
You can use the Space key on the keyboard as a shortcut for Play or Stop.
Click on Selection Tool then click on the waveform to choose a place to start, then click the
Play button . Click and drag to create a selection, and then when you click Play button only
the selection will play.
Keyboard use: You can select audio entirely using the left arrow, right arrow and other keys.
● Press Left or Right to move the cursor left or right respectively in the waveform.
● Hold down Shift while pressing Left or Right to create then extend a selection
leftwards or rightwards respectively.
● Hold down Shift and Ctrl while pressing Left or Right to contract an existing selection
leftwards or rightwards respectively.
Clicking the Skip to Start button or pressing the Home key will move the cursor to the
beginning of the track. It's kind of like rewind, but it's not for playback - it will only work when
playback is stopped.
Similarly, clicking the Skip to End button or pressing the End key will move the cursor to the
end of the track.
To jump the playback position forwards or backwards from where it is now, click on the
Timeline above the waveform at the point you wish to hear.
Keyboard use: You can use the following keys to skip around the audio file while listening.
The amount the cursor moves in this situation is called the "seek time". The long and short
seek times (one second and 15 seconds in the examples above) can be set in the Seek
Time When Playing section of Playback Preferences.
The image above shows Edit Toolbar with the Zoom buttons highlighted. This is the Zoom In
tool, and this is the Zoom Out tool.
To zoom in to get a closer look at the waveform, first choose the Selection Tool , then click
near the point you're interested in, then click the Zoom In button. Keep clicking the Zoom In
button until you see the detail you need. Note that when you click the Zoom In button the
cursor is centered on the screen.
There are also menu commands and keyboard shortcuts for zooming. View > Zoom > Zoom
In (or Ctrl + 1) is the same as clicking the Zoom In button. View > Zoom > Zoom Out (or Ctrl
+ 3) is the same as clicking the Zoom Out button. View > Track Size > Fit to Width (or Ctrl +
F) will zoom the waveform so it fits in the window.
Use the Zoom commands so that you can make maximal use of your Audacity window to
see as much detail as you need, or to make sure you see the entire file when necessary.
These steps require a mouse, except for using Space on the keyboard to play the
selection and C to play either side of the selection. See below for how to create and
adjust selections using the keyboard.
1. With playback stopped, click near the point where you want the 10-second piece to
begin.
2. Zoom in until the Timeline shows 10 seconds or more before and after the cursor.
3. While holding down the Shift key, click 10 seconds to the right of the cursor.
○ Note that this is just like selecting a range of text in a word processor
4. Press Space to listen to the entire selection. Playback will stop when the end of the
selection is reached.
5. Adjust the start and end of the selection with the mouse as follows.
○ 5.1. Move the pointer over the start of the selection - the cursor will change to
a left-pointing hand.
○ 5.2. Click and drag to adjust the beginning of the selection.
○ 5.3. You can adjust the end of the selection in a similar manner.
6. Press Space to listen to the adjusted selection. You do not have to listen to all of it;
press Space again at any time to stop playback.
○ A convenient way to listen to only the adjusted start of the selection is to
move the mouse pointer a little after the start of the selection then press B.
The selection plays from the start of the selection to the pointer. To hear the
adjusted end of the selection, move the pointer close to the selection end,
then press B to play from the pointer to the selection end.
○ You can also play a length of audio either side of the selection by pressing C.
This lets you make sure there is no audio you want to keep that will be
removed. Playing either side of the selection would also be useful if you later
wanted to cut a small piece out of that selection - you would select the small
piece to be cut, then could preview how the audio would sound after the cut.
To adjust the length of audio played before and after the selection, go to Cut
Preview in the Playback Preferences.
Keyboard use: Use the arrow keys to adjust the selection start and end.
You've now selected the portion of the audio that you want to keep. Make sure you have
pressed Space to stop if the track is still playing, then to delete everything except the
selected audio, click on Edit > Remove Special > Trim Audio.
If you make a mistake, you can always click on Edit > Undo. Audacity has unlimited Undo
and Redo. You can undo your editing actions all the way back to when you imported the file.
You can also Redo actions that you have undone.
You now have a region of audio that starts several seconds (or perhaps minutes) from the
beginning of the track. You could move the audio to the beginning of the track, using Tracks
> Align Tracks > Start to Zero, but this is not a necessary step because when exporting,
Audacity will ignore the white space between time zero and the start of the audio.
Step 6: Fade out the last second
● Click the Skip to End button .
● Zoom In until you can see the last two or three seconds of the waveform.
● Click in the waveform about 1 second before the end.
● Click on Select > Region > Cursor to Track End.
● Click on Effect > Fade Out. The last second of the audio is smoothly faded out.
Note that we always select some audio first, then choose what action we want to perform on
it.
Before we export this 10 second clip to a separate file we're going to simplify things a bit.
Go to the Import / Export Preferences, and under When exporting tracks to an audio file
uncheck "Show Metadata Editor prior to export step". Metadata Editor adds extra
information about the speech or music into the file - see For More Information below to learn
more. You can go back to the Import / Export Preferences at any time to re-enable Metadata
Editor.
● In the Save dialog, from the "Format" menu, choose "MP3 files"
● Then click the Options button to set the bit rate and other options for the MP3 file.
You cannot open an Audacity project in a media player. Only by exporting your project can
you listen to it in a media player.
Once you've exported your project you may want to keep the original project file (AUP) and
its associated _data folder around in case you want to make some changes to it in the
future.
Conclusion
The above tutorial is just one of the many tutorials that you can find in the official site of
Audacity. Among the other functions that audacity can perform are
This implies that by being able to use Audacity, you would have appreciated most of the
functions of audio editing software. This is the software that is used to make final touches to
an audio that has been recorded. But you can also use it to record new audios.