A better way of creating .wav files given note data

1 | 2 | 3 | 4 | 5 | 6 |Source Code

Today’s post in my little saga of learning audio programming features a program quad.c. It creates a .wav file given a file of quadruples of the form (frequency in Hertz, duration in seconds, decay factor, amplitude). There are advantages of this approach as compared to to the previous one, which was based on concatenating the text files produced by tfork.  First, only a single intermediate text file containing waveform sample data is created.  In the first version,  one file was created for each note, plus one for the concatenation of all the former.  The concatenated file can be quite large. It is a file of 44,100 numbers for each second of audio which represents the sampled waveform. With quad, the file size is of corse the same as that of the concatenated file. The waveform which is sampled, however, has continuously varying phase. With the concatenated files produced by tfork, the phase begins at zero at the start of each note.  More importantly, in the new version, the volume of the individual notes can be controlled by setting the amplitude in foo.quad. You will notice the increase in volume in note to note when you play the file foo.wav.

In the next post, quad will be incorporated in solfa2sf.

Example. Here is a three-line file of quadruples that represents the notes A E A’, where A’ is an octave above A = 220 Hertz:

File: foo.quad
220 1.0 0.5 0.2
330 1.0 0.5 0.5
440 1.0 0.5 1.0

The comments in quad.c give more details, but suffice to say here running the following plays the sound represented by foo.quad

./quad foo.quad foo.samp
text2sf foo.samp foo.wav 44100 1 .90
rm foo.samp
play foo.wav


Note the increase in volume from note to note.


In version two of solfa2sf: we can enter rhythm as well as pitch values, and we can control tempo and note decay

1 | 2 | 3 | 4 | 5 | Source Code

Source Code for solfa2sf

The first version of solfa2sf took a string of solfa syllables and produced a .wav sound file representing it. This was good as a start, but we want to do more! In particular, we would like to be able to give different rhythm values to the notes. I’ve posted an improved version of solfa2sf that does just this. One can use the program on the command line like this

solfa2sf foo '| q do2 . e mi2 do2 . q sol sol |'

The result is a file foo.wav:


It represents a melody that starts with a quarter note C above middle C, continues with a pair of eighth notes, and ends with a pair of quarter notes.

To work with longer bits of music, it is best to write the solfa in a text editor, then compute the sound file like this.

solfa2sf -f theme.solfa

The file (theme.solfa) is this text below

| q do2 . e mi2 do2 . q sol sol |
| do2 . e mi2 do2 . q sol sol2 |
| e fa2 mi2 re2 do2 . ti do2 ti do2 |
| re2 do2 ti la . q sol x |

The symbols || . | have no effect on the program and could be omitted. They do help the human who reads or composes tis text, however. The possible rhythm values at this moment are w (whole note), h (half note) q (quarter note), e (eighth note), and s (sixteenth note). The command tempo:120 sets the tempo, and decay:0.1 determines how fast a note trails off into silence: Increase it to make the sound more sustained, decrease it to make it more percussive.

Below is the sound file. Click the link to play it. Do you recognize the piece? I studied it as a child back in Iceland.



We combine C and Python programming to transform a sequence of solfege syllables, e.g., do re mi re do, into a sound file.

1 | 2 | 3 | 4 | 5 | Source Code

In our last post we used tfork and a few unix commands to construct a sound file for an A major chord, A C# E A’. The idea was (1) to make text files representing the sounds of the individual notes using tfork, (2) glue the text files end-to-end using cat, (3) convert the resulting big text file into a .wav file using text2sf.

This works fine for short “melodies,” but it soon gets tedious — and out of hand. A better way is to write a short program that does all this for you, given a text string of solfa syllables like “do re mi re do”. This is what we do in the Python program solfa2sf. Below is the sound produced by

./solfa2sf -w foo 0.3 0.1 do re mi re do


SOURCE CODE for solfa2sf

In running solfa2sf, the arguments are as follows: (1) an option: -d for “dry run”, i.e., no output, -v for an output file with verbose messages, -w for wet, the opposite of dry: produces the output .wav file with few messages; (2) the file name; foo results in an output file foo.wav, (3) the note duration in seconds, (4) the decay time, (*) the solfa syllables.

If you use a small decay time, the sound is percussive, like a marimba, or even a drum. If you use a large one, it is more like an organ. Try delay times of 0.01, 0.1, and 1.0 to see what the effect is.

Python is very good with strings, lists, and dictionaries, which is what we need to parse and handle an input string like “do re mi re do’. C is the best tool for fast computation. So we use them together! As you can see from the source code, we call on the C program tfork using the Python command os.system. Like a carpenter, we use saw, chisel, hammer, etc. as needed for the task at hand.


We combine the power of tfork and text2sf with the magic of unix to construct a sound file for an A major chord.

1 | 2 | 3 | 4 | 5 | Source Code

A Chord

In Audio Programming 1, we compiled the tfork.c program in chapter 1 of the Audio Programming Book, Richard Boulanger and Victor Lazzarini, editors. This program , together with text2sf, gave us the tools needed to algorithmically construct simple sounds, namely an exponentially decaying sine wave. We used these tools to fabricate a file that represents the sound of a tuning fork.

Simple as these two tools are, they gives us the means to construct more complicated sounds without any additional C programming (fun though that is!). We will use the magic of unix. To begin, a short shell script:

# Language: unix/sh
# File: sound.sh
# Example: sh sound.sh 440 a --- writes text representation of
# a 0.2 second 220 Herz sound to a file a.txt

./tfork $2.txt 0.2 $1 44100 0.2

Using sound.sh, we make four sounds, each 0.2 seconds in duration, with frequencies of 220, 275, 330, and 440 Hertz. These correspond to the notes A, C#, E, and A’ = A one octave higher. The frequency ratios are C#/A = 5/4, E/A = 3/2, and A’/A = 2. Thus we are using Pythagorean tuning, in which pitch ratios in the scale are rational numbers with small numerator and denominator.

Let us now execute the following commands:

% sh sound.sh 220 a
% sh sound.sh 275 c#
% sh sound.sh 330 e
% sh sounds.sh 440 a2

The result is the creation of text files a.txt, c#.txt, etc., which represent the given sounds. Next, we concatenate these files, putting a.txt first, c#.txt next, etc:

% cat a.txt c#.txt e.txt a2.txt >chord.txt

Then, we convert chord.txt into a .wav file:

% text2sf chord.txt chord.wav 44100 1 1.0

To conclude, we play the file:

play chord.wav

Here is the sound:


Clearly there is more to do, among which are  (1) Clean up this sound: it needs to fade cleanly into silence; (2) Develop a mini language for transforming a sequence of pitch names into a sound file; (3) make more complex sounds.


Ever since I blogged about the eruption of the volcano under the Eyjafjallajokull glacier, my American friends have teased me about the mouth-twisting words in my country’s old and so-beautiful language. As a public service, I offer them this pronunciation link. See the second audio player on this NPR web page.

Eyjafjallajokull: Post 3 | Post 2 | Post 1


A 440

We will use the magic of unix to make a tuning fork

1 | 2 | 3 | 4 | 5 | Source Code

Things are a little slow at the shop these days – the aftermath of getting our big software development project done. With some time to spare, I’ve been reading The Audio Programming Book, edited by Richard Boulanger and Victor Lazzarini. As a substiture for note-taking as I work my way through this excellent text, I’ve to decided to blog about it.

The first task, from Chapter 1, around page 162, is to write a playable file for the sound of a tuning fork. The end result sounds like what you will hear if you click this link:


Wasn’t that nice? You did play it didn’t you? Here is how we did it – proceed only if you speak C and enjoy the power and elegance of Unix:

Step 1. Compile the program tfork.c using gcc tfork.c -o tfork. You will find it on the CD in chapters/01..., or here. Then execute this command:

./tfork tfork.txt 1.0 440.0 44100 0.2

The result is a 44,000 line text file, tfork.txt, whose first three lines are as follows:


It looks like this:

Tuning fork sound: A 440, exponential decay.

The file represents a sine wave at 440 Herz that decays exponentially to a small fraction of the starting amplitude after 1.0 seconds. The sample rate for the sound is 44100 Herz. The tfork command is used this way:

./tfork outfile duration frequency sample_rate decay_constant

Step 2. Compile the code in text2sf.c and install the binary somewhere in your search path. Execute the command

text2sf tfork.txt tfork.wav 44100 1 1.0

You now have a one-second CD-quality “recording” of a tuning fork! It is in the file tfork.wav You can play it using whatever means suits you and your computer best. On a mac, the command

play tfork.wav

works fine.

Isn’t it interesting that a sound can be represented so many different ways? As a list of numbers, that is, a text file. As an image. As the digital sound file that the computer can directly play. As vibrations in the air that tickle the hair cells in our inner ear. As a memory …

There must be some deep philosophical meaning in all this.


PS. Related links: Barry Threw: Art and Technology

The internet I want

Yesterday I went to see a friend, a retired history professor, who is still active in publishing scholarly articles. She wanted to show me a short film clip on YouTube. Not more than a minute and a half long, we waited six minutes for it to download. Enough time to finish a cup of coffee and taste the excellent cookies that she had prepared. And enough time to reflect on the connection at our office, in the heart of Cambridge, Massachusetts. It is better, but it can take hours to download a four-gigabyte software install package. Amazing! And here comes the kicker:

   High-speed internet in Hong Kong

The internet connection I have

The company referred to in the article is offering 1000 megabit per second service for $26 per month! According to my computations, that means that I could download a 7.5 gigabyte file in one minute. Now that would make a difference in our productivity at the shop. Or think of my friends who work with video and have to transfer large files.

A comparison in the article: Verizon offers a “high-speed” connection of 20 megabits per second service for $144.95 per month/ Whoa!! More than five times as much money for a connection that is 50 times slower!

If this were April Fool’s Day, I could relax. This is just a joke. But it is the beginning of March. Perhaps I should move still further west, to Hong Kong. I hear that the food is excellent!!



A Different Way of Doing Things.

A friend sent me this link about a prison in Norway. I need to ponder this one. But here is an interesting statistic, taken from the article:

Only 16 percent of the prisoners in this island jail become repeat offenders in the first two years after leaving Bastøy as compared with 20 percent for Norway as a whole. In Germany, where recidivism is measured after three years, the rate is 50 percent.

Alas, Germany’s rate is not the highest:

Within three years of their release, 67% of former prisoners are rearrested and 52% are re-incarcerated, a recidivism rate that calls into question the effectiveness of America’s corrections system, which costs taxpayers $60 billion a year. Violence, overcrowding, poor medical and mental health care, and numerous other failings plague America’s 5,000 prisons and jails. source

Extreme cases are good food for thought, and the case of this Norwegian prison is certainly extreme! When I’ve mulled this over, I will post a few more comments. But I am on break and have to finish debugging a program before night falls.


Classical Jam Session

When I was recovering from a serious coding binge back in Iceland, long walks along the seashore and playing the piano were the two things that saved me. Since then, I’ve tried to lead a more balanced life: no coding after sundown, a mix of activities — reading, walking, seeing friends, hanging out at the neighborhood bar, cooking, playing music. The point is to do several things, not just one — one terrible thing that swallows up both the day and the night, demanding your full attention for five, ten, fifteen, twenty hours at a stretch until you finally lie exhausted, shipwrecked in the dawn, clinging to the foot of the bed as if it were a lifeboat, the mess of dishes and books piled high, crowding the sacred space before the altar of the computer.

I’ve tried to keep this healthy routine, part of which is to meet with friends every Thursday to play music. We are so-so amateur musicians, but we have a lot of fun, and also a new “activity:” each of us brings an original composition to play, either individually, or as a group. Well, at the beginning we were pretty bad, but we have learned a lot, and we have had a lot of good times. One of the house rules is that after a piece is played, we all have to improvise on it. This way we all share in the embarrassment, the good musical moments, as well as the beer! I promise to post something of my own soon, since the last person to do so has to buy food and drink for the whole group. I am, however, taking the liberty of linking to a piece by one of the other players:

short piece for solo cello

There is more … this is just the first line of the piece:-)


I found the following link on a friend’s facebook page:

Where women of India rule the roost and men demand gender equality.

I highly recommend it: as field anthropology, as reportage from the domestic scene, and as food for thought. The thought for which this is food concerns the enduring Nature versus Nurture controversy. In particular, are the roles of men and women biologically determined, or are they socially conditioned? It is apparent to even the most politically correct that certain differences are biological. Women bear children, while men don’t. Men have, on average, much more muscle mass. But take another area. While men dominate professions such as violent crime and the design of video games, racing cars, and thermonuclear weapons, it has not been conclusively proved that this must forever be so. Do men do this because of the genetic codes they carry in their cells, or were they schooled to do so, however implicitly, at an early age? The further we move away from the obvious cases, the less clear the issue becomes. And confusion may set in at surprisingly close range, as shown by the article on the Khasi people referred to above. The unequal and subservient situation of Khasi men is so dire that one of them writes

Only mothers or mother-in-laws look after the children. Men are not even entitled to take part in family gatherings. The husband is up against a whole clan of people: his wife, his mother-in-law and his children. So all he can do is play the guitar, sing, take to drink and die young.

As Newton taught us, for every action there is an equal and opposite reaction. While the “equal” part probably does not apply in the human sphere, the general idea seems correct. Indeed, men of the Khasi people have organized themselves into the aptly named “Syngkhong Rympei Thymai” (SRT), a liberation movement whose goal is full equality of the sexes. Nature or Nurture? I don’t know.