JC and I have added mtalk to our sf2a — solfege to audio — distribution. Once installed, you can do this:

% mtalk 'Please get me a quart of milk at \
   the store.  Thanks.' -p -t tempo:144

The option -p means “play the audio file immediately,” and -t temp:144 sets the temp. Here is what the result sounds like:

Please get me a quart of milk at the store. Thanks.

This little programming project (we have to get our fun somehow:-) was inspired by James Gleick’s book, The Information. See previous post



A Poem

A friend sent me this link to a poem by Gjertrud Schnackenberg. It is lovely, musical, strong . I will quote just a few lines. It is from The Light Gray Soil:

My fingers touch
A penny, long forgotten in my coat,
Forgotten in the shock, December eighth,
Midnight emergency, a penny swept
Together with belongings from his coat
Into a sack of “Personal Effects,”
Then locked away, then given to the “Spouse.”
Nearly relinquished, nearly overlooked.
Surely the last he touched, now briefly mine.
A token of our parting, blindly kept.
Alloy of zinc, the copper thinly clad,

Many changes. (1) New name –sf2a (2) source code is at github. (3) At github click on download button if you wish to download. (4) Installation on mac: untar or unzip downoaded file, cd to the resulting folder, run sudo sh setup.sh -install YOUR_USER_NAME; (5) after install, run sf2a 'do re mi' A file out.wav should be created. This is the audio file. (6) There is a musical dictation program, dict that creates audio files and a web page for dictation exercises based on the data in a text file. Use the file dictation.txt in the install folder for an example. Just run dict -m in that folder, then open the web page index.html. (7) For a draft manual, see this web page.

All this works on a mac. Adapting it to Linux is easy. Just change the values of $INSTALL_DIR and $BIN_DIR in setup.sh. I don’t know enough about PC’s to advise on this — one ought to be able to modify the file setup.sh

Once upon a time there were two friends, a cricket and a millipede. The cricket admired the way his friend, Illacme Plenipes, could move forward with with such grace, with such coordination of her 700 legs. And the millipede admired the singing of her friend, Gryllus Rubens. One day Gryllus asked Illacme, “How do you do it? Which leg do you move first? How do you keep the rhythm?” Illacme had never thought about these things. But it seemed like an interesting question. Motionless, she thought for a a while, and then said, “Well, I think I do it like this …”. And there she remained in place, unable to move a single leg, or even the tip of her antennae. Night fell. It was not until Gryllus began to sing that the spell was broken and Illacme, the millipede, was able once again to creep along the damp earth.

An extract from my unpublished manuscript, “Lessons and Parables”

Song of Gryllus rubens


s2sound now supports up to ten independent voices. Here is a two-voice example

voice:1 decay 2.0
h mi q re ti_ h do

h do_ q sol_ sol__ h do_

See github for source code, including the 10-channel mixer in mix.c

Talking Drums

Two days ago I started reading James Gleick’s new book, The Information. I don’t know what the critics think, but it has met my two most important criteria: It held my attention; I learned something new. And a third, good prose style — efficient if not poetic — is also satisfied. So here is today’s little gem from the book: Talking Drums. The first, brief report of these to the European public was by Francis Moore in 1730, who navigated the Gambia River on a reconnaissance mission for the slave trade. A century later, Captain William Allen, on an expedition up the Niger River, noticed more. Speaking of his Cameroon pilot, he wrote:

Suddenly he became totally abstracted, and remained for a while in the attitude of listening. On being taxed with inattention, he said: “You no hear my son speak?” As we had heard no voice, he was asked how he knew it. He said, “Drum speak me, tell me come up deck.” This seemed to be very singular.

Singular it was indeed! It was not until the publication in 1949 of The Talking Drums of Africa, by the missionary John Carrington, that non-Africans understood and deciphered the drummers’ code. Carrington realized that drummers could quite communicate complex information — “birth announcements, warnings, prayers, even jokes” — over long distances through a specialized tonal language. It was a language that was nearing extinction just as its secret was uncovered.

I don’t want to take away your reading pleasure, so I will leave you with these (1) the drummers had developed an amazingly sophisticated system of disambiguation and error correction that allowed them to communicate complex sentences using only two tones, (2) A man from the Lokele village, where Carrington lived for many years, had this to say about him:

He is not really European, despite the color of his skin. He used to be from our village, one of us. After he died, the spirits made a mistake and sent him off to a far away village of whites to enter into the body of a little baby who was born of a white woman instead of one of ours. But because he belongs to us, he could not forget where he came from, and so he came back.” The man added, “If he is a little bit awkward on the drums, this is because of the poor education that the whites gave him.

Postscriptum. As a fun little programming exercise, JC and I worked up something to transform text to an imaginary musical language. Below is the text and the “music.”

Hey man, look at the sun!
Hey man, it keep us warm!
It grow our our food,
It keep us warm.
Hey man, look at the sun,
And feel it be warm on your face!!

Hey man, look at the sun!

For the source code for converting text to music, if you are interested in such things, see our github repository. It is part of our sf2sound project. The most relevant files are talk.py and talk.sh.

NOTES. The transformation of text to “music” effected by talk.py encodes vowels as quarter notes — a = do, e = re, i = mi, o = fa, u = sol. Consonants are encoded as a pair of eighth notes, e.g., p = do re, b = re do. All the plosives are encoded as a major second, unvoiced ones rising, the voiced ones falling. In general, members of a phonetic group — fricatives, liquids, etc. — share some musical feature, e.g. the same interval. Spaces and punctuation marks are coded by a short melodic fragment. See the code for talk.py for further information.

I’ve set up a git repository for sf2sound. There is a two-voice example there, and JC has written some commentary on it, as well as posted the audio files.

The superposition principle makes this all quite easy — we just add together the waveform files that sf2sound produces for the two voices. That file represents the combined voices, and we use text2sf to produce the audio file. For more voices, simply add more waveform files!

Our work so far is quite primitive, both musically and as a software product. But we are having a lot of fun, and learning a lot. Eventually we hope something polished and elegant will come out of this.


Waveform shaping


The above is a visual representation of the opening measures of Muzio Clementi’s sonatina, Op 26, No1 as rendered by sf2sound — a kind of command line synthesizer that takes a stream of solfa symbols as input. This is homework for an eventual ear-training program. In any case, here is the audio file:


One of the challenges has been in shaping the waveforms of the individual notes so that they fit together without making annoying pops. A simple exponential decay did not work, although that is good for making the sound more or less percussive. What I discovered is that (at a minimum), one has to shape the attack and release of the note. For the moment I have done this by shaping the wave form with a simple quadratic function.

I’ve been experimenting with various settings and algorithms in order to get a better or more interesting sound. The sound file you hear is slightly more complex than the other ones I have posted. In previous version the sound was either (1) an exponentially damped sine wave, or (2) the former with some kind of shaping of the amplitude profile as mentioned above. In the current version, higher harmonics are mixed with the fundamental tone. Here is the code snippet of quad2samp where the mixing occurs:

// Form the sine wave and add harmonics to it
samp = sin(W*phase);
samp += -0.4*sin(2*W*phase);
samp += +0.2*sin(3*W*phase);
samp += -0.1*sin(4*W*phase);

I’ve observed an odd but likely well-known phenomenon (or is it an illusion?). When the sound consists of a shaped sine wave, i.e. no (deliberate) harmonic mixture, I find it painful to listen to it, even at relatively low volumes. Painful in the most elementary sense of the word, not because the poor artistry of sf2sound! When I mix in higher harmonics in some degree, the (physical) pain diminishes. I suspect this because the acoustic energy in the first instance is concentrated near a single frequency, so a small number of hair cells in the inner ear are overstimulated. When the same energy is spread among the various harmonics, albeit in unequal proportions, it is also spread over more hair cells, so that individual cells are not overstimulated. Perhaps someone who really knows what is going on can comment.

I’ll close with one more image — a close-up shot that shows how the wave forms from two adjacent notes join smoothly. The jaggedness of the sound wave reflects the addition of higher harmonics to the sine wave representing the fundamental tone.


Sonatina: close-up

Improvements — more shapely waveforms produced by quad2samp eliminates those durn popping sounds — and much more! Source code for quad2samp and sf2sound.py

1 | 2 | 3 | 4 | 5 | 6 |Source Code

Many changes, including new names for the commands. Here is an example:

% sf2sound test '|| fundamental:261 f: tempo:120 | stacc: q do re mi fa | leg: p: h sol sol ||'

And here is the audio:


The dynamics f: and p:, for forte and piano, now work thanks to the improved back-end program quad. I have renamed it quad2samp. The command fundamental:261 sets the frequency in Hertz of do. For longer examples, you can take a file as input:

sf2sound -f theme.solfa

The file theme.solfa represents the first 15 measures of a sonatina in C by Muzio Clementi (Op. 36 No. 1). See the code pages. Here is the sound file:


previous version

Notice that those terrible popping sounds are gone. Eliminating them turned out to be a problem in waveform shaping, which is accomplished in quad2samp using some nice little quadratic functions. I will discuss these improvements and others in the next few posts, as well as give all the source code.

One of the major changes is a clearer idea of what the input language is and can do. I am dubbing the language SF — for solfa, of course. It has three basic entities: note symbols, rhythm symbols, and commands. The symbols can be accented. Thus do+ is an accented do which is interpreted as C#, while do^ is one of two ways to write C an octave above the given C, the other being do2. An important accent for a rhythm symbol is the dot. Just as in music, it increases the value by one-half. Thus q. is a dotted quarter note.

Commands may or may not take arguments. Thus we have tempo:144 but also allegro:. Some commands take more than one argument. An example is cresc:4:f, which means crescendo over four beats to forte. One can also say cresc:4, which means crescendo over four beats from the current level to whatever level results. The rapidity of the crescendo is a default constant which of course can be adjusted: crescendo-speed:1.2. The commands change the values of the “SF” machine that transforms a stream of tokens in the SF language into quadruples. So far this architecture seems to remarkably easy to extend and maintain.

This little project is more work than I bargained for, but it beats playing video games. I am having fun and learning a lot. The Audio Programming Book by Boualanger and Lazzarini is a fantastic resource.


Immense destruction

There are no words for this. Move slider to left and right to see more/less of before/after. HH