Austrian Composer Peter Ablinger has transformed a child speaking so that it can be played as MIDI events on a mechanically-controlled piano, making the piano a kind of speech speaker. Via Matrixsynth, the readers at Hack a Day get fairly involved with how this may be working.

It seems not quite accurate to describe this as vocoding in the strictest sense, so much as a simple transformation to a (much) lower frequency resolution – that is, the 88 keys of the piano. Ablinger, for his part, describes the events as “pixels.”

It’s pretty extraordinary that without a bandpass filter, you get something approximating the noisy sibilance of the speech, but this seems to be the result of having lots of events (that is, lots of resolution in terms of time).

The child in the video is reciting the Proclamation of the European Environmental Criminal Court

In other words, the basic process is:

1) convert the sound spectrum of the recorded voice to a series of MIDI events,

2) play back the translated MIDI file. 

You can see that the MIDI playback is accomplished with Pd (Pure Data) running on a Windows Linux/KDE netbook, though it’s not clear what was used to do the original conversion. (The screen shot with side-by-side audio and MIDI appears as though it may be for demonstration purposes, only.)