Generating sounds

We saw earlier how to play sound samples and change their parameters. Though this technique is simple and easy to begin with, it is not enough for making breakthrough sound art projects. One way to achieve this is by generating and synthesizing sounds and not using samples at all. Another way is to use samples as raw material for processing methods such as morphing and granular synthesis. Both ways are based on using low-level algorithms, which construct sounds as an array of audio samples in real time.

openFrameworks uses low-level sound input and output, and we use C++ for processing it, so our sound processing pipeline can perform almost any trick with sounds, will work fast, and with only small lags.

Tip

There is one thing that is currently not so convenient to implement with openFrameworks. This is processing a sound stream using a variety of standard filters and effects. To do this, you need to program filters yourself or use libraries or addons. Also, you can use software such as Max/MSP or Ableton Live for sound generation and then control it from openFrameworks via OSC protocol. See Chapter 11, Networking for more details.

For generating sound in real time, you need to start the sound output stream and then provide audio samples for the sound when it is requested by openFrameworks. The corresponding additions to the project's code are as follows:

  1. Add a sound stream object and function for audio output to the testApp class declaration as follows:
    ofSoundStream soundStream;
    
    void audioOut( float *output, int bufferSize, int nChannels );
  2. At the end of the testApp::setup() function definition add:
    soundStream.setup( this, 2, 0, 44100, 512, 4 );

    Here this is a pointer to our testApp object which will receive requests of audio data from openFrameworks by calling our testApp::audioOut function.

    Subsequently, 2 is the number of output channels (hence, stereo output), 0 is the number of input channels (hence, no input), and 44100 is a sample rate, that is, the number of audio samples played per second. The value 44100 means CD quality and is good in most situations. The last two parameters 512 and 4 are the size of the buffer for audio samples and the number of buffers respectively. This is discussed later.

  3. Add the function definition as follows:
    void testApp::audioOut(
            float *output, int bufferSize, int nChannels ){
    
      //... fill output array here
    
    }

    This is the function that should fill the output array with the audio samples' data. This function actually generates the sound. Values of output should lie in the range from -1.0 to 1.0. In the opposite case, audio clipping will occur (you will hear clicks in sound). The size of output is equal to bufferSize * nChannels, and the samples in the channels are interleaved. Namely, if nChannels is equal to 2, then this is a stereo signal, so output[0] and output[1] mean the first audio samples for the left and the right channels. Correspondingly, output[2] and output[3] mean the second audio samples, and so on.

Also, there are a number of functions for managing audio devices. They are as follows:

  • The soundStream.listDevices() function prints to console the list of devices.
  • The soundStream.setDeviceID( id ) function selects a device, where id has type int. You should call this before soundStream.setup(). If no soundStream.setDeviceID( id ) was called, then the default system device is used.
  • The soundStream.stop() function stops calling audioOut().
  • The soundStream.start() function starts calling audioOut() again.
  • The soundStream.close() function ends using audio device by soundStream object.

There are two important things about the sound generating function audioOut(). Firstly, the function is called by openFrameworks independent of the update() and draw() functions' calls. Namely, it is called at the request of the sound card, when the next buffer with audio samples for playing is needed:

Generating sounds

Secondly, audioOut() should work fast. In the opposite case, the sound card did not receive the buffer in time, and you will hear clicks in the output sound. You can tune this by changing the two last parameters in the following line:

soundStream.setup( this, 2, 0, 44100, 512, 4 );

512 is a buffer size. If the buffer is bigger (for example, 1024), then it is rarely requested, so you have more time for filling this, so more robustness. On the contrary, a lower value of the buffer size, for example, 256, leads to the better responsivity (smaller latency) of audio. The reason is that the delay between buffer filling and its playing through the audio system will be smaller. The last parameter, 4, is the number of buffers used by the sound card for storing sound. Similarly, increasing the parameter leads to better robustness and decreasing them leads to better audio responsivity.

Now, we will consider an example of sound generation.

Note

Warning

When using ofSoundStream for sound output in your projects, be careful! Due to possible errors in the projects' code and for other reasons, it can suddenly generate clicks and very loud sounds. To avoid the hazard of damaging your ears, do not listen to the output of such projects using headphones.

The PWM synthesis example

Let's build a simple sound generator using Pulse Width Modulation (PWM). In electronics, PWM is a method of sending analog values through wires using just two levels of voltage (logical 1 and 0). The value is coded by changing the length of the pulse with logical value 1, with the overall cycle length fixed. In the following diagram, coding val in range from 0 to 1, with fixed cycle length c is shown. You can see that an output signal is a periodic wave, with the wavelength equal to c, and the wave consists of two segments with values 1 and 0, with lengths val * c and c - val * c respectively:

The PWM synthesis example

Such a signal can be considered as a sound wave, with the wave frequency equal to 1.0 / c.

If val is equal to 0.5, then 1 and 0 values have equal length in the wave, and such a waveform is called a square wave.

Note

PWM sound waves and especially square waves are widely used in subtractive synthesizers as the basic waveforms for sound generation. They have a fat and distinct electronic sound. Among them, sinusoidal, triangle, and saw-shaped waves are also used.

Let's consider an example of PWM sound generation. The frequency and PWM value of the wave will depend on x and y mouse coordinates, so when you move the mouse, you will hear the sound changing.

Note

This is example 06-Sound/03-PWMSynth.

Warning: To avoid the hazard of damaging your ears due to the possibility of suddenly generated very loud sounds, do not listen to the output of the project with headphones.

This example is based on the emptyExample project in openFrameworks.

Add the next code to testApp.h, in the class testApp declaration. Note that the sound control parameters are userFreq and userPwm— a frequency and PWM value. And there are separate variables for these parameters freq and pwm which will change relatively slowly. This lets us always obtain a smooth sound, even when the user changes sound parameters fast (that is, moves the mouse rapidly).

//Function for generating audio
void audioOut( float *output, int bufferSize, int nChannels );

ofSoundStream soundStream;  //Object for sound output setup

//User-changing parameters
float userFreq;             //Frequency
float userPwm;              //PWM value
//Parameters, used during synthesis
float freq;                 //Current frequency
float pwm;                  //Current PWM value
float phase;                //Phase of the wave

//Buffer for rendering last generated audio buffer
vector<float> buf;

At the beginning of the testApp.cpp file, after the #include "testApp.h" line, add declarations of some constants as follows:

int bufSize = 512;           //Sound card buffer size
int sampleRate = 44100;      //Sound sample rate
float volume = 0.1;          //Output volume

The setup() function sets the initial values and starts the sound output:

void testApp::setup(){
  userFreq = 100.0;          //Some initial frequency
  userPwm = 0.5;             //Some initial PWM value

  freq = userFreq;
  pwm = userPwm;
  phase = 0;
  buf.resize( bufSize );

  //Start the sound output
  soundStream.setup( this, 2, 0, sampleRate, bufSize, 4 );
}

The update() function is empty, and the draw() function draws the buffer with audio sample values on the screen:

void testApp::draw(){
  ofBackground( 255, 255, 255 );  //Set the background color
  //Draw the buffer values
  ofSetColor( 0, 0, 0 );
  for (int i=0; i<bufSize-1; i++) {
      ofLine( i, 100 - buf[i]*50, (i+1), 100 - buf[i+1]*50 );
  }
}

Also we need to fill the mouseMoved() function to change the parameters according to the mouse move. The userFreq frequency will change in a range from 1 to 2000 Hz, and the PWM value userPwm will change in a range from 0 to 1:

void testApp::mouseMoved( int x, int y ){
  userFreq = ofMap( x, 0, ofGetWidth(), 1, 2000 );
  userPwm = ofMap( y, 0, ofGetHeight(), 0, 1 );
}

Finally, add the audioOut() function that generates the sound. You can see how we change the freq and pwm values with each cycle loop to approach userFreq and userPwm smoothly. Also note that phase is a value in a range from 0 to 1 and it changes in correspondence with freq and sampleRate at each audio sample generation.

void testApp::audioOut( float *output,
        int bufferSize, int nChannels ){
  //Fill output buffer,
  //and also move freq to userFreq and pwm to userPWM slowly
  for (int i=0; i<bufferSize; i++) {
      //freq smoothly reaches userFreq
      freq += ( userFreq - freq ) * 0.001;
      //pwm smoothly reaches userPwm
      pwm += ( userPwm - pwm ) * 0.001;

      //Change phase, and push it into [0, 1] range
      phase += freq / sampleRate;
      phase = fmodf( phase, 1.0 );

      //Calculate the output audio sample value
      //Instead of 1 and 0 we use 1 and -1 output values
      //for the sound wave to be symmetrical along y-axe
      float v = ( phase < pwm ) ? 1.0 : -1.0;

      //Set the computed value to the left and the right
      //channels of output buffer,
      //also using global volume value defined above
      output[ i*2 ] = output[ i*2 + 1 ] = v * volume;

      //Set the value to buffer buf, used for rendering
      //on the screen
      //Note: bufferSize can occasionally differ from bufSize
      if ( i < bufSize ) {
          buf[ i ] = v;
      }
  }
}

Run the code and move the mouse left-right and up-down. You will hear a distinctive PWM sound and will see its waves:

The PWM synthesis example

Move the mouse and explore the sound when the mouse is in the center of the screen and in the screen borders. Because the x coordinate of the muse sets the frequency and the y coordinate of the mouse sets the PWM value, you will notice that moving the mouse in the middle of the screen gives a fat square sound, and moving the mouse at the very top and bottom of the screen gives glitch-like pulse signals.

If you change the values 0.001 to 0.0001 in lines freq += ( userFreq - freq ) * 0.001; and pwm += ( userPwm - pwm ) * 0.001; then freq and pwm will slowly move to userFreq and userPwm. So while moving the mouse, you will hear a glide effect used in synthesizers. On the contrary, if you set these values to 1.0, freq and pwm will just be equal to userFreq and userPwm, and you will hear a raw sound, rapidly changing with the mouse moving.

Tip

In some compilers, you need to perform the Rebuild command for your project in order for the audioOut() function to be linked to the project correctly. If the linking is not correct, you will just see a straight line on the screen and hear nothing. If you see the PWM waves on the screen but do not hear the sound, check your sound equipment and its volume settings.

You can extend the example by adding control to its parameters by using some analysis of live video taken from the camera or 3D-camera data.

We will go further and see an example of transcoding image data into a sound signal directly.

Image-to-sound transcoder example

Let's get an image and consider its center horizontal line. This is a one-dimensional array of colors. Now get the brightness of each color in the array. We will obtain an array of numbers, which can be considered as PCM values for some sound, and used for playing in the audioOut() function.

Tip

Certainly, there exist other methods for converting visual data to audio data and back. Moreover, there exist ways to convert audio and video to commands, controlling robot motors, 3D printers, smell printers, and any other digital devices. All such transformations between different information types are called transcoding. Transcoding is possible due to the digital nature of representation of all the information in the computer. For example, number 102 can be simultaneously interpreted as a pixel color component, an audio sample value and an angle for a robot's servo motor. For detailed philosophical considerations on transcoding, see the book The Language of New Media, Lev Manovich, The MIT Press.

Such an algorithm is a transcoding of image to audio data. Let's code it using frames from a camera as input images. For details on using camera data, see Chapter 5, Working with Videos.

Note

This is example 06-Sound/04-ImageToSound.

Warning: To avoid the hazard of damaging your ears due to the possibility of suddenly generated very loud sounds, do not listen to the output of the project with headphones.

This example is based on the emptyExample project in openFrameworks. Add the following code to testApp.h in the class testApp declaration:

//Function for generating audio
void audioOut( float *output, int bufferSize, int nChannels );

ofSoundStream soundStream;  //Object for sound output setup

ofVideoGrabber grabber;     //Video grabber

At the beginning of testApp.cpp, after the #include "testApp.h" line, add constants and variables:

//Constants
const int grabW = 1024;            //Width of the camera frame
const int grabH = 768;             //Height of the camera frame
const int sampleRate = 44100;      //Sample rate of sound
const float duration = 0.25;       //Duration of the recorded 
                                   //sound in seconds
const int N = duration * sampleRate;  //Size of the PCM buffer 
const float volume = 0.5;             //Output sound volume
const int Y0 = grabH * 0.5;        //y-position of the scan line

//Variables
vector<float> arr;      //Temporary array of pixels brightness
vector<float> buffer;   //PCM buffer of sound sample
int playPos = 0;        //The current position of the buffer playing

The setup() function sets the buffer arrays' sizes, runs the video grabber, and starts the sound output:

void testApp::setup(){
  //Set arrays sizes and fill these by zeros
  arr.resize( grabW, 0.0 ); 
  buffer.resize( N, 0.0 );

  //Start camera
  grabber.initGrabber( grabW, grabH );

  //Start the sound output
  soundStream.setup( this, 2, 0, sampleRate, 512, 4 );
}

The update() function reads a frame from the camera and writes the brightness of the central line into the buffer. It saves the pixel's brightness values into array arr, which has a size equal to the image width grabW. Next, arr is stretching the buffer array, which has size N, using linear interpolation.

Also, the values of the buffer are shifted so the mean value of its values will be equal to zero. Such a transformation is the simplest method for DC-offset removal. Methods of DC-offset removal are always used in sound recording for centering recorded signals. This is a crucial procedure in the case of mixing several sounds because it helps to reduce a dynamic range of mixed signals without any changes being heard:

void testApp::update(){
  grabber.update();                //Update camera
  if ( grabber.isFrameNew() ) {    //Check for new frame

      //Get pixels of the camera image
      ofPixels &pixels = grabber.getPixelsRef();

      //Read central line's pixels brightness to arr
      for (int x=0; x<grabW; x++) {
          //Get the pixel brightness
          float v = pixels.getColor( x, Y0 ).getLightness();
          //v lies in [0,255], convert it to [-1,1]
          arr[x] = ofMap( v, 0, 255, -1, 1, true );
      }

      //Stretch arr to buffer, using linear interpolation
      for (int i=0; i<N; i++) {
          //Get position in range [0, grabW]
          float pos = float(i) * grabW / N;

          //Get left and right indices
          int pos0 = int( pos );
          int pos1 = min( pos0 + 1, N-1 );

          //Interpolate
          buffer[i] = ofMap( pos, pos0, pos1, 
                             arr[pos0],arr[pos1] );
      }

      //DC-offset removal
      //Compute a mean value of buffer
      float mean = 0;
      for (int i=0; i<N; i++) {
          mean += buffer[i];
      }
      mean /= N;

      //Shift the buffer by mean value
      for (int i=0; i<N; i++) {
          buffer[i] -= mean;
      }
  }
}

The draw() function draws the camera image, marks the scan line area by a yellow rectangle, and draws the buffer as a graph in the top part of the screen. See the draw() function code in the example's text.

Finally, the audioOut() function reads the values from the buffer and pushes them into the output array. The playing position is held in the playPos value. When the end of the buffer is reached, the playPos is set to 0, so the buffer plays in a loop:

void testApp::audioOut(
        float *output, int bufferSize, int nChannels ) {
  for (int i=0; i<bufferSize; i++) {
      //Push current audio sample value from buffer
      //into both channels of output.
      //Also global volume value is used
      output[ 2*i ] = output[ 2*i + 1 ]
                = buffer[ playPos ] * volume;
      //Shift to the next audio sample
      playPos++;
      //When the end of buffer is reached, playPos sets to 0
      //So we hear looped sound
      playPos %= N;
  }
}

Run the example and direct the camera somewhere. You will see the camera image with the scan area selected by a yellow rectangle. At the top of the screen, you will see the corresponding graph of sound, and will hear this sound in a loop. Note how bright and dark pixels in the scan line correspond to the high and low graph values. Most likely, the sound you hear will be quite strange. This is because our ears are trained to hear periodic signals but normally, data from a camera image is not periodic.

Now, direct the camera to this stripes image (yes, direct the camera right to this picture in the book, or print it on a paper from the file stripesSin0To880Hz.png):

Image-to-sound transcoder example

If you fit the scan line to the horizontal line of the image, you will hear a sound tone, swiping from a low to a high tone, and see the image as shown in the following screenshot:

Image-to-sound transcoder example

Actually, the stripes correspond to a sine wave with the frequency changed from 0 to 800 Hz, with a duration of one-fourth of a second. The corresponding graph of its PCM is shown in the following screenshot:

Image-to-sound transcoder example

You can see that the graph of the sound, transcoded from the camera (at the top of the previous screenshot), is noised but nevertheless, is similar to the original graph.

Now move the camera closer to the stripes image. You will notice how the tone of the sound decreases. If you move the camera very close, you will hear a bass sound.

Here is one more stripes image to play with. It codes ar sound (stripesAr.png file):

Image-to-sound transcoder example

Tip

You can prepare stripe images by coding your own sounds using the loop sampler example. This is discussed in The loop sampler example section.

We hope that after you finish playing with this example you will understand and feel the nature of a PCM-sound representation in a better way.

Now we will consider how to get sound data from a microphone and other input sound devices.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset