Granular Sampler

Granular Sampler

We are going to demonstrate how to implement a new, completely different type of synthesis (or maybe better called a sound processing as the sound is not really synthesized). You can control parameters interactively (start, length of grain and number of voices), and replace the grain source material (or part of it) in real time. In result, this creates very interesting sounds and is a lot of fun to play.

The hardware is capable of sampling up to 1 second of stereophonic sound, and playing pieces of it back in parallel, as up to 40 grains per ear. The rest of the memory is reserved for echo/loop buffer, if that's not required, sampling time can be increased.

You will learn how to:

  • add a new channel with any numeric code we want
  • initialize hardware using low level functions
  • implement sound processing without relying on existing channels
  • record samples from microphones, process them and output
  • play multiple voices in parallel from SRAM memory with a group of pointers
  • bind the new functionality to existing controls (sensors and buttons)
  • mix the signal output from our custom function into the loop with or without delay/echo

Requirements for any Gecho firmware programming

Getting Started

While some of the previous tutorials shown how to build a new channel on top of existing one, this time the functionality we have in mind is so different, that it's better to start pretty much from scratch. We will only re-use low level, hardware driver functions.

To add new channel, open Channels.cpp, find function "void custom_program_init(uint64_t prog)" and add a new channel definition into it. The channel number can be anything that is not yet allocated in this function. For not getting lost, it's best to keep these blocks of code sorted by channel number.

There is currently (as in version 0.219 and below) an unallocated range starting from #411, let's use it.

if(prog == 411) //granular sampler
{
ADC_configure_MIC(BOARD_MIC_ADC_CFG_STRING); //use built-in microphones
granular_sampler(); //the function where everything happens
}

We will put this function into a new file, ideally within the "extensions" directory. Create a new Granular.cpp and .h files there. Function definition goes to .cpp file, let's create an empty function that does nothing for now. It also needs to refer to the .h file.

#include "Granular.h"
 
void granular_sampler()
{
 
}

Into Granular.h file goes the function declaration. Your IDE can probably generate this file automatically (and add the prevention for recursive inclusion). What you need to add there is your function name, wrapped in two compiler directives that optionally open and close extern "C" block (which is a linkage specification). If looks confusing, don't worry about it too much - it's only required here as in our project we are mixing c and cpp code.

We also added a few #include directives to access functions from framework's other files.

#ifndef EXTENSIONS_GRANULAR_H_
#define EXTENSIONS_GRANULAR_H_
 
#include <stdlib.h>
#include <string.h>
#include "hw/gpio.h"
#include "hw/leds.h"
#include "hw/sensors.h"
#include "hw/signals.h"
#include <Interface.h>
 
#ifdef __cplusplus
 extern "C" {
#endif
 
void granular_sampler();
 
#ifdef __cplusplus
 }
#endif
 
#endif /* EXTENSIONS_GRANULAR_H_ */

Also, don't forget to include the newly created .h file in top of the Channels.cpp:

#include <extensions/Granular.h>

It is a good time now to try to recompile your project to check for possible errors or typos.

In subsequent paragraphs, code will be explained by logical blocks rather than in sequence, as it appears in the file. Some less interesting (but still important) bits and pieces have been left out from this detailed explanation as they are pretty straighforward. You can download the complete source - Granular.cpp and .h files - to see how these chunks of code are arranged.

Sampling the Signal

First, let's define a few constants and variables. GRAIN_SAMPLES sets the size of sampled data buffers. Since we are sampling in 16-bit resolution, each sample takes 2 bytes - we are using int16_t type (signed integer) for this. Defining this value in relation to sampling rate (I2S_AUDIOFREQ) is a handy way to express it, as it is inherently equal to the same fragment of a second.

We also create buffer variables and allocate memory dynamically. Variables sampleCounter and sampleCounter2 will be pointing to a particular sample within these buffers. We reset them to zero (to point to the beginning of the buffers), also initialize audio codec and the I2S bus too.

#define GRAIN_SAMPLES (I2S_AUDIOFREQ/2) //memory will be allocated for up to half second grain
#define GRAIN_LENGTH_MIN (I2S_AUDIOFREQ/16) //one 16th of second will be shortest allowed grain length
 
int16_t *grain_left, *grain_right; //two buffers for stereo samples
 
//there is an external sampleCounter variable already, but we want
//a separate one for right channel to process it independently from left
unsigned int sampleCounter2;
 
//allocate memory for grain buffers
grain_left = (int16_t*)malloc(GRAIN_SAMPLES * sizeof(int16_t));
grain_right = (int16_t*)malloc(GRAIN_SAMPLES * sizeof(int16_t));
memset(grain_left, 0, GRAIN_SAMPLES * sizeof(int16_t));
memset(grain_right, 0, GRAIN_SAMPLES * sizeof(int16_t));
 
//reset counters
sampleCounter = 0;
sampleCounter2 = 0;
 
//init and configure audio codec
codec_init();
codec_ctrl_init();
I2S_Cmd(CODEC_I2S, ENABLE);

Note: By what we call a "sample" in the code, we actually mean one 16-bit wide piece of the sample data, not the entire recording (i.e. a buffer full of sampled sound). To avoid confusion, that other sample we will call "the grain".

The code to read values from A/D converters and store them to buffers is located in a loop, happening periodically at 22.050 kHz (two ADC conversions - one per stereo channel). We are also checking if the S2 proximity sensor is triggered beyond the defined threshold, and only allow sampling new grains if it is. Note how the sensor values in the ADC_last_result array are numbered from zero, here we are using index of [1] to select sensor S2. Selecting the sound source by button #4 (mics vs pickups) happens elsewhere, the ADCx_read() functions receive data from pre-configured source.

//sensor S2 enables recording - when triggered, sample the data into the grain buffer (left channel first)
if (ADC_last_result[1] > IR_sensors_THRESHOLD_1)
{
grain_left[sampleCounter] = (int16_t)(4096/2 - (int16_t)ADC1_read()) / 4096.0f * op_gain;
}
 
//sample data into the right channel grain buffer, when S2 triggered
if (ADC_last_result[1] > IR_sensors_THRESHOLD_1)
{
grain_right[sampleCounter2] = (int16_t)(4096/2 - (int16_t)ADC2_read()) / 4096.0f * op_gain;
}

Playing the Grains

The magic happens here. We have an array of floating-point values step[], containing pointers to a particular sample in the grain buffer. However it does not only have to point at a discrete sample (0,1,2,3...), but anywhere in between (e.g. 2.48, hence we need the floating point). Imagine this: if you were just incrementing this pointer by one with each new sample, you would end up playing original sound the same way how it was recorded. However, by increasing this pointer by a constant value different from 1, you can speed up or slow down how fast the pointer "travels" through the buffer. In result, the sound is played at accordingly higher or lower pitch.

The trick is to increase the pointers smootly but then round them, to pick the nearest discrete sample. Information about how much to increment pointers in each step is held in another array freq[] which contains floating point numbers, representing frequency directly. This value is compared to basic "reference" note C5, to see whether the grain should play at faster, slower or equal speed (more on that later).

After mixing all the voices into the variable sample_mix, it is transmitted to a particular channel of ADC converter over SPI bus and I2S protocol. Note how the sample is wrapped in add_echo() function, so this effect is added automatically if enabled (by button #3). The previous line is responsible for proper timing, checking SPI_I2S_FLAG_TXE flag to see if the codec is ready to receive next sample.

//mix samples for all voices (left channel)
sample_mix = 0;
 
for(int i=0;i<grain_voices;i++)
{
step[i] += freq[i] / FREQ_C5;
if ((int)step[i]>=grain_length)
{
step[i] = grain_start;
}
sample_mix += grain_left[(int)step[i]];
}
 
//send data to codec, with added echo
while (!SPI_I2S_GetFlagStatus(CODEC_I2S, SPI_I2S_FLAG_TXE));
SPI_I2S_SendData(CODEC_I2S, add_echo((int16_t)(sample_mix)));
 
//mix samples for all voices (right channel)
sample_mix = 0;
 
for(int i=0;i<grain_voices;i++)
{
sample_mix += grain_right[(int)step[i]];
}
 
//send data to codec, with added echo
while (!SPI_I2S_GetFlagStatus(CODEC_I2S, SPI_I2S_FLAG_TXE));
SPI_I2S_SendData(CODEC_I2S, add_echo((int16_t)(sample_mix)));

Pitches and Frequencies

Two things are still not clear from previous code: what does freq[i] / FREQ_C5 do, and what is hidden in freq array.

While the sound which is recorded via microphones is random and not necessarily musical, so it does not happen to be at any particular "pitch", or the actual pitch may be changing during the sampling period, for the sake of simplicity we will consider it to be at C5 note. The whole reason is that we need a handy way to manipulate this sound in the frame of common musical notation, e.g. creating chords from it. It does not matter that we picket C5 note specifically as in the end we only care about frequency ratios and those would end up being the same, regardless which note we chosen as a reference. As you alrady know, for example - changing the pitch one octave up means doubling the frequency - this ratio is the same for C6 vs. C5, C5 vs. C4 etc.

Essentially, value in freq[i] determines how fast the grain will be played back relatively to the original speed at which it was recorded.

The array as we defined it is only one of nearly infinitely many possible configurations (limited by hardware, of course, and by what sounds good) which we can define in order to take what was sampled and "spread it around" the spectrum. Here we will use the most common chord, major triad, as the base, and we add few "detuned" frequencies too to enrich it. First, let's define frequency ratios of notes in this chord. This approach is optimal enough (memory and computation wise), as the values MAJOR_THIRD and PERFECT_FIFTH will end up as pre-compiled constants, and the rest of the array will be filled in once the program starts. Coefficients like 0.98 or 1.04 slightly detune the pitch, and multiplication or division by powers of two shifts it up and down one or more octaves.

Now you can see how the first 3 positions of freq array contain the basic chord, then every subsequent three add more and more voices to it, shifting it up and down - first slightly, later by an octave or two. Think about what will this do in real time, where by affecting the sensor S1, more and more of these voices will get active - from only the first 3 in the beginning, to all 33 when the sensor is activated to its full level.

//define some handy constants
#define FREQ_C5 523.25f //frequency of note C5
#define FREQ_E5 659.25f //frequency of note E5
#define FREQ_G5 783.99f //frequency of note G5
 
#define MAJOR_THIRD (FREQ_E5/FREQ_C5) //ratio between notes C and E
#define PERFECT_FIFTH (FREQ_G5/FREQ_C5) //ratio between notes C and G
 
//define frequencies for all voices
freq[0] = FREQ_C5;
freq[1] = freq[0]*MAJOR_THIRD;
freq[2] = freq[0]*PERFECT_FIFTH;
freq[3] = freq[0]*0.99;
...
freq[8] = freq[2]*1.01;
freq[9] = freq[0]*0.98/2;
freq[10] = freq[1]*0.98/2;
freq[11] = freq[2]*0.98/2;
freq[12] = freq[0]*1.02/2;
...
freq[24] = freq[0]*1.04*4;
freq[25] = freq[1]*1.04*4;
freq[26] = freq[2]*1.04*4;
...
freq[30] = freq[0]*1.05*8;
freq[31] = freq[1]*1.05*8;
freq[32] = freq[2]*1.05*8;

Interactive Controls

Apart from sensor S2 enabling recording of new sound, which has been covered in previous paragraphs, we have S1 to control number of voices and S3+S4 to move begging and end of the grain's "active area" up and down, to select which portion of it is to be played back.

Each sensor returns values from 0 to over 2000, we may decide how to parse this range and map it against a parameter we want to control, to get optimal results. For example, considering how S1 and S2 are laid out on the board, it may be a good idea making the inner one little bit more sensitive than the outer (as it is harder to reach without triggering the other one too inadvertently).

For this reason, we allow the recording to start at the very lowest level of S2 (so it's not that hard to reach), while S1 values from 200 to 1200 will cover minimum to maximum number of voices (so there is no need to get really close to the sensor, to activate them all). Anyway, these values are not too critical and pretty much tuned up by experimenting.

//if sensor S1 triggered, increase number of active voices
if (ADC_last_result[0] > 200)
{
//when sensor value is 2200, enable all voices
grain_voices = GRAIN_VOICES_MIN + ((ADC_last_result[0] - 200) / 2000.0f)
* (GRAIN_VOICES_MAX - GRAIN_VOICES_MIN);
 
if (grain_voices > GRAIN_VOICES_MAX)
{
grain_voices = GRAIN_VOICES_MAX;
}
}
else //if sensor not triggered, use basic setting
{
grain_voices = GRAIN_VOICES_MIN;
}

Note: This may look complicated, especially the longest line that does the calculation. To better illustrate the logic, here is the basic formula explained. We want to map values between A and B coming from sensors (i.e. 200 to 2200), into a range C to D (i.e. 3 to 33). Let's call the input value coming from sensor X, and output we want to get in the result Y. Basically, we find out how far the sensor value X is from the minimum, related to maximum (as if represented by 0 to 1 in floating-point, 0 meaning minimum, 1 maximum, 0.5 in the exact middle). This is simply calculated as r = (X-A) / (B-A), essentially a ratio of two distances - how far the value is from the minimum, and how big the range is. In second step, this ratio is used to multiply output parameter's possible range, to get the end result: Y = C + r * (D-C). These two equations combined look like Y = C + (X-A) / (B-A) * (D-C) and if you compare this to the long line of code again, you can clearly see what is what (all constants & variables are in the same order). There is also a necessary check and limit for overflow value, to cover the case where the equation generates parameter outside of allowed range (i.e. when this sensor is triggered beyond the 2200 level).

Sensors S3 and S4, which are changing where the grain's "active area" begins and ends, are treated similarly. For both we will take values 200-2200 and map them over entire range, where these parameters make sense. There is a constant GRAIN_LENGTH_MIN defined for minimal length that still sounds good.

//if sensor S4 triggered, decrease grain length
if (ADC_last_result[3] > 200)
{
//when sensor value is 2200 or higher, set length to minimum
grain_length = GRAIN_SAMPLES - (float(ADC_last_result[3] - 200) / 2000.0f)
* float(GRAIN_SAMPLES - GRAIN_LENGTH_MIN);
 
if (grain_length < GRAIN_LENGTH_MIN)
{
grain_length = GRAIN_LENGTH_MIN;
}
}
else //if sensor not triggered, use basic setting
{
grain_length = GRAIN_SAMPLES;
}
 
//if sensor S3 triggered, move the grain start higher up in the buffer
if (ADC_last_result[2] > 200)
{
//when sensor value is 2200 or higher, set the start to be closest to the end
grain_start = (float(ADC_last_result[2] - 200) / 2000.0f) * (float)grain_length;
 
if (grain_start < 0)
{
grain_start = 0;
}
if (grain_start > grain_length - GRAIN_LENGTH_MIN)
{
grain_start = grain_length - GRAIN_LENGTH_MIN;
}
}
else //if sensor not triggered, use basic setting
{
grain_start = 0;
}

Using sensor S2 just to enable/disable sampling of new sound felt like not enough. However, sensor was chosen for a reason; the same could be accomplished by a button but then clicking it would be audible in the grain and we don't want that. There is one more thing this sensor could do, resulting in an interesting effect: shift the sample counter of the second grain buffer against the first one, so the two stereo channels are timed differently, which creates interesting more widely panned stereoscopic effect, giving the impression of richer, deeper sound.

At the lowest threshold IR_sensors_THRESHOLD_1 both buffers are written to at the same position, which is at subsequent thresholds shifting by one sixth, fourth, third or a half of the buffer size.

Here the smooth calculation, shifting the offset continously would be counter-productive, as we would end up shifting the recorded data up and down with each slight variation in sensor's value, which does not sound great. Instead, let's just define few threshold levels at which the change in counter's value occurs. We are relying on GRAIN_SAMPLES constant here, to represent the counter's position relative to allocated buffer's size. Of course, the whole block could be written in a for loop (replacing threshold defines with numbers), but that's been avoided for the sake of clarity.

//sensor S2 sets offset of right stereo channel against left
if (ADC_last_result[1] > IR_sensors_THRESHOLD_9)
{
sampleCounter2 = sampleCounter + GRAIN_SAMPLES / 2;
}
else if (ADC_last_result[1] > IR_sensors_THRESHOLD_7)
{
sampleCounter2 = sampleCounter + GRAIN_SAMPLES / 3;
}
else if (ADC_last_result[1] > IR_sensors_THRESHOLD_5)
{
sampleCounter2 = sampleCounter + GRAIN_SAMPLES / 4;
}
else if (ADC_last_result[1] > IR_sensors_THRESHOLD_3)
{
sampleCounter2 = sampleCounter + GRAIN_SAMPLES / 6;
}
else if (ADC_last_result[1] > IR_sensors_THRESHOLD_1)
{
sampleCounter2 = sampleCounter;
}

Activating the Hardware

To enable sensors and buttons functionality, we need this bit of code somewhere in the loop (so it is executed at certain moments, tens or hundred times per second to make controls smooth and responsive enough). Here we are using pre-defined macros like TIMING_BY_SAMPLE_EVERY_10_MS that are relying on sampleCounter. All this is pretty standard and copied over from other channels with only a small modification, required as we don't want certain settings to occur (microphones completely off, or echo going beyond the maximum buffer length, which is shorter here than in other channels).

if (TIMING_BY_SAMPLE_EVERY_10_MS == 0) //100Hz periodically, at 0ms
{
if (ADC_process_sensors()==1) //process the sensors
{
//values from S3 and S4 are inverted (by hardware)
ADC_last_result[2] = -ADC_last_result[2];
ADC_last_result[3] = -ADC_last_result[3];
 
CLEAR_ADC_RESULT_RDY_FLAG;
sensors_loop++;
 
//enable indicating of sensor levels by LEDs
IR_sensors_LED_indicators(ADC_last_result);
}
}
 
//we will enable default controls too (user buttons B1-B4 to control volume, inputs and echo on/off)
if (TIMING_BY_SAMPLE_EVERY_100_MS==1234) //10Hz periodically, at sample #1234
{
buttons_controls_during_play();
 
//if the setting goes beyond allocated buffer size, set to maximum
if(echo_dynamic_loop_length > GRANULAR_SAMPLER_ECHO_BUFFER_SIZE)
{
echo_dynamic_loop_length = GRANULAR_SAMPLER_ECHO_BUFFER_SIZE; //set default value
}
 
//it does not make sense to have microphones off here, skip this setting and re-enable them
if(input_mux_current_step==INPUT_MUX_OFF)
{
ADC_set_input_multiplexer(input_mux_current_step = INPUT_MUX_MICS);
}
}
 
//process any I2C commands in queue (e.g. volume change by buttons)
queue_codec_ctrl_process();

 

UPDATE: Progressive version

It was easy enough to expand this to a mode that cycles through chords of a song, the result is even more interesting. After updating your unit's firmware, you'll find it under channel #412.