Embedded Binary

24 Jun 2014

I was playing around with a bare metal programming project where the goal was to play a sound from a raw audio file and push the content of the file byte for byte onto a special memory location in order to generate an audio output.

In this project I try to keep things very minimal so I don't have a c library to manipulate files, I don't even have an operating system running. It's just the raw binary code running havoc directly on the hardware.

Now how do I load an audio file when I don't have any file operations or filesystem? Well one solution is to embedd the audio file into your binary directly. This can be done using the gcc toolchain, and I will show you how to do it.

The first thing I had to do was to generate the raw data. I have an mp3 file as input and I would like to generate an audio file with 8 bit pcm samples. For this job I'm using ffmpeg.

ffmpeg -i nokia-tune.mp3 -f u8 -acodec pcm_u8 -ar 8000 sound

This will produce a file called sound and this is the data that we want to embedd into our program. In order to place some raw data together with compiled code we need to convert the raw data into an object file, which is usually a file in the elf format. We can do this using the ld tool.

ld -r -b binary sound -o sound.o

-r means that we want to generate a relocatable object file. We need this because we want to link it with another object file before we can produce the final executable. -b binary means that we specify that the input file called sound should be treated as a binary file and not the default which is elf object file. -o sound.o tells ld where to put the resulting file.

We can now inspect the generated object file to see what ld has created. For this we will use the nm tool which lists the symbols from object files. We can also use the readelf tool but I prefere the clean output from nm.

$ nm sound.o 
000000000001e000 D _binary_sound_end
000000000001e000 A _binary_sound_size
0000000000000000 D _binary_sound_start
$ wc -c sound
122880 sound

This output tells us that ld has created an object file which contains 3 symbols. _binary_sound_end and _binary_sound_start are placed in the data section while the _binary_sound_size symbol is an absolute symbol placed on address 0x1e000. Notice that the raw sound file is 122880 bytes long and if we convert that into hex we get 0x1e000.

To read this data we need to create a c program which links with these symbols and uses them.

#include <stdio.h>

extern unsigned char _binary_sound_end;
extern unsigned char _binary_sound_size;
extern unsigned char _binary_sound_start;

int main(void)
{
  printf("start=0x%x\n", &_binary_sound_start);
  printf("end=0x%x\n", &_binary_sound_end);
  printf("size=0x%x\n", &_binary_sound_size);  
}

We know the name of the symbols in the sound.o file so we declare those symbols as extern and give them a unsigned char type. The code will link with the external symbols and we print out the address of the 3 symbols which we will use later. This is how to compile and run this code.

$ gcc main.c sound.o -o main
$ ./main
start=0x601034
end=0x61f034
size=0x1e000

This output shows where in the binary file the start and end of the sound file is located and it also shows that in order to find the size of binary data you could either get the address of the size symbol or you could find the difference between the start and end address. Now lets iterate all the binary data.

#include <stdio.h>

extern unsigned char _binary_sound_end;
extern unsigned char _binary_sound_size;
extern unsigned char _binary_sound_start;

int main(void)
{
  // Using pointers
  unsigned char * begin = &_binary_sound_start;
  unsigned char * end = &_binary_sound_end;
  while (begin != end)
  {
    printf("%0x\n", *begin++);
  }

  // Using array offset
  size_t size = (size_t)&_binary_sound_size;
  unsigned char * data = &_binary_sound_start;

  int i;
  for (i=0; i<size; i++)
  {
    printf("%0x\n", data[i]);
  }
}

Now that you know how to add binary data directly into your binary why should you ever care about files anymore…