John: “My ADSL broadband connection is only giving me 10Mbps today”!

Martin: “10Mbps? You’re lucky! Not long ago all I had was dual channel ISDN which gave me 128kbps… on a good day”.

John: “Huh, that’s nothing. I used to *dream* of ISDN dual channel communications. Back then, all I had was a 56kbps modem and what’s more used to live in this tiny old house with great big holes in the roof”

Martin: “56kbps? Luxury! When you were living it up with your fancy 56kbps modem, all I had was a 28.8k modem. And a house?! We used to live in one room, me and twenty-six analogue 1G mobile phones, each the size of a brick!”

John: “Well, when I say 56kbps, what I really mean is 14.4kbps. And it wasn’t really a house, more a corridor”.

Martin: “Oh we used to fantasize about owning a sexy 14.4kbps modem. And of living in a corridor too. That would have been a palace to us, even with the US Robotics 2400 baud rate modem we all had to share”


…and so on! Do any of you have conversations like this? Or is it just John and I?

Just John and I then? OK. I feared as much.

Well anyway, this blog post does actually have a purpose beyond perhaps prompting you to recommend me a good therapist!

John (@jcmrim) and I (@mdwrim) have spent some of this year investigating how to integrate accessories and peripherals with the BlackBerry 10 platform. We’ve written about Bluetooth LE a number of times and published various sample applications and video presentations on the subject. Bluetooth LE is certainly an excellent technology for allowing accessories to communicate with a BlackBerry 10 smart phone.  In the past, we’ve written a great deal about Near Field Communications (NFC) too, another technology that allows one device to “talk” to another. There are other options however and for today’s blog post, we decided to “go old school” and look at how you can use audio and the headphone socket on your BlackBerry 10 smart phone as an interface to other devices.

Now this may sound like a far-fetched idea, but it really isn’t- there are a number of commercial accessories in the market already using this very approach.

This is an integration method that is in use today, and there’s no reason why you couldn’t consider it for your accessory and BlackBerry 10 should you have good reason to not use an alternative such as Bluetooth, for example.

John and I discussed the various theoretical aspects of using audio to encode and transmit data and how we might exploit this in a real application. Let’s start with the theory part of our discussion before turning our attention to a real, sample application.

The Theoretical Bit (sic)

So, how can you use audio to send and/or receive digital data? Well, this is precisely what modems do (which was my excuse for the nostalgic trip down my own particular, technology memory lane in the opening of the article). Modems take digital data and turn it into an analogue representation that can be transmitted as an audio signal. This is called “modulation”. In addition, on receipt of such a signal, they can reverse the process and “demodulate” the analogue signal back into the digital data, hopefully the same digital data that was originally transmitted. Oh, and they make a really cool sound whilst they’re doing it too!

Download: dial_up_modem_noises.mp3

Modems like my old US Robotics 2400 are of course pieces of hardware. But, you can perform the same modulation/demodulation process in software. Those of you who like to dabble with an Arduino, may be familiar with the SoftModem Library for example.

To use this technique in a BlackBerry 10 application so that it can communicate with an external device, via the headphone socket, you need three key ingredients in or available to your application:

  1. Access to the BlackBerry 10 audio sub-system via an API
  2. A suitable modulation/demodulation  algorithm
  3. A data coding scheme with which to represent your digital data that includes a suitable error detection and perhaps, correction mechanism

We can depict these three magic ingredients, together with the application data, as layers in a stack:


Encoding and Modulation/Demodulation Choices

There are lots of ways you could choose to encode text as byte values and with a suitable error detection/correction mechanism and equally, there are many modulation schemes that you could consider. Now, I’ve never delved into the world of signal processing before but luckily John knows a lot about radio (one of his hobbies) so I was able to benefit from his vastly greater knowledge. John came up with the specifics relating to data coding and modulation that we used in our app and, knowing that I was in the presence of a somewhat superior being, I mutely (there’s an audio joke there somewhere!) did as I was told.

Data Coding Scheme

To encode our simple text in such a way that we could safely transmit it, we used something called Baudot code which represents characters with 5 bits only and uses 2 bits to supply a parity checking capability. Baudot code was used in early communications equipment like Telex machines and arguably derives from systems devised in the 16th century, seriously old school stuff! But it’s simple and…. it works, at least in-so-far-as our proof of concept (PoC) application requirements are concerned. With only 5 bits per character, we’re restricted to sending upper case ASCII characters only, but again, for a PoC this was acceptable to us.

Let’s take a look at an example.

Imagine we receive the following byte value: 00011010. How would we decode it according to our chosen scheme?

The first thing to know is that the data is transmitted with least significant bit first. So reordering, what we really have is 00011010.

Bit 7 in our reordered data bit is always 0 in this schema; the data payload is 5-bits long and is 11000 whilst the 10 bits contain our parity check bits. For an even number of 1-bits in the message (11000) the check bits must be 10 whilst for an odd number they must be 01.

So, in this example we see that the checksum is correct and the message is 11000 or 0×14 in Hex.

The final twist is that this value represents a 5-bit encoding of an ASCII characters with its code-point (value) derived as an offset from the code point value for ASCII “0”, which is 0×30. To find out what ASCII letter 0×14 represents you have to add 0×30 to 0×14 and that gives us 0×44 which is (drum roll)….. “D”. What fun :)

Modulation / Demodulation

John suggested we could use something called “AFSK” (Audio Frequency Shift Keying). I said “yes boss!”

You’ll find a reference to frequency shift keying at the end of this post. Basically, it involves the transmission of sine waves for a given length of time, at one frequency to represent a binary 1 and at another frequency to represent a binary 0.  We used 3150Hz, a high tone, to represent a 1 and 1575Hz, a lower frequency to represent a 0. The image below taken from an audio application, Audacity,  shows this:


Now, sending a single cycle of a sine wave for a single bit value will not work. The tone’s duration would be far too short for reliable decoding. So in practice we send a series of sine wave cycles at the required frequency such that the tone is emitted for a greater length of time. The following image depicts this:


It’s pretty easy to appreciate what’s going on in that last image:

  • Time progresses from left to right
  • On the left there is a long preamble of a high frequency tone (HIGH)
  • This is followed by 4 intervals of a lower frequency tone (LOW)
  • Then 2 HIGH
  • Then 1 LOW
  • Then 1 HIGH
  • Then 1 LOW
  • Then 2 HIGH
  • Then “silence”

The pre-amble of high frequencies (1s) is a technique for indicating to the receiver that some data is about to be transmitted to it. The first 0 after this, or low frequency pulse, indicates that the next 8 bits contain data. It’s called the START bit for that reason. This comes from something called “Asynchronous Serial Data Transmission”


BlackBerry 10 has extensive APIs for working with the audio system on the device and there have been other blog posts about this (reference to the API documentation included below).

We have a couple of key considerations we need to address in the context of our application and use of audio for data communications. First, when capturing or transmitting, we need to work at a particular sample rate. So, what I sampling and what is a sample rate exactly?

With our modulation/demodulation strategy, we’re converting digital data, characterized by discreet values in a fixed range, to analogue data which is characterized by being “continuously variable”. I think of it like this; analogue systems are capable of nice smooth curves, whereas digital systems can only manage straight lines or blocks. So, if I want to express the idea of a curve in the digital world, I use a series of short, straight lines and if those lines are short enough then when placed next to each other, my creation will look pretty much like a curve. Here’s a crude representation of that concept with the red curve representing an analogue signal and my blocks representing a digital approximation of that curve:


Hopefully, you can see at a glance that if I use a small number of “large grained” digital values to represent my analogue curve, I will end up with a very poor approximation indeed. But if I use a very large number of very fine lines or blocks I will produce a good likeness.

And this is what sampling is about. Imagine a sound that lasts for 1 second. If I divide that sound into 4, quarter second segments (and I’d say my sample rate here was therefore “4”), and I demodulate  each quarter second segment, producing a single digital value that corresponds to each of the measured signals during each quarter second, then this is just like producing a crude, blocky approximation of the original sound. But if (plucking a number out of the air), I was to “sample” the analogue signal, say 44,100 times a second, I will end up with a quite fine and accurate representation of my analogue sound, albeit requiring a large number of digital values as a result.

Thus, a high sample rate will give us a good quality digital representation of a sound.

Now in our case, as you’ll see when we look at the code behind our application, we’re going to be generating our high frequency (0×01) and low frequency (0×00) tones using quite a low level API and we’ll be providing those digital values, which will then convert to an analogue signal. We need to vary those values at the required frequency to get the tone we want. But in what manner? We could toggle abruptly from (+amplitude) to (-amplitude) at that frequency if we wanted to, giving a square wave representation. Or, as was our choice, we could vary our amplitude values along a mathematical curve such as that which we’d recognize as a “sine wave”.

This sample based approach to modulation and demodulation is known as Pulse Code Modulation or PCM for short. There is of course a very solid mathematical foundation for all of this but that’s another story altogether.

The Practical Bit – The AudioAccessory Application

We have a new sample application called AudioAccessory that we developed to illustrate the basic approach to using audio as a means of integrating devices as described in “The Theoretical Bit”. AudioAccessory lets you type some text and transmit this text as data, encoded as a series of audible beeps via the headphone socket. It can also receive audio via the microphone and if it’s correctly encoded, convert the audio into text and display it on the UI.

We didn’t develop every line of code in AudioAccessory from scratch. In fact, we plundered a couple of juicy sample applications that our colleagues Roberto Speranza and Gurtej Sandhu have already published in our GitHub repositories. I’ve included links to those applications at the end of this blog article. But thanks Roberto and Gurtej, your samples made our job so much easier!

The Physical and Electrical Layers

Before we get into the detail of our AudioAccessory application though, let me fill in a couple of gaps in the theory. I presented a stack in the theory section but it wasn’t the complete stack. Lurking at the very bottom are of course the electrical and physical layers. If you intend to plug a gizmo into a BlackBerry 10 smart phone’s headphone socket, you’ll need to do so with the right kind of plug and for that plug to exhibit the right electrical properties.

The physical part is easy enough; the correct type of plug is a 4 ring “TRRS” plug. Here’s a picture one in case you’re not familiar with the term TRRS:


If what you want to do is output audio then there are no particular considerations regarding the electrical properties. But if you’re intending to use the headphone socket as an external microphone socket and receive audio for decoding, then for the BlackBerry 10 device to detect what it believes to be a microphone plugged into that socket, it must find a specific set of impedances between the pins. Values of:  32ohm / 32ohm / 2kohm between ground and each of the T, R1 and S pins, where ground is Ring 2 are the nominal values for this to work.

As you’ll see in the accompanying video, we found a way of cheating here which didn’t require any soldering J

The AudioAccessory Application

Here’s what the UI for AudioAccessory looks like:


If you want to see the application in action, we’ve also published a short YouTube video to accompany this blog post and you’ll find it in our BlackBerry Developer YouTube channel. There’s a link at the end of this post.

AudioAccessory Architecture

AudioAccessory uses a form of the Producer/Consumer pattern to decouple input and output processing, each of which works at quite different rates.

Capturing audio

When capturing audio, one thread, the “capture thread”, reads PCM audio samples from the BlackBerry 10 audio system, and deposits the data into a circular buffer. Meanwhile, another thread, the “demodulate thread” performs blocking reads on the circular buffer, demodulates data that appears there and hands it to the higher layers of the application via call backs.


Emitting Tones

When we emit text encoded as audio tones, our text gets split into single character “messages” and placed into a queue by one thread, whilst another thread fetches messages from the queue and performs the encoding, modulation and emission of the tone via the BlackBerry audio APIs.


Let’s follow through the steps required to take some text from the UI, entered by the user, through to the point at which it is transmitted as a series of audio tones. As mentioned in the architecture section, we use a producer consumer style pattern and the text entered by the user gets stored in a queue, one character at a time. Given the 5 bit encoding scheme we’re using, we make sure all characters are upper case as we do this.

void AudioService::addMessageToQueue(const QString &text)

– Adding text “messages” to processing queue

You’ll find the addMessage function in our AudioPcm.c source file and if you take a look you’ll see how it stores our characters away in memory in a linked list created using C pointers. If you prefer, I see no reason why you could not use one of the higher level Qt classes for this.

So, we accumulate characters in our queue. Meanwhile, in the generateMessages() function, we wait for characters to arrive in queue and then encode, modulate and transmit our data as audio. Here are a few of the key lines of code:

– Encode, modulate and transmit characters

Data Coding

In the baudotEncoding function, we convert a single character into a byte with a value according to the Baudot Code.

static int baudotEncoding(int number)

– Applying Baudot coding to a character


In the modulateAndTransmit(….) function we implement the Asynchronous Serial Data Transmission (ASDT) functionality, creating sine waves of one of two frequencies and for a given duration, to create a modulated, analogue representation of each of the bits in a Baudot encoded character, plus additional bits (e.g start and stop bits) required by ASDT. Having done so, we use the audio APIs to emit tones corresponding to those audio waves.

Here’s how we generate the data representing our sine waves in AudioAccessory:

You’ll find this code in the generateTone function of AudioPcm.c. The variable frequency takes one of two values, FREQUENCY_HIGH or FREQUENCY_LOW according to whether we’re modulating a 1 or a 0 of course.

Emitting Audio

In modulateAndTransmit, you’ll see that the final steps are to emit tones, each representing a single bit and overall, comprising the bits required by the Asynchronous Serial Data Transmission scheme.

In the writeTone function, we use the asound library functions to prepare and then emit a tone.

snd_pcm_plugin_write takes three parameters, comprising a handle to an audio device and the start address and size of a buffer containing our PCM sample.

That’s all folks!

So that’s it for our introduction to the world of audio based accessory integration. We hope you’ve found this interesting and look forward to seeing what you use these techniques for in your own applications.



Excerpt from:  

Beep Beeep Beeeeeeep. And Spandex.