SERIALIZATION FORMATS

Consistent Overhead Byte Stuffing (COBS)

Date Published:
Last Modified:

Overview

Consistent overhead byte stuffing (COBS) is an encoding algorithm for framing serial data which provides a consistent, guaranteed maximum overhead per quanta of data (i.e. fixed amount of overhead per n bytes of data). Framing allows the use of a special byte value which can be used mark the end of the packet, so that a decoder on the other end of a serial communication bus/network can reliably split up one packet from the next and decode the data.

The Encoding/Decoding Process

COBS encoding transforms bytes in the range [0, 255] to [1, 255] so that 0x00 (or any other one byte) can be used to unambiguously mark the end of the packet. The encoding process is best explained with a diagram as below:

A step-by-step diagram of the COBS encoding process.

A step-by-step diagram of the COBS encoding process.

Dealing With 255 Bytes

The one problem with using a single byte to store where the next 0x00 arises when the next 0x00 is more than 255 bytes away. It should now be obvious why, as a single byte cannot store a number higher than 255 (0xFF). So a special thing happens when this occurs — if you encounter a “pointer” byte with 0xFF, it doesn’t mean the next 0x00 is 255 bytes away, it means that there were no 0x00's in the next 254 bytes, and that the 255th byte will contain the next pointer.

Because we have reserved the pointer value 0xFF, we can only use the pointer values 0x01 through to 0xFE to indicate where a real 0x00 is next in the data.

An follow-on effect of this rule is that for a large amount of data containing no 0x00's, a 0xFF pointer byte will be added after every 254 bytes. This is where the Consistent Overhead part of the COBS acronym comes from. Many other packetization/framing algorithms suffer from the fact that the payload can increase significantly if the data contains the specific probelmatic values (e.g. many algorithms produce a payload which is double the size of the original data!).

Some Examples

One of the quickest ways to understand how COBS work is to look a simple examples of raw and encoded data. The following table shows basic examples which demonstrate the encoding process and highlight the edge-cases. All data is byte-wise in hexidecimal.

Raw Data (hex)Encoded Data (hex)
0001 01 00
0102 01 00
0202 02 00
0302 03 00
00 0001 01 01 00
00 0101 02 01 00
01 02 03 ... FD FE FFFF 01 02 03 ... FD FE 02 FF 00

Source Code

I wrote C++ functions which can peform COBS encoding and decoding as part of an open source serial communications protocol, they can be found at https://github.com/gbmhunter/SerialFiller/blob/master/src/CobsTranscoder.cpp.


Related Content:

Tags:

comments powered by Disqus