GeckoLoader: An EFM32 Bootloader’s Utility

Are you looking for a CLI or GUI utility to upload programs to EFM32 microcontrollers through the factory-programmed bootloader? Look no further.


Why EFM32 microcontrollers?

EFM32 are my first choice for low-power sensor-based systems because, well, they are “Energy Friendly Microcontrollers”, providing several interesting features specially designed for that purpose and a nice set of peripherals (go figure on their website). With a 32-bit ARM Cortex processor at its core, performance is guaranteed. They also provide support for GCC ARM Embedded toolchain which, in my opinion, is a HUGE plus, making them really nice and easy to use (or getting started with), without code size limits or having to pay for a commercial toolchain (like Keil or IAR). Silicons Labs provide all the startup code, linker scripts, an easy to use and well documented peripheral library, tools to monitor power consumption and even an Eclipse-based IDE. It’s everything inside Simplicity Studio which can be freely downloaded.

Disclosure: I don’t work at Silicon Labs and I am by any means associated with them. I’m just an EFM32 enthusiast 🙂


A great feature of EFM32 microcontrollers, by Silicons Labs, is the factory-programmed UART bootloader that may be used to upload programs into the microcontroller instead of using a commercial programmer. The bootloader uses the XMODEM-CRC protocol to transfer data and, as referred on the  application note AN0003 UART Bootloader, TeraTerm, which supports that kind of data transfers, may be used. However, if you want the capability of uploading a program through your own application a command-line utility is what you need. Look no further, you have efm32_loader which can run in CLI or GUI mode (if any arguments are provided).

Gecko Loader

CLI mode usage:

UART: efm32_loader.exe <port_name> <bin_file> uart <boot_pol>
USB:  efm32_loader.exe <port_name> <bin_file> usb

Regarding hardware, all you need is a USB-to-UART converter connected to your computer. Connections are as follows:

TX  -- BOOT_RX (E11)
RX  -- BOOT_TX (E10)
DTR -- BOOT_EN (DBG_SWCLK)
RTS -- RESET

Please be aware that, in order to prevent the bootloader from being overwritten, the linker script must be modified as described on application note AN0003.

The source code is available on Github:

https://github.com/cidadao/efm32_loader

And there’s also a Windows executable, available here:

Download

Bugs report? Suggestions or feature requests? Please use the Github or the comment section bellow.

Update (29/09/2015): added support for USB bootloader

The Art of Fixed-Point Representation

Have you been using float or double variables to perform mathematical operations on embedded systems without a Floating-Point Unit (FPU)? You are doing it wrong! That’s incredibly inefficient. Use fixed-point representation instead.

An FPU is an hardware block specially designed to carry on arithmetic operations on floating point numbers. Even though the C/C++ code may work without an FPU, it’s always much faster to use hardware designed for a specific purpose, like this one, instead of relying on a software implementation, something that the compiler will do for you, knowing the hardware restrictions you have but not in an efficient manner. Essentially, it will generate a lot of assembly code, greatly increasing the size of your program and the amount of time required to complete the operation. Thus, if you don’t have an FPU available and you still want to perform those arithmetic operations efficiently you’ll have to convert those numbers to fixed-point representation. Integers! But how? By scaling them. Let’s see how that scaling value may be determined.

The scaling value as well as the resulting scaled number, which is an integer, really much depends on the bitness of the CPU’s architecture being used. You want to use values that fit in the available registers which have the same width as the CPU buses. So, whether you are working with an 8, 16 or 32-bit architecture, the range of integer values we can store on those registers,  being b the number of bits and representing numbers in two’s complement, is given by:

    \[-2^{b-1} \leq value \leq 2^{b-1} - 1\]

Fixed-Point Representation

If one bit is used to represent the sign (and in this text we’ll always consider signed numbers) the remaining ones may be used to represent the integer and fractional parts of the floating-point number.We may textually represent this format as follows (denoted as Q-format):

    \[Qm.n\]

Where m corresponds to the bits available to represent the integer part of and n corresponds to the bits available to represent the fractional part. If m is zero you may use just Qn. So, when you use a register to save both integer and fractional parts (and the sign bit!), the value range is then given by:

    \[-2^{m} \leq value \leq 2^{m} - 2^{-n}\]

(note that the expression above is a particular case of this one, for n=0 and m=b-1).

It’s up to you deciding how many bits are reserved for m and n (still, you should base your decision on a good criteria: the more bits, the greater the precision you can achieve; more on this bellow). So, you are essentially fixing an imaginary point in your register that separates the integer and fractional parts. That’s why it’s called fixed-point.

Now, consider the following floating-point number:

    \[x=0.123456\]

Since the integer part is zero, you have n=b-1 bits to represent the fractional part. You do that by multiplying the floating-point number by 2^n. And that’s our scaler! Simple as that. For 8, 16 and 32-bit architectures, these are the resulting scalers and corresponding scaled values (i.e. the floating-point number represented in fixed-point):

Q-format Scaler x_{scaled} x_{rounded}
Q7 2^7 15.802 16
Q15 2^{15} 4045.4 4045
Q31 2^{31} 265119741.247488 265119741

Yes, after scaling the floating-point number you may still get a floating-point number as well, so you have to round it to get an integer.

Precision (Or The Lack of It)

Using finite word lengths (i.e. 8, 16 or 32-bit registers/variables) limits the precision with which we can represent a number. The more the bits we have, the greater the precision we can achieve. It’s a trade-off. Usually, the less bits you have available (in the architecture as well as on physical memory), the cheaper is the hardware.

Thus, it is important to respond to the question: “how many bits do I need to represent a floating-point number in fixed-point format and retaining the same accuracy?”

The answer is given by this expression:

    \[d\,log_210 \simeq 3.3 d \quad bits\]

Where d is the number of digits of a fractional number. For example, to retain the same accuracy of 0.123456 we need at least 3.3 \times 6 = 19.8 \simeq 20 bits.

Arithmetic Operations

Now if you want to perform arithmetic operations with fixed-point numbers, there are some rules to follow.

Multiplication

Multiplying two numbers with the same base and different exponents results in x^a \times x^b=x^{a+b}. Since for each operand we have 2^{b-1} possible signed numbers, by multiplying them we’ll get (2^{b-1})^2=2^{2b-2} possible signed numbers as a result. That means we need twice the space to store the result of a multiplication and that you get 2 sign bits in it. For example, if both operands are 16-bit values, we’ll need a 32-bit register to store the result, where the two most significant bits are sign bits and the remaining 30 bits contain the result.

The same principle applies to fixed-point multiplication and the output Q-format is given by:

    \[Qm_1.n_1 \times Qm_2.n_2=Q(m_1+m_2).(n_1+n_2)\]

In the example above, if we represent the operands in Q15 format, then we’ll get a result in Q30 format. To remove the extra sign bit you only need to shift-left the result. You may also want to reuse it in another multiplication. In that case, it’s useful to convert the result to the same Q-format as the operands. That means we have to truncate our value. Discarding bits containing information will result in loss of precision. In order to reduce that loss the result is rounded before truncation. Adding 2^n to our result will do the rounding. Truncation is then accomplished by right-shifting the result by n bits or by (n+1) bits in case the extra sign bit was already discarded (by left-shifting).

The following code snippet illustrates these operations by multiplying 0.123456 by itself. This code can also be used to test fixed-point multiplication for different Q-formats.
By running it with Q7, Q15 and Q31 formats we can build the following table:

Format Result (Fixed) Result (Float) Error
Float - 0.015241
Q7 2 0.015625 0.000384
Q15 500 0.015259 0.000017
Q31 32730621 0.015241 0

As you can see, the result obtained with Q31 is the only one where the error is zero. That’s because it’s the only format in which we can retain the same accuracy as the floating-point numbers used (the multiplication operands).

Division

Division is trickier. When dividing two numbers with the same base and different exponents, the exponents are subtracted: x^a / x^b = x^{a-b}, which means:

    \[Qm1.n1 / Qm2.n2 = Q(m1-m2).(n1-n2)\]

If both operands were represented in the same format you would get the result in Q0. That’s a problem! But has very simple solution: if we convert the dividend to Q(2 \times n) format, then we get the result in Qn. This is done by left-shifting the dividend by n. We also want to round our values. In division this is accomplished by adding half the divisor to the dividend before proceeding with the division itself. The following code snippet illustrate these operations.

There’s still another “trick”. Consider the following two operands and divisions:

    \[x = 0.123456 \qquad y = 0.654321  \]

    \[x/y = 0.18868 \qquad y/x = 5.3\]

The result of x/y can be represented with m = 0 (i.e., Q7, Q15, Q31). However, for y/x at least 3 bits must be reserved for the decimal part. Hence, possible formats to use in this operation, for both operands and, consequently, the result, would be Q3.4, Q3.12 and Q3.28.
Unlike multiplication, the division of two fractional numbers without integer part, may or may not result in a number with integer part. This is of extreme importance when choosing the most appropriate Q-format.

Again, the following tables can be built:

Format x/y Result (Fixed) x/y Result (Float) Error
Float - 0.18868
Q7 23 0.179688 -0.008992
Q15 6182 0.18866 0.000012
Q31 405182993 0.188679 0.000002

Format y/x Result (Fixed) y/x Result (Float) Error
Float - 5.3
Q3.4 160 10.0 4.3
Q3.12 21737 5.306885 0.006885
Q3.28 1422717077 5.300034 0.000034

We can conclude that in Q3.4 format there aren’t enough bits to get a result with reasonable accuracy. We can also see that Q31 is the only format that, by rounding the result to the same decimal place as the “original result” (float format) would yield the same exact value.

Several small microprocessors may not have an hardware divider available, unlike multipliers which are very common. So the division may still be considered as an heavy operation (to be avoided). However, it’s always more efficient to do it with integers than with floats.

Addition and Subtraction

Addition and subtraction are equivalent operations since we are dealing with signed numbers. Two possible issues with these operations are: overflow and underflow, if the resulting value is larger/smaller than the one that can be stored in hardware.

The rule is: the sum of N individual b-bit binary numbers can require as many as b+log_2(N) bits to represent the results. For example, the sum of two 8-bit words requires an accumulator whose word length is 8+log_2(2)=9 bits to ensure no overflow errors occur.

Also, in fixed-point representation the operands to be added or subtracted must be in the same Q-format.

Additional Notes

Why did we choose 2^n as the scaler value? Couldn’t it be another number? Usually, we, humans, tend to use a 10^n value to scale or aid on the representation of fractional numbers, such as scientific notation. That’s a value we easily understand, it just moves the decimal place and the digits keep the same. However, in hardware values are stored in binary and as such using powers of two is more efficient. That’s what allows us to convert numbers in different Q-formats by simply shifting them and thus improving the efficiency of arithmetic operations.

Now you should feel comfortable enough to embrace the world of fixed-point representation. However, if you don’t know where to use it (or where it’s commonly used), let me tell you one thing: fixed-point representation is thoroughly used in Digital Signal Processing (DSP). You may use it to create digital audio effects, create inertial measurement units or extract other meaningful data from digital (and noisy) signals by filtering and processing them.

On the next post I will write about some types of digital filters and how they can be implemented in fixed-point.

The full source code to test fixed-point multiplication and division is available here.

Did you find this interesting? Let me know what you think.