Readable and Maintainable Bitfields in C

Bitfields are very common in low level C programming. You can use them for efficient storage of a data structure with lots of flags, or to pass a set of flags between functions. Let us look at the different ways of doing this.

The straightforward way to deal with bitfields is to do the Boolean logic by hand:

Boolean Magic

#define FLAG_USER   (1 << 0)
#define FLAG_ZERO   (1 << 1)
#define FLAG_FORCE  (1 << 2)
/* bits 3-30 reserved */
#define FLAG_COMPAT (1 << 31)

create_object(int flags)
        int is_compat = (flags & FLAG_COMPAT);

        if (is_compat)
                flags &= ~FLAGS_FORCE;

        if (flags & FLAG_FORCE) {

create_object_zero(int flags)
	create_object(flags | FLAGS_ZERO);

        create_object(FLAG_FORCE | FLAG_COMPAT);

You can see code like this everywhere, so most C programmers can probably read and understand this quite easily. But unfortunately, this method is very error-prone. Mixing up "&" and "&&" and omitting the "~" when doing "&=" are common oversights, and since the compiler only sees "int" types, this also doesn't give you any kind of type-safety.


Let us look at the same code using bitfields instead:

typedef unsigned int boolean_t;
#define FALSE 0
#define TRUE !FALSE
typedef union {
        struct {
                boolean_t user:1;
                boolean_t zero:1;
                boolean_t force:1;
                int :28;                /* unused */
                boolean_t compat:1;     /* bit 31 */
        int raw;
} flags_t;

create_object(flags_t flags)
        boolean_t is_compat = flags.compat;

        if (is_compat)
                flags.force = FALSE;

        if (flags.force) {

create_object_zero(flags_t flags)
{ = TRUE;

        create_object((flags_t) { { .force = TRUE, .compat = TRUE } });

Flags can just be used like any variables. The compiler abstracts the Boolean logic away. The only downside is that the code with the static initializer requires some advanced syntax.


Bitfields in C always start at bit 0. While this is the least significant bit (LSB) on Little Endian (bit 0 has a weight of 2^0), most compilers on Big Endian systems inconveniently consider the most significant bit (MSB) bit 0 (bit 0 has a weight of 2^31, 2^63 etc. depending on the word size), so in case your bitfield needs to be binary-compatible across machines with different endianness, you will need to define two versions of the bitfield.

The Raw Bitfield

Did you notice the "int raw" in the union? It lets us conveniently (and type-safely) export the raw bit value without having to use a cast.

	printf("raw flags: 0x%xn", flags.raw);

We can use this to reconstruct the FLAG_* constants in the original example, in case some code needs it:

#define FLAG_USER   (((flags_t) { { .user   = TRUE } }).raw)
#define FLAG_ZERO   (((flags_t) { { .zero   = TRUE } }).raw)
#define FLAG_FORCE  (((flags_t) { { .force  = TRUE } }).raw)
#define FLAG_COMPAT (((flags_t) { { .compat = TRUE } }).raw)

This code constructs a temporary instance of the bitfield, sets one bit, and converts it into a raw integer - all at compile time.

Bitfield Access from Assembly

With the same trick, you can also access your bitfield from assembly, for example if the bitfield is part of the Thread Control Block in your operating system kernel, and the low level context switch code needs to access one of the bits. The "int raw" can be used to statically convert a flag into the corresponding raw mask:

typedef unsigned int boolean_t;

typedef union {
	struct {
		boolean_t bit0:1;
		boolean_t bit1:1;
		int :19;
		boolean_t bit31:1;
	int raw;
} bitfield_t;

int test()
	int param = -1;
	int result;

	__asm__ volatile (
		"test    %2, %1    n"
		"xor     %0, %0    n"
		"setcc   %0        n"
		: "=r" (result)
		: "r" (param),
		  "i" (((bitfield_t) { { .bit31 = TRUE } }).raw)
	return result;

The corresponding x86 assembly code looks like this:

	.align	4,0x90
	.globl	_test
	pushl	%ebp
	movl	%esp, %ebp
	movl	$-1, %eax
	## InlineAsm Start
	test	$0x80000000, %eax
	xor	%eax, %eax
	setcc	%eax
	## InlineAsm End
	popl	%ebp


This works fine with LLVM, but unfortunately GCC (4.2.1) has problems detecting that the raw value is available at compile time, so the "i" has to be replaced with an "r": GCC will then pre-assign a register with the raw value instead of being able to use an immediate with the "test" instruction.

How to Not Do It

I have seen C++ code doing this:

enum {

create_object(bitfield_t flags)
        bool is_compat = flags.is_set(FLAG_COMPAT);

        if (is_compat)
                flags -= FLAGS_FORCE;

        if (flags.is_set(FLAG_FORCE)) {

create_object_zero(int flags)
	create_object(flags + FLAGS_ZERO);

        create_object(((bitfield_t)FLAG_FORCE) + FLAG_COMPAT);

This all looks quite weird. The constants are bit index values, and they are added and subtracted. The reason is C++ operator overloading:

class bitmask_t
    word_t      maskvalue;

    inline bitmask_t operator -= (int n)
            maskvalue = maskvalue & ~(1UL << n);
            return (bitmask_t) maskvalue;

This is horrible. The code that uses this class makes no sense unless you read and understand the implementation of the class. And you have to be very careful: While it is possible to "add" a flag to an existing bitfield, you cannot just add two flags - it would do the arithmetic and add the two values.

Mapping the setting and clearing of bits onto the addition and subtraction operators is clearly wrong in the first place: Flags in a bitfield are equivalent to elements in a set. Setting a flag is equivalent to the "union" operation, which even in Mathematics has its own symbol instead of overloading the "+" operator.


If you compile code that does something like "((bitfield_t) { { .bit31 = TRUE } }).raw" with GCC in C++ mode, it fails. Why?

37 thoughts on “Readable and Maintainable Bitfields in C

  1. strik

    Regarding the bitfield:

    It is bad practise to define a bit-field entry with “int”; quoting N1124, footnote 105:

    “As specied in 6.7.2 above, if the actual type specier used is int or a typed
    then it is implementation-dened whether the bit-eld is signed or unsigned.”

    Thus, it may be that your comparison flags.compat == 1 is never true, as flags.compat is -1. (Of course, for flags, this is no problem. However, if you have some bit-field entry of “int abc : 4;”, you might be very surprised if abc is in the range -8..7 instead of 0..15.)

    Thus, better use “unsigned int compat : 1;”

    Furthermore, other than endianness issues, bit-fields are not very portable if you rely only on what C guarantees:

    “An implementation may allocate any addressable storage unit large enough to hold a bit-eld. If enough space remains, a bit-eld that immediately follows another bit-eld in a structure shall be packed into adjacent bits of the same unit. If insufcient space remains, whether a bit-eld that does not t is put into the next unit or overlaps adjacent units is implementation-dened. The order of allocation of bit-elds within a unit (high-order to
    low-order or low-order to high-order) is implementation-dened. The alignment of the addressable storage unit is unspecied.”
    (N1124.pdf, § “Structure and union specifiers”, no. 10, p. 102)

    That is, here is very much unspecified. Especially a bit-field that does not fit in the storage unit where the current bit-field is located can be placed either crossing the storage unit (“byte”), or it may begin at the next one.

    For example: In your example above, .compat could be placed in a byte of its own, if the compiler likes it this way.

    Regarding using the union to access the raw value: While it works in practise (on most machines), there might be obscure machines where this does not work, as the C standard does not guarantee anything.

    “Annex J (informative)
    Portability issues
    J.1 Unspecified behaviour
    The value of a union member other than the last one stored into (”
    (N1124.pdf, J.1, p. 488)


    “When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecied values.”
    (N1124.pdf, §, p. 50)

    So, to make a long story short: While you are right that it is possible to implement bit-fields with C bit-fields, it is not advisible if you must agree on an exact layout (for example, bits of some hardware register or to write files which are read by other programs). Note also that even one compiler might “change its opinion” with some newer version, thus, even saying “but I know my compiler” will not help you.

    Regarding your question: Does C++ even allow struct initialisation with the “.bit31 = TRUE” syntax? Last time I looked at it, it did not. To be more precise, even now, there are still many C compiler arounds that do not support this, as this was added with C99. Before C99, it was not supported on C, either.

  2. quackzilla

    If you’re going to advocate this kind of approach for bitfields, there’s a tiny gotcha which can make debugging a nightmare:

    typedef int boolean_t;

    should be

    typedef unsigned int boolean_t;

    You’ve taken other steps to mitigate sign debugging hell (using the high bit, TRUE is !FALSE, always use != FALSE rather than == TRUE comparisons, etc), but it can be a terrible slice of purgatory to debug if many of these steps are missing as they often are in other’s code.

  3. ZungBang

    Bit-fields are a tempting abstraction. The problem with bit-fields is that the normal use case for them is in a setting which tends to break the abstraction.

    Case in point: a former boss of mine had me debug machine exceptions caused by a piece of code he wrote that used bit-fields to represent and access hardware registers. He was quite in love with it, I swear.

    Turns out that the assembly code that was generated by the compiler, did byte-wide reads/writes on hardware that was designed for 32-bit wide accesses only. The code would’ve been perfectly fine for accessing normal memory, but not on our system.

    The fix was to use “boolean magic”, which worked nicely both on the actual system and in a simulation environment.

  4. Anonymous

    You really should be changing it to say NOT to use this method! It is completely unreliable and will most likely cause you problems when compiling on different compilers or different architectures, as was thoroughly explained by the earlier comments. You’re relying on unspecified behaviors in several different ways.

  5. Bobby

    “Bitfields in C always start at bit 0.” is not true. The spec says “The order of allocation of bit-elds within a unit (high-order to low-order or low-order to high-order) is implementation-dened.”

    As everybody else has been saying, relying on any details of how this code works is just asking for trouble. If you just want bitfields and don’t need to interact with code from other compilers, or hardware registers, and you don’t particularly care how it works behind the scenes, C bitfields are fine. For anything else, they aren’t suitable.

  6. lorenzo

    >> If you compile code that does something like “((bitfield_t) { { .bit31 = TRUE } }).raw” with GCC in C++ mode, it fails. Why?

    to build anonymous struct (and array too) on fly i use the syntax {field:value}
    the above code compile with g++ on linux:


    union flags_t {
    struct {
    unsigned int _user:1;
    unsigned int _zero:1;
    unsigned int _force:1;
    unsigned int _dummy:28;
    unsigned int _compat:1;
    } _bitfield;
    unsigned int _raw;

    void print(const flags_t &flags) {
    std::cout << “flags._user: ” << flags._bitfield._user << “n”
    << “flags._zero: ” << flags._bitfield._zero << “n”
    << “flags._force: ” << flags._bitfield._force << “n”
    << “flags._compat: ” << flags._bitfield._compat << std::endl;

    int main(int argc, char **argv) {
    print( (flags_t){{_user:1, _zero:0, _force:1, _dummy:0, _compat:0}} );

    return 0;

  7. Pingback: // popular today

  8. bitfieldhater

    Please change the title to:
    Non-Portable Bitfields in C

    If you ever think that someone may possibly want to use your code in a cross-platform manner or even interfacing to it from code written in a different language on the same platform, please refrain from using bitfields. Use an unsigned int with masks instead.

    Thank you.

  9. Matt H


    How would masks be cross platform? You still have to deal with endianness issues.

    Bitfields are great in my opinion. I find its easiest to determine how your compiler is aligning them, and then adjust the structure for other compilers using preprocessor defines (if you want to interop)

  10. Anonymous

    @Matt H

    You have to deal with endianness issues anyway if you’re doing cross-platform code. Bitfields or masks aren’t going to change that. However, using bitfields adds a lot of other issues on top of that, none of which you actually have any control over!

  11. strik

    Michael, as other’s have also pointed out: DON’T DO IT! This type of code is the best way to be non-portable!

    It might work on your specific platform – NOW. But every little change (even time of compilation) can probably break it.

  12. Felix

    I also have to agree with the other commenters – bitfields suck. They are tempting, yes, but next to all the problems with theoretically “undefined behavior”, there are real world problems. The largest problem is endianness. The right way to handle the wrong endianness (which is little endian, obviously) is to treat reading binary data as “decoding” and writing binary data as “encoding”. Like treating user data in a web application as “unsafe”, you should treat “binary data” as “binary” (or 8-bit numbers, if you want), but never as “32-bit numbers”.

    The right way is to handle endianness is not by doing swapping of data at whatever place where it’s “broken otherwise”, it’s by having a defined (and correct) funtion to put “bytes of data” into “registers”, or generally spoken: how to convert a stream of 8 bit values into a stream of 32bit (or 64bit) values. PowerPC for example does it right; there is no “bswap”-instruction, but instead there is a “load non-native endian word” and “store non-native endian word”. Endianness is an issue with interfacing memory spaces with different data widths; registers have 32 (or 64) bit data width, and memory can be considered as an 8-bit memory space, if you do bytewise access.

    If you start having a bit-addressed memory space (like a bitfield), then WHENEVER you interface to a word-addressed memory space (register) or a byte-addressed memory space (memory) you have to take endianness into account. And while swapping in byte-based domains is cheap (bswap ftw), swapping bits is (usually) not cheap. Sure, you can just redefine your bitfield to magically match whatever the native bit-to-word conversion does (usually by inverting the order), but would you ever re-define any (word) constants because you’re too cheap to do proper endianness correction on bytes? No. Why would you redefine your bitfields then?

  13. Pingback: Josh Haberman » Bit-fields in C99

  14. Graham

    u3_u5_u8_u12_DECODER {
    // see libs_apps/docs/

    Supposing you had a C++ class,
    for use at COMPILE_TIME or RUN_TIME

    BIT_FIELD { // probably template // probably derived from NAMED_OBJECT

    uns nbits_gap_right;
    uns nbits_data;
    bool is_signed;

    // thats all you need, maybe expand “is_signed” to TYPE_SPEC_of_enum_bit_field
    // plus NAME + SPEC of this
    // plus WHEN you know this, CT_COMPILE_TIME, RT_RUNTIME, const ?, changed?

    enum byte_order_of_memory_and_code_used_to_handle_it
    byte_order = byte_order_hilo_by_definition_probably_on_lohi_cpu;

    static enum compiler_bit_order_code_and_metrics
    bit_order = bit_order_gcc_on_AMD64_using_masks_AND_bit_fields_mixed; // !

    // starts to get virtual here

    u32 GET_mask_1s_rhj();
    u32 GET_mask_1s_in_situ(); // 1 where data is

    u32 GET_mask_0s_rhj();
    u32 GET_mask_0s_in_situ(); // 0 where data is

    bool GET_is_within_byte_boundry(); // does_not_cross_byte_boundry
    bool know_gap_right_is_zero;
    bool know_no_need_to_mask_off_upper_bits;


    class LIST_of_BIT_FIELD
    BIT_FIELD & LIST[4]; // however many you want
    bool generate_code( generator & );

    and some friends
    u32_cpu // probably typedef to one or the other, with C++ casting all over the place

    and some distant cousins: u32_cpu_lohi_holding_inverted_hilo.

    Plus of course some optimising bswap functions (HTON macros evaporate to swapb),
    and some failsafe compile-mode-use-masks-and-shifts,
    plus some unit tests,
    plus a community to maintain it for a range of CPUs, compilers, times-of-day,

    Plus, you also can run this data-driven C++ class,
    to prints the correct C/C++ for your machine/compiler/timeofday
    (or falls back on masks and swapb),
    as long as there is a non-cross-compiler available, at compile-time.

    Add to that, the attempt to access u32_hilo IN-SITU from a u32_cpu_lohi
    knowing that the sub-byte-values are easier, but the byte-boundry cross isnt,
    but never-the-less attempt to get a good engineering compromise

    Then what _YOU_ do with a family of types named

    u3_u5_u8_u16 // upper tray is blind to lower tray u16
    u3_u5_u8_u4_u12 // decoder finds common case of 4K pools
    u8_u8_u16 // decode u3_u5 as plain lookup[u8] // sparse: void * lookup[ decode[u8] ]
    u16_u16 // u16_upper_u16_lower
    u32_hilo // as_found_in_file_preferably_aligned

    Remember that decoding, will probably extract the values all-in-one-go,
    because you KNOW that you will decode the entire multi-step-address
    (which is an index not an offset, looking up the object in a tray_of_256_of_similar_type)

    (Q) If it is to be quick on all architectures, and all compiler-modes,
    what should the bit layout be: ?



    NB By using different names for each bitfield,
    it remains easier to prototype here, for a larger space,
    otherwise its:

    u8_C ….

    NB My storage layout allocates within the lower 16 bits (object per item)
    with multiple parallel worlds selected in the upper 16 bits.
    That upper layout is implemented as a (hidden) lower_tray_of_items
    to reuse code, but allow mixing trays_of_lower_u16 from different files

  15. kohlrak

    I know this is old, but, truely, the instability comes with the datatype itself. An int on an older system is 16bits, and int on a going out of style system is 32bits, and a new system an int is 64bits. Probably best to revert to non-changing “types” such as byte, word, dword, qtword, etc.

  16. Pingback: » Blog Archive » Tumblelog 090817

  17. woody

    I disagree strongly with all the bit field detractors.
    I do embeded firmware. Bit fields are critical. Most porting of embedded code is going to be to a similar platform, for example, various flavors of an 8051, using the same compiler.

    The PIC24 family uses the GNU compiler, and their entire method of accessing the bit fields in all the registers, depends on bit fields in unions.
    Every pic compiler does the same thing.

    I use Keil 8051 compiler in my work. I’ve never seen any problems when using the Keil compiler in the 8051 world, and have ported code back and forth across many 8051 platforms. Bit fields have been used extensively to unpack, and access registers and bits, and communication buffers.

    The moral of the story is: Bit fields are highly useful. Now when I go to port my 8051 code over to an arm, I will of course have to use whatever compiler is furnished, and adhere to the way that compiler implements things, but most of that can be hidden behind defines, just as it is in the PCI 24 family.

  18. Sean Ellis

    Scott said: “A poster who dislikes bitfields is one who has never seriously programmed an embedded system.”

    I do serious embedded work. I dislike bitfields. I would love to use them – they would make life so much easier – but they’re all but unusable for serious work.

    The reasons for not using them have all been set out above. They’re a mess of undefinedness. Which end do they pack from? Which end of what (bytes, words, halfwords)? What exact set of operations is used for a bitfield set, and how does it interact with memory mapped I/O?

    Even if you only use a single toolchain, as woody apparently dows, any of the behaviors you rely on can be broken with an updgrade, at any time, for any reason.

    This is a real pity, because they would remove about half the evil #defines in the embedded world. But even the preprocessor isn’t quite as evil as undefined behavior.

  19. t

    I have programmed using C for numerous processors, dsps, and microcontrollers and I have never found the need to use bitfields. I use masks and macros when I want to manipulate bits.

  20. Rod

    Can someone point me to a embedded processor vendor who uses bitfields in their sample code foro access registers? Up til now, no chip maker code I know has. I’ve lost count how many embedded architectures I’ve worked with.

    Maybe I should have read the C spec before today, but I’ve been embedded programming (mostly self taught) for 6 years now. Yesterday was the first time I had ever heard of bitfields in C.

  21. Jon

    Rod, Microchip uses bitfields in their processor header files, in their mcc18 compiler. For example, the STATUS register in a PIC18F26K20 is defined as so in their header file:
    extern near unsigned char STATUS;
    extern near struct {
    unsigned C:1;
    unsigned DC:1;
    unsigned Z:1;
    unsigned OV:1;
    unsigned N:1;
    } STATUSbits;

    This way, you can clear (for example) the C bit by:
    STATUSbits.C = 0;

    Good luck!

  22. Frederick Eek

    I have been doing extensive embedded programming (amongst others) for 20 years. I fail to understand why you would not use them. All hardware have bit fields in their registers. Most protocols contain information with bit fields.
    As for their undefinedness, exactly the same applies for “boolean magic”. Here you use defines to map bits to positions in unsigned shorts or ints. You are therefore fixing the flags to the endianness of your system.
    The endianess problem does not lie in your compiler/system, but how different systems interpret them (i.e. when communicating over a bus or network).
    I have been using bit fields in various systems (ported and sharing code) including 8086, .NET programming, 8051, Arm, TMS320, AVR, etc. I never have to modify flags or #defines. The worst I have ever had to do was to have two definitions for the same structure depending on the compiler used (the old 8051 Franklin compiler did not support 32 bit packed structures and the packed TIMESTAMP data type we used on the network was packed into 32 bits).
    The biggest advantage of bit fields is the fact that you do not continuously have to keep track of how flags and masks actually map to your memory. Once the structure is defined, you are completely abstracted from the memory representation whilst for boolean magic you have to remember the masks and sometimes even shifts at every point you use them.
    I think people that detract from bit fields have actually never really had to do anything but simple flag checking and setting.

  23. Eswar

    Can any body explain the memory size of a bit field variable? What is the advantage of using Bit fields particular to memory management point of view?

  24. tissit

    TI uses bitfields and so does Fujitsu. I have a feeling I’ve met others, but I’m not sure. I think AVR is the exception that doesn’t.

  25. Pingback: Adobe Interview Question for Software Engineer/Developer about Bit Magic « GeeksforGeeks

  26. Pingback: Glitch Kombat (Part 2) | Shanth's Blog

  27. chalapathi

    Hi ,
    I’m chalapathi , which is very helpful for me updating the technology.

    Thanks & Regards


  28. Jonathan

    There is a simple solution that will keep most people happy. (except the anal retentive types who should have given up programming when cobol went out of vogue)

    Write your code using bitfields to suit your implementation.

    Write test code that confirms that values loaded into your struct produces the correct output. If you cant design a test that does this, you should not be developing anything in this day and age.

    If your tests fail, don’t deploy. Adjust your macros or unions to suit the architecture, and then you are good to go. A port is a port. Don’t expect everything to cross compile, that is a fantasy. Do some work and don’t expect the compiler to do your job for you.

  29. Dudu

    } return 0; } 题目链接:C Looooops 标准做法是扩展欧几里得解同余方程,解出一个解之后算一个最小解就是答案。先看wiki上的线性同余方程的词条,再看扩展欧几里德,博客中有一篇写了欧几里得的模板 。long long 送了很多WA,更二的是开始的时候n没计算就加上去 = =。 This entry is filed under Life.

  30. Fauss Hull

    I hate bitfields, but see no alternative for US Military Interface programming

    For the last 16 years, I have been fighting with endianness, bitfields and bitmasking. For better or worse, US military hardware typically has a Bib Endian interface containing information with bit lengths from 1 to 32 bits. These interfaces can frequently have hundreds of individual data items in them.

    I have used bit masking in the past to bypass the problem that bitfields are typically allocated from LSB to MSB on little endian machines and in the reverse order on big endian machines, however, the problem with this approach is that of data reduction. In the Military programming environment is is CRITICAL to record data traversing an interface and be able to reduce that data in a human readable format at a later date. To do this with a program using bitmasks means lots of hand coded diagnostic code.

    In my current programming environment, we have an automated data dictionary generation program which reads the .h files to produce a data dictionary which can be used to automatically reduce all defined fields in human readable format, e.g. if bit 7 is a bit named status defined as an enum of OK = 1 and FAULT = 0 and this is defined in a structure, the data parser will generate output such as status = OK.

    Bitmasking makes in the above impossible, if you know how to do it please tell me. So I am ‘forced’ to deal with the portability issue of bitfields. We have a number of ways to deal with this, however, for our case where we have little endian programs on the application side is to simply define the bitfields in each word in reverse order from how they are defined in the Interface Design Document (IDD) for the big endian hardware and do the required byte swapping. To prevent us having portability issues should we end up on a big endian platform (right), we keep representations of the the structure with the fields in the order defined in the IDD and in reverse order. The only thing left to do is swap the bytes in each word when going big endian to little endian. An example of the dual definition is:

    // Set LITTLE_ENDIAN to the type of the machine the code will run on,
    // 0 for big endian.. Alternatively set LITTLE_ENDIAN in the makefile
    #ifndef LITTLE_ENDIAN
    #define LITTLE_ENDIAN 1

    struct ExampleStruct
    unsigned b :22;
    unsigned a : 10;
    unsigned c;
    unsigned int e:14;
    unsigned d : 18;
    struct ExampleStruct
    unsigned a : 10;
    unsigned b :22;
    unsigned c;
    unsigned d : 18;
    unsigned int e:14;


    The above may seem very hard to maintain, however a utility program can be used to convert the first definition into the second. Further, changes in military hardware interfaces are VERY rare and typically used previously unused bits.

  31. Levi

    For those blessed embedded programmers who haven’t run into chip vendors that define their interfaces in terms of bitfields, Freescale is one of those vendors.

    For those of you who labor under the view that masks & shifts have just as many portability problems, I’m afraid you’re mistaken. When you deal with values that are the same width as your registers, the same mask and shift operations that work correctly on a big-endian machine will work correctly on a little-endian machine. This is because big- and little-endian issues are representation issues that are entirely absent once you have translated representation into value, and machine-level arithmetic operations only work on values.

    Now, there may be an issue of correctly translating documentation of a register on a big-endian architecture into mask and shift operations, but this can be alleviated by thinking correctly about the problem. You need to figure out which end of the diagram holds the bits that form the least-significant portion of the value; these are almost always the ones on the right regardless of endianness, but they usually have the highest numbers in a big-endian architecture’s diagrams. Now look at the first field in the least significant bits; this is the field with a shift of 0 and some number of bits, n_1, of width. The next field then has a shift of n_1 and a width of n_2. The third field has a shift of (n_1+n_2) and a width of n_3, and so on.

    You can create a mask of width n for all n less than the width of the register by shifting the value 1 left by n bits and then subtracting 1. I.e. mask(n) = (1 << n) – 1. So to get the mask for a field of width 'n' and shift 's', you have field_mask(s,n) = mask(n) << s. To prepare a value for a field, mask it to the field width and shift it, i.e. field_value(s,n,v) = (mask(n) & v) << s. To clear a field in a value, apply the inverted mask: clear_field(s,n,v) = v & ~field_mask(s,n). To replace a field in a value, combine the two: replace_field(s,n,old,new) = clear_field(s,n,old) | field_value(s,n,new).

    If you define the above via macros and use constant values for the shift and width parameters, you'll get expressions that compilers can usually simplify fairly well (the mask calculations become constant expressions, for example) and things remain fairly readable. If you can't put calculations in macros due to style guidelines, then you can pre-calculate the masks and the remaining shift/mask operations become a quickly-learned idiom.

    It should not be too difficult to create a document in a machine-readable form that, for each register, has its shift, width, and an enumerated list of possible values and their human-readable names. Since the values are values and not representations, they are again not vulnerable to representation issues. You should be able to easily generate from this document a set of header files with enums and constant-macros for each field’s shift, width, and values, along with portable data parsing code. There’s absolutely no need to use bitfields.

    I agree that this is ugly and it’s ridiculous that we can’t do better with portable C code, but it’s unlikely to change, and portable code is better than the syntactic convenience of bitfield syntax.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>