How to map enum to strings in C

How can we map the enum values to strings in the C programing language?

Often times you need to display a C enum as a string, most often when debugging or handling error conditions.

Typical techiques involve usually: a) defining an array of strings (the strings being the enum names of course, and the string index inside the array matching the enum’s value); and b) defining a long switch-case function accepting the enum as param and returning a corresponding string.

These solutions work, of course, but they have several disadvantages, most importantly they require some work to be done on the part of the programmer if for whatever reason the enum changes – e.g. if the enum’s values are re-arranged or if new values are added or deleted.

Isn’t there a way to activate this whole business and have these string values be created for us automatically?

Sure! And not only strings can be generated this way – you can maintain an enum and any list of associated data – strings, structs, etc!

Here’s one solution for mapping enums to strings or other data (given to me by Vasil Vasilev). In it’s core it boils down to this:
1. You use a macro which expands to different “things” (enum values, strings, structs) depending on where you call it from, eg. #define ADDITEM( e, v1, v2 )
2. You create a header file and fill it with such macros, one for each enum value
3. You include this header file where you need to define your enum type or associated data, but just before the include you define the ADDITEM to do what you need it to do in this particular case.

Let’s look at a quick C example of how to map an enum to an array of structs containing the enum’s name as a char string in one of the fields. We will achieve this with 3 source files:
1. def.h : The def file – used by the other 2 files.
2. code.h : The header for the code.c file.
3. code.c : The source coude file.

========================
def.h
========================
/**
In this file we define our “compoosite” data, via the ADDITEM defines. Essentially, we are linking the enum to all the other data that we want to relate to this enum. In this example I am linking the enum to a timeout milliseconds value for each command as well as a comment string. The comment is not used below in the other files, but it could be used by some code in the system, e.g. in a Help command.

IMPORTANT: Rearranging, adding or removing enum values here will NOT break the integrity of the rest of the program! Of course, if you rearrange some lines here, then the numeric value of the affected enums will change but it will be reflected in all other places where this data is used so there will be no problem.
*/

...
ADDITEM( CMD_PING, 500, "Ping from subsystem" ),
ADDITEM( CMD_INFO, 500, "Subsystem info request" ),
ADDITEM( CMD_RESTART, 3000, "Subsystem restarting" ),
...

 

========================
code.h
========================
/**
This is the header for your source code file. Create and structure it as you would a normal C header, with the only difference that when you define your enum you will provide an appropriate define for the ADDITEM macro and then include the defs file. This will create your enum type automatically!

IMPORTANT: We guard the enum by 2 #undef ADDITEM statements to make sure ADDITEM does not remain visible after it’s served its purpose of defining our enum. This is important, as we will use it again in the .c file below (and possibly in some other places)!
*/

...
#undef  ADDITEM
#define ADDITEM( _etype, _msec, _comment )      _etype
enum
{
#include "def.h"
    CMD_MAX
};
#undef  ADDITEM
...

 

========================
code.c
========================
/**
In the source file (as well as other source files of course) is where we use our enum and it’s associated data.
In this example I chose to define the cmd_def_t struct as a local type, so it is only visible in the source file. Of course you could move your enum-based data types to the header then you could use them in other source files if need be.

IMPORTANT: Again – guard the enum by #undef’s to make sure ADDITEM does not remain visible after usage.
*/

#include "code.h"
/**
    Define cmd_def_t - our data type based on the enum. 
    The 'name' field in cmd_def_t will contain the 
    corresponding enum name and the timeout_ms value will
    contain the timeout, as defined in the the def.h file.
*/
typedef struct cmd_def_s
{
    char name[30];
    int timeout_ms;
} cmd_def_t;

#undef ADDITEM
#define ADDITEM( _etype, _msec, _comment ) { #_etype, _msec }
static const cmd_def_t command_defs[CMD_MAX] =
{
#include "def.h"
};
#undef ADDITEM
...
prin

————————-
Ok, that’s all great, but aren’t there cases in which this method will NOT work? And – more importantly – how can we make it work? 😉

Well.. two scenarios which come to mind right away are:
1. Your enum is defined in another file, possibly in a third party lib header and you do not want to modify that header.
2. The second scenarios is actually an expansion of hte first – not only your enum is defined somwhere else and you have no control over it, but it also contains ‘holes’, e.g.:

enum {
    E_FIRST = 100,
    E_SECOND,
    ...
};

 

Addressing the first issue first:

… so your enum is already defined… so you can’t use your ADDITEM macro to define it… SO WHAT! The enum is already defined, hence you don’t need to define it – just define in the defs header the other data which is associated with your enum in exactly the same manner as you did before!

IMPORTANT: In this scenario the order in which the ADDITEM lines appear in your def.h file DOES MATTER! You want them to match the order of definition in the header file.

Hmm… What we were struggling for here was automation and as little maintenance as possible and we’re almost back to square one – having to watch the order of the ADDITEM lines, having to add and remove them if the enum changes.

That sucks. We don’t likes that. Let’s see what we can do about it…

Interestingly – solving the second problem (an enum with ‘holes’) will also provide a solution to the first one.

In C we can define and initialize an array like this:

int a[ARRAY_DIM] = {
    [3] = 33,
    [6] = 66
};

What this will do is create the array ‘a’ and intialize items at indexes 3 and 6 to the integers 33 and 66 respectively. What will be contained in the other slots of the array I don’t know… and I don’t care, frankly! 🙂 Probably 0’s.. Now, whether it is a good idea to initialize arrays like this a is totally different topic – what’s important is that using this technique we can solve our problems – enum’s defined outside our own code, and possibly containing holes! Knowing that your data will only be accessed via the enum as an index is a good safe-guard in itself, that we will not accidentally index into something which is not intialized and/or does not exist. And of course other precautions can also be implemented to make sure this does not happen…

With the above in mind we can re-define the ADDITEM macro in code.c as follows:
#define ADDITEM( _etype, _msec, _comment ) [etype]={ #_etype, _msec }

This will guarantee that the struct for corresponding to a given enum value will ALWAYS appear at the proper index inside array.

Of course one drawback of this solution — in the case of enums with ‘holes, and one which can not be overcome, or at least I could not think of a way to overcome it — will be that if your enum starts at, eg., 100, you will have an array of structs wher ethe first 100 items will be a useless waste of memory…

Oh, well – IMHO there’s not too many occasions when enums are defined to start at some arbitrary number and the advantages in my opinion definitely outweigh this remote possible flaw.

———

Sample code: map-enum.tar.gz

5 thoughts on “How to map enum to strings in C

  1. … oh and one more thing – using the above you can also easily define multi-dimensional arrays of data! 🙂 Imagine that you keep all the strings in your multi-lingual application (e.g. menus, labels, etc) as a 2-d array of UTF8 strings and you index into it like this:

    str = strings[LANGUAGE][ID];

    To use the above techique to define your arrays do this:
    ...
    #define ADDIITEM( lang, id, str ) [lang][id]=str
    char * strings[MAX_LANG][MAX_ID={
        #include "names_en"
        #include "names_de"
        #include "names_fr"
        #include "names_ru"
    };

     
    Then in the language def files (names_en, names_de, etc) you define your enum-based strings. For example the names_en file would look like that:

    ADDITEM(LANG_EN, NM_TITLE_MAINMENU, "Main menu"),
    ADDITEM(LANG_EN, NM_LABEL_SEARCH, "Search"),
    ...

     
    Similarly , the names_de file would look like this:

    ADDITEM(LANG_DE, NM_TITLE_MAINMENU, "Hauptmenu"),
    ADDITEM(LANG_DE, NM_LABEL_SEARCH, "Suchen"),
    ...

     
    I don’t know German so the above may be wrong or sound funny, I dont know… but you get the idea. 🙂

  2. Hmm it looks like your website ate my first comment (it was super long) so I guess I’ll just sum it up what I wrote and say, I’m thoroughly enjoying your blog. I too am an aspiring blog blogger but I’m still new to everything. Do you have any tips for newbie blog writers? I’d really appreciate it. Moreno Valley Roof Contractors, 22440 Mountain View Rd., Moreno Valley, CA, 92557, US, 951-999-4300

  3. An output from `gcc -E code.c` would be helpful.

    Also, if you’re just mapping to integers, as you are with ms here, why not just generate `case _etype: return _msec;` or something like that?

    • output from `gcc -E code.c` would be helpful

      It would, now, wouldn’t it!? 🙂 I attached a tar archive at the end of the post. It has example code and the preprocessor output as you requested.

      why not just generate `case _etype: return _msec;`

      If you read the post from the beginning you will see that the first 2-3 paragraphs explain why this is not the preferred solution.

Leave a Reply to Eliza Spofforth Cancel reply

Your email address will not be published. Required fields are marked *