[EN] Generic data structures in C

Template meta programming has been a mainstay in C++ for years. It provides a system for compile-time polymorphism and generic programming with capabilities not present in many other languages, enriching the language and bringing new opportunities for expressiveness to the table.

Why doesn't the C language have such capabilities? Well, besides the fact that C++'s templates were added to the language years after C and C++ stopped resembling each other (at least in the common coding styles of each language), the reality is that it does. Sort of. Let me show you what I mean.

The C Preprocessor

Enter, the C preprocessor. For many, a glorified find-and-replace engine; but in reality it hides great opportunities for those looking to expand their frontiers. Let's go over the basics first, and then we'll take a look at the true power of the C preprocessor.

Object-like macros

This is how the GNU Foundation refers to the simplest form of define directive[1]. These consist of a #define directive followed by a name and a macro body. Here is an example:

#define PI 3.141592

In this case, we have defined the PI symbol to be replaced with the 3.141592 literal during compilation.

Function-like macros

These have both a name and a list of arguments. When used in code they look very much like a function. Once again, here is an example:

#define max(a,b) a > b ? a : b

This adds a layer of abstraction over a ternary operator to use as a max function.

Okay, so what?

That by itself might not seem like much but that changes when you realize that the arguments don't need to be values or variables. Particularly, we are interested in the fact that we can pass typenames as arguments and nothings goes particularly wrong.

for example, we can define ourselves a shorthand way of declaring a variable on the heap.
#define new(type, name, value) type *name = malloc(sizeof(type)); *name = value
this might seem like a lot, but let's go over what's going on here using an example.
new(int, myInt, 5);
when the preprocessor finds this expression, it replaces it following the #define'd function-like macro. This will be the result:
int *myInt = malloc(sizeof(int)); *myInt = 5;

Template-like Macros

If we take this concept a bit further we arrive at what i like to call template-like macros. These macros contain struct and function definitions that create an interface for interacting with data.

To start off i would like to show a struct that could be used to implement a dynamic array of int.

struct array {
    int *data;
    int capacity, index;
};

very simple stuff. if we wanted to make this a generic container in C++ we would just replace int* with T* and add a template<typename T> at the beginning and be done with it.

Achieving the same goal in C is not much different. Here is how i would go about doing that:
#define arrayTemplate(T) struct T ## Array{ T* data; int capacity; int index; };
note: the operator ## in a macro means concatenation (int ## Array expands to intArray)

Now, since we are going to be using the expression struct T ## Array many times i would like to make an extra define to make our code a bit more clean: #define array(T) struct T ## Array

Besides the actual struct, it would be good if we could define a constructor of sorts. Turns out, this can be done in a very similar fashion, we are gonna add our constructor definition to the same define where the generic struct was defined. This means that we are only allowing declarations of functions that have a corresponding struct, and viceversa. All in all, our code will look like

#define array(T) struct T ## Array
#define arrayTemplate(T) array(T){\
    T* data;\
    int capacity;\
    int index;\
};\
array(T) T ## ArrayConstruct(int length){\
    array(T) this;\
    this.index = 0;\
    this.capacity = length;\
    this.data = malloc(length * sizeof(T));\
    if(!this.data){\
        exit(1);\
    }\
    return this;\
}
note: the backslashes allow for multi-line macros. You can think of them as escaping the line ending character

I am going to add one more macro to wrap calls to T ## arrayConstruct expression, this will allow for neater code.

#define newArray(T, length) T ## ArrayConstruct(length)

And now, we reap the fruits of our labor. A simple C program, written in an unusual style.

#include "genericarray.h"
arrayTemplate(int);

int main() {
    array(int) myArray = newArray(int, 3);
    return 0;
}

I hope you have enjoyed this post and will use this technique, even if only to have some fun seeing what the preprocessor is capable of.

If you implement any data structure or otherwise interesting program using this style, please do throw me a comment right below, and show me what you have made.

Here is a gist that implements a couple of generic data structures in C, in case you were interested in seeing the concepts explained here applied. The style used is a bit different: function(type)(args...) is used instead of function(type, args...), but the same concepts still apply.

If you find any mistakes in the code or in my grammar, or have a suggestion to improve this and future posts, please let me know.

About me

My name is Sebastian. I'm a competitive programmer (ICPC World Finalist) and coach. I also enjoy learning about programming language theory and computer graphics.

Social links:

Comments

Popular posts from this blog

[EN] Writing a Compiler is Surprisingly Easy (part 1)

[EN] Representing Programs Within Programs is Surprisingly Easy

[EN] Writing a Compiler is Surprisingly Easy (part 2)