Glider
"In het verleden behaalde resultaten bieden geen garanties voor de toekomst"
About this blog

These are the ramblings of Matthijs Kooijman, concerning the software he hacks on, hobbies he has and occasionally his personal life.

Most content on this site is licensed under the WTFPL, version 2 (details).

Questions? Praise? Blame? Feel free to contact me.

My old blog (pre-2006) is also still available.

January
Sun Mon Tue Wed Thu Fri Sat
 
14
     
Tag Cloud
Powered by Blosxom &Perl onion
(With plugins: config, extensionless, hide, Markdown, macros, breadcrumbs, calendar, directorybrowse, entries_index, feedback, flavourdir, include, interpolate_fancy, listplugins, menu, pagetype, preview, seemore, storynum, storytitle, writeback_recent, moreentries)
Valid XHTML 1.0 Strict & CSS
Calculating a constant path basename at compiletime in C++

Tags: Nerd, Softare, Optimization, C++, Templates

In some Arduino / C++ project, I was using a custom assert() macro, that, if the assertion would fail show an error message, along with the current filename and line number. The filename was automatically retrieved using the __FILE__ macro. However, this macro returns a full path, while we only had little room to show it, so we wanted to show the filename only.

Until now, we've been storing the full filename, and when an assert was triggered we would use the strrchr function to chop off all but the last part of the filename (commonly called the "basename") and display only that. This works just fine, but it is a waste of flash memory, storing all these (mostly identical) paths. Additionally, when an assertion fails, you want to get a message out ASAP, since who knows what state your program is in.

Neither of these is really a showstopper for this particular project, but I suspected there would be some way to use C++ constexpr functions and templates to force the compiler to handle this at compiletime, and only store the basename instead of the full path. This week, I took up the challenge and made something that works, though it is not completely pretty yet.

Working out where the path ends and the basename starts is fairly easy using something like strrchr. Of course, that's a runtime version, but it is easy to do a constexpr version by implementing it recursively, which allows the compiler to evaluate these functions at compiletime.

For example, here are constexpr versions of strrchrnul(), basename() and strlen():

/**
 * Return the last occurence of c in the given string, or a pointer to
 * the trailing '\0' if the character does not occur. This should behave
 * just like the regular strrchrnul function.
 */
constexpr const char *static_strrchrnul(const char *s, char c) {
  /* C++14 version
    if (*s == '\0')
      return s;
    const char *rest = static_strrchr(s + 1, c);
    if (*rest == '\0' && *s == c)
      return s;
    return rest;
  */

  // Note that we cannot implement this while returning nullptr when the
  // char is not found, since looking at (possibly offsetted) pointer
  // values is not allowed in constexpr (not even to check for
  // null/non-null).
  return *s == '\0'
      ? s
      : (*static_strrchrnul(s + 1, c) == '\0' && *s == c)
        ? s
        : static_strrchrnul(s + 1, c);
}

/**
 * Return one past the last separator in the given path, or the start of
 * the path if it contains no separator.
 * Unlike the regular basename, this does not handle trailing separators
 * specially (so it returns an empty string if the path ends in a
 * separator).
 */
constexpr const char *static_basename(const char *path) {
  return (*static_strrchrnul(path, '/') != '\0'
      ? static_strrchrnul(path, '/') + 1
      : path
     );
}

/** Return the length of the given string */
constexpr size_t static_strlen(const char *str) {
  return *str == '\0' ? 0 : static_strlen(str + 1) + 1;
}

So, to get the basename of the current filename, you can now write:

constexpr const char *b = static_basename(__FILE__);

However, that just gives us a pointer halfway into the full string literal. In practice, this means the full string literal will be included in the link, even though only a part of it is referenced, which voids the space savings we're hoping for (confirmed on avr-gcc 4.9.2, but I do not expect newer compiler version to be smarter about this, since the linker is involved).

To solve that, we need to create a new char array variable that contains just the part of the string that we really need. As happens more often when I look into complex C++ problems, I came across a post by Andrzej KrzemieĊ„ski, which shows a technique to concatenate two constexpr strings at compiletime (his blog has a lot of great posts on similar advanced C++ topics, a recommended read!). For this, he has a similar problem: He needs to define a new variable that contains the concatenation of two constexpr strings.

For this, he uses some smart tricks using parameter packs (variadic template arguments), which allows to declare an array and set its initial value using pointer references (e.g. char foo[] = {ptr[0], ptr[1], ...}). One caveat is that the length of the resulting string is part of its type, so must be specified using a template argument. In the concatenation case, this can be easily derived from the types of the strings to concat, so that gives nice and clean code.

In my case, the length of the resulting string depends on the contents of the string itself, which is more tricky. There is no way (that I'm aware of, suggestions are welcome!) to deduce a template variable based on the value of an non-template argument automatically. What you can do, is use constexpr functions to calculate the length of the resulting string, and explicitly pass that length as a template argument. Since you also need to pass the contents of the new string as a normal argument (since template parameters cannot be arbitrary pointer-to-strings, only addresses of variables with external linkage), this introduces a bit of duplication.

Applied to this example, this would look like this:

constexpr char *basename_ptr = static_basename(__FILE__);
constexpr auto basename = array_string<static_strlen(basename_ptr)>(basename_ptr); \

This uses the static_string library published along with the above blogpost. For this example to work, you will need some changes to the static_string class (to make it accept regular char* as well), see this pull request for the version I used.

The resulting basename variable is an array_string object, which contains just a char array containing the resulting string. You can use array indexing on it directly to access variables, implicitly convert to const char* or explicitly convert using basename.c_str().

So, this solves my requirement pretty neatly (saving a lot of flash space!). It would be even nicer if I did not need to repeat the basename_ptr above, or could move the duplication into a helper class or function, but that does not seem to be possible.

 
0 comments -:- permalink -:- 21:33
Copyright by Matthijs Kooijman