Saturday, 31 December 2011

Error defaults

The other day, I wrote a function which threw an exception when it failed to convert an ASCII string into an enumerator:

type from_ascii(const char* c)
{
    for (int i = 0; i != sizeof(names)/sizeof(names[0]); ++i)
        if (!strcmp(c, names[i]))
            return (type)i;

    throw conversion_error_exception();
}

Arguably, the use of an exception here is slightly gratuitous. It's how C# does things, and it's not wrong, but it's not usually the best API when we come to use it. And of course there are performance concerns.

You usually see two solutions to this kind of problem, none of which are particularly appealing to me.

First is the 'sentinel value' solution:

enum type
{
    Left,
    Right,
    Up,
    Down,

    Unknown
};

type from_ascii(const char* c)
{
    for (int i = 0; i != sizeof(names)/sizeof(names[0]); ++i)
        if (!strcmp(c, names[i]))
            return (type)i;

    return Unknown;
}

The Unknown state is somewhat untidy, though is fine for enums with limited scope within a program, where the author of the enum is likely to be the only one using it and he knows all the contexts in which an Unknown enumerator needs to be handled. Additionally, there is likely to be no overhead, assuming that the addition of the extra state doesn't require the enum's underlying type to change.

The other popular solution is the pass-by-reference method:

bool from_ascii(const char* c, type& result)
{
    for (int i = 0; i != sizeof(names)/sizeof(names[0]); ++i)
        if (!strcmp(c, names[i]))
        {
            result = (type)i;
            return true;
        }

    return false;
}

Oddly enough, C# also supports this method. However, it's a bad method as you have lost referential transparency and strict value semantics.

Here is another solution. Quite often, I find myself just defaulting an enum variable to a particular state if the conversion fails. So why not just make that part of the function?

type from_ascii(const char* c, type resultOnFailure)
{
    for (int i = 0; i != sizeof(names)/sizeof(names[0]); ++i)
        if (!strcmp(c, names[i]))
            return (type)i;

    return resultOnFailure;
}

This doesn't solve every problem though. Another technique coming soon.

Friday, 30 December 2011

decltype on Visual Studio

Visual Studio 2010 had one of the first implementations of C++11's decltype. As such, it has some deficiencies.

One of the deficiencies occurs when you try to get the type of the address of a function template specialisation:

template <typename T>
void f();

decltype(&f<int>) ptr; // error - incorrect argument to 'decltype'

Like every programming problem, this can be worked around with an extra layer of indirection:

template <typename T>
T identity(T);

decltype(identity(&f<int>)) ptr; // ptr has type void(*)()

Hopefully this will be sufficient to get you by.

Thursday, 29 December 2011

Namespaced enums (2)

A short follow-on to yesterday's post.

One additional advantage of the namespaced version over enum classes is that you have a namespace for populating with useful enum-related functions, for example:

namespace Direction
{
    enum type
    {
        Left,
        Right,
        Up,
        Down
    };

    const char* names[] = { "Left", "Right", "Up", "Down" };

    inline const char* to_ascii(type e)
    {
        return names[e];
    }

    type from_ascii(const char* c)
    {
        for (int i = 0; i != sizeof(names)/sizeof(names[0]); ++i)
            if (!strcmp(c, names[i]))
                return (type)i;

        throw conversion_error_exception();
    }
};

auto e  = Direction::Up;
auto a  = to_ascii(e);
auto e2 = Direction::from_ascii(a);

Wednesday, 28 December 2011

Namespaced enums

One of the problems with enums is that the enumerators aren't scoped to the enum, but rather are in the same scope as the enum itself. That is:

enum Direction
{
    Left,
    Right,
    Up,
    Down
};

// Direction, Left, Right, Up and Down all defined here

This legacy behaviour from C causes problems when other similarly-named enumerators occur in the same scope:

enum QuestionResult
{
    Wrong,
    Right // error - Right already defined
};

People often add prefixes to limit this kind of collision:

enum Direction
{
    Dir_Left,
    Dir_Right,
    Dir_Up,
    Dir_Down
};

enum QuestionResult
{
    QR_Wrong,
    QR_Right
};

But this is C++! We have a solution for this type of problem, and it doesn't involve adding warts to our identifiers. We have namespaces:

namespace Direction
{
    enum type
    {
        Left,
        Right,
        Up,
        Down
    };
}

namespace QuestionResult
{
    enum type
    {
        Wrong,
        Right
    };
}

That way, you can use them like any namespaced entity:

obj.SetDirection(Direction::Right);
question.SetResult(QuestionResult::Right);

I've called my enums 'type' to match similar C++ constructs, like metafunctions, but it only really matters to be consistent within a program.

And this won't be an issue when C++11's enum class feature becomes sufficiently supported. Assuming no more portability disasters like nullptr.

Tuesday, 27 December 2011

NULL

Relating to yesterday's post, it also bothers me that I have to include a header just to get NULL. Actually, it doesn't bother me at all, because I never use it. Literal 0 does everything I need and more: it doesn't require a #include and is 3 fewer characters to type.

NULL is simply #defined to 0, at least in C++. It cannot be defined to (void*)0 as it sometimes is in C, because that would break C++'s stricter conversion rules:

int* p = NULL; // error - cannot implicitly convert void* to int*

I have seen claims that using NULL rather than 0 is a useful piece of documentation that shows readers of the code that a function argument is pointer-like rather than numerical, but I've always found it to be misleading at best. I've seen plenty of places (typically in Win32 code) where NULL is used where a numeric value is needed, and of course the compiler accepts it without question.

It can even be dangerous as calling an overload may not give you the result you expected:

void f(int); // 1
void f(void*); // 2

f(NULL); // Calls 1

Similarly when instantiating a template (which has a direct impact on perfect forwarding):

template <typename T>
void f(T);

f(NULL); // instantiates f<int>

Some implementations (e.g. GCC) will define NULL to a compiler-specific construct which will warn when using it in a non-pointer context, but it's not something you can rely on if you're working on cross-platform code. Not only that, but it will still have the same semantics as a literal 0, so you'll still get the unexpected overload and template instantiation semantics mentioned above.

C++11 has the nullptr keyword, which theoretically should solve all of these problems. NULL can't be redefined as nullptr, because that would break programs which are using NULL incorrectly, so existing code must be changed to use it explictly and new code must adopt it. This puts a further nail in the coffin of NULL.

As for nullptr, all is fine unless you're writing code which is designed to be target and compiler-independent. Why? Well, nullptr is also the name for C++/CLI's null reference literal, and it's not compatible with C++'s version. When it came to resolving the ambiguity between them when compiling conforming C++ code with the /clr switch on, Microsoft made the decision to side with their own technology rather than the C++ international standard:

http://channel9.msdn.com/Shows/Going+Deep/Stephan-T-Lavavej-Everything-you-ever-wanted-to-know-about-nullptr

They provide __nullptr instead, which will always be a 'native' nullptr, but of course this is a Visual C++ extension. It won't work when you are writing code which doesn't want to care about which compiler is building it. If that is your goal, you'll need to encapsulate it in a macro which expands out to __nullptr or nullptr depending on your compiler. Which is worse than the NULL macro!

So it's best to simply get used to the fact that literal 0 can mean a null pointer. It still fails in the context of perfect forwarding, but that can be fixed by an explicit cast. Otherwise, literal 0 is portable, doesn't require pulling in an unnecessary header and doesn't result in surprising behaviour with respect to overloads or template instantiations.

It's going to be with us for a while yet.

Monday, 26 December 2011

size_t

It bothers me that size_t, the result of sizeof, is in a header. Why should I have to pull in <stddef.h> or <cstddef> just to get the name of something which the compiler knows intrinsically?

Fortunately, C++11 provides a solution:

decltype(sizeof 0)

The expression in the sizeof is immaterial; I chose 0 because it was short and it already has special meaning in the language, so it doesn't look too out of place.

Sunday, 25 December 2011

Christmas

_oTheWeatherOutside = frightful;
_fire = soDelightful;
if (_placesToGo.empty())
    for (int i = 0; i != 3; ++i)
        LetItSnow();