Search This Blog

Friday, 2 April 2021

Code : Simple RTTI

Code : Simple RTTI


With C++17 and the the support for CRTP we have a few extra tools to work with when it comes to creating helper objects. Recently I was thinking of how kernel mode code even today doesn't support dynamic_casts. The /Kernel switch disallows dynamic casts. There might be good reason for that, after all do we really need such advanced C++ features in the Kernel? Setting aside the argument of whether it is required or not for the gurus, I will focus on the how to get it done part of it. Whether is should be done, is something that is not the intent of this article at all.

The core reason why RTTI doesn't work in kernel is because the run time libraries weren't ported to the kernel. Many people have however made their own versions of the runtime and added similar support in a goal to support C++ in the kernel space. These efforts vary in their complexity and the features they offer, and come with their pitfalls as well. RTTI requires runtime support that isn’t available in the kernel, so there’s no way to enable it. 

If you can’t(or are unwilling to) replace dynamic_cast usage with the visitor pattern or std::variant (I guess it would need to be a custom implementation that since the STL isn’t available) or similar, you could implement custom dynamic casting functionality.

Before we start, this is not the first time someone has attempted to go this. There are several examples on the internet about doing this in other ways, some even emulate the COM model of adding GUIDS to each interface and then use the GUIDS to identify and cast the pointers to the correct interfaces.

Here are some links from OSR discussing the problem in the past:
These are the links to some of the discussion forums and articles I have come across in the past which discuss how to add some form of C++ run time support to the Kernel and contract programming in general.
Enough has been said, let get to the code...

// A unique runtime identifier for a type.
struct type_id
{
    //! Retrieve the `type_id` for a type.
    template <typename T>
    static type_id get()
    {
        return { &impl<T>::id }; // impl<T>::id will have a unique address for every T.
    }

    //! Determine if `T`'s `type_id` matches this `type_id` instance.
    template <typename T>
    bool is(T*) const
    {
        return get<T>().id == id;
    }

    const void* const id;

private:
    template <typename T> struct impl { static const T* const id; };
    template <typename T> struct impl<const T> : impl<T> {};
    template <typename T> struct impl<T*> : impl<T> {};
};

template <typename T> const T* const type_id::impl<T>::id = nullptr;


With this defined, one can write a casting function that will be able to dynamically cast from a base class pointer to a derived class pointer.

// Dynamically cast from a base class pointer to a derived class pointer.
// The class hierarchy must define a virtual ‘bool is_type(type_id) const’
// function that checks whether the provided `type_id` matches the `type_id`
// of each type in the class hierarchy.
// \tparam To The pointer type of the target class. Can be const or non-const.
// \param from A pointer to the base class.
// \return A pointer to the derived instance if the cast succeeds.
//         `nullptr` if the pointer does not point to an instance of
//          the derived class.

template <typename To, typename From>
To runtime_cast(From* from)
{
    if ((from != nullptr) && from->is_type(type_id::get<To>()))
    {
        return static_cast<To>(from);
    }
    else
    {
        return nullptr;
    }
}

Ofcourse, the class hierarchy must define a virtual bool is_type(type_id) const unction that checks whether the provided type_id matches the type_id of each type in the class hierarchy.

struct base
{
    virtual bool is_type(type_id type_id) const
    {
        return type_id.is(this);
    }
};
The derived classes can be defined as 

struct derived1 : base
{
    bool is_type(type_id type_id) const override
    {
        return type_id.is(this) || base::is_type(type_id);
    }
};

struct derived2 : base
{
    bool is_type(type_id type_id) const override
    {
        return type_id.is(this) || base::is_type(type_id);
    }
};
Now the following code will test the concept.

auto instance = derived1{};
auto base* = static_cast<base*>(&instance);
runtime_cast<const derived1*>(base); // Succeeds      -- is an instance of derived1.
runtime_cast<const derived2*>(base); // Returns null  -- not an instance of derived2.

How about multiple inheritence.

struct base2
{
    Virtual bool is_type(type_id type_id) const
    {
        return type_id.is(this);
    }
};

struct derived3 : derived1, base2
{
    bool is_type(type_id type_id) const override
    {
        return type_id.is(this) || derived1::is_type(type_id) || base2::is_type(type_id);
    }
};

Now the following code will test the concept.
auto instance = derived3{};
auto base2 = static_cast<base2*>(&instance);
runtime_cast<const derived3*>(base2); // Succeeds -- is an instance of derived3.

auto base = static_cast<base*>(&instance);
runtime_cast<derived1*>(base); // Succeeds -- inherits from derived1.
runtime_cast<derived2*>(base); // Returns null -- not an instance of derived2.

The crux of the idea is quite simple. We need some sort of unique identifier for each object. You could use __declspec(uuid(“…”)) to give each struct a uuid and change type_id::get() to return { __uuidof(T) };. However, this requires more work (adding a uuid to every class) and introduces the possibility of accidental reuse of uuids.

Instead, we are using the address of the object itself to achieve this, and addresses are guaranteed to be unique. (Or are they? What if the same address is reused after being freed, point to ponder upon).

I would leave it up to the implementer to figure out whether they should use an uuid or another type of identifier instead. I would rather think of ways to improve upon the current design instead. In the last implementation, the burden to write the virtual function as part of each class and then change the logic to actually include every class in it's hierarchy is a bit annoying. Is there a better way to do it? Can we automate the boilerplate code any better way?

It seems there actually is!

The as_type(previously is_type) can be implemented in a base class as well.

template <typename Derived>
struct runtime_castable_impl
{
    void* as_type(type_id type_id) const
    {
        auto* derived = const_cast<Derived*>(static_cast<const Derived*>(this));
        return type_id.is(derived) ? derived : nullptr;
    }
};

template <typename Derived>
struct runtime_castable_base : runtime_castable_impl<Derived>
{  
    virtual void* as_type(type_id type_id) const
    {
        return runtime_castable_impl<Derived>::as_type(type_id);
    }
};

template <typename Derived, typename... Bases>
struct runtime_castable : runtime_castable_impl<Derived>, Bases...
{
    void* as_type(type_id type_id) const override
    {
        return as_type_impl<runtime_castable_impl<Derived>, Bases...>(type_id);
    }

private:
    template <typename Base, typename... Rest>
    void* as_type_impl(type_id type_id) const
    {
        if (auto* converted = Base::as_type(type_id))
        {
            return converted;
        }
        else
        {
            if constexpr (sizeof...(Rest) != 0)
            {
                return as_type_impl<Rest...>(type_id);
            }
            else
            {
                return nullptr;
            }
        }
    }
};

and similarly the run_time_cast function also would have to change o use the new as_type method.

template <typename To, typename From>
To runtime_cast(From* from)
{
    if (from != nullptr)
    {
        return static_cast<To>(from->as_type(type_id::get<To>()));
    }
    else
    {
        return nullptr;
    }
}

and our test classes also change to:

struct base : runtime_castable_base<base>
{
};

struct derived1 : runtime_castable<derived1, base>
{
};

struct derived2 : runtime_castable<derived2, base>
{
};

struct base2 : runtime_castable_base<base2>
{
};

struct derived3 : runtime_castable<derived3, derived1, base2>
{
};

This is quite an improvement. There is no longer any need to add aby code into the derived classes because the base class takes care of it using CRTP. Without it, we would run into a diamond inheritance problem. All the older tests work as well.

auto instance = derived1{};
auto base* = static_cast<base*>(&instance);
runtime_cast<const derived1*>(base); // Succeeds      -- is an instance of derived1.
runtime_cast<const derived2*>(base); // Returns null  -- not an instance of derived2.


Which raises the question, why did we move from is_type to as_type? To answer this, we need an example.

class I : public runtime_castable_base<I>
{
public:
    virtual ~I() {};
    virtual void Foo() = 0;

    
};

class I2 : public runtime_castable_base<I2>
{
public:
    virtual ~I2() {};
    virtual void Bar() const = 0;
        
};

class SomeOtherI : public runtime_castable_base<SomeOtherI>
{
public:
    virtual ~SomeOtherI() {};
    virtual void BarBar() = 0;
       
};

class CImpl : public runtime_castable<CImpl, I, I2>
{
public:
    CImpl() {}
    virtual ~CImpl() {}

    // Interface I
    virtual void Foo() { std::cout << "Foo Called.\n"; };

    // Interface I2
    virtual void Bar() const { std::cout << "Bar Called.\n"; };
};

void TestMultiInheritanceWithInterface()
{

    auto instance = CImpl{};
    I* i = static_cast<I*>(&instance);

    const I2* i2 = runtime_cast<const I2*>(i); // Succeeds--is an instance of i2.
    if (i2)
    {
        std::cout << "Succeess --is an instance of I2 as well.\n";
        i2->Bar();
    }
    else
    {
        std::cout << "Fail --is an instance of I2 as well.\n";
        assert(false);
    }

    const SomeOtherI* isomeother = runtime_cast<const SomeOtherI*>(i); // Succeeds--is an instance of i2.
    if (isomeother)
    {
        std::cout << "Fail --is not an instance of SomeOtherI as well.\n";
        assert(false);
    }
    else
    {
        std::cout << "Success --is not an instance of SomeOtherI as well.\n";
        
    }
}


This code doesn't compile with the original is_type() implementaion. CImpl inherits from I and I2, so it contains the I vtable at offset 0 and the I2 vtable at offset 8. If you have a CImpl* at address 0x1000, static_cast<I*>(cimpl) will return 0x1000 and static_cast<I2*>(cimpl) will return 0x1008. 

static_cast doesn’t compile because it only supports upcasts and downcasts. So you can static_cast an I* or I2* to CImpl* or some other derived class, but you can’t cast from I* to I2* because they’re unrelated types.

If we had used reinterpret_cast instead, which is a brute force cast, reinterpret_cast<I2*>(static_cast<I*>(cimpl)), we will get an invalid I2* that points to address 0x1000 instead of 0x1008. Bar() is at the same index in I2’s vtable as Foo is in I’s, so calling Bar() on this invalid pointer will actually call I::Foo().

Hence we changed the implementation to solve the problem.

This solution is satisfactory, works in visual studio, however, when I tried this in the kernel mode, it didn't work and ended up calling the wrong function. After some debugging, the core problem was that are are some additional optimizations in the kernel side which made the enforcements for the 'const' types more robust.

// A unique runtime identifier for a type.
struct type_id
{
    //! Retrieve the `type_id` for a type.
    template <typename T>
    static type_id get()
    {
        return { &impl<T>::id }; // impl<T>::id will have a unique address for every T.
    }

    //! Determine if `T`'s `type_id` matches this `type_id` instance.
    template <typename T>
    bool is(T*) const
    {
        return get<T>().id == id;
    }

    const void* const id;

private:
    template <typename T> struct impl { static const T* const id; };
    template <typename T> struct impl<const T> : impl<T> {};
    template <typename T> struct impl<T*> : impl<T> {};
};

template <typename T> const T* const type_id::impl<T>::id = nullptr;


Once I emininated those const types, it worked just fine.

// A unique runtime identifier for a type.
struct type_id
{
    //! Retrieve the `type_id` for a type.
    template <typename T>
    static type_id get()
    {
        return { &impl<T>::id }; // impl<T>::id will have a unique address for every T.
    }

    //! Determine if `T`'s `type_id` matches this `type_id` instance.
    template <typename T>
    bool is(T*) const
    {
        return get<T>().id == id;
    }

    const void* const id;

private:
    template <typename T> struct impl { static T* id; };
    template <typename T> struct impl<const T> : impl<T> {};
    template <typename T> struct impl<T*> : impl<T> {};
};

template <typename T> const T* type_id::impl<T>::id = nullptr;


A careful reader might find that one issue with the new implementation is it lets you silently cast away const (e.g. runtime_cast<I2*>(const I*) will work when you should only be able to runtime_cast<const I2*>(const I*)). It can be fixed by removing that const_cast, changing the return type from void* to const void*, and adding non-const overloads of as_type() to runtime_castable_impl, runtime_castable_base, and runtime_castable:

const void* as_type(type_id type_id) const
{
    auto* derived = static_cast<const Derived*>(this);
    return type_id.is(derived) ? derived : nullptr;
}

void* as_type(type_id type_id)
{
    auto* derived = static_cast<Derived*>(this);
    return type_id.is(derived) ? derived : nullptr;
}


This generally works for me. One last note would be that it is not necessary that every derived class participate in the run_time_cast.

class I : public runtime_castable_base<I>
{
public:
    virtual ~I() {};
    virtual void Foo() = 0;

    
};

class I2 : public runtime_castable_base<I2>
{
public:
    virtual ~I2() {};
    virtual void Bar() const = 0;
        
};

class SomeOtherI : public runtime_castable_base<SomeOtherI>
{
public:
    virtual ~SomeOtherI() {};
    virtual void BarBar() = 0;
       
};

class CImpl : public runtime_castable<CImpl, I, I2>
{
public:
    CImpl() {}
    virtual ~CImpl() {}

    // Interface I
    virtual void Foo() { std::cout << "Foo Called.\n"; };

    // Interface I2
    virtual void Bar() const { std::cout << "Bar Called.\n"; };
};

class B1
{

};

class B2
{

};

class CImplCommon : public B1, public B2
{
public: 
    virtual void CImplVfn() { std::cout << "CImplVfn Called.\n"; }

};

class CImpl2 : public CImplCommon, public runtime_castable<CImpl2, I, I2>
{
    // Interface I
    virtual void Foo() { std::cout << "Foo Called.\n"; };

    // Interface I2
    virtual void Bar() const { std::cout << "Bar Called.\n"; };

};

void TestComplexInheritanceWithInterface()
{
    std::cout << "TestComplexInheritanceWithInterface.\n";
    I* i = new CImpl2();

    const I2* i2 = runtime_cast<const I2*>(i); // Succeeds--is an instance of i2.
    if (i2)
    {
        std::cout << "Succeess --is an instance of I2 as well.\n";
        i2->Bar();
    }
    else
    {
        std::cout << "Fail --is an instance of I2 as well.\n";
        assert(false);
    }

    const SomeOtherI* isomeother = runtime_cast<const SomeOtherI*>(i); // Succeeds--is an instance of i2.
    if (isomeother)
    {
        std::cout << "Fail --is not an instance of SomeOtherI as well.\n";
        assert(false);
    }
    else
    {
        std::cout << "Success --is not an instance of SomeOtherI as well.\n";

    }

    I* newi = runtime_cast<I*>(i2); // Succeeds--is an instance of i.
    if (newi)
    {
        std::cout << "Success --is an instance of I as well.\n";
        newi->Foo();

    }
    else
    {
        std::cout << "Fail --is an instance of I as well.\n";
        assert(false);
    }

    delete i;
}




I leave it to the user and reader of this code to decide it this is useful, or whether it work's in all cases. As with anything else here, it comes with no guarantees and I am just a random guy on the internet. Please do your due diligence before taking this into production.

A few pitfalls to consider would be:
  • Code bloat because of templates.
  • Kernel stack is a scarce resource, and template calls will consume additional stack frames, depending on where the call is being made, it can lead to stack over flow.
  • Virtual addresses are reusable, in rare cases this will cause bugs which are really hard to track down.