úlfurinn | code cave

Unrolling a list with variadic templates

Prelude

I’ve been entertaining the idea of making my own programming language, simply because it’s good fun. A lot of inspiration comes from Ruby, especially the ease of hooking native functions into the runtime:

rb_define_method( myClass, "method_name", &method_function, 2 );

and then calling it:

rb_funcall( instance, rb_intern("method_name"), 2, arg0, arg1 );

The runtime then takes care of forwarding the arguments into the appropriate C function. The underlying code, of course, is not quite so pretty. If you trace it down to the point where the actual function call takes place, it looks like this (in vm_insnhelper.c):

      ...
      case 0:
	return (*func) (recv);
	break;
      case 1:
	return (*func) (recv, argv[0]);
	break;
      case 2:
	return (*func) (recv, argv[0], argv[1]);
	break;
      ...
      case 15:
	return (*func) (recv, argv[0], argv[1], argv[2], argv[3], argv[4],
			argv[5], argv[6], argv[7], argv[8], argv[9], argv[10],
			argv[11], argv[12], argv[13], argv[14]);
	break;

My fingers itch to clean up all this repetition, and I don’t like the “magic” limit of 15 arguments (although, of course, having 15 positional arguments in a function is borderline insanity anyway, but let’s ignore that point for now). Besides, C allows you to fiddle with pointers like that, but I’m doing the project in C++ (because it’s also good fun, though in a kinkier way), which goes out of its way to discourage such things.

From the language features I have settled on so far I can anticipate that I will need to accumulate function arguments in temporary lists and then pass them on to the function as individual arguments, if I want to have native functions with natural-looking signatures; something like this:

    int f( int, int, int );
    ? f_p( &f );
    vector<int> v = { 1, 2, 3 };
    f_p( v );    //    => f( 1, 2, 3 );

Ruby offers this in the form of the splat operator; could there be a way of doing this in C++ without resorting to the sort of monstrosity shown above? Beasts like Boost.Function and Boost.Bind use the repetition pattern for dealing with arity… doesn’t look good.

The play

I should note that at this point I am in no way concerned with performance figures; all I care about is having fun with the type system and mining it for all the gold it has. I’m already using C++11 features in the project, so… enter variadic templates.

The common technique for processing argument packs is recursively splicing them into head and tail — something functional languages do on a daily basis; there’s an example of using it to convert an argument pack into a list. But the head and tail method also works for constructing lists; could we do something similar here to make the inverse conversion?

In fact, that’s exactly how it works. After several hours of trying out dead ends (marked by GCC saying “lol you can’t do that”… hey, I don’t have The Standard memorized, certainly not the parts that deal with template specialization), I found a way of doing precisely what I need.

The recursive part that does the actual unrolling looks like this:

    //  recursion step, argument accumulation
    template< int remaining, typename Ret, typename F, typename Container >
    class invoker {
    public:
        template< typename ... Args >
        static inline Ret invoke( F f, Container args, Args ... unrolled_args ) {
            return invoker< remaining - 1, Ret, F, Container >::invoke( f, args,
                    args[ remaining - 1 ], unrolled_args... );
        }
    };
    //  recursion terminator, actual call
    template< typename Ret, typename F, typename Container >
    class invoker< 0, Ret, F, Container > {
    public:
        template< typename ... Args >
        static inline Ret invoke( F f, Container args, Args ... unrolled_args ) {
            return f( unrolled_args... );
        }
    };

Initially, unrolled_args is an empty pack, and we take the last element of args and pass it in front of unrolled_args. In the next step, they are matched together as unrolled_args, which is the gist of head|tail construction. Then we prepend another element and so on. Keep in mind that is all done in compile time, so there’s no way of knowing how long args is, and we need to have the remaining counter to detect the recursion end. When it reaches zero, we hit the specialized template that does the actual call.

To initialize remaining and to compute the type F we use another class:

    template< typename Ret, typename ... P >
    class fun_wrap {
    public:
        static const unsigned arity = sizeof...(P);
        fun_wrap( Ret(*func)( P... ) ) :
                f( func ) {
        }
        template< typename Container >
        Ret invoke( Container args ) {
            if ( args.size() != arity )
                throw std::invalid_argument( "Arity mismatch" );
            return invoker< arity, Ret, Ret(*)( P... ), Container >::invoke( f, args );
        }
    private:
        Ret (*f)( P... );
    };
    template< typename Ret, typename ... P >
    fun_wrap< Ret, P... > wrap( Ret(*f)( P... ) ) {
        return fun_wrap< Ret, P... >( f ) );
    }

The entry point is the wrap function (the point of using functions instead of creating class instances by hand is that the compiler deduces the template arguments for functions, but for classes you have to specify them by hand). The function’s argument signature is bound to P, which lets us determine the function’s arity and seed remaining.

Finally, an example of using it:

    int fun1( int a ) {
        return a;
    }
    int fun2( int a, int b ) {
        return a + b;
    }
    int main( ) {
        auto f1 = wrap( &fun1 );
        auto f2 = wrap( &fun2 );
        cout << f1( vector< int >( { 1 } ) ) << endl;
        cout << f2( vector< int >( { 1, 2 } ) ) << endl;
    }

Goal accomplished.

The final version adds an opaque method class that hides all the templating nonsense from the client code and does not depend on function signature. It also works with class member functions and operates on class instances, but this does not affect the technique.

Limitations

The obvious one is that all arguments must be of the same type, being given in a homogenous container. Also, like I mentioned, performance is not a concern, but having N nested calls doesn’t look all that good, and there’s no way around using recursion with variadic templates. I can only hope GCC is smart enough to inline and/or unroll them, but I haven’t verified this.

Qt for 64-bit Windows

Working on cogscrobbler, I was a little frustrated about being able to produce 64-bit binaries for Linux but not for Windows. It’s not even an option when you download the SDK. But then one day I checked the mingw64 site, and it claimed to finally be stable enough, so I thought I’d try making a 64-bit build of Qt myself.

It turned out to be surprisingly straightforward.

I’m building the latest version today so I decided to document the process this time.

First, I need to download Qt (ignore the installer and just get the source archive) and Mingw64 (I usually go to “Personal builds” and choose a reasonably new one from sezero; others may work for you, I just go with what I’ve tried — it’s important for our purposes to have a build where binaries don’t have platform suffixes, and mingw64’s automated builds do, or did the last time I checked). Also, I get the OpenSSL build from “External binary packages / Binaries” (actually, OpenSSL also builds surprisingly cleanly using msys, but this time I’d rather use the provided one). Unpack everything and cd to Qt’s root directory. Set PATH to nothing but mingw64\bin to be on the safe side. Set INCLUDE and LIB to point to OpenSSL’s files.

I then run configure -release -opensource -qt-style-windowsvista -qt-style-windowsxp -openssl -phonon -no-s60 -stl -fast -saveconfig x64. (I use the 32-bit chain for development and debugging, because it works. I tried the 64-bit chain of Eclipse+gcc+gdb; I can’t remember now but I think it worked fine in the end, but you need to keep track of your PATH, so I don’t bother, and only make the release build here.) The extra options are there to enable stuff that for some reason doesn’t get enabled by default; the lack of native Windows decorations would be particularly odd.

You may notice that Qt only has the make spec for mingw32, and detects the platform as such. That may look discouraging, but the good news is that mingw64 successfully works as a drop-in replacement for mingw32, as long as the executables can be located (hence the need for a suffixless build).

configure will take some time to create Makefiles, and when it’s done, I run mingw32-make and find something to do for the next two hours (on a 2.8 GHz i7)

(configure complains about not having some DirectDraw/Direct3D files. mingw64 provides the headers, but configure also wants libraries in the VS format. I don’t know if it recognizes its mistake later on or simply disabled D3D. I don’t use it in my application, but if it proves to be a problem for you, drop me a note as I’d potentially be interested in resolving it for the sake of completeness.)

New home

So now that Mephisto appears to be completely abandoned (even by the guy who promised to pick it up and continue) to the point where it wouldn’t work with the latest RubyGems, which I was careless enough to upgrade on a production machine (well, I say “production” but it’s more of a glorified dev box, so these things happen a lot), I had to choose a different engine, and here it is. Good for me that there were only two posts to migrate ;)

I kind of miss Mephisto’s support for differently themed sections, so perhaps I’ll write it myself as Enki was allegedly designed to be hackable.

Substring matching in Erlang

I can’t remember if I’ve ever seen it in the official documentation and I can’t really be bothered to search for it, so I’ll just put it here regardless.

There’s this article on the internets that’s trying to expose Erlang’s weak points (sorry, not giving the link, it doesn’t really offer anything original and most of the points aren’t even valid, actually), and it’s mentioned there that string processing is one of the things that Erlang doesn’t do very well at all. Now, you have to keep in mind that strings are actually linked lists in Erlang, so you have to write your algorithms accordingly (i.e., the only O(1) operations you have are “take first element” and “add first element”; concatenation and array-like access are O(n), so if your code fetches individual characters a lot, it’ll be really sluggish), that is, mostly rely on recursive descent. So I was sort of meditating on this and then I got a small idea.

When you match a list, it is possible to specify an arbitrary (but fixed) number of elements at the front:

[a, b, c | Rest] = [a, b, c, d, e].

So I thought, perhaps there is a similar laconic way to match a substring, et voilà:

"abc" ++ Rest = "abcde".

I imagine it could be used in Yaws servlets that work with deep URLs, something like Rails’ /controller/action/id.

There are limitations though. For instance, for some reason I can’t quite figure out at this hour, you cannot use the “++” syntax for matching lists:

[a, b, c] ++ Rest = [a, b, c, d, e].
* 1: illegal pattern

although there shouldn’t be any apparent difference in internal representation.

Also, the head string must be a literal; you cannot pre-bind it to a variable, because the head must be known at compile-time to create a pattern match. (Otherwise it could be rather handy for building regular automata or something trie-like, I imagine.) But at least it’s there and it could be useful in some cases.

Emergency epmd recovery

Suppose you’re running an Erlang application (ejabberd, for instance). It’s been up for months and months but then you try to use a remote control script (like ejabberdctl) and it fails, probably saying things like “nodedown”, indicating it cannot communicate with the Erlang node; yet the application itself is apparently running just fine. Running out of ideas, you run epmd -names and, to your horror, it shows an empty list.

Every Erlang node uses a high port for communication, and epmd’s task is to map symbolic names like ejabberd to port numbers. (A bit like what DNS does for IP addresses, yes.) Without this information nodes cannot find each other. It’s quite interesting how epmd could just lose it, but right now it’s more important to find out how you can restore it without restarting your application. (Oh, the beautiful uptime!)

There are bits of information on the net that say epmd will gladly register any unused name with any port with an Erlang node attached when it’s told to do so, we just need a way to send the right packet. Turns out there’s the erl_epmd:register_node/2 function that does exactly what we need. Use netstat to find the port your orphan node is listening to, then start an anonymous Erlang node:

$ erl

It must be anonymous because you can only register once, and we’ll need to do it by hand instead.

> erl_epmd:start().

Since the node is anonymous, we must start the gen_server ourselves.

> erl_node:register_node(ejabberd, 23456).

…assuming these are the node name and the port you need.
Voila! Check the epmd -names and rejo…

But not quite yet. The thing about epmd is, although it registers whatever it’s told without asking any questions, it will also keep track of the connection that issued the registration request. The moment you disconnect your anonymous node, the restored registration will be gone again. We need to find a way to sneak into the application node and take control from there.

Start one more node, this time with a name:

$ erl -sname repair

Ping the application node to establish the connection:

repair@hostname> net_adm:ping(ejabberd@hostname).

The node will ask epmd “who’s ejabberd?” and receive the port we gave it a moment ago, so all is fine.
Now, the thing about Erlang connections is that once they’re established by whatever means, including net_adm:ping/1, you don’t need epmd to talk to that node anymore. So now you can turn to the first anonymous node and kill it:

> halt().

This will, among other things, break the connection to epmd and make the name ejabberd available for re-registration. Note that the connection we made from the second node is still very much alive, about time we used it:

repair@hostname> rpc:call(ejabberd@hostname, erl_epmd, register_node, [ejabberd, 23456]).

Note that this is essentially the same call to erl_epmd:register_node/2 as before, but this time we do it on behalf of the remote node using rpc:call. This means that the node name is now associated with its rightful owner once again! You can stop the second node now:

repair@hostname> halt().

Run epmd -names once more to make sure, and go on with your business.