The Linux kernel normally manages each page as an independent object. There are times, however, when pages are grouped into larger units, called "compound pages." This generally happens when physically contiguous allocations larger than one page are needed by the kernel; when this happens, a compound page is passed back to the caller. These pages are special in that they must be split back apart when they are released back into the system, and there may be other cleanup work to do. So compound pages have an attribute not found on normal pages: a destructor which is called when the page is freed.
So, if we look at how the exploit sets up its low-memory page structures, we see:
pages->flags = 1 << PG_compound; pages->private = (unsigned long) pages; pages->count = 1; pages->lru.next = (long) kernel_code;
When the kernel looks for a page structure at user-space address zero, it will find something which looks like a compound page. The destructor (stored in the lru.next field of the second page structure) is set to kernel_code(), a function defined within the exploit itself. Since the count is set to one, the call to page_cache_release() (which decrements that count) will conclude that there are no further references and, since the page looks like a compound page, the destructor will be called. At this point, the exploit has arbitrary code running in kernel mode, and the show is truly over. This code just sets the process's uid to zero (giving it root access), then engages in some assembly-language trickery to return immediately to user space, shorting out the rest of the cleanup process.
There are a couple of interesting implications from all of this. One, clearly, is that this exploit is not something which was bashed out by a script kiddie somewhere. It was written by somebody who understands low-level kernel code quite well and who is able to use that understanding to escalate an apparent information-disclosure vulnerability into a full code execution problem. It is, clearly, a mistake to underestimate those who write exploits, not all of whom immediately make their works known to the development community. One also should not assume that they have not already written exploits for other, still unfixed bugs.
Also worth noting is the fact that ordinary buffer overflow protection may well have not been effective against this vulnerability. The return address on the stack was not overwritten, and no exploit code was put in data areas. This episode has caused a renewed interested in technical security measures in the kernel. These measures are good, but it would be a mistake to think that they will fix the problem. What is really needed is stronger review of patches with security in mind; it is not yet clear that this review is happening.