Performance of the old kmalloc implementation was terrible.
We now use fixed-width linked list allocations for sizes <= 60 bytes.
This is much faster than variable size allocation.
We don't use bitmap scanning anymore since it was probably the slow
part. Instead we use headers that tell allocations size and aligment.
I removed the kmalloc_eternal, even though it was very fast, there is
not really any need for it, since the only place it was used in was IDT.
These changes allowed my psf (font) parsing to go from ~500 ms to ~20 ms.
(coming soon :D)
We can now use arbitary BAN::function<void(...)> as the Thread.
I also implemented multithreading for i386 since it was not done
on the initial multithreading commit.
We don't have to invalidate page in AllocatePage() if we don't make
any changes. We also should not assert on deallocating non-present
pages, just return early :)