Change Semaphore -> ThreadBlocker
This was not a semaphore, I just named it one because I didn't know
what semaphore was. I have meant to change this sooner, but it was in
no way urgent :D
Implement SMP events. Processors can now be sent SMP events through
IPIs. SMP events can be sent either to a single processor or broadcasted
to every processor.
PageTable::{map_page,map_range,unmap_page,unmap_range}() now send SMP
event to invalidate TLB caches for the changed pages.
Scheduler no longer uses a global run queue. Each processor has its own
scheduler that keeps track of the load on the processor. Once every
second schedulers do load balancing. Schedulers have no access to other
processors' schedulers, they just see approximate loads. If scheduler
decides that it has too much load, it will send a thread to another
processor through a SMP event.
Schedulers are currently run using the timer interrupt on BSB. This
should be not the case, and each processor should use its LAPIC timer
for interrupts. There is no reason to broadcast SMP event to all
processors when BSB gets timer interrupt.
Old scheduler only achieved 20% idle load on qemu. That was probably a
very inefficient implementation. This new scheduler seems to average
around 1% idle load. This is much closer to what I would expect. On my
own laptop idle load seems to be only around 0.5% on each processor.
This was broken when I added SMP support. This patch makes sse kind of
dumb as it is saved and restored on every interrupt, but now it at least
works properly... I'll have to look into how sse can get optimized
nicely with SMP. Simple way would be pinning each thread to a specific
processor and doing pretty much what I had before, but sse thread saved
in processor rather than static global.
Current context saving was very hacky and dependant on compiler
behaviour that was not consistent. Now we always use iret for
context saving. This makes everything more clean.
Signal handling code was way too complex. Now everything is
simplified and there is no need for ThreadBlockers.
Only complication that this patch includes is that blocking syscalls
have to manually be made interruptable by signal. There might be some
clever solution to combat this is make this happen automatically.
We now delete threads when
1. it is marked as Terminated and is the current thread
2. it tries to start execution in Terminated state
This allows us to never have thread executing in Terminated state
When we want to kill a process, we mark its threads as Terminating
or Terminated. If the thread is in critical section that has to be
finished, it will be in Terminating state until done. Once Scheduler
is trying to execute Terminated thread it will instead delete it.
Once processes last thread is marked Terminated, the processes will
turn it into a cleanup thread, that will allow blocks and memory
cleanup to be done.
I added a syscall for telling the kernel when signal execution has
finished. We should send a random hash or id to the signal trampoline
that we would include in the syscall, so validity of signal exit can
be confirmed.
We now load ELF files to VirtualRanges instead of using kmalloc.
We have only a fixed 1 MiB kmalloc for big allocations and this
allows loading files even when they don't fit in there.
This caused me to rewrite the whole ELF loading process since the
loaded ELF is not in memory mapped by every process.
Virtual ranges allow you to zero out the memory and to copy into
them from arbitary byte buffers.
MMU moved to namespace kernel
Kernel::Memory::Heap moved to just Kernel
MMU::map_{page,range} renamed to identity_map_{page,range}
Add MMU::get_page_flags
This disables interrupts for the current scope and restores them
after the scope. This is used in kmalloc, since scheduler might
call into kmalloc/kfree, but deadlock if some thread is currently
trying to allocate. This allows us to use kmalloc in Scheduler.
We can now use arbitary BAN::function<void(...)> as the Thread.
I also implemented multithreading for i386 since it was not done
on the initial multithreading commit.
This still uses only a single cpu, but we can now have 'parallelization'
This seems to work fine in qemu and bochs, but my own computer did not
like this when I last tried.
I have absolutely no idea how multithreading should actually be
implmemented and I just thought and implemented the most simple one I
could think of. This might not be in any way correct :D