In the past, the core-only privileged syscall `update_pd` was used only
to invalidate the TLB after removal of page-table entries.
By now, the whole TLB at least for one protection domain got invalidated,
but in preparation for optimization and upcomingARM v8 support,
it is necessary to deliver the virtual memory region that needs to get
invalidated. Moreover, the name of the call shall represent explicitely
that it is used to invalidate the TLB.
Ref #3405
This commit addresses several multiprocessing issues in base-hw:
* it reworks cross-cpu maintainance work for TLB invalidation by
introducing a generic Inter_processor_work and removes the so
called Cpu_domain_update
* thereby it solves the cross-cpu thread destruction, when the
corresponding thread is active on another cpu (fix#3043)
* it adds the missing TLB shootdown for x86 (fix#3042)
* on ARM it removes the TLB shootdown via IPIs, because this
is not needed on the multiprocessing ARM platforms we support
* it enables the per-cpu initialization of the kernel's cpu
objects, which means those object initialization is executed
by the proper cpu
* it rollbacks prior decision to make multiprocessing an aspect,
but puts back certain 'smp' mechanisms (like cross-cpu lock)
into the generic code base for simplicity reasons
* Instead of always re-load page-tables when a thread context is switched
only do this when another user PD's thread is the next target,
core-threads are always executed within the last PD's page-table set
* remove the concept of the mode transition
* instead map the exception vector once in bootstrap code into kernel's
memory segment
* when a new page directory is constructed for a user PD, copy over the
top-level kernel segment entries on RISCV and X86, on ARM we use a designated
page directory register for the kernel segment
* transfer the current CPU id from bootstrap to core/kernel in a register
to ease first stack address calculation
* align cpu context member of threads and vms, because of x86 constraints
regarding the stack-pointer loading
* introduce Align_at template for members with alignment constraints
* let the x86 hardware do part of the context saving in ISS, by passing
the thread context into the TSS before leaving to user-land
* use one exception vector for all ARM platforms including Arm_v6
Fix#2091
This commit enables multi-processing for all Cortex A9 SoCs we currently
support. Moreover, it thereby enables the L2 cache for i.MX6 that was not
enabled until now. However, the QEMU variants hw_pbxa9 and hw_zynq still
only use 1 core, because the busy cpu synchronization used when initializing
multiple Cortex A9 cores leads to horrible boot times on QEMU.
During this work the CPU initialization in general was reworked. From now
on lots of hardware specifics were put into the 'spec' specific files, some
generic hook functions and abstractions thereby were eliminated. This
results to more lean implementations for instance on non-SMP platforms,
or in the x86 case where cache maintainance is a non-issue.
Due to the fact that memory/cache coherency and SMP are closely coupled
on ARM Cortex A9 this commit combines so different aspects.
Fix#1312Fix#1807
Instead of organizing page tables within slab blocks and allocating such
blocks dynamically on demand, replace the page table allocator with a
simple, static alternative. The new page table allocator is dimensioned
at compile-time. When a PD runs out of page-tables, we simply flush its
current mappings, and re-use the freed tables. The only exception is
core/kernel that should not produce any page faults. Thereby it has to
be ensured that core has enough page tables to populate it's virtual
memory.
A positive side-effect of this static approach is that the accounting
of memory used for page-tables is now possible again. In the dynamic case
there was no protocol existent that solved the problem of donating memory
to core during a page fault.
Fix#1588
Instead of having an ID allocator per object class use one global allocator for
all. Thereby artificial limitations for the different object types are
superfluent. Moreover, replace the base-hw specific id allocator implementation
with the generic Bit_allocator, which is also memory saving.
Ref #1443