This commit fixes the following issues regarding cache maintainance
under ARM:
* read out I-, and D-cache line size at runtime and use the correct one
* remove 'update_data_region' call from unprivileged syscalls
* rename 'update_instr_region' syscall to 'cache_coherent_region' to
reflect what it doing, namely make I-, and D-cache coherent
* restrict 'cache_coherent_region' syscall to one page at a time
* lookup the region given in a 'cache_coherent_region' syscall in the
page-table of the PD to prevent machine exceptions in the kernel
* only clean D-cache lines, do not invalidate them when pages where
added on Cortex-A8 and ARMv6 (MMU sees phys. memory here)
* remove unused code relicts of cache maintainance
In addition it introduces per architecture memory clearance functions
used by core, when preparing new dataspaces. Thereby, it optimizes:
* on ARMv7 using per-word assignments
* on ARMv8 using cacheline zeroing
* on x86_64 using 'rept stosq' assembler instruction
Fix#3685
* Instead of always re-load page-tables when a thread context is switched
only do this when another user PD's thread is the next target,
core-threads are always executed within the last PD's page-table set
* remove the concept of the mode transition
* instead map the exception vector once in bootstrap code into kernel's
memory segment
* when a new page directory is constructed for a user PD, copy over the
top-level kernel segment entries on RISCV and X86, on ARM we use a designated
page directory register for the kernel segment
* transfer the current CPU id from bootstrap to core/kernel in a register
to ease first stack address calculation
* align cpu context member of threads and vms, because of x86 constraints
regarding the stack-pointer loading
* introduce Align_at template for members with alignment constraints
* let the x86 hardware do part of the context saving in ISS, by passing
the thread context into the TSS before leaving to user-land
* use one exception vector for all ARM platforms including Arm_v6
Fix#2091
Previously we did write the SPSR via an MSR instruction without
additional flags. Unfortunately, this tells the CPU to write the
register only partially. This often isn't a problem as the users PSR
reset value normally is conform to our expectations but in some cases
(e.g. PSR endianess bit on WandBoard core #4) the reset value is bad.
Thus, we have to add the CXSF flags (access Control + eXtension + Status
+ Flags) so the CPU overwrites the entire register.
Fixes#2254
Put the initialization of the cpu cores, setup of page-tables, enabling of
MMU and caches into a separate component that is only used to bootstrap
the kernel resp. core.
Ref #2092
This commit enables multi-processing for all Cortex A9 SoCs we currently
support. Moreover, it thereby enables the L2 cache for i.MX6 that was not
enabled until now. However, the QEMU variants hw_pbxa9 and hw_zynq still
only use 1 core, because the busy cpu synchronization used when initializing
multiple Cortex A9 cores leads to horrible boot times on QEMU.
During this work the CPU initialization in general was reworked. From now
on lots of hardware specifics were put into the 'spec' specific files, some
generic hook functions and abstractions thereby were eliminated. This
results to more lean implementations for instance on non-SMP platforms,
or in the x86 case where cache maintainance is a non-issue.
Due to the fact that memory/cache coherency and SMP are closely coupled
on ARM Cortex A9 this commit combines so different aspects.
Fix#1312Fix#1807
This commit separates certain SMP aspects into 'spec/smp' subdirectories.
Thereby it simplifies non-SMP implementations again, where no locking
and several platform specific maintainance operations are not needed.
Moreover, it moves several platform specifics to appropriated places,
removes dead code from x86, and starts to turn global static pointers
into references that are handed over.
Instead of having an ID allocator per object class use one global allocator for
all. Thereby artificial limitations for the different object types are
superfluent. Moreover, replace the base-hw specific id allocator implementation
with the generic Bit_allocator, which is also memory saving.
Ref #1443
* Introduce hw-specific crt0 for core that calls e.g.: init_main_thread
* re-map core's main thread UTCB to fit the right context area location
* switch core's main thread's stack to fit the right context area location
Fix#1440
Previously, we did the protection-domain switches without a transitional
translation table that contains only global mappings. This was fine as long
as the CPU did no speculative memory accesses. However, to enabling branch
prediction triggers such accesses. Thus, if we don't want to invalidate
predictors on every context switch, we need to switch more carefully.
ref #474