* Instead of always re-load page-tables when a thread context is switched
only do this when another user PD's thread is the next target,
core-threads are always executed within the last PD's page-table set
* remove the concept of the mode transition
* instead map the exception vector once in bootstrap code into kernel's
memory segment
* when a new page directory is constructed for a user PD, copy over the
top-level kernel segment entries on RISCV and X86, on ARM we use a designated
page directory register for the kernel segment
* transfer the current CPU id from bootstrap to core/kernel in a register
to ease first stack address calculation
* align cpu context member of threads and vms, because of x86 constraints
regarding the stack-pointer loading
* introduce Align_at template for members with alignment constraints
* let the x86 hardware do part of the context saving in ISS, by passing
the thread context into the TSS before leaving to user-land
* use one exception vector for all ARM platforms including Arm_v6
Fix#2091
* introduce new syscall (core-only) to create privileged threads
* take the privilege level of the thread into account
when doing a context switch
* map kernel segment as accessable for privileged code only
Ref #2091
Always switch to the "exception stack" instead of having a hardware initiated
stack switch during exceptions/interrupts when the privilege level changes only.
Moreover, this commit increases the exception stack slightly.
Ref #2091
* introduces central memory map for core/kernel
* on 32-bit platforms the kernel/core starts at 0x80000000
* on 64-bit platforms the kernel/core starts at 0xffffffc000000000
* mark kernel/core mappings as global ones (tagged TLB)
* move the exception vector to begin of core's binary,
thereby bootstrap knows from where to map it appropriately
* do not map boot modules into core anymore
* constrain core's virtual heap memory area
* differentiate in between user's and core's main thread's UTCB,
which now resides inside the kernel segment
Ref #2091
Some x86 machines do have a LAPIC speed < 1000 ticks per millisecond
when configured to use the maximum divider (as it was always the case).
But we need microseconds precision for the timeout framework. Thus,
reduce the divider dynamically until the frequency fullfills our
requirements.
Ref #2400
There are hardware timers whose frequency can't be expressed as
ticks-per-microsecond integer-value because only a ticks-per-millisecond
integer-value is precise enough. We don't want to use expensive
floating-point values here but nonetheless want to translate from ticks
to time with microseconds precision. Thus, we split the input in two and
translate both parts separately. This way, we can raise precision by
shifting the values to their optimal bit position. Afterwards, the results
are shifted back and merged together again.
As this algorithm is not so trivial anymore and used by at least three
timer drivers (base-hw/x86_64, base-hw/cortex_a9, timer/pit), move it to a
generic header to avoid redundancy.
Ref #2400
Due to the simplicity of the algorithm that translated from timer ticks
to time, we lost microseconds precision although the timer allows for it.
Ref #2400
rm_fault.run triggers write on read-only ROM provided by core, which
fails without this patch:
arm - "raised unhandled data abort"
x86 - (silent/invisible) busy loop because write fault gets never resolved
With this, we get rid of platform specific timer interfaces. The new
Timer class does the same as the old Clock class and has a generic
interface. The old Timer class was merely used by the old Clock class.
Also, we get rid of having only one timer instance which we tell with
each method call for which CPU it shall be done. Instead now each Cpu
object has its own Timer member that knows the CPU it works for.
Also, rename all "tics" to "ticks".
Fixes#2347
This commit moves the headers residing in `repos/base/include/spec/*/drivers`
to `repos/base/include/drivers/defs` or repos/base/include/drivers/uart`
respectively. The first one contains definitions about board-specific MMIO
iand RAM addresses, or IRQ lines. While the latter contains device driver
code for UART devices. Those definitions are used by driver implementations
in `repos/base-hw`, `repos/os`, and `repos/dde-linux`, which now need to
include them more explicitely.
This work is a step in the direction of reducing 'SPEC' identifiers overall.
Ref #2403
* Acknowledge receive of page-fault signal with ack_signal,
but restart thread execution separately
* use kill_signal_context when disolving a pager_object to prevent race
* Remove bureaucracy in form of Thread_event and Signal_ack_handler
* remove dead code in riscv, namely Thread_base definition
* translation_table_insertions function for ARM drops out,
which was overcautious
Put the initialization of the cpu cores, setup of page-tables, enabling of
MMU and caches into a separate component that is only used to bootstrap
the kernel resp. core.
Ref #2092
This cleans up the syscalls that are mainly used to control the
scheduling readiness of a thread. The different use cases and
requirements were somehow mixed together in the previous interface. The
new syscall set is:
1) pause_thread and resume_thread
They don't affect the state of the thread (IPC, signalling, etc.) but
merely decide wether the thread is allowed for scheduling or not, the
so-called pause state. The pause state is orthogonal to the thread state
and masks it when it comes to scheduling. In contrast to the stopped
state, which is described in "stop_thread and restart_thread", the
thread state and the UTCB content of a thread may change while in the
paused state. However, the register state of a thread doesn't change
while paused. The "pause" and "resume" syscalls are both core-restricted
and may target any thread. They are used as back end for the CPU session
calls "pause" and "resume". The "pause/resume" feature is made for
applications like the GDB monitor that transparently want to stop and
continue the execution of a thread no matter what state the thread is
in.
2) stop_thread and restart_thread
The stop syscall can only be used on a thread in the non-blocking
("active") thread state. The thread then switches to the "stopped"
thread state in wich it explicitely waits for a restart. The restart
syscall can only be used on a thread in the "stopped" or the "active"
thread state. The thread then switches back to the "active" thread state
and the syscall returns whether the thread was stopped. Both syscalls
are not core-restricted. "Stop" always targets the calling thread while
"restart" may target any thread in the same PD as the caller. Thread
state and UTCB content of a thread don't change while in the stopped
state. The "stop/restart" feature is used when an active thread wants to
wait for an event that is not known to the kernel. Actually the syscalls
are used when waiting for locks and on thread exit.
3) cancel_thread_blocking
Does cleanly cancel a cancelable blocking thread state (IPC, signalling,
stopped). The thread whose blocking was cancelled goes back to the
"active" thread state. It may receive a syscall return value that
reflects the cancellation. This syscall doesn't affect the pause state
of the thread which means that it may still not get scheduled. The
syscall is core-restricted and may target any thread.
4) yield_thread
Does its best that a thread is scheduled as few as possible in the
current scheduling super-period without touching the thread or pause
state. In the next superperiod, however, the thread is scheduled
"normal" again. The syscall is not core-restricted and always targets
the caller.
Fixes#2104
This is a redesign of the root and parent interfaces to eliminate
blocking RPC calls.
- New session representation at the parent (base/session_state.h)
- base-internal root proxy mechanism as migration path
- Redesign of base/service.h
- Removes ancient 'Connection::KEEP_OPEN' feature
- Interface change of 'Child', 'Child_policy', 'Slave', 'Slave_policy'
- New 'Slave::Connection'
- Changed child-construction procedure to be compatible with the
non-blocking parent interface and to be easier to use
- The child's initial LOG session, its binary ROM session, and the
linker ROM session have become part of the child's envirenment.
- Session upgrading must now be performed via 'env.upgrade' instead
of performing a sole RPC call the parent. To make RAM upgrades
easier, the 'Connection' provides a new 'upgrade_ram' method.
Issue #2120
base generic code:
* Remove unused verbosity code from mmio framework
* Remove escape sequence end heuristic from LOG
* replace Core_console with Core_log (no format specifiers)
* move test/printf to test/log
* remove `printf()` tests from the log test
* check for exact match of the log test output
base-fiasco:
* remove unused Fiasco::print_l4_threadid function
base-nova:
* remove unused hexdump utility from core
base-hw:
* remove unused Kernel::Thread::_print_* debug utilities
* always print resource summary of core during startup
* remove Kernel::Ipc_node::pd_label (not used anymore)
base*:
* Turn `printf`,`PWRN`, etc. calls into their log equivalents
Ref #1987Fix#2119
Besides adapting the components to the use of base/log.h, the patch
cleans up a few base headers, i.e., it removes unused includes from
root/component.h, specifically base/heap.h and
ram_session/ram_session.h. Hence, components that relied on the implicit
inclusion of those headers have to manually include those headers now.
While adjusting the log messages, I repeatedly stumbled over the problem
that printing char * arguments is ambiguous. It is unclear whether to
print the argument as pointer or null-terminated string. To overcome
this problem, the patch introduces a new type 'Cstring' that allows the
caller to express that the argument should be handled as null-terminated
string. As a nice side effect, with this type in place, the optional len
argument of the 'String' class could be removed. Instead of supplying a
pair of (char const *, size_t), the constructor accepts a 'Cstring'.
This, in turn, clears the way let the 'String' constructor use the new
output mechanism to assemble a string from multiple arguments (and
thereby getting rid of snprintf within Genode in the near future).
To enforce the explicit resolution of the char * ambiguity, the 'char *'
overload of the 'print' function is marked as deleted.
Issue #1987
Write tick count of next kernel timer to the guest timed events page if
present. This causes the guest VM to be preempted at the requested tick
count and ensures that the guest VM can not monopolize the CPU if no
traps occur.
The base-hw kernel expects a configured switch-event from the guest VM
to base-hw with ID 30 and target vector 32 to be present in the system
policy.
Issue #2016
* The Vm thread is always paused and on exception to make sure that guest VM
execution is suspended whenever we handle an interrupt. Also signal the Vm
session to poke waiting threads (e.g. Virtualbox EMT).
* Implement Vm::proceed
Switch to the mode transition assembly code declared at the _vt_vm_entry
label.
Issue #2016
The entry enables interrupts and initiates a handover to the guest VM by
invoking event number one. The sti instruction is placed at the start to
allow exits to Muen before handing off to the VM if window exiting is
requested.
Issue #2016
The sinfo function declared in sinfo_instance.h creates a static sinfo
object instance and returns a pointer to the caller.
- kernel timer and platform support to use sinfo() function to
instantiate sinfo object
- address and size of the base-hw RAM region via the sinfo API
- log_status() function in sinfo API
Use the new Sinfo::get_dev_info function to retrieve device information
in the platform-specific get_msi_params function. If the requested
device supports MSI, set the IRQ and MSI address/data register values to
enable MSIs in remappable format (see VT-d specification, section
5.1.2.2).
Currently only one MSI per device is supported as the subhandle in the
data register is always set to 0.
The new Sinfo::get_dev_info function can be used to retrieve information
for a PCI device with given source-id (SID). The function returns false
if no device information for the specified device exists.
The platform-specific get_msi_params function returns MSI parameters for
a device identified by PCI config space address. The function returns
false if either the platform or the device does not support MSI mode of
operation.
The new implementation of the FPU and FPU context is taken out to
separate architecture-dependent header files. The generic Cpu_lazy_state
is deleted. There is no hint about the existence of something like an
FPU in the generic non-architexture-dependent code anymore. Instead the
architecture-dependent CPU context of a thread is extended by an FPU
context where supported.
Moreover, the current FPU implementations are enhanced so that threads
that get deleted now release the FPU when still obtaining it.
Fix#1855
This commit enables multi-processing for all Cortex A9 SoCs we currently
support. Moreover, it thereby enables the L2 cache for i.MX6 that was not
enabled until now. However, the QEMU variants hw_pbxa9 and hw_zynq still
only use 1 core, because the busy cpu synchronization used when initializing
multiple Cortex A9 cores leads to horrible boot times on QEMU.
During this work the CPU initialization in general was reworked. From now
on lots of hardware specifics were put into the 'spec' specific files, some
generic hook functions and abstractions thereby were eliminated. This
results to more lean implementations for instance on non-SMP platforms,
or in the x86 case where cache maintainance is a non-issue.
Due to the fact that memory/cache coherency and SMP are closely coupled
on ARM Cortex A9 this commit combines so different aspects.
Fix#1312Fix#1807
This commit separates certain SMP aspects into 'spec/smp' subdirectories.
Thereby it simplifies non-SMP implementations again, where no locking
and several platform specific maintainance operations are not needed.
Moreover, it moves several platform specifics to appropriated places,
removes dead code from x86, and starts to turn global static pointers
into references that are handed over.
Other platforms implement Kernel::Cpu_context stuff in
kernel/cpu_context.cc. On x86_64, it was implemented in
kernel/thread.cc. The commit fixes this inconsistency to the other
platforms.
Ref #1652
The distinction between Kernel::Thread and Kernel::Thread_base is
unnecessary as currently all Hw platforms would have the same content in
the latter class. Thus I've merged Kernel::Thread_base into
Kernel::Thread. Thereby, Kernel::Thread_event can be moved to
kernel/thread.h.
Ref #1652
Move Platform::setup_irq_mode function from x86 platform_support.cc to
x86_64 specific file. This will enable the upcoming x86_64_muen platform
to provide a separate implementation.
Move the _core_only_mmio_regions function to the
x86_64/platform_support.cc file. This is required to make it overridable
for other platforms deriving from x86.
Perform all FPU-related setup in the Cpu class' init_fpu function instead of
the general system bring-up assembly code.
Set all required control register 0 and 4 flags according to Intel SDM Vol. 3A,
sections 9.2 and 9.6 instead of only enabling FPU error reporting and OSFXSR.
* Enable the use of the FXSAVE and FXRSTOR instructions, see Intel SDM
Vol. 3C, section 2.5.
* The state of the x87 floating point unit (FPU) is loaded and saved on
demand.
* Make the cr0 control register accessible in the Cpu class. This is in
preparation of the upcoming FPU management.
* Access to the FPU is disabled by setting the Task Switch flag in the cr0
register.
* Access to the FPU is enabled by clearing the Task Switch flag in the cr0
register.
* Implement FPU initialization
* Add is_fpu_enabled helper function
* Add pointer to CPU lazy state to CPU class
* Init FPU when finishing kernel initialization
* Add function to retry FPU instruction:
Similar to the ARM mechanism to retry undefined instructions, implement a
function for retrying an FPU instruction. If a floating-point instruction
causes an #NM exception due to the FPU being disabled, it can be retried
after the correct FPU state is restored, saving the current state and
enabling the FPU in the process.
* Disable FPU when switching to different user context:
This enables lazy save/restore of the FPU since trying to execute a
floating point instruction when the FPU is disabled will cause a #NM
exception.
* Declare constant for #NM exception
* Retry FPU instruction on #NM exception
* Assure alignment of FXSAVE area:
The FXSAVE area is 512-byte memory region that must be 16-byte aligned. As
it turns out the alignment attribute is not honored in all cases so add a
workaround to assure the alignment constraint is met by manually rounding
the start of the FXSAVE area to the next 16-byte boundary if necessary.
The implementation initializes the Local APIC (LAPIC) of CPU 0 in xapic
mode (mmio register access) and uses the I/O APIC to remap, mask and
unmask hardware IRQs. The remapping offset of IRQs is 48.
Also initialize the legacy PIC and mask all interrupts in order to
disable it.
For more information about LAPIC and I/O APIC see Intel SDM Vol. 3A,
chapter 10 and the Intel 82093AA I/O Advanced Programmable Interrupt
Controller (IOAPIC) specification
Set bit 9 in the RFLAGS register of user CPU context to enable
interrupts on kernel- to usermode switch.
Make the local APIC accessible via its MMIO region by adding a 2 MB
large page mapping at 0xfee00000 with memory type UC.
Note: The mapping is added to the initial page tables to make the APIC
usable prior to the activation of core's page tables, e.g. in the
constructor of the timer class.
The location in memory is arbitrary but we use the same address as the
ARM architecture. Adjust references to virtual addresses in the mode
transition pages to cope with 64-bit values.
The interrupt stack must reside in the mtc region in order to use it for
non-core threads. The size of the stack is set to 56 bytes in order to
hold the interrupt stack frame plus the additional vector number that is
pushed onto the stack by the ISR.
Call the _virt_mtc_addr function with the _mt_isrs label to calculate
the ISR base address in Idt::setup. Again, assume the address to be
below 0x10000.
Use parameter instead of class member variable because it would get
stored into the mtc region otherwise. In a further iteration only the
actual IDT should be saved into the mtc, not the complete class
instance. Currently the class instance size is equal to the IDT table
size.
The class provides the load() function which reloads the GDTR with the
GDT address in the mtc region. This is needed to make the segments
accessible to non-core threads.
Make the _gdt_start label global to use it in the call to
_virt_mtc_addr().
Use the _mt_tss label and the placement new operator to create the
Tss class instance in the mtc region. Update the hard-coded
TSS base address to use the virtual mtc address.
On exception, the CPU first checks the IDT in order to find the
associated ISR. The IDT must therefore be placed in the mode transition
pages to make them available for non-core threads.
The limit is set to match the TSS size - 1 and the base address is
hardcoded to the *current* address of the TSS instance (0x3a1100).
TODO: Set the base address using the 'tss' label. If the TSS descriptor
format were not so utterly unusable this would be straightforward.
Changes to the code that indirectly lead to a different location
of the tss result in #GP since the base address will be invalid.