genode/repos/base-linux/src/core/platform.cc
Norman Feske 132569d12b base-linux: socket descriptor caps for RPC
On Linux, Genode used to represent each RPC object by a socket
descriptor of the receiving thread (entrypoint) and a globally-unique
value that identifies the object. Because the latter was transferred as
plain message payload, clients had to be trusted to not forge the
values. For this reason, Linux could not be considered as a productive
Genode base platform but remained merely a development vehicle.

This patch changes the RPC mechanism such that each RPC object is
represented by a dedicated socket pair. Entrypoints wait on a set of
the local ends of the socket pairs of all RPC objects managed by the
respective entrypoint. The epoll kernel interface is used as the
underlying mechanism to wait for a set of socket descriptors at the
server side.

When delegating a capability, the remote end of the socket pair is
transferred to the recipient along with a plaintext copy of the
socket-descriptor value of the local end. The latter value serves as a
hint for re-identifiying a capability whenever it is delegated back to
its origin. Note that the client is not trusted to preserve this
information. The integrity of the hint value is protected by comparing
the inode values of incoming and already present capablities at the
originating site (whenever the capability is invoked or presented to the
owner of the RPC object).

The new mechanism effectively equips base-linux with Genode's capablity
model as described in the Chapter 3 of the Genode Foundations book.
That said, the sandboxing of components cannot be assumed at this point
because each component has still direct access to the Linux system-call
interface.

This patch is based on the extensive exploration work conducted by
Stefan Thoeni who strongly motivated the inclusion of this feature into
Genode.

Issue #3581
2020-04-17 12:40:13 +02:00

196 lines
4.6 KiB
C++

/*
* \brief Linux platform interface implementation
* \author Norman Feske
* \date 2006-06-13
*/
/*
* Copyright (C) 2006-2017 Genode Labs GmbH
*
* This file is part of the Genode OS framework, which is distributed
* under the terms of the GNU Affero General Public License version 3.
*/
/* Genode includes */
#include <base/lock.h>
#include <linux_dataspace/client.h>
/* base-internal includes */
#include <base/internal/native_thread.h>
#include <base/internal/parent_socket_handle.h>
#include <base/internal/capability_space_tpl.h>
/* local includes */
#include "platform.h"
#include "core_env.h"
#include "resource_path.h"
/* Linux includes */
#include <core_linux_syscalls.h>
using namespace Genode;
/**
* Memory pool used for for core-local meta data
*/
static char _core_mem[80*1024*1024];
/*
* Basic semaphore implementation based on the 'pipe' syscall.
*
* This alternative implementation is needed to be able to wake up the
* blocked main thread from a signal handler executed by the same thread.
*/
class Pipe_semaphore
{
private:
int _pipefd[2];
public:
Pipe_semaphore()
{
lx_pipe(_pipefd);
}
void down()
{
char dummy;
while(lx_read(_pipefd[0], &dummy, 1) != 1);
}
void up()
{
char dummy;
while (lx_write(_pipefd[1], &dummy, 1) != 1);
}
};
static Pipe_semaphore _wait_for_exit_sem; /* wakeup of '_wait_for_exit' */
static bool _do_exit = false; /* exit condition */
static void sigint_handler(int)
{
_do_exit = true;
_wait_for_exit_sem.up();
}
static void sigchld_handler(int)
{
_wait_for_exit_sem.up();
}
Platform::Platform()
: _core_mem_alloc(nullptr)
{
/* make 'mmap' behave deterministically */
lx_disable_aslr();
/* increase maximum number of open file descriptors to the hard limit */
lx_boost_rlimit();
/* catch control-c */
lx_sigaction(LX_SIGINT, sigint_handler, false);
/* catch SIGCHLD */
lx_sigaction(LX_SIGCHLD, sigchld_handler, false);
/* create resource directory under /tmp */
lx_mkdir(resource_path(), S_IRWXU);
_core_mem_alloc.add_range((addr_t)_core_mem, sizeof(_core_mem));
/*
* Occupy the socket handle that will be used to propagate the parent
* capability new processes. Otherwise, there may be the chance that the
* parent capability as supplied by the process creator will be assigned to
* this handle, which would result in a 'dup2' syscall taking
* PARENT_SOCKET_HANDLE as both source and target descriptor.
*/
lx_dup2(0, PARENT_SOCKET_HANDLE);
}
void Platform::wait_for_exit()
{
for (;;) {
/*
* Block until a signal occurs.
*/
_wait_for_exit_sem.down();
/*
* Each time, the '_wait_for_exit_sem' gets unlocked, we could have
* received either a SIGINT or SIGCHLD. If a SIGINT was received, the
* '_exit' condition will be set.
*/
if (_do_exit)
break;
/*
* Reflect SIGCHLD as exception signal to the signal context of the CPU
* session of the process. Because multiple children could have been
* terminated, we iterate until 'pollpid' (wrapper around 'wait4')
* returns -1.
*/
for (;;) {
int const pid = lx_pollpid();
if (pid <= 0)
break;
Platform_thread::submit_exception(pid);
}
}
lx_exit_group(0);
}
/****************************************************
** Support for Platform_env_base::Region_map_mmap **
****************************************************/
size_t Region_map_mmap::_dataspace_size(Capability<Dataspace> ds_cap)
{
if (!ds_cap.valid())
return Local_capability<Dataspace>::deref(ds_cap)->size();
/* use local function call if called from the entrypoint */
return core_env().entrypoint().apply(ds_cap, [] (Dataspace *ds) {
return ds ? ds->size() : 0; });
}
int Region_map_mmap::_dataspace_fd(Capability<Dataspace> ds_cap)
{
Capability<Linux_dataspace> lx_ds_cap = static_cap_cast<Linux_dataspace>(ds_cap);
/*
* Return a duplicate of the dataspace file descriptor, which will be freed
* immediately after mmap'ing the file (see 'Region_map_mmap').
*
* Handing out the original file descriptor would result in the premature
* release of the descriptor. So the descriptor could be reused (i.e., as a
* socket descriptor during the RPC handling). When later destroying the
* dataspace, the descriptor would unexpectedly be closed again.
*/
return core_env().entrypoint().apply(lx_ds_cap, [] (Linux_dataspace *ds) {
return ds ? lx_dup(Capability_space::ipc_cap_data(ds->fd()).dst.socket.value) : -1; });
}
bool Region_map_mmap::_dataspace_writable(Dataspace_capability ds_cap)
{
return core_env().entrypoint().apply(ds_cap, [] (Dataspace *ds) {
return ds ? ds->writable() : false; });
}