Runtime Measurement Hiccup
Hello seL4 developers, I want to be able to retrieve data from seL4's virtual Linux machine, in order to store it in a persistent way. Namely, I want to be able to simulate a seL4 kernel, boot its Linux virtual machine, compute some hash digests, and then export those hash digests. These digests are valuable because they represent the "clean room" runtime-state of the linux machine. Currently I can export these digests by way of hand-eye coordination, but I consider this unusable as a piece of software. To date I've taken two main approaches: CAmkES FileServer or virtual networking. I'm under the impression that the FileServer changes are not persistent through reboot, and even if they were, to change the boot image after compile-time would seem to fly in the face of seL4's principles. Virtual networking seems to promise I can host my digests on a webpage that is visible to my "root host" machine; that is, the simulated seL4's linux instance hosts a site available on my 192.168.x.x network. I know there is a seL4webserver app as part of the seL4 repositories which claims to do this, but unfortunately its prose is unhelpful and it doesn't seem to work even when it compiles and simulates. I've taken two distinct strategies to investigate the virtual network approach. First, I tried to get it to work on my normal stack: Windows 10 using WSL2 using a Docker container to simulate the seL4 image. The problem with this approach is that it appears I'm required to blindly thread 3 or 4 needles all at once, without getting feedback more descriptive than "you didn't do it." In other words, there does not appear to be a partial success available, and without ICMP ping, I honestly have no idea how to debug these "virtual" networks. Next, I tried simplifying my stack by installing the dependencies natively on a Debian 10 machine, which should bypass several layers of the virtual network I was suggesting in my first strategy. Unfortunately, I met with the same "AttributeError: module 'yaml' has no attribute 'FullLoader'" error that inspired me to begin using Docker several years ago. Of course I should note that "pip/pip2/pip3 install pyyaml" all report that pyyaml is already installed, so I would be in debt to anyone who has an idea about that error. To conclude, I find virtual networks opaque, and I would be grateful for any guidance. If you have a different idea how I might achieve my goal, I would be similarly effusive in my thanks. Cheers, Michael Neises
Hi Michael, I understand your frustration: I was at the same point some time ago... Documentation is a pending matter of the seL4 ecosystem... (I hope nobody kills me for this statement). Anyway I'm pretty sure this will improve over time (it's just a matter of survival). Having said that, it looks to me your problem is more about generic architecture than seL4 related, yet very interesting to me as looks like a common use case. So, if I understood, your intention is to be sure your Linux virtual machine running on top of seL4 has not been tampered. Correct? If so, I would suggest you don't do this task from inside the VM as you can't use the (TOE) target of evaluation (Linux VM) to evaluate itself as you may end up dealing with some attacker that modified the tools you use to compute hashes... (I personally did that on Linux system long ago by tampering a (compiled) kernel module that was responsible for integrity checks on a FIPS-2 platform... So, any decent integrity check must be done from outside of the TOE. The only reliable way is to do it from seL4 host. Also, here you have 2 common scenarios: 1) VM with Inmutable file system: easy, just compute hash of the VM before it boots 2) VM with mixed file system: Inmutable + writable stuff: complex as you need to set up a procedure to check integrity on the writable part and this is not easy and all attacks will target this part of the TOE. In any case, the tool/code doing the integrity check must be out of the TOE. If you are thinking about "embedding" the integrity tools in the inmutable part of the TOE (a common error of many vendors) to check the writable stuff, I strongly suggesr not to do it. There are countless tricks to fool this kind of checks at runtime in a compromised TOE. Thus, and trying to answer your question, thinking about "exporting" integrity checks from inside of the TOE is a very bad idea as you will never be sure if you can trust this data. In short: the TOE can't be used to self check integrity in any way. If you still want to go ahead I suggest to go for seL4webserver but remember it is a bad idea and makes no sense to use seL4 platform (trusted computing) in such a way (false sense of security). Please excuse me in advance if I misunderstood something. Cheers, El mar., 19 oct. 2021 22:49, Michael Neises <neisesmichael@gmail.com> escribió:
Hello seL4 developers,
I want to be able to retrieve data from seL4's virtual Linux machine, in order to store it in a persistent way. Namely, I want to be able to simulate a seL4 kernel, boot its Linux virtual machine, compute some hash digests, and then export those hash digests. These digests are valuable because they represent the "clean room" runtime-state of the linux machine. Currently I can export these digests by way of hand-eye coordination, but I consider this unusable as a piece of software.
To date I've taken two main approaches: CAmkES FileServer or virtual networking. I'm under the impression that the FileServer changes are not persistent through reboot, and even if they were, to change the boot image after compile-time would seem to fly in the face of seL4's principles. Virtual networking seems to promise I can host my digests on a webpage that is visible to my "root host" machine; that is, the simulated seL4's linux instance hosts a site available on my 192.168.x.x network. I know there is a seL4webserver app as part of the seL4 repositories which claims to do this, but unfortunately its prose is unhelpful and it doesn't seem to work even when it compiles and simulates.
I've taken two distinct strategies to investigate the virtual network approach. First, I tried to get it to work on my normal stack: Windows 10 using WSL2 using a Docker container to simulate the seL4 image. The problem with this approach is that it appears I'm required to blindly thread 3 or 4 needles all at once, without getting feedback more descriptive than "you didn't do it." In other words, there does not appear to be a partial success available, and without ICMP ping, I honestly have no idea how to debug these "virtual" networks.
Next, I tried simplifying my stack by installing the dependencies natively on a Debian 10 machine, which should bypass several layers of the virtual network I was suggesting in my first strategy. Unfortunately, I met with the same "AttributeError: module 'yaml' has no attribute 'FullLoader'" error that inspired me to begin using Docker several years ago. Of course I should note that "pip/pip2/pip3 install pyyaml" all report that pyyaml is already installed, so I would be in debt to anyone who has an idea about that error.
To conclude, I find virtual networks opaque, and I would be grateful for any guidance. If you have a different idea how I might achieve my goal, I would be similarly effusive in my thanks.
Cheers, Michael Neises _______________________________________________ Devel mailing list -- devel@sel4.systems To unsubscribe send an email to devel-leave@sel4.systems
Hello Michael, On 2021-10-19 22:46, Michael Neises wrote:
I want to be able to retrieve data from seL4's virtual Linux machine, in order to store it in a persistent way. [...] If you have a different idea how I might achieve my goal, I would be similarly effusive in my thanks.
If it is just one time, one way blob of data passing, you could add it to Linux's command line, or put it into the device tree before starting the VM. Or use shared memory between your seL4 application and the VM. You can tell Linux about it via device tree and then map it via /dev/mem or your own Linux (UIO) kernel driver. If using camkes, it may be easiest to create a virtual PCI UIO device (I have no experience with camkes). Or use HW specific device registers/SRAM to stash the values in. The best way forward depends on what other communication you need to do between seL4 and Linux. Greetings, Indan
Michael, if you need networking then just solve this problem and the you can use it also for exporting stuff. I'm available privately to try help you, just ping me. El mié., 20 oct. 2021 8:31, Indan Zupancic <indan@nul.nu> escribió:
Hello Michael,
On 2021-10-19 22:46, Michael Neises wrote:
I want to be able to retrieve data from seL4's virtual Linux machine, in order to store it in a persistent way. [...] If you have a different idea how I might achieve my goal, I would be similarly effusive in my thanks.
If it is just one time, one way blob of data passing, you could add it to Linux's command line, or put it into the device tree before starting the VM.
Or use shared memory between your seL4 application and the VM. You can tell Linux about it via device tree and then map it via /dev/mem or your own Linux (UIO) kernel driver. If using camkes, it may be easiest to create a virtual PCI UIO device (I have no experience with camkes).
Or use HW specific device registers/SRAM to stash the values in.
The best way forward depends on what other communication you need to do between seL4 and Linux.
Greetings,
Indan _______________________________________________ Devel mailing list -- devel@sel4.systems To unsubscribe send an email to devel-leave@sel4.systems
Hello seL4 developers, Thank you for the replies. For the sake of clarity, the system works like this: At compile time, some expected digest values are made available only to a distinct CAmkES component. At the time of first-Linux-boot, a kernel module takes several measurements of the other kernel modules present (including itself). It reports these digests outwards to CAmkES, where they are compared against the expected values. It is the "pre-compile-time provisioning of these expected digests" in which I am interested. At this time, I can simulate the system and compute these digests, but the only way I have to extract them is to copy them by hand off the screen. To be totally explicit, I want to extract these values in order to re-compile them into a system that knows its expected digest values. I want to have an initial simulation where I extract these digests, so that in the subsequent compilation and simulations, the system is aware what values these digests are required to take. Please correct me if I am wrong, but I think if the very first thing the Linux kernel does is measure itself, before it is even connected to a network, then there is simply no attack surface. Of course I'm very happy to be wrong, but I don't see who the attacker is in this situation. Certainly, there remains an open question of how to extend these measurements meaningfully into the space where there is a viable attack surface (after enabling a network adapter), but I consider that question to be beside the point for now (some future work). If there is some way for me to inspect the run-time data of the Linux system without relying somewhat on a tool inside the Linux instance, I would very much like to know about it. My strategy follows the same path as the vm-introspect example app (which I'm under the impression was created for this explicit purpose), which itself trusts implicitly the Linux instance. Again, to be entirely explicit, there does not appear to be any information anywhere on a way to meaningfully inspect a virtualized Linux system without trusting it even the slightest bit. I would be elated to be corrected; if someone can show me how to scrape (from the seL4 side exclusively) all the bits from a virtual Linux system and reassemble those bits into semantically-valuable information, I'm sure you will not hear from me for several months while I rebuild everything I have. At this time that does not appear to be possible. Cheers, Michael Neises On Tue, Oct 19, 2021 at 3:46 PM Michael Neises <neisesmichael@gmail.com> wrote:
Hello seL4 developers,
I want to be able to retrieve data from seL4's virtual Linux machine, in order to store it in a persistent way. Namely, I want to be able to simulate a seL4 kernel, boot its Linux virtual machine, compute some hash digests, and then export those hash digests. These digests are valuable because they represent the "clean room" runtime-state of the linux machine. Currently I can export these digests by way of hand-eye coordination, but I consider this unusable as a piece of software.
To date I've taken two main approaches: CAmkES FileServer or virtual networking. I'm under the impression that the FileServer changes are not persistent through reboot, and even if they were, to change the boot image after compile-time would seem to fly in the face of seL4's principles. Virtual networking seems to promise I can host my digests on a webpage that is visible to my "root host" machine; that is, the simulated seL4's linux instance hosts a site available on my 192.168.x.x network. I know there is a seL4webserver app as part of the seL4 repositories which claims to do this, but unfortunately its prose is unhelpful and it doesn't seem to work even when it compiles and simulates.
I've taken two distinct strategies to investigate the virtual network approach. First, I tried to get it to work on my normal stack: Windows 10 using WSL2 using a Docker container to simulate the seL4 image. The problem with this approach is that it appears I'm required to blindly thread 3 or 4 needles all at once, without getting feedback more descriptive than "you didn't do it." In other words, there does not appear to be a partial success available, and without ICMP ping, I honestly have no idea how to debug these "virtual" networks.
Next, I tried simplifying my stack by installing the dependencies natively on a Debian 10 machine, which should bypass several layers of the virtual network I was suggesting in my first strategy. Unfortunately, I met with the same "AttributeError: module 'yaml' has no attribute 'FullLoader'" error that inspired me to begin using Docker several years ago. Of course I should note that "pip/pip2/pip3 install pyyaml" all report that pyyaml is already installed, so I would be in debt to anyone who has an idea about that error.
To conclude, I find virtual networks opaque, and I would be grateful for any guidance. If you have a different idea how I might achieve my goal, I would be similarly effusive in my thanks.
Cheers, Michael Neises
Hi Michael Indan Zupancic suggestion about seL4 and Linux communication via shared memory by reserving a memory block in the memory map passed to the linux kernel sounds like an good initial step ( not sure if seL4 has some native mechanism to do so), you can have some telemetry code running inside the kernel by using ebpf ( https://www.brendangregg.com/ebpf.html), ebpf execution mechanism is already in the kernel and probably can be used to directly have access to the memory region shared with your seL4. basically ebpf could be used to inspect any runtime inside the linux system without trusting any tool available within linux user space. this is just a suggestion, an seL4 expert probably can give a more reliable option. -Jorge A. Garcia On Sun, Oct 24, 2021 at 12:46 PM Michael Neises <neisesmichael@gmail.com> wrote:
Hello seL4 developers,
Thank you for the replies.
For the sake of clarity, the system works like this: At compile time, some expected digest values are made available only to a distinct CAmkES component. At the time of first-Linux-boot, a kernel module takes several measurements of the other kernel modules present (including itself). It reports these digests outwards to CAmkES, where they are compared against the expected values. It is the "pre-compile-time provisioning of these expected digests" in which I am interested. At this time, I can simulate the system and compute these digests, but the only way I have to extract them is to copy them by hand off the screen. To be totally explicit, I want to extract these values in order to re-compile them into a system that knows its expected digest values. I want to have an initial simulation where I extract these digests, so that in the subsequent compilation and simulations, the system is aware what values these digests are required to take.
Please correct me if I am wrong, but I think if the very first thing the Linux kernel does is measure itself, before it is even connected to a network, then there is simply no attack surface. Of course I'm very happy to be wrong, but I don't see who the attacker is in this situation. Certainly, there remains an open question of how to extend these measurements meaningfully into the space where there is a viable attack surface (after enabling a network adapter), but I consider that question to be beside the point for now (some future work). If there is some way for me to inspect the run-time data of the Linux system without relying somewhat on a tool inside the Linux instance, I would very much like to know about it. My strategy follows the same path as the vm-introspect example app (which I'm under the impression was created for this explicit purpose), which itself trusts implicitly the Linux instance. Again, to be entirely explicit, there does not appear to be any information anywhere on a way to meaningfully inspect a virtualized Linux system without trusting it even the slightest bit. I would be elated to be corrected; if someone can show me how to scrape (from the seL4 side exclusively) all the bits from a virtual Linux system and reassemble those bits into semantically-valuable information, I'm sure you will not hear from me for several months while I rebuild everything I have. At this time that does not appear to be possible.
Cheers, Michael Neises
On Tue, Oct 19, 2021 at 3:46 PM Michael Neises <neisesmichael@gmail.com> wrote:
Hello seL4 developers,
I want to be able to retrieve data from seL4's virtual Linux machine, in order to store it in a persistent way. Namely, I want to be able to simulate a seL4 kernel, boot its Linux virtual machine, compute some hash digests, and then export those hash digests. These digests are valuable because they represent the "clean room" runtime-state of the linux machine. Currently I can export these digests by way of hand-eye coordination, but I consider this unusable as a piece of software.
To date I've taken two main approaches: CAmkES FileServer or virtual networking. I'm under the impression that the FileServer changes are not persistent through reboot, and even if they were, to change the boot image after compile-time would seem to fly in the face of seL4's principles. Virtual networking seems to promise I can host my digests on a webpage that is visible to my "root host" machine; that is, the simulated seL4's linux instance hosts a site available on my 192.168.x.x network. I know there is a seL4webserver app as part of the seL4 repositories which claims to do this, but unfortunately its prose is unhelpful and it doesn't seem to work even when it compiles and simulates.
I've taken two distinct strategies to investigate the virtual network approach. First, I tried to get it to work on my normal stack: Windows 10 using WSL2 using a Docker container to simulate the seL4 image. The problem with this approach is that it appears I'm required to blindly thread 3 or 4 needles all at once, without getting feedback more descriptive than "you didn't do it." In other words, there does not appear to be a partial success available, and without ICMP ping, I honestly have no idea how to debug these "virtual" networks.
Next, I tried simplifying my stack by installing the dependencies natively on a Debian 10 machine, which should bypass several layers of the virtual network I was suggesting in my first strategy. Unfortunately, I met with the same "AttributeError: module 'yaml' has no attribute 'FullLoader'" error that inspired me to begin using Docker several years ago. Of course I should note that "pip/pip2/pip3 install pyyaml" all report that pyyaml is already installed, so I would be in debt to anyone who has an idea about that error.
To conclude, I find virtual networks opaque, and I would be grateful for any guidance. If you have a different idea how I might achieve my goal, I would be similarly effusive in my thanks.
Cheers, Michael Neises
_______________________________________________ Devel mailing list -- devel@sel4.systems To unsubscribe send an email to devel-leave@sel4.systems
Hi Michael, "Please correct me if I am wrong, but I think if the very first thing the Linux kernel does is measure itself, before it is even connected to a network, then there is simply no attack surface" My 5 cents: it is not so simple... it depends on the specfic run time environment. Anyway, just as an example, some years ago I was challenged with a similar scenario: an appliance running Linux firmware with an embedded integrity mechanism in the kernel code that checked its own integrity and also the integrity of all loaded kernel modules (that were doing integrity checks of the file system). Once initial modules were loaded no more were allowed to be loaded. Anyway, the running kernel was very outdated, so I was able to find a vulnerability that allowed me to inject my own data/code in the kernel space. The problem was persistence: most of the file system was read only... with the exception of some config files in the compact flash storage... a second bug in the parsing of the config files (that allowed user space command execution to trigger the kernel vuln) gave me the persistence I wanted for my kernel level vulnerability in that "inmutable" system. Game over. So, it really depends on your environment. As long you have I/O data operation were an attacker can interact to some persistent storage, then there's room for persistent intrusion no matter the runtime checks you do on the kernel or the file system. There have been plenty of even more elaborated attacks/tricks on heavily siloed and isolated and "inmutable" systems that have been carried out in the computing history. In fact, those are the interesting ones... :-) Hopes this helps. El dom., 24 oct. 2021 19:46, Michael Neises <neisesmichael@gmail.com> escribió:
Hello seL4 developers,
Thank you for the replies.
For the sake of clarity, the system works like this: At compile time, some expected digest values are made available only to a distinct CAmkES component. At the time of first-Linux-boot, a kernel module takes several measurements of the other kernel modules present (including itself). It reports these digests outwards to CAmkES, where they are compared against the expected values. It is the "pre-compile-time provisioning of these expected digests" in which I am interested. At this time, I can simulate the system and compute these digests, but the only way I have to extract them is to copy them by hand off the screen. To be totally explicit, I want to extract these values in order to re-compile them into a system that knows its expected digest values. I want to have an initial simulation where I extract these digests, so that in the subsequent compilation and simulations, the system is aware what values these digests are required to take.
Please correct me if I am wrong, but I think if the very first thing the Linux kernel does is measure itself, before it is even connected to a network, then there is simply no attack surface. Of course I'm very happy to be wrong, but I don't see who the attacker is in this situation. Certainly, there remains an open question of how to extend these measurements meaningfully into the space where there is a viable attack surface (after enabling a network adapter), but I consider that question to be beside the point for now (some future work). If there is some way for me to inspect the run-time data of the Linux system without relying somewhat on a tool inside the Linux instance, I would very much like to know about it. My strategy follows the same path as the vm-introspect example app (which I'm under the impression was created for this explicit purpose), which itself trusts implicitly the Linux instance. Again, to be entirely explicit, there does not appear to be any information anywhere on a way to meaningfully inspect a virtualized Linux system without trusting it even the slightest bit. I would be elated to be corrected; if someone can show me how to scrape (from the seL4 side exclusively) all the bits from a virtual Linux system and reassemble those bits into semantically-valuable information, I'm sure you will not hear from me for several months while I rebuild everything I have. At this time that does not appear to be possible.
Cheers, Michael Neises
On Tue, Oct 19, 2021 at 3:46 PM Michael Neises <neisesmichael@gmail.com> wrote:
Hello seL4 developers,
I want to be able to retrieve data from seL4's virtual Linux machine, in order to store it in a persistent way. Namely, I want to be able to simulate a seL4 kernel, boot its Linux virtual machine, compute some hash digests, and then export those hash digests. These digests are valuable because they represent the "clean room" runtime-state of the linux machine. Currently I can export these digests by way of hand-eye coordination, but I consider this unusable as a piece of software.
To date I've taken two main approaches: CAmkES FileServer or virtual networking. I'm under the impression that the FileServer changes are not persistent through reboot, and even if they were, to change the boot image after compile-time would seem to fly in the face of seL4's principles. Virtual networking seems to promise I can host my digests on a webpage that is visible to my "root host" machine; that is, the simulated seL4's linux instance hosts a site available on my 192.168.x.x network. I know there is a seL4webserver app as part of the seL4 repositories which claims to do this, but unfortunately its prose is unhelpful and it doesn't seem to work even when it compiles and simulates.
I've taken two distinct strategies to investigate the virtual network approach. First, I tried to get it to work on my normal stack: Windows 10 using WSL2 using a Docker container to simulate the seL4 image. The problem with this approach is that it appears I'm required to blindly thread 3 or 4 needles all at once, without getting feedback more descriptive than "you didn't do it." In other words, there does not appear to be a partial success available, and without ICMP ping, I honestly have no idea how to debug these "virtual" networks.
Next, I tried simplifying my stack by installing the dependencies natively on a Debian 10 machine, which should bypass several layers of the virtual network I was suggesting in my first strategy. Unfortunately, I met with the same "AttributeError: module 'yaml' has no attribute 'FullLoader'" error that inspired me to begin using Docker several years ago. Of course I should note that "pip/pip2/pip3 install pyyaml" all report that pyyaml is already installed, so I would be in debt to anyone who has an idea about that error.
To conclude, I find virtual networks opaque, and I would be grateful for any guidance. If you have a different idea how I might achieve my goal, I would be similarly effusive in my thanks.
Cheers, Michael Neises
_______________________________________________ Devel mailing list -- devel@sel4.systems To unsubscribe send an email to devel-leave@sel4.systems
participants (4)
-
Hugo V.C.
-
Indan Zupancic
-
Jorge Alberto Garcia
-
Michael Neises