in src/arch/x86/machine/fpu.c:140 on 4.0.0, there is
if (x86_cpuid_ebx(0x0d, 0x0) != CONFIG_XSAVE_SIZE) { printf("XSAVE buffer set set to %d, but should be %d\n", CONFIG_XSAVE_SIZE, x86_cpuid_ebx(0x0d, 0x0)); return false; }
I want to force a large(r than 576) xsave region so that TCBBits is always 11, to help my userspace be vaguely more kernel-config-independent. I don't think it's a problem if the xsave buffer in the TCB is larger than the hardware will actually use? I also think this is buggy. Shouldn't it be querying ecx, since things like AVX aren't enabled in the kernel? To be clear, this is my suggested change:
if (x86_cpuid_ecx(0x0d, 0x0) > CONFIG_XSAVE_SIZE) { printf("XSAVE buffer set set to %d, but should be at least %d\n", CONFIG_XSAVE_SIZE, x86_cpuid_ecx(0x0d, 0x0)); return false; }
-- cmr http://octayn.net/ +16038524272
Hi Corey, The original intention of the 'aggressive' check was to somewhat prevent your case where the user has set the size higher than required and may be inadvertently be wasting memory by having the TCBBits end up larger than needed. I agree that it can be a little inconvenient, so will change it allow sizes that are larger, but emit a warning. Comparing against ebx instead of ecx is quite deliberate as this way we only measure the size required for the features that the user actually wanted to use as defined in CONFIG_XSAVE_FEATURE_SET. This is why we do this check just after performing 'write_xcr0' with said desired features. Adrian On Fri 06-Jan-2017 6:05 PM, Corey Richardson wrote: in src/arch/x86/machine/fpu.c:140 on 4.0.0, there is if (x86_cpuid_ebx(0x0d, 0x0) != CONFIG_XSAVE_SIZE) { printf("XSAVE buffer set set to %d, but should be %d\n", CONFIG_XSAVE_SIZE, x86_cpuid_ebx(0x0d, 0x0)); return false; } I want to force a large(r than 576) xsave region so that TCBBits is always 11, to help my userspace be vaguely more kernel-config-independent. I don't think it's a problem if the xsave buffer in the TCB is larger than the hardware will actually use? I also think this is buggy. Shouldn't it be querying ecx, since things like AVX aren't enabled in the kernel? To be clear, this is my suggested change: if (x86_cpuid_ecx(0x0d, 0x0) > CONFIG_XSAVE_SIZE) { printf("XSAVE buffer set set to %d, but should be at least %d\n", CONFIG_XSAVE_SIZE, x86_cpuid_ecx(0x0d, 0x0)); return false; } _______________________________________________ Devel mailing list Devel@sel4.systems<mailto:Devel@sel4.systems> https://sel4.systems/lists/listinfo/devel
On 01/09/2017 12:37 AM, Adrian.Danis@data61.csiro.au wrote:
Hi Corey,
The original intention of the 'aggressive' check was to somewhat prevent your case where the user has set the size higher than required and may be inadvertently be wasting memory by having the TCBBits end up larger than needed. I agree that it can be a little inconvenient, so will change it allow sizes that are larger, but emit a warning.
Sounds great, thanks!
Comparing against ebx instead of ecx is quite deliberate as this way we only measure the size required for the features that the user actually wanted to use as defined in CONFIG_XSAVE_FEATURE_SET. This is why we do this check just after performing 'write_xcr0' with said desired features.
Hm, this isn't consistent with my observations of the actual behavior, although I'm not seeing why at this point. On my Ivy Bridge Xeon CPU (supports AVX) with CONFIG_XSAVE_FEATURE_SET=3, using Qemu/KVM with -cpu host and the linked kernel config (and a small patch to always print out ebx/ecx), I get 576 for ebx and 832 for ecx. Kernel config: http://ix.io/1PCY My only patch from 4.0.0: http://ix.io/1PCZ Kernel log: http://ix.io/1PD3 Qemu invocation:
qemu-system-x86_64 -cpu host,+avx -enable-kvm -nographic -kernel ~/proj/robigalia/sel4/stage/kernel-x86_64-pc99 -initrd ~/proj/robigalia/hello-world/target/x86_64-sel4-robigalia/release/hello-world
Maybe there's another issue at play here? -- cmr http://octayn.net/ +16038524272
On Tue 10-Jan-2017 2:20 PM, Corey Richardson wrote: On 01/09/2017 12:37 AM, Adrian.Danis@data61.csiro.au<mailto:Adrian.Danis@data61.csiro.au> wrote: Hi Corey, The original intention of the 'aggressive' check was to somewhat prevent your case where the user has set the size higher than required and may be inadvertently be wasting memory by having the TCBBits end up larger than needed. I agree that it can be a little inconvenient, so will change it allow sizes that are larger, but emit a warning. Sounds great, thanks! Comparing against ebx instead of ecx is quite deliberate as this way we only measure the size required for the features that the user actually wanted to use as defined in CONFIG_XSAVE_FEATURE_SET. This is why we do this check just after performing 'write_xcr0' with said desired features. Hm, this isn't consistent with my observations of the actual behavior, although I'm not seeing why at this point. On my Ivy Bridge Xeon CPU (supports AVX) with CONFIG_XSAVE_FEATURE_SET=3, using Qemu/KVM with -cpu host and the linked kernel config (and a small patch to always print out ebx/ecx), I get 576 for ebx and 832 for ecx. This makes sens to me. You have set feature mask to 3, which means support for SSE and FPU. The size for those two features, which is all the the kernel will enable, is 576 bytes. The 832 is if you were to also turn on AVX (which is bit 2). Now if you set feature mask to 7, so that the SSE, FPU and AVX features are enabled then you should get ebx also reporting 832. Of course, if you're able to perform AVX instructions on AVX state with the feature mask set to 3 without getting a fault, then something is going very wrong. Kernel config: http://ix.io/1PCY My only patch from 4.0.0: http://ix.io/1PCZ Kernel log: http://ix.io/1PD3 Qemu invocation: qemu-system-x86_64 -cpu host,+avx -enable-kvm -nographic -kernel ~/proj/robigalia/sel4/stage/kernel-x86_64-pc99 -initrd ~/proj/robigalia/hello-world/target/x86_64-sel4-robigalia/release/hello-world Maybe there's another issue at play here?
On 01/09/2017 10:39 PM, Adrian.Danis@data61.csiro.au wrote:
This makes sens to me. You have set feature mask to 3, which means support for SSE and FPU. The size for those two features, which is all the the kernel will enable, is 576 bytes. The 832 is if you were to also turn on AVX (which is bit 2). Now if you set feature mask to 7, so that the SSE, FPU and AVX features are enabled then you should get ebx also reporting 832.
Oh, wow, I completely misunderstood the help message for that. I thought 0 was the bitmask for FPU (ie, "can't disable"), 1 for SSE, 2 for AVX, so 1 | 2 = 3 would be AVX + SSE + FPU. I see now that this understanding is flawed, and it's actually the raw state-component bitmap value.
Of course, if you're able to perform AVX instructions on AVX state with the feature mask set to 3 without getting a fault, then something is going very wrong.
This kernel doesn't even seem to boot at all with -cpu host -enable-kvm (or -cpu IvyBridge), regardless of the configuration. Curiously, with -cpu SandyBridge -enable-kvm, it does boot. It seems to be a qemu bug. It seems to be stuck in some sort of loop handling int_0e (page fault). When I attach gdb, I get this sort of behavior:
<int_0e> mov 0xffffffff80745008,%rsp <int_0e+8> add $0x3e8,%rsp <int_0e+15> push %rcx
At the start, rsp is x64KSIRQStack, but after an si $rsp is 0? Another si and it's 0x3e8, and trying to execute the push causes gdb to "wait" until I press Ctrl-c, at which point it's back to the top of int_0e and the stack as reported by bt is unchanged: #0 0xffffffff8071537c in int_0e () #1 0x0000000000000002 in ?? () #2 0xffffffff8071538b in int_0e () followed by some garbage. I expect the "waiting" is gdb not knowing what's happening when qemu makes an interrupt happen. If it's at all useful, here's the kernel under consideration: https://files.octayn.net/sel4-4.0.0-x86_64-pc99 It's 4.0.0 with this config: http://ix.io/1PDr Kofi was able to reproduce this in Qemu, but not on Haswell hardware. Any idea what the issue might be? -- cmr http://octayn.net/ +16038524272
On Tue 10-Jan-2017 3:52 PM, Corey Richardson wrote: This kernel doesn't even seem to boot at all with -cpu host -enable-kvm (or -cpu IvyBridge), regardless of the configuration. Curiously, with -cpu SandyBridge -enable-kvm, it does boot. It seems to be a qemu bug. It seems to be stuck in some sort of loop handling int_0e (page fault). When I attach gdb, I get this sort of behavior: <int_0e> mov 0xffffffff80745008,%rsp <int_0e+8> add $0x3e8,%rsp <int_0e+15> push %rcx At the start, rsp is x64KSIRQStack, but after an si $rsp is 0? Another si and it's 0x3e8, and trying to execute the push causes gdb to "wait" until I press Ctrl-c, at which point it's back to the top of int_0e and the stack as reported by bt is unchanged: #0 0xffffffff8071537c in int_0e () #1 0x0000000000000002 in ?? () #2 0xffffffff8071538b in int_0e () followed by some garbage. I expect the "waiting" is gdb not knowing what's happening when qemu makes an interrupt happen. If it's at all useful, here's the kernel under consideration: https://files.octayn.net/sel4-4.0.0-x86_64-pc99 It's 4.0.0 with this config: http://ix.io/1PDr Kofi was able to reproduce this in Qemu, but not on Haswell hardware. Any idea what the issue might be? I've tracked this down to a combination of two problems. 1. There is actually a single GP fault happening to trigger everything but the early boot code is not setup to handle and print out kernel faults correctly and hence you end up in this infinite page fault loop 2. KVM fails to emulate the PLATFORM_INFO_MSR properly, despite reporting through CPUID a CPU model where it is architecturally defined to exist. Workaround seems to be to detect GP faults to indicate a failure to read the MSR Cannot say when we will have a chance to fix these. Adrian Adrian
participants (2)
-
Adrian.Danis@data61.csiro.au
-
Corey Richardson