Do what's easiest for you. On Sun, Mar 12, 2023 at 8:05 PM Kent Mcleod <kent.mcleod72@gmail.com> wrote:
On Fri, Feb 3, 2023 at 10:29 AM Sam Leffler <sleffler@google.com> wrote:
On Thu, Feb 2, 2023 at 1:56 PM Kent Mcleod <kent.mcleod72@gmail.com>
wrote:
On Fri, Feb 3, 2023 at 8:26 AM Sam Leffler via Devel <devel@sel4.systems>
wrote:
I have a target platform with only 4M of memory. When the system
image is
generated and the shoehorn helper script is used to find a place in memory to load the build artifacts it tacks on an extra 4M of memory use (aka fudge_factor). The comment in the code < https://github.com/AmbiML/sparrow-seL4_tools/blame/master/cmake-tool/helpers...
says this is to accommodate sel4test_driver. Needless to say this breaks on my 4M target platform. So I made the fudge-factor settable from the cmd line with a default of 0 and changed the sel4test build glue to set 4M when building elfloader. Works fine for my target platform. But this change breaks building a bootable image for rpi3 (AARCH64=1 bcm28367)--shoehorn places elfloader s.t. it overlaps the image; e.g.
ELF-loader started on CPU: ARM Ltd. Cortex-A53 r0p4
paddr=[335000..51a0ff] No DTB passed in from boot loader. Looking for DTB in CPIO archive...found at 378778. Loaded DTB from 378778. paddr=[237000..23afff] ELF-loading image 'kernel' to 0 paddr=[0..236fff] vaddr=[ffffff8000000000..ffffff8000236fff] virt_entry=ffffff8000000000 ELF-loading image 'capdl-loader' to 23b000 paddr=[23b000..33bfff] vaddr=[400000..500fff] virt_entry=4009a8 ERROR: image load address overlaps with ELF-loader! ERROR: Physical address range invalid ERROR: Could not load user image ELF
Debug output of shoehorn for this case:
shoehorn: debug: found CPIO identifying sequence b'070701' at offset 0x40
in
/usr/local/google/home/sleffler/shodan/out/cantrip/aarch64-unknown-elf/release/elfloader/archive.o
shoehorn: debug: encountered CPIO entry name: kernel.elf shoehorn: debug: encountered CPIO entry name: kernel.dtb shoehorn: debug: encountered CPIO entry name: capdl-loader shoehorn: debug: setting marker to 0x0 (region 0 start) shoehorn: debug: setting marker to 0x237000 (kernel_end) shoehorn: debug: setting marker to 0x23b000 (dtb_end) shoehorn: debug: setting marker to 0x335000 (end of rootserver)
So two questions: 1. Where is the 4M under-count of sel4test_driver? (the code indicates this might be explained in JIRA SELFOUR-2335 but I couldn't locate it)
Here is the referred to Jira issue, but it doesn't provide any additional context: https://sel4.atlassian.net/browse/SELFOUR-2335
shoehorn is attempting to calculate how the kernel and root server binaries will be unpacked into memory in order to place the elfloader's start address above the unpacked region. shoehorn calculates the region by iterating over the PT_LOAD segments from each ELF file. The elfloader then unpacks each ELF file at runtime by iterating over the PT_LOAD segments.
For some reason, the two implementations don't agree. In your case, the offline calculation expects that the root server is loaded from [0x23b000, 0x335000) whereas the online calculation attempts: [0x23b000, 0x33bfff). Are you able to print the segment headers for the root server image you are loading?
I'm guessing (from quickly looking at the code) the issue is that the shoehorn calculation only sums the p_memsz amounts for each PT_LOAD segment and isn't taking into account any gaps between segments in the virtual address space.
Yes, that appears to be the issue. readelf of capdl-loader shows:
Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000 0x00000000000a9130 0x00000000000a9130 RWE 0x1000 LOAD 0x0000000000000000 0x00000000004b0000 0x00000000004b0000 0x0000000000000000 0x0000000000050168 RW 0x1000
so there's a gap between the two load segments that isn't accounted for. Attached is a change that seems to DTRT. It also appears to eliminate the need for fudge_factor (in quick testing). You'll probably want to write your own fix as my python fu is basic.
Thanks for this fix Sam, This seems to be an appropriate fix. If https://github.com/seL4/seL4_tools/pull/158 passes the test suite then I'll try and get the fix merged. Can I use your commit and sign-off the certificate of originality or would you prefer I rewrite it?
A fudge-factor wouldn't be needed if these two calculations weren't out
of sync.
2. Should zero'ing fudge_factor work? If yes, where should I look to
remedy
the above?
I looked upstream for changes that might address this issue but didn't see anything.
I suspect I can invert my logic and default fudge_factor to some value and then override as needed (e.g. 0 for my sparrow platform & 4M for sel4test builds).
This seems fine to me.
-Sam _______________________________________________ Devel mailing list -- devel@sel4.systems To unsubscribe send an email to devel-leave@sel4.systems