Some questions about group of camkes (adjust the format)
Hi * We learn the group of camkes from the https://github.com/seL4/camkes-tool/blob/master/docs/index.md * group can colocate two component instances in a single address space, we want to use the share var by this way, like directcall transfer string virtual address, But we feel confused about the global symbols who have the same symbol name include function and variable in a single address space, Their behavior seems undefined 。 * Our camkes project test below, clien1 and client2 in the same group: " import <std_connector.camkes>; procedure MyInterface { void print(in string message); } component Client { control; uses MyInterface iface; } component Server { provides MyInterface iface; } assembly { composition { group g { component Client client_1; component Client client_2; } component Server server; connection seL4RPCCall conn(from g.client_1.iface, from g.client_2.iface, to server.iface); } configuration { client_1.iface_attributes = 11; client_2.iface_attributes = 22; } } * 1. We test function in group (a single address space): * the client code below: " #include <camkes.h> int run(void) { char *name = get_instance_name(); iface_print(name); return 0; } " * the server code below: " #include <camkes.h> #include <stdio.h> void iface_print(const char *name) { printf("[%s] ping, the badge is [%d]\n", name, iface_get_sender_id()); } " * From the cameks project, we have two client, they are client_1 and client_2, they have the same they can all “iface_print” to access server function,In this scene,we have two quesiton: * Question 1. After compiled the test project, we found in the symbol table of g_group_bin, we have two "iface_print" symbol, they have different address even though they have the same name, it look like the componet can call their own "iface_print", I think in a single address space, component like thread, Why two symbols with the same name(iface_print) will not conflict in one address space * Question 2: when client access server, we offer the _attributes(badge) for the interface of client, by test, the two client can call their own "iface_print", but the server can only recevice one badge 22, below is our test: " [client_2] ping, the badge is [22] [client_1] ping, the badge is [22] " * The attribute(badge) of client1 is covered by the attribute(badge) of client2, The iface_print of two client have only one capability to access IPC endpoint object * We can not understand in single address space,the "iface_print" of two component is behavior as one or two different ? * 2. We test function in global variable (a single address space): * We found whether global variables are initialized will have completely different behavior, out test code below: * scene1 :Client code below: " int temp_value; int run(void) { char *name = get_instance_name(); iface_print(name); printf("at start, in %s, the temp_value address is [%p], value is [0x%lx]\n", name, &temp_value, temp_value); if (strcmp(name, "client_1") == 0) { int count = 1000000000; while(count--) { asm volatile(""); } /* client1 read the temp_value*/ printf("at last, in %s, the temp_value address is [%p], value is [0x%lx]\n", name, &temp_value, temp_value); } else { /* client2 change the val of temp_value */ temp_value = 0x12345; printf("at last, in %s, the temp_value address is [%p], value is [0x%lx]\n", name, &temp_value, temp_value); } return 0; } " * Phenomenon: client1 and client2 are different instances of the same code in same group,form the above code,the symbol of "temp_value" is bss section, the temp_value of client1 and the temp_value of client2 have the same virtual address, and if the client2 change the value, client1 can perceive the change, test as below: " at start, in client_1, the temp_value address is [0x659e70], value is [0x0] at start, in client_2, the temp_value address is [0x659e70], value is [0x0] at last, in client_2, the temp_value address is [0x659e70], value is [0x12345] at last, in client_1, the temp_value address is [0x659e70], value is [0x12345] " * scene2 :Client code below: " int temp_value = 1; int run(void) { char *name = get_instance_name(); iface_print(name); printf("at start, in %s, the temp_value address is [%p], value is [0x%lx]\n", name, &temp_value, temp_value); if (strcmp(name, "client_1") == 0) { int count = 1000000000; while(count--) { asm volatile(""); } /* client1 read the temp_value*/ printf("at last, in %s, the temp_value address is [%p], value is [0x%lx]\n", name, &temp_value, temp_value); } else { /* client2 change the val of temp_value */ temp_value = 0x12345; printf("at last, in %s, the temp_value address is [%p], value is [0x%lx]\n", name, &temp_value, temp_value); } return 0; } " * Phenomenon: client1 and client2 are different instances of the same code in same group,form the above code,the symbol of "temp_value" is data section, the temp_value of client1 and the temp_value of client2 have the different virtual address, and if the client2 change the value, client1 can not get the change, test as below: " at last, in client_2, the temp_value address is [0x445ba8], value is [0x12345] at last, in client_1, the temp_value address is [0x444008], value is [0x1] " * The code of scene1 and scene2 are almost the same, except whether temp_value is initialized, the bss section or data section,but the symbol table completely different, scene1 the symbol table of g_group_bin has only one "temp_value" symbol, But scene1 the symbol table of g_group_bin has two symbols named temp_value, we also do not understand why the same name symbols of global variable should be one or two in a single address of group. It looks like random. Thank you very much
On Dec 23, 2021, at 08:50, yadong.li <yadong.li@horizon.ai> wrote:
Hi
* We learn the group of camkes from the https://github.com/seL4/camkes-tool/blob/master/docs/index.md * group can colocate two component instances in a single address space, we want to use the share var by this way, like directcall transfer string virtual address, But we feel confused about the global symbols who have the same symbol name include function and variable in a single address space, Their behavior seems undefined 。
My information on how this works is out of date and I can’t answer all your questions, but I can tell you how this feature was originally implemented. Single address space components were originally implemented using post-compilation symbol mangling. As you’ve noticed, naive linking of two component instances together presents two problems: 1. Unrelated homonyms between the instances now conflict. A symbol called `foo` in component instance A and a symbol called `foo` in component instance B now refer to the same thing when linked together. 2. References to glue code have the opposite problem: their names may differ in component instance A and component instance B, but need to refer to the same functions when the two are linked together. Both problems were solved with the same mechanism: GNU objcopy. The invocations to objcopy were template-generated so they could take advantage of knowledge about the system begin compiled. A generated objcopy invocation would do the following: 1. Adjust all non-generated symbols to have internal visibility. Component instance A and component instance B have already been (partially) linked, so at this point the only unresolved symbols that need to remain externally visible are those related to the connection(s) between them. This solves problem 1 above. 2. Name-mangle all remaining symbols to something prefixed with the relevant connection instance’s name. IIRC the name mangling scheme was something like ‘<connection instance name><space><original symbol name>’. This guaranteed uniqueness because <space> is not a character that existed in C or assembly symbols. Whether using a space in a symbol name is legal or not, I don’t know, but all the binutils seemed fine with this. 3. Rename the second component in the name mangling above to a common name. I don’t recall exactly how this name was chosen, but this solves problem 2 above. This sounds pretty unorthodox and brittle, but it actually worked surprisingly well. All combinations of single address space component systems seemed to Just Work, with a few notable sharp edges: * Anything involving MMIO was tricky. These symbols frequently needed to remain externally visible and the two component instances would often have differing names but the same addresses for them. * GNU ld was more or less required. The multiple steps involving partial linking was only supported by GNU ld and Gold at the time. LLVM’s lld may have caught up in the interim years. * The objcopy name mangling broke cross-section references used by GCC’s implementation of Link-Time Optimization. As a result, any LTO compilation degraded to LTO being disabled. This wouldn’t have been a big deal except that one of the primary reasons to put two components in a single address space is to enable cross-component inlining, usually facilitated by LTO. AFAICT this (playing objcopy tricks and expecting LTO to still work) was simply not a supported use case. We explored how to work around this and got some one-off efforts working for benchmarking, but proper support would have involved altering the way binutils work.
participants (2)
-
Matthew Fernandez
-
yadong.li