Hello.
I have a problem. It is a very complicated problem. I understand that It is hard to give an advice without looking into the sources. But I am stuck and need some ideas in ‘brainstorm’ format as an input for me. So, here is the problem:
There are many tests from sel4test suit are working. This is how does it look like:
http://pastebin.com/vvnDaUe9
Unfortunately, I found several configurations of the tests, which causes crashes with stable and different symptoms. One of them is looking following:
11 #include
12 #include "../test.h"
13 #include "../helpers.h"
14
15 #define MIN_EXPECTED_ALLOCATIONS 100
16
17
18
19
20
21
22
23 int test_allocator(env_t env)
24 {
25 /* Perform a bunch of allocations and frees */
26 vka_object_t endpoint;
27 int error;
28
29 for (int i = 0; i < MIN_EXPECTED_ALLOCATIONS; i++) {
30 error = vka_alloc_endpoint(&env->vka, &endpoint);
31 test_assert(error == 0);
32 test_assert(endpoint.cptr != 0);
33 vka_free_object(&env->vka, &endpoint);
34 }
35
36 return sel4test_get_result();
37 }
38 DEFINE_TEST(TRIVIAL0001, "Ensure the allocator works", test_allocator)
39 DEFINE_TEST(TRIVIAL0002, "Ensure the allocator works more than once", test_allocator)
as you might see, this is little bit modified version of trivial.c tests. Usual, when I am testing all tests, there is no problem with this test. But when I am running this file alone, I have a problem. Also, you might see, that there are several free lines with numbers from 16 to 22. I made this not accidental, it is a source of error. If the test_allocator(env_t env) is lockated in the 23rd line, this tests has no problem. But if I add one more free line, I have an error like this:
Starting test suite sel4test
Starting test 0: TEST_TRIVIAL0001
TEST_TRIVIAL0001
726:trap!, address error on store!
$0 : 0 fffffff8 7 10089000
$4 : 6a6ffc 3f 0 ffffffff
$8 : 1 7 1 20
$12 : 70f 0 2 6ae
$16 : 6a6ffc 5a6a50 ffffff30 6a6ffc
$20 : 3f 5a6e58 5a6a50 6a6fdc
$24 : 20 41bab0 0 0
$28 : 0 100848d8 fffffffe 41faf0
Hi : 20
Lo : 0
epc : 41f818
fi : 41f814
ra : 41faf0
Status: 20000002
Cause : 414
Config: 80008482
BadVaddr: f
which means, that the test-drivers (I know the ASID) tries to store something to the address 0x0000000f by the instruction located in the 0x41f818 (EPC). Also, it is failed before the starting of trivial.c test!
This is what I on the address:
0041f7b4 <unbin>:
41f7b4: 8c83000c lw v1,12(a0)
41f7b8: 27bdffd8 addiu sp,sp,-40
41f7bc: 8c820008 lw v0,8(a0)
41f7c0: afb0001c sw s0,28(sp)
41f7c4: 00808021 move s0,a0
41f7c8: afbf0024 sw ra,36(sp)
41f7cc: 1462000e bne v1,v0,41f808
41f7d0: afb10020 sw s1,32(sp)
41f7d4: 00a03021 move a2,a1
41f7d8: 00002021 move a0,zero
41f7dc: 0c107afa jal 41ebe8 <.pic.__ashldi3>
41f7e0: 24050001 li a1,1
41f7e4: 3c04005a lui a0,0x5a
41f7e8: 00022827 nor a1,zero,v0
41f7ec: 24846a50 addiu a0,a0,27216
41f7f0: 0c107dac jal 41f6b0
41f7f4: 00038827 nor s1,zero,v1
41f7f8: 3c04005a lui a0,0x5a
41f7fc: 02202821 move a1,s1
41f800: 0c107dac jal 41f6b0
41f804: 24846a54 addiu a0,a0,27220
41f808: 8e030008 lw v1,8(s0)
41f80c: 8e02000c lw v0,12(s0)
41f810: 8fbf0024 lw ra,36(sp)
41f814: 8fb10020 lw s1,32(sp)
41f818: ac430008 sw v1,8(v0)
I am quite sure that there is no problem here with unbin function, and the problem somewhere with contexts, or alignment or something else. Also, I am mapping this area RO, so I am also sure that there is no corruption of the user-space regions.
Also, I see, that there is a correlation between size of the image and faults:
1349892 ./sel4test-tests.bin_23
1349900 ./sel4test-tests.bin_24
The border line is 1349900 if I have a size of the image below the value -- there is no problem. Unfortunately, 1349900 is not a 'round' value, somehow related to TLB sizes of something else what I know.
Also, I should mention, that 'key' variable wich changes when I add new line is an argument of _sel4test_failure:
mips-mti-linux-gnu-objdump -d sel4test-tests.bin_23 > bin23.asm
mips-mti-linux-gnu-objdump -d sel4test-tests.bin_24 > bin24.asm
diff -ua bin23.asm bin24.asm
Disassembly of section .init:
@@ -3143,7 +3143,7 @@
4030f0: 3c050043 lui a1,0x43
4030f4: 2484c248 addiu a0,a0,-15800
4030f8: 24a5c964 addiu a1,a1,-13980
- 4030fc: 2406001f li a2,31
+ 4030fc: 24060020 li a2,32
403100: 0c107b51 jal 41ed44 <_sel4test_failure>
403104: 0000f021 move s8,zero
403108: 8fbf0074 lw ra,116(sp)
@@ -3383,7 +3383,7 @@
4034b0: 2484c950 addiu a0,a0,-14000
4034b4: 24a5c964 addiu a1,a1,-13980
4034b8: 0c107b51 jal 41ed44 <_sel4test_failure>
- 4034bc: 24060020 li a2,32
+ 4034bc: 24060021 li a2,33
4034c0: 03c01021 move v0,s8
4034c4: 8fbf0074 lw ra,116(sp)
4034c8: 8fbe0070 lw s8,112(sp)
That is all that I have now. Any ideas? Now I do not see any other choice but to gradually reduce the amount of code with keeping this 'border' situation, until I will not have very small set of system functions.
Thank you
--
Vasily A. Sartakov
sartakov@ksyslabs.org