Hi,
I've been working on a bug for weeks that is very difficult to hunt. Finally, I've decided to go at the assembly level to track it down. I am not allowed to share or post the code, but I am quite puzzled by the assembly code. To simplify, the subroutine looks like this:
subroutine anonymized(this, k) implicit none class(my_type), intent(inout) :: this integer, intent(in) :: k real(8) :: aux integer :: i1, i2 aux = this%something ... do i1 = 1, this%n do i2 = 1, this%m if (this%value(i1) < 1.0e-10) then ...
and the code crashes at the first comparison of this%value(i1). The crash is only observable with some flags such as -O2 -heap-arrays 0. If I try to print the value of this%value(i1), just before it is used, the code runs fine to completion and the bug dissapears. Sometimes, when I change the code that is *after* this one, the bug disappears. It just drives me crazy.
So I had a look at the assembly code. The beginning of this code is given here.
Dump of assembler code for function __anonymized: => 0x0000000000522970 <+0>: push %rbp 0x0000000000522971 <+1>: mov %rsp,%rbp 0x0000000000522974 <+4>: push %r12 0x0000000000522976 <+6>: push %r13 0x0000000000522978 <+8>: push %r14 0x000000000052297a <+10>: push %r15 0x000000000052297c <+12>: push %rbx 0x000000000052297d <+13>: sub $0x148,%rsp 0x0000000000522984 <+20>: mov (%rdi),%rbx 0x0000000000522987 <+23>: mov %rsi,-0x80(%rbp) 0x000000000052298b <+27>: mov %rdi,-0x78(%rbp) 0x000000000052298f <+31>: mov 0x79c58(%rbx),%rdx 0x0000000000522996 <+38>: neg %rdx 0x0000000000522999 <+41>: movslq 0x7a6a8(%rbx),%rcx 0x00000000005229a0 <+48>: add %rcx,%rdx 0x00000000005229a3 <+51>: mov 0x79c18(%rbx),%rax 0x00000000005229aa <+58>: movsd 0x7a688(%rbx),%xmm0 0x00000000005229b2 <+66>: mov 0x79ba0(%rbx),%r8d 0x00000000005229b9 <+73>: mov %rcx,-0x88(%rbp) 0x00000000005229c0 <+80>: mulsd (%rax,%rdx,8),%xmm0 0x00000000005229c5 <+85>: mov %r8d,-0x48(%rbp) 0x00000000005229c9 <+89>: mov 0x79bc0(%rbx),%ecx 0x00000000005229cf <+95>: test %r8d,%r8d 0x00000000005229d2 <+98>: jle 0x527bcf <__anonymized+21087> 0x00000000005229d8 <+104>: mov %ecx,%r13d 0x00000000005229db <+107>: xor %r12d,%r12d 0x00000000005229de <+110>: and $0xfffffff8,%r13d 0x00000000005229e2 <+114>: pxor %xmm2,%xmm2 0x00000000005229e6 <+118>: movslq -0x48(%rbp),%rax 0x00000000005229ea <+122>: pxor %xmm3,%xmm3 0x00000000005229ee <+126>: movslq %r13d,%r10 0x00000000005229f1 <+129>: movslq %ecx,%rdx 0x00000000005229f4 <+132>: movsd 0x13bb9c(%rip),%xmm1 # 0x65e598 0x00000000005229fc <+140>: mov %rax,-0x40(%rbp) 0x0000000000522a00 <+144>: mov %r10,-0x160(%rbp) 0x0000000000522a07 <+151>: mov %r13d,-0x168(%rbp) 0x0000000000522a0e <+158>: mov %ecx,-0x30(%rbp) 0x0000000000522a11 <+161>: cmpl $0x0,-0x30(%rbp) 0x0000000000522a15 <+165>: jle 0x522d10 <__anonymized+928> 0x0000000000522a1b <+171>: neg %r11 0x0000000000522a1e <+174>: add %r12,%r11 0x0000000000522a21 <+177>: mov 0x79fd8(%rbx),%rdi 0x0000000000522a28 <+184>: mov 0x7a018(%rbx),%r8 0x0000000000522a2f <+191>: mov 0x7a260(%rbx),%rsi 0x0000000000522a36 <+198>: comisd 0x8(%rdi,%r11,8),%xmm1
The code crashed on comisd. It seems that the jle are not taken (I am a beginner to assembly code). On the comisd line, 0x8(%rdi,%r11,8) is obviously trying to access the array at index r11. I have checkd %rdi which contains the right address. But what is surprising, is that r11 is set to 140737488332700 at the beginning of the function and is only neg at line 0x0000000000522a1b. So it feels to me that the register %r11 is never initialized.
What do you think of that?
Best regards