none
Fortran interface only half-working with gfortran (MSYS2/mingw64) RRS feed

  • General discussion

  • While trying to build a scientific software under MSYS2/mingw64/gfortran with MS MPI, I noticed that there are weird problems related to boolean tests. I note that I did not see problems with Intel Fortran or PGI Fortran.

    MS MPI version used: 9.0.12497.9, but note I also tried with 7.* and 8.* and saw the same issues. I verified this on different machines, just in case.

    The code:

    program example
    
       implicit none
    #include "mpif.h"
    
       integer :: ierr
       logical :: flag
       
       ierr = 42
       flag = .true.
    
       call MPI_INIT(ierr)
       print *, "MPI_INIT", ierr
    
       call MPI_INITIALIZED(flag, ierr )
       print *, "MPI_INITIALIZED", flag, ierr
       if ( .not. flag ) then
            print *, "bad: not inited; ", flag, ierr
       endif
       if ( flag ) then
            print *, "good: inited; ", flag, ierr
       endif
    
    end program

    The output when built with gfortran:

    $ ./example.exe
     MPI_INIT           0
     MPI_INITIALIZED T           0
     bad: not inited;  T           0
     good: inited;  T           0

    So, the boolean is both true and false! Well that can't be...

    This is how I built it within the MSYS2 MinGW 64 bit shell:

    gfortran -o example "-I$MSMPI_INC" "-I$MSMPI_INC/x64" "-L$MSMPI_LIB64" -DINT_PTR_KIND\(\)=8 example.F90 -lmsmpifmc -lmsmpi -fno-range-check -O0 -save-temps -fverbose-asm -masm=intel

    Note that I tried -lmsmpifec (instead of the mc variant) as well which didn't make a difference.

    When built with PGI or Intel Fortran, the output is, as expected:

    $ ./example.exe
     MPI_INIT           0
     MPI_INITIALIZED T           0
     good: inited;  T           0

    The command line for PGI is:

    pgfortran -o example example.F90 -Mmpi=msmpi -I"$MSMPI_INC/x64" -Mkeepasm -Manno

    (Note that -I"$MSMPI_INC/x64" shouldn't really be required but there seems to be a tiny bug in how -Mmpi=msmpi works currently).

    I tried to dig into the assembly and see what the difference could be, but I'm a bit lost there. At the end of this post you find the assembly from each compiler.

    I really like to get this issue solved together with the MS MPI team but I need a bit of expertise to understand what exactly the issue is.

    I'm looking forward to your replies! :)

    Maik

    This is gfortran:

    	.file	"example.F90"
    	.intel_syntax noprefix
     # GNU Fortran2008 (Rev1, Built by MSYS2 project) version 8.2.0 (x86_64-w64-mingw32)
     #	compiled by GNU C version 8.2.0, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.19-GMP
    
     # GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
     # options passed:  ../example.F90 -cpp=example.f90
     # -I C:\Program Files (x86)\Microsoft SDKs\MPI\Include\
     # -I C:\Program Files (x86)\Microsoft SDKs\MPI\Include\/x64
     # -iprefix C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/
     # -D_REENTRANT -D INT_PTR_KIND()=8 ../example.F90 -masm=intel
     # -mtune=generic -march=x86-64 -O0 -fno-range-check -fverbose-asm
     # -fintrinsic-modules-path C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/finclude
     # options enabled:  -faggressive-loop-optimizations
     # -fasynchronous-unwind-tables -fauto-inc-dec -fchkp-check-incomplete-type
     # -fchkp-check-read -fchkp-check-write -fchkp-instrument-calls
     # -fchkp-narrow-bounds -fchkp-optimize -fchkp-store-bounds
     # -fchkp-use-static-bounds -fchkp-use-static-const-bounds
     # -fchkp-use-wrappers -fcommon -fdelete-null-pointer-checks
     # -fdwarf2-cfi-asm -fearly-inlining -feliminate-unused-debug-types
     # -ffp-int-builtin-inexact -ffunction-cse -fgcse-lm -fgnu-runtime
     # -fgnu-unique -fident -finline-atomics -fira-hoist-pressure
     # -fira-share-save-slots -fira-share-spill-slots -fivopts
     # -fkeep-inline-dllexport -fkeep-static-consts -fleading-underscore
     # -flifetime-dse -flto-odr-type-merging -fmerge-debug-strings -fpeephole
     # -fpic -fplt -fprefetch-loop-arrays -freg-struct-return
     # -fsched-critical-path-heuristic -fsched-dep-count-heuristic
     # -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic
     # -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
     # -fsched-stalled-insns-dep -fschedule-fusion -fsemantic-interposition
     # -fset-stack-executable -fshow-column -fshrink-wrap-separate
     # -fsigned-zeros -fsplit-ivs-in-unroller -fssa-backprop -fstdarg-opt
     # -fstrict-volatile-bitfields -fsync-libcalls -ftrapping-math
     # -ftree-cselim -ftree-forwprop -ftree-loop-if-convert -ftree-loop-im
     # -ftree-loop-ivcanon -ftree-loop-optimize -ftree-parallelize-loops=
     # -ftree-phiprop -ftree-reassoc -ftree-scev-cprop -funit-at-a-time
     # -funwind-tables -fverbose-asm -fzero-initialized-in-bss
     # -m128bit-long-double -m64 -m80387 -maccumulate-outgoing-args
     # -malign-double -malign-stringops -mavx256-split-unaligned-load
     # -mavx256-split-unaligned-store -mfancy-math-387 -mfentry -mfp-ret-in-387
     # -mfxsr -mieee-fp -mlong-double-80 -mmmx -mms-bitfields -mno-sse4
     # -mpush-args -mred-zone -msse -msse2 -mstack-arg-probe -mstackrealign
     # -mvzeroupper
    
    	.text
    	.section .rdata,"dr"
    .LC0:
    	.ascii "../example.F90\0"
    .LC1:
    	.ascii "MPI_INIT"
    .LC2:
    	.ascii "MPI_INITIALIZED"
    .LC3:
    	.ascii "bad: not inited; "
    .LC4:
    	.ascii "good: inited; "
    	.text
    	.def	MAIN__;	.scl	3;	.type	32;	.endef
    	.seh_proc	MAIN__
    MAIN__:
    	push	rbp	 #
    	.seh_pushreg	rbp
    	sub	rsp, 576	 #,
    	.seh_stackalloc	576
    	lea	rbp, 128[rsp]	 #,
    	.seh_setframe	rbp, 128
    	.seh_endprologue
     # ../example.F90:9:    ierr = 42
    	mov	DWORD PTR 440[rbp], 42	 # ierr,
     # ../example.F90:10:    flag = .true.
    	mov	DWORD PTR 444[rbp], 1	 # flag,
     # ../example.F90:12:    call MPI_INIT(ierr)
    	lea	rax, 440[rbp]	 # tmp90,
    	mov	rcx, rax	 #, tmp90
    	call	mpi_init_	 #
     # ../example.F90:13:    Print *, "MPI_INIT", ierr
    	lea	rax, .LC0[rip]	 # tmp91,
    	mov	QWORD PTR -88[rbp], rax	 # dt_parm.0.common.filename, tmp91
    	mov	DWORD PTR -80[rbp], 13	 # dt_parm.0.common.line,
    	mov	DWORD PTR -96[rbp], 128	 # dt_parm.0.common.flags,
    	mov	DWORD PTR -92[rbp], 6	 # dt_parm.0.common.unit,
    	lea	rax, -96[rbp]	 # tmp92,
    	mov	rcx, rax	 #, tmp92
    	call	_gfortran_st_write	 #
    	lea	rax, -96[rbp]	 # tmp93,
    	mov	r8d, 8	 #,
    	lea	rdx, .LC1[rip]	 #,
    	mov	rcx, rax	 #, tmp93
    	call	_gfortran_transfer_character_write	 #
    	lea	rdx, 440[rbp]	 # tmp94,
    	lea	rax, -96[rbp]	 # tmp95,
    	mov	r8d, 4	 #,
    	mov	rcx, rax	 #, tmp95
    	call	_gfortran_transfer_integer_write	 #
    	lea	rax, -96[rbp]	 # tmp96,
    	mov	rcx, rax	 #, tmp96
    	call	_gfortran_st_write_done	 #
     # ../example.F90:15:    CALL MPI_INITIALIZED(flag, ierr )
    	lea	rdx, 440[rbp]	 # tmp97,
    	lea	rax, 444[rbp]	 # tmp98,
    	mov	rcx, rax	 #, tmp98
    	call	mpi_initialized_	 #
     # ../example.F90:16:    Print *, "MPI_INITIALIZED", flag, ierr
    	lea	rax, .LC0[rip]	 # tmp99,
    	mov	QWORD PTR -88[rbp], rax	 # dt_parm.1.common.filename, tmp99
    	mov	DWORD PTR -80[rbp], 16	 # dt_parm.1.common.line,
    	mov	DWORD PTR -96[rbp], 128	 # dt_parm.1.common.flags,
    	mov	DWORD PTR -92[rbp], 6	 # dt_parm.1.common.unit,
    	lea	rax, -96[rbp]	 # tmp100,
    	mov	rcx, rax	 #, tmp100
    	call	_gfortran_st_write	 #
    	lea	rax, -96[rbp]	 # tmp101,
    	mov	r8d, 15	 #,
    	lea	rdx, .LC2[rip]	 #,
    	mov	rcx, rax	 #, tmp101
    	call	_gfortran_transfer_character_write	 #
    	lea	rdx, 444[rbp]	 # tmp102,
    	lea	rax, -96[rbp]	 # tmp103,
    	mov	r8d, 4	 #,
    	mov	rcx, rax	 #, tmp103
    	call	_gfortran_transfer_logical_write	 #
    	lea	rdx, 440[rbp]	 # tmp104,
    	lea	rax, -96[rbp]	 # tmp105,
    	mov	r8d, 4	 #,
    	mov	rcx, rax	 #, tmp105
    	call	_gfortran_transfer_integer_write	 #
    	lea	rax, -96[rbp]	 # tmp106,
    	mov	rcx, rax	 #, tmp106
    	call	_gfortran_st_write_done	 #
     # ../example.F90:17:    IF ( .not. flag ) THEN
    	mov	eax, DWORD PTR 444[rbp]	 # flag.5_1, flag
    	xor	eax, 1	 # _2,
    	test	eax, eax	 # _2
    	je	.L2	 #,
     # ../example.F90:18:         Print *, "bad: not inited; ", flag, ierr
    	lea	rax, .LC0[rip]	 # tmp107,
    	mov	QWORD PTR -88[rbp], rax	 # dt_parm.2.common.filename, tmp107
    	mov	DWORD PTR -80[rbp], 18	 # dt_parm.2.common.line,
    	mov	DWORD PTR -96[rbp], 128	 # dt_parm.2.common.flags,
    	mov	DWORD PTR -92[rbp], 6	 # dt_parm.2.common.unit,
    	lea	rax, -96[rbp]	 # tmp108,
    	mov	rcx, rax	 #, tmp108
    	call	_gfortran_st_write	 #
    	lea	rax, -96[rbp]	 # tmp109,
    	mov	r8d, 17	 #,
    	lea	rdx, .LC3[rip]	 #,
    	mov	rcx, rax	 #, tmp109
    	call	_gfortran_transfer_character_write	 #
    	lea	rdx, 444[rbp]	 # tmp110,
    	lea	rax, -96[rbp]	 # tmp111,
    	mov	r8d, 4	 #,
    	mov	rcx, rax	 #, tmp111
    	call	_gfortran_transfer_logical_write	 #
    	lea	rdx, 440[rbp]	 # tmp112,
    	lea	rax, -96[rbp]	 # tmp113,
    	mov	r8d, 4	 #,
    	mov	rcx, rax	 #, tmp113
    	call	_gfortran_transfer_integer_write	 #
    	lea	rax, -96[rbp]	 # tmp114,
    	mov	rcx, rax	 #, tmp114
    	call	_gfortran_st_write_done	 #
    .L2:
     # ../example.F90:20:    IF ( flag ) THEN
    	mov	eax, DWORD PTR 444[rbp]	 # flag.6_3, flag
    	test	eax, eax	 # flag.6_3
    	je	.L4	 #,
     # ../example.F90:21:         Print *, "good: inited; ", flag, ierr
    	lea	rax, .LC0[rip]	 # tmp115,
    	mov	QWORD PTR -88[rbp], rax	 # dt_parm.3.common.filename, tmp115
    	mov	DWORD PTR -80[rbp], 21	 # dt_parm.3.common.line,
    	mov	DWORD PTR -96[rbp], 128	 # dt_parm.3.common.flags,
    	mov	DWORD PTR -92[rbp], 6	 # dt_parm.3.common.unit,
    	lea	rax, -96[rbp]	 # tmp116,
    	mov	rcx, rax	 #, tmp116
    	call	_gfortran_st_write	 #
    	lea	rax, -96[rbp]	 # tmp117,
    	mov	r8d, 14	 #,
    	lea	rdx, .LC4[rip]	 #,
    	mov	rcx, rax	 #, tmp117
    	call	_gfortran_transfer_character_write	 #
    	lea	rdx, 444[rbp]	 # tmp118,
    	lea	rax, -96[rbp]	 # tmp119,
    	mov	r8d, 4	 #,
    	mov	rcx, rax	 #, tmp119
    	call	_gfortran_transfer_logical_write	 #
    	lea	rdx, 440[rbp]	 # tmp120,
    	lea	rax, -96[rbp]	 # tmp121,
    	mov	r8d, 4	 #,
    	mov	rcx, rax	 #, tmp121
    	call	_gfortran_transfer_integer_write	 #
    	lea	rax, -96[rbp]	 # tmp122,
    	mov	rcx, rax	 #, tmp122
    	call	_gfortran_st_write_done	 #
    .L4:
     # ../example.F90:24: end program
    	nop	
    	add	rsp, 576	 #,
    	pop	rbp	 #
    	ret	
    	.seh_endproc
    	.def	__main;	.scl	2;	.type	32;	.endef
    	.globl	main
    	.def	main;	.scl	2;	.type	32;	.endef
    	.seh_proc	main
    main:
    	push	rbp	 #
    	.seh_pushreg	rbp
    	mov	rbp, rsp	 #,
    	.seh_setframe	rbp, 0
    	sub	rsp, 32	 #,
    	.seh_stackalloc	32
    	.seh_endprologue
    	mov	DWORD PTR 16[rbp], ecx	 # argc, argc
    	mov	QWORD PTR 24[rbp], rdx	 # argv, argv
     # ../example.F90:24: end program
    	call	__main	 #
    	mov	rax, QWORD PTR 24[rbp]	 # tmp89, argv
    	mov	rdx, rax	 #, tmp89
    	mov	ecx, DWORD PTR 16[rbp]	 #, argc
    	call	_gfortran_set_args	 #
    	lea	rdx, options.4.4034[rip]	 #,
    	mov	ecx, 7	 #,
    	call	_gfortran_set_options	 #
    	call	MAIN__	 #
    	mov	eax, 0	 # _7,
    	add	rsp, 32	 #,
    	pop	rbp	 #
    	ret	
    	.seh_endproc
    	.comm	mpifcmb5_, 4, 4
    	.comm	mpifcmb9_, 4, 4
    	.comm	mpipriv1_, 28, 4
    	.comm	mpipriv2_, 24, 4
    	.comm	mpiprivc_, 2, 4
    	.section .rdata,"dr"
    	.align 16
    options.4.4034:
    	.long	68
    	.long	8191
    	.long	0
    	.long	1
    	.long	1
    	.long	0
    	.long	31
    	.ident	"GCC: (Rev1, Built by MSYS2 project) 8.2.0"
    	.def	mpi_init_;	.scl	2;	.type	32;	.endef
    	.def	_gfortran_st_write;	.scl	2;	.type	32;	.endef
    	.def	_gfortran_transfer_character_write;	.scl	2;	.type	32;	.endef
    	.def	_gfortran_transfer_integer_write;	.scl	2;	.type	32;	.endef
    	.def	_gfortran_st_write_done;	.scl	2;	.type	32;	.endef
    	.def	mpi_initialized_;	.scl	2;	.type	32;	.endef
    	.def	_gfortran_transfer_logical_write;	.scl	2;	.type	32;	.endef
    	.def	_gfortran_set_args;	.scl	2;	.type	32;	.endef
    	.def	_gfortran_set_options;	.scl	2;	.type	32;	.endef

    This is PGI:

    	.file	"example.F90"
    ## PGF90 18.4 -opt 1 -norecursive
    ## PGF90 08/05/2018  14:17:53
    ## pgfortran ../example.F90 -o example -Mmpi=msmpi -IC:\Program Files (x86)\Microsoft SDKs\MPI\Include\/x64 -Mkeepasm -Manno
    ## C:\PROGRA~1\PGICE/win64/18.4/bin\pgf902.exe
    ## pgfortran2a9S0c3fz4x_KD.ilm -fn ../example.F90 -opt 1 -terse 1 -inform warn
    ## -x 51 0x20 -x 120 0x80000000 -x 59 4 -x 19 0x400000 -x 28 0x40000 -x 119 0x4a10400
    ## -x 122 0x40 -x 123 0x1000 -x 127 0x15 -x 129 0x10 -quad -y 80 0x1000 -x 80 0x10800000
    ## -tp haswell -x 70 0x8000 -x 122 1 -x 125 0x20000 -x 120 0x1000 -x 124 0x400
    ## -x 119 0x400000 -x 120 0x80 -y 15 2 -x 57 0x3b0000 -x 58 0x48000000 -x 15 2
    ## -x 49 0x100 -astype 0 -x 70 0x40000000 -x 124 1 -y 163 0xc0000000 -x 189 0x10
    ## -y 189 0x4000000 -anno -cmdline +pgfortran ../example.F90 -o example -Mmpi=msmpi -IC:\Program Files (x86)\Microsoft SDKs\MPI\Include\/x64 -Mkeepasm -Manno
    ## -asm example.s
    	.section	.dbginfo
    ..D1b:
    	.4byte	..D1e-..D1b-4
    	.2byte	0x2
    	.4byte	..labbrv.b
    	.byte	0x8
    	.byte	0x1
    	.string	"../example.F90"
    	.string	"pgifortran"
    	.string	"PGF90 18.4-0"
    	.byte	0x8
    	.quad	..text.b
    	.quad	..text.e
    	.4byte	..line.b
    	.section	.dbgabbr
    ..labbrv.b:
    	.byte	0x1
    	.byte	0x11
    	.byte	0x1
    	.byte	0x3
    	.byte	0x8
    	.byte	0x1b
    	.byte	0x8
    	.byte	0x25
    	.byte	0x8
    	.byte	0x13
    	.byte	0xb
    	.byte	0x11
    	.byte	0x1
    	.byte	0x12
    	.byte	0x1
    	.byte	0x10
    	.byte	0x6
    	.byte	0,0
    	.text
    ..text.b:
    	.section	.bss
    ..bss.b:
    	.data
    ..data.b:
    ##  (example,../example.F90)
    	.text
    	.align	16
    ..Lfb1:
    	.globl	MAIN_
    MAIN_:
    ## PGI Target haswell-64
    ..Dcfb0:
    	pushq	%rbp
    ..Dcfi0:
    	subq	$16, %rsp
    ..Dcfi1:
    	leaq	128(%rsp), %rbp
    ..Dcfi2:
    	movq	%rbx, -128(%rbp)
    ..Dcfi3:
    	pushq	%rax
    	pushq	%rax
    	stmxcsr	(%rsp)
    	popq	%rax
    	orq	$64, %rax
    	pushq	%rax
    	ldmxcsr	(%rsp)
    	popq	%rax
    	popq	%rax
    ##  lineno: 1
    
    ## program example
    
    ## 
    
    ##    implicit none
    
    ## #include "mpif.h"
    
    ## 
    
    ##    integer :: ierr
    
    ##    logical :: flag
    
    ##    
    
    	subq	$32, %rsp
    	leaq	.C1_283(%RIP), %rcx
    	call	pghpf_init
    	addq	$32, %rsp
    ..EN1_297:
    ##  lineno: 9
    ..LN1:
    
    ##    ierr = 42
    
    ##    flag = .true.
    
    ## 
    
    ##    call MPI_INIT(ierr)
    
    ##    Print *, "MPI_INIT", ierr
    
    ## 
    
    ##    CALL MPI_INITIALIZED(flag, ierr )
    
    ##    Print *, "MPI_INITIALIZED", flag, ierr
    
    ##    IF ( .not. flag ) THEN
    
    ##         Print *, "bad: not inited; ", flag, ierr
    
    	movl	$42, -116(%rbp)
    	movl	$-1, -120(%rbp)
    	subq	$32, %rsp
    	leaq	-116(%rbp), %rcx
    	call	mpi_init_
    	addq	$32, %rsp
    	subq	$32, %rsp
    	leaq	.C1_357(%RIP), %rcx
    	leaq	.C1_576(%RIP), %rdx
    	movl	$14, %r8d
    	call	pgf90io_src_info03
    	addq	$32, %rsp
    	subq	$32, %rsp
    	leaq	.C1_305(%RIP), %rcx
    	leaq	.C1_283(%RIP), %r8
    	xorl	%edx, %edx
    	movq	%r8, %r9
    	call	pgf90io_print_init
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	leaq	.C1_580(%RIP), %rcx
    	movl	$14, %edx
    	movl	$8, %r8d
    	call	pgf90io_sc_ch_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	movl	-116(%rbp), %ecx
    	movl	$25, %edx
    	call	pgf90io_sc_i_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	call	pgf90io_ldw_end
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	leaq	-120(%rbp), %rcx
    	leaq	-116(%rbp), %rdx
    	call	mpi_initialized_
    	addq	$32, %rsp
    	subq	$32, %rsp
    	leaq	.C1_368(%RIP), %rcx
    	leaq	.C1_576(%RIP), %rdx
    	movl	$14, %r8d
    	call	pgf90io_src_info03
    	addq	$32, %rsp
    	subq	$32, %rsp
    	leaq	.C1_305(%RIP), %rcx
    	leaq	.C1_283(%RIP), %r8
    	xorl	%edx, %edx
    	movq	%r8, %r9
    	call	pgf90io_print_init
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	leaq	.C1_586(%RIP), %rcx
    	movl	$14, %edx
    	movl	$15, %r8d
    	call	pgf90io_sc_ch_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	movl	-120(%rbp), %ecx
    	movl	$19, %edx
    	call	pgf90io_sc_i_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	movl	-116(%rbp), %ecx
    	movl	$25, %edx
    	call	pgf90io_sc_i_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	call	pgf90io_ldw_end
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	testl	$1, -120(%rbp)
    	jne	.LB1_593
    	subq	$32, %rsp
    	leaq	.C1_339(%RIP), %rcx
    	leaq	.C1_576(%RIP), %rdx
    	movl	$14, %r8d
    	call	pgf90io_src_info03
    	addq	$32, %rsp
    	subq	$32, %rsp
    	leaq	.C1_305(%RIP), %rcx
    	leaq	.C1_283(%RIP), %r8
    	xorl	%edx, %edx
    	movq	%r8, %r9
    	call	pgf90io_print_init
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	leaq	.C1_587(%RIP), %rcx
    	movl	$14, %edx
    	movl	$17, %r8d
    	call	pgf90io_sc_ch_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	movl	-120(%rbp), %ecx
    	movl	$19, %edx
    	call	pgf90io_sc_i_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	movl	-116(%rbp), %ecx
    	movl	$25, %edx
    	call	pgf90io_sc_i_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	call	pgf90io_ldw_end
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	.p2align	4,,3
    .LB1_593:
    ##  lineno: 19
    ..LN2:
    
    ##    ENDIF
    
    ##    IF ( flag ) THEN
    
    ##         Print *, "good: inited; ", flag, ierr
    
    	testl	$1, -120(%rbp)
    	je	.LB1_594
    	subq	$32, %rsp
    	leaq	.C1_361(%RIP), %rcx
    	leaq	.C1_576(%RIP), %rdx
    	movl	$14, %r8d
    	call	pgf90io_src_info03
    	addq	$32, %rsp
    	subq	$32, %rsp
    	leaq	.C1_305(%RIP), %rcx
    	leaq	.C1_283(%RIP), %r8
    	xorl	%edx, %edx
    	movq	%r8, %r9
    	call	pgf90io_print_init
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	leaq	.C1_588(%RIP), %rcx
    	movl	$14, %edx
    	movl	$14, %r8d
    	call	pgf90io_sc_ch_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	movl	-120(%rbp), %ecx
    	movl	$19, %edx
    	call	pgf90io_sc_i_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	movl	-116(%rbp), %ecx
    	movl	$25, %edx
    	call	pgf90io_sc_i_ldw
    	addq	$32, %rsp
    	movl	%eax, %ebx
    	subq	$32, %rsp
    	call	pgf90io_ldw_end
    	addq	$32, %rsp
    	.p2align	4,,3
    .LB1_594:
    ##  lineno: 22
    ..LN3:
    
    ##    ENDIF
    
    	movq	-128(%rbp), %rbx
    	leaq	-112(%rbp), %rsp
    	popq	%rbp
    	ret
    	.type	MAIN_,@function
    	.size	MAIN_,.-MAIN_
    	.text
    ..Dcfe0:
    ..Lfe1:
    	.section	.dbginfo
    	.byte	0x2
    	.byte	0x1
    	.string	"example"
    	.byte	0x1
    	.byte	0x1
    	.quad	..Lfb1
    	.quad	..Lfe1
    	.byte	0x2
    	.byte	0x1
    	.byte	0x56
    	.section	.dbgabbr
    	.byte	0x2
    	.byte	0x2e
    	.byte	0x1
    	.byte	0x3f
    	.byte	0xc
    	.byte	0x3
    	.byte	0x8
    	.byte	0x3a
    	.byte	0xb
    	.byte	0x3b
    	.byte	0xb
    	.byte	0x11
    	.byte	0x1
    	.byte	0x12
    	.byte	0x1
    	.byte	0x36
    	.byte	0xb
    	.byte	0x40
    	.byte	0xa
    	.byte	0,0
    __MAIN_END:
    	.section	.pdata
    	.align	4
    $pdata$MAIN_:
    	.4byte	MAIN_ # function start
    	.4byte	__MAIN_END # function end
    	.4byte	$unwind$MAIN_
    	.section	.xdata
    	.align	4
    $unwind$MAIN_:
    	.byte	0x01         # version 1, flag:0
    	.byte	..Dcfi3 - MAIN_  # size of prolog
    	.byte	0x5          # count of (2 byte) unwind nodes
    	.byte	0x85         # (128/16) sets rbp:5
    	    	             # start unwind code array
    	.byte	..Dcfi3-..Dcfb0	     # entry 4: offset
    	.byte	0x34	     # (%rbx)SAVE_NONVOL
    	.2byte	0x10         # to: offset(128)/8
    	.byte	..Dcfi2-..Dcfb0	     # entry 3: offset
    	.byte	0x83	     #  %rbp[rsp+128] SET_FPREG
    	.byte	..Dcfi1-..Dcfb0	     # entry 2: offset
    	.byte	0x12	     # (16/8 -8)ALLOC_SMALL
    	.byte	..Dcfi0-..Dcfb0	     # entry 1: offset
    	.byte	0x50	     # (%rbp)PUSH_NONVOL
    	.byte	0x00, 0x00   # Even Fill
    	.data
    	.align	1
    .C1_580:
    ## "MPI_INIT"
    	.byte	0x4d,0x50,0x49,0x5f,0x49,0x4e,0x49,0x54
    .C1_586:
    ## "MPI_INITIALIZED"
    	.byte	0x4d,0x50,0x49,0x5f,0x49,0x4e,0x49,0x54,0x49,0x41,0x4c
    	.byte	0x49,0x5a,0x45,0x44
    	.align	16
    .C1_587:
    ## "bad: not inited; "
    	.byte	0x62,0x61,0x64,0x3a,0x20,0x6e,0x6f,0x74,0x20,0x69,0x6e
    	.byte	0x69,0x74,0x65,0x64,0x3b,0x20
    	.byte	0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20
    	.byte	0x20,0x20,0x20,0x20
    	.align	1
    .C1_576:
    ## "../example.F90"
    	.byte	0x2e,0x2e,0x2f,0x65,0x78,0x61,0x6d,0x70,0x6c,0x65,0x2e
    	.byte	0x46,0x39,0x30
    .C1_588:
    ## "good: inited; "
    	.byte	0x67,0x6f,0x6f,0x64,0x3a,0x20,0x69,0x6e,0x69,0x74,0x65
    	.byte	0x64,0x3b,0x20
    	.align	4
    .C1_357:
    	.long	13
    .C1_368:
    	.long	16
    .C1_339:
    	.long	18
    .C1_361:
    	.long	21
    .C1_305:
    	.long	6
    .C1_283:
    	.long	0
    	.data
    ## COMMON BLOCKS
    ## STATIC VARIABLES
    	.data
    ## mpi_argvs_null	mpiprivc_+0
    ## mpi_argv_null	mpiprivc_+1
    ## mpi_weights_empty	mpifcmb9_+0
    ## mpi_unweighted	mpifcmb5_+0
    ## mpi_statuses_ignore	mpipriv2_+0
    ## mpi_errcodes_ignore	mpipriv2_+20
    ## mpi_bottom	mpipriv1_+0
    ## mpi_in_place	mpipriv1_+4
    ## mpi_status_ignore	mpipriv1_+8
    	.globl	mpi_init_
    	.globl	pgf90io_src_info03
    	.globl	pgf90io_print_init
    	.globl	pgf90io_sc_ch_ldw
    	.globl	pgf90io_sc_i_ldw
    	.globl	pgf90io_ldw_end
    	.globl	mpi_initialized_
    	.globl	pghpf_init
    	.section	.dbginfo
    	.byte	0x0
    	.text
    ..text.e:
    	.section	.bss
    ..bss.e:
    	.data
    ..data.e:
    	.section	.dbginfo
    	.byte	0x0
    	.section	.dbgabbr
    	.section	.dbginfo
    ..D1e:
    	.section	.dbgabbr
    	.byte	0
    	.section	.dbgline
    ..line.b:
    	.4byte	..line.e-..line.b-4
    	.2byte	0x2
    	.4byte	..linp.e-..linp.b
    ..linp.b:
    	.byte	0x1
    	.byte	0x1
    	.byte	0xf6
    	.byte	0xf5
    	.byte	0xa
    	.byte	0x0,0x1,0x1,0x1,0x1,0x0,0x0,0x0,0x1
    	.string	".."
    	.byte	0x0
    	.string	"example.F90"
    	.byte	0x1
    	.byte	0x0
    	.byte	0x0
    	.byte	0x0
    ..linp.e:
    	.byte	0x0
    	.byte	0x9
    	.byte	0x2
    	.quad	..text.b
    	.byte	0x0
    	.byte	0x9
    	.byte	0x2
    	.quad	..LN1
    	.byte	0x4
    	.byte	0x1
    	.byte	0x1c
    	.byte	0x0
    	.byte	0x9
    	.byte	0x2
    	.quad	..LN2
    	.byte	0x4
    	.byte	0x1
    	.byte	0x1e
    	.byte	0x0
    	.byte	0x9
    	.byte	0x2
    	.quad	..LN3
    	.byte	0x4
    	.byte	0x1
    	.byte	0x17
    	.byte	0x0
    	.byte	0x9
    	.byte	0x2
    	.quad	..text.e
    	.byte	0x0
    	.byte	0x1
    	.byte	0x1
    ..line.e:
    	.section	.dbgfram
    ..Dcieb0:
    	.4byte	..Dciee0-..Dcieb0-4
    	.4byte	0xffffffff
    	.byte	0x1
    	.byte	0x0
    	.byte	0x1
    	.byte	0x78
    	.byte	0x10
    	.byte	0xc
    	.byte	0x7
    	.byte	0x8
    	.byte	0x90
    	.byte	0x1
    	.align	8
    ..Dciee0:
    	.4byte	..Dfdee0-..Dfdeb0
    ..Dfdeb0:
    	.4byte	..Dcieb0
    	.quad	..Dcfb0
    	.quad	..Dcfe0-..Dcfb0
    	.byte	0x4
    	.4byte	..Dcfi0-..Dcfb0
    	.byte	0xe
    	.byte	0x10
    	.byte	0x86
    	.byte	0x2
    	.byte	0x4
    	.4byte	..Dcfi1-..Dcfi0
    	.byte	0xe
    	.byte	0x20
    	.byte	0x4
    	.4byte	..Dcfi2-..Dcfi1
    	.byte	0x12
    	.byte	0x6
    	.byte	0xc
    	.align	8
    ..Dfdee0:
    	.data
    	.align	8
    	.globl	f90_compiled
    	.quad	f90_compiled

    This is Intel Fortran:

    ; mark_description "Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.7.272 Bui";
    ; mark_description "ld 20180408";
    ; mark_description "-nologo -Od -fpp -IC:\\Program Files (x86)\\Microsoft SDKs\\MPI\\Include\\ -IC:\\Program Files (x86)\\Micros";
    ; mark_description "oft SDKs\\MPI\\Include\\\\x64 -warn:interfaces -module:x64\\Debug\\ -object:x64\\Debug\\ -Fdx64\\Debug\\vc14";
    ; mark_description "0.pdb -FAs -Fax64\\Debug\\ -traceback -check:bounds -check:stack -libs:dll -threads -dbglibs -c -Qlocation,l";
    ; mark_description "ink,C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\\\bin\\amd64 -Qm64";
    	OPTION DOTNAME
    _TEXT	SEGMENT      'CODE'
    TXTST0:
    ; -- Begin  MAIN__
    _TEXT	ENDS
    _RDATA	SEGMENT     READ  'DATA'
    	ALIGN 004H
    __rtc_frame_desc_2.0	DD	14
    	DD 1 DUP (0H)	; pad
    	DQ	__rtc_frame_desc_2.0 + 0000000000000020H
    	DD	0
    	DD 1 DUP (0H)	; pad
    	DQ	__rtc_frame_desc_2.0 + 0000000000000100H
    	DD	32
    	DD	48
    	DQ	__rtc_var_name.1.0
    	DD	8
    	DD	4
    	DQ	__rtc_var_name.1.1
    	DD	20
    	DD	4
    	DQ	__rtc_var_name.1.2
    	DD	88
    	DD	16
    	DQ	__rtc_var_name.1.3
    	DD	112
    	DD	8
    	DQ	__rtc_var_name.1.4
    	DD	128
    	DD	16
    	DQ	__rtc_var_name.1.5
    	DD	152
    	DD	8
    	DQ	__rtc_var_name.1.6
    	DD	168
    	DD	8
    	DQ	__rtc_var_name.1.7
    	DD	184
    	DD	16
    	DQ	__rtc_var_name.1.8
    	DD	208
    	DD	8
    	DQ	__rtc_var_name.1.9
    	DD	224
    	DD	8
    	DQ	__rtc_var_name.1.10
    	DD	240
    	DD	16
    	DQ	__rtc_var_name.1.11
    	DD	264
    	DD	8
    	DQ	__rtc_var_name.1.12
    	DD	280
    	DD	8
    	DQ	__rtc_var_name.1.13
    __STRLITPACK_3	DD	1598640205
    	DD	1414090313
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_2	DD	1598640205
    	DD	1414090313
    	DD	1229734217
    	DD	4474202
    __STRLITPACK_1	DD	979657058
    	DD	1953459744
    	DD	1768843552
    	DD	996435316
    	DW	32
    	DB 2 DUP ( 0H)	; pad
    __STRLITPACK_0	DD	1685024615
    	DD	1852383290
    	DD	1684370537
    	DW	8251
    	DB	0
    __rtc_var_name.1.0	DD	1597396014
    	DB	0
    __rtc_var_name.1.1	DD	1296128069
    	DD	608521296
    	DD	1195461702
    	DB	0
    __rtc_var_name.1.2	DD	1296128069
    	DD	608521296
    	DD	1381123401
    	DB	0
    __rtc_var_name.1.3	DD	1111970369
    	DD	1262702412
    	DW	12383
    	DB	0
    __rtc_var_name.1.4	DD	1111970369
    	DD	1262702412
    	DW	12639
    	DB	0
    __rtc_var_name.1.5	DD	1111970369
    	DD	1262702412
    	DW	12895
    	DB	0
    __rtc_var_name.1.6	DD	1111970369
    	DD	1262702412
    	DW	13151
    	DB	0
    __rtc_var_name.1.7	DD	1111970369
    	DD	1262702412
    	DW	13407
    	DB	0
    __rtc_var_name.1.8	DD	1111970369
    	DD	1262702412
    	DW	13663
    	DB	0
    __rtc_var_name.1.9	DD	1111970369
    	DD	1262702412
    	DW	13919
    	DB	0
    __rtc_var_name.1.10	DD	1111970369
    	DD	1262702412
    	DW	14175
    	DB	0
    __rtc_var_name.1.11	DD	1111970369
    	DD	1262702412
    	DW	14431
    	DB	0
    __rtc_var_name.1.12	DD	1111970369
    	DD	1262702412
    	DW	14687
    	DB	0
    __rtc_var_name.1.13	DD	1111970369
    	DD	1262702412
    	DD	3158367
    _RDATA	ENDS
    _TEXT	SEGMENT      'CODE'
    ; mark_begin;
    
    	PUBLIC MAIN__
    ; --- EXAMPLE
    MAIN__	PROC 
    .B1.1::                         ; Preds .B1.0
                                    ; Execution count [0.00e+000]
    
    ;;; program example
    
    L1::
                                                               ;1.9
            push      rbp                                           ;1.9
            sub       rsp, 464                                      ;1.9
            lea       rbp, QWORD PTR [48+rsp]                       ;1.9
            mov       QWORD PTR [rsp], rax                          ;1.9
            mov       rax, 460                                      ;1.9
    L2:                                                             ;
            mov       DWORD PTR [rsp+rax], -858993460               ;1.9
            sub       rax, 4                                        ;1.9
            cmp       rax, 4                                        ;1.9
            jg        L2            ; Prob 50%                      ;1.9
            mov       rax, QWORD PTR [rsp]                          ;1.9
            mov       DWORD PTR [rsp], -858993460                   ;1.9
            mov       DWORD PTR [4+rsp], -858993460                 ;1.9
            mov       QWORD PTR [400+rbp], rsi                      ;1.9[spill]
            mov       QWORD PTR [392+rbp], rbx                      ;1.9[spill]
            lea       rax, QWORD PTR [__NLITPACK_0]                 ;1.9
            mov       rcx, rax                                      ;1.9
            call      for_set_reentrancy                            ;1.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.20::                        ; Preds .B1.1
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.2::                         ; Preds .B1.20
                                    ; Execution count [0.00e+000]
    
    ;;; 
    ;;;    implicit none
    ;;; #include "mpif.h"
    ;;; 
    ;;;    integer :: ierr
    ;;;    logical :: flag
    ;;;    
    ;;;    ierr = 42
    
            mov       DWORD PTR [20+rbp], 42                        ;9.4
    
    ;;;    flag = .true.
    
            mov       DWORD PTR [8+rbp], -1                         ;10.4
    
    ;;; 
    ;;;    call MPI_INIT(ierr)
    
            lea       rax, QWORD PTR [20+rbp]                       ;12.9
            mov       rcx, rax                                      ;12.9
            call      MPI_INIT                                      ;12.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.21::                        ; Preds .B1.2
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.3::                         ; Preds .B1.21
                                    ; Execution count [0.00e+000]
    
    ;;;    Print *, "MPI_INIT", ierr
    
            mov       QWORD PTR [32+rbp], 0                         ;13.4
            mov       QWORD PTR [88+rbp], 8                         ;13.4
            lea       rax, QWORD PTR [__STRLITPACK_3]               ;13.4
            mov       QWORD PTR [96+rbp], rax                       ;13.4
            lea       rax, QWORD PTR [32+rbp]                       ;13.4
            mov       edx, -1                                       ;13.4
            mov       rcx, 01208384ff00H                            ;13.4
            lea       rbx, QWORD PTR [__STRLITPACK_5]               ;13.4
            lea       rsi, QWORD PTR [88+rbp]                       ;13.4
            mov       QWORD PTR [32+rsp], rsi                       ;13.4
            mov       QWORD PTR [304+rbp], rcx                      ;13.4[spill]
            mov       rcx, rax                                      ;13.4
            mov       rax, QWORD PTR [304+rbp]                      ;13.4[spill]
            mov       r8, rax                                       ;13.4
            mov       r9, rbx                                       ;13.4
            call      for_write_seq_lis                             ;13.4
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.22::                        ; Preds .B1.3
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.4::                         ; Preds .B1.22
                                    ; Execution count [0.00e+000]
            mov       eax, DWORD PTR [20+rbp]                       ;13.4
            mov       DWORD PTR [112+rbp], eax                      ;13.4
            lea       rax, QWORD PTR [32+rbp]                       ;13.4
            lea       rdx, QWORD PTR [__STRLITPACK_6]               ;13.4
            lea       rcx, QWORD PTR [112+rbp]                      ;13.4
            mov       QWORD PTR [312+rbp], rcx                      ;13.4[spill]
            mov       rcx, rax                                      ;13.4
            mov       rax, QWORD PTR [312+rbp]                      ;13.4[spill]
            mov       r8, rax                                       ;13.4
            call      for_write_seq_lis_xmit                        ;13.4
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.23::                        ; Preds .B1.4
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.5::                         ; Preds .B1.23
                                    ; Execution count [0.00e+000]
    
    ;;; 
    ;;;    CALL MPI_INITIALIZED(flag, ierr )
    
            lea       rax, QWORD PTR [8+rbp]                        ;15.9
            lea       rdx, QWORD PTR [20+rbp]                       ;15.9
            mov       rcx, rax                                      ;15.9
            call      MPI_INITIALIZED                               ;15.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.24::                        ; Preds .B1.5
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.6::                         ; Preds .B1.24
                                    ; Execution count [0.00e+000]
    
    ;;;    Print *, "MPI_INITIALIZED", flag, ierr
    
            mov       QWORD PTR [32+rbp], 0                         ;16.4
            mov       QWORD PTR [128+rbp], 15                       ;16.4
            lea       rax, QWORD PTR [__STRLITPACK_2]               ;16.4
            mov       QWORD PTR [136+rbp], rax                      ;16.4
            lea       rax, QWORD PTR [32+rbp]                       ;16.4
            mov       edx, -1                                       ;16.4
            mov       rcx, 01208384ff00H                            ;16.4
            lea       rbx, QWORD PTR [__STRLITPACK_7]               ;16.4
            lea       rsi, QWORD PTR [128+rbp]                      ;16.4
            mov       QWORD PTR [32+rsp], rsi                       ;16.4
            mov       QWORD PTR [320+rbp], rcx                      ;16.4[spill]
            mov       rcx, rax                                      ;16.4
            mov       rax, QWORD PTR [320+rbp]                      ;16.4[spill]
            mov       r8, rax                                       ;16.4
            mov       r9, rbx                                       ;16.4
            call      for_write_seq_lis                             ;16.4
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.25::                        ; Preds .B1.6
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.7::                         ; Preds .B1.25
                                    ; Execution count [0.00e+000]
            mov       eax, DWORD PTR [8+rbp]                        ;16.4
            mov       DWORD PTR [152+rbp], eax                      ;16.4
            lea       rax, QWORD PTR [32+rbp]                       ;16.4
            lea       rdx, QWORD PTR [__STRLITPACK_8]               ;16.4
            lea       rcx, QWORD PTR [152+rbp]                      ;16.4
            mov       QWORD PTR [328+rbp], rcx                      ;16.4[spill]
            mov       rcx, rax                                      ;16.4
            mov       rax, QWORD PTR [328+rbp]                      ;16.4[spill]
            mov       r8, rax                                       ;16.4
            call      for_write_seq_lis_xmit                        ;16.4
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.26::                        ; Preds .B1.7
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.8::                         ; Preds .B1.26
                                    ; Execution count [0.00e+000]
            mov       eax, DWORD PTR [20+rbp]                       ;16.4
            mov       DWORD PTR [168+rbp], eax                      ;16.4
            lea       rax, QWORD PTR [32+rbp]                       ;16.4
            lea       rdx, QWORD PTR [__STRLITPACK_9]               ;16.4
            lea       rcx, QWORD PTR [168+rbp]                      ;16.4
            mov       QWORD PTR [336+rbp], rcx                      ;16.4[spill]
            mov       rcx, rax                                      ;16.4
            mov       rax, QWORD PTR [336+rbp]                      ;16.4[spill]
            mov       r8, rax                                       ;16.4
            call      for_write_seq_lis_xmit                        ;16.4
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.27::                        ; Preds .B1.8
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.9::                         ; Preds .B1.27
                                    ; Execution count [0.00e+000]
    
    ;;;    IF ( .not. flag ) THEN
    
            mov       eax, DWORD PTR [8+rbp]                        ;17.4
            test      al, 1                                         ;17.15
            jne       .B1.13        ; Prob 50%                      ;17.15
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.10::                        ; Preds .B1.9
                                    ; Execution count [0.00e+000]
    
    ;;;         Print *, "bad: not inited; ", flag, ierr
    
            mov       QWORD PTR [32+rbp], 0                         ;18.9
            mov       QWORD PTR [184+rbp], 17                       ;18.9
            lea       rax, QWORD PTR [__STRLITPACK_1]               ;18.9
            mov       QWORD PTR [192+rbp], rax                      ;18.9
            lea       rax, QWORD PTR [32+rbp]                       ;18.9
            mov       edx, -1                                       ;18.9
            mov       rcx, 01208384ff00H                            ;18.9
            lea       rbx, QWORD PTR [__STRLITPACK_10]              ;18.9
            lea       rsi, QWORD PTR [184+rbp]                      ;18.9
            mov       QWORD PTR [32+rsp], rsi                       ;18.9
            mov       QWORD PTR [344+rbp], rcx                      ;18.9[spill]
            mov       rcx, rax                                      ;18.9
            mov       rax, QWORD PTR [344+rbp]                      ;18.9[spill]
            mov       r8, rax                                       ;18.9
            mov       r9, rbx                                       ;18.9
            call      for_write_seq_lis                             ;18.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.28::                        ; Preds .B1.10
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.11::                        ; Preds .B1.28
                                    ; Execution count [0.00e+000]
            mov       eax, DWORD PTR [8+rbp]                        ;18.9
            mov       DWORD PTR [208+rbp], eax                      ;18.9
            lea       rax, QWORD PTR [32+rbp]                       ;18.9
            lea       rdx, QWORD PTR [__STRLITPACK_11]              ;18.9
            lea       rcx, QWORD PTR [208+rbp]                      ;18.9
            mov       QWORD PTR [352+rbp], rcx                      ;18.9[spill]
            mov       rcx, rax                                      ;18.9
            mov       rax, QWORD PTR [352+rbp]                      ;18.9[spill]
            mov       r8, rax                                       ;18.9
            call      for_write_seq_lis_xmit                        ;18.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.29::                        ; Preds .B1.11
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.12::                        ; Preds .B1.29
                                    ; Execution count [0.00e+000]
            mov       eax, DWORD PTR [20+rbp]                       ;18.9
            mov       DWORD PTR [224+rbp], eax                      ;18.9
            lea       rax, QWORD PTR [32+rbp]                       ;18.9
            lea       rdx, QWORD PTR [__STRLITPACK_12]              ;18.9
            lea       rcx, QWORD PTR [224+rbp]                      ;18.9
            mov       QWORD PTR [360+rbp], rcx                      ;18.9[spill]
            mov       rcx, rax                                      ;18.9
            mov       rax, QWORD PTR [360+rbp]                      ;18.9[spill]
            mov       r8, rax                                       ;18.9
            call      for_write_seq_lis_xmit                        ;18.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.30::                        ; Preds .B1.12
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.13::                        ; Preds .B1.30 .B1.9
                                    ; Execution count [0.00e+000]
    
    ;;;    ENDIF
    ;;;    IF ( flag ) THEN
    
            mov       eax, DWORD PTR [8+rbp]                        ;20.4
            test      al, 1                                         ;20.9
            je        .B1.17        ; Prob 50%                      ;20.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.14::                        ; Preds .B1.13
                                    ; Execution count [0.00e+000]
    
    ;;;         Print *, "good: inited; ", flag, ierr
    
            mov       QWORD PTR [32+rbp], 0                         ;21.9
            mov       QWORD PTR [240+rbp], 14                       ;21.9
            lea       rax, QWORD PTR [__STRLITPACK_0]               ;21.9
            mov       QWORD PTR [248+rbp], rax                      ;21.9
            lea       rax, QWORD PTR [32+rbp]                       ;21.9
            mov       edx, -1                                       ;21.9
            mov       rcx, 01208384ff00H                            ;21.9
            lea       rbx, QWORD PTR [__STRLITPACK_13]              ;21.9
            lea       rsi, QWORD PTR [240+rbp]                      ;21.9
            mov       QWORD PTR [32+rsp], rsi                       ;21.9
            mov       QWORD PTR [368+rbp], rcx                      ;21.9[spill]
            mov       rcx, rax                                      ;21.9
            mov       rax, QWORD PTR [368+rbp]                      ;21.9[spill]
            mov       r8, rax                                       ;21.9
            mov       r9, rbx                                       ;21.9
            call      for_write_seq_lis                             ;21.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.31::                        ; Preds .B1.14
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.15::                        ; Preds .B1.31
                                    ; Execution count [0.00e+000]
            mov       eax, DWORD PTR [8+rbp]                        ;21.9
            mov       DWORD PTR [264+rbp], eax                      ;21.9
            lea       rax, QWORD PTR [32+rbp]                       ;21.9
            lea       rdx, QWORD PTR [__STRLITPACK_14]              ;21.9
            lea       rcx, QWORD PTR [264+rbp]                      ;21.9
            mov       QWORD PTR [376+rbp], rcx                      ;21.9[spill]
            mov       rcx, rax                                      ;21.9
            mov       rax, QWORD PTR [376+rbp]                      ;21.9[spill]
            mov       r8, rax                                       ;21.9
            call      for_write_seq_lis_xmit                        ;21.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.32::                        ; Preds .B1.15
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.16::                        ; Preds .B1.32
                                    ; Execution count [0.00e+000]
            mov       eax, DWORD PTR [20+rbp]                       ;21.9
            mov       DWORD PTR [280+rbp], eax                      ;21.9
            lea       rax, QWORD PTR [32+rbp]                       ;21.9
            lea       rdx, QWORD PTR [__STRLITPACK_15]              ;21.9
            lea       rcx, QWORD PTR [280+rbp]                      ;21.9
            mov       QWORD PTR [384+rbp], rcx                      ;21.9[spill]
            mov       rcx, rax                                      ;21.9
            mov       rax, QWORD PTR [384+rbp]                      ;21.9[spill]
            mov       r8, rax                                       ;21.9
            call      for_write_seq_lis_xmit                        ;21.9
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.33::                        ; Preds .B1.16
                                    ; Execution count [0.00e+000]
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.17::                        ; Preds .B1.33 .B1.13
                                    ; Execution count [0.00e+000]
    
    ;;;    ENDIF
    ;;; 
    ;;; end program
            mov       eax, 1                                        ;24.1
            mov       QWORD PTR [296+rbp], rax                      ;24.1[spill]
            lea       rax, QWORD PTR [__rtc_frame_desc_2.0]         ;24.1
            mov       rdx, rax                                      ;24.1
            mov       rcx, rbp                                      ;24.1
            call      _RTC_CheckStackVars                           ;24.1
                                    ; LOE rbp rdi rsp r12 r13 r14 r15 rip zmm6 zmm7 zmm8 zmm9 zmm10 zmm11 zmm12 zmm13 zmm14 zmm15
    .B1.34::                        ; Preds .B1.17
                                    ; Execution count [0.00e+000]
            mov       rax, QWORD PTR [296+rbp]                      ;24.1[spill]
            mov       rbx, QWORD PTR [392+rbp]                      ;24.1[spill]
            mov       rsi, QWORD PTR [400+rbp]                      ;24.1[spill]
            lea       rsp, QWORD PTR [416+rbp]                      ;24.1
            pop       rbp                                           ;24.1
            ret                                                     ;24.1
                                    ; LOE
    .B1.18::
    ; mark_end;
    MAIN__ ENDP
    _TEXT	ENDS
    .xdata	SEGMENT  DWORD   READ  ''
    	ALIGN 004H
    .unwind.MAIN__.B1_B34	DD	889735681
    	DD	3617866
    	DD	3695683
    	DD	17302285
    	DD	1342242874
    .xdata	ENDS
    .pdata	SEGMENT  DWORD   READ  ''
    	ALIGN 004H
    .pdata.MAIN__.B1_B34	DD	imagerel .B1.1
    	DD	imagerel .B1.18
    	DD	imagerel .unwind.MAIN__.B1_B34
    .pdata	ENDS
    _RDATA	SEGMENT     READ  'DATA'
    __NLITPACK_0	DD	000000002H,000000000H
    __STRLITPACK_5	DD	132152
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_6	DD	65801
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_7	DD	132152
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_8	DD	131344
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_9	DD	65801
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_10	DD	132152
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_11	DD	131344
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_12	DD	65801
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_13	DD	132152
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_14	DD	131344
    	DB	0
    	DB 3 DUP ( 0H)	; pad
    __STRLITPACK_15	DD	65801
    	DB	0
    _RDATA	ENDS
    _DATA	SEGMENT      'DATA'
    _DATA	ENDS
    ; -- End  MAIN__
    _DATA	SEGMENT      'DATA'
    	COMM MPIPRIV1:BYTE:28
    	COMM MPIPRIV2:BYTE:24
    	COMM MPIFCMB5:BYTE:4
    	COMM MPIFCMB9:BYTE:4
    	COMM MPIPRIVC:BYTE:2
    _DATA	ENDS
    EXTRN	for_write_seq_lis_xmit:PROC
    EXTRN	for_write_seq_lis:PROC
    EXTRN	for_set_reentrancy:PROC
    EXTRN	MPI_DUP_FN:PROC
    EXTRN	MPI_NULL_DELETE_FN:PROC
    EXTRN	MPI_NULL_COPY_FN:PROC
    EXTRN	MPI_COMM_DUP_FN:PROC
    EXTRN	MPI_COMM_NULL_DELETE_FN:PROC
    EXTRN	MPI_COMM_NULL_COPY_FN:PROC
    EXTRN	MPI_WIN_DUP_FN:PROC
    EXTRN	MPI_WIN_NULL_DELETE_FN:PROC
    EXTRN	MPI_WIN_NULL_COPY_FN:PROC
    EXTRN	MPI_TYPE_DUP_FN:PROC
    EXTRN	MPI_TYPE_NULL_DELETE_FN:PROC
    EXTRN	MPI_TYPE_NULL_COPY_FN:PROC
    EXTRN	MPI_CONVERSION_FN_NULL:PROC
    EXTRN	MPI_INIT:PROC
    EXTRN	MPI_INITIALIZED:PROC
    EXTRN	_RTC_CheckStackVars:PROC
    EXTRN	__ImageBase:PROC
    	INCLUDELIB <ifconsol>
    	INCLUDELIB <libifcoremdd>
    	INCLUDELIB <libifportmd>
    	INCLUDELIB <libmmdd>
    	INCLUDELIB <MSVCRTD>
    	INCLUDELIB <libirc>
    	INCLUDELIB <svml_dispmd>
    	INCLUDELIB <OLDNAMES>
    	END






    Sunday, August 5, 2018 1:34 PM

All replies

  • Some more digging revealed this:

    #include <stdio.h>
    
    // Fortran interface
    void mpi_init_( int *ierr );
    void mpi_initialized_( int *flag, int *ierr );
    
    // C interface
    int MPI_Initialized( int* flag);
    
    int main(int argc, char *argv[])
    {
        int flag, ierr;
    
        mpi_init_(&ierr);
        printf("%d\n", ierr);
    	
        mpi_initialized_(&flag, &ierr);
        printf("Fortran: %d %d\n", flag, ierr);
    		
        ierr = MPI_Initialized(&flag);
        printf("C: %d %d\n", flag, ierr);
    }

    The output is across all compilers:

    0
    Fortran: -1 0
    C: 1 0

    That means MSMPI represents Fortran's TRUE as -1 (and FALSE as 0). Most compilers (Intel, PGI) are happy with this as they check whether the value (or the least significant bit) is 0 or not 0 when compiling an if-statement that checks a LOGICAL. However, other compilers (gfortran) check for the value of TRUE explicitly, and in the case of gfortran TRUE is represented as 1.

    gfortran's assembly from above:

    # ../example.F90:17:    IF ( .not. flag ) THEN
    	mov	eax, DWORD PTR 444[rbp]	 # flag.5_1, flag
    	xor	eax, 1	 # _2,
    	test	eax, eax	 # _2
    	je	.L2	 #,
    
    # ../example.F90:20:    IF ( flag ) THEN
    	mov	eax, DWORD PTR 444[rbp]	 # flag.6_3, flag
    	test	eax, eax	 # flag.6_3
    	je	.L4	 #,

    The "if (.not. flag)" test is equivalent to "if (flag xor 1 != 0)" or shorter "if (flag != 1)". Since both 0 and -1 are unequal 1 the test will always succeed! It only works correctly if TRUE is represented as 1.

    The "if (flag)" test is equivalent to "if (flag != 0)", which would work as expected in all cases.

    Intel Fortran's assembly:

    ;;;    IF ( .not. flag ) THEN
            mov       eax, DWORD PTR [8+rbp]                        ;17.4
            test      al, 1                                         ;17.15
            jne       .B1.13        ; Prob 50%                      ;17.15
    
    ;;;    IF ( flag ) THEN
            mov       eax, DWORD PTR [8+rbp]                        ;20.4
            test      al, 1                                         ;20.9
            je        .B1.17        ; Prob 50%                      ;20.9

    The "if (.not. flag)" test is equivalent to "if (least_significant_byte(flag) & 1 == 0)" or shorter "if (least_significant_bit(flag) == 0)", where the least significant byte is -1 or 0 and the least significant bit 1 or 0, respectively. This works as expected in all cases, no matter if TRUE is -1 or 1.

    The "if (flag)" test is equivalent to "if (least_significant_byte(flag) & 1 != 0)" or shorter "if (least_significant_bit(flag) != 0)", which again works for all cases.

    PGI's assembly:

    ##    IF ( .not. flag ) THEN
    	testl	$1, -120(%rbp)
    	jne	.LB1_593
    
    ##    IF ( flag ) THEN
    	testl	$1, -120(%rbp)
    	je	.LB1_594

    The tests for PGI are identical to the ones of Intel Fortran.


    It seems that the fastest solution for this would be to patch MS MPI and use 1 as value for TRUE. I can't see any downsides since it would be compatible to all major Fortran compilers on Windows.

    Does that sound like a plan?

    Cheers
    Maik

    Monday, August 6, 2018 8:45 PM
  • Thanks Maik for the report and investigation! The solution seems reasonable, we will verify with other compilers and hopefully we can include this in next MSMPI release.

    Monday, August 6, 2018 9:03 PM
  • Mark,

    This has been fixed in MSMPI v10.0. Please give a try.

    MSMPI is now opensource, please find us on https://github.com/Microsoft/Microsoft-MPI

    Thanks,
    Jithin

    Friday, November 9, 2018 7:20 PM