01-26-2017 02:11 AM
i have been experimenting with high performance linear algebra libraries and parallelisation on the NI Linux RT targets and some tests I did yesterday showed that it makes no difference if I call a library generated with blas implementations of matrix operations or sequential FOR implementations.
The thing I noticed afterwards was that these two functions:
float myArr[50000000]; int i = 0; int seqMulti(void){ for (i = 0; i < 50000000; i++) { myArr[i] = 256*i; } return i; } int paraMulti(void){ #pragma omp parallel for for (i = 0; i < 50000000; i++) { myArr[i] = 256*i; } return i; }
performed exactly the same when called from the call library function node.
Is there something that I'm overlooking? Am I locked to a single thread when calling code with the Call Library Function node?
I compiled the shared object with -O3 -march=atom -mtune=atom -fopenmp -g3 -Wall -c -fmessage-length=0 -fPIC
the interesting part is that the paraMulti function returns nothing, while the seqMulti function returns the correct iteration number. Both calls take exactly 111 ms to execute on my cRIO.
Any ideas?
01-26-2017 11:09 AM - edited 01-26-2017 11:11 AM
The call library function node does nothing to prevent you from creating threads, I'd instead suspect that the GCC tools being used haven't been configured with gomp support.
I'm away from my controllers at the moment, so I can't verify but check the tools to see how they were configured with a "gcc -v"
01-27-2017 02:30 AM
I have two GCC versions installed. The default one:
admin@NI-cRIO-9039-01BE3F31:~# gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/gcc/x86_64-nilrt-linux/4.8.2/lto-wrapper Target: x86_64-nilrt-linux Configured with: /builds/perforce/ThirdPartyExports/NIOpenEmbedded/trunk/3.5/objects/targettools/linuxU/x64/gcc-4.7-oe/release/build/tmp-glibc/work-shared/gc c-4.8.2-r0/gcc-4.8.2/configure --build=x86_64-linux --host=x86_64-nilrt-linux --target=x86_64-nilrt-linux --prefix=/usr --exec_prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --libexecdir=/usr/lib/gcc --datadir=/usr/share --sysconfdir=/etc --sharedstatedir=/com --localstatedir=/var --libdir=/usr/lib --includedi r=/usr/include --oldincludedir=/usr/include --infodir=/usr/share/info --mandir=/usr/share/man --disable-silent-rules --disable-dependency-tracking --with-lib tool-sysroot=/builds/perforce/ThirdPartyExports/NIOpenEmbedded/trunk/3.5/objects/targettools/linuxU/x64/gcc-4.7-oe/release/build/tmp-glibc/sysroots/x64 --wit h-gnu-ld --enable-shared --enable-languages=c,c++ --enable-threads=posix --enable-multilib --enable-c99 --enable-long-long --enable-symvers=gnu --enable-libs tdcxx-pch --program-prefix=x86_64-nilrt-linux- --without-local-prefix --enable-target-optspace --enable-lto --enable-libssp --disable-bootstrap --disable-lib mudflap --with-system-zlib --with-linker-hash-style=gnu --enable-linker-build-id --with-ppl=no --with-cloog=no --enable-checking=release --enable-cheaders=c_ global --with-sysroot=/ --with-build-sysroot=/builds/perforce/ThirdPartyExports/NIOpenEmbedded/trunk/3.5/objects/targettools/linuxU/x64/gcc-4.7-oe/release/bu ild/tmp-glibc/sysroots/x64 --with-native-system-header-dir=/builds/perforce/ThirdPartyExports/NIOpenEmbedded/trunk/3.5/objects/targettools/linuxU/x64/gcc-4.7 -oe/release/build/tmp-glibc/sysroots/x64/usr/include --with-gxx-include-dir=/usr/include/c++/4.8.2 --enable-nls --enable-__cxa_atexit --with-arch=core2 --wit h-tune=core2 Thread model: posix
and a version i compiled with fortran support in order to compile ATLAS. I didn't specifically set the --disable-libgomp flag, so I don't know if I have it or not. I can't seem to find the flag to explicitly enable it...
admin@NI-cRIO-9039-01BE3F31:~/gcc-4.9.4/bin# ./gcc -v Using built-in specs. COLLECT_GCC=./gcc COLLECT_LTO_WRAPPER=/home/admin/gcc-4.9.4/libexec/gcc/x86_64-unknown-linux-gnu/4.9.4/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --prefix=/home/admin/gcc-4.9.4 --enable-languages=c,c++,fortran --disable-multilib Thread model: posix gcc version 4.9.4 (GCC)
what I actually did do was install the libgomp through opkg:
admin@NI-cRIO-9039-01BE3F31:~/gcc-4.9.4/bin# ldconfig -p |grep omp libnss_compat.so.2 (libc6,x86-64, OS ABI: Linux 3.14.3) => /lib/libnss_compat.so.2 libnss_compat.so (libc6,x86-64, OS ABI: Linux 3.14.3) => /usr/lib/libnss_compat.so libgomp.so.1 (libc6,x86-64) => /usr/lib/libgomp.so.1 libXcomposite.so.1 (libc6,x86-64) => /usr/lib/libXcomposite.so.1
I think I'll try building GCC again with --enable-libgomp and see if that does something...
02-20-2020 03:40 AM
@tusrob wrote:
Is there something that I'm overlooking? Am I locked to a single thread when calling code with the Call Library Function node?
Any ideas?
I know it's probably late and the OP probably won't see this, but did you make sure to configure the Call Library Node to run in ANY thread? With the default "Run in UI thread" I could imagine such effects.