08-12-2022 10:36 PM
I wonder what lsni -v shows?
In my experience it will show the RIO alias string, although I have found the latest versions of the drivers don't do that, the alias is more of a PCI bus string, and I have to derive the RIO number from the index order of the devices reported by the system configuration api. Did NI change something on how the RIO number is reported?
More to the point, I believe one has to have the correct RIO string for the device you're loading the bitfile into. Without the correct RIO string, the bitfile load will fail.
08-16-2022 01:01 AM
Here is the output of dmesg.
08-16-2022 01:04 AM
Here's the output of lsni -v.
Scanning localhost for devices...
System Configuration API Experts found:
NI Device Interconnect Manager 22.5.0 (nidim)
NI-CONTROLLER 22.5 (ni-controller)
NI Network Browser 22.5 (network)
NI MX Routing Utility 22.5.0 (nimru)
NI-MXI 22.5 (ni-mxi)
NI PXI Platform Services 22.5 (ni-pxi)
NI-QPXI 22.5 (ni-qpxi)
NI-RIO 22.5.0 (ni-rio)
NI System Configuration 22.5 (nisyscfg)
System Configuration API resources found:
PXIChassis1
--Primary Expert: NI PXI Platform Services 22.5
--Model Name: NI PXIe-1083
--Serial Number: redacted
mtms-Precision-3650-Tower
--Primary Expert: NI System Configuration 22.5
--Model Name: Precision 3650 Tower
--Serial Number: redacted
/sys/bus/pci/devices/0000:0e:00.0
--Primary Expert: NI System Configuration 22.5
--Bus/Dev/Func: 14/0/0
Ethernet Adapter docker0
--Primary Expert: NI System Configuration 22.5
Ethernet Adapter enp0s31f6
--Primary Expert: NI System Configuration 22.5
I have tried both RIO0 and PXI1Slot4 as RIO string, both work on windows.
08-16-2022 08:33 AM
Hmm, there's no errors or any indication why the driver isn't loaded.
The driver on linux for that device is nirseriesstc3k. Could you try the following
1. sudo modprobe nirseriesstc3k
2. Check dmesg to see if any new errors were printed
3. If there weren't any errors above, check if the lsni output has changed to include the PXIe card
4. Maybe also check lspci -v to see if the PXIe-7820R has a "Kernel driver in use"
08-17-2022 01:05 AM
I think we are now understanding why it doesn't work. Running sudo modprobe nirseriesstc3k outputs
modprobe: FATAL: Module nirseriesstc3k not found in directory /lib/modules/5.14.0-1048-oem
Upon further investigation, it seems that I have two folders under /lib/modules/, 5.14.0-1047-oem/ and 5.14.0-1048-oem/. Running uname -r returns 5.14.0-1048-oem, BUT dkms status says
...
nirseriesstc3k, 21.5.0f98, 5.14.0-1047-oem, x86_64: installed
...
So I recon this means that the driver is installed on the wrong kernel for some reason?
08-17-2022 01:15 AM - edited 08-17-2022 01:16 AM
I booted into 5.14.0-1047-oem kernel and now I got a step further. Now the device is found and lsni -v output looks different too
Scanning localhost for devices...
System Configuration API Experts found:
NI Device Interconnect Manager 22.5.0 (nidim)
NI-CONTROLLER 22.5 (ni-controller)
NI Network Browser 22.5 (network)
NI MX Routing Utility 22.5.0 (nimru)
NI-MXI 22.5 (ni-mxi)
NI PXI Platform Services 22.5 (ni-pxi)
NI-QPXI 22.5 (ni-qpxi)
NI-RIO 22.5.0 (ni-rio)
NI System Configuration 22.5 (nisyscfg)
System Configuration API resources found:
PXI1Slot4
--Primary Expert: NI-RIO 22.5.0
--Model Name: NI PXIe-7820R
--Serial Number: redacted
--Chassis: PXI1
--Slot: 4
--Trigger Bus Number: 1
--Bus/Dev/Func: 14/0/0
PXIChassis1
--Primary Expert: NI PXI Platform Services 22.5
--Model Name: NI PXIe-1083
--Serial Number: redacted
mtms-Precision-3650-Tower
--Primary Expert: NI System Configuration 22.5
--Model Name: Precision 3650 Tower
Ethernet Adapter docker0
--Primary Expert: NI System Configuration 22.5
Ethernet Adapter enp0s31f6
--Primary Expert: NI System Configuration 22.5
HOWEVER, now when I try to utilize FIFOs, my whole computer freezes. DMA protection should be disabled:
$ cat /sys/bus/thunderbolt/devices/domain0/iommu_dma_protection
0
$ cat /sys/bus/thunderbolt/devices/domain0/security
none
I wonder what's the problem now? Also looking at the dkms status output, it seems like half of the drivers are on kernel 5.14.0-1047-oem and half are on 5.14.0-1048-oem. Could it perhaps cause this kind of problems?
08-17-2022 08:28 AM
dkms should take care of versioning modules for all available kernels. See if dkms autoinstall rebuilds everything for one or both kernels.
08-17-2022 08:51 AM
I assume this is an Intel CPU. Could you try disabling VT-d in the BIOS or specifying the kernel command line argument "intel_iommu=off" and see if you are able to start FIFOs without that?
This sounds a lot like an IOMMU issue even though you have thunderbolt security disabled. This driver attempts to do things correctly to support IOMMUs on linux, but digging through the code some, I'm not sure its registering its DMA Link chain with the DMA APIs properly. If disabling the IOMMU fully works, I'll dig into that a little further.
08-18-2022 01:13 AM
Yeah it's an Intel CPU. I disabled VT-d in BIOS, didn't help. I then added the kernel command line argument "intel_iommu=off", didn't help. I tried having both, but that unfortunately didn't help either.
08-18-2022 01:12 PM
Hmm, its hard to say whats going on then.
You mentioned the machine was freezing. Could you try using sysrq and see if you can get the "Backtrace for active CPUs" option working, then try it in the frozen state?
https://en.wikipedia.org/wiki/Magic_SysRq_key
You might also try getting a serial console hooked up and see if there's any messages printed when the system goes down.