USB thumb drive with corrupted filesystem mounts two devices

Staab_Engineering · ‎07-07-2016

We have a cRIO-9068 running 3.2.35-rt52-2.10.0f0 (from uname -r) whose LV app uses a USB thumb drive. This particular cRIO has mounted the drive twice, once read-only in response to what appears to be a file system corruption, and again as a r/w device with no files. Here are some details from my investigation of the issue:

The drive has lots of files on it from past uses of the LVRT app. According to the end user, it has never been removed from the cRIO. At one point, the cRIO was powered up and the drive failed to mount on "sda1" as a r/w device, so the kernel mounted it read-only. I can see all the files and folders and read them. Everything has 0x777 mod bits, but I can't actually write. That's what I would expect. I've attached the full output of "dmesg" where you can see the "FAT-fs invalid cluster chain" errors.

Here's the weird thing -- and by "weird", I mean that I don't expect a Linux auto-mounter to normally do this -- there's also an "sdb1" device mounted on the cRIO. It has no files or directories, but I can write new ones to it just fine. Is this an NI-only thing? Can someone explain more about it and whether I should consider it safe to use "sdb1"?

And how about troubleshooting or further investigating this problem? There's important test data on the drive (yes, important data on a thumb drive lol, I know), so I'm nervous about trying to use "fsck" or other auto-repair tools without first understanding the scenario better.

Staab_Engineering · ‎07-07-2016

Update: I also ran parted /dev/sda1 'print'. Here's the output:

Model: Unknown (unknown)
Disk /dev/sda1: 2011MB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:
Number Start End Size File system Flags
1 0.00B 2011MB 2011MB fat32

I tried to run parted /dev/sdb1 'print', but it says it's not there!

Error: Could not stat device /dev/sdb1 - No such file or directory.

I can still browse it and write to it with my SSH client and over SFTP.

BradM · ‎07-07-2016

The kernel log attached seems to indicate some hardware failure, either with the USB storage device or the 9068's USB controller. Something's causing the asynchronous part of the new device probing to run twice, which is likely the root of there being "two devices" and both behaving oddly.

Can you do a quick sanity check by trying to examine the USB stick in a different controller (or computer)?

On the last point, what is the contents of the /dev/ folder? Are there entries for /dev/sdb and /dev/sdb1?

Staab_Engineering · ‎07-07-2016

Is this what you need? The end user is across the country, so I've emailed them to ask if they can pop the USB stick and check it in one of their computers.

admin@cRIO:~# ls /dev/sda
/dev/sda
admin@cRIO:~# ls /dev/sda1
/dev/sda1
admin@cRIO:~# ls /dev/sdb1
ls: /dev/sdb1: No such file or directory
admin@cRIO:~# ls -lahF /media/sdb1
total 0
drwxr-xr-x    4 admin    administ     312 Jul 7 03:23 ./
drwxr-xr-x   12 admin    administ     808 Jun 2 16:36 ../
drwxr-xr-x    2 admin    administ     160 Jul 7 00:42 Staab Test/
drwxr-xr-x    2 admin    administ     160 Jul 7 03:23 Hello World/

BradM · ‎07-07-2016

Yes, that's what I needed. What does mount show as being mounted to /media/sdb1?

Staab_Engineering · ‎07-07-2016

/dev/sda1 on /media/sda1 type vfat (ro,relatime,fmask=0000,dmask=0000,allow_utime=0022,codepage=cp437,iocharset=iso8859-1,shortname=mixed,quiet,errors=remount-ro)

BradM · ‎07-07-2016

I just want to make sure that we're checking the right thing:

(safemode) admin@HerNameIsRIO:~# dmesg | tail
[   43.008869] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   44.227758] sd 0:0:0:0: [sda] 121065984 512-byte logical blocks: (61.9 GB/57.
7 GiB)
[   44.228233] sd 0:0:0:0: [sda] Write Protect is off
[   44.228246] sd 0:0:0:0: [sda] Mode Sense: 23 00 00 00
[   44.228735] sd 0:0:0:0: [sda] No Caching mode page found
[   44.228745] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   44.233998] sda: sda1
[   45.476510] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   46.089862] FAT-fs (sda1): Volume was not properly unmounted. Some data may b
e corrupt. Please run fsck.
[ 110.315695] random: nonblocking pool is initialized
(safemode) admin@HerNameIsRIO:~# mount /media/sda1/
mount: /dev/sda1 is already mounted or /media/sda1 busy
       /dev/sda1 is already mounted on /media/sda1

Note that I'm using mount to indicate what device is being used for a mountpoint root. If the actual directory is not a filesystem root, I'll see something like the following

(safemode) admin@HerNameIsRIO:~# mkdir /media/sdb1
(safemode) admin@HerNameIsRIO:~# mount /media/sdb1
mount: can't find /media/sdb1 in /etc/fstab

The udev scripts when working with new removable device insertion will essentially make the directory for the new device and mount it there. If there's some flakiness with the actual hardware, it may create the folder without getting a removal event from the kernel to cleanup (remove the directory). I'm asking to check to see if the /media/sdb1 folder is actually a mountpoint or if it's actually just a normal folder under the rootfs of the actual 9068 (and therefore the contents exist on the NAND)

Another way to check what's going on would be to look at the /proc/mounts file (or calling mount without any arguments), looking for the device in question

(safemode) admin@HerNameIsRIO:~# grep sd /proc/mounts
/dev/sda1 /media/sda1 vfat rw,relatime,fmask=0000,dmask=0000,allow_utime=0022,co
depage=437,iocharset=iso8859-1,shortname=mixed,quiet,errors=remount-ro 0 0

In this output, I see that /dev/sda1 is mounted to the mountpoint /media/sda1, and the options of the mount (unimportant for the most part, but look for rw or ro)

Staab_Engineering · ‎07-07-2016

Ah, I honestly just misread your original request as sda1, not sdb1. Sorry about that. And thanks for the detailed explanation! Now I know where your automount scripts are. I love looking through all this stuff to learn how the RIO's IT is managed.

sdb1 is not listed in the output of mount. sda1 is, and I pasted that line above. Also this:

admin@cRIO:~# grep sd /proc/mounts
/dev/sda1 /media/sda1 vfat ro,relatime,fmask=0000,dmask=0000,allow_utime=0022,codepage=cp437,iocharset=iso8859-1,shortname=mixed,quiet,errors=remount-ro 0 0

So as you might already suspect:

admin@cRIO:~# mount /media/sdb1/
mount: can't find /media/sdb1/ in /etc/fstab

ls with extra flags confirms the suspicion that /media/sdb1/ isn't a mountpoint:

admin@cRIO:~# ls -lahF /media
total 4
drwxr-xr-x   12 admin    administ     808 Jun 2 16:36 ./
drwxrwxr-x   17 webserv ni          1.6K Jul 6 07:33 ../
drwxr-xr-x    2 admin    administ     160 Dec 8 2014 card/
drwxr-xr-x    2 admin    administ     160 Dec 8 2014 cf/
drwxr-xr-x    2 admin    administ     160 Dec 8 2014 hdd/
drwxr-xr-x    2 admin    administ     160 Dec 8 2014 mmc1/
drwxr-xr-x    2 admin    administ     160 Dec 8 2014 net/
drwxr-xr-x    2 admin    administ     160 Dec 8 2014 ram/
drwxr-xr-x    2 admin    administ     160 Dec 8 2014 realroot/
drwxrwxrwx    8 admin    administ    4.0K Dec 31 1969 sda1/
drwxr-xr-x    4 admin    administ     312 Jul 7 03:23 sdb1/
drwxr-xr-x    2 admin    administ     160 Dec 8 2014 union/

It's got a new creation date, a weird size, and the wrong mod bits.

NI Linux Real-Time Discussions

USB thumb drive with corrupted filesystem mounts two devices

USB thumb drive with corrupted filesystem mounts two devices

Re: USB thumb drive with corrupted filesystem mounts two devices

Re: USB thumb drive with corrupted filesystem mounts two devices

Re: USB thumb drive with corrupted filesystem mounts two devices

Re: USB thumb drive with corrupted filesystem mounts two devices

Re: USB thumb drive with corrupted filesystem mounts two devices

Re: USB thumb drive with corrupted filesystem mounts two devices

Re: USB thumb drive with corrupted filesystem mounts two devices