NFS nfsStBlksize and buffer overflows (NFS RPC: Timed out)

Discussion:

Peter Dufault

2014-09-10 13:53:24 UTC

My client is having problems similar to that described here:

http://www.rtems.org/rtems/maillistArchives/rtems-users/2011/march/msg00228.html

I don't understand the details, or why one needs to limit the I/O size based on the ethernet chip set, but I did verify that cpukit/libfs/src/nfsclient/src/nfs.c does not have changes described in the email and that changing nfs.c and then setting nfsStBlksize to 4096 before calling nfsInit() works around the issue. This is on the phycore_mpc5554 BSP with the smc91111 ethernet chip.

I'll gladly submit a bug and a patch for nfs.c, but I don't fully understand the issue. I could just describe the bug as "NFS does not honor nfsStBlksize" and submit it as that.

Does this make sense or is there a more fundamental bug at a different level?

Peter
-----------------
Peter Dufault
HD Associates, Inc. Software and System Engineering

Peter Dufault

2014-09-11 11:47:11 UTC

Permalink

Post by Peter Dufault
http://www.rtems.org/rtems/maillistArchives/rtems-users/2011/march/msg00228.html
I don't understand the details, or why one needs to limit the I/O size based on the ethernet chip set, but I did verify that cpukit/libfs/src/nfsclient/src/nfs.c does not have changes described in the email and that changing nfs.c and then setting nfsStBlksize to 4096 before calling nfsInit() works around the issue. This is on the phycore_mpc5554 BSP with the smc91111 ethernet chip.

According to http://osr507doc.sco.com/en/PERFORM/NFS_tuning.html mounting the file system with a smaller rsize and wsize should address the problem:

"If the network adapter on an NFS client cannot handle full frames and back-to-back packets, reduce the NFS read and write transfer sizes below the default of 8KB. To do this, specify the mount(ADM) option modifiersrsize and wsize for each mounted filesystem. These must be added to the options defined for the mntopts keyword in the file /etc/default/filesys (see filesys(F) for more information). The following is an example of such an entry reducing the read and write transfer sizes to 1KB (1024 bytes):

" bdev=nfs_svr:/remote \
mountdir=/remote_mnt fstyp=NFS \
fsck=no fsckflags= \
init=yes initcmd="sleep 2" \
mntopts="bg,soft,rsize=1024,wsize=1024" \
rcmount=yes rcfsck=no mountflags=..."

Note that nfs.c already sets the blocksize based on nfsStBlksize:

/* Set to "preferred size" of this NFS client implementation */
buf->st_blksize = nfsStBlksize ? nfsStBlksize : fa->blocksize;

The existing code limits I/O transfers to nfsStBlksize or NFS_MAXDATA and not the file system blocks size. Is there a way to get back to the buf->st_blksize given the "rtems_libio_t *iop" in the NFS I/O routines? Then the file system blocksize could be used as the limit.

The above code-quote has an additional bug. nfsStBlksize has a default value of DEFAULT_NFS_ST_BLKSIZE which is NFS_MAXDATA which is 8K, so the default RTEMS behavior is that buf->st_blksize is always 8K even if fa->blocksize is 4K.

So currently:
- I/O is not limited to the file system block size;
- The mounted block size is ignored and nfsStBlksize is used, which defaults to 4K.

The first problem can be fixed by getting from the iop back to the block size and then limiting the transfer at that size. The second can be fixed by initializing nfsStBlksize to 0 instead of NFS_MAXDATA.

Peter
-----------------
Peter Dufault
HD Associates, Inc. Software and System Engineering

Peter Dufault

2014-09-11 13:11:09 UTC

Permalink

Post by Peter Dufault
- I/O is not limited to the file system block size;
- The mounted block size is ignored and nfsStBlksize is used, which defaults to 4K.

I meant 8K above. Anyway, looking through the code and the associated change log this behavior is intentional, apparently the newlib file system blocksize used to default to something small like 512 bytes and increasing the default reported size to 8K improved performance (no buffer cache on NFS).

The value I'm getting now is a 4K file system blocksize on my NFS file system, I'm not sure if that's in newlib or somewhere in the NFS code, I don't see 4K there. Changing the code to go back to using the file system block size would hurt performance for everyone unless more work is done to get the default to 8K for NFS mounts. I'm just going to change the code to limit it to nfsStBlksize and put a comment about why it is the way it is.

If someone understands the state of the file system code today and how it's changed over the years and wants to suggest a better fix I'll listen.

So:
- Default for nfsStBlksize will be 8192.
- Default behavior is to ignore the file system block size (nfsStBlksize not 0), nfsStBlksize will be reported.
- nfsStBlksize set to 0 will pay attention to the file system block size and limit I/O at that.

I think that's the author's original intention.

Peter
-----------------
Peter Dufault
HD Associates, Inc. Software and System Engineering