Advertisement
ffilz

David Sohonet Chat

Apr 30th, 2019
402
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 6.16 KB | None | 0 0
  1. Apr 18 12:59:51 * _david_sohonet (~dcc@usa-67-224-101-18.sohonet.co.uk) has joined #ganesha
  2. Apr 18 12:59:55 <_david_sohonet> Hi. I am using ganesha 2.7.2 with an OS X client. I am seeing occational error on reading a folder. We are using NFSv3. The OS X client reports "Unknown error: 10006". Looking at the server side of things that appears to map to NFS3ERR_SERVERFAULT. I don't have any log messages for errors at the time that NFS client reported the error.
  3. Apr 18 13:00:09 * skoduri has quit (Ping timeout: 255 seconds)
  4. Apr 18 13:00:47 <_david_sohonet> Is it possible to get LogDebug() messages into a log file with a config change to ganesha ?
  5. Apr 18 13:22:49 * michaeldexter has quit (Quit: michaeldexter)
  6. Apr 18 13:26:26 <_david_sohonet> In other words, how can I figure out what the server returned a 10006 error code to the NFS Client ?
  7. Apr 18 13:26:45 <_david_sohonet> *why* - excuse my typo
  8. Apr 18 13:33:02 <ffilzwin> _david_sohonet, you could start with LOG { COMPONENTS { NFSPROTO = FULL_DEBUG; } } in your config file
  9. Apr 18 13:33:45 <ffilzwin> you can send Ganesha a SIGHUP and it will re-read the config file and enable the debug (to disable you need to explicitly set NFSPROTO back to INFO or EVENT depending on your overall debug level)
  10. Apr 18 13:34:05 <ffilzwin> depening on the source of the error, enabling debug on other components will be useful
  11. Apr 18 13:43:05 <ffilzwin> hmm, now I can do things as root, but now I don't even see an attempt to map ffilz@LOCALDOMAIN...
  12. Apr 18 13:44:07 * mattbenjamin has quit (Quit: Leaving.)
  13. Apr 18 13:46:13 <_david_sohonet> Thanks ffilzwin - I will try that
  14.  
  15. Apr 24 10:15:22 <_david_sohonet> I mentioned last week in here an issue with a NFS client ( Mac OS X , NFSv3) getting error code 10006 using v2.7.2 ganesha. The app running on the client errors saying "Cannot read folder /Volumes/qstar/READYNAS/Storage/FACILITY: cannot read folder /FACILITY: Unknown error: 10006 (error 10006)". I asked in here, how do I work out why ganesha is returning 10006 to the client. I added "LOG {
  16. Apr 24 10:15:28 <_david_sohonet> COMPONENTS { NFSPROTO = FULL_DEBUG; } }" as suggested in here to my config and HUP'ed the daemon. The NFS client ran the workload ( a backup from the app OS X app goodsync ) and I got more errors on the client. I looked through the 800MB ganesha.log but I can't see any errors or failure logged, no explanation. Any other ideas?
  17. Apr 24 10:26:05 <ffilzwin> kkeithley, oh, yea, we may not have a recent ntirpc pullup in next...
  18. Apr 24 10:29:30 <ffilzwin> _david_sohonet, we don't return that error from too many places, a wireshark trace showing the request that got the failur would help
  19. Apr 24 10:29:56 <ffilzwin> what FSAL are you using?
  20. Apr 24 10:30:16 <_david_sohonet> VFS FSAL
  21. Apr 24 10:31:22 <ffilzwin> hmm, that makes it likely it's from a READDIR_PLUS call
  22. Apr 24 10:31:36 <ffilzwin> you did say NFS v3 right?
  23. Apr 24 10:32:14 <_david_sohonet> correct. mount -v -t nfs -o rw,hard,intr,resvport,async,tcp,nolocks,nfc,locallocks,vers=3,wsize=1048576,rsize=1048576 10.0.0.150:/data/storage /Volumes/qstar
  24. Apr 24 10:32:38 * cliluw (~cliluw@unaffiliated/cliluw) has joined #ganesha
  25. Apr 24 10:32:46 <ffilzwin> and you didn't see a CRIT Mapping x to ERR_FSAL_SERVERFAULT in the Ganesha log right?
  26. Apr 24 10:35:13 <ffilzwin> what type of filesystem being exported? What's in the FACILITY directory?
  27. Apr 24 10:35:46 <_david_sohonet> I grep'ed and have no "CRIT" messages in the 800MB log file for the day when the client has the issue
  28. Apr 24 10:36:48 <_david_sohonet> The underling filesystem is FUSE based archive filesystem, which is backed by S3 object storage. It's from a company called qstar https://www.qstar.com/archive-manager/
  29. Apr 24 10:42:06 <_david_sohonet> We have had multiple folders mentioned with the "Unknown error: 10006". Three on the same day that I was told about
  30. Apr 24 10:43:11 <_david_sohonet> OS X version is Mac OS 10.12.6 Sierra
  31. Apr 24 10:43:15 <_david_sohonet> if that matters
  32. Apr 24 10:43:56 <ffilzwin> for this one, client shouldn't matter
  33. Apr 24 10:44:17 <ffilzwin> no MAJOR messages about Space too small for handle?
  34. Apr 24 10:44:38 <ffilzwin> The filesystem could also be returning ENODATA for something
  35. Apr 24 10:47:28 <ffilzwin> I'm going to guess there's a file in the directory that has something a bit funny about it considering the involvement of FUSE and S3 backed storage...
  36. Apr 24 10:49:54 <_david_sohonet> 22/04/2019 18:15:06 : epoch 5cb4a000 : SM-X10SRI4B-S-1-LDCF1-GB : ganesha.nfsd-6600[svc_1359] nfs3_lookup :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Lookup handle: File Handle V3: Len=28 4300004d164400000000300000008100000000c6f73b008b95ae5c00 name: MAJOR TOM
  37. Apr 24 10:49:58 <_david_sohonet> 22/04/2019 18:15:06 : epoch 5cb4a000 : SM-X10SRI4B-S-1-LDCF1-GB : ganesha.nfsd-6600[svc_1392] nfs3_lookup :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Lookup handle: File Handle V3: Len=28 4300004d164400000000300000008100000000c9f73b008b95ae5c00 name: MAJOR TOM
  38. Apr 24 10:50:05 <_david_sohonet> only a file/folder called MAJOR TOM :-)
  39. Apr 24 10:56:07 <ffilzwin> hmm, maybe something about MAJOR TOM? A wireshark trace would definitely help (but maybe would reveal detail you can't share)
  40. Apr 24 10:58:08 <ffilzwin> you could turn on log component FSAL = FULL_DEBUG; and READDIR = FULL_DEBUG; but that might get you way too much if you don't have a small re-creator
  41. Apr 24 10:58:37 <ffilzwin> any errors from FUSE or the S3 backend?
  42. Apr 24 11:12:52 <_david_sohonet> I cannot see any errors in the "qstar" log during the time period when the NFS client had issues, no.
  43. Apr 24 11:14:06 <kkeithley> lol, so that wasn't completely unexpected.
  44. Apr 24 11:27:07 <_david_sohonet> I have upgraded from 2.7.2 to 2.7.3 ( and the libary thingy ). In the list messages from Daniel it mentions "some fixes for readdir" - could that be related to my issue ?
  45. Apr 24 12:17:45 <ffilzwin> I don't think so
  46. Apr 24 12:18:17 <ffilzwin> they have to do with the dirent cache, the error you're seeing isn't from the cache
  47. Apr 24 12:33:39 <_david_sohonet> Ok, in that case I will get the person to run the NFS transfer again on 2.7.3. If as expected it fails again I will try FSAL = FULL_DEBUG; and READDIR = FULL_DEBUG;
  48. Apr 24 12:33:43 <_david_sohonet> Thanks
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement