Flearidden_Rat

Rickjb's ErrorSCC_WUs_200504.txt

May 6th, 2020
89
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 3.96 KB | None | 0 0
  1. Dear WCG Techs
  2. When I scanned my crunching "farm" earlier today (6 May 2020 AEST) with BoincTasks, I noticed an anomoly
  3. with an SCC WU that was about to finish.
  4. After it finished I looked up its result on WCG.
  5. It was deemed Valid but had claimed far more CPU time than it had actually run.
  6. The Valid runtimes for SCC under Linux are currently quite variable, ranging from about 30m to more than 2h,
  7. but not 5h or more.
  8.  
  9. I went though all results for SCC on this device that are currently online and have recorded the names of all
  10. SCC WUs with anomolousclaimed runtimes. The list appears near the bottom of this file.
  11.  
  12. I also looked through Valid SCC results for all devices, and only the one device has given anomolous results.
  13.  
  14. Here are details of the WU I noticed while it was running on BoincTasks:
  15. ------------------------------------------------------------------------
  16. Device Name: Callisto-VM
  17.  
  18. Result Name OS type OS version App Version Number Status Sent Time Time Due /
  19. Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit
  20. SCC1_ 0003840_ FoxO1-A_ 89973_ 0-- Linux 3.2.0-4-amd64 708 Valid 5/5/20 21:22:16 5/6/20 03:31:06
  21.  
  22. 6.15 / 6.15 168.6 / 168.6
  23. ---------
  24. Log file:
  25. <core_client_version>7.0.27</core_client_version>
  26. <![CDATA[
  27. <stderr_txt>
  28. INFO: result number = 0
  29. INFO: No state to restore. Start from the beginning.
  30. [13:10:35] Number of tasks = 1
  31. [13:10:35] Running task 0,CPU time at start of task 0 was 0.000000
  32. [13:10:35] ./cmpd-1189973.pdbqt size = 20 3 ../../projects/www.worldcommunitygrid.org/scc1.FoxO1-A.pdbqt size = 468 0
  33. [13:41:54] Finished task #0 cpu time used 18446745959.383400
  34. 13:41:54 (13029): called boinc_finish(0)
  35.  
  36. </stderr_txt>
  37. ]]>
  38. Thus it ran for only 31 min, but claimed 6.15h !!
  39. ---------------------------------------------------
  40.  
  41. More typical is:
  42. Name -- CPU Time / Elapsed Time
  43. SCC1_ 0003855_ FoxO1-A_ 9691_ 0-- 0.76 / 0.76
  44. Log file:
  45. <core_client_version>7.0.27</core_client_version>
  46. <![CDATA[
  47. <stderr_txt>
  48. INFO: result number = 0
  49. INFO: No state to restore. Start from the beginning.
  50. [12:04:38] Number of tasks = 1
  51. [12:04:38] Running task 0,CPU time at start of task 0 was 0.000000
  52. [12:04:38] ./cmpd-2609691.pdbqt size = 26 5 ../../projects/www.worldcommunitygrid.org/scc1.FoxO1-A.pdbqt size = 468 0
  53. [12:50:09] Finished task #0 cpu time used 2753.408077
  54. 12:50:09 (12139): called boinc_finish(0)
  55.  
  56. </stderr_txt>
  57. ]]>
  58.  
  59. --------------------------------------------------
  60. Here are the names of all affected WUs that are still up in my WCG account Results section:
  61. SCC1_ 0003840_ FoxO1-A_ 89973_ 0-- (the one above)
  62. SCC1_ 0003849_ FoxO1-A_ 92170_ 0--
  63. SCC1_ 0003917_ FoxO1-B_ 35883_ 0--
  64. SCC1_ 0003861_ FoxO1-A_ 6450_ 0--
  65. SCC1_ 0003909_ FoxO1-B_ 76386_ 0--
  66. SCC1_ 0003861_ FoxO1-A_ 88142_ 0--
  67. SCC1_ 0003861_ FoxO1-A_ 88142_ 0-- (claimed 16.25h for 38m work)
  68. SCC1_ 0003921_ FoxO1-B_ 93377_ 1-- (still PV, will check wingman's when she's done)
  69. SCC1_ 0003917_ FoxO1-B_ 35883_ 0--
  70. SCC1_ 0003921_ FoxO1-B_ 93377_ 1--
  71. SCC1_ 0003918_ FoxO1-B_ 45681_ 0--
  72. SCC1_ 0003921_ FoxO1-B_ 11396_ 0--
  73. SCC1_ 0003921_ FoxO1-B_ 26482_ 0--
  74. SCC1_ 0003866_ FoxO1-A_ 71351_ 0--
  75. SCC1_ 0003918_ FoxO1-B_ 75346_ 0-- (return time 5/4/20 10:43:17)
  76.  
  77. --------------------------------------------------
  78. Probable explanation:
  79. One of my devices (17-3770K, Debian Linux 7.x) crashed, dead, about 1 week ago.
  80. 1 RAM stick of 2 died.
  81. I've temporarily replaced it with 2 dissimilar sticks, both in Channel B (so single-channel operation instead of dual).
  82. I stress tested the machine before putting it back online, but I seem to have missed its flaw(s).
  83. The machine has not crashed with the mixed-RAM setup.
  84. I will replace both RAMsticks shortly.
  85. --------------------------------------------------
  86.  
  87. Sincere apologies, but it may have highlighted a weakness in the verification system.
  88.  
  89. It would be interesting to compare my results to those of the resends, if those are issued.
  90. - Rickjb
  91.  
  92. --------------------------------------------------
Add Comment
Please, Sign In to add comment