Advertisement
WayGroovy

admincraft chat regarding ovh raid issues

Oct 5th, 2013
284
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 29.92 KB | None | 0 0
  1. WayGroovy: So... the new core protect upgrade doesn't spam the console? So I have zero clues as to how long this update is taking?
  2. WayGroovy: Well, I can tell something is happening, because /plugins/CoreProtect/database.db is increasing in file size. Nothing is logging to the console though.
  3. WayGroovy: I don't like how it says, in the startup, "don't restart until this is done" but gives zero indication of how done it is.
  4. WayGroovy: My bad. There was an update. 4% complete.
  5. chiisana: lol, that's gonna take some time
  6. WayGroovy: 10 minutes per %, so approx 16+ hours.
  7. WayGroovy: With no restarts. Gosh I hope this doesn't fail over.
  8. chiisana: should be generally okay
  9. WayGroovy: 7% complete
  10. WayGroovy: what does 8820520k cached in TOP mean?
  11. WayGroovy: is that 8.8 GB of data cached to write?
  12. WayGroovy: 11%. Users on all of my servers reporting heavy lag, but not very descriptive. Server with transfer in progress shows 0-1% cpu and 20% of 3072 MB ram usage. Most users log off within seconds.
  13. WayGroovy: Users aren't just logging off. Looks like they may be getting kicked.
  14. WayGroovy: Not sure which plugin is doing the kicking, or if it is a bukkit/spigot thing.
  15. dzadnik: sounds like network issues
  16. WayGroovy: Traffic is low: http://i.imgur.com/5BOBN2K.png , looking deeper
  17. WayGroovy: vnstat doesn't show much
  18. WayGroovy: Speed test shows 6.84 Mbps down and 15.32 Mbps up
  19. WayGroovy: subsequent tests much faster, possibly due to caching somewhere
  20. WayGroovy: Any ideas on what I can test to see if this is a network issue, and not related to the coreprotect upgrade, which coincidentally started simultaneiously to the issue?
  21. WayGroovy: http://pastebin.com/wdqbvNav another server running on the machine, but not in sleep mode. CP2 update started at 1:29 system clock time.
  22. WayGroovy: time dd if=/dev/zero of=/tmp/test-hd bs=1M count=1000
  23.  
  24. gives out
  25.  
  26. 1000+0 records in
  27. 1000+0 records out
  28. 1048576000 bytes (1.0 GB) copied, 18.1464 seconds, 57.8 MB/s
  29.  
  30. real 0m20.200s
  31. user 0m0.000s
  32. sys 0m0.876s
  33. WayGroovy: Any ideas on what I can do? Right now I've got 4 minecraft servers aparrantly out of comission due to the coreprotect update in progress. I've tried 'renice +1 {pid}' to give the server running the update a lower priorirty, but that has had no apparant effect.
  34. WayGroovy: Does the coreprotect update occur in the primary minecraft process in linux?
  35. dzadnik: no
  36. WayGroovy: If I bump the process id in nice for the main java running the server, would that have any effect? right now it looks like everything in top is running at 0, except cpuset, khelper, netns, and kintegrityd, which are at -20
  37. dzadnik: doubt it
  38. chiisana: cache is file system cache
  39. chiisana: linux kernel looks at files you use regularly and caches them in ram for you
  40. chiisana: if your application requires ram, it will make those resources available automatically w/o lag
  41. chiisana: renice'ing things only matter when your cpu is full
  42. chiisana: if your cpu is not full, it will just use what it can
  43. dzadnik: fill dat CPU up!
  44. chiisana: Groovy, how frequently are the updated perecentage message appearing for you?
  45. chiisana: These:
  46. Upgrading... X% complete.
  47. WayGroovy: minutes and minutes apart
  48. WayGroovy: about 20+?
  49. chiisana: just scanning through the code right now and trying to see how it works, does it report every percent?
  50. WayGroovy: yep,
  51. WayGroovy: I'm not exactly sure what the holdup is. I imagine it's drive access possibly.
  52. WayGroovy: I'm very curious how an upgrade on a VPS would affect the other VPS on the shared machine.
  53. chiisana: can you please do `iostat` without the quotes for me in your ssh as root?
  54. chiisana: you're running things in VM? what's the drive setup on the host looking like? RAID?
  55. WayGroovy: I'm not in VM, but I am running raid 1 i believe
  56. chiisana: what do you mean for the VPS part then?
  57. WayGroovy: I'm curious if it had been in a VPS envrionment if the other vps's would be afffected
  58. chiisana: ah
  59. chiisana: it really depends... I'm not sure what is the hold up right now, too
  60. WayGroovy: Nor I
  61. chiisana: what's the iostat command output looking like?
  62. WayGroovy: http://pastebin.com/utyvBaN5
  63. WayGroovy: note: that is currently outside my realm of understanding. I read spaghetti better than that.
  64. chiisana: well, the two column we care about for the time being are Blk_read/s Blk_wrtn/s
  65. chiisana: blocks read per second and blocks written per second
  66. chiisana: what we are seeing are two drives, each with 3 partitions, making 2 raid paritions some how
  67. WayGroovy: Interesting. I have no idea why that is that way.
  68. WayGroovy: Filesystem Size Used Avail Use% Mounted on
  69. /dev/md1 10G 3.7G 5.8G 40% /
  70. /dev/md2 1.8T 69G 1.7T 4% /home
  71. tmpfs 7.8G 24K 7.8G 1% /dev/shm
  72. chiisana: probably one partition for boot and another for actual content, the third not in use one would be swap
  73. chiisana: yep, looks about same
  74. WayGroovy: I've tried shutting down the other mc servers, it had no visible affect
  75. chiisana: no, it wouldn't
  76. chiisana: from the looks of it, I'm not sure if the delay is coming from resources
  77. chiisana: I stopped working on code for a bit before Dan started the converter, so I don't really know it that well
  78. chiisana: I'm seeing a few places where we could potentially increase performance, but I really don't think it would be this drastic
  79. chiisana: something else must be holding things up.... but I don't know what.
  80. chiisana: CPU is definitely not the bottle neck
  81. WayGroovy: I feel bad, because 2 of the mc servers on my machine aren't my servers. I've sublet them, one to a coworker, one to an internet friend.
  82. chiisana: so shutting down other servers won't help at all
  83. chiisana: are other servers affected, too?
  84. WayGroovy: yep
  85. chiisana: what are the problems on that side?
  86. chiisana: any error log messages?
  87. WayGroovy: similar, read timed out, disconnected due to flying
  88. WayGroovy: but all network health applications I've run seem... wonderful
  89. chiisana: well
  90. WayGroovy: I even ran a rsync last night to my home machine, no issue, similar to usual speeds.
  91. chiisana: my guess from looking at the code and everything I can see points at potential disk io blocking
  92. chiisana: but, the weird thing is this
  93. chiisana: your %iowait is only at 2.22, while %idle is at 85.69
  94. chiisana: that means your CPU is waiting for disk roughly 2.22% of the time, but not doing anything 85.69% of the time
  95. chiisana: it really could be doing more stuff and wait more... but I don't know whats going on and why it isn't
  96. WayGroovy: Hmm. I'm curious, but I'm clueless as to what it could be. I'm mostly surprised at how it's doing it.
  97. WayGroovy: I have noticed that the coredata folder size is shrinking over time, as the coreprotect db file grows
  98. WayGroovy: I'm assuming that's natural with the conversion process
  99. chiisana: yes, the idea is we progressively shave the old data into the database
  100. chiisana: that's how we can pick up even if the server crashes/shuts down during upgrade
  101. chiisana: so we know where to continue
  102. WayGroovy: Ah. Well that part is nice.
  103. WayGroovy: subsequent iostats are similar. negligible variation
  104. chiisana: are you using MySQL or SQLite?
  105. WayGroovy: flatfile to sqlite
  106. chiisana: hm... I wonder if that's whats causing the delay...
  107. WayGroovy: I've got a meeting to go to, need to grab my coat.
  108. chiisana: alright
  109. chiisana: I'll see if I can do something about it, but it would mean interupting the update and losing the changes happened up to that point :/
  110. WayGroovy: Well, no players have been online for more than 2 minutes since starting,
  111. WayGroovy: I think cumulative online time in the past... lemme check mcstats
  112. WayGroovy: http://mcdigr.com/dashboard/public/europa.waygroovys.com
  113. chiisana: ouch, your TPS took a huge hit :/
  114. WayGroovy: yeah.
  115. WayGroovy: i hadn't thought to look there yet.
  116. WayGroovy: averaging around 14-15, low as low as 3
  117. chiisana: hm... the thing I thought was hammering database looks like it is cached already...
  118. WayGroovy: blocks broken or placed since starting per mcdigr is 0/0, so no problem restarting.
  119. chiisana: alright, I'll try to poke around and see if I can find anything
  120. WayGroovy: k. i'll be by in an hour or two. thanks for looking.
  121. chiisana: don't thank me yet... thank after I fix it :P have a good meeting :P
  122. WayGroovy: (it's outside. in the cold :(
  123. WayGroovy: postponted. 15-inf minutes.
  124. WayGroovy: 64% complete
  125. WayGroovy: oh shit... how is rsync going to handle this sqlite file?
  126. WayGroovy: Fuckit. I'll take the conversion offline and install prism.
  127. WayGroovy: 24 hours downtime, x 4 servers, (* 8 players), = approx 52736 hours of playtime down due to the update process.
  128. WayGroovy: Maths. Not even once. 192 hours.
  129. chiisana: I suspect rsync wont like sqlite db too much if it is being written into at the same time...
  130. chiisana: that's not a core protect thing but more of a filesystem thing...
  131. chiisana: as for going to prism... :x
  132. chiisana: I was busy making data to test upgrade thing :x
  133. chiisana: only managed to fill about 450mb of data so far :x
  134. WayGroovy: Understandable. I was getting pinged on far too many communication fronts
  135. WayGroovy: I suppose I could send you a historical version of my coredata
  136. chiisana: how big is your coredata folder?
  137. WayGroovy: 3 GB
  138. WayGroovy: close to 4
  139. chiisana: root@builddit:/home/minecraft/CoreProtect/plugins/CoreData# du -hs
  140. 448M .
  141. WayGroovy: it's much smaller now
  142. chiisana: that's my size so far... even with a brush I branded my data generation brush
  143. WayGroovy: lols
  144. chiisana: "//br sphere 5%TNT,5%redstone_torch,5%iron_block 5"
  145. WayGroovy: throw some sand into that for fun
  146. chiisana: flying around using WorldEdit to generate spheres of death, lol
  147. WayGroovy: pretty fun indeed.
  148. chiisana: seriously though... it lags me due to graphics lag :/
  149. WayGroovy: I wish I had a clue about php
  150. chiisana: O.o
  151. chiisana: actually... 600mb of data should be enough... that's ~1/6 of what you've got so I should see 3 hours conversion, assuming linear scaling?
  152. WayGroovy: assuming that. I suppose.
  153. chiisana: I'll just pull the latest fro bukkitdev and see what happens...
  154. WayGroovy: dan replied to my post, says that he can do 12GB in a few hours, so my machine must be pathetic in comparison.
  155. chiisana: well, FWIW, I'm trying it on a 4 core 4GB ram VPS
  156. WayGroovy: Fair enough.
  157. chiisana: so whatever result I get is going to be probably acceptable as a baseline
  158. chiisana: I'm using spigot, but I don't think that'd make any difference in the conversion process?
  159. chiisana: ah... whatever, I'll just use bukkit RB
  160. chiisana: then it would be more definitive
  161. WayGroovy: I was using spigot as well, but sure, that'll give a good baseline
  162. chiisana: 18:20:16 [INFO] [CoreProtect] Upgrading... 1% complete.
  163. 18:20:20 [INFO] [CoreProtect] Upgrading... 2% complete.
  164. 18:20:25 [INFO] [CoreProtect] Upgrading... 3% complete.
  165. 18:20:28 [INFO] [CoreProtect] Upgrading... 4% complete.
  166. 18:20:31 [INFO] [CoreProtect] Upgrading... 5% complete.
  167. chiisana: root@builddit:/home/minecraft/CoreProtect.1.x/plugins/CoreData# du -hs
  168. 722M .
  169. WayGroovy: obv leaps and bounds faster.
  170. chiisana: root@builddit:/home/minecraft# iostat
  171. Linux 2.6.32-5-amd64 (builddit) 02/28/2013 _x86_64_ (4 CPU)
  172.  
  173. avg-cpu: %user %nice %system %iowait %steal %idle
  174. 0.19 0.00 0.03 0.00 0.00 99.77
  175.  
  176. Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
  177. vda 0.15 0.30 6.67 249036 5623184
  178. dm-0 0.84 0.29 6.67 245466 5623168
  179. dm-1 0.00 0.00 0.00 1032 0
  180. [2/28/2013 9:21:44 PM] chiisana: iostat also shows much less stuff
  181. [2/28/2013 9:22:24 PM] chiisana: so perhaps something else was doing huge amounts of read and write on your server
  182. [2/28/2013 9:22:57 PM] WayGroovy: avg-cpu: %user %nice %system %iowait %steal %idle
  183. 10.82 0.02 1.25 2.32 0.00 85.60
  184.  
  185. Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
  186. sda 61.63 4641.20 520.42 27225812091 3052838408
  187. sda1 1.12 0.31 18.14 1843355 106439928
  188. sda2 60.50 4640.87 502.25 27223906936 2946230952
  189. sda3 0.00 0.01 0.03 61160 167528
  190. sdb 61.59 4641.24 520.39 27226038547 3052670880
  191. sdb1 1.12 0.31 18.14 1807753 106439928
  192. sdb2 60.46 4640.93 502.25 27224229322 2946230952
  193. sdb3 0.00 0.00 0.00 832 0
  194. md2 62.35 9.91 495.12 58107194 2904414464
  195. md1 1.90 0.61 14.99 3572786 87940432
  196. WayGroovy: perhaps so,
  197. WayGroovy: that's my current conditions
  198. WayGroovy: with it not currently doing anything
  199. chiisana: are you stll converting?
  200. chiisana: O.o
  201. WayGroovy: no, just running mc server
  202. WayGroovy: s
  203. WayGroovy: dafaq
  204. chiisana: how many online?
  205. WayGroovy: 4
  206. WayGroovy: oh, players?
  207. chiisana: dafaq?
  208. WayGroovy: my poor disks!
  209. WayGroovy: how to determine source....
  210. WayGroovy: klill processes randomly?
  211. chiisana: no, don't do that
  212. WayGroovy: :D
  213. chiisana: look for iotop in your package system
  214. WayGroovy: reboot machine?
  215. WayGroovy: iotop. k
  216. WayGroovy: blasted missing dependency
  217. WayGroovy: python-ctypes
  218. WayGroovy: nope, that's not it
  219. chiisana: the only thing I can think of is if the software raid is doing some kind of background scrubbing... but even then, it _really_ shouldn't be doing it to the point where it is affecting performance that much
  220. chiisana: doesn't your package manager help you with dependencies?
  221. WayGroovy: Setting up Install Process
  222. Resolving Dependencies
  223. --> Running transaction check
  224. ---> Package iotop.noarch 0:0.4.3-4.el5 set to be updated
  225. --> Processing Dependency: kernel >= 2.6.18-199 for package: iotop
  226. --> Finished Dependency Resolution
  227. iotop-0.4.3-4.el5.noarch from base has depsolving problems
  228. --> Missing Dependency: kernel >= 2.6.18-199 is needed by package iotop-0.4.3-4.el5.noarch (base)
  229. Error: Missing Dependency: kernel >= 2.6.18-199 is needed by package iotop-0.4.3-4.el5.noarch (base)
  230. You could try using --skip-broken to work around the problem
  231. You could try running: package-cleanup --problems
  232. package-cleanup --dupes
  233. rpm -Va --nofiles --nodigest
  234. The program package-cleanup is found in the yum-utils package.
  235. -bash-3.2#
  236. chiisana: wtf, it depends on a kernel version?
  237. WayGroovy: \o/
  238. WayGroovy: \_o_/
  239. WayGroovy: No package kernel available.
  240. chiisana: I have no idea how to work with rpm :/
  241. chiisana: 18:28:22 [INFO] [CoreProtect] Upgrading... 100% complete.
  242. 18:28:22 [INFO] [CoreProtect] Logging new data. Please wait...
  243. 18:28:23 [INFO] --------------------
  244. 18:28:23 [INFO] [CoreProtect] Upgrade successfully completed.
  245. 18:28:23 [INFO] --------------------
  246. >
  247. WayGroovy: I know how to so little
  248. dzadnik: ugh
  249. dzadnik: coreprotect 2.1 to-do list is so long already
  250. dzadnik: going to take a break for a week or so first before beginning any work on it :P
  251. dzadnik: also, most of the things in the prism 1.4 release, CoreProtect can already do :P
  252. WayGroovy: No ideas on my issue, other than my hardware isn't up to snuff,?
  253. dzadnik: and nope
  254. Edited 9:34:20 PM] dzadnik: try using smartctl maybe
  255. dzadnik: see if your drive is failing
  256. chiisana: My best guess is something is eating through your disk IO... but I have no idea what
  257. dzadnik: shouldn't be minecraft
  258. dzadnik: it uses way less disk i/o than people think ;P
  259. WayGroovy: /dev/sda passed, checking /dev/sdb
  260. WayGroovy: sdb passed
  261. chiisana: rpm -q kernel ?
  262. WayGroovy: package kernel is not installed
  263.  
  264. da...shit
  265. chiisana: not necessarily wrong
  266. WayGroovy: yum list grep kernel
  267. chiisana: I don't know RPM, and that's just a command I found online that might reveal what kernel version is installed according to rpm
  268. WayGroovy: 2.6.18-53.1.21.el5
  269. WayGroovy: kernel-headers.x86_64 2.6.18-53.1.21.el5 installed
  270. chiisana: so you're using 2.6.18-53, but iostat wants 2.6.18-199
  271. chiisana: have you updated your system recently?
  272. WayGroovy:
  273. No Packages marked for Update
  274. chiisana: hm... :3
  275. chiisana: wget http://guichaz.free.fr/iotop/files/iotop-0.5-1.noarch.rpm
  276. rpm -i iotop-0.5-1.noarch.rpm
  277. chiisana: I wonder if you can install the rpm on iotop's homepage
  278. chiisana: not sure about that second command... I vageuly recall -i for install, but if it is wrong, man rpm
  279. chiisana: supper time, afk for now
  280. chiisana: hopefully we can some how get iotop in there so we can check out what's eating your disk write
  281. chiisana: your md1 and md2 aren't seeing the read, so the high read is probably background scrubbing for software raid
  282. chiisana: but one of your md# is seeing a lot of write still... so something is creating a lot of data and writing to disk a lot
  283. chiisana: just shy of 500blk write/sec, and assuming conventional ish 512kb blocks, that's almost 250mb/s
  284. WayGroovy: food sounds good, i'm getting frustrated.
  285. chiisana: rpm from website didn't help either?
  286. WayGroovy: no, it's not
  287. chiisana: (worry)
  288. WayGroovy: need to update python, looking for that now, had an alarm, had to leave keyboard for 20 min or so
  289. chiisana: try this old package:
  290. chiisana: wget http://repo.or.cz/w/iotop.git/snapshot/416f4c9419a7b5ecf3f1b3dcf366217b072ae79e.zip
  291. chiisana: unzip 416<tab>
  292. chiisana: cd iotop/bin
  293. chiisana: ./iotop
  294. WayGroovy: No module named iotop.ui
  295. To run an uninstalled copy of iotop,
  296. launch iotop.py in the top directory
  297. chiisana: O.o
  298. chiisana: ok,
  299. cd ..
  300. ./iotop.py
  301. WayGroovy: no love
  302. chiisana: what's the error this time? O.o
  303. WayGroovy:
  304. -bash-3.2# ./iotop.py
  305. -bash: ./iotop.py: No such file or directory
  306. chiisana: are you in the iotop dir?
  307. WayGroovy: yes, but i also tried from /
  308. chiisana: should go to the folder named iotop after you extract
  309. chiisana: do a `ls` there for me please
  310. chiisana: lemme see what's included in that package for you
  311. WayGroovy: AHAHHAHA
  312. WayGroovy: there it be
  313. chiisana: :P
  314. WayGroovy: oh hell
  315. WayGroovy: so many columns, but mostly 0.00 B/s or %
  316. chiisana: anything under Disk Write?
  317. chiisana: (that's not 0
  318. WayGroovy: http://pastebin.com/HsYVWrED
  319. WayGroovy: so many 0s
  320. chiisana: wtf all 0
  321. chiisana: total disk write is virtually nothing as well
  322. chiisana: what does iostat say now?
  323. WayGroovy:
  324. avg-cpu: %user %nice %system %iowait %steal %idle
  325. 10.81 0.02 1.25 2.34 0.00 85.58
  326.  
  327. Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
  328. sda 61.64 4638.13 520.72 27227255203 3056759024
  329. sda1 1.12 0.31 18.16 1844891 106603088
  330. sda2 60.50 4637.81 502.53 27225348512 2949987888
  331. sda3 0.00 0.01 0.03 61160 168048
  332. sdb 61.60 4638.15 520.69 27227387355 3056590976
  333. sdb1 1.12 0.31 18.16 1809217 106603088
  334. sdb2 60.47 4637.85 502.53 27225576666 2949987888
  335. sdb3 0.00 0.00 0.00 832 0
  336. md2 62.43 10.37 495.40 60896058 2908157032
  337. md1 1.90 0.61 15.01 3575738 88088776
  338. WayGroovy: kkwnwbllooooow
  339. chiisana: ok, now I am officially confused....
  340. chiisana: iotop says nothing is writing, but iostat say you've got something writing
  341. chiisana: what do you get when you do:
  342. cat /proc/mdstat
  343. WayGroovy: md1 1.90 0.61 15.01 3575738 88088776
  344.  
  345. -bash-3.2# ./iotop.py
  346. Total DISK READ: 0.00 B/s | Total DISK WRITE: 30.76 K/s
  347. TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
  348. 28002 be/4 visualke 0.00 B/s 11.54 K/s 0.00 % 0.00 % java -Xincgc -Xmx60M -Xms60M -XX:MaxPermSize=80M -jar Minecraft_RKit.jar kev:Babies
  349. 19867 be/4 europa 0.00 B/s 0.00 B/s 0.00 % 0.00 % java -Dfile.encoding=utf-8 -Djline.terminal~t.jar nogui -d yyyy-MM-dd HH:mm:ss -nojline
  350. 11877 be/4 visualke 0.00 B/s 0.00 B/s 0.00 % 0.00 % java -Xms1536M -Xmx1536M -Djline.terminal=j~r /home/visualkev/cbs/craftbukkit.jar nogui
  351. 11878 be/4 visualke 0.00 B/s 0.00 B/s 0.00 % 0.00 % java -Xms1536M -Xmx1536M -Djline.terminal=j~r /home/visualkev/cbs/craftbukkit.jar nogui
  352. 22528 be/4 visualke 0.00 B/s 0.00 B/s 0.00 % 0.00 % java -Xincgc -Xmx60M -Xms60M -XX:MaxPermSize=80M -jar Minecraft_RKit.jar kev:Babies
  353. 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init [3]
  354. WayGroovy: ugly, but I caught some non zeros
  355. WayGroovy: so iotop is functioning, somewhat
  356. chiisana: there's only md1 in your /proc/mdstat ?
  357. chiisana: what the heck is going on with your md2 then.... can please also check:
  358. mdadm --detail /dev/md2
  359. WayGroovy: http://pastebin.com/WjVPJksn
  360. chiisana: looks okay
  361. WayGroovy: -bash-3.2# who -b
  362. system boot 2012-12-23 07:55
  363. chiisana: that's 2 months of uptime, nice and solid :)
  364. chiisana: compared to my crappy 9 days :P
  365. 19:39:22 up 9 days, 19:19, 2 users, load average: 0.44, 0.10, 0.03
  366. WayGroovy: I'd hate to break that, but if I must,
  367. chiisana: well, I don't know what's going on
  368. WayGroovy: Me neither.
  369. chiisana: I don't think rebooting would solve anything....
  370. WayGroovy: me neither...
  371. chiisana: since, if you didn't start anything, then whatever is doing it is doing it automagically
  372. chiisana: and if you did start something that you need, you'll likely start it again anyways :P
  373. chiisana: can you do: iostat -x
  374. WayGroovy:
  375. avg-cpu: %user %nice %system %iowait %steal %idle
  376. 10.81 0.02 1.25 2.34 0.00 85.58
  377.  
  378. Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
  379. sda 0.42 39.49 36.04 25.59 4637.54 520.66 83.70 0.03 12.39 2.67 16.46
  380. sda1 0.01 1.16 0.01 1.11 0.31 18.16 16.50 0.10 88.27 55.36 6.20
  381. sda2 0.41 38.33 36.03 24.47 4637.21 502.47 84.96 0.66 10.98 2.35 14.21
  382. sda3 0.00 0.00 0.00 0.00 0.01 0.03 13.95 0.00 37.67 24.80 0.01
  383. sdb 0.43 39.50 36.02 25.57 4637.56 520.63 83.75 0.58 9.38 2.69 16.57
  384. sdb1 0.00 1.16 0.01 1.11 0.31 18.16 16.51 0.10 86.00 54.38 6.08
  385. sdb2 0.43 38.34 36.00 24.46 4637.25 502.47 85.01 0.48 7.96 2.37 14.31
  386. sdb3 0.00 0.00 0.00 0.00 0.00 0.00 48.94 0.00 8.47 8.00 0.00
  387. md2 0.00 0.00 0.49 61.92 10.37 495.35 8.10 0.00 0.00 0.00 0.00
  388. md1 0.00 0.00 0.02 1.88 0.61 15.01 8.21 0.00 0.00 0.00 0.00
  389. chiisana: the -x should give us breakdown of how long the average wait time are for the inidividual drives
  390. WayGroovy: ugly, let me pastebin
  391. WayGroovy: http://pastebin.com/mQQy6zuF
  392. chiisana: I'm not familiar with all the details, but what I do notice is your sdX1, which makes up your md1, used for boot, and main OS, is pretty slow
  393. chiisana: not a whole lot is done on it, so it shouldn't matter, though
  394. WayGroovy: iotop/setup.py install worked
  395. chiisana: your sdX2, which makes up your md2, home directory, most probably where your minecraft server + worlds etc. are, is fairly fast... but have a lot of read and write happening on the sdX2 level, but not on md2 level
  396. chiisana: well, a lot of read, not happening on md2 level
  397. chiisana: but the writes are happening on md2 level
  398. chiisana: my best guess tells me your raid is doing background scrubbing with the read process, which is a very good thing to do... background scrubbing is when your drive is supposedively not doing anything, it will try to read every single byte from your drive to make sure there are no bad sector etc.
  399. chiisana: so when a drive fail happens, you don't risk losing data because other drive(s) in the raid array is bad
  400. chiisana: however, this doesn't explain the writes we're seeing in iostat (but not in iotop)
  401. WayGroovy: hmm.
  402. chiisana: I'm afriad I'd have to admit defeat here as well :x
  403. WayGroovy: I will cry to my pillow tonight.
  404. chiisana: I'll do more research on this... but in the mean time, your guess as to what's going on is probably as good as my...
  405. WayGroovy: 198.27.64.58/~europa/404.swf
  406. chiisana: lol
  407. WayGroovy: 1261 to a partition!
  408. md: md2: data-check done.
  409. grsec: From 66.249.86.225: signal 11 sent to /opt/avg/av/bin/avgscand[avgscand:28373] uid/euid:0/0 gid/egid:0/0, parent /opt/avg/av/bin/avgtcpd[avgtcpd:28224] uid/euid:0/0 gid/egid:0/0 by /opt/avg/av/bin/avgscand[avgscand:28381] uid/euid:0/0 gid/egid:0/0, parent /opt/avg/av/bin/avgtcpd[avgtcpd:28224] uid/euid:0/0 gid/egid:0/0
  410. grsec: From 66.249.86.225: denied hardlink of /home/europa/Super_Spice_Bros_64.swf (owned by 0.0) to /home/europa/public_html/Super for /usr/libexec/openssh/sftp-server[sftp-server:25761] uid/euid:500/500 gid/egid:500/500, parent /usr/sbin/sshd[sshd:25760] uid/euid:500/500 gid/egid:500/500
  411. The scan_unevictable_pages sysctl/node-interface has been disabled for lack of a legitimate use case. If you have one, please send an email to linux-mm@kvack.org.
  412. WayGroovy: other 5 lines
  413. chiisana: ?
  414. WayGroovy: No idae
  415. WayGroovy: http://serverfault.com/questions/236399/how-can-i-find-which-process-is-causing-this-io-read-tried-iotop-already trying to follow along with this
  416. chiisana: I just remembered these two threads:
  417. chiisana: http://ww.reddit.com/r/admincraft/comments/16f99f/trying_to_get_to_the_root_of_my_performance_issues/
  418. http://ww.reddit.com/r/admincraft/comments/1451ep/help_odd_tick_rate_no_lag_ramdisk/
  419.  
  420. Seems like OVH have had a history with RAID being weird
  421. WayGroovy: very interesting. I've always had no issue, even with upwards of 40 people on the main server,
  422. WayGroovy: BUT, you don't typically go investigating io r/w when there isn't an issue.
  423. chiisana: and typically you don't really have a problem on your server... only during excess disk io such as upgrading core protect do you start to notice the problem...
  424. WayGroovy: bash-3.2# hdparm -Tt /dev/sda
  425.  
  426. /dev/sda:
  427. Timing cached reads: 45820 MB in 2.00 seconds = 22934.12 MB/sec
  428. Timing buffered disk reads: 326 MB in 3.01 seconds = 108.42 MB/sec
  429. chiisana: root@builddit:/home/minecraft# hdparm -Tt /dev/vda
  430.  
  431. /dev/vda:
  432. Timing cached reads: 11938 MB in 2.00 seconds = 5975.69 MB/sec
  433. Timing buffered disk reads: 748 MB in 3.01 seconds = 248.82 MB/sec
  434. chiisana: looks like cached read is insanely awesome, butbuffered disk read is not as good
  435. WayGroovy: bash-3.2# mount
  436. /dev/md1 on / type ext3 (rw,errors=remount-ro)
  437. proc on /proc type proc (rw)
  438. sysfs on /sys type sysfs (rw)
  439. devpts on /dev/pts type devpts (rw)
  440. /dev/md2 on /home type ext3 (rw)
  441. tmpfs on /dev/shm type tmpfs (rw)
  442. none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
  443. chiisana: that looks about right
  444. WayGroovy: frizzing googling and not comprehending 100% but bfuddling through
  445. WayGroovy: http://browser.primatelabs.com/geekbench2/view/1705220
  446. WayGroovy: shit. none of that info is disk io
  447. chiisana: lol
  448. WayGroovy: oh god, how did i wind up http://www.bringinthecats.com/
  449. WayGroovy: warning
  450. WayGroovy: seizure warning
  451. chiisana: http://browser.primatelabs.com/geekbench2/view/1705243 <-- your server rapes my VM
  452. WayGroovy: Damn
  453. WayGroovy: daaa FUCK
  454. chiisana: so it really is just that disk io that's holding you back
  455. WayGroovy: running iozone now
  456. WayGroovy: no idea
  457. WayGroovy: http://www.thegeekstuff.com/2011/05/iozone-examples/
  458. dzadnik: lol
  459. dzadnik: CoreProtect 2 versus CoreProtect 1
  460. dzadnik: http://i.imgur.com/6MmJA2q.png
  461. WayGroovy: lols
  462. WayGroovy: Quite impressive.
  463. WayGroovy: your own data?
  464. dzadnik: no
  465. chiisana: I blame dan and his weird anonymous thread thing and random sleeping :P
  466. dzadnik: converting 10.7GB of data took me 3 hours and 31 minutes
  467. WayGroovy: >:{ flip all the fucking tables
  468. dzadnik: :P
  469. dzadnik: also
  470. dzadnik: ugh
  471. dzadnik: was so hopefull about using http://coreix.net/ as my new UK DC
  472. dzadnik: but after dealing with their support team
  473. dzadnik: well
  474. dzadnik: their support team isn't as nice as their website design, I'll just say.
  475. chiisana: ovh/hetzner :P
  476. chiisana: except, groovy's having raid problems with ovh as with a few others on /r/admincraft...
  477. WayGroovy: I'm not entirely sure that my numbers indicate a problem exactly...
  478. WayGroovy: i don't know enough.. damn.
  479. chiisana: well, there is a.... notable amount of slow down when it comes to large amount of disk read + write...
  480. chiisana: and tests for other parts show your server is pretty awesome...
  481. WayGroovy: https://docs.google.com/spreadsheet/ccc?key=0AsUG-_IEV-_OdEVoWEd3SVh2TlRMRm5LdnFGWVdhUmc&usp=sharing
  482. WayGroovy: fucking numbers, what do they mean
  483. chiisana: Command line used: iozone -a -b iozoneout.xls
  484. Output is in Kbytes/sec
  485. WayGroovy: so, low numbers would be bad?
  486. chiisana: yes, I guess
  487. WayGroovy: I need some basis of comparison
  488. WayGroovy: http://blog.serverfault.com/2010/07/06/777852755/
  489. chiisana: running it on my test VPS already
  490. WayGroovy: https://code.google.com/p/iozone-results-comparator/wiki/Overview
  491. chiisana: https://docs.google.com/a/terrandin.com/spreadsheet/ccc?key=0AnVSLrpE2qnedDVZSE02VmZ1T1ZfdDB5a2I0T2ExRHc&usp=sharing
  492. chiisana: yours looks better tbh
  493. WayGroovy: shared to any with link?
  494. WayGroovy: or just waygroovy@?
  495. chiisana: should be any, no idea :D
  496. chiisana: @gmial?
  497. WayGroovy: hm. won't open for me, says i need to req access
  498. chiisana: mail
  499. chiisana: try again?
  500. WayGroovy: I'm in (uuuuunnnnhh)
  501. WayGroovy: I think the takeaway from all of this is I don't have a freaking clue, but I know how to google and run various diagonistics, but don't know how to properly interperet them.
  502. chiisana: well, it is telling me your server should theoretically perform way better than my
  503. chiisana: which makes sense, i5 (2011+) vs E5450 (2009)
  504. chiisana: DDR3 vs DDR2
  505. WayGroovy: sure, but... why coreprotect, why did you take that arrow in the knee?
  506. chiisana: that is a mystery in life :/
  507. WayGroovy: could it have something to do with the number of open files?
  508. WayGroovy: cat /proc/sys/fs/file-max > 1613010
  509. chiisana: root@builddit:~# cat /proc/sys/fs/file-max
  510. 405036
  511. WayGroovy: various rage noises
  512. dzadnik: aroooo
  513. chiisana: so, um...
  514. dzadnik: slurp
  515. WayGroovy: slobber
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement