Advertisement
philknows

Untitled

Jul 25th, 2023
22
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 31.81 KB | None | 0 0
  1. Okay. Welcome everyone to the July 25th stand up.
  2. Let's see what I have here on my list here today. Okay, one of the things that I wanted to cover was
  3. block production times. We've been seeing some issues with some mis-blocks recently
  4. in on main net on some of our Lido nodes.
  5. Tune was looking into some of this stuff.
  6. He has a proposal to start the slot to pull
  7. for proposer duties.
  8. I think that's 5409.
  9. I don't know if that had been reviewed yet though,
  10. but I'll let maybe Tuyen summarize some of the PRs
  11. That he has in regards to tackling some of this,
  12. and then we can go from there.
  13. - So I analyze the issue of missed proposal
  14. at the first slot of the epoch.
  15. There are two issues.
  16. First is the delay at the validator client side,
  17. because it's a zero epoch look ahead.
  18. So usually we have to pull the proposal at the start of,
  19. the epoch, and due to the IO lag,
  20. it's maybe two second delay.
  21. So I have one PR for that,
  22. to on the proposal duties before the next epoch start,
  23. because we had the prepare slot scheduler,
  24. which run the epoch transition for us already.
  25. Usually it takes less than three seconds.
  26. So I did the PR to poll the proposal duties
  27. one second in advance.
  28. If the epoch transition data finish,
  29. it can wait for a while for some seconds
  30. and then return back so that we can save some time
  31. at the VC time, at the validator time.
  32. Other than that, at the beacon node side,
  33. I see we have some delay in producing the phase 0
  34. Beacon block body.
  35. I did not look into that yet.
  36. - Great, thanks, Tuyen.
  37. Sorry, it's my mistake.
  38. That was, he was describing 5794,
  39. which I posted on there, but yeah.
  40. - Did we figure out why the execution layer client took about 14 seconds
  41. to produce the block?
  42. I don't have an answer to that
  43. investigation at all yet.
  44. I think that is still in our --
  45. yeah, Gajinder did question
  46. that in our missed block
  47. proposal thread.
  48. the missing slot 6940832, there was a huge sort of delay in getting an execution block.
  49. I don't know if anybody has looked into that though.
  50. Okay.
  51. Um, so there's that. Um, so that's the current.
  52. That is on really these investigations. There is, of course, the fact that we are.
  53. Having a long epoch transition time, which was the 5409 that I had originally quoted.
  54. that is a blocker. I don't know if much investigation came out of this. I know
  55. Tune reopened it because of this block production delay issue that we're having.
  56. But we'll need to sort of investigate this a little bit further.
  57. But just unless there's anybody who wants to add to this
  58. to potentially what's happening here. I think we do know but we don't really have a plan
  59. yet to tackle that outside of Tuyen's PRs that he's working on.
  60. I see that sometimes we see a process block in more than one second which is huge.
  61. I mean, we should also look at why the builder response also came so late because I mean,
  62. Builder could have also saved us. But the builder response was very late.
  63. Okay.
  64. [AUDIO OUT]
  65. OK.
  66. And it sounds like there's a bunch of different pieces
  67. that are being tackled here.
  68. And I don't know that all of them
  69. are really necessary to fix before we
  70. think about cutting a release.
  71. So the epoch transition time, that
  72. seems like something that we would like to fix.
  73. But it's also something that has already existed in our
  74. Yeah, yeah, I should not say it's, it's not a blocker, be honest. Yeah, I used the wrong
  75. word. So it's just nice to have in my PR, when we, when it did not finish the import
  76. transition, we can wait for a while also. So we, yeah, we can work on that later. But
  77. epoch transition more than three seconds is bad.
  78. Okay.
  79. Any other points that anybody want to add to this.
  80. If not, we'll move forward with some other follow up issues.
  81. I was curious about just the kind of the status of the network thread, if anyone has done
  82. any kind of like the latest research is on that.
  83. And whether or not--
  84. it seems like, I guess, just thinking
  85. about where we're at for our next release,
  86. it's like we're not ready.
  87. Clearly, it's like we're still kind of trying
  88. to get a few things in.
  89. But whether or not that we think that would land in this release
  90. or what kind of the blockers are that are still open for that?
  91. Yeah, so I did message Tuyen about this yesterday.
  92. Maybe I'll give Matt a chance to also chime in on this.
  93. But one of the big main issues that we're still having
  94. is the event loop lag.
  95. So I don't know if there's an update from Matt about that,
  96. if there was any work done on it.
  97. Yes, I put up a PR last night and actually just about done getting emerged to add metrics.
  98. For the last thing that Ben wanted was the message queue between the worker and the main thread.
  99. So that is PR now. And that's what I'm going to deploy to feature one and try to pull some metrics.
  100. And then I'm also going to turn back on the extra new space
  101. in that same PR and see if that affects things.
  102. But it didn't do a huge amount of difference.
  103. It did actually reduce a lot of the lower level metrics,
  104. but it didn't seem to affect the loop much.
  105. So the last thing that I needed to give Ben
  106. was the worker queue, like the message queue between the worker and the main thread.
  107. And then from there, we were going to start the discussion again.
  108. And I'll add the two of the things I'm going to ask in my update back to him for that.
  109. And then as a matter of fact, actually, that's not.
  110. Cool. Also as well, we merged in 5747, which was the verify signature sets with the same
  111. attestations. So that I think Tuyen said, gave us more stable peering when the network thread
  112. was enabled. So we do have that now. I thought that the 5729 also had to be merged for that,
  113. the batch verify attestation gossip messages in batch.
  114. - Yeah, so the new BLS APIs,
  115. there's no consumer at this moment.
  116. So that's still a work in progress
  117. to implement the index, the CPU
  118. and the validation logic in batch,
  119. which I should work on next week.
  120. [BLANK_AUDIO]
  121. Okay, and then there were some recent things that were also merged
  122. into hopefully reduce the IO traffic related to thread.
  123. So that was the two subnets per node.
  124. And then I think Tween is also working on the subscribe to short-live subnets.
  125. too early. So that one is also going to be for inclusion for that. And then hopefully
  126. that leads to a much more stable network thread.
  127. Okay, I guess I was just thinking like, as far as release planning and everything goes,
  128. what our strategy is for all of it. It seems like we don't have any major pending, like
  129. urgent timeline, things going on where we need to cut a release right now.
  130. So yeah, I was actually thinking of maybe doing a hotfix.
  131. There were some PRs here that were merge related to helping to potentially alleviate some of
  132. the sync committee issue stuff that we were having.
  133. I don't know if that's something that you guys want to do
  134. either or--
  135. but sorry, I don't want to get off topic with that.
  136. >>I think it's definitely on topic.
  137. I guess the question is, is our current unstable stable enough
  138. to attempt a release candidate or just a 1.10 scaled down
  139. version, just whatever we have here on unstable,
  140. Or is it unstable enough that we need to actually just
  141. do the hotfix if we want to release those features?
  142. I don't see any block for 1.10.
  143. Otherwise, please update me.
  144. Do we want to update to node 20 for that?
  145. Right.
  146. We want-- we probably don't want to.
  147. We want to stay on 18.
  148. But I think there was a fix for the issue
  149. I was looking into with Matthew.
  150. So I think they merged it today, right?
  151. Yeah, this morning.
  152. But it still has to be merged through--
  153. was it cross-fetch?
  154. I'm not really able to just update the dependency resolution.
  155. Okay.
  156. So we, if I guess we could, we can't keep 20 if we just, um, fix the
  157. issue or we can downgrade, I think ideally we would, we would upgrade, but,
  158. um, we can't want to test it though.
  159. Because in theory, it should work.
  160. And I mean, it should work.
  161. But you don't have a day or two of metrics on it,
  162. just to make sure it's stable.
  163. >>Right, right.
  164. Yeah, we would go through our earliest candidate testing
  165. process of testing.
  166. >>It is ready for that, though.
  167. >>Yeah, if one of you all want to PR that resolution,
  168. Or we could wait a few days for CrossFetch to bump the--
  169. -That was going to be on my updates,
  170. is that I was going to message them today
  171. because I was going to send them that info,
  172. and then the stuff that Nazar had found as well.
  173. But if you want, Nazar, if you think it's a good idea,
  174. I'm down.
  175. We could just update the intermediate dependency
  176. the lower dependency and then do cross-fetch, you know, just so it's not a blocker.
  177. Yeah, I mean I could quickly test the issues that it's pretty easy to reproduce,
  178. so I can see if it's actually fixed now.
  179. I fix on my PR by, like in the React application that I'm working,
  180. by downgrading the lower dependencies of the cross-fetch.
  181. Like lower dependencies of our package of cross-fetch to 3.1.5.
  182. Okay, yeah, if that is, you know, of course, like a safe and logical thing to do, I would rather
  183. take an action where we can sort of control things rather than depend on other people
  184. to sort of bump stuff. So if we do want to do it that way, let's get that in. And then we'll
  185. stay on node 20. And then we'll try for an RC release right after,
  186. if that works for everyone. Because we do have, I think it's 90, close to 90 commits
  187. between stable and unstable right now. So it's quite a few changes. And ideally, I'd like to like
  188. push out a 1.10 release to get us all caught up if there's no, if it's stable enough,
  189. And there's nothing blocking it right now.
  190. Cool. Any opposition to that at all?
  191. Okay.
  192.  
  193. I had a question about 5225, that was the code coverage test.
  194. I've been sitting there for awhile,
  195. so just wondering if there was a reason why
  196. we've been waiting on that one for awhile.
  197. There is no reason it was just a low priority.
  198. Thing. OK.
  199. I merged it because it seemed.
  200. Like self contained enough that we could just.
  201. Delete it if we wanted to,
  202. but it's a nice feature that we might.
  203. Try, I don't know what why
  204. had this drive by PR, but he's a security researcher at the EF and
  205. Figured it might help him with some things he's doing so
  206. Okay, cool
  207. Thanks, I think to make it useful we should maybe
  208. See if we can aggregate the coverage and get an average and then maybe also update the batch maybe that we have in the readme
  209. So I'm not sure how easy that is with that CA tool, but yeah.
  210. Right.
  211. Right now, it's not in use anywhere.
  212. It just adds a script in package.json,
  213. which is not being used unless you decide to run it.
  214. Cool.
  215. Any other points specifically for planning at this point?
  216.  
  217. Real quick, just looking at that cross-fetch and node-fetch, they updated it to 3 and 2
  218. is the major version in cross-fetch, so that update that they just merged today
  219. is not going to be available to us on a minor seminar. We wouldn't be able to
  220. actually use it unless we update.
  221. And they update because they're on 2.
  222. >>I think we could probably just downgrade cross-fetch.
  223. I updated it to, I guess it was a 4.0 in the Node 20 PR.
  224. But I think actually it wasn't even necessary.
  225. And it seems like it's created all these problems when--
  226. it's like, we can just downgrade it back to 3.something.
  227. on it yeah i think that's the right strategy because if you look on the
  228. uh download stats 4 is like pretty immature right now uh it does not have a lot of downloads as
  229. well compared to the 3 so maybe we should wait for it once it becomes more usable by other people
  230. and then we start using it.
  231. Yeah, this doesn't solve what Matthew
  232. mentioned, right?
  233. Like cross-fetch might not work with
  234. Node-fetch version 3
  235. right now.
  236. Yeah, it's using... we can...
  237. we'll take it offline and just so we
  238. don't
  239. do some weird stuff and but we should be
  240. able to figure something out but I think
  241. the downgrade is probably just go back and it should work.
  242. Because that addition that they basically took back out
  243. was adding the close connection,
  244. like the connection close header that got added.
  245. The agent is what's auto adding a keep alive,
  246. and that's what's actually causing
  247. the successive closing of requests,
  248. because there's a, I'll do it in my update,
  249. but if we get back grade or back downgrade,
  250. we should be fine.
  251.  
  252. - If that's the case, actually,
  253. maybe we'll just go right into updates
  254. and then we'll just start with you, Matthew,
  255. and then you can give us the whole update.
  256. So there was an update that was done to NodeFetch
  257. that added a close connection header,
  258. and then the agent has keep alive that's being also applied,
  259. and it's basically causing an issue
  260. where it's closing the socket instead of keeping it alive,
  261. and then there's a recycle issue in Node,
  262. which is an existing bug.
  263. So what I did was I looked into both the bug in Node
  264. and then the bug in NodeFetch,
  265. and they're kind of conflicting,
  266. and that's really what's driving the issue
  267. with the upgrade to version 20.
  268. The fix for it is actually structural in Node.
  269. It's not an easy fix and it's something that...
  270. It's an issue that had come up in Node 8, I believe,
  271. and then it kind of went away in Node 12,
  272. and then it came back again at some point.
  273. It has to do with how the socket and the agent
  274. and the readable stream interact.
  275. It's really like a design issue.
  276. Um, and that's the reason why it hasn't gotten fixed yet. Um, I did actually add some information to the ticket, uh that we have up
  277. um the issue and basically I pinged the
  278. Person who was supposed to be putting up a pr in april
  279. And just ask the question of like are you still going to be doing that or is it something you'd like us to help with?
  280. in order to be able to resolve it because I've got a couple good ideas of how it might be possible in order to
  281. To fix it looking at what was done before and just kind of like how the classes interact
  282. so it's possible that we might be able to fix it, but
  283. He says he's already got a pr up. It just hasn't been put up yet
  284. It's trickier than it looks and that's my guess is why it's not done yet
  285. even mateo
  286. collina basically said that this is going to take a couple days
  287. in order for someone on the Node team to look at.
  288. And so it's a tricky, sticky wicket.
  289. And then because of that,
  290. it was surfaced because of a header that got added in NodeFetch,
  291. which we're importing through CrossFetch.
  292. And then that PR got merged,
  293. that takes that header off as default,
  294. so it basically falls back to the Node agent.
  295. and it should resolve the issue of auto-closing the sockets.
  296. We got to test that.
  297. So it should, in theory, be resolving that issue.
  298. But we'll see.
  299. And then basically, his research all
  300. shows that it should work fine.
  301. But honestly, I didn't test it.
  302. So we'll have to double check that it actually works.
  303. But it all looked like it worked.
  304. and they tested it on the NodeFetch side.
  305. So it has been tested,
  306. I just didn't personally do it to sign my name on it.
  307. Also, for my update, I did a PR for the du command.
  308. I had to restart my computer and found a weird thing
  309. where basically the du command was failing in the unit test,
  310. so I put that up.
  311. I put up a PR for the network worker message latency.
  312. and in order to be able to get some metrics for that.
  313. And then I'm going to follow up with Ben
  314. after I get some metrics running in order to just let him know what's happened there.
  315. And then I'm also going to be adding a question from Tuyen about breaking up the run micro task function
  316. to see if we can get that scheduled a little better
  317. to improve the network performance.
  318. And then I'm also going to be adding a question from Nico
  319. about set timeout versus set immediate and just strategies of how to use the scheduling methods
  320. that we've been using if there's got any suggestions essentially. And then the other
  321. thing I'd like to be doing this week is the blast stuff is pretty close and hopefully Gajinder will
  322. we'll be able to get it over the hump
  323. because when we were testing that a couple of weeks ago,
  324. it really did stabilize the network a lot
  325. just by freeing up the main thread
  326. in order to be able to process a lot of the other,
  327. the work that's existing.
  328. So that's something I'd really like to be able to push over
  329. the hump this week if possible.
  330. And in particular, just by not having to deserialize
  331. and serialize the keys in order to be able to convert
  332. from the state transition back in through all
  333. of the validation functions.
  334. I think we'll, I mean, there's just a few things there
  335. that I think are gonna free up a lot of resources,
  336. which I think is gonna stabilize the note a lot
  337. 'cause it really was doing a really good job.
  338. Assuming everybody is okay with that,
  339. that's really, I think, gonna be my goal
  340. to just see if we can get some metrics on that.
  341. And then following up with Ben.
  342. >>Awesome.
  343. Thanks, Matthew.
  344. All right.
  345. That sounds exciting.
  346. I will now hand it over to Cayman for any updates that you might have.
  347. >>Yeah.
  348. So this past week, to be honest, I didn't-- I was not very productive.
  349. I got a small PR merged in the P2P
  350. that allows us to manually dial the identify protocol, which
  351. should help with a very minor thing in Lodestar.
  352. May help us identify peers a little bit better,
  353. identify the client versions a little better.
  354. I've seen sometimes we have an unknown when we might--
  355. that unknown peer might actually be related to a client
  356. that we know about.
  357. Other than that, I was closing out some--
  358. trying to close out old PRs in our queue.
  359. And I'm going to keep on doing that this week,
  360. specifically the disc v5 using vanilla events.
  361. That's a prime candidate.
  362. And I would really love to look again at the multi-fork types
  363. PR that I had out.
  364. There was a type error that was blocking it.
  365. But I'd like to see if I can revisit that because it's
  366. going to keep on--
  367. I think just having a better organization of our types
  368. is going to be helpful as we get more and more forks out there.
  369. Other than that, if anyone has any specific things
  370. they want me to review, I'm free to take a look.
  371. So ping me.
  372. But yeah, I'm back to full availability
  373. now that my family is no longer in my house.
  374. Cool, thanks, Cayman.
  375. All right, I'll hand it over to Nico.
  376. Hey, so I was mostly looking into the issues I opened last week.
  377. So there was this that the process was hanging so that turns out that's the network worker.
  378. It turned out so that was not related to anything IPv6 updates we did.
  379. did. There was the issue that our metrics was actually not
  380. configured to listen on localhost. That's fixed now,
  381. with the PR I did. Besides that, I was basically trying to
  382. investigate why our sim tests were hanging. For this one PR I
  383. closed, where I changed the order of how we shut down the
  384. peer manager. And then looking at those logs, I found out that
  385. in our sim tests, we actually have a lot of these cannot set
  386. header errors. So this leads me to further look into that issue.
  387. And I think this is now finally correctly resolved. So it was
  388. just a race condition how we close the event stream
  389. basically. And so in some cases, we still emitted events to the
  390. event stream even though it was already closed or the stream was no longer
  391. writable. So this should be fixed now. Yeah besides that was also looking a bit
  392. into that node 20 issue that we had and yeah so what I want to finalize now is
  393. just the other open PR regarding this node health API. So yeah there are good
  394. suggestions there how we can improve that. So from NASA I think that's a
  395. pretty good approach that he suggested. So I will implement
  396. that. And then hopefully this week, do some progress on
  397. looking into how we can improve our region strategy, and
  398. eventually talk with line about it.
  399. Which strategy?
  400. Just looking into how we do region at the moment and state
  401. caching. And yeah, I was looking at strategies that other
  402. clients used. But yeah, there was still some points that I
  403. need to better understand to really do proper decisions of
  404. what on what we can improve basically.
  405. I'd like to give a shout out to Nico for holding down the help
  406. channel and always seeming to have the answer for everybody in
  407. there. I just I think it deserves commendation. You do an amazing job at all that stuff.
  408. Yeah, man, big ups. Big ups.
  409. It's really fun for me to have users. So
  410. you're really good at it.
  411. Like the broad, the thing is, it teaches you a lot. Because if you try to answer,
  412. I guess always my favorite approach to like learning stuff, just helping other software stuff.
  413. It's very impressive to me that like because I learned a ton from your answers in there also, it's just I just wanted to specifically call it especially getting the guy like what what sparked it was the guy started a test net from from Genesis, just as pretty cool, like I hope he actually puts that repo up, because I'd love to be able to see that.
  414. Yeah, I think he's doing great work. Let's see what comes out of it. Because that whole
  415. def net or test net was not. I mean, I also tested it, but not that extensively like he does now.
  416. So I think that's good that we know it actually works.
  417. And he has a compose. And it's not starting from phase zero, like he started like halfway through,
  418. which I think is also like it's, it's, it's a very cool thing we should we should even turn it into
  419. a highlight or something, or maybe do a blog on it or I don't know there's some, there's some
  420. opportunity there. Yeah, definitely. And also like to add that, you know, the work that you do Nico,
  421. which helps other community members build tooling or guides or whatever that may also help the
  422. community, that is actually a flywheel we've been trying to get going with a community,
  423. basically, specifically to Lodestar. If we're able to help other builders do some of the work
  424. that we otherwise wouldn't get to or will help improve the Ethereum community in some way,
  425. Like that's awesome work.
  426. It just exponentially increases the output
  427. of what we're trying to do here.
  428. So thanks a lot, Nico.
  429. And okay, we'll move on to Nazar.
  430. - Thank you so much.
  431. Yeah, lately I was struggling using the prover package
  432. in the React application.
  433. There is a very famous known pattern of conditional exports.
  434. It turned out that the package edition conditional exports are not standardized or most of the
  435. libraries are not using it properly in different ways.
  436. So I made some changes to make those conditional exports working for the webpack.
  437. For reference, conditional exports are when the building tool like TSC compiler, TypeScript
  438. compiler or Webpack or any other building tools, they can detect if it's a browser or
  439. an old environment or what kind of conditions that apply and then appropriately switch the
  440. import path at the runtime.
  441. was a bug which I fixed for the webpack, it was working but then we have a package in
  442. our repo which is like linting the readme files and then that package stopped working
  443. because they have a different way of detecting the conditional exports. So they were only
  444. detecting one level, they were not doing it deep. On the other hand, webpack can do nested
  445. conditional exports as well.
  446. So due to this limitation, I banged my head a lot,
  447. but finally I went for the named export.
  448. So now if you wanna use the prover into the browser,
  449. there is a name export for our browser,
  450. so you can rely on that.
  451. That is much more streamlined in all building tools,
  452. so that will work.
  453. And there was one other discussion with Nico
  454. about a situation where when we shut down a beacon node
  455. we see an error message in the control log
  456. which says that execution went offline
  457. which actually is not the case
  458. because execution is there, we just shut down the node.
  459. Apparently it's an abort error which somehow been detected
  460. as that there is a communication error
  461. between the execution layer and the beacon layer
  462. and then our logic that we have create this error
  463. as an error that the execution went offline.
  464. So there is a PR I opened for it.
  465. I'm writing some tests for it.
  466. We'll finalize those tests and open the draft PR,
  467. like make the draft PR ready for the review.
  468. And there is one other PR I'm working right now
  469. a logical error I found in one of the implementation of the prover
  470. when we don't have enough finalized blocks
  471. so if we initialize the prover and we only have one finalized block
  472. at that time then there is a logical error which limits fetching some payload
  473. So I will open one PR for it and then if both works fine, then I will open the
  474. React application PR ready. It's almost done
  475. It's just limited because of this
  476. Logical error, which I just found in the morning
  477. Yeah, so you guys can see three PRs upcoming by me maybe today or tomorrow
  478. Thank you.
  479.  
  480. - Okay, well, next up we got Tuyen,
  481. if you have anything to add.
  482. - Yeah, so I finished the new BLS API.
  483. Next, I will work on the index,
  484. but secure hopefully we'll have a PR tomorrow.
  485. Other than that, I submitted two PR.
  486. One is to own proposal duties before the next epoch.
  487. The other one is not to subscribe to too much subnets.
  488. The context is that when we join a sync committee,
  489. there are a lot of long-lived subnets appear
  490. up to 50 in average.
  491. And I see that we receive like 120K message IDs
  492. in the IHAVE gossipsub, which increase the bandwidth a lot.
  493. And for each of the message ID,
  494. we have to convert to string.
  495. and we have a lot of IO lag at that moment.
  496. This afternoon when I look into the rated network with Lion,
  497. there was time when the rated network decreased a lot
  498. and that happened when we joined the sync committee too.
  499. So the fix is not to subscribe to too much subnet peers.
  500. Right now the target is six subnet peers.
  501. So please review that.
  502. Next, the other thing I will work on
  503. is not to subscribe to short-lived subnets too early.
  504. Right now, if we have an hour later duty, the next epoch,
  505. we start subscribing at the beginning
  506. of this epoch, which increase the bandwidth a lot.
  507. we try to subscribe to just some slot in advance.
  508. And the last thing is the noble guys
  509. have a new chacha-poly.
  510. And in the last update,
  511. he said that he will support the destination
  512. as an optional param.
  513. This is what we want.
  514. So I will do a performance test to see
  515. if it's actually better than our assembly script.
  516. we can switch to that.
  517. That's it for me.
  518. - Thanks, Tuya.
  519. Yeah, some of those fixes that you're putting in there
  520. to help with the sync committee issues,
  521. that was my rationale for potentially
  522. pushing out a hotfix release.
  523. But if we're gonna go ahead and do a 1.10 anyway,
  524. hopefully that goes well,
  525. we can sort of play it by ear and see how the RST does instead might be a better way to go.
  526. Okay, next up we got Lion.
  527. Hey, so last week was Paris. I think I gave an update on the last week.
  528. I guess I had an interesting conversation. We were chatting a lot with Terrence about
  529. multiple things and he asked a question. So because we were discussing what's going to
  530. happen when the state is so big and yada yada. And basically he asked me, how long does it take
  531. Lodestar to process all the attestations in the aggregate moment? And basically my question is,
  532. we just don't. So I was spending a bunch of time trying to understand how bad this problem
  533. is. So we have a new dashboard called load star group dash good behavior. What is the
  534. selection of the things we do that do not affect directly us but affects others? Because
  535. Because I think, yeah, we have to keep, I don't know, like, I don't know why it bothers
  536. me this much, but I don't think it's okay that Lodestar keeps growing while being a
  537. negative to the network.
  538. Especially like the problem that we have where we essentially drop messages.
  539. That means that messages that are propagated through the network don't get there.
  540. If Lodestar had a significant share, this would be pretty catastrophic for the network.
  541. I guess at the rate that we are now,
  542. the redundancies in the network
  543. do not cause a significant effect
  544. due to the fact that we have so many aggregators
  545. and everyone has a decent amount of mesh peers.
  546. So we are basically a sink
  547. where if you send us something, it will just not get through.
  548. So that's why I'm coordinating with Tuyen to see,
  549. which is kind of the realization, right?
  550. If we are not processing a distinguish in time,
  551. why are we doing that at all? It's kind of stupid. We could even completely turn off
  552. our aggregator at all. Just make loadstar performant and then focus on that part. Because
  553. otherwise it doesn't make sense. We are spending all this time to aggregate the decisions to then
  554. not do it in time. And if it doesn't get to the aggregator within a slot, it's useless.
  555. So, yeah. Working on that, at least. I think doing something radical like this could give us
  556. us a bit of time to not overload Lodestar while we get something more permanent like
  557. the networking threads. I think that I would be okay with that compromise, at least if
  558. we know that it's temporary and we just know that otherwise it's kind of useless anyway.
  559. And yeah, besides that, all the different research paths that I mentioned last week
  560. Just continue, no big news there.
  561. And that's it for me.
  562. - Thanks, Lai.
  563. Great update, something for us to think about
  564. how we might wanna approach this.
  565. Okay, just in the essence of time,
  566. we have good ginger and NC left,
  567. if NC has anything to say.
  568. Do you have any sort of updates
  569. in regards to some of the work that you've been doing?
  570. - Right, yes.
  571. Yeah, so on the ePBS,
  572. so I finally had some capacity
  573. starting this project last Friday.
  574. And since then, like a couple of things has happened.
  575. So first off,
  576. Terence invited Lion and I to join the EPBS discussion
  577. over on the Prysm Discord.
  578. So I think like, I think like from now,
  579. like all the EPBS discussion is going to happen
  580. over on that side.
  581. And also like, okay, he set up like an initial touch base
  582. with, you know, Lion and I,
  583. and also like couple present folks next Wednesday.
  584. So I hope like, you know, we can have some like
  585. productive outcome or like any sort of discussion.
  586. And then he also posted like his first draft
  587. on the P2P spec on the ePBS.
  588. It's something I still need to review.
  589. Right, so for this coming week,
  590. I need to get myself up to speed on the P2P,
  591. especially the network layer stuff,
  592. mostly the libP2P and also the gossip stuff like that,
  593. just so enough that I could understand
  594. what cameras is doing with the EPBS P2P side.
  595. And also over the next two weeks or so,
  596. I wish to write up like a project documents on the EPBS,
  597. just to formalize the projects a little bit,
  598. maybe like set some objectives, goals,
  599. and split up the project into a couple of phases,
  600. so that it's easier to organize and to track progress.
  601. So I hope that, so it seems like right now
  602. we are focusing on the P2P side of things on the EPPS
  603. and for the rest, like we still don't have, you know
  604. any meaningful discussion yet.
  605. Yeah, that's all from me.
  606. - Awesome, thanks for the update.
  607. All right, and we have Gajinder.
  608. - Hey guys, so I worked on PRs for forkChoiceUpdate v3
  609. for DevNet 8, then I worked on broadcast validation PR,
  610. and I hope I've addressed the concerns
  611. raised by Cayman and Lion.
  612. And then I tried to sync Constantine, the Verkle TestNet,
  613. but I was facing issues regarding loading the genesis,
  614. spent quite a lot of time debugging them,
  615. and finally figure out that constant had a change
  616. over the current local branch in payload header.
  617. It now has execution witness header
  618. rather than execution witness.
  619. So I'll try to make that change
  620. and try to again, try to run the network.
  621. And then I basically had discussions as well as raised PRs
  622. on the consensus specs regarding publishBlockV3.
  623. And it seems that, right, we will need to move our builder
  624. versus execution race to beacon rather than the validator,
  625. which is what we are doing right now.
  626. because now the format is,
  627. so there is the format right now
  628. is basically all the APIs are on the format
  629. where it's assuming that this race and selection
  630. is happening in the beacon.
  631. So that is something that I'll pick up.
  632. And then there is a PR on consensus specs
  633. regarding parent beacon block header.
  634. So, again, it seems that EL guys were in favor of the PR but CL guys are not and on tomorrow's
  635. call, most probably it will be decided whether this PR will be included or not.
  636. Yep.
  637. That's all.
  638. Thanks, Gajinder.
  639. Okay.
  640. So thanks guys for coming out and we'll see you later.
  641. - Bye guys.
  642. - Bye. - Bye.
  643. - Have a great week. - Bye-bye.
  644. - Have a great week, everybody.
  645. - Have a good week, bye-bye.
  646.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement