Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Hey, everyone, and welcome to the June 13th stand up.
- Today for planning.
- I have mostly questions, I guess, catching up with with the chats and the issues in regards
- to 1.9.
- Thanks guys for cutting an RC yesterday and deploying it.
- looking to see or hear of any updates and questions, I guess,
- that we might have on what's happening with v1.9.0-Rc2.
- If anybody wants to give just a quick overview
- about where we're at, that would be great.
- Just have a look.
- Seems like for today, we dropped some mesh peers.
- But on beta previously, I deployed a branch
- to test the batch delete stuff.
- And when I look at the seven days chart,
- seems like it's very stable except for today.
- So maybe it's just nothing.
- We need to monitor more.
- Right.
- OK, then in the chat as well, I think there was an idea put up.
- I came in to deploy it on some mainnet validators,
- suggesting the CIP canary validators
- to hopefully get some better metrics on that.
- Is there any sort of contention to that idea?
- - Yeah, my thought was that we could just deploy it
- And to some, some nodes that have validators attached,
- 'cause it seems like, you know,
- the scoring is gonna be a little different.
- We're gonna be giving the network actual useful data
- rather than just forwarding stuff, so.
- Sorry, I lied, I said deploy to the whole CIP fleet.
- - Yeah.
- I mean, the worst that could happen is a bit less rewards
- and we don't care.
- So we just as well use them.
- >>Okay.
- Yeah, that makes sense.
- Okay.
- And then the--
- >>I feel pretty confident about this release.
- I feel like it's just the network
- we're seeing with this MeshPeer stuff.
- Like it hasn't come up before in any of our other testing.
- And...
- >>Yeah, for stable mainnet,
- it has been stable in the last seven days
- to accept for some last hours.
- So yeah, it's just like some incident in the network.
- - Yeah, I think it's pretty interesting
- 'cause like not all of the, like the beacon nodes
- that we have deployed necessarily perform the same.
- A lot of it seems to be also dependent on the luck
- of the mesh peering that you kind of get.
- I don't know if I'm mistaken in this,
- but there are definitely some beacon nodes
- that we had on mainnet with Lido
- that don't have this stable mesh peering.
- And it's very specific ones sometimes.
- So I'm not entirely sure what the solution is for that
- when we start seeing some poor mesh peering
- on specific nodes.
- I would say maybe we should make an issue about this just so we can crack it and I'm
- not exactly sure how we would go about debugging that but I guess I would tend to agree that
- there's some some luck like if you're connecting to nodes that are more connected to the network
- to the rest of the network.
- You might be getting messages a little bit faster.
- And then not having as many.
- Missed attestations because you're better synced, I don't know, I have no data to back this up.
- OK, yeah, sounds good.
- All right, well, we definitely have some next steps here in regards to seeing if we can push out
- v1.9 RC2.
- So that's probably the most important thing
- on my list right now.
- There are a couple of people that are waiting to try this out.
- Some larger node operators like RockLogic.
- And I've also reached out to
- to some relayers as another target group of users as well.
- we should all be in a chat with Aestus Relay on Telegram now. So they will be sort of like our
- test clients or customer in regards to diversifying that aspect of infrastructure on the network.
- Does anybody have any points for planning that they'd like to bring up specifically?
- Maybe we can talk about the network threads or the different strategies that we have for
- how to improve performance and what and prioritize them.
- Yeah, that sounds good.
- I would say, has someone confirmed the fact that
- those that mainly experiences issues
- when we have a lot of keys,
- say the nodes that have zero keys or little keys,
- are they doing fine?
- - With the network thread enabled?
- - No, in general.
- - In general, I've noticed that the CIP validators
- seem to have better effectiveness
- comparatively to the Lido nodes.
- When you say like little keys,
- like I don't know if there's like a specific number
- that you're considering,
- but our CIP validators have a division of,
- I think it's 16 canaries.
- So, and then, so one node has 16
- and then the other one has 56 keys in it.
- Whereas all of our Lido ones have like,
- I think 200 per beacon node at this point.
- - Go ahead.
- - Yeah, also when I looked at the performance
- of the Lido nodes, when I debug that,
- I also saw that there are way more late attestations
- and way more missed attestations
- compared to my private server.
- And also the performance and rate,
- it was, I think a difference over,
- It was about 3% difference.
- I think the timeframe was 30 days, so yeah.
- So it makes a huge difference, I think.
- - I see the same thing with my personal validators too.
- - Got it.
- I think there's also a piece that whenever we talk
- with operators, we can say confidently then
- that Lodestar is at a good level
- that we are comfortable recommending for, say,
- not huge stakers. And then for bigger stakers, we are working on it. And then we can put this
- relative to size beta tag or stable tag on the software, so we can cover both grounds.
- Cool. Then do we want the network, I guess the answer is yes, but do we want the network thread
- to be enabled by default eventually?
- >> Absolutely.
- I think, well, it seems like we're
- able to actually process all of the messages.
- Well, for larger stakers that are connected to more subnets,
- we're actually performing well for the network's sake.
- I remember we were talking about being a good network
- participant and how we aren't always a great network participant right now. I think enabling
- the network thread by default is kind of key to doing more work that we should already be doing.
- Okay, I agree. I can share the findings I got so far, which are not very conclusive.
- Tuyen has been able to take regular CPU profiles with the Chrome DevTools,
- I was building a hack in the API. I was able to take a performance record of the whole process
- and then label stacks by thread ID so we can look exclusively at the network.
- And at least from there, there is nothing obvious that stands out in either of the two.
- So it looks like the node is doing what it's supposed to do.
- It's just doing Gossip, Mplex, Crypto, Allocations, TCP handling.
- There are two minor issues that Tuyen is already handling.
- Like we do PRID conversion and deconversion.
- That's stupid, but we're not going to do that anymore.
- But that's about 4% of CPU time.
- So it just looks like for some reason the thread is overloaded.
- The thing that I don't understand, to be honest, is why the main thread was doing fine,
- and now the network thread is clogged.
- That's a question I haven't been able to answer yet.
- I don't know if someone has.
- There's a definite overhead to spinning up a second isolate.
- So it's definitely doing a lot of work just handling another node instance.
- One thing to be clarified, the main was not a thread, it was a process.
- And this network thread is the first time we introduced a thread.
- So there's definitely going to be context switching at the CPU level.
- Because if you have two separate sets of worker pools, it's basically switching between the two sets of worker pools in order to accommodate all the work at the processor level.
- It's possible.
- I mean, I don't really buy that because
- CPUs are very optimized for that purpose.
- And if you look at the profile, you don't see anything that would
- relate to that.
- One difference that
- Tuyen also was able to see in the
- in his profiles that he took, when
- we have everything in the main thread, we explicitly
- yield back to the macro thread, to the macro queue,
- rather than have a bunch of these promises being
- awaited in a--
- We don't have huge uninterrupted periods of micro queue tasks.
- We explicitly yield back to the macro queue
- for timers and other things to be able to be
- triggered at the right time.
- And if you don't do that, then your timers
- won't be able to fire because there's too much work
- to be done.
- There's a huge uninterrupted stream of micro queue tasks.
- And that's one of the things we saw in the profile.
- Yeah, I would be. So that makes sense, but I would be cautious when you look at the time
- render of the profile, you have to take into account that that's a sample. So if there
- is a sample at time X with a stack that's rooted on the micro queue, and then the next
- one also in the micro queue, but it happened that between that there was a micro queue
- yield. If there is not a sample at that specific time, it would look like it's a continuous
- micro queue you run when that's not true.
- Just just an FYI.
- I feel like we do have a metric that kind of
- is a counter point to that, though we see the event loop lag,
- which is the time between macro queue runs.
- And that is up to like, a second or more.
- when we're running the network thread.
- That should be, you know, in milliseconds.
- - We have to, 'cause I don't know why the hell
- Node.js doesn't expose that in a nice way.
- I'm not sure if they do.
- So, Prom Client captures the metric in two ways.
- And if you look at the metrics in the Grafana dashboard,
- you will see that there are two different dashboards,
- sorry, two different charts for Event Loopback,
- which are widely different.
- Also for the GC metrics,
- there are two different set of metrics
- because they use different techniques
- and are widely different, which is really annoying.
- So I don't know, we just don't know what's happening.
- But anyway, so--
- - Well, that would be making sense
- because the micro queues and the first garbage collection,
- I don't think runs on the full cycle.
- And also the polling also runs on the full event loop,
- not on the, like if you insert promises
- in the promise in the micro queue,
- it doesn't actually get to the next step.
- So it's not pulling the network.
- It's not pulling new stuff off sockets.
- It's not running the major GC,
- but the minor GC is basically just a memory pointer
- that runs back and forth that would be running in between.
- I just thought.
- - Got it.
- Okay, so if we want to enable, then we have different strategies.
- Something that I'm not sure if we can do, but would the network thread be able to self-regulate
- itself to not choke?
- That would be great, but I'm not sure if we can do that.
- Then we can, I mean, we are already doing the network thread, so we are scaling horizontally.
- We are moving loads across different CPU threads.
- We cannot do that anymore because it's already one thread.
- It wouldn't make sense to have multiple threads.
- Then we can try to do less things.
- Reintroduce somehow a mechanism to drop messages.
- So we reuse load if the thread, the text is overloaded.
- But that's -- I mean, it would be cool to do that.
- This is why I was bringing up how is Yamux doing?
- Because Yamux has back pressure built into it and may be able to help in these situations.
- Because it will not be sending the window updates.
- So you will start receiving things.
- But I think Yamux is blocked.
- Because the performance was bad, right?
- No.
- There was a memory leak last time we tested it out.
- Okay.
- Because if the network is not doing anything stupid, which it's not.
- that we fix these two little things,
- then what options we have?
- We just do less things of the things we are already doing
- that we have to do, or we optimize the pipeline further,
- which that would be another option,
- but that would take a while.
- Like, how is it going on libp2p land?
- Yeah, so I was going to mention this in my update,
- but I think I've got a branch open for upgrading to 0.45,
- which is the latest version of the libP2P.
- I think we definitely need to prioritize that after we
- get this release out, because we have a lot of--
- so we have some fixes in the TCP library.
- We have some improvements that are Open PR's Gossip sub.
- And all of those are blocked because we're
- running several versions behind in production,
- like several versions behind of Gossip Sub and of TCP.
- So we want to get back up and get all these latest goodies.
- That'll also let us retest Yamux.
- And hopefully-- there have been a few fixes.
- And that memory issue may have been resolved.
- The other interesting thing that is kind of related to this
- is that in a future version of libp2p, maybe 0.46 or maybe 0.47,
- we are thinking about replacing the underlying
- implementation of the streams with the--
- there's a standard for streams.
- They're called web streams or web WG streams.
- And they basically provide a readable and a writable stream.
- And there is a promise of maybe being--
- that it may being more performant.
- Because you can use a readable stream in such a way
- that the consumer of the stream passes in a buffer
- to the stream to have data written directly to it
- to avoid additional memory copies when
- you're dealing with binary data.
- So this is pretty nice for us.
- And this would also allow us to stop--
- we're doing a lot of stupid wrapping of these streams
- to make them abortable at different levels of the stack.
- And the reason we have to do that is because our underlying
- source, our underlying streams are not abortable by default.
- So that's another piece of the puzzle
- that this implementation is abortable at the lowest level.
- So we don't have to do additional wrapping.
- And then another third thing about it
- is that there's built-in backpressure
- with these streams.
- And so this is not even just like Yamux backpressure,
- where we're sending data.
- It's like backpressure is built into the--
- we might be able to basically tell the TCP socket that we're
- not able to be writing more data.
- So we might be able to-- or not be able to be reading more data.
- There may be some ability to handle backpressure
- that way too. That's all in a future as like, version 0.46 or
- something 0.47. But we need to get to 45 first. So 45 then we can
- start testing yaml. So hopefully 0.46 and 0.47 we get more goodies.
- So with with a single variables, we do have back pressure because
- you are requesting each individual next item. If you
- don't request more items, the source should not produce more
- items. So the classic stream backpressure is that you have small buffers everywhere.
- But with async at the levels, you just you have a buff, you have backpressure, it's just
- like a zero item buffer, right?
- Okay, so with this, you basically have a, I guess the queuing strategy or like the buffering
- strategy is more explicit. So you the this readable stream, you could tell it how how
- to buffer the data and what to do when you're full.
- So right now we have these it pushables in different places
- that are performing that task, where it's like we're buffering
- things, and then we have to set a max buffer size for these it
- pushables.
- So it's kind of built in.
- So I guess related to this, I'm confident with the current
- design we could get rid of all the abort sources because if I mean probably the issue is that
- libp2p does the wiring of the transports but if we can declare at the libp2p constructor level
- I am aware of my maxer and I am aware of my protocols and I know they can
- they have all abortable sources so please don't wrap it.
- Like we could do that today with today's implementation.
- Like, I guess moving away from missing it that it was into something else either is
- a big thing.
- And I'm not sure if it's the right one.
- The thing is, these streams implement async intervals.
- So it's not, we're not losing async intervals.
- It's just that there's an implementation is a concrete implementation, backing, backing
- everything.
- Can you send me an issue if you're designing this in the open?
- Yes.
- I commented on your -- you opened a draft PR and gossip sub to remove a portable source.
- And I commented there where some of the discussion is happening.
- Okay.
- Is there a performance test somewhere on libp2p to be that we can test like the full throughput
- of the stack?
- Not yet.
- They're working on it.
- Like we have, I would like to test this hypothesis that our stack is slow.
- Because maybe it's not true.
- Maybe doing fine.
- I don't know.
- Like we have had surprises before.
- I actually just thought of something,
- is that if we're in micro, getting out of the micro task queue,
- or we're basically stuck in the micro task queue,
- the sockets could be loading instead of to L1,
- because it wasn't like going through the loop each time.
- And it's basically backing up all the socket data
- to like L2 or L3 cache, and it just takes much longer
- in order for the data to get to the CPU to process.
- So it would look like the CPU is doing
- what it's supposed to be doing, it's just waiting for data.
- Yeah, that's a hypothesis I don't know how to test.
- I don't either, though.
- I'd love to see what performance looks like if we just--
- I think we have something like this on the main thread,
- where it checks to see the last time that there was an event
- loop, macro queue event loop.
- And if there's been a certain amount of time,
- then it will yield--
- it'll sleep zero.
- It'll yield back to the macro queue
- just to avoid any long periods of micro-queue tasks.
- And see what that looks like.
- Yeah, actually, there was a very old issue I opened,
- that I never got to do because I didn't have the expertise,
- but maybe Matthew, you can take on it,
- is to investigate the hypothesis that our OS socket buffers
- are being read and written slower than they should
- due to the event loop being clogged.
- Like, I think that with the knowledge you have now,
- you should be able to confirm that.
- - At least be able to dig and find out.
- That's a good question.
- Because the hypothesis is, if we write the attestation to the socket,
- and then we stay busy for a while,
- it will take maybe five loops to copy all the data.
- So the message will be actually sent off the wire much later than we thought.
- And that would definitely cause it to be processor-level,
- like pushed back to slower cache, for sure.
- Cool. So regarding planning, we don't have any, well, we have multiple things to optimize
- the network. I think all take a while, but we'll work on it. Just to confirm from Tuyen,
- like the performance decrease that we're seeing now is one, well, the main issue is that we
- processing blocks late so we both float on the wrong head. Is that correct?
- Yeah, I think with the single thread model we mostly receive up, process up block rate and
- and then validate vote for wrong head. And is that that explains all the
- attestation problems that you have or it's also that our mesh sometimes
- gets into a bad situation so we cannot send the stations to the right aggregators on time?
- I think when we publish, we publish to all of the trophic keys, not mesh keys,
- so maybe it's not related to that.
- Cool, so all we have to focus now is on getting those blocks ahead as fast as possible.
- Yes.
- And then on the network thread, it
- seems like we're getting them a lot faster,
- but our peers are a lot lower.
- So we can't keep--
- so we just have an unstable peer set,
- and it's constantly rotating out.
- And so that becomes our problem.
- But we're getting things a lot faster.
- We're processing them a lot faster.
- I think after we get 1.9 out, we can also enable network prep in the CIP node too.
- Cool.
- Great.
- That's all I wanted to discuss.
- Okay.
- Cool.
- So I guess, what are we on?
- Day two of testing that RC pretty much.
- So we can probably make a decision on that
- as early as Thursday,
- but we'll still deploy to the CIP nodes anyway, for sure.
- And then we'll enable,
- actually, are we enabling the network thread immediately,
- or is that something that we're gonna wait?
- - I was thinking let's test the RC this week
- and use it as additional test data for Thursday.
- And then once we cut RC or once we cut the full 1.9,
- then deploy with the network thread to our CIP nodes.
- - Cool.
- And I threw observation on that,
- then we will figure out if we want to,
- I don't know if we want to consider that for the Lido nodes if we get good data.
- But we can make the decision later based on what we see.
- Any other points for planning?
- Otherwise we'll do a quick round of updates and that should cover up the last 20 minutes.
- All right, cool.
- Let's start with Gajinder.
- How are things going on the DevNet?
- Yep. So last week was all about pre-DevNet 6 and the most of,
- basically we were working quite good last week,
- but today DevNet 6 has started and it basically upgraded
- created the number of blobs that one can use in the network to six for a block.
- And some issues have cropped up and I'm sort of debugging them.
- And I have also generated a fix and we are trying to basically again,
- sync back to Net6 so that we can bring it to a healthy state.
- Apart from that, I did a little bit of PRs doing fixes here and there.
- Yeah, that's mostly it.
- - With the increase in the blobs,
- does that actually make a huge dent in our performance
- or anything like that?
- - No, no, it's, I mean, basically the problem started
- when I sent like 500 blobs in the network
- to be 500 transactions with 500 blobs in the network
- for them to be included.
- So basically each blob consequently
- was now getting six blobs, which was fine.
- But then not in the network,
- basically the problems started showing up
- with the EL clients.
- So earlier EL clients were not agreeing.
- And there was one problem with LoadStarware
- when somebody would do Blob site cars by range request,
- it was sending all Blob, all six Blob site cars together
- rather than checking it up one by one.
- So which was basically because of a typo
- for which PR has been generated
- and I've updated them there as well.
- But I mean, all this is right now running in one data center
- So there would be actually no issues
- with respect to network latency,
- but I don't expect network latency to be an issue over here.
- - Cool, thanks Gajinder.
- All right, next up we have Matt.
- - Good morning.
- So I spent some time studying through
- how the network processor was working
- and the network thread was working
- in the beacon node and got BLST brought into there properly.
- So it's only doing attestations and aggregates and proofs.
- So I have a draft PR up
- and I'm hoping to get that deployed today
- but it was not the most efficient week.
- As you can tell my background is here
- where Jordan's got surgery in a couple hours.
- It's only 6.30 in the morning here.
- So just trying to get ready to get here and get here for her.
- And then we're gonna, I'm gonna have to go online
- 'cause we're gonna go wait with the doctor.
- So I'll probably be off most of the today.
- I'm gonna bring my computer with me while I wait
- and try to get this deployed,
- this BLST version deployed to a feature node
- and get some metrics 'cause hopefully that will help
- with the network stuff, just taking some of the load
- off of main CPU might help with some of the other issues
- we're seeing.
- So that's my goal for today.
- And then I do have some small build issue,
- Linux build issue that I wasn't seeing on Mac.
- So I'll resolve that pretty quickly, I would guess,
- 'cause I've seen it before,
- I just don't remember which the fix was.
- And keep on going.
- And then I have the second piece ready for Gajinder,
- but I know it's been super busy
- with getting the next step of BLST approved.
- - Great, thanks, Matt.
- All right, moving on, we got came in.
- If you have anything else to add.
- Yeah.
- So two things.
- One we mentioned the working on that libp2p branch, try to keep it up to date.
- I think I'll push any late any fixes or any latest updates today on that other thing.
- I've been working on getting us ready for Node 20, which
- came out a while ago.
- I had some performance improvements.
- I've got a PR open to simplify our snappy frame decompression.
- We were using some kind of mildly supported, mildly
- unsupported libraries.
- And it could just all go away and be simpler.
- And in the process, we're updating this native library
- we're using, Snappy.
- So updating it to the latest version,
- which is Node 20 compatible.
- And that's it for me.
- Oh, actually, I got one thing.
- I made a comment on the networking channel.
- But it's kind of a cool type hack I found out about
- called branded types. And it's a way of like
- creating types,
- creating unique types, they call them nominal types. But
- basically, like, if we wanted to distinguish between a pure ID
- that is a string versus a, I don't know, a normal string, you
- can create this type that's called purity string, which has
- a special little twist to it.
- And then anything that's typecast as a string
- does not satisfy a PRID string.
- You would have to explicitly typecast it to PRID string
- or have some kind of function that would
- be able to typecast it for you.
- So it provides a little bit of assurances
- that you're not going to accidentally use a string
- when it should be a, you know,
- it needs to be validated first or whatever.
- So yeah, if anyone's interested in that,
- I wrote up a little comment
- and I've got an example library that uses it.
- So feel free to take a look at that.
- - Very cool, Cayman, thanks.
- All right, next up we have Tuyen.
- Hi, so I focused on 1.9.0 to investigate the external memory issue.
- Finally, I have a fix to the batch delete and it seems that's good.
- On the other side, I followed to revoke some of the third API.
- PR that I found is to introduce head block
- hacks which has nothing to do with the external memory. So I think we just leave it as is.
- But for now, I think the external memory is good for now. The other thing is to
- investigate network thread. I was able to take a profile that created on PR. From
- that I found some low-hanging fruit on
- the gossipsub. One is not to convert the PID
- when we call rip-off violation result of
- the gossipsub.
- The other thing is to unbundle two
- level metrics. Both of these PRs are likely
- to save us four percent of CPU time but
- Is not the root cause of the network thread issue. We'll continue with this step.
- cool. Thanks for the update.
- Next up we have Nico.
- So I guess the main thing was making the thread pool we use to
- decrypt key source reusable. I also improved the error handling
- there a bit and found some other issues.
- Like for example, we could not terminate the decryption without
- force closing the process.
- So this should be all fixed now.
- Yeah.
- And then I also submitted the PR so we can use that in the key manager API, which
- was then quite a simple change.
- So that's not that huge of a diff and thanks for the review there.
- And all those things are in RC2.
- Yay.
- Yeah.
- Yeah.
- That's quite the advantage, I guess, that we delayed it so much.
- So I think we got most important things now in that we planned for 1.9 actually.
- So yeah.
- Yeah.
- Besides that was just fixing few smaller things that came up on
- Discord or on GitHub issues.
- And one other issue I noticed that if the beacon node is running for a while,
- Sometimes I'm not sure what's the cause, but it does not seem to exit cleanly.
- So the process just keeps running.
- Um, I investigated it a bit, but did not found proper, like,
- uh, the handler that keeps it active.
- I did not figure out what is that.
- Um, so yeah, maybe someone has an idea.
- So it seems really random and not really testable.
- Um, but definitely only happens after like maybe 10 minutes or something like that.
- So maybe my idea was that we explicitly process exit once the beacon node is closed to avoid that.
- But it seems so rare, so I'm not sure.
- Does it happen on Mac or Linux?
- I only tested on Linux.
- Linux. So, yeah.
- Because we run some nodes on the E2E test and the simulations,
- they run over 10 minutes.
- They exit fine, at least on the CI.
- Yeah. So I'm still not sure what happens also in Docker.
- So when I updated my mainnet node, which runs in Docker,
- it looked like it was updating the container only after 60
- seconds, which indicates to me that that was the time mode when
- Docker force closed it. So yeah. So yeah, let's see. I still
- want to investigate that. And maybe we have to explicitly
- exit, which I kind of want to avoid. But yeah. So that's it.
- Interesting. Thanks, Nico.
- >>As for-- I just wanted to make one comment on the explicitly
- exiting.
- I know that for Geth, when you--
- I think you need Control-C to kill the process.
- But then if it doesn't die, it gives you--
- it's trying to shut down.
- And then I think if you hit Control-C again,
- then it doesn't-- when the first time you do it, it then--
- the second time you do it, I guess it shuts down.
- Or I don't know.
- Maybe it's not the second time.
- but you have to do it like 10 times or something like that.
- And then the 10th time it kills it,
- but that's just always, it's another option.
- - Yeah, I think, I mean, most process managers
- just have a timeout at some point.
- So usually like 30 seconds or one minute,
- but I guess it's still annoying if you update Lodestar
- and the container hangs for like one minute or something.
- If in reality, like it already shut down after a few seconds.
- So, but actually in geth, it may need like 20 minutes of graceful shutdown.
- Like at least in Dappnode, we used to have this problem.
- Like geth keeps an insane amount of memory of data in memory and it needs the clean shutdown
- to persist it.
- And if you don't wait 20 minutes, people had to like spend two days syncing.
- Luckily, we don't have that issue.
- - Wow.
- So what are, I guess, the side effects of doing that, Nico?
- Like force closing, is there anything that like-
- - Actually, no.
- So when we would force close is,
- so what happened before is that we appropriately exited
- because of this library we were using
- and that exited like uncontrolled
- in the middle of the shutdown process.
- Now, what I observed is that we always,
- also this beacon node close method we have always succeeds.
- But then after that,
- there might be still an active handler in really rare cases,
- which I need to identify.
- But yeah, I think nothing really happens if we exit there.
- So I guess one disadvantage would be that
- we would not detect as easily
- if the beacon node shuts down cleanly or not.
- because we always explicitly exit.
- So yeah.
- Maybe we could add a test for that, that just checks.
- - Would be nice to work.
- - Cool.
- Thanks for that Nico.
- All right.
- And then we have Nazar.
- - Thank you.
- I was working last week on some final features
- or refactoring of the prover.
- One of them was the batch request,
- which turned out to be a bit tricky
- because some providers, for example,
- Ethers.js does not have a public interface
- for the batch request.
- On the other hand, Web3JS do have a public interface
- for the batch request.
- But we wanted our prover to be compatible with both
- on the same time. That was a tricky which took a lot of time to finalize. So the PR is open.
- So in that PR, I covered a couple of things which were left from the major epic.
- And once this PR is merged, then I will hopefully close the epic issue that we have. And
- the only thing that will be left out from the epic will be the P2P interface for the light client.
- I will open the separate issue for that particular task and then close the EPIC.
- And in addition, I was going through some types and I saw a very useful ESLint rule
- for an unnecessary typecasting. I enabled the rule and found out that
- there was a lot of unnecessary typecasting in our source code around. So I had to open a PR.
- If you guys find it fine, then we can merge it. Or if you think keeping those unnecessary typecasting,
- for example, as begin or as string when the string values are there. So if we think it is necessary
- in our source code and we can close the PR.
- And in addition, I was doing research on MetaMask snaps
- for the future target of integrating Prover
- with MetaMask.
- It turns out that MetaMask snaps may not be the right
- framework or architecture for integrating the Prover
- because as per their documentation and architecture,
- they suggest that there should not be a long running process
- in the snaps.
- On the other hand, we need to run a light client
- that should keep running always.
- So we have to figure out a way of using snaps
- or if there's another way to do it, but I'm not sure yet.
- So I'm doing a bit research on this topic further.
- And yeah, next week I will continue the research
- on this topic and hopefully when v1.9 is released
- then we will make the Prover package public as well.
- And then I will update the Light Client demo that we have
- to use the Prover package instead of having this
- boilerplate code from the Light Client package itself.
- Yeah, there's all from.
- Awesome, thanks, Nazar.
- If that's the case, maybe we should escalate a little bit further and maybe just talk to
- the MetaMask guys directly.
- We should have a connect channel and Slack with them to sort of figure this out.
- Yeah.
- If you're not interested, no.
- To do the research on our side, complete, then we can talk to them.
- Because the first thing I foresee, that the first thing they're going to suggest is to use the snaps.
- Because that's what they developed to extend the behavior of the MetaMask.
- But in our case, we have to think how the snaps can fit in our use case.
- Right.
- I will update you about my findings by tomorrow, hopefully.
- Okay, thank you.
- Cool.
- And, Lain, if you have any additional points you want to add.
- Otherwise, if you're good, I think that about covers it for today.
- Okay, cool.
- Thanks, guys.
- I'll get a summary of the notes out in a bit today and have a good week.
- And we'll talk to you on Discord.
- >> Sounds good.
- >> Thanks.
- >> Bye-bye.
- >> Thank you.
- >> Bye-bye.
Advertisement
Add Comment
Please, Sign In to add comment