SatoshiHunter

Untitled

Mar 6th, 2021 (edited)
1,432
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.13 KB | None | 0 0
  1. On Tue, Nov 19, 2019 at 3:15 AM Xxxxx wrote:
  2.  
  3. Hey xxx,
  4.  
  5. I've run a bunch of analyses and, generally speaking, the results aren't super clear. I can't rule out that they're the same person. Some of the telltale patterns of a unique author are there -- the scores for things like Analytic, Clout, and Authentic are strikingly consistent (same with things like article use, auxiliary verbss, and 1st person singular pronouns). However, there are some other language categories that I'd expect to be really consistent as well that aren't. In looking at phrases/vocabulary, there's also a pretty wide gap between the two papers -- less overlap in 2- and 3-grams than I'd generally expect if they're the same person, but a lot of this could also be driven by how the content area of crypto has changed over the years. I'm also not at all well-versed in the crypto world, so I couldn't say how much of this could also be driven by differences in how the two technologies might shape how they're being described in the papers. Readability and lexical richness scores are fairly consistent between the two texts, especially when stacked against some other scientific papers / types of documents. But, even here, there are some metrics that I'd expect to be basically identical between the two papers that show a pretty clear divergence. This is always a major difficulty -- it's easy to focus on the many similarities statistical similarities between the two to support a "same author" conclusion, but it's important to weigh in and consider evidence to the contrary as well.
  6.  
  7. tl;dr: I can't rule out the possibility that it's the same person, but it's definitely not a resounding "yes, same person" conclusion either. Forum posts might help disambiguate a bit more, especially if there are a good number (e.g., 30 or 40 posts) of a good length (at least 50-100 words apiece, longer is better) for each person. Authorship attribution on shorter/fewer texts is generally an unsolved problem in the field. But, if that type of data exists, more is always better, and throwing in a few people who definitely aren't the same person is helpful as well.
  8.  
  9. Best,
  10. Xxxx
Add Comment
Please, Sign In to add comment