Guest User

Content of https://news.ycombinator.com/item?id=44312324

a guest
Jun 19th, 2025
41
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.03 KB | None | 0 0
  1. neuml/paperai "indexes databases previously built with paperetl" and does RAG with txtai; https://github.com/neuml/paperai :
  2.  
  3. > paperai is a combination of a txtai embeddings index and a SQLite database with the articles. Each article is parsed into sentences and stored in SQLite along with the article metadata. Embeddings are built over the full corpus.
  4.  
  5. paperai has a YAML report definition schema that's probably useful for meta-analyses.
  6.  
  7. Paperetl can store articles with SQLite, Elasticsearch, JSON, YAML. It doesn't look like markdown from a tagged git repo is supported yet. Supported inputs include PDF, XML (arXiv, PubMed, TEI), CSV.
  8.  
  9. PaperQA2 has a CLI: https://github.com/Future-House/paper-qa#what-is-paperqa2 :
  10.  
  11. > PaperQA2 is engineered to be the best agentic RAG model for working with scientific papers.
  12.  
  13. > [ Semantic Scholar, CrossRef, ]
  14.  
  15. paperqa-zotero: https://github.com/lejacobroy/paperqa-zotero
  16.  
  17. The Oracle of Zotero is a fork of paper-qa with FAISS and langchain: https://github.com/Frost-group/The-Oracle-of-Zotero
Advertisement
Add Comment
Please, Sign In to add comment