In this blog we discuss how the transformer architecture naturally extends over external memories, and share empirical results which leverage this capability to succeed where RAG has struggled. These methods are innate (don’t require fine tuning) and outperform popular retrieval augmented generation methods.
Extremely fascinating! How do people come up with ideas like this?