Talk to us

Ravi Theja Sep 21, 2023

LlamaIndex Update — 20/09/2023

Hello LlamaIndex Enthusiasts!

Welcome to the fifth edition of our LlamaIndex Update series.

Most Important Takeaways:

  1. We’ve open-sourced SECInsights.ai — your gateway to the production RAG framework.
  2. Replit templates — kickstart your projects with zero environment setup hassles.
  3. Build RAG from scratch and get hands-on with our processes.

But wait, there’s more!

  • Feature Releases and Enhancements
  • Fine-Tuning Guides
  • Retrieval Tips for RAG
  • Building RAG from Scratch Guides
  • Tutorials
  • Integration with External Platforms
  • Events
  • Webinars

So, let’s embark on this journey together. Dive in and explore the offerings of the fifth edition of the LlamaIndex Update series!

Feature Releases and Enhancements

  1. Open-Sourced RAG Platform: LlamaIndex open-sourced http://secinsights.ai, accelerating RAG app development with chat-based Q&A features. Tweet
  2. Linear Adapter Fine-Tuning: LlamaIndex enables efficient fine-tuning of linear adapters on any embedding without re-embedding, enhancing retrieval/RAG across various models. Tweet, Docs, BlogPost
  3. Hierarchical Agents: By structuring LLM agents in a parent-child hierarchy, we enhance complex search and retrieval tasks across diverse data, offering more reliability than a standalone agent. Tweet
  4. SummaryIndex: We’ve renamed ListIndex to SummaryIndex to make it clearer what its main functionality is. Backward compatibility is maintained for existing code using ListIndex. Tweet
  5. Evaluation: LlamaIndex’s new RAG evaluation toolkit offers async capabilities, diverse assessment criteria, and a centralized BaseEvaluator for easier developer integrations. Tweet, Docs.
  6. Hybrid Search for Postgres/pgvector: LlamaIndex introduces a hybrid search for Postgres/pgvector. Tweet, Docs.
  7. Replit Templates: LlamaIndex partners with Replit for easy LLM app templates, including ready-to-use Streamlit apps and full Typescript templates. Tweet, Replit Templates.


  1. Launches with MongoDBReader and type-safe metadata. Tweet.
  2. Launches with chat history, enhanced keyword index, and Notion DB support. Tweet.

Fine-Tuning Guides:

  1. OpenAI Fine-Tuning: LlamaIndex unveils a fresh guide on harnessing OpenAI fine-tuning to embed knowledge from any text corpus. In short: generate QA pairs with GPT-4, format them into a training dataset, and proceed to fine-tuning. Tweet, Docs.
  2. Embedding Fine-Tuning: LlamaIndex has a more advanced embedding fine-tuning feature, enabling complex NN query transformations on any embedding, including custom ones, and offering the ability to save intermediate checkpoints for enhanced model control. Tweet, Docs.

Retrieval Tips For RAG:

  • Use references (smaller chunks or summaries) instead of embedding full text.
  • Results in 10–20 % improvement.
  • Embeddings decoupled from main text chunks.
  • Smaller references allow efficient LLM synthesis.
  • Deduplication applied for repetitive references.
  • Evaluated using synthetic dataset; 20–25% MRR boost.


Building RAG from Scratch Guides:

  1. Build Data Ingestion from scratch. Docs.
  2. Build Retrieval from scratch. Docs.
  3. Build Vector Store from scratch. Docs.
  4. Build Response Synthesis from scratch. Docs.
  5. Build Router from scratch. Docs.
  6. Build Evaluation from scratch. Docs.


  1. Wenqi Glantz tutorial on Fine-Tuning GPT-3.5 RAG Pipeline with GPT-4 Training Data with LlamaIndex fine-tuning abstractions.
  2. Wenqi Glantz tutorial on Fine-Tuning Your Embedding Model to Maximize Relevance Retrieval in RAG Pipeline with LlamaIndex.

Tutorials from the LlamaIndex Team.

Integrations with External Platforms

  1. Integration with PortkeyAI: LlamaIndex integrates with PortkeyAI, boosting LLM providers like OpenAI with features like auto fallbacks and load balancing. Tweet, Documentation
  2. Collaboration with Anyscale: LlamaIndex collaborates with anyscalecompute, enabling easy tuning of open-source LLMs using Ray Serve/Train. Tweet, Documentation
  3. Integration with Elastic: LlamaIndex integrates with Elastic, enhancing capabilities such as vector search, text search, hybrid search models, enhanced metadata handling, and es_filters. Tweet, Documentation
  4. Integration with MultiOn: LlamaIndex integrates with MultiOn, enabling data agents to navigate the web and handle tasks via an LLM-designed browser. Tweet, Documentation
  5. Integration with Vectara: LlamaIndex collaborates with Vectara to streamline RAG processes from loaders to databases. Tweet, Blog Post
  6. Integration with LiteLLM: LlamaIndex integrates with LiteLLM, offering access to over 100 LLM APIs and features like chat, streaming, and async operations. Tweet, Documentation
  7. Integration with MonsterAPI: LlamaIndex integrates with MonsterAPI, allowing users to query data using LLMs like Llama 2 and Falcon. Tweet, Blog Post


  1. Jerry Liu spoke on Production Ready LLM Applications at the Arize AI event.
  2. Ravi Theja conducted a workshop at LlamaIndex + Replit Pune Generative AI meetup.
  3. Jerry Liu session on Building a Lending Criteria Chatbot in Production with Stelios from MQube.


  1. Webinar on How to Win an LLM Hackathon by Alex Reibman, Rahul Parundekar, Caroline Frasca, and Yi Ding.
  2. Webinar on LLM Challenges in Production with Mayo Oshin, AI Jason, and Dylan.