LlamaIndex • 2024-01-02

LlamaIndex Newsletter 2024–01–02

Hello, Llama Lovers 🦙,

Happy New Year! As we step into 2024, we’re thrilled to bring you a special edition of our newsletter, packed with updates from the last two weeks of 2023. This edition is brimming with the latest features, community demos, courses, insightful tutorials, guides, and webinars that we’ve curated for you.

Have you been working on an interesting project, written an engaging article, or created a video? We can’t wait to hear about it! Please share your work with us at news@llamaindex.ai. Don’t forget to subscribe to our newsletter via our website to receive all these exciting updates directly in your inbox.

🤩 First, the highlights:

LLMCompiler Implementation: A SOTA agent implementation for faster, efficient handling of complex queries. Notebook, Tweet.
MultiDocAutoRetrieverPack: A RAG template for structured retrieval and dynamic responses to large documents and metadata. Tweet, LlamaPack.
Structured Hierarchical RAG: New RAG technique for optimized retrieval over multiple documents, ensuring precise, relevant responses. Docs, Tweet.
Custom Agents: A simple abstraction for custom agent reasoning loops, enabling easy integration with RAG, SQL, and other systems, and enhancing response refinement for complex queries. Docs, Tweet.
New lower-level agent API: For enhanced transparency, debuggability, and control, supporting step-wise execution and task modification. Docs, Tweet.

✨ Feature Releases and Enhancements:

We have introduced a simple abstraction for building custom agent reasoning loops, surpassing prepackaged frameworks like ReAct. This tool allows for easy integration with RAG, SQL, or other systems, and we demonstrated how to build an agent with retry logic for routers, enhancing its ability to manage complex, multi-part questions and refine query responses. Docs, Tweet.
We have implemented the LLMCompiler project, a SOTA agent framework enabling DAG-based planning and parallel function execution. This surpasses traditional sequential methods in speed, allowing for quicker and more efficient handling of complex queries in any LLM and data pipeline. Notebook, Tweet.
We have introduced MultiDocAutoRetrieverPack, a RAG template for efficiently handling large documents and metadata, offering structured retrieval and dynamic responses tailored to specific queries. Tweet, LlamaPack.
We have introduced a Structured Hierarchical RAG technique, optimizing RAG over multiple documents. It involves modeling documents as structured metadata for auto-retrieval, indexed in a vector database. This method dynamically selects documents based on inferred properties and performs recursive retrieval within each document for precise, relevant responses in your RAG pipeline. Docs, Tweet.
We have launched a new feature for advanced RAG that allows step-wise feedback for complex query executions, improving interpretability and control. This is particularly beneficial for weaker models that struggle with multi-part tasks. We also introduced a step-by-step chat interface for enhanced user interaction and control. Notebook, Tweet.
We have integrated with OpenRouterAI, offering a unified API for easy LLM access, cost efficiency, and reliable fallback options. OpenRouterAI allows users to compare costs, latency, and throughput for various models, like mixtral-8x7b, directly on their platform. Notebook, Tweet.
We have introduced a new lower-level agent API that enhances transparency, debuggability, and control. This API allows for granular control over agents, decouples task creation from execution, and supports step-wise execution. It also enables viewing each step, upcoming steps, and soon, modifying intermediate steps with human feedback. Docs, Tweet.

👀 Community Demos:

Automated LeetCode Crash Course: The Project integrates advanced ML with traditional algorithms to streamline LeetCode study for technical interviews. It involves extracting and summarizing LeetCode problems using an LLM, organizing these summaries in a vector store, and employing scikit-learn for clustering. Blog, Code.
RAG Assisted Auto Developer: A project by Ocean Li for building a devbot that understands and writes code. It integrates various tools: LlamaIndex for indexing codebases, Autogen / OpenAI Code Interpreter for code writing and testing, and lionagi.ai for orchestration. Notebook.

📚 Courses:

We’ve partnered with ActiveLoop AI to provide a free course on retrieval-augmented generation for production, featuring 33 lessons, 7 hands-on assignments, and a certification upon completion.
Beginner-friendly course from IBM Skills Network on using LlamaIndex with IBM Watsonx to create effective product recommendations.

🗺️ Guides:

Guide to Semi-Structured Image QA with Gemini: Learn to extract data from unlabeled images and query it, using multi-modal models and advanced retrieval techniques, as demonstrated with the SROIE v2 dataset which contains images of receipts/invoices.
Guide to Advanced RAG Concepts: A comprehensive survey by Ivan Ilin, covering twelve core concepts including chunking, hierarchical indexing, query rewriting, and more. Each section provides resources and guides from our system for deeper understanding and practical application.
Guide to Building Hybrid Search: Learn to create a hybrid search for RAG from scratch. The process involves generating sparse vectors, fusing sparse and dense queries, and implementing this in a Qdrant engine database for effective RAG integration.
Guide to Building Structured Retrieval with LLMs: Set up auto-retrieval in Pinecone vector database, monitor prompts with Arize AI Phoenix, and tailor prompts for specific queries to enhance your document handling and structured data analysis.
Guide on Evaluating LLM Evaluators: our new evaluation method and dataset bundle, are designed to benchmark LLMs as evaluators against human annotations. This involves comparing LLM judge predictions (1–5 score) with ground-truth judgments, using metrics like Correlation, Hamming Distance, and Agreement Rate.

✍️ Tutorials:

Ryan Nguyen tutorial on Processing Tables in RAG Pipelines with LlamaIndex and UnstructuredIO.
Wenqi Glantz tutorial on Safeguarding RAG Pipelines: A Step-by-Step Guide to Implementing Llama Guard with LlamaIndex.
Wenqi Glantz tutorial on 10+ Ways to Run Open-Source Models with LlamaIndex.
Jina AI tutorial on enhancing RAG applications by integrating Jina v2 embeddings with LlamaIndex and Mixtral LLM via Hugging Face.
Ankush Singal tutorial on Benchmarking RAG Pipelines With A Evaluation Pack in Forward-Looking Active Retrieval Augmented Generation (FLARE).
Laurie’s tutorial on Effortlessly Running Mistral AI’s Mixtral 8x7b: Learn to use OLLAMA with LlamaIndex for a one-line setup of a local, open-source retrieval-augmented generation app with API, featuring Qdrant engine integration for vector storage.
Tomaz Bratanic tutorial on Multimodal RAG pipeline with LlamaIndex and Neo4j.
Sudarshan Koirala video tutorial on using Mistral API with LlamaIndex.
Chia Jeng Yang tutorial on Technical Considerations for Complex RAG.

🎥 Webinars:

Webinar with Google Developers on advanced RAG applications and multi-modal settings with Google Gemini.
Webinar of Jerry Liu with Louis-François on the Future of AI: LlamaIndex, LLMs, RAG, Prompting, and more.

🏢 Calling all enterprises:

Are you building with LlamaIndex? We are working hard to make LlamaIndex even more Enterprise-ready and have sneak peeks at our upcoming products available for partners. Interested? Get in touch.

LlamaIndex Newsletter 2025-07-08
2025-07-08
Context Engineering - What it is, and techniques to consider
2025-07-03
LlamaIndex Newsletter 2025-07-01
2025-07-01
Announcing Workflows 1.0: A Lightweight Framework for Agentic systems
2025-06-30

LlamaIndex Newsletter 2024–01–02

Related articles