Category: Use-Case

LlamaIndex’s Newly-Released Instrumentation Module + Phoenix Integration

Category: Generative AI, Large Language Models, LLMOps, Phoenix, Use-Case | Comments Off on LlamaIndex’s Newly-Released Instrumentation Module + Phoenix Integration
Posted by Evan Jolley | July 1, 2024

Due to the black box nature of LLMs and the importance of tasks they’re being trusted to handle, intelligent monitoring and optimization tools are essential to ensure they operate efficiently and effectively. The integration of Arize Phoenix with LlamaIndex’s newly released instrumentation module offers developers unprecedented power to fine-tune performance, diagnose issues, and enhance the…

LLM Summarization: Getting To Production

Category: Phoenix, Use-Case | Comments Off on LLM Summarization: Getting To Production
Posted by Shittu Olumide | May 30, 2024

Recently, I attended a workshop hosted by Arize AI’s Jason Lapatecki and Dat Ngo on large language model summarization covering common challenges with the use case and how to evaluate generated summaries. Drawing from this session and additional research, this article dives into the concept of LLM summarization – why it is important, primary summarization…

How To Set Up a SQL Router Query Engine for Effective Text-To-SQL

Category: LLMOps, Phoenix, Use-Case | Comments Off on How To Set Up a SQL Router Query Engine for Effective Text-To-SQL
Posted by Amber Roberts | March 18, 2024

This article co-authored by Dustin Ngo Large language model (LLM) applications are being deployed by an increasing number of companies to power everything from code generation to improved summarization of customer service calls. One area where LLMs with in-context learning show promise is text-to-SQL, or generating SQL queries from natural language. Achieving results is often…

Evaluate RAG with LLM Evals and Benchmarks

Category: Generative AI, Large Language Models, LLMOps, Phoenix, Use-Case | Comments Off on Evaluate RAG with LLM Evals and Benchmarks
Posted by Shittu Olumide | March 6, 2024 | Tags: llamaindex, llm evaluation, phoenix

Recently, I attended a workshop organized by Arize AI titled “RAG Time! Evaluate RAG with LLM Evals and Benchmarking.” Hosted by Amber Roberts – ML Growth Lead at Arize AI, and Mikyo King – Head of Open Source at Arize AI, the talks provided valuable insights into an important field of study. Miss the event?…

Calling All Functions: Benchmarking OpenAI Function Calling and Explanations

Category: Generative AI, Large Language Models, LLMOps, Phoenix, Use-Case | Comments Off on Calling All Functions: Benchmarking OpenAI Function Calling and Explanations
Posted by Amber Roberts | December 7, 2023

This piece is co-authored by Roger Yang, Software Engineer at Arize AI Observability in third-party large language models (LLMs) is largely approached with benchmarking and evaluations since models like Anthropic’s Claude, OpenAI’s GPT models, and Google’s PaLM 2 are proprietary. In this blog post, we benchmark OpenAI’s GPT models with function calling and explanations against…