TNS
VOXPOP
Do You Resent AI?
If you’re a developer, do you resent generative AI’s ability to write code?
Yes, because I spent a lot of time learning how to code.
0%
Yes, because I fear that employers will replace me and/or my peers with it.
0%
Yes, because too much investment is going to AI at the expense of other needs.
0%
No, because it makes too many programming mistakes.
0%
No, because it can’t replace what I do.
0%
No, because it is a tool that will help me be more productive.
0%
No, I am a highly evolved being and resent nothing.
0%
I don’t think much about AI.
0%
AI / Data / Infrastructure as Code / Large Language Models / Observability / Serverless

Pulumi Templates for GenAI Stacks: Pinecone, LangChain First

The AI professional, who may not have operations experience, can define and orchestrate an ML stack, using Python or another language of their choice.  
Feb 21st, 2024 9:00am by
Featued image for: Pulumi Templates for GenAI Stacks: Pinecone, LangChain First

To build a Generative AI application, you typically need at least two components to start with, a Large Language Model (LLM) and a vector data store. You probably need some sort of frontend component as well, such as a chatbot.

Organizations jumping into the GenAI space are now facing an orchestration challenge with GenAI. They find that moving these components from the developer’s laptop to the production environment can be error-prone and time-consuming.

To ease deployments, Infrastructure as Code (IaC) software provider Pulumi has introduced “providers,” or templates, for two essential GenAI tools, namely the Pinecone vector database and the LangChain framework for building LLMs.

“We find a lot of the tools out there, like LangChain, are great for local development. But then when you want to go into production, it’s left as a DIY exercise,” said Joe Duffy, CEO and co-founder of Pulumi, in an interview with TNS. “And it’s very challenging because you want to architect for infinite scale so that as you see success with your application, you’re able to scale to meet that demand. And that’s not very easy to do.”

Specifically, Pulumi is supporting the serverless version of Pinecone on AWS, which was unveiled in January, and support for LangChain comes through provisioning an Amazon ECS cluster with LangServe running as a service.”

The two templates join a portfolio that covers over 150 cloud and SaaS service providers, including many others used in the GenAI space, such as Vercel Next.js for the frontend and Apache Spark.

In addition to the templates themselves, Pulumi also mapped out a set of reference architectures that use Pinecone and LangChain.

How to Build a GenAI Stack Using IaC

The idea is that the AI professional, who may not have operations experience, can define and orchestrate an ML stack with Pulumi, using Python another language.

As an IaC solution, Pulumi provides a way to declaratively define an infrastructure. Unlike other IaC approaches, Pulumi allows the developer to build out your environment using any one of a number of programming languages, such as Python, Go, Java and TypeScript.

The deployment engine then provisions the defined environment, and even check to ensure that the operational state stays in sync with the defined state.

The AI Gen reference architectures have been designed with best practices in mind, Duffy said. “A lot of the challenge is how to make this scalable, scalable across regions and scalable across the subnets, and networks. And so this blueprint is built for configurable scale.”

This is not Pulumi’s first foray into managing AI infra. The company has already developed modules for AWS SageMaker and Microsoft’s OpenAI Azure service. There is also a blueprint for deploying an LLM from Hugging Face on Docker, Azure, or Runpod.

Of course, the company has plans to further expand the roster going forward.

“We’re seeing a lot of uptake in using Pulumi for these AI workloads,” Duffy said.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Docker.
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.