Node + React + Langchain Explainer: How to chat with your documents 😱

Build an app to chat with your documents using Langchain, Node, and React.

Node + React + Langchain Explainer: How to chat with your documents 😱
Photo by Jamie Haughton / Unsplash

Building Large Language Models (LLMs) into your apps has never been easier. With frameworks like Langchain JS we can easily integrate LLMs into existing web applications. In this explainer we'll walk through how to build a fullstack LLM app using Node, React, and Langchain.

Setup & Chat

To save some time, let's clone this LLM React Node template and follow the install instructions. This template supports integrations with OpenAI and Hugging Face. Just make sure you set your API keys and ENABLED_MODEL_STORE in your .env file. In this tutorial we'll use OpenAI as our model.

git clone https://github.com/golivecosmos/llm-react-node-app-template.git

Run yarn install to install the project's dependencies. Next you'll run yarn start-server to start your server. Verify it is running by visiting http://localhost:3100/ in your browser. The server will return the following JSON.

{"message":"welcome to your LLM app server"}

You should now also be able to run yarn start to start the client. If installed correctly you can visit http://localhost:5173/. You can try it out by uploading documents and asking questions. Don't forget to give your template some styling.

How does ingesting a document work?

Now that your application is running and answering your questions, let's walk through how this all works.

Let's give it a document to embed (Side note: embeddings are numerical representations of your data. It's how LLMs process information). You can start with one of the documents included in the project or bring your own (warning: testing with large documents may result in hanging queries. To upload larger documents you should consider implementing a chunking strategy).

Once you've selected your file you'll notice an 'upload' button. This will call the /chat/ingest endpoint on your server and submit the file you have selected. In the chat_handler.js you'll notice the ingestFile function has to detect the file type. This is because different file formats require their own loaders.

The getFileLoader method is a simple switch statement that returns a langchain loader object. For now we'll use 3 straight-forward loaders – pdf, txt, and csv. Loaders provide a common interface for us to prepare our data into documents. Feel free to add your own loader. There are a ton to choose from in the langchain docs.

After we return the loader we'll pass the result of loader.load() into a vector database. In this example we'll use HNSWLIb. The HNSWLIb.fromDocuments() method accepts the document we just uploaded and an embeddings function. We're using the OpenAIEmbeddings method available in langchain. The resulting vector database can now reference the document that was uploaded. This is where the code might hang for a while if you are testing with files that are too big.

Lastly we'll initialize a ContextualCompressionRetriever and and a RetrievalQAChain. These are langchain modules that help us manage where and how we process user inputs (note there is a lot of info we're glossing over here. Check out the retriever and chains documentation for more).

One thing to note in ingestFile is how we are reassigning instance variables this.chain, this.vectorStore, and this.retriever. We do this because they are not set when the OpenAIService is instantiated.

This takes care of generating embeddings for our document. Now let's take a look at how we actually submit queries to our model.

How does the app chat with your documents?

When a user enters a natural language query into your text bar, that query is routed to our /chat/ route on the server. Under the hood the router calls startChat() in our chat_handler.js. That handler is responsible for querying our LLM. The current models supported in the template both implement the call() method. You can check out the Hugging Face implementation if you want to branch away from OpenAI.

The screenshot below shows how we direct a user response after a document has been uploaded. It's worth noting that without a document, we use the ConversationChain with ConversationSummaryMemory to handle the query. When we've ingested a document we invoke the RetrievalQAChain chain that is created in ingestFile(). That conditional logic is a shortcut to managing your chains and queries. In a production application you would want to compare available chains to best suit your needs. For more complex chains you will want to learn about Agents.

The response from call() (aka our LLM response) gets parsed and returned to the client for the user to see. You should be able to ask follow-up questions about the document until a new one is uploaded or the server is restarted. There are also plenty of pieces to tinker with. For example, swapping out different types of memory (examples in the LLM React Node template), using a different vector db, using a Hugging Face model, or using different chains. The possibilities are endless.

Next Steps

That's it! You've built an LLM-powered web app. Nice work. In this explainer we've walked through a simple Node React app that lets users upload and ask questions about the document. There is plenty more to dive into and we'll be going into more detail in future posts.

If you found this helpful, please give us a ⭐ on Github. Curious for more? Follow us on X (Twitter) to continue the conversation.