I recently built my very first app using ChatGPT's API.

The thing about ChatGPT is that it's trained on information from the internet, but what if you want it to answer questions using your custom data? Like, let's say you have some data stored on Google Drive, and you want ChatGPT to answer questions based on that.

I ran into this super helpful post by Greg Bagues, explaining how to do just that using Google Drive API and Langchain, and Python. I won't reproduce all the steps here, because he does a phenomenal job at explaining the concepts, and the steps.

I did, however run into some errors, which I want to document here

Greg's instructions worked for the most part, except I ran into some credentials errors from the Google Drive API. Langchain expects the credentials.json file you download from Google's developer console to be located inside ~/.credentials/credentials.json. However, in my case, I had to manually set an environment variable using the Export command (found this fix in the GitHub repo issues for Langchain)

Here's the command I used to set the environment variable:

export GOOGLE_APPLICATION_CREDENTIALS="~/.credentials/credentials.json"

You also need to set up the OPEN AI API key as an environment variable:

export OPENAI_API_KEY=<YOUR OPEN AI API KEY>

Outside of that, Greg's instructions worked like a champ.

But, it seems to work only for docs. What if I wanted to query CSV?

This setup seems to work fine for documents, but when it comes to CSV files inside Google Drive, things get a little wonky. It looks like it only loads part of the CSV file, and the answers are not accurate. I will look into that in the next Mint!

This was a pretty gratifying and exciting start. Definitely going to play around more with this.

Tagged in: