GitHub - ernesttan1976/mylangchain: My Langchain - An exploration of LangChain.JS to implement LLM / ChatGPT Application

My LangChain

Getting started 🚀

Clone this repo!
Install dependencies: npm install
Run the development server: npm run dev
Create .env file with OPENAI_API_KEY='Your key here'

Open http://localhost:3000 with your browser to see the result.

Experiment

This app is my experiment to try out LangchainJS and to create embedding for GPT/LLM for a specific knowledge area and application.

Embedding Function Using PineCone / HNSWLib

Write NextJS code for uploading a pdf file
Langchain.JS will use the RecursiveCharacterTextSplitter to process the file into chunks.
The processed chunks and the pdf to be saved in local storage "/folder"
User has option to save in Pinecone data store or HNSWLib local storage
write NextJS code to query the Pinecone data store or HNSWLib store AND also OpenAI

Here's an example Next.js code for uploading a PDF file:

import { useState } from "react";
function UploadPDF() {
const [file, setFile] = useState(null);
const handleFileChange = (event) => {
setFile(event.target.files[0]);
};
const handleSubmit = async (event) => {
event.preventDefault();
const formData = new FormData();
formData.append("pdf", file);
const response = await fetch("/api/upload-pdf", {
method: "POST",
body: formData,
});
const data = await response.json();
console.log(data);
};
return (
<form onSubmit={handleSubmit}>
<input type="file" accept=".pdf" onChange={handleFileChange} />
<button type="submit">Upload PDF</button>
</form>
);
}
export default UploadPDF;

Here's an example code using Langchain.js to process the PDF file into chunks using RecursiveCharacterTextSplitter:

const Langchain = require("langchain");
const fs = require("fs");
const pdf = fs.readFileSync("path/to/pdf");
const text = Langchain.PDF.extractText(pdf);
const splitter = new Langchain.RecursiveCharacterTextSplitter();
const chunks = splitter.split(text);
console.log(chunks);

To save the processed chunks and the PDF file in local storage, you can use the fs module in Node.js:

const fs = require("fs");
fs.writeFileSync("path/to/local/folder/pdf.pdf", pdf);
fs.writeFileSync("path/to/local/folder/chunks.json", JSON.stringify(chunks));

To save the data in Pinecone data store or HNSWLib local storage, you can use their respective APIs. Here's an example code for Pinecone:

const Pinecone = require("@openai/pinecone");
const pinecone = new Pinecone();
await pinecone.init();
const index = await pinecone.create_index("my-index");
await index.upsert(chunks);

And here's an example code for HNSWLib:

const hnswlib = require("hnswlib");
const index = new hnswlib.Index({
dim: chunks[0].length,
efConstruction: 200,
indexFileName: "path/to/local/folder/index.bin",
storeFileName: "path/to/local/folder/index.store",
});
index.addDataPointBatch(chunks);
index.saveIndex("path/to/local/folder/index.bin");
index.saveIndex("path/to/local/folder/index.store");

To query the Pinecone data store or HNSWLib local storage, you can use their respective APIs. Here's an example code for Pinecone:

const Pinecone = require("@openai/pinecone");
const pinecone = new Pinecone();
await pinecone.init();
const index = await pinecone.get_index("my-index");
const results = await index.query(chunks[0], { k: 10 });
console.log(results);

And here's an example code for HNSWLib:

const hnswlib = require("hnswlib");
const index = new hnswlib.Index({
dim: chunks[0].length,
efConstruction: 200,
indexFileName: "path/to/local/folder/index.bin",
storeFileName: "path/to/local/folder/index.store",
});
index.loadIndex("path/to/local/folder/index.bin");
index.loadIndex("path/to/local/folder/index.store");
const results = index.knnQuery(chunks[0], 10);
console.log(results);

To query OpenAI, you can use their API. Here's an example code:

const openai = require("openai");
openai.apiKey = "YOUR_API_KEY";
const prompt = "What is the meaning of life?";
const completions = await openai.completions.create({
engine: "davinci",
prompt,
maxTokens: 10,
n: 1,
stop: "\n",
});
console.log(completions.choices[0].text);

Flow

Chat Mode: Normal chat with the bot. Able to select the bot type with initial prompt. This is the default mode. Free and easy.
Web Loader Mode: Load your reference web page. There is a text field to enter a web url. When submitting a prompt with web url filled in, it means this web page will be referred to as an embedding -> get embedding from OpenAI -> immediately get a response from OpenAI -> Show a button whether to save this embedding in Pinecone. It does not save by default.
Pdf Loader Mode: Same as Web Loader, except loading a pdf is more memory intensive and takes longer time to get the embeddings. Because it is costly, the pdf embedding is saved by default in Pinecone.
Agent Mode: Pick and Choose the tools for the agent. Tools: [Pinecone Store, Calculator, Browser]

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
components		components
config		config
lib		lib
models		models
pages		pages
public		public
styles		styles
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
log.txt		log.txt
next-env.d.ts		next-env.d.ts
next.config.js		next.config.js
package-lock.json		package-lock.json
package.json		package.json
prompts.txt		prompts.txt
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

My LangChain

Getting started 🚀

Experiment

Embedding Function Using PineCone / HNSWLib

About

Releases

Packages

Languages

License

ernesttan1976/mylangchain

Folders and files

Latest commit

History

Repository files navigation

My LangChain

Getting started 🚀

Experiment

Embedding Function Using PineCone / HNSWLib

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages