You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
I have ingested files in the vector db of pinecone index, i delete all the vectors from the db and when i query it still fetches it from the db, how it is possible?
Expected Behavior
Db is deleted but i am still fetching vectors from the index.
Steps To Reproduce
1- First i create a test index
2- After creating index, i ingest my data through this below script
import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';
import { OpenAIEmbeddings } from 'langchain/embeddings/openai';
import { PineconeStore } from 'langchain/vectorstores/pinecone';
import { pinecone } from '@/utils/pinecone-client';
import { CustomPDFLoader } from '@/utils/customPDFLoader';
import { PINECONE_INDEX_NAME, PINECONE_NAME_SPACE } from '@/config/pinecone';
import { DirectoryLoader } from 'langchain/document_loaders/fs/directory';
/* Name of directory to retrieve your files from */
const filePath = 'new docs';
export const run = async () => {
try {
/* Load raw docs from all files in the directory */
const directoryLoader = new DirectoryLoader(filePath, {
'.pdf': (path) => new CustomPDFLoader(path),
});
const rawDocs = await directoryLoader.load();
// Extracting the file name using regular expressions and updating metadata
const processedDocs = rawDocs.map(doc => {
const fileName = doc.metadata.source.match(/[^\\\/]+$/)?.[0] || doc.metadata.source;
const modifiedMetadata = { ...doc.metadata, source: fileName };
return { ...doc, metadata: modifiedMetadata };
});
/* Split text into chunks */
const textSplitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const docs = await textSplitter.splitDocuments(processedDocs);
console.log('split docs', docs);
console.log('creating vector store...');
/* Create and store the embeddings in the vectorStore */
const embeddings = new OpenAIEmbeddings();
const index = pinecone.Index(PINECONE_INDEX_NAME); // Change to your own index name
// Embed the PDF documents
await PineconeStore.fromDocuments(docs, embeddings, {
pineconeIndex: index,
namespace: PINECONE_NAME_SPACE,
textKey: 'text',
});
} catch (error) {
console.log('error', error);
throw new Error('Failed to ingest your data');
}
};
console.log('question', question);
console.log("Your history is",history)
console.log("Your username is",username)
// Only accept post requests
if (req.method !== 'POST') {
res.status(405).json({ error: 'Method not allowed' });
return;
}
if (!question && (!history || history.length === 0)) {
// Start a new chat by clearing the history
return res.status(200).json({
text: 'What can I help you with now? ',
sourceDocuments: [],
});
}
if (!question) {
return res.status(400).json({ message: 'No question in the request' });
}
// OpenAI recommends replacing newlines with spaces for best results
const sanitizedQuestion = question.trim().replaceAll('\n', ' ');
try {
const index = pinecone.Index(PINECONE_INDEX_NAME);
/* create vectorstore */
const vectorStore = await PineconeStore.fromExistingIndex(
new OpenAIEmbeddings({}),
{
pineconeIndex: index,
textKey: 'text',
namespace: PINECONE_NAME_SPACE,
},
);
// Create chain
const chain = makeChain(vectorStore);
// Ask a question using chat history
const response = await chain.call({
question: sanitizedQuestion,
chat_history: history || [],
});
chatbotResponse = response.text;
fetch (flaskservice,{
method:'POST',
headers:{
'Content-Type':'application/json',
},
body:JSON.stringify({
question:sanitizedQuestion,
history:history,
username:username,
chatbot_response: chatbotResponse,
}),
})
.then((response)=>response.json())
.then((response)=>{
console.log('data stored in flask',response);
})
.catch((error)=>{
console.error('Error storing in flask :',error);
})
globalSourceDocs = response.sourceDocuments;
chatbotResponse = response.text;
/*
console.log('Your response is ', chatbotResponse);
*/
if (globalSourceDocs.length > 0) {
const fullPath = globalSourceDocs[0].metadata.source;
const filename = fullPath.split('\\').pop();
/*
console.log('Your filenames are is', filename);
*/
const fetchFileUrl=fetch(file_url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
file_name: filename,
}),
})
.then((response2) => response2.json())
.then((response2) => {
console.log("URL RECEVIED IS",response2.fileurl)
return response2.fileurl; // Return the fileUrl value
})
.catch((error) => {
console.error('Error fetching file URL:', error); // Handle the error
});
fetchFileUrl.then((fileUrl) => {
// Update sourceDocuments with filename and fileUrl
globalSourceDocs = globalSourceDocs.map((sourceDoc) => ({
...sourceDoc,
metadata: {
...sourceDoc.metadata,
source: filename, // Change source to filename
fileUrl, // Add fileUrl to metadata
},
}));
/*
console.log('Updated sourceDocuments:', globalSourceDocs);
*/
// Update the response with modified sourceDocuments
const modifiedResponse = { ...response, sourceDocuments: globalSourceDocs };
/*
console.log('Modified response:', modifiedResponse);
*/
res.status(200).json(modifiedResponse); // Send the modified response
})
.catch((error) => {
console.error('Error:', error); // Print errors, if any
res.status(500).json({ error: 'Something went wrong' }); // Return error response
});
} else {
// If no source documents found, return the original response
console.log('No source documents found in the response.');
res.status(200).json(response);
}
Is this a new bug?
Current Behavior
I have ingested files in the vector db of pinecone index, i delete all the vectors from the db and when i query it still fetches it from the db, how it is possible?
Expected Behavior
Db is deleted but i am still fetching vectors from the index.
Steps To Reproduce
1- First i create a test index
2- After creating index, i ingest my data through this below script
import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';
import { OpenAIEmbeddings } from 'langchain/embeddings/openai';
import { PineconeStore } from 'langchain/vectorstores/pinecone';
import { pinecone } from '@/utils/pinecone-client';
import { CustomPDFLoader } from '@/utils/customPDFLoader';
import { PINECONE_INDEX_NAME, PINECONE_NAME_SPACE } from '@/config/pinecone';
import { DirectoryLoader } from 'langchain/document_loaders/fs/directory';
/* Name of directory to retrieve your files from */
const filePath = 'new docs';
export const run = async () => {
try {
/* Load raw docs from all files in the directory */
const directoryLoader = new DirectoryLoader(filePath, {
'.pdf': (path) => new CustomPDFLoader(path),
});
} catch (error) {
console.log('error', error);
throw new Error('Failed to ingest your data');
}
};
(async () => {
await run();
console.log('ingestion complete');
})();
3- once i ingest my data, it creates a vector store and ingest doc files meta data in it
4- When i delete this index and calls the query it still fetches data from the vectordb, this is my chat.ts code when it fetches query
import type { NextApiRequest, NextApiResponse } from 'next';
import { OpenAIEmbeddings } from 'langchain/embeddings/openai';
import { PineconeStore } from 'langchain/vectorstores/pinecone';
import { makeChain } from '@/utils/makechain';
import { pinecone } from '@/utils/pinecone-client';
import { PINECONE_INDEX_NAME, PINECONE_NAME_SPACE } from '@/config/pinecone';
interface SourceDocument {
pageContent: string;
metadata: {
'loc.lines.from': number;
'loc.lines.to': number;
pdf_numpages: number;
source: string;
};
}
export default async function handler(
req: NextApiRequest,
res: NextApiResponse
) {
const { question, history, username} = req.body;
let chatbotResponse = '';
let globalSourceDocs: SourceDocument[] = [];
const { file_url, flaskservice} = require('../../public/config.json');
console.log('question', question);
console.log("Your history is",history)
console.log("Your username is",username)
// Only accept post requests
if (req.method !== 'POST') {
res.status(405).json({ error: 'Method not allowed' });
return;
}
if (!question && (!history || history.length === 0)) {
// Start a new chat by clearing the history
return res.status(200).json({
text: 'What can I help you with now? ',
sourceDocuments: [],
});
}
if (!question) {
return res.status(400).json({ message: 'No question in the request' });
}
// OpenAI recommends replacing newlines with spaces for best results
const sanitizedQuestion = question.trim().replaceAll('\n', ' ');
try {
const index = pinecone.Index(PINECONE_INDEX_NAME);
} catch (error: any) {
console.log('error', error);
res.status(500).json({ error: error.message || 'Something went wrong' });
}
}
Relevant log output
No response
Environment
Additional Context
No such
The text was updated successfully, but these errors were encountered: