Gather data from various sources, which can be either structured (like databases) or unstructured (like text documents).
After loading the data into your data warehouse or data lake, you need a separate process or tool that handles data embedding. This involves converting text (or other data) into numerical representations that can be easily understood and processed by the AI model. This is usually done using machine learning techniques, like word vectors or sentence embeddings (e.g., using BERT, GPT, etc.).
The chatbot application can then connect to this vector database to retrieve and use the relevant information based on user queries.
decouple the embedding and vector storage part. then the performance can be improved i think. because embedding can be processed asynchronously
Created by Shirish Kadam
·