A QA system is probably the first idea that comes to mind for most people who think about building an LLM-powered system. The only currently available tutorial for writing ChatGPT applications provided by OpenAI is How to build an AI that can answer questions about your website.
The system described there consists of two main components: a search engine and a response synthesizer, and I believe this is a typical architecture of these systems. The search engine is responsible for finding the most relevant texts from a given archive that could potentially contain the answer to the user’s question. The response synthesizer is responsible for generating the answer based on the selected texts and the question. These two parts are needed because the LLM has a limited prompt and in many cases you can’t load the entire archive into the prompt. This limit is constantly being lifted and in more and more cases you don’t need the search engine - you can just load everything into a pre-prompt and voila, you have your QA system. But I believe that the interesting thing is to combine knowledge from multiple sources. It might also be a way to get more factual answers and limit ‘hallucinations’ (that’s just a hunch).
Some examples of large archives that could be used here:
all academic textbooks
all textbooks of some large IT system (I’m thinking of SAP)
all accounting textbooks combined with a database of legal acts - so that the answers are always adjusted to the current state of legislation.
A basic problem with large contexts is also that for each loading of the context you have to pay - because it involves a lot of work for the LLM. I think some new versions of OpenAI API are announced in which you will be able to keep the LLM with loaded context - which may give significant savings if such an LLM with pre-loaded context was used to serve answers to many questions.
I imagine that you could experiment a lot with the search engine part. For example, you could use traditional indexes and keywords instead of embeddings. You could also use user feedback or interaction to refine or expand the search criteria and adjust the ranking or selection of texts. We still don’t know much about the possibilities offered by LLMs - it would be advisable to give users a lot of control so that they could experiment with many usage modes.
Another idea for experimenting would be multi-stage systems. For example, one that would use an additional LLM to analyze the query and extract the relevant keywords.