Total Contributors:
Table of Contents:
The ChatLSE project is a proof of concept of a full data pipeline to index data from LSE websites. In this project, we gathered all public LSE documents and webpages into a database and then develop a chat interface using an LLM. Think of it as a ChatGPT meant to be particularly knowledgeable of LSE documents. Utilising retrieval augmented generation (RAG), the ChatLSE chatbot is capable of answering queries from staff and students by consulting relevant LSE documents and regulations.
As all parts of this application are completely open-source, this project also aims to serve as a blueprint for a fully open-source RAG solution. The full workflow of the project is illustrated below:
The workflow improves upon vanilla implementations of RAG by adding components of query rewriter and query classifier. They ensure that the chatbot behaves more naturally when interacting with users, being able to handle follow-up questions by referring to previous context and knowing when to deny answering questions that are out of the scope of its intended usage.
ChatLSE was initially created by a small team from the LSE Data Science Institute over the summer of 2024. We now hope to make it open-source and community-driven to allow everyone to contribute to this project. Everyone who contributes to this project, no matter how small or big their contributions are, is recognised in this project as a contributor and a community member.
The project is coordinated and managed by Jonathan Cardoso-Silva.
Please see the Contributors Table for the GitHub profiles of all our contributors.
This repository is always a work in progress and everyone is encouraged to help us build something that is useful to the many.
Everyone who joins the project should check out our contributing guidelines for more information on how to get started.
Community members are provided with opportunities to learn new skills, share their ideas and collaborate with others.
You can contact the ChatLSE team by emailing [email protected].
Thanks goes to these wonderful people (emoji key):
This project follows the all-contributors specification. Contributions of any kind welcome!
