<img width="856" alt="截屏2024-04-10 11 34 05" src="https://github.com/OpenGenerativeAI/llm-colosseum/assets/13414571/f1a89e25-e84d-44e4-be8e-a4839d47beda"> How to generate this ranking? If I added new model, how to reproduce this benchmark?