Release 0.3.5 · open-compass/opencompass

The OpenCompass team is thrilled to announce the release of OpenCompress v0.3.5!

🌟 Highlights

🚀 Introduction of two new datasets: CMO&AIME, expanding our evaluation capabilities.
📖 Several updates to our documentation, ensuring clearer guidance for all users.
⚙ Several enhancements and refactoring efforts to make our codebase more robust and maintainable.

🚀 New Features

🆕 Added support for the CMO&AIME datasets, broadening the scope of models we can evaluate. (#1610)
🆕 Introduced the CompassArenaSubjectiveBench, a new benchmark for subjective evaluations. (#1645)
🆕 Added configurations for the lmdeploy DeepSeek model, enhancing compatibility with cutting-edge technologies. (#1656)

📖 Documentation

📚 Updated the documentation to reflect the latest changes and improvements, making it easier than ever to navigate and understand. (#1655)

🐛 Bug Fixes

🔧 Fixed issues with the ruler_16k_gen component, ensuring more accurate and reliable results. (#1643)
🔧 Resolved an error in the get_loglikelihood function when using lmdeploy as the accelerator. (#1659)
🔧 Addressed problems with automatic downloads for certain datasets, streamlining the user experience. (#1652)

⚙ Enhancements and Refactors

💪 Enhanced the summarizer configurations for models, improving the efficiency and effectiveness of summarization tasks. (#1600)
💪 Added new model configurations, keeping up with the latest advancements in machine learning. (#1653)
💪 Updated the WildBench maximum sequence length, allowing for better handling of longer input sequences. (#1648)
💪 Updated the Needlebench OSS path, ensuring smoother data access and processing. (#1651)
💪 Improved the mmmlu_lite dataloader, optimizing data loading processes. (#1658)

🎉 Welcome New Contributors

👏 A warm welcome to @jnanliu, who has made their first contribution by adding the CMO&AIME datasets! (#1610)

For a complete overview of all changes, please refer to the full changelog: 0.3.4...0.3.5

Provide feedback