The OpenCompass team is thrilled to announce the release of OpenCompress v0.3.5!
🌟 Highlights
- 🚀 Introduction of two new datasets: CMO&AIME, expanding our evaluation capabilities.
- 📖 Several updates to our documentation, ensuring clearer guidance for all users.
- ⚙ Several enhancements and refactoring efforts to make our codebase more robust and maintainable.
🚀 New Features
- 🆕 Added support for the CMO&AIME datasets, broadening the scope of models we can evaluate. (#1610)
- 🆕 Introduced the
CompassArenaSubjectiveBench
, a new benchmark for subjective evaluations. (#1645) - 🆕 Added configurations for the lmdeploy DeepSeek model, enhancing compatibility with cutting-edge technologies. (#1656)
📖 Documentation
- 📚 Updated the documentation to reflect the latest changes and improvements, making it easier than ever to navigate and understand. (#1655)
🐛 Bug Fixes
- 🔧 Fixed issues with the
ruler_16k_gen
component, ensuring more accurate and reliable results. (#1643) - 🔧 Resolved an error in the
get_loglikelihood
function when using lmdeploy as the accelerator. (#1659) - 🔧 Addressed problems with automatic downloads for certain datasets, streamlining the user experience. (#1652)
⚙ Enhancements and Refactors
- 💪 Enhanced the summarizer configurations for models, improving the efficiency and effectiveness of summarization tasks. (#1600)
- 💪 Added new model configurations, keeping up with the latest advancements in machine learning. (#1653)
- 💪 Updated the WildBench maximum sequence length, allowing for better handling of longer input sequences. (#1648)
- 💪 Updated the Needlebench OSS path, ensuring smoother data access and processing. (#1651)
- 💪 Improved the
mmmlu_lite
dataloader, optimizing data loading processes. (#1658)
🎉 Welcome New Contributors
- 👏 A warm welcome to @jnanliu, who has made their first contribution by adding the CMO&AIME datasets! (#1610)
For a complete overview of all changes, please refer to the full changelog: 0.3.4...0.3.5