ERBench - Equation Recovery Benchmark¶

Welcome to the Equation Recovery Benchmark (ERBench), a benchmark for symbolic regression algorithms. This benchmark is designed to enable the next generation of algorithms that discover equations from data.

The main properties of this benchmark are:

📊 Comprehensive Evaluation: Multiple accuracy and interpretability measures for symbolic regressors.
📁 Diverse Regression Tasks: Split into a public set for training and a secret set for evaluation.
🏆 Leaderboard: Continuous tracking and comparison of the top-performing SR algorithms.

If you have any further questions, feel free to contact us.

🔗 Quick Links¶

🚀 Updates / News¶

2025-05-20: Leaderboard Launched! The first version of the public leaderboard is now live. Submit your results!
2025-05-15: Dataset Published! The public dataset is now available on Huggingface.

ℹ️ About¶

In Symbolic Regression (SR), we want to find interpretable models that solve a regression task. Instead of optimizing the parameters of models from a fixed hypothesis class (e.g. in Linear regression), SR algorithms search the space of functions that can be composed from a given set of operators.

The benchmark consists of two parts:

Development Set: A public set of ground truth equations together with python code for evaluation. Use this set to train and fine-tune your algorithm. It is accessible via Huggingface.
Competition Set: A secret set of equations on which your algorithm’s performance will be evaluated. It is only accessible via the competition. Results are then publicly available via the Leaderboard.

📝 References¶

If you use this benchmark in your research or projects, please cite the following publication:

Under review