Skip to content

ERBench - Equation Recovery Benchmark

Welcome to the Equation Recovery Benchmark (ERBench), a benchmark for symbolic regression algorithms. This benchmark is designed to enable the next generation of algorithms that discover equations from data.

The main properties of this benchmark are:

  • πŸ“Š Comprehensive Evaluation: Multiple accuracy and interpretability measures for symbolic regressors.
  • πŸ“ Diverse Regression Tasks: Split into a public set for training and a secret set for evaluation.
  • πŸ† Leaderboard: Continuous tracking and comparison of the top-performing SR algorithms.

If you have any further questions, feel free to contact us.

πŸš€ Updates / News

  • 2025-05-20: Leaderboard Launched! The first version of the public leaderboard is now live. Submit your results!
  • 2025-05-15: Dataset Published! The public dataset is now available on Huggingface.

ℹ️ About

In Symbolic Regression (SR), we want to find interpretable models that solve a regression task. Instead of optimizing the parameters of models from a fixed hypothesis class (e.g. in Linear regression), SR algorithms search the space of functions that can be composed from a given set of operators.

The benchmark consists of two parts:

  1. Development Set: A public set of ground truth equations together with python code for evaluation. Use this set to train and fine-tune your algorithm. It is accessible via Huggingface.
  2. Competition Set: A secret set of equations on which your algorithm’s performance will be evaluated. It is only accessible via the competition. Results are then publicly available via the Leaderboard.

πŸ“ References

If you use this benchmark in your research or projects, please cite the following publication:

Under review