Benchmarking for Computer Algebra

(Author Albert Heinle aheinle@uwaterloo.ca, 2015-02-19)

SDEval is a benchmarking toolbox built on top of the Symbolic Data database. Some of its main goals are

  1. providing an easy way of translating existing entries in Symbolic Data into executable code of computer algebra systems,
  2. providing a feasible way of trustfully reproducing computation results from current research papers and
  3. meeting the particularities of benchmarking in the field of computer algebra and flexibility in order to be applicable across different communities.

Motivations for developing SDEval

Benchmarking of software – i.e. measuring the quality of results and the time resp. memory consumption for a given, standardized set of examples as input – is a common way of evaluating implementations of algorithms in many areas of industry and academia. Unfortunately, for the computer algebra community there are no standards for benchmarking of software yet, and often it is hard to reproduce certain timings and computation results coming from the literature. With SDEval, we aimed to change this, and provide an easy and flexible way to create benchmarks, which can be reproduced and extended by anyone.

Benchmarking in computer algebra differs from other benchmarking tasks, since one can make the following observations:

  1. Sometimes, the results of computations are not unique; that is, several non identically equal outputs can be equivalently correct. It is not always possible to find a canonical form for an output. Even if this is the case, the transformation of output into the canonical form can be quite costly. Moreover, the latter transformation is not necessarily provided by every single computer algebra system.
  2. Related to the previous item: If an answer is not unique, then the evaluation of the correctness of the output is often far from trivial. In some cases, the correctness-evaluation of certain results is even subject of on-going research.
  3. The field of computer algebra deals with a large variety of topics, even though it can be divided into classes of areas where certain common computational problems do appear. Thus, there need to be collections of benchmarks, optimally one as a standard for each class. The benchmark creation process should be flexible to be applicable in a wide range of areas.
  4. Considering input formats, many computer algebra systems are going their own ways, i.e. for many computation problems, telling the respective system what to calculate differ a lot. The source of this problem is that the way of representing certain given mathematical objects may also not be unified across the corresponding community.

We tried to address these challenges as much as possible when designing our toolkit.

Item (1) is something that differs the creation of benchmarks for computer algebra problems from most other fields of studies. Item (3) leads to the conclusion to provide the maximum flexibility possible. Item (4) also encapsulates another big problem: It is a fair assumption that most researchers know how to operate only a small number of computer algebra systems effectively. Trying to use a different computer algebra system for a specific example – which might work better – involves a learning curve for the user, and one can assume that many refrain from taking this curve. In this circumstance, having a possibility to translate certain examples into executable code for these computer algebra systems would be helpful.

Current Features

SDEval currently contains the following features.

How to obtain SDEval

SDEval is part of the Symbolic Data distribution on GitHub. The latest developments can be obtained from Albert’s fork of the main repo.

Documentations/Manuals/Tutorials

A quick manual, some clarifications of certain terms used for Symbolic Data and FAQ’s can be found on the SD Doku project.

There is also a video tutorial/introduction for SDEval on Youtube. It covers the main functionality of the provided scripts.

Papers and Presentations:

News and History