Icms 2010
Icms 2010
1 Introduction
In order to use mathematical software for exploration, we often push the bound-
aries of available computing resources and continuously try to improve our imple-
mentations and algorithms. Most mathematical algorithms require basic build-
ing blocks, such as multiprecision numbers, fast polynomial arithmetic, exact
or numeric linear algebra, or more advanced algorithms such as Gröbner basis
computation or integer factorization. Though implementing some of these basic
foundations from scratch can be a good exercise, the resulting code may be slow
and buggy. Instead, one can build on existing optimized implementations of these
basic components, either by using a general computer algebra system, such as
Magma, Maple, Mathematica or MATLAB, or by making use of the many high
quality open source libraries that provide the desired functionality. These two
approaches both have significant drawbacks. This paper is about Sage,3 which
provides an alternative approach to this problem.
3
http://www.sagemath.org
2 Burçin Eröcal and William Stein
“Indeed, in almost all practical uses of Mathematica, issues about how Math-
ematica works inside turn out to be largely irrelevant. You might think that
knowing how Mathematica works inside would be necessary [...]” (See [Wol].)
Even if we manage to contact the developers, and they find time to make the
changes we request, it might still take months or years before these changes are
made available in a new release.
Fundamental questions of correctness, reproducibility and scientific value
arise when building a mathematical research program on top of proprietary
software (see, e.g., [SJ07]). There are many published refereed papers containing
results that rely on computations performed in Magma, Maple, or Mathemat-
ica.4 In some cases, a specific version of Magma is the only software that can
carry out the computation. This is not the infrastructure on which we want to
build the future of mathematical research.
In sharp contrast, open source libraries provide a great deal of flexibility, since
anyone can see and modify the source code as they wish. However, functionality
is often segmented into different specialized libraries and advanced algorithms
are hidden behind custom interpreted languages. One often runs into trouble
trying to install dependencies before being able use an open source software
package. Also, converting the output of one package to the input format of
another package can present numerous difficulties and introduce subtle errors.
Sage, which started in 2005 (see [SJ05]), attacks this problem by providing:
The rest of this paper goes into more detail about Sage. In Section 1.1,
we describe the Sage graphical user interface. Section 1.2 is about the Sage
development process, Sage days workshops, mailing lists, and documentation.
The subject of Section 2 is the sophisticated way in which Sage is built out of
a wide range of open source libraries and software. In Section 2.1 we explain
how we use Python and Cython as the glue that binds the compendium of
software included in Sage into a unified whole. We then delve deeper into Python,
Cython and the Sage preparser in Section 2.2, and illustrate some applications
to mathematics in Section 2.3. Sage is actively used for research, and in Section 3
we describe some capabilities of Sage in advanced areas of mathematics.
Sage: Unifying Free Mathematical Software 3
As illustrated in Figure 1, the graphical user interface for Sage is a web applica-
tion, inspired by Google Documents [Goo], which provides convenient access to
all capabilities of Sage, including 3D graphics. In single user mode, Sage works
like a regular application whose main window happens to be your web browser.
In multiuser mode, this architecture allows users to easily set up servers for ac-
cessing their work over the Internet as well as sharing and collaborating with
colleagues. One can try the Sage notebook by visiting www.sagenb.org, where
there are over 30,000 user accounts and over 2,000 published worksheets.
Users also download Sage to run it directly on their computers. We track all
downloads from www.sagemath.org, though there are several other high-profile
sites that provide mirrors of our binaries. Recently, people download about 6,000
copies of Sage per month directly from the Sage website.
There are over 200 developers from across the world who have contributed to the
Sage project. People often contribute because they write code using Sage as part
of a research project, and in this process find and fix bugs, speed up parts of Sage,
or want the code portion of their research to be peer reviewed. Each contribution
to Sage is first posted to the Sage Trac server trac.sagemath.org; it is then
peer reviewed, and finally added to Sage after all issues have been sorted out
and all requirements are met. Nothing about this process is anonymous; every
step of the peer review process is recorded indefinitely for all to see.
4
Including by the second author of this paper, e.g., [CES03]!
4 Burçin Eröcal and William Stein
The Sage Developer’s Guide begins with an easy-to-follow tutorial that guides
developers through each step involved in contributing code to Sage. Swift feed-
back is available through the sage-devel mailing list, and the #sage-devel IRC
chat room on irc.freenode.net (see www.sagemath.org/development.html).
Much development of Sage has taken place at the Sage Days workshops.
There have been two dozen Sage Days [Sagb] and many more are planned. These
are essential to sustaining the momentum of the Sage project and also help ensure
that developers work together toward a common goal, rather than competing
with each other and fragmenting our limited community resources.
A major goal is ensuring that there will be many Sage Days workshops for
the next couple of years. The topics will depend on funding, but will likely in-
clude numerical computation, large-scale bug fixing, L-functions and modular
forms, function fields, symbolic computation, topology, and combinatorics. The
combination of experienced developers with a group of enthusiastic mathemati-
cians at each of these workshops has rapidly increased the developer community,
and we hope that it will continue to do so.
With the motto “building the car instead of reinventing the wheel,” Sage
brings together numerous open source software packages (see Table 1 and [Saga]).
Many applications of Sage require using these libraries together. Sage handles
the conversion of data behind the scenes, automatically using the best tool for
the job, and allows the user to concentrate on the problem at hand.
In the following example, which we explain in detail below, Sage uses the
FLINT library [HH] for univariate polynomials over the ring Z of integers,
whereas Singular [DGPS10] is used for multivariate polynomials. The option
to use the NTL library [Sho] for univariate polynomials is still available, if the
user so chooses.
1 sage : R . <x > = ZZ []
2 sage : type ( R . an_element ())
3 < type ’ sage . rings ... P o l y n o m i a l _ i n t e g e r _ d e n s e _ f l i n t ’ >
4 sage : R . <x ,y > = ZZ []
5 sage : type ( R . an_element ())
6 < type ’ sage . rings ... MP ol y no m ia l_ l ib si n gu l ar ’ >
7 sage : R = PolynomialRing ( ZZ , ’x ’ , implementation = ’ NTL ’)
8 sage : type ( R . an_element ())
9 < type ’ sage . rings ... P o l y n o m i a l _ i n t e g e r _ d e n s e _ n t l ’ >
Sage: Unifying Free Mathematical Software 5
The first line in the example above constructs the univariate polynomial
ring R = Z[x], and assigns the variable x to be the generator of this ring. Note
that Z is represented by ZZ in Sage. The expression R.<x> = ZZ[] is not valid
Python, but can be used in Sage code as a shorthand as explained in Section 2.2.
The next line asks the ring R for an element, using the an_element function,
then uses the builtin Python function type to query its type. We learn that
it is an instance of the class Polynomial_integer_dense_flint. Similarly line
4 constructs R = Z[x, y] and line 7 defines R = Z[x], but this time using the
PolynomialRing constructor explicitly and specifying that we want the under-
lying implementation to use the NTL library.
Often these interfaces are used under the hood, without the user having to
know anything about the corresponding systems. Nonetheless, there are easy
ways to find out what is used by inspecting the source code, and users are
strongly encouraged to cite components they use in published papers. The fol-
lowing example illustrates another way to get a list of components used when a
specific command is run.
sage : from sage . misc . citation import get_systems
sage : get_systems ( ’ integrate ( x ^2 , x ) ’)
[ ’ ginac ’ , ’ Maxima ’]
sage : R . <x ,y ,z > = QQ []
sage : I = R . ideal ( x ^2+ y ^2 , z ^2+ y )
sage : get_systems ( ’I . pr imar y_dec ompo sitio n () ’)
[ ’ Singular ’]
6 Burçin Eröcal and William Stein
2.1 Interfaces
Pexpect axiom, ecm, fricas, frobby, gap, g2red, gfan, gnuplot, gp,
kash, lie, lisp, macaulay2, magma, maple, mathematica,
matlab, maxima, mupad, mwrank, octave, phc, polymake,
povray, qepcad, qsieve, r, rubik, scilab, singular, tachyon
C Library eclib, fplll, gap (in progress), iml, linbox, maxima,
ratpoints, r (via rpy2), singular, symmetrica
C Library arithmetic flint, mpir, ntl, pari, polybori, pynac, singular
The above interfaces are the result of many years writing Python and Cython
[BBS] code to adapt Singular [DGPS10], GAP [L+ ], Maxima [D+ ], Pari [PAR],
GiNaC/Pynac [B+ ], NTL [Sho], FLINT [HH], and many other libraries, so that
they can be used smoothly and efficiently in a unified way from Python [Ros].
Some of these programs were originally designed to be used only through their
own interpreter and made into a library by Sage developers. For example lib-
Singular was created by Martin Albrecht in order to use the fast multivariate
polynomial arithmetic in Singular from Sage. The libSingular interface is now
used by other projects, including Macaulay2 [GS] and GFan [Jen].
There are other approaches to linking mathematical software together. The
recent paper [LHK+ ] reports on the state of the art using OpenMath. Sage takes
a dramatically different approach to this problem. Instead of using a general
string-based XML protocol to communicate with other mathematical software,
Sage interfaces are tailor made to the specific software and problem at hand.
This results in far more efficient and flexible interfaces. The main disadvantage
compared to OpenMath is that the interfaces all go through Sage.
Having access to many programs which can perform the same computation,
without having to worry about data conversion, also makes it easier to double
check results. For example, below we first use Maxima, an open source symbolic
computation package distributed with Sage, to integrate a function, then perform
the same computation using Maple and Mathematica.
sage : var ( ’x ’)
sage : integrate ( sin ( x ^2) , x )
1/8*(( I - 1)* sqrt (2)* erf ((1/2* I - 1/2)* sqrt (2)* x ) + \
Sage: Unifying Free Mathematical Software 7
( I + 1)* sqrt (2)* erf ((1/2* I + 1/2)* sqrt (2)* x ))* sqrt ( pi )
sage : maple ( sin ( x ^2)). integrate ( x )
1/2*2^(1/2)* Pi ^(1/2)* FresnelS (2^(1/2)/ Pi ^(1/2)* x )
sage : mathematica ( sin ( x ^2)). Integrate ( x )
Sqrt [ Pi /2]* FresnelS [ Sqrt [2/ Pi ]* x ]
The most common type of interface, called a pexpect interface, communi-
cates with another command line program by reading and writing strings to a
text console, as if another user was in front of the terminal. Even though these
are relatively simple to develop, the overhead of having to print and parse strings
to represent the data makes this process potentially cumbersome and inefficient.
This is the default method of communication with most high level mathemat-
ics software, including commercial and open source programs, such as Maple,
Mathematica, Magma, KASH or GAP.
Sage provides a framework to represent elements over these interfaces, per-
form arithmetic with them or apply functions to the given object, as well as
using a file to pass the data if the string representation is too big. The following
demonstrates arithmetic with GAP elements.
sage : a = gap ( ’ 22 ’)
sage : a * a
484
It is also possible to use pexpect interfaces over remote consoles. In the
following code, we connect to the localhost as a different user and call Math-
ematica functions. Note that the interface can handle indexing vectors as well.
sage : mma = Mathematica ( server = " rmma60@localhost " )
sage : mma ( " 2+2 " )
4
sage : t = mma ( " Cos [ x ] " )
sage : t . Integrate ( ’x ’)
Sin [ x ]
sage : t = mma ( ’ {0 ,1 ,2 ,3} ’)
sage : t [2]
1
Sage also includes specialized libraries that are linked directly from compiled
code written in Cython. These are used to handle specific problems, such as the
characteristic polynomial computation in the example below.
sage : M = Matrix ( GF (5) , 10 , 10)
sage : M . randomize ()
sage : M . charpoly ( algorithm = ’ linbox ’)
x ^10 + 4* x ^9 + 4* x ^7 + 3* x ^4 + 3* x ^3 + 3* x ^2 + 4* x + 3
Many basic arithmetic types also use Cython to directly utilize data struc-
tures from efficient arithmetic libraries, such as MPIR or FLINT. An example
of this can be seen at the beginning of this section, where elements of the ring
Z[x] are represented by the class Polynomial_integer_dense_flint.
The Singular interface is one of the most advanced included in Sage. Singular
has a large library of code written in its own language. Previously the only way to
access these functions, which include algorithms for Gröbner basis and primary
8 Burçin Eröcal and William Stein
5
All timings in this paper were performed on an 2.66GHz Intel Xeon X7460 based
computer.
Sage: Unifying Free Mathematical Software 9
A fast interpreter In the following Singular session, we first declare the ring
r = Q[x, y, z] and the polynomial f ∈ r, then measures the time to square f
repeatedly, 10,000 times.
singular : int t = timer ; ring r = 0 ,(x ,y , z ) , dp ;
singular : def f = y ^2* z ^2 - x ^2* y ^3 - x * z ^3+ x ^3* y * z ;
singular : int j ; def g = f ;
singular : for ( j = 1; j <= 10^5; j ++) { g = f * f ; }
singular : ( timer - t ) , system ( " -- ticks - per - sec " );
990 1000
The elapsed time is 990 milliseconds. Next we use Sage to do the same com-
putation, using the same Singular data structures directly, but without going
through the interpreter.
sage : R . <x ,y ,z > = QQ []
sage : f = y ^2* z ^2 - x ^2* y ^3 - x * z ^3 + x ^3* y * z ; type ( f )
< type ’ sage . rings . polynomial ... M P ol y no mi a l_ l ib si n gu l ar ’ >
sage : timeit ( ’ for j in xrange (10^5): g = f * f ’)
5 loops , best of 3: 91.8 ms per loop
Sage takes only 91.8 milliseconds for the same operation. This difference is
because the Python interpreter is more efficient at performing for loops.
with its own type declarations, which behave as expected with respect to arith-
metic and have the advantage that they are backed by efficient multiprecision
arithmetic libraries such as MPIR [H+ ] and MPFR [Z+ ], which are thousands
of times faster than Python for large integer arithmetic.
To call the preparser directly on a given string, use the preparse function.
sage : preparse ( " 1/2 " )
’ Integer (1)/ Integer (2) ’
sage : preparse ( " 1.5 " )
" RealNumber ( ’1.5 ’) "
Adding a trailing r after a number indicates that the preparser should leave
that as the “raw” literal. The following illustrates division with Python integers.
sage : preparse ( " 1 r /2 r " )
’ 1/2 ’
sage : 1 r /2 r
0
Here is the result of performing the same division in Sage.
sage : 1/2
1/2
sage : type (1/2)
< type ’ sage . rings . rational . Rational ’ >
sage : (1/2). parent ()
Rational Field
The preparser also changes the ^ sign to the exponentiation operator **
and provides a shorthand to create new mathematical domains and name their
generator in one command.
sage : preparse ( " 2^3 " )
’ Integer (2)** Integer (3) ’
sage : preparse ( " R . <x ,y > = ZZ [] " )
" R = ZZ [ ’x , y ’]; (x , y ,) = R . _first_ngens (2) "
Performance is typically within a factor of two from what one gets using a direct
implementation in C or Fortran.
3 Afterword
In this article, we have showed that Sage is a powerful platform for developing
sophisticated mathematical software. Sage is actively used in research mathe-
matics, and people use Sage to develop state-of-the-art algorithms. Sage is par-
ticularly strong in number theory, algebraic combinatorics, and graph theory.
For further examples, see the 53 published articles, 11 Ph.D. theses, 10 books,
and 30 preprints at www.sagemath.org/library-publications.html.
For example, Sage has extensive functionality for computations related to
the Birch and Swinnerton-Dyer conjecture. In addition to Mordell-Weil group
computations using [Cre] and point counting over large finite fields using the
SEA package in [PAR], there is much novel elliptic curve code written directly
for Sage. This includes the fastest known algorithm for computation of p-adic
heights [Har07,MST06], and code for computing p-adic L-series of elliptic curves
at ordinary, supersingular, and split multiplicative primes. Sage combines these
capabilities to compute explicitly bounds on Shafarevich-Tate groups of elliptic
curves [SW10]. Sage also has code for computation with modular forms, modular
abelian varieties, and ideal class groups in quaternion algebras.
The MuPAD-combinat project, which was started by Florent Hivert and
Nicolas M. Thiéry in 2000, built the world’s preeminent system for algebraic
combinatorics on top of MuPAD (see [Des06] and [HT05]). Page 54 of [HT05]:
“They [MuPAD] also have promised to release the code source of the library
under a well known open-source license, some day.” In 2008, MuPAD was in-
stead purchased by MathWorks (makers of MATLAB), so MuPAD is no longer
available as a separate product, and will probably never be open source. Instead
it now suddenly costs $3000 (commercial) or $700 (academic).
As a result, the MuPAD-combinat group has spent several years reimplement-
ing everything in Sage (see [T+ ] for the current status). The MuPAD-combinat
group was not taken by surprise by the failure of MuPAD, but instead were
concerned from the beginning by the inherent risk in building their research
program on top of MuPAD. In fact, they decided to switch to Sage two months
before the bad news hit, and have made tremendous progress porting:
“It has been such a relief during the last two years not to have this
Damocles sword on our head!”
– Nicolas Thiéry
References
B+ . Christian Bauer et al., Ginac: is not a CAS, http://www.ginac.de/.
BBS. Stefan Behnel, Robert Bradshaw, and Dag Seljebotn, Cython: C-
Extensions for Python, http://www.cython.org/.
Sage: Unifying Free Mathematical Software 15