And GSoC ends!

This was the last week of my Google Summer of Code project on fast series expansions for SymPy and SymEngine. It has thoroughly been an amazing experience, challenging and rewarding in more ways than I had imagined. I was extremely lucky to have such awesome mentors as Ondrej and Thilina.

Though I couldn't achieve all that I had planned in my proposal, it taught me what I think is my biggest take away from the experience- things seldom work the way you want them to. In fact, I faced maximum difficulties in that part of my project which I had assumed to be trivial- ring series in SymPy. And as it turned out, it was the corner stone of what I had set out to do, and it needed to be done well.

Fortunately, things turned out rather well there and now most of the difficult questions with regard to ring series have been answered. ring series now has a general purpose rs_series function that expands an arbitrary SymPy expression really fast. Most of the important internal functions are also implemented now. I think as a module ring series has reached a stage where it can be helpful to people and others can help with improving and expanding it. Of course, a crazy amount of work still needs to be done and for that we need a lot of helping hands.

I have been writing a guide as well as documenting the internals in PR 9839. The importance of good documentation is another lesson I learnt during my project.

The most important thing is that people use these new capabilities. I hope more people will get involved. If all goes well, it is a potential replacement of the current series method of SymPy.

Other than that, I had a very fruitful discussion with Ondrej about how to implement polynomials and then series expansion in SymEngine. You can read the summary here. I am already excited about writing all this new stuff.

The end of GSoC is not really an end; it is a beginning, of more interesting times :)

Cheers!!

GSoC Week 12

Last week I told you why rs_series doesn't work with negative or fractional powers because of the constraints of a polynomial back-end and why we need to modify polys. The situation isn't that hopeless actually. Let's talk about negative and fractional powers one by one.

Negative Powers

The reason negative exponents work in ring_series is because I modified PolyElement to allow so. In hind sight, it wasn't the right decision and needs to be replaced with something that doesn't alter polys. It is rather surprising that I came across a possible solution so late (Now we know why good documentation is so important).

I already knew that polys allows us to create a FractionField. A fraction field over a domain R consists of elements of the form a/b where a and b belong to R. In our case we are interested in the fraction field of polynomial ring, i.e, fractions with polynomials as numerator and denominator. So a/bis not a * b**(-1) but is a / b, where a and b are polynomials.

What was new to me was that just like ring, polys also has sparse field. In effect, it allows us to create sparse rational functions without altering anything.

I modified some functions in ring_series to work with a rational function field here, and it works quite well indeed.

In [42]: from sympy.polys.fields import *

In [43]: F, a, b = field('a, b', QQ)

In [44]: p = rs_sin(a + b, a, 3)/a**5

In [45]: p*a
Out[45]: (-3*a**2*b - 3*a*b**2 + 6*a - b**3 + 6*b)/(6*a**4)

Note that all these are field operations and I haven't modified field.py in any way. Elegant!

But then again, having a field increases the complexity as we need to evaluate the numerator and denominator separately.

Fractional Powers

Fractional powers are a much trickier case as there is no simple solution to it as above. What we can do is optimise the option I had presented in my last post, i.e, have each fractional power as a generator. But doing that opens up a Pandora's box. Simple things such as sqrt(a)**2 == a do not hold true any more. The current rs_series treats sqrt(a) as a constant if we are expanding with respect to a:

In [22]: rs_series(sin(a**QQ(1,2)), a**QQ(1,2),5)
Out[22]: -1/6*(sqrt(a))**3 + (sqrt(a))

In [23]: rs_series(sin(a**QQ(1,2)), a,5)
Out[23]: sin(sqrt(a))

So, if we indeed decide to tread this path, we would need to replace a here with sqrt(a)**2. This really complicates the situation as we need to figure out what to replace with. In any calculation the powers change multiple times and each time we'll need to figure out how to rewrite the series.

Next Week

It is now mostly a design decision whether we want the ring_series to be confined within the polys module. The polys environment allows efficient manipulations of laurent series (with FracField), but I doubt we can achieve the speed we want with puiseux series without modifying polys. One possible solution is to separate the modified parts of polys along with ring_series from polys. We are using polys only because it has the data structure that we want. Separating them would allow us to simultaneously make use of its back-end and not introduce unnecessary complexity in our representation.

Other than that, documentation is another priority now. I had planned to do it earlier too, but couldn't. This week's discovery has reminded me of its importance.

Cheers!

GSoC Week 11

Sorry for the delayed post! Last week was extremely busy.

It's time to wrap up my work. The good new is that rs_series (I called it series_fast earlier) works well for taylor series. The speedups are impressive and it can handle all sorts of cases (so far!). Now, I need to make it work for laurent and puiseux series.

Given that ring_series functions work well for negative and fractional powers, ideally that shouldn't be difficult. However, my current strategy is to add variables as generators to the currently used ring. The backend of creating rings is in polys, which doesn't allow negative or fractional powers in the generators (that is the mathematical definition of polynomials). For example:

In [276]: sring(a**QQ(2,3))
Out[276]: (Polynomial ring in a**(1/3) over ZZ with lex order, (a**(1/3))**2)

In [277]: _[0].gens
Out[277]: ((a**(1/3)),)

Contrast this with:

In [285]: sring(a**2)
Out[285]: (Polynomial ring in a over ZZ with lex order, a**2)

Generators with negative or fractional powers are treated as symbolic atoms and not as some variable raised to some power. So these fractional powers will never simplify with other generators with the same base.

The easy way to fix this is to modify sring but that would mean changing the core polys. I am still looking for a better way out.

The polynomial wrappers PR had been lying dead for quite some time. It currently uses piranha's hash_set but it needs to work on unordered_set when piranha is not available. I am adding that here. It is mostly done, except for encode and decode functions. Once the wrappers are in, I can start porting ring_series functions.

Next Week

  • Make rs_series work for puiseux series.

  • Complete polynomial wrappers.

  • Port the low level ring_series functions.

Cheers!

GSoC Week 10

I spent a good amount of time this week trying to make the series function more intelligent about which ring it operates on. The earlier strategy of using the EX ring proved to be slow in many cases. I had discussions with Kalevi Suominen, who is a developer at SymPy and we figured out the following strategy:

  • The user inputs a Basic expr. We use sring over QQ to get the starting ring.

  • We call individual functions by recursing on expr. If expr has a constant term we create a new ring with additional generators required by the ring (e.g sin and cos in case of rs_sin and expand expr over that.)

  • This means that each ring_series function now can add generators to the ring it gets so that it can expand the expression.

This results in considerable speed-up as we do operations on the simplest possible ring as opposed to using EX which is the most complex (and hence slowest) ring. Because of this, the time taken by the series_fast function in faster_series is marginally more than direct function calls. The function doesn't yet have code for handling arbitrary expressions, which will add some overhead of its own.

Most of the extra time is taken by sring. The overhead is constant, however (for a given expression). So for series_fast(sin(a**2 + a*b), a, 10) the extra routines take about 50% of the total time (the function is 2-4 x slower). For series_fast(sin(a**2 + a*b), a, 100), they take 2% of the total time and the function is almost as fast as on QQ

There is, of course scope for speedup in sring (as mentioned in its file). Another option is to minimise the number of calls to sring, if possible to just one (in the series_fast function).

In my last post I talked about the new solveset module that Harsh and Amit are working on. I am working with them and I sent a patch to add a domain argument to the solveset function. It is pretty cool stuff in that solution is always guaranteed to be complete.

Next Week

I haven't been yet been able to start porting the code to Symengine as the Polynomial wrappers are not yet ready. Hopefully, they should be done by the next week. Till, then I will focus on improving series_fast and any interesting issues that come my way.

  • Write a fully functional series_fast. Profile it properly and optimise it.

  • Polynomial wrappers.

  • Document the functions and algorithms used in ring_series.py

Cheers!!

GSoC Week 9

Like I said in my last post, this was my first week in college after summer vacation. I had to reschedule my daily work according to my class timings (which are pretty arbitrary). Anyway, since I do not have a test anytime soon, things were manageable.

So Far

Ring Series

This week I worked on rs_series in PR 9614. As Donald Knuth succinctly said, 'Premature optimisation is the root of all evil', my first goal was to write a function that used ring_series to expand Basic expressions and worked in all cases. That has been achieved. The new function is considerably faster than SymPy's series in most cases. eg.

In [9]: %timeit rs_series(sin(a)*cos(a) - exp(a**2*b),a,10)
10 loops, best of 3: 46.7 ms per loop

In [10]: %timeit (sin(a)*cos(a) - exp(a**2*b)).series(a,0,10)
1 loops, best of 3: 1.08 s per loop

However, in many cases the speed advantage is not enough, especially considering that all elementary ring_series functions are faster than SymPy's series functions by factors of 20-100. Consider:

In [20]: q
Out[20]: (exp(a*b) + sin(a))*(exp(a**2 + a) + sin(a))*(sin(a) + cos(a))

In [21]: %timeit q.series(a,0,10)
1 loops, best of 3: 2.81 s per loop

In [22]: %timeit rs_series(q,a,10)
1 loops, best of 3: 3.99 s per loop

In this case, rs_series is in fact slower than the current series method!. This means that rs_series needs to be optimised, as expanding the same expression directly with rs_* functions is much faster.

In [23]: %timeit (rs_exp(x*y,x,10) + rs_sin(x,x,10))*(rs_exp(x**2+ x,x,10) + rs_sin(x,x,10))*(rs_sin(x,x,10) + rs_cos(x,x,10))
1 loops, best of 3: 217 ms per loop

I spent Friday playing with rs_series. Since the function is recursive, I even tried using a functional approach (with map, reduce, partial, etc). It was fun exploring SymPy's functional capabilities (which are quite decent, though Haskell's syntax is of course more natural). This didn't make much difference in speed. Code profiling revealed that rs_series is making too many function calls (which is expected). So, I plan to try a non-recursive approach to see if that makes much of a difference. Other than that, I will also try to make it smarter so that it does not go through needless iterations (which it currently does in many cases).

SymEngine

I had a discussion with Sumith about Polynomial wrappers. I am helping him with constructors and multiplication. We both want the basic Polynomial class done as soon as possible, so that I can start with writing series expansion of functions using it.

I also sent a PR 562 that adds C wrappers for Complex class. This will be especially helpful for Ruby wrappers that Abinash is working on. FQA is a nice place to read about writing C++/C wrappers and for some side entertainment too.

Other than that, I also happened to have a discussion with Harsh on the new solve-set he and Amit are working on. Their basic idea is that you always work with sets (input and output) and that the user can choose what domain he wants to work on. The latter idea is quite similar to what SymPy's polys does. Needless to say, their approach is much more powerful that solvers's. I will be working with them.

Next Week

Targets for the next week are as modest as they are crucial:

  • Play with rs_series to make it faster.

  • Finish Polynomial wrappers and start working on series expansion.

Cheers!