My Rhapsody

gsoc maths programming python

And GSoC ends!

This was the last week of my Google Summer of Code project on fast series expansions for SymPy and SymEngine. It has thoroughly been an amazing experience, challenging and rewarding in more ways than I had imagined. I was extremely lucky to have such awesome mentors as Ondrej and Thilina.

Though I couldn't achieve all that I had planned in my proposal, it taught me what I think is my biggest take away from the experience- things seldom work the way you want them to. In fact, I faced maximum difficulties in that part of my project which I had assumed to be trivial- ring series in SymPy. And as it turned out, it was the corner stone of what I had set out to do, and it needed to be done well.

Fortunately, things turned out rather well there and now most of the difficult questions with regard to ring series have been answered. ring series now has a general purpose rs_series function that expands an arbitrary SymPy expression really fast. Most of the important internal functions are also implemented now. I think as a module ring series has reached a stage where it can be helpful to people and others can help with improving and expanding it. Of course, a crazy amount of work still needs to be done and for that we need a lot of helping hands.

I have been writing a guide as well as documenting the internals in PR 9839. The importance of good documentation is another lesson I learnt during my project.

The most important thing is that people use these new capabilities. I hope more people will get involved. If all goes well, it is a potential replacement of the current series method of SymPy.

Other than that, I had a very fruitful discussion with Ondrej about how to implement polynomials and then series expansion in SymEngine. You can read the summary here. I am already excited about writing all this new stuff.

The end of GSoC is not really an end; it is a beginning, of more interesting times :)

Cheers!!

gsoc maths programming python

GSoC Week 12

Last week I told you why rs_series doesn't work with negative or fractional powers because of the constraints of a polynomial back-end and why we need to modify polys. The situation isn't that hopeless actually. Let's talk about negative and fractional powers one by one.

Negative Powers

The reason negative exponents work in ring_series is because I modified PolyElement to allow so. In hind sight, it wasn't the right decision and needs to be replaced with something that doesn't alter polys. It is rather surprising that I came across a possible solution so late (Now we know why good documentation is so important).

I already knew that polys allows us to create a FractionField. A fraction field over a domain R consists of elements of the form a/b where a and b belong to R. In our case we are interested in the fraction field of polynomial ring, i.e, fractions with polynomials as numerator and denominator. So a/bis not a * b**(-1) but is a / b, where a and b are polynomials.

What was new to me was that just like ring, polys also has sparse field. In effect, it allows us to create sparse rational functions without altering anything.

I modified some functions in ring_series to work with a rational function field here, and it works quite well indeed.

In [42]: from sympy.polys.fields import *

In [43]: F, a, b = field('a, b', QQ)

In [44]: p = rs_sin(a + b, a, 3)/a**5

In [45]: p*a
Out[45]: (-3*a**2*b - 3*a*b**2 + 6*a - b**3 + 6*b)/(6*a**4)

Note that all these are field operations and I haven't modified field.py in any way. Elegant!

But then again, having a field increases the complexity as we need to evaluate the numerator and denominator separately.

Fractional Powers

Fractional powers are a much trickier case as there is no simple solution to it as above. What we can do is optimise the option I had presented in my last post, i.e, have each fractional power as a generator. But doing that opens up a Pandora's box. Simple things such as sqrt(a)**2 == a do not hold true any more. The current rs_series treats sqrt(a) as a constant if we are expanding with respect to a:

In [22]: rs_series(sin(a**QQ(1,2)), a**QQ(1,2),5)
Out[22]: -1/6*(sqrt(a))**3 + (sqrt(a))

In [23]: rs_series(sin(a**QQ(1,2)), a,5)
Out[23]: sin(sqrt(a))

So, if we indeed decide to tread this path, we would need to replace a here with sqrt(a)**2. This really complicates the situation as we need to figure out what to replace with. In any calculation the powers change multiple times and each time we'll need to figure out how to rewrite the series.

Next Week

It is now mostly a design decision whether we want the ring_series to be confined within the polys module. The polys environment allows efficient manipulations of laurent series (with FracField), but I doubt we can achieve the speed we want with puiseux series without modifying polys. One possible solution is to separate the modified parts of polys along with ring_series from polys. We are using polys only because it has the data structure that we want. Separating them would allow us to simultaneously make use of its back-end and not introduce unnecessary complexity in our representation.

Other than that, documentation is another priority now. I had planned to do it earlier too, but couldn't. This week's discovery has reminded me of its importance.

Cheers!

gsoc maths programming python

GSoC Week 11

Sorry for the delayed post! Last week was extremely busy.

It's time to wrap up my work. The good new is that rs_series (I called it series_fast earlier) works well for taylor series. The speedups are impressive and it can handle all sorts of cases (so far!). Now, I need to make it work for laurent and puiseux series.

Given that ring_series functions work well for negative and fractional powers, ideally that shouldn't be difficult. However, my current strategy is to add variables as generators to the currently used ring. The backend of creating rings is in polys, which doesn't allow negative or fractional powers in the generators (that is the mathematical definition of polynomials). For example:

In [276]: sring(a**QQ(2,3))
Out[276]: (Polynomial ring in a**(1/3) over ZZ with lex order, (a**(1/3))**2)

In [277]: _[0].gens
Out[277]: ((a**(1/3)),)

Contrast this with:

In [285]: sring(a**2)
Out[285]: (Polynomial ring in a over ZZ with lex order, a**2)

Generators with negative or fractional powers are treated as symbolic atoms and not as some variable raised to some power. So these fractional powers will never simplify with other generators with the same base.

The easy way to fix this is to modify sring but that would mean changing the core polys. I am still looking for a better way out.

The polynomial wrappers PR had been lying dead for quite some time. It currently uses piranha's hash_set but it needs to work on unordered_set when piranha is not available. I am adding that here. It is mostly done, except for encode and decode functions. Once the wrappers are in, I can start porting ring_series functions.

Next Week

  • Make rs_series work for puiseux series.

  • Complete polynomial wrappers.

  • Port the low level ring_series functions.

Cheers!

gsoc maths programming python

GSoC Week 10

I spent a good amount of time this week trying to make the series function more intelligent about which ring it operates on. The earlier strategy of using the EX ring proved to be slow in many cases. I had discussions with Kalevi Suominen, who is a developer at SymPy and we figured out the following strategy:

  • The user inputs a Basic expr. We use sring over QQ to get the starting ring.

  • We call individual functions by recursing on expr. If expr has a constant term we create a new ring with additional generators required by the ring (e.g sin and cos in case of rs_sin and expand expr over that.)

  • This means that each ring_series function now can add generators to the ring it gets so that it can expand the expression.

This results in considerable speed-up as we do operations on the simplest possible ring as opposed to using EX which is the most complex (and hence slowest) ring. Because of this, the time taken by the series_fast function in faster_series is marginally more than direct function calls. The function doesn't yet have code for handling arbitrary expressions, which will add some overhead of its own.

Most of the extra time is taken by sring. The overhead is constant, however (for a given expression). So for series_fast(sin(a**2 + a*b), a, 10) the extra routines take about 50% of the total time (the function is 2-4 x slower). For series_fast(sin(a**2 + a*b), a, 100), they take 2% of the total time and the function is almost as fast as on QQ

There is, of course scope for speedup in sring (as mentioned in its file). Another option is to minimise the number of calls to sring, if possible to just one (in the series_fast function).

In my last post I talked about the new solveset module that Harsh and Amit are working on. I am working with them and I sent a patch to add a domain argument to the solveset function. It is pretty cool stuff in that solution is always guaranteed to be complete.

Next Week

I haven't been yet been able to start porting the code to Symengine as the Polynomial wrappers are not yet ready. Hopefully, they should be done by the next week. Till, then I will focus on improving series_fast and any interesting issues that come my way.

  • Write a fully functional series_fast. Profile it properly and optimise it.

  • Polynomial wrappers.

  • Document the functions and algorithms used in ring_series.py

Cheers!!

gsoc maths programming python

GSoC Week 9

Like I said in my last post, this was my first week in college after summer vacation. I had to reschedule my daily work according to my class timings (which are pretty arbitrary). Anyway, since I do not have a test anytime soon, things were manageable.

So Far

Ring Series

This week I worked on rs_series in PR 9614. As Donald Knuth succinctly said, 'Premature optimisation is the root of all evil', my first goal was to write a function that used ring_series to expand Basic expressions and worked in all cases. That has been achieved. The new function is considerably faster than SymPy's series in most cases. eg.

In [9]: %timeit rs_series(sin(a)*cos(a) - exp(a**2*b),a,10)
10 loops, best of 3: 46.7 ms per loop

In [10]: %timeit (sin(a)*cos(a) - exp(a**2*b)).series(a,0,10)
1 loops, best of 3: 1.08 s per loop

However, in many cases the speed advantage is not enough, especially considering that all elementary ring_series functions are faster than SymPy's series functions by factors of 20-100. Consider:

In [20]: q
Out[20]: (exp(a*b) + sin(a))*(exp(a**2 + a) + sin(a))*(sin(a) + cos(a))

In [21]: %timeit q.series(a,0,10)
1 loops, best of 3: 2.81 s per loop

In [22]: %timeit rs_series(q,a,10)
1 loops, best of 3: 3.99 s per loop

In this case, rs_series is in fact slower than the current series method!. This means that rs_series needs to be optimised, as expanding the same expression directly with rs_* functions is much faster.

In [23]: %timeit (rs_exp(x*y,x,10) + rs_sin(x,x,10))*(rs_exp(x**2+ x,x,10) + rs_sin(x,x,10))*(rs_sin(x,x,10) + rs_cos(x,x,10))
1 loops, best of 3: 217 ms per loop

I spent Friday playing with rs_series. Since the function is recursive, I even tried using a functional approach (with map, reduce, partial, etc). It was fun exploring SymPy's functional capabilities (which are quite decent, though Haskell's syntax is of course more natural). This didn't make much difference in speed. Code profiling revealed that rs_series is making too many function calls (which is expected). So, I plan to try a non-recursive approach to see if that makes much of a difference. Other than that, I will also try to make it smarter so that it does not go through needless iterations (which it currently does in many cases).

SymEngine

I had a discussion with Sumith about Polynomial wrappers. I am helping him with constructors and multiplication. We both want the basic Polynomial class done as soon as possible, so that I can start with writing series expansion of functions using it.

I also sent a PR 562 that adds C wrappers for Complex class. This will be especially helpful for Ruby wrappers that Abinash is working on. FQA is a nice place to read about writing C++/C wrappers and for some side entertainment too.

Other than that, I also happened to have a discussion with Harsh on the new solve-set he and Amit are working on. Their basic idea is that you always work with sets (input and output) and that the user can choose what domain he wants to work on. The latter idea is quite similar to what SymPy's polys does. Needless to say, their approach is much more powerful that solvers's. I will be working with them.

Next Week

Targets for the next week are as modest as they are crucial:

  • Play with rs_series to make it faster.

  • Finish Polynomial wrappers and start working on series expansion.

Cheers!

gsoc maths programming python

GSoC Week 8

This week I started work on writing the series module for symengine. I am working with Sumith on Polynomials on PR 511. It is still a work in progress. A lot of the code, especially related to Piranha was new to me, so I had to read a lot. I got a PR 533 merged which uses templates to simplify code for equality and comparison of maps.

With regard to PR 9614, I had a meeting with my mentor Thilina. We decided that we should be able to call the RingSeries function on SymPy Basic expressions so that the user need not bother about creating rings and all the extra function calls. So, I will be building on top of the classes I created earlier to make the series method accept Basic expressions. I created RingFunction, RingMul and RingAdd classes. The taylor_series method now works with sums and products of functions. However, it is still not optimised, especially with series that have a low minimum exponent (say cos). I need to find a better way.

I returned to college. Once classes begin from Monday I will need to manage my time well as my schedule will get busier.

Next Week

  • Finish the Polynomials PR.

  • Further optimise the taylor_series method to work in all cases.

  • Start writing a series method that works with Basic expressions.

I think it is good to have the flexibility to work with Basic objects as that integrates the ring_series module better with the existing series infrastructure of SymPy. At the end of day, we want maximum number of people to benefit from using the code.

Cheers!

gsoc maths programming python

GSoC Week 7

So Far

PR 9575 and PR 9495 got merged this week. All the basic functions are now in place in polys.ring_series. The module supports Laurent as well as Puiseux series expansion. The order of the day is to extend this support for nested functions, and encapsulate the whole thing with classes. The idea is that the user need not bother about calling ring_series functions and that the class should hold all the relevant information about the series.

Given that SymPy already has series infrastructure, we need to decide whether ring_series with be integrated with it or whether it will be distinct from it.

Next Week

  • Discuss and decide how ring_series is to be used and write its classes accordingly. I will build upon PR 9614

  • Write a series function that makes use of ring_series functions to expand arbitrary expressions.

I also need to start porting some of it to Symengine. The basic polynomial operations are in place there. I need to discuss with Ondrej and Sumith how the series module will best work with the Polynomial module. If that is sorted, I can start porting the ring_series functions.

Cheers!

gsoc SymPy

GSoC Week 6

I successfully passed my mid term evaluation this week and now the second half of my project has begun! It has been a challenging journey so far that has made me explore new algorithms (some very ingenious) and read a lot of code (much more difficult than I had imagined). This week, Mario whose code I am working on, helped in a big way by showing how and where to improve the algorithms. It is clear now that all the functions need to guarantee the order of the series they output. We were planning to keep it optional but since ring_series functions each other, an error in the order will propagate and eventually make it unpredictable.

So Far

PR 9575 is ready for merge except for a bug in sympify that converts PythonRational into float.

I had a discussion with Mario on PR 9495. He has suggested a lot of improvements while dealing with fractional exponents, especially the fact that Newton method may not be ideal in these cases. It is very interesting to try and compare different algorithms and come up with ways to optimise them for out needs. The scope for improvement is immense and we'll need to decide the order in which we'll push the optimisations.

I started writing the RingSeries class for series evaluation in PR 9614. I was supposed to update my blog yesterday, but I dozed off while working on it. According to my current approach, I am writing classes for all the functions so that they can be represented symbolically. Another issue that needs to be tackled is expansion of nested functions, things like sin(cos(sin(x)). This will need some work as there are many approaches to tackle it. Currently, I evaluate the inner functions( it they exist) recursively with prec + 1. This will work in simple cases, but not if there are cancellations, eg, sin(cos(x)) - sin(cos(x**2)).

Next Week

  • Get PR 9575 merged.

  • Improve PR 9495 and get it merged.

  • Finalise the series class hierarchy and the series evaluation function.

The next phase of my project is in Symengine and there have been a lot of improvements and changes there. I will need to play with the new stuff and perhaps also think of ways to port ring_series there.

Cheers!

gsoc maths programming python

GSoC Week 5

I am almost half way through my project and it has been an amazing learning experience. According to my timeline I am supposed to finish ring_series work in SymPy by the mid-eval (i.e, by next week) and start porting it to Symengine after it. Though I faced some hiccups, especially in figuring out how to deal with a symbolic ring, the deadline is achievable.

Till now

  • Series reversion has been added to ring_series after this PR of mine got merged.

  • I have started modifying the functions to operate on constant terms in the input series with this PR. I plan to finish it by this week. We are using the EX domain to represent the symbolic coefficients.

  • The PR to add puiseux series has not been yet merged. I have added a check_precision function that gives an approximate number of iterations with which the ring_series functions will output the series of the requested order.

Next Week

I expect both the pending pull requests to get merged by this week. After that the only major task remaining, would be to write the new Series class structure. So, the tasks are:

  • Discuss and write the new Series class structure.

  • Finish pending tasks, if any, and write more tests.

gsoc maths programming python

GSoC Week 4

Though I had resolved to update my blog posts by Friday every week, this one is quite late. This is mostly because this week was one of the most confusing yet, in terms of figuring out how to get things done within the existing code structure of Sympy. And that process is still on.

So Far

PR 9495 is under review. It took some time to decide how the precision/order of Puiseux series needs to be handled;

We had an interesting discussion on reverting a series here. Fredrik Johansson suggested a fast algorithm of his for it. I also got to know a very ingenuous way to expand trigonometric functions. For example, for exponential:

def exp_series(A, n):
B = [exp(A[0])]
for k in range(1, n):
B.append(sum(j*A[j]*B[k-j]/k for j in range(1,min(len(A),k+1))))
return B

Possibly the most confusing part of the project is to get ring_series working over an Expression domain, i.e, the coefficients can be any SymPy symbolic expression. Multivariate series need to have multiple gens, implying that in a multivariate series, the coefficients can be symbolic functions of PolyElement objects. However, PolyElement class is of type CantSympify, which means I can't use it in SymPy functions. I had quite a few discussions with my mentors over it and I know now what the issues are. I need to solve them next week.

Next Week

  • Finalise how to handle symbolic coefficients and finish it

  • Read Fredrik's paper and try to implement it.

Cheers!