# N 1 s n 1 n 2 s n 2 1 finally the number e n of

• No School
• AA 1
• 24

This preview shows page 3 - 5 out of 24 pages.

n - 1) S n - 1 - ( n - 2) S n - 2 (1) Finally, the number E n of expressions with n internal nodes, p 1 unary operator, p 2 binary operators and L possible leaves is recursively computed as ( n + 1) E n = ( p 1 + 2 Lp 2 )(2 n - 1) E n - 1 - p 1 ( n - 2) E n - 2 (2) If p 1 = p 2 = L = 1 , Equation 2 boils down to Equation 1. If p 2 = L = 1 , p 1 = 0 , we have ( n + 1) E n = 2(2 n - 1) E n - 1 which is the recurrence relation satisfied by Catalan numbers. The derivations and properties of all these formulas are provided in Section B of the appendix. In Figure 1, we represent the number of binary trees ( C n ) and unary-binary trees ( S n ) for different numbers of internal nodes. We also represent the number of possible expressions ( E n ) for different sets of operators and leaves. 3
Figure 1: Number of trees and expressions for different numbers of operators and leaves. p 1 and p 2 correspond to the number of unary and binary operators respectively, and L to the number of possible leaves. The bottom two curves correspond to the number of binary and unary-binary trees (enumerated by Catalan and Schroeder numbers respectively). The top two curves represent the associated number of expressions. We observe that adding leaves and binary operators significantly increases the size of the problem space. 3 G ENERATING DATASETS Having defined a syntax for mathematical problems and techniques to randomly generate expressions, we are now in a position to build the datasets our models will use. In the rest of the paper, we focus on two problems of symbolic mathematics: function integration and solving ordinary differential equations (ODE) of the first and second order. To train our networks, we need datasets of problems and solutions. Ideally, we want to generate representative samples of the problem space, i.e. randomly generate functions to be integrated and differential equations to be solved. Unfortunately, solutions of random problems sometimes do not exist (e.g. the integrals of f ( x ) = exp( x 2 ) or f ( x ) = log(log( x )) cannot be expressed with usual functions), or cannot be easily derived. In this section, we propose techniques to generate large training sets for integration and first and second order differential equations. 3.1 I NTEGRATION We propose three approaches to generate functions with their associated integrals. Forward generation ( FWD ). A straightforward approach is to generate random functions with up to n operators (using methods from Section 2) and calculate their integrals with a computer algebra system. Functions that the system cannot integrate are discarded. This generates a representative sample of the subset of the problem space that can be successfully solved by an external symbolic mathematical framework. Backward generation ( BWD ). An issue with the forward approach is that the dataset only contains functions that symbolic frameworks can solve (they sometimes fail to compute the integral of integrable functions). Also, integrating large expressions is time expensive, which makes the overall method particularly slow. Instead, the backward approach generates a random function f , computes

#### You've reached the end of your free preview.

Want to read all 24 pages?

• Fall '19

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern