Motivating the Integral with Euler's Method

I have a fun idea for how to teach and think about the integral in the context of freshman calculus. I’ve never actually used this in a class, and I suspect it’s not actually a great idea. But it’s a fun idea and worth at least playing with, even if it’s a bit too weird to help calculus novices understand what’s going on.

But first, I want to mention that if you want to support my writing, I now have a Ko-Fi account. Any tips would be appreciated and would help me write more essays like this.

The Big Ideas of Calculus 1

When I teach calculus I emphasize two big ideas: differential equations, and numerical analysis.

Differential equations generalize the concept of “rate of change”, and they’re the core of why calculus is useful: you can describe the rules a system follows, encode them in math, and draw conclusions. Calculus 1 students don’t have the tools to solve differential equations, but they can—and should—understand how a sentence like “the acceleration is proportional to the displacement” relates to the equation \(y’’ = -ky\).

Numerical approximation is often the way we use calculus, and increasingly so as computers are more powerful and available. I motivate the derivative with the idea of linear approximation: if I want to pretend my function is a line, and write \(f(x) = f(a) + m (x-a)\), what number \(m\) will do the best job? This develops into other methods for approximating the answers to questions that are too hard to answer directly: it leads into ideas like quadratic approximation and Newton’s method, and provides a foundation for numerical integration and Taylor series in Calculus 2.

Euler’s Method

If we combine these two ideas, we can try to numerically approximate the solution to a differential equation. Suppose we have a differential equation \(f’(t) = f(t) - f(t)^2/2\), and we know the initial condition that \(f(0)=1\). If we want to know \(f(3)\) we can get a rough guess with a linear approximation: we know \(f(0) = 1 \) and thus that \(f’(0) = 1 - \frac{1^2}{2} = \frac{1}{2} \), so we get

\[ f(3) \approx f(0) + f’(0) (3-0) = 1 + \frac{1}{2} \cdot 3 = \frac{5}{2}. \]

That’s only a rough estimate; linear approximation generally isn’t very accurate when the starting point and ending point aren’t close together. In fact the true value is \( \frac{2e^3}{e^3+1} \approx 1.905\), which isn’t terribly far off from \(2.5 \) but isn’t especially close either. But this is the best estimate we can really get using only \(f(0)\) and \(f’(0)\).

However, we know a lot more than that, because we have a formula for \(f’(x)\). It’s a bit hard to use, because we need to know \(f(x)\) to compute \(f’(x)\); but we know we can approximate \(f(x_2)\) if we already know \(f(x_1)\) and \(f’(x_1)\). That allows us to do a recursive calculation:

\[ \begin{array}{rl} f(1) & \approx f(0) + f’(0) (1-0) = 1 + \left(1 - \frac{1^2}{2} \right) \cdot (1) = 3/2. \\\
f(2) & \approx f(1) + f’(1) (2-1) \approx \frac{3}{2} + \left( \frac{3}{2}
\frac{ \left(\frac{3}{2} \right)^2}{2} \right)\cdot (1) = \frac{15}{8}. \\\
f(3) & \approx f(2) + f’(2) (3-2) \approx \frac{15}{8} + \left(\frac{15}{8} - \frac{\left(\frac{15}{8} \right)^2}{2} \right) \cdot (1) \\\
& = \frac{255}{128}. \end{array} \]

Thus we estimate \(f(3) \approx \frac{255}{128} \approx 1.99\).

This still isn’t an exact value for \(f(3)\); but this approximation is much better than our first try. And if this isn’t close enough, we can do even better by breaking our approximation into more steps: with six steps we get \(f(3) \approx 1.95\) and with sixty we get \(f(3) \approx 1.909\). More steps takes more work, but also gives us a more precise answer.

This approach is known as Euler’s method, and it allows us to numerically approximate the result of any first-order ordinary differential equation given an initial condition. With a little bit of work, we can generalize this to any ordinary differential equation; it’s quite straightforward and flexible.

It’s also basically just integration.

What is an integral?

In a typical calculus course, we motivate the integral with the area problem: we have the graph of some function, and we want to find the area under that curve. We can approximate that area by chopping it up into rectangles, which gives us the Riemann sum. And then as the number of rectangles approaches infinity our approximation gets really good, which allows us to define the integral.

An animation of a Riemann sum as the number of rectangles goes to infinity

\[ \int_a^b f(t) \,dt = \lim_{n \to \infty} \sum_{k=1}^n f(x_k) \Delta x \]

This definition has a lot of symbols in it, and is generally intimidating to freshman calculus students. But it does accurately describe what we’re doing and why: the key idea of the integral is to break a calculation into pieces, do an approximation on each piece, and then add the results together. This will give us an approximate answer to our original question; as we use more and smaller pieces, the approximation gets better, and so in the limit we get an exact answer.

So this formula directly answers the question that we’re asking. And when we want to think about applications of the integral, the Riemann sum definition is useful: it helps us figure out what the integral is actually computing, and so what problems it can help solve. But Riemann sums are a huge pain to actually do computations with, so we generally don’t.

Instead, we rely on the Fundamental Theorem of Calculus, which comes in two parts.

Fundamental Theorem of Calculus, Part 1:
Given a function \(f(x)\) and a number \(a\), we can define a new function \(F(x) = \int_a^x f(t) \,dt\). Then \(F’(x) = f(x)\).

Part 1 tells us that the derivative undoes the integral; the derivative of the integral of \(f\) is just \(f\). This is conceptually cool, and it does allow us to compute something. But it doesn’t directly help us compute the integral. Instead, we use it to prove¹ a second statement.

Fundamental Theorem of Calculus, Part 2:
If \(F’(x) = f(x)\), then \(\int_a^b f(t) \,dt = F(b) - F(a)\).

This is also known as the Evaluation Theorem, or sometimes the Net Change Theorem. And it’s the tool we actually use in practice to compute integrals—to the extent that people mainly associate “integration” with finding the antiderivative \(F(x)\), not with finding the number corresponding to the area under the curve.

And this all works, but we’ve moved pretty far away from the original question, and the connections pass through some relatively abstract territory. It’s hard to really intuitively see how this calculation relates to the original question.

Maybe there’s a better way.

The antiderivative as a differential equation

Let’s start by asking this question backwards. Suppose there’s some function you’re interested in, but you don’t have a formula for it. Instead you just have a formula for the derivative. In practical terms, this happens in dead reckoning: if you can’t measure where you are, but you know where you started and how fast you’re moving, you can estimate where you end up.

So suppose we know our speed \(F’(x)\), and our starting position \(F(a)\), and we want a way to figure out our current position \(F(x)\). We want to compute an antiderivative! The FTC part 2 tells us that \(F(x) = F(a) + \int_a^x F’(t) \,dt \), so we could figure this out by doing an integral. But I want to follow a different thought process.

We can start by saying, we know what \(F(a)\) is, and since we have a formula for \(F’(x)\), we can compute \(F’(a)\). Then we can use the linear approximation formula to estimate \[ F(x) \approx F(a) + F’(a) (x-a). \] So if we know, say, that \(F(1)=3\) and \(F’(x) = 3x^2\), we can estimate that \(F(5) \approx 3 + 3(5-1) = 15\).

Linear approximation gives a pretty decent estimate if \(x\) and \(a\) are close, but if they’re far apart it’s not very good. Consequently it doesn’t really work here: in reality \(F(5) = 127\).

But we can improve this exactly the same way we did before, by using Euler’s method! The problem is that the two points on my linear approximation are too far apart. But we can try to approximate the value of \(F\) somewhere closer to \(1\), like at \(3\).

\[ F(3) \approx F(1) + F’(1)(3-1) = 3 + 3(2) = 9. \] And then, since we also know \(F’(3) = 27\) I can estimate \[ F(5) \approx 9 + 27(5-3) = 63. \] Still not right, but much better! And we can improve even further by doing more steps: \[ \begin{array}{rl} F(2) & \approx F(1) + F’(1)(2-1) = 3 + 3 = 6 \\\
F(3) & \approx F(2) + F’(2)(3-2) = 6 + 12 = 18 \\\
F(4) & \approx F(3) + F’(3)(4-3) = 18 + 27 = 45 \\\
F(5) & \approx F(4) + F’(4)(5-4) = 45 + 48 = 93. \end{array}{rl} \] This still isn’t quite right, but it’s even closer; and as we take more and more smaller and smaller steps, we’ll get a better and better approximation.

Riemann Sums as Euler’s Method

This is basically Euler’s method. But why is it an integral? Let’s reorganize the calculation to make it clearer what’s happening. \[ \begin{array}{rl} F(5) & \approx F(4) + F’(4)(5-4) \\\
& \approx F(3) + F’(3) (4-3) + F’(4) (5-4) \\\
& \approx F(2) + F’(2) (3-2) +F’(3) (4-3) + F’(4) (5-4) \\\
& = F(1) + F’(1) (2-1) + F’(2) (3-2) +F’(3) (4-3) + F’(4) (5-4) \\\
& = 3 \cdot 1 + 3 \cdot 1 + 12 \cdot 1 + 27 \cdot 1 + 48 \cdot 1 = 93. \end{array} \] At this point this should be starting to look familiar. We’re taking a bunch of steps of size \(1 = \Delta x\), and for each step we’re multiplying it by the derivative at some \(x\) value. So we just computed \[ F(5) \approx F(1) + \sum_{k=1}^4 F’ \big( 1 + (k-1) \cdot 1 \big) \cdot 1. \] More generally, if we take \(n\) steps we get \[ F(5) \approx F(1) + \sum_{k=1}^n F’\big( 1 + (k-1) \Delta x \big) \Delta x. \] And that’s almost exactly a Riemann sum on the left-hand side. In fact, it’s a Riemann sum, plus the extra term \(F(1)\). If we rearrange it we get \[ F(5) - F(1) \approx \sum_{k=1}^n F’\big( 1 + (k-1) \Delta x \big) \Delta x. \]

I see two ways to think about this formula. One is that the indefinite integral contains a \(+C\) term, because antiderivatives aren’t unique. So while \(\int F’(t) \,dt\) is an antiderivative of \(F’(x)\), we don’t necessarily get the same function as our original \(F(x)\). Instead, the FTC just guarantees we have \(F(x) +C\), and \(F(1)\) is just the \(+C\) term.

But I think a clearer to me is that we’re really computing the change in the value of \(F\). This should make physical sense: the calculations with the speed tell us how far we’ve moved, not where we are. Thus the Euler’s method calculation tells us our displacement; but if we add that on to our starting position, we find out ending position.

Is this a good idea?

Mathematically, this all works out. It’s a cute argument and I’m glad I’ve found it. But there are plenty of fun math ideas that don’t belong in a freshman calculus course.

This approach has one obvious, major disadvantage: no one else teaches it like this, so it would probably leave students confused if they go on to take another course with someone else. And that’s probably enough to make it not worth doing², on its own.

But while that’s a real obstacle to adopting this approach in one class, it’s also kind of dodging the interesting questions about whether this would be a better approach. What if we could get everyone to switch? Should we?

One problem is that this argument isn’t at all rigorous. As long as we believe that Euler’s method will converge to the right answer, then the integral will as well; but I don’t know how you’d prove that Euler’s method converges without referencing the integral, so that seems fairly circular.

That objection seems fatal to me—in an upper-division Real Analysis course. In a freshman calculus course, nothing is ever going to be fully rigorous, and the proofs involving Riemann sums especially won’t be because getting the technical details of Riemann sums correct is hard. So I don’t mind a little non-rigor, especially if it helps students develop a clear intuitive understanding of what we’re trying to do.

In fact, having to avoid some of the abstraction involved in proving the Fundamental Theorem of Calculus might be a win, overall. That’s one of those lectures where I’m always confident my students aren’t really following the details, and are just hanging on trying to survive until we get back to computing things. On the other hand, it’s good for them to see some abstract formalism, even if they’re not ready to fully understand it yet. You have to see your first scary proof sometime!

Another problem is that this derivation captures the relationship between the Riemann sum and the antiderivative, but presents it exactly backwards. In most applications, the Riemann sum is the question we want to answer; the antiderivative is the tool we use to answer it. But the Euler’s method approach treats the antiderivative as the question, and the Riemann sum as the way we compute the answer—which is completely wrong since the Riemann sum is nearly impossible to compute outside of the simplest cases. I think this is a really deep problem with this approach. One of the big ideas I want my students to engage with is figuring out the difference between identifying a question, and computing the answer; giving it to them backwards seems like an obstacle to developing that understanding.

But I do really like the way this approach connects the integral back to the other big ideas in the class. Not just to the derivative; any presentation of the FTC will draw a link between integration and differentiation. But this makes the integral seem connected to the themes of numeric approximation and differential equations, which ties the entire course together neatly.

And really, that sums it up, I think. It’s always nice to tell a neat story that ties the whole class together. But it probably isn’t as important as making sure our students understand each piece well on its own. I have to resist the temptation to do something pretty, and elegant, and unnecessarily confusing. So this is a fun idea, but for now I’m going to teach this normally.

Do you have a clever way to motivate the integral? Do you think I should actually be using this approach in my course? Any other thoughts on teaching integration? Tweet me @ProfJayDaigle or leave a comment below.

This proof relies heavily on specific special properties of the real numbers, and in particular the property that if \(f’(x)=0\) then \(f(x)\) is constant. This isn’t true if we allow functions to be defined solely for rational numbers; the real numbers are exactly the set that makes it work. ↵Return to Post
Or at least not worth doing as the motivation to the integral. I think it’s fine to do this as a followup, or an application of the integral. If you have an extra day to spend on integration, this isn’t the worst thing you could do. But if you have extra days in your calculus syllabus please tell me how you got them. ↵Return to Post

Tags: math teaching calculus numerical analysis differential equations integration

Maybe-Mathematical Musings

Math, Teaching, Literature, and Life

March 15, 2023

Recent Posts: