Bernoulli numbers turn up in a surprising number of contexts. This page will document the most outstanding and make sense of why these numbers appear therein.


Definition of Bernoulli Numbers
Trigonometric Power Series
Sums of Products of Bernoulli Numbers
Sums of Powers of Integers
Bernoulli Polynomials
Euler-Maclaurin Summation
Relations to Zeta Functions

Definition of Bernoulli Numbers

These numbers are defined by the generating function

tet-1 =n=0 Bn tnn!

The sum on the left-hand side is the inverse of an exponential series with powers shifted down by one. Rearranging the definition gives

1= k=0 Bk tk k!× m=1 tm-1 m! = k,l=0 Bk tk+l k!(l +1)!

The first term on the right-hand side is constant, so that B0=1 . The remaining powers must then have coefficients equal to zero. Picking out an arbitrary power gives

k=0 n Bk k!(n-k +1)! =0 = k=0n (n+1 k)Bk

This summation that equals zero is all important in detecting the presence of Bernoulli numbers, and will be used time and again in what follows.

Since binomial coefficients are positive by definition, some of the Bernoulli numbers must be negative for the total sum to equal zero. The numbers can be evaluated recursively using

Bn=1 n+1 k=0 n-1 (n+1 k)Bk

With the aid of a CAS or a bit of calculational persistence, the first few values are

B1=12 , B2=16 , B3=0 , B4=130 , B5=0, B6=142 , B7=0 , B8=130 , B9=0 , B10=566 ,

All odd numbers after B1 are identically zero. This is easy to show by separating the first two terms of the sum:

n=0 Bn tnn! =1-t2 +n=2 Bn tnn! =t et-1 1+n=2 Bn tnn! =t et-1 +t2 =t2 et+1 et-1 =t2 cott2

Since the final right-hand side is an even function, there can be no odd terms on the left-hand side. The coefficients of these terms are necessarily zero.

Even Bernoulli numbers alternate in sign, but this will not be important in what follows.

Trigonometric Power Series

Because of the generating function definition, Bernoulli numbers appear prominently in the power series of trigonometric functions other than sine and cosine. Consider first the hyperbolic cotangent:

cothx=ex +ex ex -ex =e2x +1 e2x -1 =1+2 e2x -1 cothx =1+1x k=0 Bk (2x)k k! cothx=1x +k=2 2kBk k! xk-1 =1x +k=1 22k B2k (2k)! x2k-1

The index of the sum can be rewritten in the last step because all the odd Bernoulli numbers in the sum are identically zero. The singular term can be combined with the sum but is kept separate for clarity.

For the circular cotangent, let xix and, noting that sinhix=isinx , multiply by the i from the denominator of the left-hand side:

cotx =1x +k=1 (1)k 22k B2k (2k)! x2k-1

The hyperbolic tangent has a positive sign in its denominator, requiring a change of sign in the denominator of the the generating function. First note the partial fraction decomposition

1e2t -1 =1(et -1)(et +1) =12 [1et -1 -1et +1]

so that one has

tet+1 =tet -1 -2t e2t -1 =k=0 Bk tk -(2t)k k!

The power series for the hyperbolic tangent is thus

tanhx=ex -ex ex +ex =e2x -1 e2x +1 =1-2 e2x +1 tanhx =1-1x k=0 Bk (2x)k -(4x)k k! tanhx= k=2 2k (2k-1) Bk k! xk-1 =k=1 22k (22k -1) B2k (2k)! x2k-1

where again the index of the sum can be rewritten in the last step because all the odd Bernoulli numbers in the sum are identically zero.

For the circular tangent, let xix and divide by the i from the numerator of the left-hand side:

tanx= k=1 (1 )k+1 22k (22k -1) B2k (2k)! x2k-1

For the hyperbolic cosecant, use an intermediate partial fraction decomposition:

cschx=2 ex-e x =2 ex e2x -1 =1ex -1 +1ex +1 cschx =1x k=0 Bk 2xk -(2x)k k! cschx=1x +k=2 (2-2k) Bkk! xk-1 =1x +k=1 (2-2 2k) B2k (2k)! x2k-1

One more time the index of the sum can be rewritten in the last step because all the odd Bernoulli numbers in the sum are identically zero. The singular term is again kept separate for clarity.

For the circular cosecant, let xix and multiply by the i from the denominator of the left-hand side:

cscx =1x +k=1 (1)k (2-2 2k) B2k (2k)! x2k-1

Series for the hyperbolic and circular secants cannot be written in terms of Bernoulli numbers, but rather define a new set known as Euler numbers.

Sums of Products of Bernoulli Numbers

Consider the consistency of power series in the context of the two derivatives

ddx cothx =csch2x ddx cschx =cschxcothx

Including singular terms in sums for conciseness, these two relations can be expressed as

k=0 22k B2k (2k)! (2k-1) x2k-2 = k,l=0 (2-2 2k) (2-2 2l) B2k B2l (2k)! (2l)! x2k +2l-2   k=0 (2-2 2k) B2k (2k)! (2k-1) x2k-2 = k,l=0 (2-2 2k) 22l B2k B2l (2k)! (2l)! x2k +2l-2

Picking out an arbitrary power of the independent variable on each side of the equation, these two expressions are equivalent to

(2n-1) 22n B2n (2n)! = k=0n (2-2 2k) (2-2 2n-2k) B2k B2n-2k (2k)! (2n-2k)!   (2n-1) (2-2 2n) B2n (2n)! = k=0n (2-2 2k) 22n-2k B2k B2n-2k (2k)! (2n-2k)!

This will appear much simpler with the abbreviations

S1 = k=0n (2n 2k) B2k B2n-2k S2 = k=0n (2n 2k) 22k B2k B2n-2k

Note that both sums exhibit an interchange symmetry between k and n-k . Keeping this in mind, the two equalities can be written

(1-2n) 22n B2n =(4 +22n) S1 -4S2 (1-2n) (2 -22n) B2n =2S2 -22n S1

A bit of algebra establishes that S1 =S2 , except for n=1 when the system is indeterminate. From either of these two relations, and for n1 , one then has

k=0 n (2n 2k) B2k B2n-2k = k=0n (2n 2k) 22k B2k B2n-2k =(1-2n) B2n

This is somewhat surprising: the extra power of two inside the second summation has no effect on the result. These summations can also be written without the first and last terms as

k=1 n-1 (2n 2k) B2k B2n-2k = k=1 n-1 (2n 2k) 22k B2k B2n-2k =(2n +1) B2n

This result has been known since Euler, but perhaps this is a novel derivation.

Sums of Powers of Integers

Bernoulli numbers first arose in the context of summing powers of integers. To motivate the general form, first consider a trivial sum over zeroth powers:

k=1 nk0 =k=1 n1 =n

The corresponding sum over first powers uses a trick of Gauss,

k=1 nk =n2 (n+1) =12n2 +12 n

where the first and last terms are added and multiplied by half the number of such intermediate sums. This form remains an integer for n both even and odd.

The sum over squares of integers is most easily done using a trick of Knuth. Consider an identity for the sum of cubes:

k=1 nk3 +(n+1 )3 =k=0 n (k+1)3 =k=0 n (k3 +3k2+3k +1) =k=1 nk3 +3 k=1n k2 +3 k=1n k +(n+1)

The sum over cubes cancels on both sides, but one can then solve for the sum of squares using previous results:

k=1 nk2 =13 [(n+1 )3 -32n (n+1) -(n+1) ] =13n3 +12n2 +16n

This method can be used for all subsequent sums, as well as the previous two results.

These examples are enough to see that the final result begins with a power one more than that in the sum, descending to a single power. Thus assume that the sum over integers to an arbitrary power has the form

k=1n kp =k=1 p+1 cknk

For p = 0, this form is satisfied with c1 = 1. Consider induction on the assumed form:

k=1 n+1 kp =(n+1 )p +k=1 nkp k=1 n+1 kp =(n+1 )p +k=1 p+1 cknk =k=1 p+1 ck (n+1)k

Before expanding binomials, let nn-1 in the second line to simplify,

np +k=1 p+1 ck (n-1)k =k=1 p+1ck nk

so that now the right-hand side can be easily eliminated:

np +k=1 p+1 m=1k ck (1)m (km) nk-m =0

The indices k and m describe a triangular region in their coordinate plane. Diagonals in this triangular region correspond to the difference between the two indices. These differences run from zero, when the two indices are equal, up to the power p, when the indices differ the most. Working down in value from the uppermost diagonal, when the indices are equal, produces an new upper limit on the second index m. The double sum can thus be rearranged as

np +i=0 p m=1 p+1-i ci+m (1)m (i+m m) ni =0

One can now set coefficients of powers of n equal to zero for coefficient values. The highest power gives

1-cp+1 (p+1 1) =0 cp+1 =1p+1

while the next highest power gives

cp+1 (p+1 2) -cp (p1) =0 cp=12

which is curiously independent of the power p. Both of these results are consistent with the examples at the beginning of this section. For the general case one has

m=1 p+1-i ci+m (1)m (i+m m) =0

and the result just above suggest one should look at a descending index defined by i+m =p+1-k . The general case then becomes

k=0 p-i cp+1-k (1 )k ( p+1-k p+1 -i-k) =0

where a possible overall minus sign is ignored as inconsequential. Now consider casting this in the form of the all-important summation for Bernoulli numbers:

k=0 n (n+1 k)Bk =0

One would think at first that the alternating nature of the sum in question would make this impossible, except for one fact: all odd Bernoulli numbers are zero except B1 . Apart from explicitly adjusting the sign of the coefficient containing this value, one can ignore the alternating sign in the series.

One thus needs to have

cp+1-k ( p+1-k p+1-i -k) (p -i+1k) Bk

Ignoring factors not dependent of the index of summation k, this gives

cp+1-k Bkk! (p+1-k)! ( p+1k) Bk

Renaming the index and adjusting the normalization factor to match the first two values gives the result

ck =(1 )δ k,p p+1 ( p+1k) Bp+1-k

The extra minus sign accounts for the one term in the alternating series that contains the nonzero odd Bernoulli number B1 .

The final form of the sum over integers to an arbitrary power is thus

k=1 nkp =k=1 p+1 (1 )δ k,p p+1 ( p+1k) Bp+1-k nk

and the leading terms are

k=1n kp =1p+1 np+1 +12np +p12 np-1 -p(p-1) (p-2) 6! np-3 +

where as noted cp-2 0  , along every other subsequent coefficient, due to odd Bernoulli numbers being zero after the first. The coefficients here are naturally consistent with the examples at the beginning of this section. The linear term will not appear if multiplied by an odd Bernoulli number other than B1 .

This form of the sum has a rather surprising implication,

k=1 nkp =np +k=1 n-1 kp k=1 n-1 kp =k=1 nkp -np =k=1 p+1 1p+1 ( p+1k) Bp+1-k nk

since only one term of the sum is modified. That is, a sum over one less integer can be written with the standard Bernoulli numbers without modification.

Bernoulli Polynomials

These polynomials are defined by the generating function

text et-1 =n=0 Bn(x) tnn!

Inserting the definition of the Bernoulli numbers gives

n=0 Bn(x) tnn! =ext k=0 Bk tkk! = k,m=0 Bk xm tk+m k!m! =n=0 k=0 n (n k)Bk xn-k tnn!

where the last rearrangement picks out individual powers of the variable t. The polynomials are thus given in terms of the numbers by

Bn(x) =k=0 n (n k)Bk xn-k =k=0 n (n k) Bn-k xk

Again with the aid of a CAS or a bit of calculational persistence, the first few polynomials are

B0(x) =1 B1(x) =x-12 B2(x) =x2-x +16 B3(x) =x3 -32x2 +12x B4(x) =x4 -2x3 +x2 -130

From the definition it is trivial to see that

Bn(0) =Bn

The value of the polynomials at unity is in most cases identical, which can be seen by shifting the quantity n in the all-important summation for Bernoulli numbers:

Bn(1) =k=0 n (n k)Bk =k=0 n-1 (n k)Bk +Bn =Bn

This relation does not hold for n=1 because the intermediate sum is nonzero: instead one has B1(1) =B1 .

The all-important summation leads to the following property of the polynomials for n>0 :

01 dx Bn(x) =k=0 n (nk) Bk xn-k +1 n-k+1 |01 =1n+1 k=0 n (n+1 k)Bk =0

The derivative of the polynomials is

ddx Bn(x) =k=0 n-1 (nk) Bk (n-k) xn-k-1 =n k=0 n-1 (n-1 k)Bk xn-1-k =nBn-1 (x)

This relation holds for all values of the index and represents an Appell sequence. The relation can also be written in terms of an integral as

dx Bn(x) =1n+1 Bn+1 (x)

with a shift in index. Constants of integration are here chosen proportional to Bernoulli numbers for consistency. Evaluating the left-hand side as a definite integral from zero to one is an alternate way to show that the polynomials have identical values at these two endpoints, apart from the exception noted above.

Euler-Maclaurin Summation

The formula derived here provides a way to evaluate the error between an integral and its approximation as a sum. While the derivation given in venerable Whittaker & Watson is accurate, it provides no hint as to why Bernoulli polynomials are employed as opposed to any other set of similarly defined polynomials. This derivation will be clearer in that crucial aspect.

An integral over a small region can be approximated to first order by a trapezoid,

x0 x1 dx f(x) x1 -x02 [f(x0) +f(x1)]

where the heights at the endpoints are averaged and multiplied by the base. The error in this first approximation is clearly the difference of the two sides of this equation:

ε1 =x1 -x02 [f(x0) +f(x1)] -x0 x1 dx f(x)

Consider integration by parts of the following integral:

x0 x1 dx(x-c) f(x) =(x-c) f(x) |x0 x1 -x0 x1 dx f(x) =(x1 -c) f(x1) -(x0 -c) f(x0) -x0 x1 dx f(x)

Setting the constant equal to the average of the two endpoints, this is identical to the expression for error, which can thus be represented

ε1 =x0 x1dx (x-x0 +x12) f(x)

Now make the change of variable

x=(x1 -x0)u +x0 =hu+x0 dxdu =h

so that the integral becomes

ε1 =h2 01 du(u -12) f(x)

where the argument of the function is left as is for conciseness.

The procedure going forward is to perform repeated integration by parts on this error integral to produce a sum of decreasing error terms. The polynomials that will appear during this process constitute an Appell sequence.

Most derivations at this point recognize the first factor in the integrand as the Bernoulli polynomial B1 (u) and proceed without further comment. Such a procedure is problematic because this is also the Euler polynomial E1 (u) , which differs from the Bernoulli polynomial by a constant of integration. Since constants of integration can be assigned arbitrarily in a definite integral, there needs to be justification for how they are chosen.

One way to do this is to focus on a final result that will be useful in evaluation. So far one has the exact statement

x0 x1 dx f(x) =h2 [f(x0) +f(x1)] -h2 01 du(u -12) f(x)

over a small region. For extended regions one needs to add the contributions from each small region,

ab dx f(x) =h2 [f(a) +f(b)] +hk=1 n-1 f(a+kh) -k=0 n-1 h2 01 du(u -12) f( xk)

where b=a+nh and xk=hu +a+kh . The first two terms represent the trapezoidal rule for estimating integrals: internal points contribute twice that of endpoints from adjacent small regions.

The third term also has contributions from adjacent regions, but given that it consists of integrals the contributions appear with opposite signs. If they can be made to cancel, the sum of error terms would then only depend on endpoint contributions. This will be the criterion for determining constants of integration.

Define an Appell-like sequence of polynomials with

Pi+1 (u) =du Pi(u)

The integration by parts of a general error integral can then be written

εi =hi+1 01 du Pi(u) f(i) (xk) εi =hi+1 [Pi+1 (1) f(i) (xk +1) -Pi+1 (0) f(i) (xk)] -hi+2 01 du Pi+1 (u) f(i+1) (xk)

Clearly the condition that allows cancellation of interior error contributions is

Pi(0) =Pi(1)

If this condition holds, the estimate to the integral becomes

ab dx f(x) =h2 [f(a) +f(b)] +hk=1 n-1 f(a+kh) - m=2N (1)m hm Pm(0) [f (m-1) (b) -f (m-1) (a)] +R

after repeated integration by parts. The remainder term R is simply the next error integral in the sequence and will not be considered further.

And now for the crucial part: determining the polynomials themselves. Starting from the one appearing in the integrand, the first few are

P1(u) =u-12 P2(u) =u22 -u2 +c2 P3(u) =u33! -u2 2·2 +c2u +c3 P4(u) =u44! -u3 2·3! +c2 u22 +c3u +c4

from which one can deduce the general form

Pn(u) =unn! -un-1 2(n-1)! +k=2 nck un-k (n-k)!

Equating the value of this polynomials and zero and one gives the equation

1n! -1 2(n-1)! +k=2 n-1 ck (n-k)! =0

With the choice ck =Bkk! this becomes

1n! k=0 n-1 (nk) Bk =0

which is the all-important summation for Bernoulli numbers! Simply requiring these polynomials to have equal values at zero and one is enough to produce Bernoulli numbers and confirm that Bernoulli polynomials are indeed appropriate for the process. Comparing the general form here to the definition of Bernoulli polynomials gives

Pn(u) =1n! Bn(u)

The approximation to the integral is thus

ab dx f(x) =h2 [f(a) +f(b)] +hk=1 n-1 f(a+kh) - m=2N (1)m hm Bmm! [f (m-1) (b) -f (m-1) (a)] +R

and since the odd Bernoulli numbers in the sum are zero, this can be written a bit more simply as

ab dx f(x) =h2 [f(a) +f(b)] +hk=1 n-1 f(a+kh) - k=1N h2k B2k (2k)! [f (2k-1) (b) -f (2k-1) (a)] +R

The sum here is in general asymptotic, requiring an upper cutoff to prevent divergence.

It is instructive to compare this derivation to that originally given by MacLaurin as described in this paper, but with modern notation replacing fluxions. Start with the Taylor series of a function at some initial point:

f(x) =k=0 f(k) (a) (x-a)k k!

Integrating this from the initial point to one separated by unity will define a sum

S0=a a+1 dx f(x) =k=0 1(k+1)! f(k) (a)

whose first term is f(a) . Define a sequence of such sums with

S1 =f(x) |a a+1 =k=1 1k! f(k) (a) S2 =f(x) |a a+1 =k=2 1(k-1)! f(k) (a) S3 =f(x) |a a+1= k=3 1(k-2)! f(k) (a)

where subscript on the sum indicates that its first term is f(n) (a) : initial constant terms cancel in the difference between endpoints, leaving only terms corresponding to the upper endpoint.

It is not hard to see from writing out a few terms in each sum that a particular derivative will appear only in a finite number of sums. Combining the sums with coefficients to be determined and picking out individual derivatives on the right-hand sides gives

m=0 cmSm =k=0 f(k) (a) m=0 k cm (k+1 -m)!

One can then solve for f(a) by choosing the coefficients to make remaining sums on the right-hand side zero. If this is done, then one can write

f(a) =1c0 m=0 cmSm f(a) =a a+1 dx f(x) +c1c0 [f(a+1) -f(a)] +m=2 cmc0 [f (m-1) (a+1) -f (m-1) (a)]

Summing over n unit steps to a final point, one can write the integral as

ab dx f(x) =k=0 n-1 f(a+k) -c1c0 [f(b) -f(a)] -m=2 cmc0 [f (m-1) (b) -f (m-1) (a)]

And how should the coefficients be chosen? With cm =Bmm! the remaining sums are all

m=0 k Bm m! (k+1 -m)! =1 (k+1)! m=0 k (k+1 m)Bm =0

which as anticipated is the all-important summation for Bernoulli numbers. The integral is thus

ab dx f(x) =k=0 n-1 f(a+k) +12 [f(b) -f(a)] -m=2 Bmm! [f (m-1) (b) -f (m-1) (a)] ab dx f(x) =12 [f(b) +f(a)] +k=1 n-1 f(a+k) -k=1 B2k (2k)! [f (2k-1) (b) -f (2k-1) (a)]

which as expected is equivalent to the result of the first derivation with h=1 .

While it may appear at first that Bernoulli numbers arise here in some sense from inverting a Taylor series, one can start the solution for derivatives at any arbitrary sum for the general result

f(n) (a) =m=0 cm Sn+m =m=0 Bmm! Sn+m

Bernoulli numbers thus appear here, not from inverting a Taylor series per se, but from inverting its factorial coefficients. Curious...

Relations to Zeta Functions

Begin with the definition of the gamma function as an integral:

Γ(z) =0 dt tz-1 et

Change the variable of integration using t=(k +x)u ,

Γ(z) =(k+x )z 0 du uz-1 e(k +x)u

rearrange slightly and and sum over the introduced index,

k=0 1(k+x )z =1 Γ(z) k=0 0 du uz-1 e(k +x)u =1 Γ(z) 0 du uz-1 exu 1-e u

where the intermediate step uses an infinite geometric series. The left-hand sum is the Hurwitz zeta function ζ(z,x) expressed as a summation, and the right-hand side is its representation as an integral.

This integral converges as long as z>1 to avoid the singularity at the origin. For general values of the independent variable, interpret this integral as part of a complex contour that comes in from positive infinity, circles the origin and returns to positive infinity. That is, consider the contour integral

duu (u)z exu 1-e u

The direct variable u has a complex argument of zero on the positive real axis. A contour that comes in from negative infinity has an argument of iπ below the negative real axis, and an argument of iπ upon returning to negative infinity above the negative real axis. When this contour is rotated to handle the negative variable u these arguments are inverted, with the negative value on the part of the contour coming in from positive infinity above the positive real axis, and the positive value upon returning to positive infinity below the positive real axis.

Around the origin parametrize the contour as u=ρ eiφ  . On this part of the contour the integral becomes

0 2π idφ ρz eizφ exp(xρ eiφ) 1-exp(ρe iφ) ρz-1 f(φ) , ρ0

where the left-hand side is exact, and approaches the right-hand value in the indicated limit. There is thus no contribution from this part of the contour, and the complete contour integral can be written

duu (u)z exu 1-e u =duu (e±iπ u)z exu 1-e u =eiπz 0 duu uz exu 1-e u +eiπz 0 duu uz exu 1-e u =2isinπz 0 duu uz exu 1-e u

where explicit values of the phase have been separated out in the second step as per the description above of the contour. Combining intermediate results gives

ζ(z,x) =12isinπz Γ(z) duu (u)z exu 1-e u =Γ(1-z) 2πi duu (u)z exu 1-e u

using a reflection formula for the gamma function. One can now change the variable of integration using u u to get

ζ(z,x) =Γ(1-z) 2πi du uz-1 exu 1-eu

where the contour is now rotated to the negative real axis, but still circles the origin. This representation is given here in a standard online reference.

Now insert the definition of Bernoulli polynomials,

ζ(z,x) =Γ(1 -z) 2πi du k=0 Bk(x) uk+z-2 k!

and let argument z be equal to a negative integer. Under the residue theorem, only terms with k-n-1=0 will contribute, giving

ζ(n,x) =Γ(n +1) 2πi k=0 Bk(x) du uk-n-2 k! =B n+1(x) n+1

which is here in the standard online reference. Renaming the index gives the general relation

Bn(x) =n ζ(1-n, x)

Since the Hurwitz zeta function is analytic in both parameters, this relation can be used to extend Bernoulli polynomials as analytic functions of both index and argument.

When the second argument of the Hurwitz zeta function is unity, it becomes the Riemann zeta function. Since the value of Bernoulli polynomials at unity is usually the corresponding Bernoulli number, one has immediately

Bn =n ζ(1-n) , n>1

This expression can appear with an additional alternating sign, which is unnecessary because the Bernoulli numbers that would be affected by such a sign are all zero.

Uploaded 2023.08.17 — Updated 2023.11.14