Curve Fitting Part 6: Linear Least Squares and its Errors

Linear least squares and its uncertainties

In [1]:
from collections import namedtuple
import numpy as np
import numpy.linalg
import matplotlib
matplotlib.rcParams['savefig.dpi'] = 144
%matplotlib inline
import matplotlib.pyplot as plt

Setting up a simple test problem

The goal is to demonstrate techniques of error estimation for linear least squares, so let's set up a problem about as simple as possible while still presenting the difficulties we might hope these techniques solve.

The problem will be a simple straight-line fit to data. I'm not even going to bother fooling with the best-fit values; the slope and intercept will be zero, and the data points are symmetrical about zero, so the slope/intercept parameterization should yield uncorrelated errors on the fitted parameters.

The goal of the fitting is to find \(x\) that minimizes \[ \left\|Ax-b\right\|^2_u, \] that is, the sum of squares of the residuals \(r\) divided by (our estimate of) the uncertainties \(u\). We will be interested in what happens when our estimate of the uncertainties differs from the true uncertainties, so here I make the distinction.

In [2]:
xs = np.linspace(-1,1,100)
true_uncerts = np.abs(xs)
claimed_uncerts = 0.35*np.ones_like(xs)

def make_fake_data(xs, true_uncerts):
    return true_uncerts*np.random.randn(len(xs))

ys = make_fake_data(xs, true_uncerts)
plt.errorbar(xs,ys,true_uncerts, linestyle='none')
<Container object of 3 artists>

The fitter

Here's a function to do the fitting. The actual calculation I defer to numpy's lstsq function, which uses the singular value decomposition under the hood and is supposed to be somewhat robust to degeneracies in the problem. (Not as robust as one could wish, unfortunately.) Because several of the error estimates are most easily calculated from some of these intermediate results, I calculate them here, though I will explain them below. Because we have quite a few values to return, I use a named tuple to return them.

(A brief aside: because scipy has only recently adopted the use of named return values, they don't want to break existing code that simply assigns the return values, as in x, chi2, rk, s = lstsq(...), and named tuples satisfy this criterion. But I believe that in some cases they have objects that behave a little more like python's function call signatures: the named tuple is extended by more arguments accessible only by name. This would be nice here, but I don't know where to find the machinery to do it easily. Even the named tuple machinery is kind of cumbersome. I could write some, it wouldn't be long, but... I'll go down that rabbit hole another time.)

The basic fitting procedure can be carried out by matrix algebra. If we first leave aside the possibility of uncertainties, we get \[ x = (A^TA)^{-1} A^T b. \] Of course, numerically, matrix inversion is a troublesome operation, particularly when you have nearly-degenerate problems; the SVD provides a way to manage the treatment of such matrices, and the degree of near-degeneracy is summarized in the rank (rk) and singular-value (s) returns from numpy.linalg.lstsq. In such a numerically-messy case, though, the covariance matrix is going to become completely horrible, so we're going to ignore that when we're looking at error estimates. But in any case, we can work with the "pseudo-inverse", which provides some of the same resistance to numerical headaches (and is also based on the SVD). Anyway, incorporating uncertainties simply amounts to scaling the rows of \(A\) and the elements of \(b\) by dividing by the claimed uncertainties.

In [3]:
linear_least_squares_result = namedtuple("linear_least_squares_result",
def linear_least_squares(Acols, b, uncerts):
    names = sorted(Acols.keys())
    A = np.array([Acols[n] for n in names]).T
    n = len(b)
    Ascaled = A/uncerts[:,None]
    x, chi2, rk, s = numpy.linalg.lstsq(A/uncerts[:,None], b/uncerts)
    res =,x)
    cov = np.linalg.pinv(,Ascaled))
    cov_scaled = cov*chi2/n
    leverage = np.diag(,,Ascaled.T)))
    # HC3, Mackinnon and White estimator
    cov_robust =
    return linear_least_squares_result(names=names,
        x=x, chi2=chi2, rk=rk, s=s, n=n,
        res=res, cov=cov, cov_scaled=cov_scaled, 
        leverage=leverage, cov_robust=cov_robust)
In [4]:
Acols = {"constant": np.ones(len(xs)),
         "linear": xs}
res = linear_least_squares(Acols, ys, true_uncerts)

print res.x
[-0.00994284 -0.12111   ]

Standard linear least squares

This just uses the standard formula for the covariance matrix of a linear least squares fit: \[ (A^T A)^{-1}, \] for an \(A\) with appropriately scaled rows. So it should work as long as it's provided with the correct uncertainties for the input values.

In [5]:
res = linear_least_squares(Acols, ys, true_uncerts)
print res.names
print res.cov
true_cov = res.cov
['constant', 'linear']
[[  4.15196328e-05   6.43500890e-19]
 [  6.43500890e-19   1.00000000e-02]]

Let's quickly check that, statistically. That is, we want to know, if we had the same true values but many different instances of inaccurate measurements, what would the distribution of fit parameters be? It would be multivariate normal, described by a covariance matrix. We want our fitting algorithm to return that matrix, or an estimate of it. We can test whether it's right by actually creating many fake data sets, fitting them, and taking the covariance matrix of the fitted parameters. This should converge (like \(\sqrt{n}\)) towards the true covariance matrix.

A note about how I structure the code: because the convergence requires many fake data sets, the more the better, I use three IPython blocks: one to set up the problem, one to run a bunch of simulations, and one to compute the results. They are written so I can rerun the middle block as many times as I have patience for and watch the computed results converge. If this gets slow, there are tricks to parallelize the middle block, possibly even running it in the background while I evaluate partial results.

In [6]:
def run_fake():
    ys = make_fake_data(xs, true_uncerts)
    res = linear_least_squares(Acols, ys, true_uncerts)
    return res.x
fake_xs = []
In [7]:
for i in range(100000):
In [8]:
print len(fake_xs)
fx = np.array(fake_xs)
print "simulation:"
print np.cov(fx.T)
print "claimed by linear_least_squares:"
print true_cov
[[  4.11844934e-05   4.03486016e-06]
 [  4.03486016e-06   1.00363575e-02]]
claimed by linear_least_squares:
[[  4.15196328e-05   6.43500890e-19]
 [  6.43500890e-19   1.00000000e-02]]

Well, they don't agree perfectly. The off-diagonal entries, which analytically should be exactly zero, are bad. But remember that error estimates converge like root n, and n here is not huge. So it looks like the errors here are okay.

Dealing with wrong error estimates

Of course, we just saw the best-case scenario: the errors were normal and we knew exactly their sizes. In general we don't always know our uncertainties this well. What can be done about this? That is, suppose we put in our best guess for the experimental uncertainties and do the fitting. Is there any way to produce more pessimistic error estimates that might be okay even if our estimated uncertainties are wrong?

There's one commonly-used method: compute the reduced \(\chi^2\) of the fit and rescale the uncertainties until that reduced \(\chi^2\) is one. This correctly compensates for the situation where all your uncertainties are under- or over-estimated by the same factor. Of course, it suffers from the severe drawback that it quietly absorbs errors that are due to unmodelled structure (for example if these data had a quadratic part) into inflated errors on model parameters. If your model is wrong - that is, missing some important effect - then inflating your error bars is not at all a satisfactory solution, and it leaves you without much in the way of quality-of-fit estimates.

Nevertheless, what if we are confident that our model correctly describes all features of the data but that our uncertainties are all wrong by differing amounts? This situation - uncertainties that are different from each other - is called "heteroskedasticity", and there is a fairly substantial statistical literature about how to deal with it.

First I'm going to repeat the trick above with the fake data to determine the "true" distribution of fitted parameters. It's going to be different from (worse than) the distribution above, because the fitting procedure is using the wrong weights in the least-squares fit.

In [9]:
def run_fake_het():
    ys = make_fake_data(xs, true_uncerts)
    res = linear_least_squares(Acols, ys, claimed_uncerts)
    return res.x
fake_het_xs = []
In [10]:
for i in range(100000):
In [11]:
print len(fake_het_xs)
fhx = np.array(fake_het_xs)
cov_het = np.cov(fhx.T)
print "Observed covariance when using wrong uncertainties:"
print cov_het
print "Observed covariance when using the right uncertainties:"
print true_cov
Observed covariance when using wrong uncertainties:
[[  3.38980301e-03  -5.14196812e-05]
 [ -5.14196812e-05   1.82065550e-02]]
Observed covariance when using the right uncertainties:
[[  4.15196328e-05   6.43500890e-19]
 [  6.43500890e-19   1.00000000e-02]]

Because we've weighted the data points uniformly, rather than based on their true uncertainty, we get an estimator that does a worse job than it could. I don't show it here, but it'll still converge to the right value eventually, if you keep adding data points, but it'll do so much more slowly than one using the right weights. So it pays to get your uncertainties as close to right as possible even if you're not going to trust them for error estimation.

Of course, if you make your covariance estimate with the wrong uncertainties, the covariance comes out badly wrong, in fact much too small in this case, and not even scaling so that the reduced \(\chi^2\) is one solves the problem:

In [12]:
res = linear_least_squares(Acols, ys, claimed_uncerts)

print "Claimed covariance with the wrong uncertainties:"
print res.cov
print "Claimed covariance with the wrong uncertainties, scaled:"
print res.cov_scaled
print "Observed covariance when using wrong uncertainties:"
print cov_het
Claimed covariance with the wrong uncertainties:
[[  1.22500000e-03   1.27950983e-19]
 [ -3.76251901e-19   3.60222772e-03]]
Claimed covariance with the wrong uncertainties, scaled:
[[  2.92846580e-03   3.05877615e-19]
 [ -8.99461898e-19   8.61142914e-03]]
Observed covariance when using wrong uncertainties:
[[  3.38980301e-03  -5.14196812e-05]
 [ -5.14196812e-05   1.82065550e-02]]


The most direct approach to dealing with heteroskedasticity - and actually a pretty broad range of other kinds of statistical challenge - is the bootstrap method. In this method, fake - even more fake than the ones above - data sets are created from the observed data itself. Then the fitting method is applied repeatedly to these fake data sets to produce a large collection of fitted parameters, and their distribution is described. This has the advantage over the method above that one need know nothing about the method used to generate the true data, in particular, one need not know the true uncertainties. Annoyingly, it does take as many simulations as you have patience for (as opposed to the lovely matrix calculations you can do in one shot), but it's at least simple to implement.

In [13]:
def run_boot():
    # draw from original data with replacement
    ix = np.random.randint(len(xs), size=len(xs))
    xs_l = xs[ix]
    ys_l = ys[ix]
    us_l = claimed_uncerts[ix]
    res = linear_least_squares({"constant":np.ones_like(xs_l), "linear":xs_l}, ys_l, us_l)
    return res.x
boots = []
In [14]:
for i in range(100000):
In [15]:
print len(boots)
ba = np.array(boots)
print "Bootstrap:"
print np.cov(ba.T)
print '"True":'
print cov_het
[[ 0.00294946 -0.00054469]
 [-0.00054469  0.01402717]]
[[  3.38980301e-03  -5.14196812e-05]
 [ -5.14196812e-05   1.82065550e-02]]

The bootstrap is doing a surprisingly bad job, here. The bootstrap might do better if we had more data points - a thousand rather than a hundred. Most of these techniques work only "asymptotically", that is, they converge towards being right as you supply more data points. The bootstrap is confusing because there are two numbers that go into it - the number of bootstrap trials and the number of data points. A small number of bootstrap trials has an obvious negative effect on the result, but even as the number of bootstrap trials goes to infinity, the number of data points still provides a limit on how well it can work.

There are smarter ways to apply the bootstrap to a linear least-squares problem; for example, maybe you could generate your fake data sets by adding residuals to the model, but with your residuals chosen with replacement from the set of residuals. This avoids the problem that some data points have more impact on the fitted parameters. This, by the way, is what the "leverage" measures; or rather, it measures the impact of the data points on the fitted model. There are derived measures to quantify the impact of data points on fitted parameters more directly. But in case you're curious, here's a plot of the leverage of these data points.

The leverage is computed from the projection matrix (so called because it takes a data set to the best-fit model evaluated at those points) \[ H = A(A^TA)^{-1}A^T \] by picking out the diagonal entries \[ h_i = \left[H\right]_{ii}. \]

In [16]:
res = linear_least_squares(Acols, ys, true_uncerts)
plt.plot(xs, res.leverage, ".")
plt.title("Using correct weights")

res = linear_least_squares(Acols, ys, claimed_uncerts)
plt.plot(xs, res.leverage, ".")
plt.title("Using bogus weights")
<matplotlib.text.Text at 0x7f64c095e450>

Note that when the correct weights are used, the points nearest zero have a great influence on the fitted model in their vicinity - their uncertainties are so small (and therefore their weights are so high) they're going to pull the best-fit line strongly towards them. With uniform weights, the endpoints have more pull just because of their influence on the slope.

Robust standard errors

Fortunately, linear least-squares is a very well-studied problem, and there's a nice matrix formula that is supposed to produce a very similar sort of calculation. Fundamentally what it tries to do is use the residual from each data point to estimate the uncertainty that data point should have had. Of course, this does not converge to a sensible estimate of the individual uncertainties, since no matter how many data points you have you have just as many uncertainties to estimate. But the uncertainties on your fit parameters are a finite set of modest size, and there's hope that an appropriate calculation could converge to those uncertainties. There are a few slightly different formulae for calculating this. I'm using the Mackinnon and White estimator, as described in Long and Ervin 1999 (as HC3). This is \[(A^TA)^{-1} A^T \text{diag}\left(\frac{r_i^2}{1-h_i^2}\right)_i (A^TA)^{-1}, \] where \(r_i\) is the \(i\)th residual, and \(h_i\) is the leverage of the \(i\)th point.

In [17]:
res = linear_least_squares(Acols, ys, claimed_uncerts)
print "Robust standard errors:"
print res.cov_robust
print '"True", approximated:'
print cov_het
Robust standard errors:
[[  3.78233363e-04  -7.43917638e-05]
 [ -7.43917638e-05   1.79503265e-03]]
"True", approximated:
[[  3.38980301e-03  -5.14196812e-05]
 [ -5.14196812e-05   1.82065550e-02]]

In [17]:

Full post


Since I now live in Europe, I find myself drawn in to the Eurovision song contest. On the one hand, it's an international contest even more fraught with under-the-surface (and on-the-surface) politics than the Olympics; one is drawn to root for one's team. But on the other hand, it is a festival of massively-produced pop music. I mean, the live performances use the very highest technology available — massive LED matrices, realtime projection mapping, elaborate robotic sets, even goofy two-wheeled contraptions — and the music is processed to within an inch of its life. Which is actually kind of fun to watch. Of course, all the songs are horrible earworms. So be warned; the video below is lovely and apropos for this blog, but it will be stuck in your head:

Oh, and of course, it's astonishingly gay — the hosts opened with a string of jokes about all the gay fans, and I saw at least as many rainbow flags as I did flags from any nation (should I make that any geographical nation?).

Below the jump I'll post a few others that I particularly liked (that is, they're hopelessly stuck in my head).

Full post

In defense of Duck and Cover

Now that the Cold War is over, people will occasionally look back at the horrifying situation — two opposing superpowers threatening the world with annihilation ­— view the threat as over, and laugh at some of the things people said and did back then. Now I know that this is partly because laughter is the only way to deal with such a nightmare, but some of the specifics they choose are just wrong. Take for example this John Oliver clip about nuclear weapons:
 He argues convincingly that there is still a threat, and that we need to do something about it. But he opens by mocking "Duck and Cover", which is actually misguided.

Full post

I for one welcome my new robot overlords, or at least housemates

I recently gave in to temptation and obtained a 3D printer. Combined with my robot vacuum cleaner, that means I am now outnumbered, and will be even more so when Coral (the 3D printer) has printed itself enough upgrades and starts printing parts for the upcoming Windeybot. Expect, therefore, many robot-oriented posts in the next little while. I'm still considering getting a cat to swing the balance a little towards organic life, but until then probably no cute cat posts. I'll see what I can do otherwise.

The 3D printer is from Builda3DPrinter, a Dutch company (actually just one guy living in Hardinxveld) that sells kits and supplies for delta printers. There's lots more to say about the robot and how it works, but as a hobby, 3D printing is an interesting mix of getting the machine to work well and coming up with cool things to print. I'll just say a bit about it below.

Full post

Numerical derivatives in parallel

My last post was about a fairly esoteric new language feature in python: coroutines. I wasn't reading up on them because they're whiz-bang cool (okay, not only because they're whiz-bang cool) but because I have an actual problem that they might help with. For the triple system, I find I need to do very slow numerical optimization problems. That is, the objective function takes seconds to evaluate. It can't be parallelized internally, and numerical minimization is largely a sequential operation: evaluate the function, figure out which way is downhill, go there, repeat. So this code currently uses just one core, and is almost as slow as the highly parallel Markov-Chain Monte Carlo code that does hundreds of times as much work. So obviously it would be nice to find some parallelism. I have a way to do that: parallelize gradient evaluations.

The other reason for this post is to make a point about code robustness. Getting numerical derivatives right is a notoriously difficult process, and getting them both efficient and right is even harder. The package I have been using, MINUIT, has been used by the particle physicists for decades, and they have hammered out all sorts of corner cases that a naive implementation might miss. So I'm not going to use a naive implementation, not least because my problem is numerically difficult. This means the code is not simple, and I don't want to try to restructure it manually to be more concurrent; I want the computer to handle the concurrency for me.

Full post

Numerical coroutines

Python 3.5 has new syntax for coroutines. Like many of python3's new features (unicode strings? who uses non-English languages in astronomy?), it is not immediately obvious that coroutines are of any use to people like me who mostly write relatively simple imperative scripts to wrangle large amounts of data or long computations. But cores aren't getting any faster, so we're going to have to start writing parallel code if we're going to take advantage of new machines. Now, if you want to take a gigantic Fourier transform, or do algebra on enormous matrices, you are going to need a witch design you a parallel algorithm. But our code is often full of easy opportunities for parallelism that we don't take advantage of because it's awkward. So I keep my eye out for ways to take advantage of this easy parallelism. I have found one, in the combination of coroutines and futures, and I'd like to describe it with a worked example. Maybe it'll sell you on some of these cool new language features!

Full post

Parallel ipython notebooks and the H test

The CPU on the nice shiny new server I log in to is really not much faster than that of the ratty old laptop I have in front of me. The server has more memory, and more disk space, but CPU-wise it just has more not faster. For that matter even my laptop has two cores. So if I have some heavy-duty computing task, I'd better find a way to make it use multiple cores in parallel. Some tasks are just plain hard to parallelize (solving ordinary differential equations, for example), but it turns up fairly often that I'm doing something embarrassingly parallel: there's a for loop somewhere that just does the same thing to lots of different pieces of input. If only it were easy to hand each piece to a different core! Well, there are various tools for doing this sort of thing, but most of them apply to scripts or programs that run non-interactively. It turns out that ipython offers tools for interactive parallel computing. I'm going to explain how I use them, by working through a test problem, checking some statistics on a periodicity test (the H test).

Full post

Remote ipython notebooks

I use ipython notebooks a lot. For interactive tinkering, for calculations with units, for interactive control of long-running processes, for easy parallelization (details to come), and to generate plots for all my papers. When I can, I run the notebooks on a server rather than my rather wimpy laptop; apart from the obvious advantages of more cores, more disk space, and more RAM, this also means I can leave things running for ages. It's all very convenient, but it requires some setup. As with the ssh configuration, I've found myself explaining the setup to several different people recently, so I thought I'd put it here. (Still working on that cute-kittens post.)

Full post

SSH incantations

I use ssh all the time, for shell access on remote machines, to transfer files, to carry VNC sessions, and to use different institutional affiliations to get at paywalled things. Over the years I have developed a particular way of setting it up that minimizes the pain involved in doing all of the above things. Several times in the past months I have tried to explain the whole setup to someone, so I think that's a sign it's time to post it here. I'll try to post something about, I don't know, fluffy kittens, boating on the Thames, seaborgium hexacarbonyl, or something soon for those of you who for some reason expected an interesting blog.

Full post

The Fortress of the North

I live in Groningen. It's a city of about 200,000, about 50,000 of whom are students at the universities. So I sort of thought it might resemble Kitchener-Waterloo, the college town where I did my undergraduate degree, but in fact I've never been anywhere quite like this before.

Full post

The problem with "beginner's mind"

I should say up front that I know nothing about authentic Zen Buddhism. But one of the concepts that has made it into popular culture that I rather like is "beginner's mind": an expert knows what she thinks about things, while a beginner sees everything anew and judges it on what it is. It sounds kind of charming, and like a recipe for creativity. But applying it too literally can get you in trouble.

Concretely, ASTRON (where I work) arranged for a company to come in and teach programming courses to anyone who wanted them. They taught three: Introductory Python, Numeric Python, and Advanced Python. Now, I think that's a brilliant idea: we astronomers all spend most of our time writing code at one level or another, and nobody seems to have bothered to teach us how to do it well. So that ASTRON (actually NWO I think) cares is a really good sign. So of course I wanted to participate. I have lots to learn about writing good programs. I don't think I'm an expert programmer, or even a Python expert, but that latter is partly because I try not to think of myself as an expert in anything. I knew the Introductory Python course would not be productive — I write python code every day. And the Numerical Python, again, was a good idea, but having written, for example, the reshape function in numpy, and the spatial module in scipy, I figured that was probably not going to be too productive either. But the Advanced Python course sounded promising. And I wanted to show my support for the whole idea. So I signed up.

Well, the course was the last three days, and it was a pretty good course, but not at all what I needed. Which should not have been much of a surprise: when I stopped to do the math, I realize that I wrote my first python program eighteen years ago. (Good God.) Still, it was interesting to see how they ran the course, and to think about how I would run one (because if I end up somewhere like McGill that has basically nothing for physics grad students, I will run one, official or not).

Full post

Visualizing the new pet

I recently wrote about a new object I am studying: a millisecond pulsar with two white dwarf companions. There is lots more I want to say about it, but I think it would be nice to show what it looks like, or at least, to show a video I made trying to make visible what's going on:
Edited to note that Blogger's YouTube embedding is distinctly flaky; video is here

This video shows the orbital motions in the triple system. The orbits are drawn to scale, showing the actual motions of the two stars (red and yellow) and the pulsar (white). The first ten seconds are played relatively slowly, showing the motion around the inner orbit, then we speed up to see the motion around the outer orbit. For a sense of the time scale, an "MJD" is a modified Julian day, so a single day long. The larger left panel shows all three bodies, with trails marking the motion of the outer companion and the center of mass of the inner system. The inset in the top right zooms in on the inner system, showing the pulsar and the companion, with trails marking their orbits. The dots that appear on the orbits mark moments when we have observations of the system, color-coded by telescope; it should be clear that we have quite thorough coverage of both orbits. Each measurement tells us the distance to the pulsar to within a kilometer, so that we can measure the tiny deviations of these orbits from perfect Keplerian ellipses, allowing us to reconstruct the orbit.

There's a little more to it than that.

Full post

New pet: PSR J0337+17

I did my PhD thesis on PSR J1023+0038, a millisecond pulsar that is at a fascinating point in its evolution. (In fact there have been developments since the thesis was submitted; more about that later.) But during a moment of procrastination, I got involved with a new and fascinating system. The name, unedifying as usual, is PSR J0337+17, and it is unique in that the pulsar has not just one white dwarf companion but two.

Full post

Flywheel energy storage

In the quest for something better to run our cars on than gasoline, one of the proposals is flywheels. In fact, for a while there were flywheel-powered buses running in Switzerland and Belgium. On one level, it makes a lot of sense: you're storing energy as mechanical motion, and we're pretty good at transmitting mechanical motion from place to place. On another level it scares the living daylights out of me: a car in motion uses tens of kilowatts, so the car must be able to store hundreds of kilowatt-hours. If you let all those loose at once bad things will happen: 100 g of TNT going off releases about a hundred kilowatt-hours. Fortunately it's hard to get gasoline to do this, but a flywheel is just itching to dump all its energy. Batteries are a little scary too, to be honest. But anyway, that's all a digression: I want to talk about some really staggering examples of flywheel energy storage: pulsars and black holes.

Full post

New job, new country

I just started a new job, and in the way of academic life, I had to move. I put some things in storage, hopped on a plane with two suitcases, and hey presto, I live in the Netherlands now. The new job is with ASTRON, the Dutch institute for radio astronomy, and it's going really well. Living in a new country actually takes more getting used to.

For one thing, in spite of the fact that I live in arguably the most pedestrianized city in Europe, and in a densely-populated country with good transit systems and cities built when rapid transit meant corn-fed horses, I find myself tempted to carpool to work. See, the thing is, ASTRON was built in the middle of a national park in order to minimize interference with the radio telescope that was also being built. Since the fifties, though, radio astronomy has made a few strides, and the telescope is now used chiefly in education. The telescopes we do use are all off-site, so we're in the middle of the park for historical reasons. It means we're twelve kilometers from the nearest train station, and even the bus drops you off in the village of Lhee, a kilometer or so away from work. So a ride to work is tempting. Fortunately, this is the Netherlands, so I've been biking.

Full post