On 5th January, just before the public launch of data.gov.uk, I spoke to Tim Berners-Lee about how he helped the British government open up public data. This interview was published on the Prospect website in January 2010. The full inside story of the government data initiative—written by myself and James Crabtree—was published in the February issue of Prospect
Tom Chatfield: How did you come to be working with the British government?
Tim Berners-Lee: It began with a lunch at Chequers, when the prime minister asked me what I felt the UK should do in order to make the best use of the internet, and I said, you should put all your government data onto the web. And he said, okay then, let’s do it. [laughs] So when one has spent a lot of one’s life persuading people to put things onto the web, and persuading people to be open, it’s almost disarming to have somebody say that straight away. The result of that was a team in the Cabinet office of a team under Andrew Stott. Various people in the UK government had experience of this area already, so it was a question of how to accelerate this as much as possible.
TC: I know it’s sometimes felt in the world of online innovation that governments are not the best people for creating radical change. What motivated you to put this effort into working with and within the British government?
TBL: The neat thing about this is that there is such a clear win: the data we’re talking about has already seen a lot of effort expended on creating it. It’s a valuable resource that has been produced by parliament for a particular purpose, and it has typically been sitting there as a really valuable but under-used resource.
The thing people were amazed about with the web itself is that when you put something online, you don’t know who is going to use it. You’re looking for something and you think it’s impossible that somebody will have done it before, but you find that they have, and the web saves your bacon. It’s the serendipity—the unexpected reuse that is the value of the web. When you move to data, suddenly this is not applied because data actually is really desperately boring when you look at it by itself. When you put data together you can derive very powerful new insights; so I think that realisation that the UK had got all all this resource that was under-utilized means the arguments become very obvious for putting them out there for people to re-use.
TC: You’ve been working in this field for some considerable time, but is this the first time you have worked in this way with a government?
TBL: I have encouraged governments on various occasions to adopt standards and to use them for websites, but this is the first time within government. What happened is that at the beginning of 2009 I decided that this is going to be the year in which I ask people to put data on the web. I gave a talk at TED, including getting people to chant “get raw data now.”
TC: Part of the interest of the story seems to be that from this first encounter with the prime minister there was a degree of serendipity involved: the time proved incredibly ripe in government for this initiative to gather a huge amount of momentum over a very short time in government terms.
TBL: People had different points of view coming from different places, but the consistency of the encouragement that we’ve had has been very gratifying. And I suppose this is really my first experience with getting involved with policy so I don’t have very much to compare it with, which means that my impression is that if people really do want to do something, and they are excited about it, then they can. I think we have to be very careful about having a burst of apparent momentum, though, and a lot of talk; we have to realise that it is going to mean a lot of pushing of other people, staying late for a bit, putting extra effort in maybe to go to a seminar to learn how to do things with linked data. I think that within departments there is bound to be resistance from a few individuals, who are going to have to be coaxed around it.
One of the really important things of course is that we should do all this without changing the way people work. It needs a very concerted ongoing push at each level: managerial, within departments, the material level, and the grass roots. This stuff happens because people are dedicated and put a lot of time into doing it. And then it’s great to celebrate the results: things like these “hackathons” which we had a couple of, where people got together and made visualisations of the existing data. Those have been good because they give you a good sense of the data being made into something really interesting.
TC: Did you feel that people really grasped what you were talking about, intellectually, the principles and ideas?
TBL: Yes. Obviously from the point of view of the timing it connected to a concern about transparency in government, which is not just the UK, the US is also very concerned at the moment; some people connected it directly to that, and that may have helped, the sense that this was holding the government accountable. One of my fears is that people would see that as the only motivator and they wouldn’t see that we also have an enormously valuable resource.
For instance, somebody blogged on DirectGov, putting up some data which was just the grid references and years of bike accidents over three years. And that was on 10th March, I think, and later on that day somebody pointed out that it had been put up in Microsoft Excel, and said you shouldn’t do that, it’s a proprietary format, you should put it up as a comma separated file which anybody can read, and by the way here it is turned into a csv; and then someone says they’ve turned it into a kml file which can be used with a mapping application; and then the next day someone from the Times blog says I have done the mash-up, so here is a map you can go to and zoom all over the location and find your journey to work and see where all the bike accidents have been and maybe modify your journey to take another route. That was within 48 hours: the data had been turned from a pile of figures into a really valuable resource, which can save lives, which perhaps can help in the long term helping the public put pressure on the government to deal with black spots, and that is immediately useful to anyone getting on a bike.
Now imagine if a government department in any country had decided they were going to have a bicycle accident website. They would probably have spent a long time drawing up a requirements document, put it out to tender, and eventually gone for the lowest bidder, and after a certain amount of time the company would have come in, and then there would have been a review, and eventually the site would have been launched, and with luck it would have been useful; but in fact the message is that there are people out there who are prepared to put the effort in to turn data around before you have gone to the trouble of doing it yourself. It’s about seeing whether the mash-up-sphere, if you like, will do it for you. And that sphere will always win because they have access to data from different departments and non-government sites and all kinds of things. Somebody who is out there mashing up data sources, or someone in government doing that, is always going to produce things that go far beyond one single data set.
TC: Coming in from the outside, how did you find the internal working practices of government?
TBL: The people I met were generally very switched on, and I have been very impressed with the way that people in the Cabinet Office have made things happen and explained to me how things work. Yes, people tend to send around word processor files in emails, where at W3C everything is on the web. The British library has I think one of the largest public wi-fi areas in the country, possibly in the world, but the government doesn’t have open wi-fi so one has to work around that, you can’t just open your laptop and be connected. But I wasn’t there to complain or worry about that, and of course there are an awful lot of industries in the world that still operate by sending around copies of a document via email.
TC: One of the key things seems to be the Ordinance Survey data. As I understand it, you went in thinking that OS would not be your prime focus, but it ended up becoming a key component. Was there a eureka moment with OS data?
TBL: My initial feeling was that OS had a complex history, that the whole set-up for OS as a trading organisation was defined, so that it was not something to deal with in the first instance. But so many people we came to, who deal with data of almost any variety, said that government is to do with the government of the country, the place, and almost everything you do is to do with some physical place on a map. We met so many people who were very constrained by OS data or had a governmental right to use the data but as a result couldn’t pass on the information they created to the public, so we were under a huge amount of pressure from a large number of people to do something.
TC: I know historically it has proved very difficult to open up the OS data. Did you find there was a lot of opposition to that?
TBL: There had been various attempts to review the problem, some of which had been very conservative and focussed on small changes to the model, but there was one report that focused on the economic side and said very strongly that everything should just be made public.
TC: Rufus Pollock’s report?
TBL: That sounds like it. Basically, the ideas in the report are correct. But one of the problems is that the value to the individual citizen of having the data available, that return on investment, is very difficult to measure. How can you measure the value in your life of the web: how do you put a pound sticker on it? You can’t. You can try in some ways, you can say you would have wasted this much time going down to the library which I don’t now do, but the whole thing is wrapped up and you do things now that you didn’t used to do. But this is one of the problems.
One of the things that was very important to us was to preserve the OS. A lot of us remembered at school being taught to use an OS map: we grew up on our holidays using OS maps to avoid traffic jams, find beaches, walk through the hills; and there are a huge number of people in Britain who are very attached to the Ordinance Survey and who value it, they know the OS as the people who make their maps, and are a jewel in the crown of Britain’s information resources. A lot of it is sentimental attachment too, to particular maps and the particular way they are presented.
TC: What are you worried about? Are there some great obstacles remaining—and is it possible that a different government might not care for the agenda so much?
TBL: The whole openness of data thing is so non-partisan that I can’t see a different government really wanting to reel back the openness. I think that once people have seen it, too, it will be easier to see that nothing horrible has happened. The fears that people tend to have when they are managing a particular bit of data sitting at a computer are to say, well, I’m worried that people will misinterpret the data, I’m worried that people will use it for the wrong thing, or I’m worried that people will think it’s more accurate than it is. Those are the sorts of things that you hear, from the very large standard excuse set. But once the data is out there those sort of excuses won’t be there any more, because people will say, well, the data is out there and it’s not really being abused, and people do understand that it’s not very clean data and that nobody is perfect, but they are very grateful to you for making it available.
The things that I am concerned about: we need to keep the momentum going, and ensure that people are following up with data sets. There is also a temptation to mail out a DVD and say, okay, here’s the data, but obviously the data is changing and being made open is just part of the cycle of the development of data; we have to grow to learn how best to do this.
TC: Is this an area that Britain could lead the world in, or are there other countries we should be looking to and trying to emulate?
TBL: America is also at it, of course, with its data.gov site.
TC: They seem to have less emphasis on making it highly usable for developers and third-party APIs.
TBL: That’s right. I think when it comes to the quality of data and making it usable, Britain is ahead. Of course it is early days. There is an awful lot of data out there: in both countries there has been a call for a list of the things that are out there, and I think just producing that list of what data to call for is quite difficult. We need to move to an ethos where if somebody in government creates a database then by default they will create a path to making that available and usable publicly. There are different ways in which the UK and US can learn from each other. There are also efforts go on in other countries—Australia, New Zealand, Toronto, New York, the State of Massachusetts, all have public data projects.
TC: Is this a movement that could change the way people think about politics and interact with political systems?
TBL: Yes, I think it will have a big effect: the accountability of government and transparency will have a very healthy effect on the way that government is run.
I felt initially that we clearly we needed to do this with the most developed countries, who understand about putting stuff on the web. But people are also pushing the idea of this in developing countries, because that’s where government and data transparency is needed, and you really need to establish trust in the government in order to justify investment from outside for example. When I was recently in Uganda, talking to ministers and the prime minister there, I took the opportunity to mention the openness of data in Uganda, so it may be that some of the most important effects that you find early on actually come from developing countries.
TC: Do you see yourself having a long-term involvement with the government, or with governments, on this?
TBL: This has been a project of a certain length of time. I hope that the momentum that it has got will be self-driving, I hope this will take off exponentially, and that I will be able in future years be able to push other sorts of things. What should it be in 2010—should it be the year of scientific data, social networking data? There’s a lot of ways in which we have to go in how we use the web, and they all connect together. But putting government data on the web has been a very exciting journey. We have to keep pushing, though. Constant vigilance.
TC: And what about your personal motivation. Obviously you’re very driven: it would be quite possible for you to sit back if you wanted to.
TBL: It is very exciting, clearly, to make things that work and that allow computers to do things that help us. The whole new field of web science, learning about how the web as a very large system and how humanity connected by technology should evolve, has a lot of excitement. But there is also a certain amount of duty. The web is this big system which we did actually make, this artificial system created by the people who sit down and write protocol and machine specifications. The way computers interact over the web is defined by a system we invented and can change. We have a duty to make sure that at the same time as we are putting data on the web, that we look more broadly to make sure the web does serve humanity, and think about the 75 to 80 per cent of the world who don’t use it at all at the moment. As always, too, I’m motivated by meeting people who are also very fired up: people like Andrew Stott, who are excited about doing something that is going to be good and effective.
TC: What most concerns you, and most excites you, about the future of the web?
TBL: Most of my concerns are to do with the web being controlled by one party or one group, whether government or a large company that has got excessively powerful and is able to control what one sees or know what one does; if a government decides it is going to control or limit what people do, or spy on them. Those are the main fears.
Tim Berners-Lee studied physics at Oxford, before working in telecoms and software engineering. In 1989, while working as a fellow at the CERN research centre in Switzerland, he published the academic paper that defined what would come to be known as the world wide web. In 1994, he founded the World Wide Web Consortium, a group devoted to keeping the seething mass of pages he helped to create working together. In June 2009, alongside Nigel Shadbolt, he was appointed as an information advisor to the British government.