33 boroughs, two babies and one city modelling platform

We’ve been hard at work since our last update – in that time we’ve delivered our first release of Witan, and two babies! Our city modelling platform is now live and our first users in London have been given access this week.

On-demand demographic projections to meet housing demand

Witan is now able to generate demographic projections linked to demand for housing in London. This means that officers in the 33 London borough councils can enter housing data projections and see how this affects the spread of the population across wards in the borough for years going all the way up to 2041.

These numbers are useful to the boroughs for planning in many ways – one important example is to carry out school roll projections. Previously, this process used to take weeks of specialist GLA staff time, and would typically happen once a year. Now we’ve automated the process, allowing boroughs to run their own projections at any time they like, as often as they need to, creating new projections in under five minutes.

Excellent & functional, but not sexy

We’ve opened up Witan to the boroughs in London for testing, and as feedback comes in we will be using this to improve the user experience. Once borough officers experience the freedom of generating their own projections, we hope that the true potential of the platform will start to be realised – such as the benefits of moving beyond the limitations of Excel.

Our favourite description so far is from Andrew Collinge, Assistant Director of the GLA Intelligence Unit, that Witan is ‘excellent & functional, but not sexy’. We’re happy that we are building something with substance over style!

However in the next phase, we’ll start putting bells and whistles onto the platform, including visualisations, group sharing, a data API and a metadata API. We’ll also be building in our next set of models which feature economics projections around jobs and commuters.

Would this work in your city?

We’re looking for pilot cities and local authorities to test Witan outside of London. If you are interested then please sign up to the Witan public mailing list on the landing page at witan.mastodonc.com or contact us on witan@mastodonc.com

Mastodon C is a small cloud computing company working closely with the GLA and funded by Innovate UK’s SBRI programme to develop a city-modelling platform. One of our aims is to open up government processes where possible and appropriate, so we are working in the open as much as possible.

To explore Witan and answer any queries you may have, please visit the Witan guide, checkout the project, UI and app code (and star the github repos if you really like them), or email us at witan@mastodon.com.

We’d also really appreciate you sharing this blog with others you think might be interested in what we are doing.

Extracting Met Office weather data with Clojure

Living in York for the majority of my life, I’ve got used to flooding. With the number of floods increasing, and the probability of that flood causing long lasting damage also increasing, it would be a good idea to start peeking at past data and seeing if there’s anything jumping out at us.

Getting Historical Met Office Data

The Met Office provided an open data set in 2011 on the data.gov.uk portal. There are a number of different feed types so it’s certainly worth taking the time to have a look and see which fits best for your needs. In this walkthrough I will concentrate on hourly observation data.

Historical data comes by way of a form which generates a file from the data archive and sends back the data as a CSV file.


The form can only deal with one date at a time, which is fine if you want one day but is rather a pain if you want to get data between two dates. With some unix scripting (using curl for example) you can easily pull the data in, but there is an issue with that. When you fire the search form, the data is pulled and then forwarded to another url with a unique id. So unless you’re great at handling redirects within unix it’s going to be difficult to get the data out. Well, it was difficult to get the data out, but now I have good news for you.

A Clojure Alternative?

The Mastodon C team created an open source project called kixi.hecuba.weather, whose primary purpose is to pull historical weather data and send it up to our Hecuba platform. It’s fully open source so even if you’re not using Hecuba you can still use some of the component parts of k.h.w. to pull the Met Office data for your own needs.

The project is hosted on Github and anyone can use it for grabbing Met Office historical weather data.

Clone The Project

First of all you need to clone the project. From the command line run the following git command from your terminal or command prompt.

git clone git@github.com:MastodonC/kixi.hecuba.weather.git

You’ll see the repository download to your machine.

dev:mctemp jasonbell$ git clone git@github.com:MastodonC/kixi.hecuba.weather.git
Cloning into 'kixi.hecuba.weather'...
remote: Counting objects: 209, done.
remote: Total 209 (delta 0), reused 0 (delta 0), pack-reused 209
Receiving objects: 100% (209/209), 39.05 KiB | 0 bytes/s, done.
Resolving deltas: 100% (68/68), done.
Checking connectivity... done.
dev:mctemp jasonbell$

With that done you can now start the REPL and start running functions to pull down the data.

Starting The REPL

I’m going to first compile the k.h.d. project and then start the REPL from the command line:

dev:mctemp jasonbell$ lein compile
dev:mctemp jasonbell$ lein repl

Give it a few moments then you will see the output as the REPL starts and then the prompt. Once you have the prompt you can start working.

nREPL server started on port 55453 on host - nrepl://
REPL-y 0.3.7, nREPL 0.2.10
Clojure 1.7.0
Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27
 Docs: (doc function-name-here)
 (find-doc "part-of-name-here")
 Source: (source function-name-here)
 Javadoc: (javadoc java-object-or-class-here)
 Exit: Control+D or (exit) or (quit)
 Results: Stored in vars *1, *2, *3, an exception in *e

The namespace where the Met Office functions are is called “kixi.hecuba.weather.metoffice-api“, so you will need to change namespace first.

kixi.hecuba.weather.core=> (ns kixi.hecuba.weather.metoffice-api)

Now we can turn our attention to retrieving the Met Office data.

Retrieving The Data

The run-data-pull function takes three parameters: a start date, an end date, and a path to save the files to. For example if I want to pull historical data from the 1st of January 2013 to the 1st of February 2013 I would run the following. The dates are entered as a dd/mm/yyyy format, so for example:

kixi.hecuba.weather.metoffice-api=> (run-data-pull "01/01/2013" "01/02/2013" "/Users/jasonbell/Documents/work/projects/mctemp/")

The function will call the API and save the data for each hourly observation and then save it to the directory specified.

-rw-r--r-- 1 jasonbell staff 21427 7 Dec 13:18 01-01-2013-0000.csv
-rw-r--r-- 1 jasonbell staff 20759 7 Dec 13:18 01-01-2013-0100.csv
-rw-r--r-- 1 jasonbell staff 20690 7 Dec 13:18 01-01-2013-0200.csv
-rw-r--r-- 1 jasonbell staff 20755 7 Dec 13:18 01-01-2013-0300.csv
-rw-r--r-- 1 jasonbell staff 20734 7 Dec 13:18 01-01-2013-0400.csv
-rw-r--r-- 1 jasonbell staff 20809 7 Dec 13:18 01-01-2013-0500.csv
-rw-r--r-- 1 jasonbell staff 21328 7 Dec 13:18 01-01-2013-0600.csv
-rw-r--r-- 1 jasonbell staff 21234 7 Dec 13:18 01-01-2013-0700.csv
-rw-r--r-- 1 jasonbell staff 21312 7 Dec 13:18 01-01-2013-0800.csv
-rw-r--r-- 1 jasonbell staff 20789 7 Dec 13:18 01-01-2013-0900.csv
-rw-r--r-- 1 jasonbell staff 20673 7 Dec 13:18 01-01-2013-1000.csv
-rw-r--r-- 1 jasonbell staff 20825 7 Dec 13:18 01-01-2013-1100.csv
-rw-r--r-- 1 jasonbell staff 20812 7 Dec 13:18 01-01-2013-1200.csv
-rw-r--r-- 1 jasonbell staff 21003 7 Dec 13:18 01-01-2013-1300.csv
-rw-r--r-- 1 jasonbell staff 20888 7 Dec 13:18 01-01-2013-1400.csv

The make up of the CSV files is documented by the Met Office, here’s a quick look at one of the files for your reference.

Site Code,Site Name,Latitude,Longitude,Region,Observation Time,Observation Date,Wind Direction,Wind Speed,Wind Gust,Visibility,Screen Temperature,Pressure,Pressure Tendency, Significant Weather
"3005","LERWICK (S. SCREEN) (3005)","60.1390","-1.1830","Orkney & Shetland","12:00","2013-01-01","WNW","10","","14000","3.80","989","R","Light rain shower (Day)",
"3031","LOCH GLACARNOCH SAWS (3031)","57.7250","-4.8960","Highland & Eilean Siar","12:00","2013-01-01","WNW","16","29","18000","3.70","997","R","Heavy Rain",
"3041","AONACH MOR (3041)","56.8200","-4.9700","Highland & Eilean Siar","12:00","2013-01-01","W","17","32","","-1.20","","#","N/A",
"3063","AVIEMORE (3063)","57.2060","-3.8270","Highland & Eilean Siar","12:00","2013-01-01","WSW","3","","16000","3.50","997","R","Light rain shower (Day)",
"3066","KINLOSS (3066)","57.6494","-3.5606","Grampian","12:00","2013-01-01","WSW","17","","40000","5.20","996","R","(Black) Low-level cloud",


The kixi.hecuba.weather service provides you with a simple method to retrieve data published from the Met Office. The ability to pull hourly data across a number of days, weeks or months into CSV files makes it perfect for placing in Hadoop’s distributed file system (HDFS) enabling you to do further analysis with Hadoop or Spark for example.







The Mastodon in the room: Why Clojure?

Mastodon C advertises itself as a ‘Clojure shop’ when talking to clients and partners, but let’s delve a bit deeper into what this means and why we believe it’s a key advantage for us. Ultimately, it comes down to a simple idea:

“Simplicity is hard work. But, there’s a huge payoff. The person who has a genuinely simpler system – a system made out of genuinely simple parts, is going to be able to affect the greatest change with the least work. He’s going to kick your ass. He’s gonna spend more time simplifying things up front and in the long haul he’s gonna wipe the plate with you because he’ll have that ability to change things when you’re struggling to push elephants mastodons around.”
Rich Hickey, Creator of the Clojure programming language (edited for effect)


Clojure is a computer programming language that promotes simplicity, whilst simultaneously being incredibly powerful and broad. It hails from a family of languages known as LISPs (LisProcessor), which are famous for their distinctive, fully parenthesized Polish prefix notation. It looks a bit like this…

(require '[clojure.string :as string])
(defn say-hello [name]
  (->> name
       (println "Hello")))

#=> (say-name "archibald")
"Hello Archibald"

If you’ve ever seen code from other languages, such as JavaScript or Python, you’ll notice immediately that this is aesthetically very different. However, like any programming languages, once you’ve learnt the nuances and the idioms, writing code that is concise yet expressive becomes quite straight-forward. Clojure is especially geared toward a style of programming known as declarative, whereby a programmer, rather than solve problems imperatively (“do this, then this”) is more descriptive (“change this data to look like this”), allowing the underlying language and/platform (the JVM in Clojure’s case) to do the heavy-lifting. The result is often much less actual code – and the less code there is, the less there is to go wrong!

Clojure is predominantly used as a ‘server’ language which means that it’s usually the workhorse at the bottom of a technology stack, doing all of the business logic, logging, management and logistics. This suits us quite nicely, as this is where most of the Mastodon C magic tends to happen –  reading in data, crunching lots of numbers, managing distributed computations across the cloud. Having all of our solutions written in Clojure means that we significantly reduce the effect of “silo-ing” people into specific projects. One person can easily move between jobs and this is something we actually mandate as part of our support process.

Pretty things

Recently there has been a big movement to bring Clojure to more platforms. Where once it was almost exclusively big, beefy servers, Clojure is now also available in your browser via ClojureScript – a near-perfect reproduction of the Clojure language which transpiles to JavaScript. This opens up a vast amount of opportunities for development across browser, mobile and desktop, as well as allowing seamless integration with lots of powerful and popular JavaScript libraries – such as React, jQuery, D3 and many more. It’s completely possibly now to write entire client-side web applications using ClojureScript – and many have.

Considering Mastodon C is already bursting with experienced Clojure developers, using ClojureScript for our frontend work is a total no-brainer, and of course comes with the added benefit that – once again – no one is left-behind when it comes to rolling up their sleeves and contributing to the code. There is no “server team” and “frontend team” as is so typical at most polyglot organisations – everyone does everything.

Once more, with feeling

It’s important to note that although our preference lies firmly in Clojure and ClojureScript, we aren’t scared of cutting our teeth on other languages, especially if they are designed for and excel at a specific task – R is a good example of this. However, the benefits of working in a predominantly Clojure organisation are paying dividends for us:

Further reading/listening

We’re on the Whitehall steering group of “digital and data visionaries”

It was announced yesterday that Fran, our CEO, will be on the new steering group of “digital and data visionaries” helping the UK government become more and more data-driven. We’re obviously very excited about this – both because it’s an honour to be asked to participate, and because we’re genuinely keen to put data to work in a way that really affects our lives for the better.

The full list of members is:

  • Sir Nigel Shadbolt from the ODI
  • Mustafa Sulyman from Google DeepMind
  • Fran Bennett from Mastodon C
  • Xavier Rolet from the London Stock Exchange
  • Mark Thompson from Judge Business School
  • Dame Fiona Caldicott, former Chair of the National Information Governance Board for Health and Social Care

City modelling MVP – develop and share demographic projections

We’re getting very close to launching the minimum viable product (MVP) version of Witan, the city modelling platform we’ve been working on since this summer.

The MVP will be focussed on running and evolving demographic models – having a picture of how many people, of what type, will be where, in a city, underpins all sorts of other critical services, so is a good place to start with creating a data-driven, legible model of the city. Plus, it’s helping us to develop a really great user experience for the key parts of city modelling:

  • Working with input data from many places – in this case, 33 London Boroughs, providing various housing scenarios as inputs
  • Forecasting based on variable assumptions, and making those assumptions transparent – for example, how high migration is expected to be in future years
  • The ability to keep some scenarios and data private, but also to share your scenarios with colleagues and keep track of how they are developed over time as policy or knowledge evolve

The interface is pretty clean and simple at the moment. We’re happy with how it’s coming together, and excited about having it live before Christmas – sneak preview below.


The team has also been busy on personal projects – while building Witan v1, we’ve also manufactured one and three-quarters babies between us.


If you’d like to keep up with Witan progress, request a demo, or request more very cute baby photos, please do get in touch.

How we’re building a new city platform

As we mentioned earlier on the blog, we’re working on a very big and ambitious project to create a new city modelling platform, Witan, with London as the first test bed.

Since there is so much that could be done in this field, and so many possible ways to do it, there are some tough choices for us to make, to ensure that the platform works and is widely adopted, both in London and in other world cities.

Oh, if only.

Oh, if only.

We wish that new product development was predictable – a straight line from seeing a user need, to building a tool which meets that need, then selling it to lots of other similar users. Unfortunately, it’s rarely that simple or predictable.


There’s no easy way to make that path predictable – but we can do things which reduce the overall risk, and this is a big part of our lives at the moment. This post outlines how we’re tackling the challenge for Witan, drawing some inspiration from the stellar resources shared by the GDS with their user research method wiki, how companies like Spotify build their products, and good agile (with a small A) practices.


Understanding the problem first

If we start by casting our net wide early on, we can improve our understanding of the problems our users have, to end up with a larger set of user needs.

We can’t serve every need, so we validate and prioritise which needs we want to meet with further research. At this point we focus on the problem, not on how we’d solve it.

Choosing the smallest way to meet a need

When we  have a smaller set of needs we’re aiming to meet, we diverge again, to come up with solutions to the problems we’ve decided to target. We’d typically describe these as user stories.

Checking we’ve actually solved it

We choose the smallest subset of user stories possible to deliver to users, build a solution which meets those needs, and see how well it works for the users in reality. When we’re building those solutions, we use multi-skilled teams because we want to minimise handoffs between experts, and we get our team to work with directly with users in order to create a sense of empathy and make sure that the technology really meets their needs.

And repeat…

Inside Mastodon C, we’ve been working in weekly iterations for a while: each week we demo what we’ve made, and then we plan what we’ll do the following week. One key thing we’re trying now, is using the mental model above to decide what activities in our ‘toolbox’ are most appropriate for that week.

Early on – when our focus is discovery

Early on – when our focus is discovery

Early on – when our focus is discovery

Early on in the project, most of our time is spent generating ideas, and exploring them, so we have a set of needs to validate, most of our activities are in the left hand side of our ‘toolbox’.


As we learn more, we’ll move more of our efforts to activities in the middle, where we’re validating our understanding of user’s needs, and see how they currently try to solve their problems themselves. It’s not uncommon for some of our team to focus on activities on the left after discovering something new, while the rest focus on the middle, on problems that we have a better understanding of. Again, we share what we learn each week in a show-and-tell like session, (but with added cake).

Testing out a solution to a problem we feel we understand

Testing out a solution to a problem we feel we understand

Later on, when we’re confident we understand enough of the problem to build a solution to it, we’re likely to be spending most of our time on activities on the right.

That is, until we discover something interesting enough to warrant spending some time using tools from the left hand side of our toolbox again. In some cases, we might discover a killer new feature that makes sense to add to what we have built already.

In others, we might discover the beginnings of a new product that is interesting, but not key to meeting the needs of the users we’re designing for right now. We’ll save it for later to see if it comes up again, at which point we’ll make a call about whether we should investigate further.

Well, that’s the plan at least

We’re working on one of the most complex problems we’ve ever come across, in one of the greatest cities on earth – if you’d like to see how we get on, watch this space.

Witan – the flexible city modelling platform

Screen Shot 2015-08-19 at 11.08.56

As Fran mentioned in her previous blog post, Mastodon C is working together with the Greater London Authority to develop a flexible approach to city modelling. The aim is to take forecasting beyond the limitations of Excel, while providing modellers with the benefits of sophisticated data management tools, such as version control, security levels and scaling to very big or complex datasets.

We are now working on the first features of this platform together with the GLA demography team. Population projections are vital to London planning and provides a base for the London Plan, the Mayor’s strategic plan for London up until 2036. London population projections are also used by multiple departments within the GLA and underlie many other models, and including these models early on means that the Witan platform can be beneficial throughout the organisation.

Another feature we are implementing is an interface for London boroughs to input data and run population projection scenarios themselves, without having to rely on GLA modelling resources. The platform thus provides boroughs with a powerful tool to take charge of their own, local, modelling requirements.

Going forward, this interface can be used for other city modelling purposes and be expanded to other stakeholders beyond the London boroughs, providing the ability to run different models and access a host of datasets, both open and private, to play out city modelling scenarios.      

We will be implementing these first features of Witan in the autumn and hope to provide updates on the platform build as we progress. The platform will be built using open source and interested parties will be able to access the platform code through our Github account.

By the end of the year, the demographic forecasting modules of the platform will be in public beta, and we’d love to start working with more interested users – please do get in touch at theteam@mastodonc.com if this sounds like it would be useful for you!