This weekend we had the second annual Toronto OpenStreetMap developer weekend. The nice folks at the Ryerson Department of Geography hosted us. My focus this weekend was to work the Serge and Martijn on maproulette
Maproulette is software that presents an easy to do mapping task to users which they can complete and then mark the task as completed. Examples of past maproulette mapping challenges include fixing connectivity errors or fixing objects touched by the license change.
Serge and Martijn wanted to redesign parts of maproulette to make it easier develop more challenge modules. I am going to discuss some of the design decisions we debated during the weekend.
A Challenge is a type of mapping activity that map roulette can assign to users. Examples of challenges might be ‘Add bridges to roads with where we suspect a bridge exists’ or ‘Add the street address tags to restaurants that don’t have addresses’.
A task is a specific work item within a challenge. In the ‘Add bridges to roads’ challenge a task might be to add a bridge to a specific intersection (or determine that bridge should be there)
Maproulette needs to provide a web based interface that displays challenge tasks to users(task assignments). This requires us to somehow store information about the tasks that make up a challenge. Challenge specific modules will compute tasks (ie find missing bridges). Figuring out how the web interface and the programs that find challenge tasks should be connected was one of the questions we discussed this weekend.
Proposal 1: Discrete Modules
Each challenge has a challenge module that supports operations such as
- Give me some available tasks near this lat/lon
- Mark this task as assigned to a user
- Mark this task as complete/skipped
Each challenge module would be responsible for storing task data in it’s own
Proposal 2: Shared Task Representation
Challenge modules compute a list of tasks by examining a mixture of OSM and external data sources. The challenge modules then convert these tasks into a common representation made up of a task and associated sets of task_objects. A task to find add a missing bridge might be made up of three objects. An object for the place we think the bridge belongs, an object for the road the bridge is part of and a third object for the road/stream that the bridge crosses over. These tasks and associated objects get stored in a database. Tasks and associated objects for different challenges get stored in the same pair of database tables. The web application that users interact with can then display and manipulate information about tasks from all challenges.
Contrasting the options
When designing software systems you often have two or more different approaches to a given problem. Deciding which option to proceed with is the challenge software designers have to face. Some of the questions I asked myself this weekend included
- How can code between different challenges be shared or re-used in each of the options? . The second approach sounds like it has more places where code can be re-used because all challenges use the same data model for assigning tasks
- Is one approach more flexible than the other? Are there things different challenges might deal with differently that will be hard to accommodate with one of the designs? Our concern with the second proposal was that we were generalizing all challenges to object + geometry and not much else. What if some challenges needed to display extra things such as textual hints?
- How do the proposals make it easier or harder for someone to come along and develop a new challenge? In the first approach a challenge developer can has free reign to implement things in any fashion they like as long as they implement the task assignment API I discuss above. In the second proposal challenge authors can still pick their programming language and methods for detecting challenges but they must use the database schema we provide for storing information about tasks. With the second approach challenge developers only need to worry about writing programs to find challenge tasks and store them in the database. Task assignment/management is handled by the common code
A third proposal
We were still debating the merits of these different approaches on late-morning on Sunday when we decided to take a break and go for lunch. One would think that a group of mappers would know lots of places nearby for lunch in downtown Toronto but we seemed to be wondering aimlessly back and forth down the street. While we were doing this we came up with the idea of a shared database schema approach like proposal 2 but letting each challenge store a blob of JSON with the task. Each challenge would also implement challenge specific rendering/display could that would know how to take the JSON describing the task and display information about the task to the user in a meaning full way.
I want to thank Richard Weait for organizing this weekend and Serge and Martijn for letting me participate in the design process. Next we need to finish implementing what we discussed. This should help us figure out if our ideas were good ideas.