Heavy Lifting in Location
December 2, 2010
Location Labs works closely with application developers integrating location into their web and mobile services. Understanding and addressing their major pain points around location is our core business. So, what are the most important areas where help is needed, or in other words, what is the heavy lifting Location Labs does for its app developer community? I look at it this way:
1. Access to Location
For many developers, on the face of it this sounds like a fairly straight-forward question. Namely, if I'm building a mobile app, I simply get location through native API's on the device. True enough, and if this fits with your application you can safely ignore this question and move on to the next section. However, for many applications that leverage location (or wish to), this is not enough.
For example, you may need to locate your user remotely, independent of a mobile application running on their device (consider: you offer 24 hour roadside assistance through a 1-800 service, wouldn't it be great to locate the user from the network?) Network access to location requires a close relationship
with the underlying network operator. We are in production with the top four in the US: Verizon, AT&T, Sprint and T-Mobile. We enable developers to access this through the Universal Location Service.
As a second example, users may access your service through a mobile application, but you may need the app to run in the background, 24/7, monitoring location. Sure, you could write this yourself, but there a number of non-trivial aspects to building it that may not be where you want to spend your precious R&D dollars. Getting this right requires a solution that runs in the background 24/7 resilient to network outages, flaky GPS support, various conditions that cause the app to reset, etc. all in a manner that efficiently manages power consumption. Getting this right is no easy task, that's why even the OS providers themselves (such as Apple iOS and Google Android) don't even get this right.
2. Geofencing and Other Dynamic Spatial Processing
Geofencing is rapidly becoming an important aspect of location-based marketing strategies. Advertisers are finding high value in messaging customers in response to their current location (as well as recent location history.) Translating location streams (as detailed in the previous section) into these kinds of spatial triggers is an interesting technical challenge.
A general purpose solution to this problem looks very much like a spatial (or spatio-temporal) expert system. That is, you combine a passive location data feed with a set of rules for when triggers should fire. Location Labs approaches this problem by considering the primary use cases first, such as geo-fencing and user proximity. These are far and away the majority of interest in this space. We have also developed a more general purpose solution that provides a type of "plug-in" architecture for adding new and different spatial trigger algorithms that access location data in a well-defined way. More information on Location Labs geofencing can be found here.
3. Spatial Storage and Spatial Indexing
Developers often ask about location storage and how this problem should be approached. Like most interesting engineering questions the answers is: "it depends". Spatial storage in an RDBMS setting with supporting spatial indexing (including native spatial access methods) is a mature technology, and in most cases commodity solutions suffice. There are a few exceptions worth mentioning…
First, if the data is truly big enough, a commodity RDBMS system (such as PostGIS) may not suffice. If performance requirements dictate that the data needs to be in memory (even with native spatial indexing, such as an R-Tree or Quad-Tree), and you have literally billions of records that live in a single index, and hand sharding (that is, sharding across multiple hosts is not automated) is not an option, then a custom solution may be required. OK, to be clear, there are a lot of "ifs" here, and in our experience working with developers although not completely inexistent, this situation is extremely rare. In fact there are probably only a handful of common applications out there that approach this kind of scale.
To give you a better idea of how truly rare this case is, consider a weather application that tracks the location of its users in order to inform them of weather emergencies. In this case you will want to store the latest location of your users and perform a range search against these locations in response to a weather incident. Even the largest weather applications today have on the order of 10 MM subs.
Caching these in memory on a single host with native spatial indexing is commodity.
A second circumstance, and one I will get into more detail below, is when your spatial data needs to be joined against content that you don't host. The most common example here is POI or "Place" data. This in fact is a very different problem. What's needed here is either a hosted service providing the developer with access to spatially indexed Place data, or better, wholesale access to Place data that can be hosted and processed locally. This is not really a storage problem per se but rather a content problem.
4. Place Data and User Check-ins
The most common way location is shared today from the user is the user-generated check-in. Services such as Foursquare, Gowalla and Facebook Places provide convenient, engaging ways for users to participate.
In this model location is associated with a Place or POI. Of course this means that there is now a great competition for who will own the standard, accepted Place database. This is a whole subject on its own and I hope to have more to say about it in a later post, save to say that the problem
remains unresolved.
As it turns out, there's really no technology here. You could argue that collecting check-in data and spatially indexing that data for data mining purposes could readily turn into big data. That's true, but what's less clear is if this is a mainstream problem service providers will face. I don't think so. What make more sense is for these types of location-based network providers (or a limited set of agents working on their behalf) to do their own processing locally and expose useful data to third parties.
For now the space remains fractured, and in order to provide effective tools that relate to Place data platform providers such as Location Labs are in a position of supporting an open-ended set of Place databases, unsustainable but a fact of the market for now.
@sahotes
CTO, Location Labs