Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Mantle – Serverless Maps Using Lambda or Cloudflare Workers (protomaps.com)
113 points by bdon on April 7, 2022 | hide | past | favorite | 19 comments


Looks like a nice offering. For those interested in an even more self-service style option -- check out https://github.com/onthegomap/planetiler. Spin up a high-memory spot instance on your favorite cloud provider, run a command, let it run a few hours, and now you've got a complete tileset for the planet at the cost of a couple of bucks. I'm not affiliated but I've been using it for my own project (http://www.lumathon.com/map) and I've been very happy.


Author of post here. One of the cool possibilities here is that Planetiler output is just another dataset, provided it's been converted to the S3-friendly PMTiles archive format (utility here: http://github.com/protomaps/PMTiles)

While Google Maps and MapBox let you customize the appearance of vector maps, you can only modify and remove data that already exists at a zoom level. A self-hosted solution allow products like yours to have 100% custom basemap datasets specific to the application, and serve overlay datasets (like those made with tippecanoe) through the same system.


It looks like the PMTiles is basically Cloud Optimized GeoTIFF, but for png.

Is this formant something that you expect will be widely supported in various clients? (e.g., there's serverless, and then there's static data)


Yep, it's inspired by COG as well as MBTiles, hence the name. Key differences are:

* like MBTiles it's agnostic to what individual tiles are. You can store vector data as SVG, Protobuf, raster PNGs, JPGs or even raw digital elevation models.

* It's not backwards compatible in the same way Cloud Optimized GeoTIFF is.

* It has a recursive index structure to avoid needing to load in huge indexes at once for large datasets (hundreds of millions of tiles).

* It is designed for remote HTTP range reads. If your application can instead directly access a filesytem, like on a mobile phone, MBTiles, which uses SQLite, is a more established solution.

The specification is open source on GitHub https://github.com/protomaps/PMTiles and is still evolving in response to user needs. I've created direct client support for Leaflet and MapLibre, and plan to work on OpenLayers next.


Ah, so it's a general tile addressable system, where the individual tiles might be vector data, rendered vector data, or raster data. For rendered or raster data, the advantage is that it's a single file, rather than eleventy billion little files and directories. For the vector data, it's a tiling system, which allows you to do the standard things for dropping/combining features as mbtiles, just as one container file.


Really love the work here.

Any thoughts about pre-generating all of the tiles instead of serving them from the static PMTiles via Lambda? There must be levels of traffic at which Lambda is more expensive? Even an agressively-cached one?

It seems like the serverless implementation's worst-case is having to generate a planet's worth of uncached tiles, and the compute is a lot more expensive per tile than just generating them all + putting them behind a CDN.


The tiles in the PMTiles on static storage are already pre-generated. However, most storage solutions work in a single region, and don't offer great latency guarantees.

Lambda/Workers performs the computation to populate the edge cache with some vendor specific optimizations, which gets data closer to end users. There's also space to do more advanced filtering or data combination at lambda time that I'm exploring.

Agreed that the pure serverless pricing model may not be the best for all use cases. Future options I'm looking at are Fly.io and Cloud Run, I can help evaluate these for customers. There's a lot of competition and emerging products in this space.


> To illustrate the cost savings, every additional 1,000 users that load a map on Google Maps costs 7 USD. An additional million hits to Cloudflare Workers costs fifteen cents.

That's certainly a selling point!

The whole ecosystem of tools is powerful, this tileserver is a great capstone and an interesting commercial model. What is your definition of End Product here? Eg if we have an app and website under one brand is that two end products or one?

How do updates work? Eg with mapbox or google you just get the "latest" data (modulo time it takes them to integrate data), does this hook automatically into the latest source?


> What is your definition of End Product here? Eg if we have an app and website under one brand is that two end products or one?

One brand with web + mobile app would be one product. To be more explicit, a software development consultancy with multiple clients would need one license per client.

For updates, you are 100% in control of the data once it is on your S3. This means that it won't change from under you, change pricing, or disappear. This also means you'll have to explicitly copy updates from upstream, and I'm figuring out the right cadence for that (likely ~quarterly)


Can Backblaze B2 be paired with Cloudflare for similar effect at a lower storage cost? Backblaze does support an S3 compatible target [1].

[1] https://www.backblaze.com/blog/backblaze-b2-s3-compatible-ap...

EDIT: Thank you for the reply!


I've tested Backblaze B2 as a storage option, and I found that with the level of traffic a typical map backend incurs, there were enough 503 errors to cause problems. This is an intentional trade-off of the B2 design that allows them to set an aggressive price point: https://www.backblaze.com/blog/b2-503-500-server-error/

I'm constantly evaluating all of the different deployment combinations and new features on cloud providers, so part of the service I'm offering is advice for your specific provider and this workload.


> To illustrate the cost savings, every additional 1,000 users that load a map on Google Maps costs 7 USD. An additional million hits to Cloudflare Workers costs fifteen cents. That's certainly a selling point!

It gets even better. For example, Mapbox charges for every map load and again for every 12-hour period the map remains open.


The thing that's holding us back on switching fully to OSM maps is geocoder data. Any lines on the OSM of geocoding?


The stack I describe in the post is only for map tiles - Map tiles are a good fit for CDNs because the input space is small (just Z/X/Y coordinates on a square grid) and thus very cacheable.

Geocoding is a very different problem because the input space - human language - is much, much larger, and answering queries quickly to support features like autocomplete really requires a server with hot data in memory.

One of my favorite projects in this space is Pelias https://pelias.io which is an open source auto-completing geocoder based on OSM plus other open data. It's backed by a great team that also runs a business: Geocode Earth https://geocode.earth


(Co-maintainer of Pelias and co-founder Geocode Earth here)

Thanks Brandon for yet another one of your shout outs.

I'd just like to underline one thing you said, Pelias/Geocode Earth are based on OSM _plus other data_: that last part is pulling a lot of weight.

OSM data is great, in fact the POI data in OSM is best-in-class in many parts of the world. But OSM in general doesn't have great address coverage. It's very difficult to manually map low density rural or suburban areas, as is the preferred method with OSM. Bulk address imports are possible, but rare. However there are a huge number of local governments that publish up to date, relatively complete and accurate address lists, and we heavily lean on those for good address coverage.

So to anyone looking for a geocoder who has been put off from Pelias or Geocode Earth because you saw it uses OSM, give it a try anyway. OSM data is a crucial piece of the puzzle, but not the only one.


This looks like a very compelling option. I do have a few questions:

- can we use our own map data in addition of OSM? - possibility for Azure Blob Storage and Function support? - perhaps KNative for when you already have Kubernetes cluster maintained?

I don't quite understand the "per-site" license. Currently we have one tile server used by several different map apps. Would each app need one, separate license even though they use same data in (roughly) the same way?

Thank you.


* Yes, the serverless part is a CDN accelerator for any tiled data, as long as you can get your own map data into PMTiles format. I have tested with azure functions but it's not as mature as other platforms yet.

If your several map apps are part of a single organization, it should be fine. I'm happy to discuss this more if you email me at brandon@protomaps.com.


Thanks, will do.


Makes me want to find a project I need a map for!




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: