Exploring geocodr’s clusters

Heres a slightly modified geocodr that returns details about all the clusters that the geocodr finds and the photos that make up the clusters. It can return the bounding box of the clusters too. Options are there to just display the main one, all clusters and/or all points within each cluster. In addition, it can take all the other standard geocodr params.

webservice url: http://geothings.ning.com/Flickr/flickrgeocodr_cluster.php documentation

Example application here: http://geothings.ning.com/geocodrjson.php.

Those with firebug extension can view the json nicely, but it also “stringifies” it onto the page. It returns in both JSON and XML format.

Flickr Geocodr – k-means cluster enabled geocoder

Here’s a geocoder that finds places based on the combined knowledge – the tags, title or description of geotagged Flickr Photos. Example Application. Now that Flickr has 10 Million Geotagged pics, there’s a fair chance that people will add tags to them that describes the location. So searching for photos tagged with Manchester, will probably bring up a lot of photos that are located in Manchester. However, it will also bring up other groups of photos, and this is where clustering comes into play. geocodr.png

This work is inspired by Mikel Maron’s Flickr Geocoder, which grabs photos via the geoRSS feed, and uses the mean value of the locations. The Flickr GeoRSS feed Mikel used also includes photos that have no locational information – so we have to use an API call(flickr.photos.search) to grab more geotagged photos, and a simple mean value doesn’t take into account the clustering of photos that are found, for multiple areas.

I ported over and changed a Java k-means clustering algorithm into PHP. The clustering process seems to be very fast.

world-manchester4wide.jpg
this screenshot shows a search for Manchester across the world, and shows the number of clusters. It picks out the cluster with the largest number of points within it.

You can get different results by changing a number of parameters both in the clustering and the flickr api call. I found that three or four clusters gave a good result, the number of points around 50 was sufficient, but a larger sample would give a better answer, searching by tag or text, using a bounding box etc, could improve or change results.

In the Example Application, and as default setting on the geocoder, it returns photos based on “interestingness” rather than “relevance” or date. This seemed to give a good spread of different authors, and photos.

This is clustering based on geographical proximity, but how about clustering based on other variables? The similarity of other tags? Colours in the photo? Date or time taken? A multi-variate clustering may be worth looking at. Dan Catt has talked about clustering recently too.

Possible things for the future:
Automatically search by text if no results are given by tags.
Make pure clustering webservice.
Return photo and cluster, points information back in response.

Edits: For the Ning users:
I made use of <xn:head> to insert the relevant OpenLayers javascript code, and marker code.
Since Ning uses dojo, I used that to communicate via javascript to the webservice:

var bindArgs = {
url: “Flickr/flickrgeocodr.php”,
method: “get”,
content: {“place”: place },
mimetype: “text/xml”,
load: function(type, data) {
doPlace(data, place)
}
};
dojo.io.bind(bindArgs);
}

Grassroots remapping – OpenStreetMap conference for 2007?

Steve writes on OpenGeoData.org about the first Openstreetmap conference mooted for mid 2007 in the UK. Looking forward to that, should be great to get everyone together around tables that haven’t, for the most part of the day, got beer on them. The active social nature of OSM’ers however is a postive strength and indication of the level of comittment and involvment by individuals to make this project suceed. Steve also posted an animated gif by RandomJunk for OSM activity in London. This is just one months worth of activity from 16 October to 14 November 2006!
london.gif

Looking at the statistics for activity, users, people uploading tracks, editing map features etc, growth continues at a very healthy rate… and its very worth while asking when would the globe be covered?

I’ve been experimenting with osmarenderer, (a way to transform the xml formatted .osm data into SVG and pretty maps) changing some of the default rules the features are rendered, font sizes, symbols etc. It also showed up which segments and ways were out of order, or needed reversing – helping to correct some early JOSM work, and increase more thorough mapping. Heres a little map of my neighbourhood, I am slowly walking around with a gps, so things take time! someosmstuff.png
It was painless, fun, very customisable, and educational. I use Inkscape to edit the resulting SVG file. (The blank white areas, are not countryside, but are unmapped areas, maybe we should write “here be dragons”…)

Swap: cycletourer.com for a tourer cycle

Swap http://www.cycletourer.com for a tourer.

I’ve had the cycletouer.com website for about six years now, and not got round to making it into a definitive cycle route sharing, planning and mapping site – mainly because I was never a hard core cyclists, a day tripper if anything – but also mainly because there was no google maps, and tools…. Now there are and I’m glad to see little sites springing up for the cycling community. (Cycletourer actually had a few routes around Perth in Western Australia, with hard coded HTML maps, but that was as far as it got). So who wants it? Its expiring in Jan 2007, I want a bike for it, a road tourer, nothing fancy, as it would probably be stolen, like my previous one, but one that can have panniers on, and ideally with suspension to help my back! I’m in Yorkshire UK, if that helps.
Also, I’ll take preference over those in the cycling and geo communities, (or with the nicest bike heh).

virtual property rights? Virtual Earth 3D & adverts.

“Because adverts are natural”… this but does bring up questions of rights over virtual representations of real property. Microsoft have just announced their new additional functionality, 3D views. Works in Windows only, IE6 and 7 and is a plugin, essentially. It does look and move very well, with the detail on the sides of buildings and on the roofs looking to have been rendered correctly with areal photographs. Very impressive looking buildings! And they’ve also added adverts, making this virtual world more like the real world for better or for worse.

ve3d.png
I saw this advert/billboard for “Eragon” almost immediately when I searched for San Francisco and zoomed directly in (New York for some reason didn’t show any 3D buildings for me, London seems quite flat, but there are some areas rendered like the Battersea Power Station).bps.png

I can see that there will be prime areas of a map, like where does the map center when someone searches for “Manchester”? … .That central point will be more expensive to have adverts in. Or on the top of a high building, or in the middle of a park. Perhaps we will see loads of billboards in the centre of the cities, and maybe not so many on the motorways and arterial routes into and out of them.

It also raises questions of ownership….(Something we chatted about in #geo not too long ago) Do I have the right to stop a competitor putting their billboard on top of my Headquarters? Can a town argue to stop certain businesses advertising on their map, if they have similarly banned them from the physical space? Do I have a right to say how my physical property is represented in a virtual (3D) world? Can state owned property have paid advertisements on the 3D representation? Does anyone have a right to the representation of the reality.

Do you have any rights that you can cite in respect to how Microsoft/Google represent physical property over which you claim ownership? Doesn’t the representation of your property come under trademark/copyright? How about the physical space above it?

You would have the right to point out any inaccuracies in their representation (i.e. my house now has an extension) but I would guess that wouldn’t be the same as the right to have your own representation, giant flower, purple mushroom etc. Perhaps in the future ownership of virtual representation of real property becomes as standard when you come into ownership of real property.

GeoRSS of flickr photos – “flicked”

You can now use a little application to get GeoRSS feeds for flickr photos for wherever and whatever in the world. Choose the tags, choose the bounding box (where in the world), choose how many results to retrieve, and choose the format, as list, as map (see below), or as GeoRSS feed. Don’t specify tags, and it gives you back the latest photos for that area. Users logged into Ning can save their searches/feeds and can get a shorter permalink to their feeds. Heres an example feed showing some interesting trees in the UK.

flicked.png

The GeoRSS feeds can be viewed a number of ways, using the ACME GeoRSS viewer , the OpenLayers Viewer, or viewed in GoogleEarth via the Geonames RSS to GoogleEarth tool. The OpenLayers viewer on the site has a Yahoo! Maps layer, which you can swtich on or off.

The GeoRSS is RSS2.0 and the geo points are represented twice using

W3C Basic Geo
<geo:lat>31.114651</geo:lat>
<geo:lon>120.83571</geo:lon>

and

Simple GeoRSS
<georss:point>31.114651 120.83571</georss:point>

Primarily, it has been a learning exercise in Flickr api’s, Ning , XML, GeoRSS and OpenLayers. Its on Ning, so anyone can clone, look at the source, help me bug fix 😉 . You can see some of the steps I took here.

At the moment, occasionally the flickr services timeout causing a crash. If that happens, a reload will usually correct things. But it’s a reminder not to hit it too hard (fixes are planned, see below).

Note: You can speed up feed generation by choosing to just have photo link and title (no description, owner name, or tags). These extra, “Detailed Outputs” result in a Flickr API call for each individual photo, as opposed to one at the beginning, thus slowing things down some.

Flickr themselves and Rev Dan Catt of Flickr and Geobloggers have hinted on the blog and on the Flickr:GeoTagging discussion group that this is what will be coming to Flickr.

Take any RSS feed and add &georss=1 onto the end of it, it’ll pop in a georss:point for you, if the photo has geo information.

An example of the geotagged tag’s RSS feed

Of course with that there’s no guarantee that any photos will have a georss:point. But it’s there if you want to use it on your own photos or whatever.

So hopefully we will soon see these types of GeoRSS feeds to be easily made right from the flickr map. In the meantime, this builds upon that by doing a photo search with a geographical bounding box, we are able to get all photos that have geo information.

The URL is quite hackable too (and subject to change), with key value pairs:

xn_auth=no sidesteps Ning authentication, thus makes it faster.
format=geoRSS|map|list
tags=thistag,thattag,theothertag (comma separated, no spaces)
andor=any|all
&numresults=30 – can go up to 500 apparentlyThese three give detailed information, any on set to true will slow down feed &desc=true|false – description and date uploaded
&oname=true|false – show Flickr Username
&showtags=true|false – show tags (some photos have loads!)

&bbox= min_lon, min_lat, max_lon, max_lat – the guts of the feed.
And other things, like sorting by interestingness, relevance or date, etc

Application: http://geothings.ning.com/flicked.php
Example Feed (30 of the newest photos): http://geothings.ning.com/flickedout.php?&id=2347607

WMS-C OpenLayers – (cached tiling WMS)

Metacarta Labs / OpenLayers have produced an implementation of WMS-C  – Thats WMS tile Caching, proposed by the OSGeo.

Its blisteringly fast! Mapserver is quite fast but serving the cached tiles make it very quick. “So long as you can run python CGI, and write to disk, you can cache any WMS for your own use”. It has support for WMS and also TMS (Tile Mapping Service) requests.

Currently they are serving their Vmap0, Blue Marble, Human Footprint and USGS Digitial Raster Graphic datasets.

some examples: here for the UK and here for DRG over the USA somewhere.

Very impressive.