A Digital Gazetteer of Places for the Library of Congress and the NYPL

I’m proud to tell you that the project I worked on last year with Topomancy for the Library of Congress and the New York Public Library has just been released to the Internet. It’s an open source, fast, temporal enabled, versioned gazetteer using open data. It also works as a platform for a fully featured function-laden historical gazetteer.

You can check out the official release at the Library of Congress’s GitHub page, and for issues, documentation and development branches on Topomancy’s Gazetteer project site.

Here is an introduction to the project giving an overview and discussion of the challenges and listing the features of this software application. Enjoy!

DEMO: Head over to http://loc.gazetteer.us/ for the Digital Gazetteer – a more global gazetteer and to http://nypl.gazetteer.us/ for the NYPL’s  NYC focused historic gazetteer. If you want to try out the conflation, remediation and administrative roles, let us know at team@topomancy.com

Introduction, Overview, Features

A gazetteer is a geographic dictionary sometimes found as an index to an atlas. It’s a geographic search engine for named places in the world. This application is a temporal gazetteer with data source remediation, relations between places and with revisions which works with many different data sources. It’s fast, written in Python and uses ElasticSearch as a database. The application was primarily written as a Digital Gazetteer for the Library of Congress’s search and bibliographic geographic remediation needs and was also developed for the New York Public Library’s Chronology of Place project. It is currently being used in production by the Library of Congress to augment their search services. The software is MIT licensed.

Fig 1. Library of Congress Gazetteer

Fig 1. Library of Congress Gazetteer

Architecture

* JSON API
* Backbone.js frontend
* Django
* ElasticSearch as revision enabled document store
* PostGIS

Data Model

* Simple data model.
* Core properties, name, type, geometry
* Alternate Names, (incl. language and local, colloq etc)
* Administrative hierarchy
* Timeframe
* Relations between places (conflation, between geography and between time, etc
* Edit History, versioning, rollbacks, reverts

Features

Search

* Text search (with wildcard, AND and OR support – Lucene query syntax)
* Temporal search
* Search according to data source and data type
* Search within a geographic bounding box
* Search within the geography of another Place.
* GeoJSON and CSV results
* Search can consider alternate names and administrative boundaries, and address details.
* API search of historical tile layers
* Server side dynamic simplification option for complex polygon results

Fig 2. Gazetteer Text Search

Fig 2. Gazetteer Text Search

Fig 3. Temporal Search

Fig 3. Temporal Search

Fig 4. geographic search

Fig 4. geographic search

Place

* Place has alternate names and administrative boundaries
* Similar Features search (similar names, distance, type etc)
* Temporal data type with fuzzy upper and lower bounds.
* Display of any associated source raster tile layer (e.g. historical map)
* Full vector editing interface for edit / creation.
* Creation of composite places from union of existing places.
* Full revision history of changes, rollback and rollforward.

Fig 5. Alternate Names

Fig 5. Alternate Names

Fig 6. Similar Names

Fig 6. Similar Names

Fig 7. Vector Editing

Fig 7. Vector Editing

Relations between places

These are:
* Conflates (A is the same place as B)
* Contains (A contains B spatially)
* Replaces (A is the same as B but over time has changed status, temporal)
* Subsumes (B is incorporated into A and loses independent existence, temporal)
* Comprises (B comprises A if A contains B, along with C,D and E)

We will delve into these relationship concepts later

Site Admin

* GeoDjango administrative pages
* Administrative Boundary operations
* Batch CSV Import of places for create / Update
* Edit feature code definitions
* Edit groups and users and roles etc
* Edit layers (tile layers optionally shown for some features)
* Add / Edit data origin definitions

Fig 8. feature code edition

Fig 8. feature code edition

Fig 9. Django origin edition

Fig 9. Django origin edition

Background, Requirements and Challenges

Library of Congress and Bibliographic Remediation

The Library has lots of bibliographic metadata, lots of geographic information, much of it historical, almost all of it is unstructured.
For example, they have lots of metadata about books, where it was published, the topics, subjects etc. They want to try and improved the quality of the geo information associated with the metadata, and to augment site search.

So the library needs an authoritative list of places. The Library fully understands the needs for authoritative lists – they have authority files for things, ideas, places, people, files etc, but no centralised listing of them, and where there are geographic records there may be no actual geospatial information about them.

Initial Challenges

So we start with a simple data model, where a named location on the Earth’s surface has a name, a type and a geometry. All very simple right? But actually it’s a complex problem. Take the name of a place, what name to use? What happens if a place has multiple names, and what happens if it has multiple records to describe the same place? Taxonomies are also a large concern, for example establishing a set schema for every different type of feature on the earth is not trivial!

What’s the geometry of a place? Is it a point, is it a polygon, and at what scale? For administrative datasets, it’s often impossible to get a good detailed global administrative dataset. Often in many places the data is not there. Timeframe and temporal gazetteers are an another large area for research (see OpenHistoricalMap.org if this intrigues you!). But the way we describe places in time is very varied, for example “in the 1880’s” or “mid 19th Century” or “1 May 2012 at 3pm”. What about places which are vague or composed of other places, like “The South” (of the US) – how would a gazetteer handle these? And the relationships between places is another very varied research topic.

Approach

So we think the project has tried to address these challenges. For names, the system can accept multiple additional alternate names, and conflation enables the fixing of multiple records together so that the results shows the correct authoritative results. The Digital Gazetteer allows places to have any type of geometry (e.g. point, line, polygon) where all the database needs is a centroid to make search work. For temporal support, places have datestamp for start and end dates but crucially there is in addition fuzzy start and ends specified in days. This enables a place, for example to have a fuzzy start date (sometime in the year 1911) and a clear end date (23 May, 1945). For “The US South” example – composite places were created. The system generates the union of the composite places and makes a new one. The component places still exist – they just have a relationship with their siblings and with their new parent composite place. This brings us to how the Digital Gazetteer handles relations between places.

Fig 10. Composite Place

Fig 10. Composite Place

Relationships

Let’s look a bit more in detail about the relationship model. Basically the relationships between places help in conflation (reducing duplicate records) and in increasing search accuracies. The five relationships are as follows:

* Conflates
* Contains
* Replaces
* Subsumes
* Comprises

Conflates

This is the most common relationship between records initially. It effectively is an ontological statement that the place in one record is the same as described in another record, that entries A and B are the same place. It’s a spatial or a name type of relation. For example, if we had 5 records for Statue of Liberties, and all 4 were conflated to the one record, when you searched for the statue you’d get the one record, but with a link to each of the other four. Conflates hides the conflated record from search results.

Contains

Contains is a geographical relationship. Quite simply, Place A contains Place B. So for example, the town of Brighton would contain the Church St. Matthews.

Replaces

Replaces is mainly a temporal relation, where one place replaces another place if the other place has significantly changed status, name, type or boundary. For example, the building representing the Council Offices of the town from 1830-1967 is replaced by a bank.

Subsumes

Subsumes is mainly a temporal relation. Where a place A becomes incorporated into another place B and loses independent existence. For example, the ward of Ifield which existed from 1780 to 1890 becomes subsumed into the ward of Crawley.

Comprises

Comprises is primarily a spatial or name relation. Place A comprises place B along with place C,D and E. This relation creates composite places, which inherit the geometries of the component places. For example, “The US South” can be considered a composite place. This place is comprised of Virginia, Alabama etc. Virginia in this case comprises “the US South”, and the composite place “The US South” has the union of the geometry of all the places it is comprised by.

Data Sources

OpenStreetMap (OSM), Geonames, US Census Tiger/Line, Natural Earth, Historical Marker Database (HMDB), National Historical GIS (NHGIS), National Register of Historic Places Database (NRHP) and Library of Congress Authority Records

Further Challenges

Automatic Conflation

There remains two main areas for future development and research – Automatic Conflation and Search Ranking. Since there are multiple datasets, there will of course be the same record for the same place. The challenge is how to automatically find the same place from similar records by some kind of search distance. For example, by distance from each other, distance geographically, and in terms of name and place type. Tricky to get right, but the system would be able to undo any of the robots mistakes. Further information about this topic can be found on the GitHub wiki: https://github.com/topomancy/gazetteer/wiki/Conflation

Search Ranking

By default the gazetteer uses full text search which also takes into account alternate names and administrative boundaries, but there is a need to float up the more relevant places in the search results. We can also sort by distance from the search centre if doing a search within geographic bounds, which is used for helping find similar places for conflation. We could probably look at weighting results based on place type, population and area, although population and area for many urban areas in the world may not be available. One of the most promising areas of research is using Wikipedia request logs as a proxy for importance – places could be more important if they are viewed on Wikipedia more than other places.

Further Issues

Some other issues which I haven’t got space to go into here include: synchronising changes up and downstream to and from the various services and datasets. Licensing of the datasets could be looked at especially if they are being combined. What level of participation in the conflation and remediation steps should a gazetteer have, which depends on where the gazetteer is based and who it is being used for.

NYPL Chronology Of Place

I mentioned at the beginning of the post that the New York Public Library (NYPL) was also involved with the development of the Gazetteer. That project was called The Chronology of Place, and as the name suggests is more temporal in nature. But it’s also more focused geographically. Whereas the LoC are interested in the US and the World as a whole, the NYPL’s main focus is the City of New York. They wanted to deep dive into each building of the city, exploring the history and geography of buildings, streets and neighbourhoods.

Fig 11. NYPL Chronology of Place

Fig 11. NYPL Chronology of Place

Thus the level of detail was more fine grained, and is reflected in some custom default cartography in the web application client. A nondescript building in a street in a city for example are not usually considered a “place” worthy of a global gazetteer but for the NYPL each building was significant. Also, the NYPL has extensive access to historical maps via the NYPL Map Warper which Topomancy developed for them, and around a hundred thousand digitized vector buildings from these historical map atlases. This data, along with data from the city were able to be added to the system to augment the results. Additional data sources include the Census’s Historical Township boundary datasets, NYC Landmarks Preservation Commission Landmarks and NYC Building Footprints.

There were two additional features added to the application for the NYPL’s Chronology of Place. The first was expanding the data model to include street addresses, so that a building with no name can be used, and the second was to display raster tile layers (often from historical maps) for specific features. Thus,the building features which were digitized from the historical maps were able to be viewed alongside the source raster map that they came from.

Fig 12. Custom/Historical layers shown

Fig 12. Custom/Historical layers shown

A Web Maps Primer using MapWarper.net (via NYPL Blog)

Mauricio from the innovative NYPL Labs has just published an extensive tutorial on how to use MapWarper.net with GeoJSON, MapboxJS, and JSFiddle to create your own historical web map, as he says it is  “a primer on working with various free web mapping tools so you can make your own awesome maps.” The end result is worth checking out.

h5s13Mm

In the tutorial the following steps are included:

  1. geo-referencing the scanned map so that web tiles can be generated
  2. generating GeoJSON data to be overlaid
  3. creating a custom base map (to serve as reference/present day)
  4. integrating all assets in an interactive web page

81lQIRG

Its a very detailed introduction to a wide range of new, free and open geo tools on the web, and I cannot recommend it high enough! It’s also great to see mapwarper.net being used in this way!

Devise Omniauth OAuth Strategy for MediaWiki (Wikipedia, WikiMedia Commons)

Authentication of MediaWiki users with a Rails Application using Devise and Omniauth

Wikimaps is a Wikimedia Commons project to georeference/georectify historical maps. Read the wikimaps blog here. It is using a customised version of the Mapwarper open source map georectification software as seen on http://mapwarper.net to speak with the Commons infrastructure and running on Wikimedia Foundations Labs servers. We needed a way to allow Commons users to log in easily.  And so I developed the omniauth-mediakwiki strategy gem so your Ruby applications can authenticate on WikiMedia wikis, like Wikipedia.org and Wikimedia Commons.

e0974880-2ef0-11e4-9b51-e96f339fe90c

The Wikimaps Warper application uses Devise – it works very nicely with Omniauth. The above image shows traditional login with username and password and, using OmniAuth, to Wikimedia Commons, GitHub and OpenStreetMap.

After clicking the Wikimedia Commons button the user is presented with this:oauth

It may not be that pretty, but the user allowing this will redirect back to our app and the user will be logged in.

This library used the omniauth-osm library as an initial framework for building upon.

The code is on github here:   https://github.com/timwaters/omniauth-mediawiki

The gem on RubyGems is here: https://rubygems.org/gems/omniauth-mediawiki

And you can install it by including it in your Gemfile or by doing:

gem install omniauth-mediawiki

Create new registration

The mediawiki.org registration page is where you would create an OAuth consumer registration for your application. You can specify all wikimedia wikis or a specific one to work with. Registrations will create a key and secret which will work with your user so you can start developing straight away although currently a wiki admin has to approve each registration before other wiki users can use it.  Hopefully they will change this as more applications move away from HTTP Basic to more secure authentication and authorization strategies in the future!

Screenshot from 2014-09-03 21:08:33

Usage

Usage is as per any other OmniAuth 1.0 strategy. So let’s say you’re using Rails, you need to add the strategy to your `Gemfile` alongside omniauth:

gem 'omniauth'
gem 'omniauth-mediawiki'

Once these are in, you need to add the following to your `config/initializers/omniauth.rb`:

Rails.application.config.middleware.use OmniAuth::Builder do
 provider :mediawiki, "consumer_key", "consumer_secret"
end

If you are using devise, this is how it looks like in your `config/initializers/devise.rb`:

config.omniauth :mediawiki, "consumer_key", "consumer_secret", 
    {:client_options => {:site => 'http://commons.wikimedia.org' }}

If you would like to use this plugin against a wiki you should pass this you can use the environment variable WIKI_AUTH_SITE to set the server to connect to. Alternatively you can pass the site as a client_option to the omniauth config as seen above. If no site is specified the http://www.mediawiki.org wiki will be used.

Notes

In general see the pages around https://www.mediawiki.org/wiki/OAuth/For_Developers for more information

When registering for a new OAuth consumer registration you need to specify the callback url properly. e.g. for development:

http://localhost:3000/u/auth/mediawiki/callback
http://localhost:3000/users/auth/mediawiki/callback

This is different from many other OAuth authentication providers which allow the consumer applications to specify what the callback should be. Here we have to define the URL when we register the application. It’s not possible to alter the URL after the registration has been made.

Internally the strategy library has to use `/w/index.php?title=` paths in a few places, like so:

:authorize_path => '/wiki/Special:Oauth/authorize',
:access_token_path => '/w/index.php?title=Special:OAuth/token',
:request_token_path => '/w/index.php?title=Special:OAuth/initiate',

This could be due to a bug in the OAuth extension, or due to how the wiki redirects from /wiki/Special pages to /w/index.php pages….. I suspect this may change in the future.

Another thing to note is that the mediawiki OAuth implementation uses a cool but non standard way of identifying the user.  Omiauth and Devise needs a way to get the identity of the user. Calling '/w/index.php?title=Special:OAuth/identify' it returns a JSON Web Token (JWT). The JWT is signed using the OAuth secret and so the library decodes that and gets the user information.

Calling the MediaWIki API

Omniauth is mainly about authentication – it’s not really about using OAuth to do things on their behalf – but it’s relatively easy to do so if you want to do that. They recommend using it in conjunction with other libraries, for example, if you are using omniauth-twitter, you should use the Twitter gem to use the OAuth authentication variables to post tweets. There is no such gem for MediaWiki which uses OAuth. Existing  Ruby libraries such as MediaWiki Gateway and MediaWIki Ruby API currently only use usernames and passwords – but they should be looked at for help in crafting the necessary requests though.

So we will have to use the OAuth library and call the MediaWiki API directly:

In this example we’ll call the Wikimedia Commons API

Within a Devise / Omniauth setup, in the callback method, you can directly get an OAuth::AccessToken via request.env["omniauth.auth"]["extra"]["access_token"] or you can get the token and secret from request.env["omniauth.auth"]["credentials"]["token"] and request.env["omniauth.auth"]["credentials"]["secret"]

Assuming the authentication token and secret are stored in the user model, the following could be used to query the mediawiki API at a later date.

@consumer = OAuth::Consumer.new "consumer_key", "consumer_secret",
            {:site=>"https://commons.wikimedia.org"}
@access_token = OAuth::AccessToken.new(@consumer, user.auth_token, user.auth_secret)
uri = 'https://commons.wikimedia.org/w/api.php?action=query&meta=userinfo&uiprop=rights|editcount&format=json'
resp = @access_token.get(URI.encode(uri))
logger.debug resp.body.inspect
# {"query":{"userinfo":{"id":12345,"name":"WikiUser",
# "rights":["read","writeapi","purge","autoconfirmed","editsemiprotected","skipcaptcha"],
# "editcount":2323}}}

Here we called the Query action for userinfo asking for rights and editcount infomation.

Leeds Creative Labs – Initial steps and ideas around The Hajj

Cross posted from The Leeds Creative Labs blog.

I signed up to take part in Leeds Creative Labs Summer 2014 programme with the hope that it would result in something interesting, something that a techie would never get the opportunity to do normally. It’s certainly exceeded that expectation – it’s been a fascinating enthralling process so far, and I feel honoured to have been selected to participate.

 

I’m the designated “technologist” who is in partnership with Dr Seán McLoughlin and Jo Merrygold on this project around The Hajj and British Muslims. Usually I tend to do geospatial collaborative and open data projects, although I’m also a member of the Leeds group of Psychogeographers. Psychogeography is intentionally vague to describe but one definition is that it’s about the feelings and effects of space and place on people. It’s also about a critique of space – a way to see how modern day consumerism/capitalism is changing how our spaces are, and by definition how we in these spaces behave.

We had our first meeting last week – it was a “show and tell” by Seán and Jo to share some of the ideas, research, themes and topics that could be of relevance to what we will be doing.

Show and tell

Seán, from the School of Philosophy, Religion and The History of Science introduced his research on Islam and Muslim culture, politics and society in contexts of contemporary migration, diaspora and transnationalism. In particular his work has been around and with South Asian heritage British Muslim communities. The current focus of his work, and the primary subject of this project is about researching British Muslim pilgrims’ experiences of the Hajj.

The main resources are audio interviews, transcripts and on-line questionnaires from a number of different sources such as pilgrims of all ages and backgrounds, other people related to the Hajj “industry” such as tour operators and charities.

Towards the end of the year are a few set days for the Hajj – a once in a lifetime pilgrimage to the holy Saudi Arabian city of Mecca. You have probably seen similar photos such as this where thousands of pilgrims circle the Kaaba – the sacred cuboid house right in the centre of the most sacred Muslim mosque.

It’s literally the most sacred point in Islam. It’s the focal point for prayers and thoughts. Muslims orient themselves towards this building when praying. The place is thought about everywhere – for example, people may have paintings with this building in their homes in the UK, and they may bring back souvenirs of their Hajj pilgrimage . You can see that the psychogeography of space and place on the emotions and thoughts of people could be very applicable here!

And yet the Hajj itself is more than just about the Kaaba – it’s a number of activities around the area. Here’s a map!

The Hajj

These activities, all with their own days and particular ways of doing them are literally meant to be in the footsteps of key religious figures in the past. I will let the interested reader to discover for themselves, but there’s a number of fascinating issues surrounding the Hajj for British Muslims with Seán outlined.

Here’s a small example of some of these themes:

Organising the Hajj (tour operators, travel etc).
What the personal experiences of the pilgrims were.
How Mecca has changed, and how the Hajj has changed.
The commercial, the profane, the everyday and the transcendent and the sacred.
How this particular location and event works over time and space.
What are the differences and similarity of people and cultures, and possible experiences of poverty.
“Hajj is not a holiday” and Hajj Ratings.
Differences in approach of modern British Muslims to going to the Hajj (compared to say their grandparents).
Returning home and the meaning and expectations of returnees (called Hajjis).
What we did and didn’t do

We didn’t rush to define our project outputs – but we all agreed that we wanted to produce something!

Echoing Maria’s post earlier we are trying to leave the options open for what we hope to do. Allowing our imaginations to run and to explore options. I think this justice to the concept of experimentation and collaboration, and should help us be more creative. I think that we can see which spark our imaginations, what address the issues better – what examples and existing things are out there that can be re-appropriated or borrowed, and which things point us in the right direction.

What I did after

So after the show and tell my mind was spinning with new ideas and concepts. It took me a few days to go over the material and do some research of my own, and see what sorts of things I might be able to contribute to. It’s certainly sparked my curiosity!

I was to prepare for a show and tell (an ideas brain-dump) for the next meeting. The examples I prepared included things from cut and paste transcriptions, 3D maps, FourSquare and social media, to story maps, to interactive audio presentations and oral history applications. I also gave a few indications as to possible uses of psychogeography with the themes. I hope to use this blog to share some of these ideas in later posts.

Initially I mentioned the difference between a “hacker” approach and the straight client and consultant way of doing development. For example encouraging collaborative play and exploration rather than hands off development. Allowing things to remain open. The further steps would be crystallizing some of these ideas – finding better examples and working out what we want to look at or devote more time to. We’d then be able to focus on some aims and requirements for a creative interesting project.

State of the Map Europe 2014 – Pure OpenStreetMap.

Karlsruhe

State of the Map Europe 2014 was in the German city of Karlsruhe. The city was a planned city – designed and built around 1715 – pre motor car, but with wide avenues, and half of the city seems to be a park. It’s also famous for being the home of the Karlsruhe Addressing Scheme – an example of a folksonomy tagging convention that everyone pointed to and adopted, due to the great mappers there – including the folks from Geofabrik.de who also organised the conference. Here are some notes from the conference:

Nature of the conference

The European conference seemed much more intimate with a focus on developer and contributors  – compared to the US Conference which I think had more end users and people sent there by their bosses for their company. Pretty much every single session was on topic (except for the closing buzzword laden keynote!)  – and as such there were no enlightening talks about psychogeography, general historical mapping, or other geospatial software. It was pure OSM.

All the talks are online and the video recordings are on youtube and I encourage you to view them.

3D maps

3D Maps, such as Mapzen and OSMBuildings were prominent – and both showed off some very creative ways of representing 3D maps.

Geocoder and Gazetteers

The only track in the conference – this was full of gazetteers with an announcement from OpenCage and MapZen – all appear to be using ElasticSearch – same as we (Topomancy) did last year for the NYPL and Library of Congress. Check out gazetteer here.

Other stuff

Trees – Jerry did a talk about mapping trees – about how they were represented in historical maps previously, and how we can use SVG symbols to display woods and trees in a better way. Jerry lead an expedition and workshop on the morning of the hack day to show participants the different habitats, surface types and variance in the environment that mappers could take into consideration.

Mapbox WebGL – Constantine, a European engineer of Mapbox did a fascinating talk about the complexities of the technical challenges with vector tiles and 3D maps. I really enjoyed the talk.

Image

OpenGeoFiction – using the OSM stack to create fictional worlds  – not fantasy or science fiction, but amazing experiments in amateur planning, utopian visions and creative map making. OpenGeoFiction.net

The fictional world of Opengeofiction is thought to be in modern times. So it doesn’t have orcs or elves, but rather power plants, motorways and housing projects. But also picturesque old towns, beautiful national parks and lonely beaches.

I love this project!

Vector Tiles – Andy Allan talked about his new vector tile software solution ThunderForest – being one of the only people to know the ins and outs of how Mapbox do the Mapnik / TileMill vector magic. ThunderForest powers the cycle map. Vector maps has lots of advantages and I think we’d probably use it for OpenHistoricalMap purposes at some stage. Contact Andy for your vector mapping and online cartographic needs!

POI Checker – from the same house as WheelMap.org comes POI Checker – it allows organisations to compare their data with data in OSM  – and gives a very neat diff view of Points of Interests. This could be a good project to follow.

Historical Stuff

OpenHistoricalMap There were a few things about historical maps in the conference, although in my opinion less than at any other SOTM previously. I did a lightning talk about OpenHistoricalMap and completely failed to mention the cool custom UK centric version of the NYPL’s Building Inspector.

Opening Keynote  – this was peppered with the history of the city and gave a number of beautiful historical map examples. Watch the video.

Map Roulette v2 – Serge gave a talk about the new version of Map Roulette  – it is being customised to be able to run almost any custom task on the system. We chatted a the hack day to see if the tasks from the Building Inspector could be a good fit into the new Map Roulette – I will look into this!

 

 

NYPL Adds 20,000 High Resolution Maps to the NYPL Warper – free to download

NYPL Warper – New Maps!

This weeks news was about a project I’ve been working on for the last few months with Topomancy – adding a whole load of new maps to one of the largest libraries around the New York Public Library.  These were  added to an award winning crowdsourced geo-rectification, historical map exploration and discovery application. Users can download full resolution TIFF files without the need to login, and if the map has been geo-referenced/rectified/warped, then you can freely download the warped versions too. The images are all in CC-Zero licenses – so, effectively Public Domain in nature. Credit to the library is appreciated though.

From motherboard.vice.com/read/new-york-public-library-releases-20000-beautiful-high-resolution-maps

From motherboard.vice.com/read/new-york-public-library-releases-20000-beautiful-high-resolution-maps

 

The announcement of the freely available 20,000 maps from the NYPL this week has been covered in a few places including OpenGLAM MotherBoard, OpenCulture and InfoDocket amongst others!

Castello_Plan_Warp

How

Folks may recognise that the Warper has been around for a little while now, and so here’s what we did: We hooked it up with the NYPL Digitial Collections API – this changed the way it requested , instead of internally requesting images from the Image Server, it uses the API properly. A whole suite of import processes were also generated to enable to search of maps from the repository, importing individual maps sheets, the import of individual atlases or layers full of maps, and most usefully the import of newly digitized maps.  A by product of this was to extract some of the library code into the nypl_repo Ruby Gem. There’s even some documentation for the nypl_repo gem for interacting with the NYPL Digital Collections API.

The code for the NYPL Warper can be found on GitHub – although if you are wanting to do this at home – have a look at the code for MapWaprer.net  also available on github.

geosearch

Little used feature of the warper – finding for maps using a map to search for them!

Magic, Illusion, Perception @ March Leeds Superpositon

A few days ago saw the most recent meeting of the Superposition group in Leeds. That nights was under the theme “Magic Illusion and Perception”   I’ve pinched a lot of the text in this post from that one!

There were four talks. The first was about the “curiosity” machine that uses lasers to draw moving images on clouds, the zoopraxiscope, and it was taken up in a small plane where images of a moving horse were projected onto a cloud. Wonderful stuff.

IMG_1792

Ben Dalton’s talk ‘Zines in the age of ‘big data’?’ introduced and proposed the idea of bundle publishing. At odds with current trends in digital distribution, bundle publishing involves editing a large collection of digital content and then publishing it on a specific date as a single, large file. This was the most intriguing talk of the evening, where instead of streams, or blogs, or things, that media could be published and shared in huge bundles of files. I’m encouraged partly by online publications such as The New Inquiry as an alternative to a blog roll. Ben is also interested in pseudonyms. A team of writers may publish using the same pseudonym – the pseudonym would have its own character, style of writing. There was also the pseudonyms as used by “anon” users – names that become used and familiar to people.

Experimental jazz musician and neuroscientist Christophe de Bézenac talked about the blurring of self and other in music and psychosis. Having studied at Conservatoire de Strasbourg, and been a regular performer at international music festivals he explained how perceptual ideas have guided his musical practice and how his musical work has, in turn, fed into his empirical/neuroscience research into psychosis. This talk really excited the audience, with discussions about what is ambiguity. Ambiguous language, music etc. What is the crowd? What is the mob? Can someone experience things as a group? Fascinating stuff.

IMG_1794

Professional Magician and Slight-of-Hand artist, Tony O’Neill discussed his creative process within the magical syllabus and sharing his current findings on the power of suggestion and self belief. It showed that magic, fortune telling could be used to help people, even when they knew what the process was all about. I wonder if a city needs more magicians, or if this type of magic could be used on a group of people. Things discussed include things like you can change someone’s mind by planting suggestions, etc.

IMG_1797

Tube Sign – Service Information image generation service

Part of a series of posts to cover some small projects that I did whilst not being able to work. They cover things from the role of familiar strangers on the internet and anti-social networks, through to meteorological hacks, funny memes to twitter bots. This post is about a funny meme image generation service.

Sometimes I surf the internet for funny pictures. Although the ones with cats I have a healthy distrust for – there was one class of amusing image which caught my eye. Those funny or inspirational London Underground passenger informations signs. I was seeing these every week and thought… “I could do that”. So I did, created tubesign.herokuapp.com and a few other people found it funny. At one point there was about 50 people visiting at any one time and when I put the statistics on there was 13,000 views on the second day with an image being created one every second. At time of writing it has had over 50,000 views.

from http://weknowmemes.com/wp-content/uploads/2012/09/apple-maps-london-tube-sign.jpg

An actual real life TfL service information Sign! from http://weknowmemes.com/wp-content/uploads/2012/09/apple-maps-london-tube-sign.jpg

How I did it.

First of all I looked into fonts – I wanted to get a good handwriting font which would look as if someone had used a marker on a white board. Google fonts delivered, and I chose Reenie Beany.

It uses Sinatra, Ruby and Rmagick and is hosted on the Heroku platform – even at it’s busiest it was able to cope on the free tier. It doesnt use any database. It caches requests for images though.

I use a bit of random number generation to change the angle the text is written at, and change the indent a bit.

The code for Tube Sign is on github  but give it a go firsttubesign.herokuapp.com

Viral & coverage

I posted this on facebook and my friends gave it a go, with some hilarious images being created, and then it spread to twitter, where more and more people found it. Then blogs, mainly London based blogs found it.

first image used – the source for this I could not found, and so the use of this image was discontinued

Someone said that the original image was someone’s copyright, so I changed it to a CC-By-SA image by Flickr user Lrosa, which also meant that all images created were under the same licence.

Creative Commons by Share Alike, Attribution image which is the image being used on the application. Image from Flickr, Lrosa, http://www.flickr.com/photos/lrosa/1138285047/

The main media outlets that covered it were: BBC America, ITV, The Londonist, The Atlantic Cities, The Guardian, The Next Web and the B3ta.com newsletter (very proud of that one).

There was about 50 people visiting at any one time and when I put the statistics on there was 13,000 views on the second day with an image being created one every second. At time of writing it has had over 50,000 views. Now the traffic is in the hundreds, with number of people visiting right now enough to be counted on one hand.

Future

  • Live preview
  • Better font rendering – defocus
  • Add range of images for different places (Bombay signs, Leeds Metro signs etc)
  • Store images, allow voting, create gallery