-
WDPA-GBIF
GBIF
Challengue GBIF wanted to integrate their data with the World Database on Protected Areas from UNEP-WCMC. GBIF on the other hand handles massive amounts of primary data collected from lot of different providers.
The project had to achieve a way to intersect more than 130 Million records from GBIF with 150.000 areas from WDPA.
Providing a good tool to the users in a clean and easy way was the main goal. Additionally the intersection between all this data could become a big issue with several days of complicated processing.Solution The project started only focusing on Spain and Madagascar and when the solutions were proven it was scaled to all countries and protected areas. We tried using 2 different strategies for the processing: relational databases for the short term and Map/Reduce strategy for the long term. Relational database, like PostGIS handles the work but were limited in scalability taking up to 17 hours to process the data. Map/Reduce, with Hadoop, was able to do the job much quicker paralyzing the problem in Amazon EC2.
The widget can be embeded on multiple websites interested only on a certain country, particular area, etc. The user can then download the data if necessary.Highlights & technologies The project is a great demonstration of the joint projects that GBIF and WCMC can handle together. The Protected Areas provide context to the Biodiversity data from GBIF and the other way around.
A mix of technologies and APIs were used. PostGIS, Java, AMF, Hadoop, Flex, Google Maps, Geoserver, Geowebcache..
Latest tweets about the project