Notice: This app will soon be replaced by a new version, currently available here for testing.

Infomap Bioregions: Interactive mapping of biogeographical regions from species distributions
Daniel Edler, Thaís Guedes, Alexander Zizka, Martin Rosvall, and Alexandre Antonelli

Biogeographical regions reveal how species are spatially grouped and therefore are important units for conservation, historical biogeography, ecology and evolution. Several methods have been developed to identify bioregions based on species distribution data rather than expert opinion. One approach successfully applies network theory to simplify and highlight the underlying structure in species distributions data. However, there are no tools that make this methodology simple and efficient to use. Here we present Infomap Bioregions, an interactive web application that inputs species distribution data and generates bioregion maps. Species distributions may be provided as georeferenced point occurrences or range maps, and can be of local, regional or global scale. The application uses a novel adaptive resolution method to make best use of often incomplete species distribution data. The results can be downloaded as vector graphics, shapefiles or in table format. We validate the tool by processing large datasets of publicly available species distribution data of the world's amphibians using species ranges, and mammals using point occurrences. Potential applications include ancestral range reconstructions in historical biogeography and identification of indicator species for targeted conservation.
FIG. 1 Bioregion map of the world's amphibians generated with Infomap Bioregions, using the IUCN species range maps. White areas have insufficient data and were excluded from the analysis. The inset shows a zoom-in of Central America, the West Indies and northwestern South America, depicting many small bioregions that reflect high species turnover and narrow range distributions characteristic for the region.

How to cite

The original implementation is described in:
Edler, D., Guedes, T., Zizka, A. Rosvall, M. Antonelli, A. (2017) Infomap Bioregions: Interactive mapping of biogeographical regions from species distributions. Systematic Biology 66(2):197–204, doi: 10.1093/sysbio/syw087

It is based on the method described in:
Vilhena, D., Antonelli, A. (2015) A network approach for identifying and delimiting biogeographical regions. Nature communications 6, 6848

The original clustering algorithm is described in:
Rosvall, M. Bergstrom, C. (2008) Maps of information flow reveal community structure in complex networks, PNAS 105, 1118

If you are using Infomap Bioregions in one of your research articles or otherwise want to refer to it, please cite the relevant publication above or use the following format:

D. Edler, T. Guedes, A. Zizka, M. Rosvall, and A. Antonelli, Infomap Bioregions, available online at https://bioregions.mapequation.org.

Feedback

If you have any questions, suggestions or issues regarding the software, please add them to GitHub issues.

Step-by-step explanation
From species distributions to bioregions

Schematics - Input to output
Input quality?

Tip: sampbias is an R package for cleaning erroneous geographical coordinates and assessing sampling biases in biological collection databases.

1
Input
Load species distribution data

As species distribution input, Infomap Bioregions supports both point occurrence data and species range maps.

Point occurrences

Point occurrences are specified in a text file with either comma-separated (CSV) or tab-separated (TSV) values. The application requires a header with the column names, and the user must identify which columns that should be parsed as name, latitude and longitude, respectively.

Example input format:

Name, Lat, Long
Sp1, -5.61, -53.43
Sp1, -8.22, -48.90
Sp2, -8.17, -50.06
          

Range maps

Range maps are specified in the shapefile format, which includes multiple files: a .shp file for species range polygons, a .dbf file for the attributes of each range polygon and, optionally, a .prj file for projection information. As for point occurrence data, the user must identify which attribute to parse as the name of the species.

Phylogenetic tree

Infomap Bioregions also supports loading a phylogenetic tree to show a phylogram and how it maps to the bioregions, if the names in the tree match with the names in the species distribution data. It supports the NEXUS and Newick tree format.

Example tree format:

(A:0.1,B:0.2,(C:0.3,D:0.4):0.5);

The above example represents this tree:

Newick tree example

2
Adaptive resolution
Discretize geographical space

Schematics - Quadtree mapping

Infomap Bioregions bins the species records into quadratic grid cells. To allow for adaptive spatial resolution, each grid cell can be recursively subdivided into four cells. The adaptive binning generates a so-called quadtree with subdivided grid cells that satisfy the following user-specified criteria, with decreasing priority from 1 to 3:

1. Given in degrees, no grid cell is larger than the specified
max cell size
or smaller than the specified
min cell size
.
2. Given as a natural number, no grid cell contains fewer records than the specified
min cell capacity
.
3. Given as a natural number, no grid cell contains more records than the specified
max cell capacity
.

For point occurrence data, these criteria make the adaptive binning straightforward. For range maps, the application first adds a species record to each cell of minimum size that intersects with the corresponding species range polygon, and then proceeds with the adaptive binning to satisfy the user-specified criteria.

Quadtree

A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are most often used to partition a two-dimensional space by recursively subdividing it into four quadrants or regions.

3
Bipartite occurrence network
Connect grid cells through common species

Schematics - Network mapping

Infomap Bioregions generates a bipartite network of species and grid cells. Each species is connected by an unweighted link to each grid cell in which it is present. The network is clustered with the Infomap clustering algorithm for bipartite networks. In that process, the geographical grid cells are merged into different clusters which become the resulting bioregions.

Bipartite network

A bipartite network is network with two types of nodes, where the links always connect one type of nodes with the other.

4
Clustering
Merge grid cells with similar species distribution

Schematics - Network clustering

The bipartite network of species and geographical grid cells is clustered with the Infomap clustering algorithm to find an optimal partition of the geographical space with respect to the map equation.

5
Bioregions
Map clusters to geographical space

Schematics - Bioregions

Each cluster of grid cells is given a unique color and forms a bioregion.

Bioregion

A bioregion, or biogeographical region, reveal how species are spatially grouped and therefore are important units for conservation, historical biogeography, ecology and evolution.

6
Output
Export bioregions map and statistics

Schematics - Bioregions

The bioregions can be exported both as a visual map (.png or .svg) and as structured data for further analysis (in the GeoJSON or shapefile format).

The application also identifies the most common and the most indicative species in each grid cell and bioregion, and shows the results as an interactive map together with supporting tables with information about the bioregions. These tables can be exported in the .csv format.