ASCII file
Habitat data is input to the Toolkit in the form of an .asc or ASCII file produced in a GIS software such as ArcGIS.
(Note: Input data cannot be in a 'geographic' projection, because this system uses units such as decimal degrees rather than meters).
This format consists of an array of data values preceded by a header containing a set of keywords:
<NCOLS xxx>
<NROWS xxx>
<XLLCENTER xxx | XLLCORNER xxx>
<YLLCENTER xxx | YLLCORNER xxx>
<CELLSIZE xxx>
{NODATA_VALUE xxx}
row 1
row 2
.
.
.
row n
For example,
ncols 725
nrows 1229
xllcorner -2364022.7551035
yllcorner 1232329.6699838
cellsize 1000
NODATA_value -9999
-9999 -9999 -9999 -9999 -9999
.....
This format can be produced by a variety of software. Here we briefly describe methods used within ESRI's ArcGIS. From the ArcGIS Desktop interface (ArcMap or Arc Toolbox), you can use the From Raster toolset's 'Raster to ASCII' function to convert the information in a raster dataset such as a grid to an .asc file. Alternately, from the arc command prompt of ArcGIS Workstation, you can use the gridascii command:
gridascii <input grid> <output .asc>
The resolution of the grid and thus of the .asc file should be appropriate to the resolution at which you plan to run the graph analysis (see below). You should typically retain enough resolution (if present in the original data) to have many (e.g., 100+) pixels within each hexagon, but excessively high resolution may slow hexmap generation somewhat.
Treatment of background or NODATA values
ASCII files such as those produced by ArcGIS typically represent areas without data (background, ocean, etc.) with a value such as -9999. These areas can be ignored (excluded) during creation of the graph file as described below.
Treatment of zero values
A more difficult question occurs when some areas for which data is available are considered 'non-habitat' and assigned a value of zero in the ASCII file. In the context of connectivity analysis, an area with a conductance or capacity value of zero represents a complete barrier to dispersal. Thus, in the context of the methods described here, no analysis of connectivity across such barriers can occur. If it is known that such habitat really is a complete barrier to dispersal, and other options (potential linkage zones) for linking the areas of interest are available, then assigning non-habitat a zero value may be appropriate. However, often it is more informative to assign such areas a low but non-zero value so that it does not represent a complete barrier to dispersal. If habitat value are derived from a probabilistic model such as Maxent, then it may be useful to round up values so that the lowest probability value is distinguishable from zero in the input ASCII file. Unlike areas with value -9999, areas with zero value are NOT excluded from graph creation by default.
If too much of the landscape is composed of non-habitat which has been assigned a value if zero, the centrality analysis may fail due to the fact that no feasible connections exist between the areas of interest.
In min-cost-max-flow analyses, cost values of zero may also present problems. Therefore, even when using a uniform cost layer, cost should be set to a positive non-zero value.
Source and target input files for subset centrality
As described below, subset centrality requires input of text files identifying the subset of nodes (hexes) forming the source and target areas. These files have the following format:
For python-NetworkX based functions (shortest-path and current flow betweenness centrality subset), the required format is a single column of node ids per file:
1
2
3
...
Network flow methods (e.g., min-cost-max-flow centrality) can also accept such input as a two line file with a space-separated row of source ids in the first line and target ids in the second line,
1 2 3 4
5 6 7 8
While python-based functions can analyze connections between a group of source nodes (A) and target nodes (B), they cannot limit this analysis to specific node pairs, but will instead look at all possible pairs that combine a node in A and a node in B.
Network flow functions, however, can analyze a list of specific node pairs, entered as a file of more than two lines, in which each line specifies the space-separated ids of a source/target pair.
1 5
2 6
3 7
4 8
...