Lifemapper Species Point Data Preparation
Species Point data formatted for Lifemapper is a compressed data
package, named
The data file consists of a tab-delimited CSV file, with records
grouped by taxa using a field designated with the "groupby" role in the
metadata. Records MUST BE grouped in input data file. When Lifemapper
processes these data, groups of records are written to separate files
for Species Distribution Modeling and other analyses.
The metadata file consists of a JSON formatted object with name/value
pairs for each column. The name of the column is either the column
index (zero-based) or the name contained in the header row of the data
file. The value is a JSON object with the following name/value pairs:
- "name": Required. Name should be a unique value for each column, less than 10 characters.
- "type": Required. The data type contained in the field. Accepted values = "int", "str", "float"
- "role": Required only for fields fullfilling those roles. Required and optional roles listed below
The roles "groupby" and either "latitude" and "longitude" or "geopoint" are required. The role "taxaname" is useful if the "groupby" field is an identifier rather than human-readable name.
- "groupby": Required. Indicates the field used to group records
- "longitude": Required if no "geopoint" role. The longitude/x value
- "latitude": Required if no "geopoint" role. The latitude/y value
- "geopoint": Required if no "latitude" and "longitude" roles. The field contains both x and y coordinates; CSV data will be in the format {"lat": -16.35, "lon": -67.616667}
- "uniqueid": Optional. The field contains a unique ID for each record. Values will be generated if missing.
- "taxaname": Optional. The field contains the taxa name for the recordset. If this role is not present, records will be named with "groupby" field . If role is present and records in a group have different values, the first record be used as the dataset displayname.
Sample JSON metadata file contents:
{"0": {"name": "gbifid", "type": "int", "role": "uniqueid"},
"1": {"name": "occurid", "type": "int"},
"2": {"name": "taxonkey", "type": "int", "role": "groupby"},
"3": {"name": "datasetkey", "type": "str"},
"4": {"name": "puborgkey", "type": "str"},
"5": {"name": "basisofrec", "type": "str"},
"6": {"name": "kingdomkey", "type": "int"},
"7": {"name": "phylumkey", "type": "int"},
"8": {"name": "classkey", "type": "int"},
"9": {"name": "orderkey", "type": "int"},
"10": {"name": "familykey", "type": "int"},
"11": {"name": "genuskey", "type": "int"},
"12": {"name": "specieskey", "type": "int"},
"13": {"name": "sciname", "type": "str", "role": "taxaname"},
"14": {"name": "dec_lat", "type": "float", "role": "latitude"},
"15": {"name": "dec_long", "type": "float", "role": "longitude"},
"16": {"name": "day", "type": "int"},
"17": {"name": "month", "type": "int"},
"18": {"name": "year", "type": "int"},
"19": {"name": "rec_by", "type": "str"},
"20": {"name": "inst_code", "type": "str"},
"21": {"name": "coll_code", "type": "str"},
"22": {"name": "catnum", "type": "str"}
}