Using iNaturalist data for research
Things to be aware of
Identifications
Certainty of ID.
iNat does not have a reputation system. So it is impossible to know what "research grade" means. Basically if someone proposes an ID, and someone agrees, then it is research grade. But double IDs can come about for many reasons, despite the guidelines (https://www.inaturalist.org/pages/help#identification). for instance:
- people may agree with someone they know (or trust), often just supporting a friend;
- to get their own (or a friends, associates) observation to research grade;
- to stop getting messages about identifications on observations they have contributed to.
- to up their profile in identifications posted.
If you are interested in quality of observations here are some fields that you can use:
(in downloads)
- num_identification_agreements
- num_identification_disagreements
Another option is to see if any experts or acclaimed enthusiasts have contributed to the ID, if you know of any. You can do this by adding the &ident_user_id= phrase to your filter.
https://www.inaturalist.org/observations?place_id=113055&subview=grid&taxon_id=563588&ident_user_id=383144
By similar token, you may want to use observations that are not research grade, but that have been identified by certain users (experts, enthusiasts). [Although why not just agree to these observations and make them research grade, if they are not too many?].
If you dont know who experts are, look at the identifications tab on a filter for the group. But beware that regular observers may be high up the list, even if they dont know the group, and that some identifiers may be interested in a group, but not particularly competent.
Alternatively, if you are knowledgeable in a group, and there are not too many observations, it is worthwhile using the curation tool to check any identifications before using the data (e.g. https://www.inaturalist.org/observations/identify?quality_grade=needs_id%2Cresearch%2Ccasual&verifiable=true&taxon_id=563588&place_id=any)
If you intend using data regularly, then it is worthwhile also adding to the DQA at the end of each observation (or the last tab on the curation tool).
If you have special data needs, you can always add an Observation Field and annotate the observations, and then include these fields with any downloads you make.
Location
Obscuration
Obscuration on iNaturalist is necessary, but the bane of research. Obtaining obscured data is virtually impossible. (note that private data is useless for most research as even the country is not accessible.) You can see obscured data with the following phrase
- user obscured: &geoprivacy=obscured
- e.g. https://www.inaturalist.org/observations?geoprivacy=obscured&taxon_id=132845
- taxon obscured: &taxon_geoprivacy=obscured
- e.g. https://www.inaturalist.org/observations?taxon_geoprivacy=obscured&taxon_id=132845
Note that the coordinates provided in any download are meaningless if the field coordinates_obscured=True if you require a locality resolution less than 30km radius. Make sure that you download the field " coordinates_obscured" and exclude any such data if you need to have accurate localities.
The best way to obtain taxon-obscured data (and all IUCN Red List species are obscured by default) is to requrest data from your community administrator (or California if you are not part of an iNaturalist Community). Note that this will not include any observations additionally obscured by users.
Obtaining user obscured data is nigh impossible. The problem is simply the volume of users that need to be contacted, and the number of dead, inactive or unresponsive users. These data are effectively forever lost. (users can manage their obscuration rights at https://www.inaturalist.org/relationships - but they cannot add new people there.
There are several ways to access user obscured data:
- request the user to trust you. There are many ways of doing this, but the best is via a message to the user, explaining why you need access to their obscured localities.
- create a project and ask users to join the project and to trust you (and to allow you do add your project to their observations and see the coordinates: it is useless if users only trust you if they add the observation to the project themselves, because the amount of chasing up required is impossible - they need to allow anyone to add the project to the observation). This is the most efficient, and the only option if you want the data to be useful in the long term (just make sure that curation of project is passed on the next generation of researchers).
Location Accuracy
(An unfortunate term, as higher values are more inaccurate; think of it as Location Error or Location Uncertainty. On iNaturalist it is measured as the radius (in m) in which the given location point is likely to occur).
Some useful filters:
- no positional data: https://www.inaturalist.org/observations?acc=false
- above a threshold: https://www.inaturalist.org/observations?acc_above=10000
- below a threshold: https://www.inaturalist.org/observations?acc_below=3
There are two issues here:
-
What resolution of location do your require?
If your work requires resolution to m or km accuracy, then add in a filter to exclude values of less certainty. For instance, modelling distributions using a climate model at minute scale is about 2000m in South Africa. Discard courser data. -
Are you working with smaller nature reserves?
The place filters exclude data that are too course. Conceptually, one does not want a locality to be considered inside a reserve if the possibility that it is outside the reserve exceeds 50%. So iNaturalist excludes observations were the uncertainty is too large (details here: https://www.inaturalist.org/pages/help#placeindex).
This means that for very small reserves, lots of good data can be discarded where users are not aware of the implications. Many naive observers assume that making the circle of uncertainty just larger than the reserve will indicate that the data are from (somewhere in) the reserve. In reality, if the area outside of the actual reserve is too large while doing this, then the point will be considered probably outside.
This requires educating users, and especially educating users as to the significance and importance of Locality Accuracy. For app users, it is merely an awareness that they need to let the app find their locality to a reasonable accuracy, as the app is quite precise thereafter. But for users adding in data and doing their own mapwork, need to be aware of the significance of not recording the Location Accuracy, or making it too precise or too imprecise.
Dont forget the DQA: mark up any localities that have dubious localities, especially if you plan to download the data in the future for further research. The "Location is accurate " and "Organism is wild" are the two fields that are most useful in this regard.