scholarly journals Making Biodiversity Data Social, Shareable, and Scalable: Reflections on iNaturalist & citizen science

Author(s):  
Carrie Seltzer

Since 2008, iNaturalist has been crowdsourcing identifications for biodiversity observations collected by citizen scientists. Today iNaturalist has over 25 million records of wild biodiversity with photo or audio evidence, from every country, representing more than 230,000 species, collected by over 700,000 people, and with 90,000 people helping others with identifications. Hundreds of publications have used iNaturalist data to advance research, conservation, and policy. There are three key themes that iNaturalist has embraced: social interaction; shareability of data, tools, and code; and scalability of the platform and community. The keynote will share reflections on what has (and has not) worked for iNaturalist while drawing on other examples from biodiversity informatics and citizen science. Insights about user motivations, synergistic collaborations, and strategic decisions about scaling offer some transferable approaches to address the broadly applicable questions: Which species is represented? How do we make the best use of the available biodiversity information? And how do we build something viable and enduring in the process?

Author(s):  
Natalya Ivanova ◽  
Maxim Shashkov

Currently Russia doesn't have a national biodiversity information system, and is still not a GBIF (Global Biodiversity Information Facility) member. Nevertheless, GBIF is the largest source of biodiversity data for Russia. As of August 2020, >5M species occurrences were available through the GBIF portal, of which 54% were published by Russian organisations. There are 107 institutions from Russia that have become GBIF publishers and 357 datasets have been published. The important trend of data mobilization in Russia is driven by the considerable contribution of citizen science. The most popular platform is iNaturalist. This year, the related GBIF dataset (Ueda 2020) became the largest one for Russia (793,049 species occurrences as of 2020-08-11). The first observation for Russia was posted in 2011, but iNaturalist started becoming popular in 2017. That year, 88 observers added >4500 observations that represented 1390 new species for Russia, 7- and 2-fold more respectively, than for the previous 6 years. Now we have nearly 12,000 observers, about 15,000 observed species and >1M research-grade observations. The ratio of observations for Tracheophyta, Chordata, and Arthropoda in Russia is different compared to the global scale. There are almost an equal amount of observations in the global iNaturalist GBIF dataset for these groups. At the same time in Russia, vascular plants make up 2/3rds of the observations. That is due to the "Flora of Russia" project, which attracted many professional botanists both as observers and experts. Thanks to their activity, Russia has a high proportion of research-grade observations in iNaturalist, 78% versus 60% globally. Another consequence of wide participation by professional researchers is the high rate of species accumulation. For some taxonomic groups conspicuous species were already revealed. There are about 850 bird species in Russia of which 398 species were observed in 2018, and only 83 new species in 2019. Currently, the number of new species recorded over time is decreasing despite the increase in observers and overall user activity. Russian iNaturalist observers have shared a lot of archive photos (taken during past years). In 2018, it was nearly 1/4 of the total number of observations and about 3/4 of new species for the year, with similar trends observed during 2019. Usually archive photos are posted from December until April, but the 2020 pandemic lockdown spurred a new wave of archive photo mobilisation in April and May. There are many iNaturalist projects for protected areas in Russia: 27 for strict nature reserves and national parks, and about 300 for others. About 100,000 observations (7.5% of all Russian observations) from the umbrella project "Protected areas of Russia" represent >34% of the species diversity observed in Russia. For some regions, e.g., Novosibirsk, Nizhniy Novgorod and Vladimir Oblasts, almost all protected areas are covered by iNaturalist projects, and are often their only source of available biodiversity data. There are also other popular citizen science platforms developed by Russian researchers. The first one is the Russian birdwatching network RU-BIRDS.RU. The related GBIF dataset (Ukolov et al. 2019) is the third largest dataset for Russia (>370,000 species occurrences). Another Russian citizen science system is wildlifemonitoring.ru, which includes thematic resources for different taxonomic groups of vertebrates. This is the crowd-sourced web-GIS maintained by the Siberian Environmental Center NGO in Novosibirsk. It is noteworthy that iNaturalist activities in Russia are developed more as a social network than as a way to attract volunteers to participate in scientific research. Of 746 citations in the iNaturalist dataset, only 18 articles include co-authors from Russia. iNaturalist data are used for the management of regional red lists (in the Republic of Bashkortostan, Novosibirsk Oblast and others), and as an additional information source for regional inventories. RU-BIRDS data were used in the European Russia Breeding Bird Atlas and the new edition of the European Breeding Bird Atlas. In Russia, citizen science activities significantly contribute to filling gaps in the global biodiversity map. However, Russian iNaturalist observations available through GBIF originate from the USA. It is not ideal, because the iNaturalist GBIF dataset is growing rapidly, and in the future it will represent more than all other datasets for Russia combined. In our opinion, iNaturalist data should be repatriated during the process of publishing through GBIF, as it is implemented for the eBird dataset (Levatich and Ligocki 2020).


2018 ◽  
Vol 2 ◽  
pp. e26367
Author(s):  
Yvette Umurungi ◽  
Samuel Kanyamibwa ◽  
Faustin Gashakamba ◽  
Beth Kaplin

Freshwater biodiversity is critically understudied in Rwanda, and to date there has not been an efficient mechanism to integrate freshwater biodiversity information or make it accessible to decision-makers, researchers, private sector or communities, where it is needed for planning, management and the implementation of the National Biodiversity Strategy and Action Plan (NBSAP). A framework to capture and distribute freshwater biodiversity data is crucial to understanding how economic transformation and environmental change is affecting freshwater biodiversity and resulting ecosystem services. To optimize conservation efforts for freshwater ecosystems, detailed information is needed regarding current and historical species distributions and abundances across the landscape. From these data, specific conservation concerns can be identified, analyzed and prioritized. The purpose of this project is to establish and implement a long-term strategy for freshwater biodiversity data mobilization, sharing, processing and reporting in Rwanda. The expected outcome of the project is to support the mandates of the Rwanda Environment Management Authority (REMA), the national agency in charge of environmental monitoring and the implementation of Rwanda’s NBSAP, and the Center of Excellence in Biodiversity and Natural Resources Management (CoEB). The project also aligns with the mission of the Albertine Rift Conservation Society (ARCOS) to enhance sustainable management of natural resources in the Albertine rift region. Specifically, organizational structure, technology platforms, and workflows for the biodiversity data capture and mobilization are enhanced to promote data availability and accessibility to improve Rwanda’s NBSAP and support other decision-making processes. The project is enhancing the capacity of technical staff from relevant government and non-government institutions in biodiversity informatics, strengthening the capacity of CoEB to achieve its mission as the Rwandan national biodiversity knowledge management center. Twelve institutions have been identified as data holders and the digitization of these data using Darwin Core standards is in progress, as well as data cleaning for the data publication through the ARCOS Biodiversity Information System (http://arbmis.arcosnetwork.org/). The release of the first national State of Freshwater Biodiversity Report is the next step. CoEB is a registered publisher to the Global Biodiversity Information Facility (GBIF) and holds an Integrated Publishing Toolkit (IPT) account on the ARCOS portal. This project was developed for the African Biodiversity Challenge, a competition coordinated by the South African National Biodiversity Institute (SANBI) and funded by the JRS Biodiversity Foundation which supports on-going efforts to enhance the biodiversity information management activities of the GBIF Africa network. This project also aligns with SANBI’s Regional Engagement Strategy, and endeavors to strengthen both emerging biodiversity informatics networks and data management capacity on the continent in support of sustainable development.


2016 ◽  
Vol 11 ◽  
Author(s):  
Alex Asase ◽  
A. Townsend Peterson

Providing comprehensive, informative, primary, research-grade biodiversity information represents an important focus of biodiversity informatics initiatives. Recent efforts within Ghana have digitized >90% of primary biodiversity data records associated with specimen sheets in Ghanaian herbaria; additional herbarium data are available from other institutions via biodiversity informatics initiatives such as the Global Biodiversity Information Facility. However, data on the plants of Ghana have not as yet been integrated and assessed to establish how complete site inventories are, so that appropriate levels of confidence can be applied. In this study, we assessed inventory completeness and identified gaps in current Digital Accessible Knowledge (DAK) of the plants of Ghana, to prioritize areas for future surveys and inventories. We evaluated the completeness of inventories at ½° spatial resolution using statistics that summarize inventory completeness, and characterized gaps in coverage in terms of geographic distance and climatic difference from well-documented sites across the country. The southwestern and southeastern parts of the country held many well-known grid cells; the largest spatial gaps were found in central and northern parts of the country. Climatic difference showed contrasting patterns, with a dramatic gap in coverage in central-northern Ghana. This study provides a detailed case study of how to prioritize for new botanical surveys and inventories based on existing DAK.


Author(s):  
José Augusto Salim ◽  
Antonio Saraiva

For those biologists and biodiversity data managers who are unfamiliar with information science data practices of data standardization, the use of complex software to assist in the creation of standardized datasets can be a barrier to sharing data. Since the ratification of the Darwin Core Standard (DwC) (Darwin Core Task Group 2009) by the Biodiversity Information Standards (TDWG) in 2009, many datasets have been published and shared through a variety of data portals. In the early stages of biodiversity data sharing, the protocol Distributed Generic Information Retrieval (DiGIR), progenitor of DwC, and later the protocols BioCASe and TDWG Access Protocol for Information Retrieval (TAPIR) (De Giovanni et al. 2010) were introduced for discovery, search and retrieval of distributed data, simplifying data exchange between information systems. Although these protocols are still in use, they are known to be inefficient for transferring large amounts of data (GBIF 2017). Because of that, in 2011 the Global Biodiversity Information Facility (GBIF) introduced the Darwin Core Archive (DwC-A), which allows more efficient data transfer, and has become the preferred format for publishing data in the GBIF network. DwC-A is a structured collection of text files, which makes use of the DwC terms to produce a single, self-contained dataset. Many tools for assisting data sharing using DwC-A have been introduced, such as the Integrated Publishing Toolkit (IPT) (Robertson et al. 2014), the Darwin Core Archive Assistant (GBIF 2010) and the Darwin Core Archive Validator. Despite promoting and facilitating data sharing, many users have difficulties using such tools, mainly because of the lack of training in information science in the biodiversity curriculum (Convention on Biological Diversiity 2012, Enke et al. 2012). However, most users are very familiar with spreadsheets to store and organize their data, but the adoption of the available solutions requires data transformation and training in information science and more specifically, biodiversity informatics. For an example of how spreadsheets can simplify data sharing see Stoev et al. (2016). In order to provide a more "familiar" approach to data sharing using DwC-A, we introduce a new tool as a Google Sheet Add-on. The Add-on, called Darwin Core Archive Assistant Add-on can be installed in the user's Google Account from the G Suite MarketPlace and used in conjunction with the Google Sheets application. The Add-on assists the mapping of spreadsheet columns/fields to DwC terms (Fig. 1), similar to IPT, but with the advantage that it does not require the user to export the spreadsheet and import it into another software. Additionally, the Add-on facilitates the creation of a star schema in accordance with DwC-A, by the definition of a "CORE_ID" (e.g. occurrenceID, eventID, taxonID) field between sheets of a document (Fig. 2). The Add-on also provides an Ecological Metadata Language (EML) (Jones et al. 2019) editor (Fig. 3) with minimal fields to be filled in (i.e., mandatory fields required by IPT), and helps users to generate and share DwC-Archives stored in the user's Google Drive, which can be downloaded as a DwC-A or automatically uploaded to another public storage resource like a user's Zenodo Account (Fig. 4). We expect that the Google Sheet Add-on introduced here, in conjunction with IPT, will promote biodiversity data sharing in a standardized format, as it requires minimal training and simplifies the process of data sharing from the user's perspective, mainly for those users not familiar with IPT, but that historically have worked with spreadsheets. Although the DwC-A generated by the add-on still needs to be published using IPT, it does provide a simpler interface (i.e., spreadsheet) for mapping data sets to DwC than IPT. Even though the IPT includes many more features than the Darwin Core Assistant Add-on, we expect that the Add-on can be a "starting point" for users unfamiliar with biodiversity informatics before they move on to more advanced data publishing tools. On the other hand, Zenodo integration allows users to share and cite their standardized data sets without publishing them via IPT, which can be useful for users without access to an IPT installation. Additionally, we are working on new features and future releases will include the automatic generation of Global Unique Identifiers for shared records, the possibility of adding additional data standards and DwC extensions, integration with GBIF REST API and with IPT REST API.


2018 ◽  
Vol 2 ◽  
pp. e25298
Author(s):  
Siobhan Leachman

The Biodiversity Heritage Library (BHL) provides open access to over 54 million pages of biodiversity literature. Much of this literature is either in the public domain or is licensed for reuse under the Creative Commons framework. Anyone can therefore freely reuse much of the information and data provided by BHL. This presentation will outline how the work of a citizen scientist using BHL content might benefit research scientists. It will discuss how a citizen scientist can reuse and link BHL literature and data in Wikipedia and Wikidata. It will explain the research efficiencies that can be obtained through this reuse and linking, for example through the consolidation of database identifiers. The presentation will outline the subsequent reuse of the BHL data added to Wikipedia and Wikidata by the internet search engine Google. It will discuss an example of the linking of this information in the citizen science observation platform iNaturalist. The presentation will explain how BHL, as a result of its open reuse licensing of information and data, helps in the creation of more accurate citizen science generated biodiversity data and assists with the wider and more effective dissemination of biodiversity information.


Author(s):  
Jean Ganglo

Benin became member of the Global Biodiversity Information Facility (GBIF) in 2004 and acceded to the status of voting member in 2011. GBIF Benin through the constant efforts of its node is now very active in GBIF community with respect to capacity building, data mobilization and data uses. GBIF Benin published more than 400 000 occurrence data from about 125 datasets on GBIF portal . As for capacity building, GBIF Benin yearly organizes at least 2 (two) workshops to enhance the capacities of national and regional partners in data mobilization and data uses. At regional level, GBIF Benin is leading a consortium of many countries (Senegal, Côte-d’Ivoire, Niger, Democratic Republic of Congo, Guinea, and Madagascar etc.) to help overcome the challenges of data mobilization and data uses at regional level. From the academic year 2017-2018, GBIF Benin, through its node manager, successfully cooperated with the University of Kansas to create a master program in biodiversity informatics. Biodiversity informatics is a field of investigation relatively new in science and is concerned with massive occurrence data collection on biodiversity as well as on environment; data treatments, analysis, and representations so as to derive sound research products to inform decisions on biodiversity conservation and sustainable uses in the context of climate and global changes. In Benin, the master program in biodiversity informatics is a permanent two-year program structured in teaching units with the following contents: 1) Basics concepts of biodiversity; 2) Biodiversity data capture; 3) Biodiversity inventories; 4) Biodiversity data analysis; 5) Climate change and biodiversity; 6) Ecological niche modeling and strategies for biodiversity conservation; 7) Data-science-policy interface; 8) Public Health and Applications of biodiversity data etc. At completion of their studies, students graduated in the program will be capacitated so as to achieve the following innovative objectives: 1) Use Geographic Information System to map spatial distribution of species; 2) Model the current and the future ecological niche of species in the context of climate and global changes; 3) Characterize biodiversity on scales ranging from local to global; 4) Assess geographic patterns among suites of species (i.e., communities); 5) Refine the knowledge on particular taxonomic groups; 6) Define priority zones of biodiversity conservation; 7) Develop strategies of species conservation; 8) Implement biodiversity conservation strategies; 9) Predict the risks of propagation of infectious diseases (Lassa fever, Ebola fever etc.) which vectors are living organisms, so as to support preventive actions, etc. With such capacities, the graduated students of the master program are obviously the new generation of biodiversity information scientists who are able to address the needs of information so as to contribute to biodiversity conservation and its sustainable uses. Furthermore, in their respective countries and the rest of Africa, they will successfully contribute to the achievements of the Sustainable Development Goals as defined by the United Nations in 2015. With respect to data uses, more and more research products are piling up in Benin and are being integrated into decision makers’ arena. In 2018, the results of our data uses were integrated in the elaboration of the second communication on climate change of Benin.


2018 ◽  
Vol 6 ◽  
Author(s):  
A. Townsend Peterson ◽  
Alex Asase ◽  
Dora Canhos ◽  
Sidnei de Souza ◽  
John Wieczorek

The field of biodiversity informatics is in a massive, “grow-out” phase of creating and enabling large-scale biodiversity data resources. Because perhaps 90% of existing biodiversity data nonetheless remains unavailable for science and policy applications, the question arises as to how these existing and available data records can be mobilized most efficiently and effectively. This situation led to our analysis of several large-scale biodiversity datasets regarding birds and plants, detecting information gaps and documenting data “leakage” or attrition, in terms of data on taxon, time, and place, in each data record. We documented significant data leakage in each data dimension in each dataset. That is, significant numbers of data records are lacking crucial information in terms of taxon, time, and/or place; information on place was consistently the least complete, such that geographic referencing presently represents the most significant factor in degradation of usability of information from biodiversity information resources. Although the full process of digital capture, quality control, and enrichment is important to developing a complete digital record of existing biodiversity information, payoffs in terms of immediate data usability will be greatest with attention paid to the georeferencing challenge.


Author(s):  
Nora Escribano ◽  
David Galicia ◽  
Arturo H. Ariño

Building on the development of Biodiversity Informatics, the Global Biodiversity Information Facility (GBIF) undertook the task of enabling access to the world’s wealth of biodiversity data via the Internet. To date, GBIF has become, in many respects, the most extensive biodiversity information exchange infrastructure in the world, opening up a full range of possibilities for science. Science has benefited from such access to biodiversity data in research areas ranging from the effects of environmental change on biodiversity to the spread of invasive species, among many others. As of this writing, more than 7,000 published items (scientific papers, reviews, conference proceedings) have been indexed in the GBIF Secretariat’s literature tracking programme. On the basis on this database, we will represent trends in GBIF in the users’ behaviour over time regarding openness, social structure, and other features associated to such scientific production: what is the measurable impact of research using GBIF data? How is the GBIF community of users growing? Is the science made with, and enabled by, open data, actually open? Mapping GBIF users’ choices will show how biodiversity research is evolving through time, synthesising past and current priorities of this community in an attempt to forecast whether summer—or winter—is coming.


Author(s):  
Nina Filippova ◽  
Ilya Filippov ◽  
Natalya Ivanova

Biodiversity-related studies in the northern part of West Siberia are relatively recent in line with intensive industrial development of the region in recent decades. The region posesses few biological collections within the universities and nature reserves. Still, the Department of Natural Resources pays considerable attention to the sustainable use of natural resources. On the global scale, the success of biodiversity informatics goals largely depends on the local initiatives and progress in data mobilization and sharing. Therefore, organization of regional biodiversity portals is important to promote data mobilization, education and citizen science on local scale. Previous experience of biodiversity information systems in the region was low. The program on digitization of observations of Red Listed species was launched in 2010 under the support of the Department of Natural Resources of Yugra. The information system for Red Listed species registrations was developed through this project and currently includes about three thousand observations. Another example of digitization in Western Siberia was developed by the biological collection of Yugra State University. Its database is based on the database management system Specify and available online through its web portal (http://bioportal.ugrasu.ru). Some collections of nature reserves have their catalogues in digital form. The need of biodiversity data mobilization is well understood and is discussed at regular workshops on biological collections management held in Khanty-Mansiysk. Recently, the biologists curating several biological collections in the region started a project on a regional biodiversity portal development (https://nwsbios.org). The portal has three major components: the database of collections based on Specify software (http://bioportal.ugrasu.ru), the metadata of different sources of biodiversity information in the region, an educational platform for learning biodiversity informatics, using data published via GBIF and DwC standards. the database of collections based on Specify software (http://bioportal.ugrasu.ru), the metadata of different sources of biodiversity information in the region, an educational platform for learning biodiversity informatics, using data published via GBIF and DwC standards. This initiative of biodiversity data mobilization in the region includes the organization of workshops, discussions and newsletters helping to reach potential data holders and coordinate work. Through this work four different organizations from Khanty-Mansi region have registered accounts in GBIF since 2019 and started uploading data to the GBIF portal. At present there are about 25,000 observations mobilized in GBIF from the Khanty-Mansi and Yamalo-Nenets regions. The integrated massive publishing of data in the portal will provide new opportunities for biodiversity research and sustainable management of nature resources in the northern part of West Siberia.


Author(s):  
Jerome Chie-Jen Ko ◽  
Huiling Chang ◽  
Yihong Chang ◽  
Tzu-Chien Kuo ◽  
You-Cheng Yu ◽  
...  

The importance of a data exchanging culture accompanied by a supporting bioinformatic system is widely praised as an aid to sustainable development. Yet this is not always implemented as a top-down procedure in every governing environment. Common obstacles include lack of resources, lack of support from decision-makers, and lack of recognition from data-providers. Using citizen science (hereafter CS), which assumes a spirit of public information sharing, we demonstrate how CS can be a critical tool to help database managers overcome this difficulty. CS data contributes to impressively over 70% of the currently 4.5 million openly distributed occurrence data in Taiwan. Although CS projects emerged much earlier in a few taxa, such as Aves and Anura, CS was unknown to the wider public and politicians in the region until 2009. This was probably due to the combination of the popularity of social media and improvements to wifi connections, which brought discoveries and impacts of CS data to the news spotlight. Such cases include roadkill projects that aided rabies-outbreak control, and amateur bird records that helped downscale the conflict between solar energy deployment and migratory wetland bird conservation. These cases also created feedback on the call for more data to be open, an effect that was prominent from project managers in other CS communities, the previously reluctant expert researcher communities, and even placed pressure on data policy of several conservation agencies which previously were not supportive of open data. The inclusion of CS programs is also critical in forming alliances between agencies that were responsible for promoting and building the biodiversity informatics system. Previously, financial and human resources for such systems are split across agencies. However, in terms of building up a cutting edge biodiversity information service platform, or empowerment of human resources to handle the rapidly growing amount of data, joint partnerships across government agencies is then necessary. CS brings the spotlight of government efforts to the people, which is an important strategy to maintain support from top decision-makers and politicians, who mostly rely on public votes in a democratic society. Currently, the national node of the Global Biodiversity Information Facility in Taiwan, the administration for conservation in Taiwan, and the main biodiversity consultancy in Taiwan have teamed up, answering the call for sharing data for a better future. As a tribute to the CS projects, a biodiversity informatics system named Taiwan Biodiversity Network, is now enhancing its ability as a platform to promote data usage and provide technical aid to CS programs. Data visualization projects such as “Coldspots” pointed out regions that lack data, which can be used to decide where to focus efforts for the next field surveys. Online CS data platforms, such as Taiwan Reptile Report Program, are also working to ease the previously intensive efforts that project managers needed to contribute to run event-based monitoring. Combined, these developments form a cultural and technical basis for the implementation of multi-taxa atlas projects, which was made possible by the mainstreaming of open data culture and biodiversity awareness through citizen science projects.


Sign in / Sign up

Export Citation Format

Share Document