hdxr is a package designed to allow you to interact with Humanitarian Data Exchange (HDX) in R, allowing for more reproducible workflow with humanitarian data.

Currently it is a read only package, that gathers information from:

  • Datasets
  • Resources

Connecting to HDX

HDX uses CKAN to store data, and to connect to its server you first use:

hdx_connect()

This simply uses the ckanr command ckanr_setup(), specifying HDX

Datasets

Listing Datasets

To see what datasets are available on HDX, use:

hdx_dataset_list()

This will provide a dataframe, with a column containing the first 5000 titles of datasets available (use the limit argument to get more than 5000)

Retreiving Datasets

To download further information about a dataset, use:

hdx_dataset_search(term = "blah")

This will retrieve a dataframe of the first 10 datasets that have titles similar to the provided search term.

To return more than 10 results use the rows argument.

When you know the correct title, and only want that result, use the exact argument.

hdx_dataset_search(term = "afghanistan-roads", exact = TRUE)

Resources

Extracting Resources

Once you’ve selected the dataset you want using the hdx_dataset_search() function, you can extract information about the resources (the data files themselves) using:

hdx_resource_list()

This can be piped, in a situation like:

hdx_dataset_search(term = "afghanistan-roads", exact = TRUE) %>%
  hdx_resource_list()

This will provide a dataframe, with one line for each resource found, joined on to the initial information about the datasets.

Loading CSV

The hdx_resource_csv() function will take the results of hdx_resource_list(), select the resources that are csv, and download them into the environment, as a nested dataframe. You can use this function to load multiple csv files into the environment at once.

Once downloaded, you can then unnest the csv file you want to explore further:

afghanistan <- hdx_dataset_search(term = "ocha-afghanistan-topline-figures") %>%
  hdx_resource_list() %>%
  hdx_resource_csv()

afghanistan[1,] %>%
  unnest()

Loading Shapefiles

The hdx_resource_shapefile() function will take the results of hdx_resource_list(), select the resources that are zipped shapefiles, and download the first one. It will unzip it into its own folder, then load the shapefiles into the environment using simple features.

afghanistan <- hdx_dataset_search(term = "afghanistan-roads") %>%
  hdx_resource_list() %>%
  hdx_resource_shapefile()