Title: | ILO Open Data via Ilostat Bulk Download Facility |
---|---|
Description: | Tools to download data from the [ilostat](<https://ilostat.ilo.org>) database together with search and manipulation utilities. |
Authors: | David Bescond [aut, cre] |
Maintainer: | David Bescond <[email protected]> |
License: | BSD_2_clause + file LICENSE |
Version: | 2.3.1 |
Built: | 2025-03-10 15:20:43 UTC |
Source: | https://github.com/ilostat/rilostat |
Deletes all cache files from the your ilostat cache directory.
See get_ilostat
for more on cache.
clean_ilostat_cache( cache_dir = getOption("ilostat_cache_dir", file.path(tempdir(), "ilostat")), cache_update = getOption("ilostat_cache_update", FALSE), quiet = getOption("ilostat_quiet", FALSE) )
clean_ilostat_cache( cache_dir = getOption("ilostat_cache_dir", file.path(tempdir(), "ilostat")), cache_update = getOption("ilostat_cache_update", FALSE), quiet = getOption("ilostat_quiet", FALSE) )
cache_dir |
A character, path to a cache directory. The directory has to exist.
The |
cache_update |
a logical whether to delete only out of date cache files. Useful when |
quiet |
a logical, if |
David Bescond [email protected]
See citation("Rilostat") ilostat bulk download facility user guidelines https://webapps.ilo.org/ilostat-files/Documents/ILOSTAT_BulkDownload_Guidelines.pdf
## Not run: clean_ilostat_cache() ## End(Not run)
## Not run: clean_ilostat_cache() ## End(Not run)
open ilostat data explorer app on your computer.
dataexplorer()
dataexplorer()
David Bescond [email protected]
See citation("Rilostat")
## Not run: require(shiny) dataexplorer() ## End(Not run)
## Not run: require(shiny) dataexplorer() ## End(Not run)
Get distribution for ilostat number of persons only.
distribution_ilostat(x, var, .keep = FALSE)
distribution_ilostat(x, var, .keep = FALSE)
x |
dataset to transform into distribution. |
var |
String variable name use for the distribution default |
.keep |
if true return only new column call distribution default |
this function use the max of the corresponding grouping so it is important to not filter any subset of the corresponding variable selected for the distribution at this level, ie. if you remove SEX_T, the distribution by sex will only have SEX_F or SEX_M / max(SEX_M, SEX_F) * 100, which is no longer a distribution.
In addition, distribution is only applicable for indicators with Number of persons (usually in thousands), So plse do not distribute ratios, earnings, hours of works, CPI, GDP etc ... no warning will prevent for that if doubts use distribution from get_ilostat() instead of, warnings will help you.
a data_frame. obs_status will no longer be a number of persons but a percentage.
David Bescond [email protected]
See citation("Rilostat") ilostat bulk download facility user guidelines https://webapps.ilo.org/ilostat-files/Documents/ILOSTAT_BulkDownload_Guidelines.pdf
## Not run: dat <- get_ilostat("EMP_TEMP_SEX_STE_GEO_NB_A", cache = FALSE) dat_dist <- distribution_ilostat(dat, "classif1") dat_plus_dist <- mutate(dat, dist = distribution_ilostat(dat,"classif1", .keep=TRUE)) head(dat_dist) clean_ilostat_cache() ## End(Not run)
## Not run: dat <- get_ilostat("EMP_TEMP_SEX_STE_GEO_NB_A", cache = FALSE) dat_dist <- distribution_ilostat(dat, "classif1") dat_plus_dist <- mutate(dat, dist = distribution_ilostat(dat,"classif1", .keep=TRUE)) head(dat_dist) clean_ilostat_cache() ## End(Not run)
Download datasets from ilostat https://ilostat.ilo.org via bulk download facility https://ilostat.ilo.org/data/bulk/.
get_ilostat( id, segment = getOption("ilostat_segment", "indicator"), type = getOption("ilostat_type", "code"), lang = getOption("ilostat_lang", "en"), time_format = getOption("ilostat_time_format", "raw"), filters = getOption("ilostat_filter", "none"), fixed = getOption("ilostat_fixed", TRUE), detail = getOption("ilostat_detail", "full"), cache = getOption("ilostat_cache", TRUE), cache_update = getOption("ilostat_cache_update", TRUE), cache_dir = getOption("ilostat_cache_dir", NULL), cache_format = getOption("ilostat_cache_format", "rds"), back = getOption("ilostat_back", TRUE), cmd = getOption("ilostat_cmd", "none"), quiet = getOption("ilostat_quiet", FALSE) )
get_ilostat( id, segment = getOption("ilostat_segment", "indicator"), type = getOption("ilostat_type", "code"), lang = getOption("ilostat_lang", "en"), time_format = getOption("ilostat_time_format", "raw"), filters = getOption("ilostat_filter", "none"), fixed = getOption("ilostat_fixed", TRUE), detail = getOption("ilostat_detail", "full"), cache = getOption("ilostat_cache", TRUE), cache_update = getOption("ilostat_cache_update", TRUE), cache_dir = getOption("ilostat_cache_dir", NULL), cache_format = getOption("ilostat_cache_format", "rds"), back = getOption("ilostat_back", TRUE), cmd = getOption("ilostat_cmd", "none"), quiet = getOption("ilostat_quiet", FALSE) )
id |
A code name for the dataset of interest.
See |
segment |
A character, way to get datasets by: |
type |
a character, type of variables, |
lang |
a character, code for language. Available are |
time_format |
a string giving a type of the conversion of the time
column from the ilostat format. "raw" (default)
does not do conversion and return time as character (ie. '2017', '2017Q1', '2017M01'). A "date" converted to
a |
filters |
a list;
|
fixed |
a logical, if |
detail |
a character, |
cache |
a logical whether to do caching. Default is |
cache_update |
a logical whether to update cache. Check cache update with last.update attribute store on the cache file name
and the one from the table of contents. Can be set also with
options(ilostat_cache_update = FALSE). Default is |
cache_dir |
a path to a cache directory. The directory has to exist.
The |
cache_format |
a character, format to store on the cache |
back |
a logical, |
cmd |
a character, R expression use for manipulate internal data frame |
quiet |
a logical, if |
a tibble. One column for each dimension in the data and the values column for numerical values, as well as the metadata columns. The time column for a time dimension.
Data sets are downloaded from the
ilostat bulk download facility.
If only the table id
is given, the whole table is downloaded from the
bulk download facility.
The bulk download facility is the fastest method to download whole datasets. It is also often the only way as the sdmx API has limitation of maximum 300 000 records at the same time and whole datasets usually exceeds that.
By default datasets from the bulk download facility are cached as they are often rather large.
Cache files are stored in a temporary directory by default or in
a named directory if cache_dir or option ilostat_cache_dir is defined.
The cache can be emptied with clean_ilostat_cache
.
The id
, a code, for the dataset can be searched with
the get_ilostat_toc
or from the [bulk download facility](https://ilostat.ilo.org/data/bulk/).
David Bescond [email protected]
See citation("Rilostat") ilostat bulk download facility user guidelines https://ilostat.ilo.org/data/bulk/
get_ilostat_toc
, label_ilostat
## Not run: ############# get simple dataset dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A") head(dat) dat <- get_ilostat("NZL_Q", segment = "ref_area") head(dat) dir.create(file.path(tempdir(), "r_cache")) dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cache_dir = file.path(tempdir(), "r_cache")) head(dat) clean_ilostat_cache(cache_dir = file.path(tempdir(), "r_cache")) options(ilostat_update = TRUE) dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A") head(dat) options(ilostat_update = FALSE) options(ilostat_cache_dir = file.path(tempdir(), "r_cache")) dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A") clean_ilostat_cache() ############# get multiple datasets dat <- get_ilostat(c("CPI_ACPI_COI_RT_M", 'CPI_ACPI_COI_RT_Q'), cache = FALSE) head(dat) toc <- get_ilostat_toc(search = 'CPI_') head(toc) dat <- get_ilostat(toc, cache = FALSE) #id as a tibble ############# get datasets with filters dat <- get_ilostat(id = c("UNE_2UNE_SEX_AGE_NB_A", 'EMP_2EMP_SEX_AGE_NB_A'), filters = list( ref_area = "FRA", classif1 = "AGE_YTHADULT_YGE15", time = "2016", sex = c("T", 'SEX_F')), quiet = TRUE) head(dat) clean_ilostat_cache() ############# store in other format dir.create(file.path(tempdir(), "ilostat")) dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cache_dir = file.path(tempdir(), "r_cache"), cache_format = 'csv') dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cache_dir = file.path(tempdir(), "r_cache"), cache_format = 'dta') ############# advanced manipulation dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cmd = "dat %>% count(ref_area)", quiet = TRUE) label_ilostat(dat, code = 'ref_area') clean_ilostat_cache() ## End(Not run)
## Not run: ############# get simple dataset dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A") head(dat) dat <- get_ilostat("NZL_Q", segment = "ref_area") head(dat) dir.create(file.path(tempdir(), "r_cache")) dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cache_dir = file.path(tempdir(), "r_cache")) head(dat) clean_ilostat_cache(cache_dir = file.path(tempdir(), "r_cache")) options(ilostat_update = TRUE) dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A") head(dat) options(ilostat_update = FALSE) options(ilostat_cache_dir = file.path(tempdir(), "r_cache")) dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A") clean_ilostat_cache() ############# get multiple datasets dat <- get_ilostat(c("CPI_ACPI_COI_RT_M", 'CPI_ACPI_COI_RT_Q'), cache = FALSE) head(dat) toc <- get_ilostat_toc(search = 'CPI_') head(toc) dat <- get_ilostat(toc, cache = FALSE) #id as a tibble ############# get datasets with filters dat <- get_ilostat(id = c("UNE_2UNE_SEX_AGE_NB_A", 'EMP_2EMP_SEX_AGE_NB_A'), filters = list( ref_area = "FRA", classif1 = "AGE_YTHADULT_YGE15", time = "2016", sex = c("T", 'SEX_F')), quiet = TRUE) head(dat) clean_ilostat_cache() ############# store in other format dir.create(file.path(tempdir(), "ilostat")) dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cache_dir = file.path(tempdir(), "r_cache"), cache_format = 'csv') dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cache_dir = file.path(tempdir(), "r_cache"), cache_format = 'dta') ############# advanced manipulation dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cmd = "dat %>% count(ref_area)", quiet = TRUE) label_ilostat(dat, code = 'ref_area') clean_ilostat_cache() ## End(Not run)
Downloads one ilostat dictionary from ilostat https://ilostat.ilo.org via bulk download facility https://ilostat.ilo.org/data/bulk/.
get_ilostat_dic(dic, lang = getOption("ilostat_lang", "en"))
get_ilostat_dic(dic, lang = getOption("ilostat_lang", "en"))
dic |
A character, dictionary for the variable to be downloaded, |
lang |
a character, code for language. Available are |
For a given coded variable from ilostat https://ilostat.ilo.org/.
The dictionaries link codes with human-readable labels.
To translate codes to labels, use label_ilostat
.
tibble with two columns: code names and full names.
David Bescond [email protected]
See citation("Rilostat") ilostat bulk download facility user guidelines https://webapps.ilo.org/ilostat-files/WEB_bulk_download/ILOSTAT_BulkDownload_Guidelines.pdf ilostat bulk download facility main page https://ilostat.ilo.org/data/bulk/
## Not run: tmp <- get_ilostat_dic("indicator") head(tmp) tmp <- get_ilostat_dic("classif1", lang = "fr") head(tmp) ## End(Not run)
## Not run: tmp <- get_ilostat_dic("indicator") head(tmp) tmp <- get_ilostat_dic("classif1", lang = "fr") head(tmp) ## End(Not run)
Download one table of contents from ilostat https://ilostat.ilo.org via bulk download facility https://ilostat.ilo.org/data/bulk/.
get_ilostat_toc( segment = getOption("ilostat_segment", "indicator"), lang = getOption("ilostat_lang", "en"), search = getOption("ilostat_search", "none"), filters = getOption("ilostat_filter", "none"), fixed = getOption("ilostat_fixed", TRUE) )
get_ilostat_toc( segment = getOption("ilostat_segment", "indicator"), lang = getOption("ilostat_lang", "en"), search = getOption("ilostat_search", "none"), filters = getOption("ilostat_filter", "none"), fixed = getOption("ilostat_fixed", TRUE) )
segment |
A character, way to get datasets by: |
lang |
a character, code for language. Available are |
search |
a character vector, "none" (default), datasets with this pattern in the description will be returned, characters vector will be use as AND, Character with '|' as OR, see example, options(ilostat_time_format = 'date'), |
filters |
a list; |
fixed |
a logical, if |
The TOC in English by indicator is downloaded from https://webapps.ilo.org/ilostat-files/WEB_bulk_download/indicator/table_of_contents_en.csv. The values in column 'id' should be used to download a selected dataset.
The TOC in English by ref_area is downloaded from https://webapps.ilo.org/ilostat-files/WEB_bulk_download/ref_area/table_of_contents_en.csv. The values in column 'id' should be used to download a selected dataset.
A tibble with ten columns depending of the segment: indicator or ref_area
id
: The codename of dataset of theme, will be used by the get_ilostat and get_ilostat_raw functions,
indicator or ref_area
: The indicator or ref_area code of dataset,
indicator.label or ref_area.label
: The indicator or ref_area name of dataset,
freq
: The frequency code of dataset,
freq.label
: Is freq name of dataset,
size
: Size of the csv.gz files,
data.start
: First time period of the dataset,
data.end
: Last time period of the dataset,
last.update
: Last update of the dataset,
...
: Others relevant information
David Bescond [email protected]
See citation("Rilostat") ilostat bulk download facility user guidelines https://webapps.ilo.org/ilostat-files/WEB_bulk_download/ILOSTAT_BulkDownload_Guidelines.pdf
## Not run: ## default segment by indicator, default lang English toc <- get_ilostat_toc() head(toc) toc <- get_ilostat_toc(segment = 'ref_area', lang = 'fr') head(toc) ## ## search on toc toc <- get_ilostat_toc(search = 'education') head(toc) toc <- get_ilostat_toc(lang = 'fr', search = 'éducation') head(toc) toc <- get_ilostat_toc(segment = 'ref_area', lang = 'fr', search = 'Albanie') toc toc <- get_ilostat_toc(segment = 'ref_area', lang = 'es', search = 'Trimestral') head(toc) ## ## search multi on toc toc <- get_ilostat_toc(segment = 'ref_area', lang = 'fr', search = 'Albanie|France', fixed = FALSE) head(toc) toc <- get_ilostat_toc(search = 'youth|adult', fixed = FALSE) head(toc) toc <- get_ilostat_toc(search = c('youth','adult'), fixed = FALSE) head(toc) ## ## End(Not run)
## Not run: ## default segment by indicator, default lang English toc <- get_ilostat_toc() head(toc) toc <- get_ilostat_toc(segment = 'ref_area', lang = 'fr') head(toc) ## ## search on toc toc <- get_ilostat_toc(search = 'education') head(toc) toc <- get_ilostat_toc(lang = 'fr', search = 'éducation') head(toc) toc <- get_ilostat_toc(segment = 'ref_area', lang = 'fr', search = 'Albanie') toc toc <- get_ilostat_toc(segment = 'ref_area', lang = 'es', search = 'Trimestral') head(toc) ## ## search multi on toc toc <- get_ilostat_toc(segment = 'ref_area', lang = 'fr', search = 'Albanie|France', fixed = FALSE) head(toc) toc <- get_ilostat_toc(search = 'youth|adult', fixed = FALSE) head(toc) toc <- get_ilostat_toc(search = c('youth','adult'), fixed = FALSE) head(toc) ## ## End(Not run)
Gets definitions/labels for ilostat codes from ilostat dictionaries.
label_ilostat( x, dic = NULL, code = NULL, lang = getOption("ilostat_lang", "en") )
label_ilostat( x, dic = NULL, code = NULL, lang = getOption("ilostat_lang", "en") )
x |
A character or a factor vector or a data_frame to labelled. |
dic |
A string (vector) naming ilostat dictionary or dictionaries.
If |
code |
a vector of names of the column for which code columns
should be retained. Set to |
lang |
a character, code for language. Available are |
A character or a factor vector of codes returns a corresponding vector of definitions.
label_ilostat
labels also data_frames from get_ilostat
. For vectors a dictionary
"time" and "values" columns are returned as they were, so you can supply data_frame from get_ilostat
and get data_frame with definitions instead of codes.
a vector or a data_frame. The suffix ".label" is added to code column names.
David Bescond [email protected]
See citation("Rilostat") ilostat bulk download facility user guidelines https://webapps.ilo.org/ilostat-files/Documents/ILOSTAT_BulkDownload_Guidelines.pdf
## Not run: dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cache = FALSE) dat_lab <- label_ilostat(dat) head(dat_lab) # add just ref_area label require(tidyverse) dat <- get_ilostat("UNE_TUNE_SEX_AGE_NB_A") %>% mutate(ref_area.label = ref_area %>% label_ilostat( "ref_area", code = "all"), .after = ref_area) clean_ilostat_cache() ## End(Not run)
## Not run: dat <- get_ilostat("UNE_2UNE_SEX_AGE_NB_A", cache = FALSE) dat_lab <- label_ilostat(dat) head(dat_lab) # add just ref_area label require(tidyverse) dat <- get_ilostat("UNE_TUNE_SEX_AGE_NB_A") %>% mutate(ref_area.label = ref_area %>% label_ilostat( "ref_area", code = "all"), .after = ref_area) clean_ilostat_cache() ## End(Not run)
On regular basis new tutorial and examples are building and available through this function.
brief description of the package
Package: | Rilostat |
Type: | Package |
Version: | See sessionInfo() or DESCRIPTION file |
Date: | 2020-2025 |
License: | BSD_2_clause + LICENSE |
LazyLoad: | yes |
R Tools for ilostat Open Data
David Bescond [email protected]
See citation("Rilostat")
ilostat bulk download facility user guidelines https://webapps.ilo.org/ilostat-files/Documents/ILOSTAT_BulkDownload_Guidelines.pdf
## Not run: # check which documentation have been recently added: # help(Rilostat) # https://ilostat.github.io/Rilostat/ ## End(Not run)
## Not run: # check which documentation have been recently added: # help(Rilostat) # https://ilostat.github.io/Rilostat/ ## End(Not run)