• Publicado: 22 Oct 2015

  • Archivado en:

inegiR

Overview

inegiR is a package designed to interact with the two API’s of INEGI (Oficial statistics agency of Mexico). Because these work with JSON or XML formating, this package is essentially a wrapper for jsonlite, XML and some tidy plyr transformations.

The package uses two main functions:

The remaining functions serve as elegant wrappers to perform common tasks. For example inflacion_general() to download monthly inflation data. Other functions make transformations easier on-the-fly, such as YoY() to calculate a percentage change from a year ago (year-over-year).


Example 1: downloading a data series

Install

To get the CRAN version (as of Nov-2015):

install.packages(inegiR)
library(inegiR)

To download dev version on github, using devtools:

#install.packages("devtools")
library(devtools)
install_github("Eflores89/inegiR")
  #dependiencies: zoo, XML, plyr, jsonlite
library(inegiR)

Download data

There are roughly two ways to download data series: the “general” and the “short” way (provided there is a wrapper function available).

In the first case, the function parses a URL provided by the user. All the URL’s for each data series can be found in the INEGI development site. You must also sign up for an API token in that same site with your email.

Let us save the imaginary token:

token <- "abc123"

Now, I wish to find the rate of inflation (which in the case of INEGI is a percent change of the INPC data series).

This is the corresponding URL for INPC data.series:

urlINPC <- "http://www3.inegi.org.mx/sistemas/api/indicadores/v1//Indicador/216064/00000/es/false/xml/"

JSON format is also accepted and is interchangeable (do not use the “?callback?” sign provided by INEGI’s documentation):

urlINPC2 <- "http://www3.inegi.org.mx/sistemas/api/indicadores/v1//Indicador/216064/00000/es/false/json/"

Now, we are going to download this data series as a data.frame.

INPC <- serie_inegi(urlINPC, token)

# take a look
tail(INPC)
# Fechas         Valores
# 2014-12-01   116.05900000
# 2015-01-01   115.95400000
# 2015-02-01   116.17400000
# 2015-03-01   116.64700000
# 2015-04-01   116.34500000
# 2015-05-01   115.76400000

The optional “metadata” parameter in serie_inegi allows us to also download the metadata information from the data series, which includes date of update, units, frequency, etc.

If “metadata” is set to TRUE, the information is parsed as a list of two elements: the metadata and the data frame.

INPC_Metadata <- serie_inegi(urlINPC, token, metadata = TRUE)
class(INPC_Metadata)
# [1] "list"

To access any of these elements, simply use as a list:

# date of last update
INPC_Metadata$MetaData$UltimaActualizacion
[1] "2015/06/09"

Now that we have the INPC data series, we must apply a year-over-year change. For this we use the handy YoY() function, which let’s us choose the amount of periods to compare over (12 if you want year over year for a monthly series):

Inflation <- YoY(INPC$Valores, 
                 lapso = 12, 
                 decimal=FALSE)

# if we want a dataframe, we simply build like this
Inflation_df <- cbind.data.frame(Fechas = INPC$Fechas, 
                                 Inflation = Inflation)

tail(Inflation_df)
# Fechas        Inflation
# 2014-12-01    4.081322
# 2015-01-01    3.065642
# 2015-02-01    3.000266
# 2015-03-01    3.137075
# 2015-04-01    3.062327
# 2015-05-01    2.876643

This method works for any URL obtained from the INEGI documentation, but for the most used indicators, the package has built-in shortcut wrappers.

Let us obtain the same data series (inflation) via one of these specified shortcut functions:

Inflation_fast <- inflacion_general(token)
tail(Inflation_fast)
# Fechas        Inflacion
# 2014-12-01    4.081322
# 2015-01-01    3.065642
# 2015-02-01    3.000266
# 2015-03-01    3.137075
# 2015-04-01    3.062327
# 2015-05-01    2.876643

Example 2: downloading statistics from DENUE

The DENUE is a directory of businesses in Mexico and is accesible by another API within INEGI here. A different API token is used for these queries.

token_denue <- "abcdef1234"

To download the businesses in a certain radius, we need a few coordinates. Let’s use the ones around Monterrey Mexico’s main square:

latitud_macro<-"25.669194"
longitud_macro<-"-100.309901"

Now, we download into a data.frame the list of businesses in a 250 meter radius.

NegociosMacro <- denue_inegi(latitud = latitud_macro, 
                             longitud = longitud_macro, 
                             token_denue)

Let’s see only the first rows and columns…

head(NegociosMacro)[,1:2]
#     id                                       Nombre
# 2918696                   ESTACIONAMIENTO GRAN PLAZA
# 2918698             TEATRO DE LA CIUDAD DE MONTERREY
# 2918723                           CONGRESO DE ESTADO
# 2918793               SECRETARIA DE SALUD DEL ESTADO
# 2974150                           BIBLIOTECA CENTRAL
# 2974215      SOTANO RECURSOS HUMANOS Y ADQUISICIONES

If you would like to change some parameters, this is accepted. For example a 1km radius and only businesses with “Restaurante” in the description.

RestaurantsMacro <- denue_inegi(latitud = latitud_macro, 
                                longitud = longitud_macro, 
                                token_denue, 
                                metros = 1000, 
                                keyword = "Restaurante")