inegiR version 1.2
Version 1.2 of inegiR is now on CRAN so I thought I’d write a few words/vignette about what’s new or different, if at all. By the way, i’m writing in english because more people seem to read r-bloggers than my blog (no surprise there), however the pdf manual and most documentation is still in spanish.
Thanks to Diego Valle who reported a slight bug, the more random dates (“bienal” and “decenal”) were not being parsed correctly.
Also added warnings and error handling when the data doesn’t exist for municipalities (issue is here)
Thanks to Arturo Cardenas who unwittingly built a new function for the DENUE part of the package that’s incorporated in this version.
As he wrote in his blog, the denue API only allows us to download businesses in a radious of a maximum of 5 kilometers. However, we can get around this limitation by asking the API a series of coordinates that we know overlap each other to create a square of a larger size. This is a picture, taken from that post, detailing what I mean:
Each circle is, of course 5 kms in radius and so the API would give us everything inside.
hacer_grid() function helps us in the process by creating a data.frame with a series of coordinates that create a grid like the one in the image if we supply it 2 corners in latitud and longitud.
But the more powerful
denue_grid() does the interesting part. Using the former function, it also downloads the denue data and returns a unique business data.frame in that grid (if you want duplicates as well, you can eliminate the unique part by setting the
unicos = FALSE parameter)
Example with Grids
Here is an example with the city of Monterrey, let’s say I want all the businesses in San Pedro (a municipality that is part of the metropolitan area).
The total area is roughly about 45 kms, give or take (I know this is not geographically accurate):
I feed the upper right hand and lower left hand coordinates to the function, and voila:
Simple as that!
By using two fairly consistent surveys that INEGI makes on a monthly bases, I added two functions to calculate productivity, by state in two important industries.
For both cases, productivity is defined as total value produced in state divided by number of total occupied people in the industry in the state. Bear in mind that value produced is in thousands of pesos, so 100 would be equal to 100 thousand pesos “produced” by each person.
We can simply get a time series by the doing the following:
These last two examples lead me to another point: the names in the functions with states have changed. In the first version, Nuevo León state was “NuevoLeon”, it has been changed to “NL”. This is more conscise, easier to read and consistent with the new constitutional name change for Mexico City (it is now “CDMX”, as opposed to “DF”).
The other advantage is that these names will be consistent with Diego Valle’s
mxmaps package to easily make chroloplethr maps (it’s available here). There is a nifty function to make these included in the package using inegiR, but now you can do this both ways!
To switch between “old names” and the new ones, i’ve left the following catalog here:
|Name of State||Previous Name||New Name|
|Baja California Sur||BajaCaliforniaSur||BCS|
|Estado de México||EdoMexico||MEX|
|San Luís Potosí||SanLuisPotosi||SLP|