GeoQuetzal

Geographic and census data for Guatemala.

GeoQuetzal gives researchers programmatic access to Guatemala’s administrative boundaries and 2018 Census microdata (INE), following the same philosophy as tidycensus for the US and geobr for Brazil.


What is GeoQuetzal?

GeoQuetzal brings together three things that normally require manual work:

  • Administrative boundaries: 22 departments and 340 municipalities with correct INE codes, ready to join with any dataset that uses the same codes for departments and municipalities.
  • Census microdata: persons, households, housing units, and emigration from the 2018 Census, partitioned by department for efficient downloads. Data can be load at level of departments or municipalities.
  • Sub-municipal data: 20,254 lugares poblados with pre-aggregated indicators for demographics, ethnicity, education, housing, and services.
Niveles geográficos en GeoQuetzal 1 país Contorno nacional · gq.country() 22 departamentos Unidades administrativas mayores · gq.departamentos() 340 municipios Unidades administrativas menores · gq.municipios() 20,254 lugares poblados Indicadores pre-agregados · gq.lugares_poblados() Primera base de datos sub-municipal para Guatemala y Centroamérica

Installation

pip install geoquetzal
devtools::install_github("geoquetzal/geoquetzal-r")

Example

GeoQuetzal enables the analysis and visualization of the census variables at the country, department, and municipality levels. The geographical analysis offers more detailed, immersive, and comprehensive results. Example, What is Guatemala’s Internet access rate? Hover over each area to see the locality name and percentage.

Country-level analysis

import geoquetzal as gq

# Administrative boundaries, instant loading, no Internet required
deptos = gq.departamentos()

# Microdatos censales
df = gq.hogares() #Downloads all departments data, it might take some time depending on the Internet speed
⬇  Descargando hogares para los 22 departamentos...
   ✓ 3,275,931 hogares cargados
# % of households with Internet per department
acceso_internet = (
    df.groupby("DEPARTAMENTO")["PCH9_I"]
    .apply(lambda x: (x == 1).mean() * 100)
    .round(1)
    .reset_index(name="pct_internet")
)

#Join geometry, visualization
resultado = deptos.merge(acceso_internet, left_on="codigo_depto", right_on="DEPARTAMENTO")
resultado.explore(
    column="pct_internet",
    cmap="YlGnBu",
    tooltip=["departamento", "pct_internet"],
    tooltip_kwds={"aliases": ["Departamento", "% Internet"]},
    tiles="CartoDB positron",
    style_kwds={"weight": 1, "color": "white", "fillOpacity": 0.8},
)
Make this Notebook Trusted to load map: File -> Trust Notebook
if (!requireNamespace("geoquetzal", quietly = TRUE)) {
  remotes::install_github("geoquetzal/geoquetzal-r")
}
library(geoquetzal)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(mapview)

# Administratives boundaries, instant loading, no Internet access required
deptos <- departamentos()

#Census microdata
df <- hogares()  # All the departments
⬇  Descargando hogar para los 22 departamentos...
⬇  Descargando hogar depto 01 (Guatemala)...
⬇  Descargando hogar depto 02 (El Progreso)...
⬇  Descargando hogar depto 03 (Sacatepéquez)...
⬇  Descargando hogar depto 04 (Chimaltenango)...
⬇  Descargando hogar depto 05 (Escuintla)...
⬇  Descargando hogar depto 06 (Santa Rosa)...
⬇  Descargando hogar depto 07 (Sololá)...
⬇  Descargando hogar depto 08 (Totonicapán)...
⬇  Descargando hogar depto 09 (Quetzaltenango)...
⬇  Descargando hogar depto 10 (Suchitepéquez)...
⬇  Descargando hogar depto 11 (Retalhuleu)...
⬇  Descargando hogar depto 12 (San Marcos)...
⬇  Descargando hogar depto 13 (Huehuetenango)...
⬇  Descargando hogar depto 14 (Quiché)...
⬇  Descargando hogar depto 15 (Baja Verapaz)...
⬇  Descargando hogar depto 16 (Alta Verapaz)...
⬇  Descargando hogar depto 17 (Petén)...
⬇  Descargando hogar depto 18 (Izabal)...
⬇  Descargando hogar depto 19 (Zacapa)...
⬇  Descargando hogar depto 20 (Chiquimula)...
⬇  Descargando hogar depto 21 (Jalapa)...
⬇  Descargando hogar depto 22 (Jutiapa)...
   ✓ 3,275,931 registros cargados
# % of households with Internet access per department
acceso_internet <- df |>
  group_by(DEPARTAMENTO) |>
  summarise(pct_internet = round(mean(PCH9_I == 1, na.rm = TRUE) * 100, 1))

# Merge with geometry, visualization
resultado <- merge(deptos, acceso_internet,
                   by.x = "codigo_depto", by.y = "DEPARTAMENTO")

mapview(resultado,
        zcol        = "pct_internet",
        layer.name  = "% Internet",
        col.regions = RColorBrewer::brewer.pal(9, "YlGnBu"),
        map.types   = "CartoDB.Positron",
        label       = "departamento")
Warning: Found less unique colors (9) than unique zcol values (19)! 
Interpolating color vector to match number of zcol values.

Department-level analysis (Sacatepéquez)

import geoquetzal as gq

# Administrative boundaries — instant load, no internet needed
deptos = gq.departamentos()

# Census microdata — downloads only the department you need
df = gq.hogares(departamento="Sacatepequez")

# % of households with internet per municipality
acceso_internet = (
    df.groupby("MUNICIPIO")["PCH9_I"]
    .apply(lambda x: (x == 1).mean() * 100)
    .round(1)
    .reset_index(name="pct_internet")
)

# Join with geometry and visualize
munis = gq.municipios("Sacatepequez")
resultado = munis.merge(acceso_internet, left_on="codigo_muni", right_on="MUNICIPIO")
resultado.explore(
    column="pct_internet",
    cmap="YlGnBu",
    tooltip=["municipio", "pct_internet"],
    tooltip_kwds={"aliases": ["Municipality", "% Internet"]},
    tiles="CartoDB positron",
    style_kwds={"weight": 1, "color": "white", "fillOpacity": 0.8},
)
Make this Notebook Trusted to load map: File -> Trust Notebook
if (!requireNamespace("geoquetzal", quietly = TRUE)) {
  remotes::install_github("geoquetzal/geoquetzal-r")
}
library(geoquetzal)
library(dplyr)
library(mapview)

# Administratives boundaries, instant loading, no Internet access required
municipios <- municipios()

# Census microdata for Sacatepequez
df <- hogares("Sacatepéquez")  # Departament of Sacatepéquez
⬇  Descargando hogar depto 03 (Sacatepéquez)...
# % of households with Internet access per department
acceso_internet <- df |>
  group_by(MUNICIPIO) |>
  summarise(pct_internet = round(mean(PCH9_I == 1, na.rm = TRUE) * 100, 1))

# Merge with geometry, visualization
resultado <- merge(municipios, acceso_internet,
                   by.x = "codigo_muni", by.y = "MUNICIPIO")

mapview(resultado,
        zcol        = "pct_internet",
        layer.name  = "% Internet",
        col.regions = RColorBrewer::brewer.pal(9, "YlGnBu"),
        map.types   = "CartoDB.Positron",
        label       = "municipio")
Warning: Found less unique colors (9) than unique zcol values (16)! 
Interpolating color vector to match number of zcol values.

Municipaly-level analysis (Sacatepéquez, Antigua Guatemala)

import geoquetzal as gq

# Voronoi polygons — approximation of sub-municipal boundaries
vor = gq.voronoi_lugares_poblados(municipio="Antigua Guatemala")
   ℹ 2 lugares poblados excluidos por coordenadas nulas (códigos terminados en 999 — asentamientos sin nombre oficial).
   ✓ 57 polígonos Voronoi generados
# Pre-aggregated indicators per lugar poblado
lp = gq.lugares_poblados(municipio="Antigua Guatemala")
lp = lp.drop(columns=["nombre", "lat", "longitud", "area"])

# Join
gdf = vor.merge(lp, on=["departamento", "municipio", "lugar_poblado"])

# % of households with internet
gdf["pct_internet"] = (
    gdf["pch9_i_si"] / (gdf["pch9_i_si"] + gdf["pch9_i_no"]) * 100
).round(1)

# Interactive map
gdf.explore(
    column="pct_internet",
    cmap="YlGnBu",
    tooltip=["nombre", "pct_internet"],
    tooltip_kwds={"aliases": ["Lugar Poblado", "% with Internet"]},
    popup=["nombre", "pct_internet", "pch9_i_si", "pch9_i_no"],
    popup_kwds={"aliases": [
        "Lugar Poblado", "% with Internet",
        "HH with Internet", "HH without Internet"
    ]},
    legend=True,
    tiles="CartoDB positron",
    style_kwds={"weight": 0.5, "color": "white"},
)
Make this Notebook Trusted to load map: File -> Trust Notebook
if (!requireNamespace("geoquetzal", quietly = TRUE)) {
  remotes::install_github("geoquetzal/geoquetzal-r")
}
library(geoquetzal)
library(mapview)

# Voronoi polygons - approximations
vor <- voronoi_lugares_poblados(municipio = "Antigua Guatemala")
Spherical geometry (s2) switched off
⬇  Descargando lugares poblados depto 03 (Sacatepéquez)...
   ✓ 264 lugares poblados cargados
   ℹ 2 lugares poblados excluidos por coordenadas nulas (códigos terminados en 999).
   ✓ 57 polígonos Voronoi generados
Spherical geometry (s2) switched on
# Census data pre-agregated by populated place
lp <- lugares_poblados(municipio = "Antigua Guatemala")
⬇  Descargando lugares poblados depto 03 (Sacatepéquez)...
   ✓ 264 lugares poblados cargados
# Drop duplicate columns before the merge
lp <- lp[, !names(lp) %in% c("nombre", "lat", "longitud", "area")]

# Join
gdf <- merge(vor, lp, by = c("departamento", "municipio", "lugar_poblado"))

# Compute % of households with Internet access
gdf$pct_internet <- round(
  gdf$pch9_i_si / (gdf$pch9_i_si + gdf$pch9_i_no) * 100,
  1
)

# Interactive map
mapview(gdf,
        zcol        = "pct_internet",
        col.regions = RColorBrewer::brewer.pal(9, "YlGnBu"),
        map.types   = "CartoDB.Positron",
        layer.name  = "% Internet",
        label       = "nombre")
Warning: Found less unique colors (9) than unique zcol values (50)! 
Interpolating color vector to match number of zcol values.

Available datasets

Dataset Records Storage
Administrative boundaries 22 depts / 340 municipalities Bundled in package
Persons 14,901,286 GitHub (~333 MB)
Households 3,275,931 GitHub (~38 MB)
Housing units ~3,300,000 GitHub (~30 MB)
Emigration 242,203 GitHub (~1.6 MB)
Lugares Poblados 20,254 GitHub

Authors

Developed by Jorge Yass and Anasilvia Salazar, lecturers at Universidad del Valle de Guatemala (UVG) and PhD students at Iowa State University (ISU).

📦 PyPI · Python GitHub · R GitHub · 📄 MIT License