Examples: Other Data + Maps

This section combines external data sources with geographic visualizations. The general pattern is: aggregate microdata first, then merge with geometry. The merge requires data to be at the departmental or municipality level, by numerical code or name.

Tip

Key rule: Always aggregate first (with pandas in Python, with dplyr in R), then merge geometry onto the summary rows. Never use geometry= directly on large microdata — it attaches a polygon to every row and is very slow.


Guatemala’s Literacy Rate Projections for 2024

In this example, we will work with the public dataset of CONALFA for literacy and illiteracy projections for 2024. The datasource can be downloaded from this link. In the first sheet “DEPARTAMENTO” we can find the projections at the departmental level. The first column has the names of the departments. We will use the department names to merge with the administrative boundary data from GeoQuetzal.

import geoquetzal as gq
import pandas as pd

# CONALFA data
xlsx = pd.ExcelFile("PROYECCION-ANALFABETISMO-POR-DEPARTAMENTO-Y-MUNICIPIO-GALP-2024.xlsx")
projections_df = xlsx.parse('DEPARTAMENTO', skiprows=5, usecols="A,C")
projections_df.columns = ['Depto_Name', 'Pct_literacy']

# Administrative boundaries — instant loading, no internet needed
deptos = gq.departamentos()

# Merge with geometry and visualize
resultado = deptos.merge(projections_df, left_on="departamento", right_on="Depto_Name")

resultado.explore(
    column="Pct_literacy",
    cmap="YlGnBu",
    tooltip=["Depto_Name", "Pct_literacy"],
    tooltip_kwds={"aliases": ["Department", "% Literacy"]},
    tiles="CartoDB positron",
    style_kwds={"weight": 1, "color": "white", "fillOpacity": 0.8},
)
Make this Notebook Trusted to load map: File -> Trust Notebook
library(geoquetzal)
library(readxl)
library(mapview)

# CONALFA data
projections_df <- read_excel(
  "PROYECCION-ANALFABETISMO-POR-DEPARTAMENTO-Y-MUNICIPIO-GALP-2024.xlsx",
  sheet     = "DEPARTAMENTO",
  skip      = 5,
  col_names = FALSE
)[, c(1, 3)]
names(projections_df) <- c("Depto_Name", "Pct_literacy")

# Administrative boundaries — instant loading, no internet needed
deptos <- departamentos()

# Merge with geometry and visualize
resultado <- merge(deptos, projections_df,
                   by.x = "departamento", by.y = "Depto_Name")

mapview(resultado,
        zcol        = "Pct_literacy",
        col.regions = RColorBrewer::brewer.pal(9, "YlGnBu"),
        layer.name  = "% Literacy",
        map.types   = "CartoDB.Positron",
        label       = "departamento")

As can be seen, the flexibility of working with external data sources greatly increases GeoQuetzal’s potential on the scenarios it can handle. The only requirement is that the department and municipality numeric codes or names from the external data sources must match those used by the INE.