Analysis of missing persons in Mexico

Cyclistic Rides Analysis

Project Overview

This project was developed as part of the final assignment for Module IV of the diploma program Introduction to Data Science: Tools for Machine Learning in the Social Sciences and Humanities. It is a web application designed to make visible and analyze information about missing persons in Mexico. With an interactive interface, it allows users to explore trends by age, gender, location, and more. The platform was created using Python, Flask, and modern web technologies, and is freely accessible to the public.
The project also invites critical reflection on the quality and meaning of the available data.

To view the project, please visit the following link.
The project is hosted on GitHub, where you can find the source code and documentation. GitHub Repository.

Technologies Used

The following technologies were used to develop this project:

  • Python: Used for data cleaning, transformation and analysis. The pandas library was used to facilitate data processing and exploration.
  • Flask: Web framework used to create the application and manage the backend logic. It allows to serve the web pages and handle user requests.
  • Highcharts: JavaScript library used to create interactive and dynamic visualizations. It allows to display data in a clear and attractive way, facilitating the analysis of trends and patterns.

Database

The database used in the Cartografía de la Ausencia project contains information on 8,407 missing persons in Mexico. This information was collected by a citizen collective based on data provided by family members and close acquaintances, representing an effort to build an alternative record to the official one—centered on the dignity and memory of the victims. The database consists of 16 variables that cover personal, contextual, and geographic dimensions of each case. These include:

  • Demographic data such as full name, age at the time of disappearance, sex, and gender.
  • Physical characteristics such as body build, height, complexion, and hair type.
  • Circumstances of disappearance, including date, clothing, and distinguishing features.
  • Geographic location, referencing municipality and state.
  • Localization status, indicating whether the person was found alive, deceased, or remains missing.
  • Public information authorization status, determining whether the data can be made publicly visible.
While the dataset offers a rich and multidimensional perspective on the phenomenon of disappearance, it also has limitations related to geographic coverage, data collection processes, standardization of categories, and the availability of human and technical resources for its consolidation. The data is available in a CSV file format, which can be downloaded at the following link .

Processing Data

Key fields such as name, age, disappearance date, gender, and location are complete, providing a strong data base for analysis and visualizations. The dataset was processed to ensure clear, consistent, and reliable results on the platform.

Data Visualization

Temporal Evolution of Disappearances by Gender


Total Rides by Quarter

This line chart displays the annual evolution of the number of disappearances in Mexico, disaggregated by gender (MALE and FEMALE), from 1978 to 2024.

Key Findings:

  • Sharp increase since 2017: A significant rise in disappearances—especially among men—has been observed since 2017. This surge coincides with a national context marked by escalating violence, possibly linked to public security policies or the strengthening of organized crime.
  • Peak in 2022: The year 2022 recorded the highest number of disappearances for both genders, with over 1,500 cases among men and around 500 among women.
  • Downward trend in 2023–2024: In recent years, there is a noticeable decline in the number of recorded cases. This drop may be due to delays in data entry or updates.
  • Persistent gender gap: Throughout the entire period, the number of male disappearances consistently exceeds that of females. This may be related to gendered patterns of violence, socio-political risk profiles, and involvement in violent contexts


Trends in Disappearances in the 5 Most Affected Municipalities


Total Rides by Quarter

This line chart presents the annual evolution of disappearances in the most affected municipalities of the state of Jalisco, including Zapopan, Guadalajara, Tlajomulco de Zúñiga, San Pedro Tlaquepaque, among others.

Key Findings:

  • Accelerated increase since 2018: Similar to the previous chart by gender, a significant rise in disappearances is observed beginning in 2018, especially in the metropolitan municipalities surrounding Guadalajara.
  • Municipalities with the highest number of disappearances:
    • Zapopan leads with the highest peak, surpassing 500 disappearances in 2022.
    • Guadalajara and Tlajomulco de Zúñiga follow, each reporting over 300 cases in the same year.
    • This pattern reinforces the territorial concentration of the phenomenon.
  • Urban concentration of the phenomenon: The most affected municipalities are part of densely populated urban areas, marked by organized crime, social mobility, and territorial disputes. This suggests that urban context may be an aggravating factor in disappearances.
  • Recent decline (2024–2025): As seen in the overall trend, there is a sharp drop in recent records. This may be due to delays in data updates or possible changes in case reporting and classification methods.



Top 8 Municipalities with the Highest Number of Reported Cases


Total Rides by Quarter


This horizontal bar chart displays the total number of reported disappearances by municipality, highlighting the top 8 with the highest cumulative cases.

Key Findings:

  • Zapopan and Guadalajara lead the ranking:
    • Zapopan ranks first with 1,886 cases.
    • Guadalajara follows closely with 1,800 cases.
    • Both municipalities are part of the Guadalajara Metropolitan Area, reflecting a high concentration of disappearances in the state's urban core.
  • Tlajomulco de Zúñiga and San Pedro Tlaquepaque:
    • Rank third and fourth with 1,007 and 795 cases, respectively.
    • Also part of the metropolitan area, further confirming the territorial correlation with densely populated zones and likely presence of organized crime.
  • Other municipalities with notable figures:
    • Tonalá (507), El Salto (281), Lagos de Moreno (241), and Puerto Vallarta (171) complete the top eight.
    • While their numbers are lower, their inclusion suggests that the phenomenon extends beyond the metropolitan area.


Distribution by Gender Identity


Total Rides by Quarter

This horizontal bar chart shows the gender distribution among a total of 8,407 individuals, highlighting the proportions across different gender identity categories.

Key Findings:

  • Predominance of male gender: Men represent the majority with 77.3% (6,498 cases), indicating a marked disparity in the composition of the dataset.
  • Significant presence of women: Women account for 22.2% (1,864 cases), reflecting a substantially lower proportion compared to men.
  • Inclusion of trans identities and unspecified cases:
    • Trans women account for 0.1% (9 cases), while trans men represent just one case.
    • A small percentage (0.4% or 35 individuals) have no gender information recorded, which may point to limitations in data collection processes.

Conclusions

Forced disappearance in Mexico is one of the most painful and persistent humanitarian crises of our time. Each case is not just a number, but a silenced story, a broken family, and an absence that cries out for justice. For those waiting for the return of their loved ones, uncertainty becomes an open wound that deepens with each passing day, month, and year. This project is a modest attempt to shed light on this grave issue and to encourage thoughtful reflection. Through data, visualizations, and critical insights, it aims to reveal not only the scale of the phenomenon, but also its deeply human and territorial dimensions. As a society, we cannot remain indifferent. From our own trenches—whether academic, artistic, technological, or community-based—we can raise our voices, demand truth, and urge the government to take the structural and comprehensive actions needed to address the root causes of this violence.

Because naming absence is also an act of resistance.
Because those who are missing continue to exist in memory.
Because to be absent is not to disappear.