We Should Chart an Atlas of All the World's Models
Abstract
Public model repositories now contain millions of models, yet most models remain undocumented and effectively lost. In this position paper, we advocate for charting the world's model population in a unified structure we call the Model Atlas: a graph that captures models, their attributes, and the weight transformations that connect them. The Model Atlas enables applications in model forensics, meta-ML research, and model discovery, challenging tasks given today's unstructured model repositories. However, because most models lack documentation, large atlas regions remain uncharted. Addressing this gap motivates new machine learning methods that treat models themselves as data, inferring properties such as functionality, performance, and lineage directly from their weights. We argue that a scalable path forward is to bypass the unique parameter symmetries that plague model weights. Charting all the world's models will require a community effort, and we hope its broad utility will rally researchers toward this goal.

The Hugging Face atlas
While this is a small subset (63,000 models) of the documented regions of HF, it already reveals significant trends.
Depth and structure. The LLM connected component (CC) is deep and complex. It includes almost a third of all models. In contrast, while Flux is also substantial, its structure is much simpler and more uniform.
Quantization. Zoom-in (A) highlights quantization practices across vision, language, and vision-language (V&L) models. Vision models barely use quantization, despite Flux containing more parameters (12B) than Llama (8B). Conversely, quantization is commonplace in LLMs, constituting a large proportion of models. VLMs demonstrate a balance between these extremes.
Adapter and fine-tuning strategies. A notable distinction exists between discriminative (top) and generative (bottom) vision models. Discriminative models primarily employ fine-tuning, while generative models have widely adopted adapters like LoRA. The evolution of adapter adoption over time is evident: Stable-Diffusion 1.4 (SD) (1) mostly used full fine-tuning, while SD 1.5 (2), SD 2 (3), SD XL (4), and Flux (5) progressively use more adapters. Interestingly, the atlas reveals that audio models rarely use adapters, suggesting gaps in cross-community knowledge transfer.
This inter-community variation is particularly evident in model merging. LLMs have embraced model merging, with merged models frequently exceeding the popularity of their parents. This raises interesting questions about the limited role of merging in vision models. For enhanced visualization, we display the top 30% most downloaded models.
Model atlas demo
For full version, visit our Hugging Face space.
Model attribute prediction using the atlas
Currently, most models have very partial documentation. As local atlas regions contain related models, the atlas can also be useful for predicting missing model attributes, including task, accuracy, license, missing weights, and popularity.

Using atlas structure improves prediction of model accuracy and other attributes, compared to naively using the majority label. In (b), we report the prediction accuracy.


Charting the atlas
While we've seen the importance of the model atlas, in practice, over 60% of it is unknown. Using the known regions of the atlas, we identify high-confidence structural priors based on dominant real-world model training practices.

Quantizations are leaves: Our analysis of over 400,000 documented model relationships reveals that 99.41% of quantized models are leaf nodes. This figure shows this for a subset of the Llama-based models. Indeed, quantized models (magenta) are nearly always leaf nodes, corroborating the statistical finding.

Temporal dynamics indicate edge directionality: We analyzed over 400,000 documented model relationships and observed that in 99.73% of cases, earlier upload times correlate with topologically higher positions in the DAG. Here, we visualize this trend on a subset of the Llama model family. Green nodes indicate models where earlier upload times align with topological order, while red nodes represent exceptions to this trend. The source (in gray) vacuously satisfied this assumption. It is clear that nearly all nodes satisfy our assumption.

Snake vs. Fan patterns: Snake patterns often arise from sequential training checkpoints, while fan patterns typically result from hyperparameter sweeps. In both structures the model weight variance is low. However, in snake patterns the weight distance has high correlation with model upload time, whereas in fan patterns the correlation is lower.
Our approach computes the distance between model weights. Using these priors, our method outperforms the baselines by a significant margin, even for in-the-wild models.

BibTeX
@article{horwitz2025charting,
title={Charting and Navigating Hugging Face's Model Atlas},
author={Horwitz, Eliahu and Kurer, Nitzan and Kahana, Jonathan and Amar, Liel and Hoshen, Yedid},
journal={arXiv preprint arXiv:2503.10633},
year={2025}
}