aidos-lab
diff --git a/‎LICENSE‎
Lines changed: 8 additions & 1 deletion b/‎LICENSE‎
Lines changed: 8 additions & 1 deletion
diff --git a/‎README.md‎
Lines changed: 29 additions & 9 deletions b/‎README.md‎
Lines changed: 29 additions & 9 deletions
diff --git a/‎grokking/analysis/local_estimates_computation/estimator/get_estimator.py‎
Lines changed: 0 additions & 17 deletions b/‎grokking/analysis/local_estimates_computation/estimator/get_estimator.py‎
Lines changed: 0 additions & 17 deletions
diff --git a/‎grokking/analysis/local_estimates_computation/get_n_neighbors_from_array_len_and_pointwise_config.py‎
Lines changed: 0 additions & 16 deletions b/‎grokking/analysis/local_estimates_computation/get_n_neighbors_from_array_len_and_pointwise_config.py‎
Lines changed: 0 additions & 16 deletions
diff --git a/‎grokking/analysis/local_estimates_computation/global_and_pointwise_local_estimates_computation.py‎
Lines changed: 0 additions & 17 deletions b/‎grokking/analysis/local_estimates_computation/global_and_pointwise_local_estimates_computation.py‎
Lines changed: 0 additions & 17 deletions
diff --git a/‎grokking/config_classes/constants.py‎
Lines changed: 0 additions & 18 deletions b/‎grokking/config_classes/constants.py‎
Lines changed: 0 additions & 18 deletions
diff --git a/‎grokking/config_classes/local_estimates/filtering_config.py‎
Lines changed: 0 additions & 17 deletions b/‎grokking/config_classes/local_estimates/filtering_config.py‎
Lines changed: 0 additions & 17 deletions
diff --git a/‎grokking/config_classes/local_estimates/local_estimates_config.py‎
Lines changed: 0 additions & 17 deletions b/‎grokking/config_classes/local_estimates/local_estimates_config.py‎
Lines changed: 0 additions & 17 deletions
diff --git a/‎grokking/config_classes/local_estimates/noise_config.py‎
Lines changed: 0 additions & 17 deletions b/‎grokking/config_classes/local_estimates/noise_config.py‎
Lines changed: 0 additions & 17 deletions
diff --git a/‎grokking/config_classes/local_estimates/plot_config.py‎
Lines changed: 0 additions & 16 deletions b/‎grokking/config_classes/local_estimates/plot_config.py‎
Lines changed: 0 additions & 16 deletions
@@ -1,7 +1,8 @@
 MIT License
 
 Copyright (c) 2022 Charlie Snell
-Copyright (c) 2025 AUTHOR_1 ([email protected])
+Copyright (c) 2025 Benjamin Matthias Ruppik ([email protected])
+Copyright (c) 2025 Julius von Rohrscheidt ([email protected])
 
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
@@ -20,3 +21,9 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
+
+Code generation tools and workflows:
+First versions of this code were potentially generated
+with the help of AI writing assistants including
+GitHub Copilot, ChatGPT, Microsoft Copilot, Google Gemini.
+Afterwards, the generated segments were manually reviewed and edited.
@@ -1,7 +1,7 @@
-# Predicting Grokking via Local Intrinsic Dimensions of Contextual Language Models
+# Detecting Grokking via Local Intrinsic Dimensions of Contextual Language Models
 
 *Grokking* is the phenomenon where a machine learning model trained on a small dataset learns to generalize well beyond the training set after a long period of overfitting.
-We demonstrate that the grokking phenomenon can be predicted by the local intrinsic dimension of the model's hidden states.
+We demonstrate that the grokking phenomenon can be detected by a change of the local intrinsic dimension (LID) of the model's hidden states.
 
 This repository is based on an unofficial re-implementation of the paper [Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets](https://arxiv.org/abs/2201.02177) by Power et al.
 The original codebase that we base our work on was written by Charlie Snell.
@@ -53,18 +53,19 @@ This step can be achieved by running the setup script in the `grokking/setup/` d
 
 1. (Optional) If required, e.g. when planning to run jobs on a cluster via a custom Hydra launcher, set the correct environment variables in the `.env` file in the project root directory.
 
-1. (Optional) For setting up the repository to support job submissions to a cluster using a Hydra multi-run launcher, follow the instructions here: [[ANONYMIZED_HYDRA_HPC_LAUNCHER_LINK]].
+1. (Optional) For setting up the repository to support job submissions to a cluster using a Hydra multi-run launcher, follow the instructions in the [Hydra-HPC-Launcher repository](https://github.com/carelvniekerk/Hydra-HPC-Launcher).
 
 ## Usage
 
 We define `uv run` commands in the `pyproject.toml` file, which can be used as entry points to run the code.
 
 The training script uses [Weights And Biases](https://wandb.ai/home) (wandb) by default to generate plots in realtime.
-If you would not like to use wandb, just set `wandb.use_wandb=False` in `config/train_grokk.yaml` or as an argument when calling `train_grokk.py`.
-In our modified version of the repository, this includes:
+If you want to disable wandb, just set `wandb.use_wandb=False` in `config/train_grokk.yaml` or as an argument when calling `train_grokk.py`.
 
-- Training and validation loss curves and accuracy curves
-- Topological local estimates of the hidden states during training (with selected hyperparameters)
+In our modified version of the repository, the logging includes:
+
+- Training and validation loss curves and accuracy curves;
+- Topological local estimates of the hidden states during training (with selected hyperparameters).
 
 Note that since the computation of the local intrinsic dimension is expensive, we only compute it in certain intervals during training.
 This can be controlled via the `topological_analysis.compute_estimates_every=500` parameter in the `config/train_grokk.yaml` file.
@@ -90,7 +91,7 @@ uv run train_grokk dataset.frac_train=0.5 wandb.use_wandb=false
 
 You can try different operations or learning and architectural hyperparameters by modifying configurations in the `config/` directory.
 
-### Experiments: Local Dimensions Predict Grokking
+### Experiments: Local Dimensions Detect Grokking
 
 To reproduce the results in our paper, which compares the onset of grokking with the timing of the drop in local intrinsic dimension, you can run the following command:
 
@@ -114,7 +115,26 @@ The description of the local estimates contains the parameters used for its comp
 - `n-neighbors=64`: Number of neighbors (L) to use for the local intrinsic dimension estimate.
 - `mean`: Log the mean of the local intrinsic dimension estimates over all token samples.
 
-Note: We provide scripts for creating the figures in the paper from the wandb logs as part of the Topo_LLM repository in `topollm/plotting/wandb_export/`.
+Note: We provide scripts for creating the figures in the paper from the wandb logs as part of the [Topo_LLM repository](https://github.com/aidos-lab/Topo_LLM) in `topollm/plotting/wandb_export/`.
+
+## References
+
+Further discussion of the results can be found in our paper [Less is More: Local Intrinsic Dimensions of Contextual Language Models](https://arxiv.org/abs/2506.01034).
+
+```tex
+@misc{ruppik2025morelocalintrinsicdimensions,
+      title={Less is More: Local Intrinsic Dimensions of Contextual Language Models}, 
+      author={Benjamin Matthias Ruppik and Julius von Rohrscheidt and Carel van Niekerk and Michael Heck and Renato Vukovic and Shutong Feng and Hsien-chin Lin and Nurul Lubis and Bastian Rieck and Marcus Zibrowius and Milica Gašić},
+      year={2025},
+      eprint={2506.01034},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2506.01034},
+      note={To appear in NeurIPS 2025},
+}
+```
+
+- [Topo_LLM repository](https://github.com/aidos-lab/Topo_LLM)
 
 ## Acknowledgements
 
 
@@ -1,20 +1,3 @@
-# Copyright 2024-2025
-# [ANONYMIZED_INSTITUTION],
-# [ANONYMIZED_FACULTY],
-# [ANONYMIZED_DEPARTMENT]
-#
-# Authors:
-# AUTHOR_1 (2025) ([email protected])
-# AUTHOR_2 ([email protected])
-#
-# Code generation tools and workflows:
-# First versions of this code were potentially generated
-# with the help of AI writing assistants including
-# GitHub Copilot, ChatGPT, Microsoft Copilot, Google Gemini.
-# Afterwards, the generated segments were manually reviewed and edited.
-#
-
-
 import logging
 import pprint
 
 
@@ -1,19 +1,3 @@
-# Copyright 2025
-# [ANONYMIZED_INSTITUTION],
-# [ANONYMIZED_FACULTY],
-# [ANONYMIZED_DEPARTMENT]
-#
-# Authors:
-# AUTHOR_1 (2025) ([email protected])
-#
-# Code generation tools and workflows:
-# First versions of this code were potentially generated
-# with the help of AI writing assistants including
-# GitHub Copilot, ChatGPT, Microsoft Copilot, Google Gemini.
-# Afterwards, the generated segments were manually reviewed and edited.
-#
-
-
 import logging
 
 from grokking.config_classes.local_estimates.pointwise_config import LocalEstimatesPointwiseConfig
 
@@ -1,20 +1,3 @@
-# Copyright 2025
-# [ANONYMIZED_INSTITUTION],
-# [ANONYMIZED_FACULTY],
-# [ANONYMIZED_DEPARTMENT]
-#
-# Authors:
-# AUTHOR_1 (2025) ([email protected])
-# AUTHOR_2 ([email protected])
-#
-# Code generation tools and workflows:
-# First versions of this code were potentially generated
-# with the help of AI writing assistants including
-# GitHub Copilot, ChatGPT, Microsoft Copilot, Google Gemini.
-# Afterwards, the generated segments were manually reviewed and edited.
-#
-
-
 """Compute global and local estimates from prepared embeddings."""
 
 import logging
 
@@ -1,20 +1,3 @@
-# Copyright 2024-2025
-# [ANONYMIZED_INSTITUTION],
-# [ANONYMIZED_FACULTY],
-# [ANONYMIZED_DEPARTMENT]
-#
-# Authors:
-# AUTHOR_1 (2025) ([email protected])
-# AUTHOR_2 ([email protected])
-#
-# Code generation tools and workflows:
-# First versions of this code were potentially generated
-# with the help of AI writing assistants including
-# GitHub Copilot, ChatGPT, Microsoft Copilot, Google Gemini.
-# Afterwards, the generated segments were manually reviewed and edited.
-#
-
-
 """Script for setting global variables for the config files."""
 
 import logging
@@ -23,7 +6,6 @@
 
 from dotenv import load_dotenv
 
-
 default_logger: logging.Logger = logging.getLogger(
     name=__name__,
 )
 
@@ -1,20 +1,3 @@
-# Copyright 2024-2025
-# [ANONYMIZED_INSTITUTION],
-# [ANONYMIZED_FACULTY],
-# [ANONYMIZED_DEPARTMENT]
-#
-# Authors:
-# AUTHOR_1 (2025) ([email protected])
-# AUTHOR_2 ([email protected])
-#
-# Code generation tools and workflows:
-# First versions of this code were potentially generated
-# with the help of AI writing assistants including
-# GitHub Copilot, ChatGPT, Microsoft Copilot, Google Gemini.
-# Afterwards, the generated segments were manually reviewed and edited.
-#
-
-
 """Configurations for specifying filtering of the data for local estimates computation."""
 
 from pydantic import BaseModel, Field
 
@@ -1,20 +1,3 @@
-# Copyright 2024-2025
-# [ANONYMIZED_INSTITUTION],
-# [ANONYMIZED_FACULTY],
-# [ANONYMIZED_DEPARTMENT]
-#
-# Authors:
-# AUTHOR_1 (2025) ([email protected])
-# AUTHOR_2 ([email protected])
-#
-# Code generation tools and workflows:
-# First versions of this code were potentially generated
-# with the help of AI writing assistants including
-# GitHub Copilot, ChatGPT, Microsoft Copilot, Google Gemini.
-# Afterwards, the generated segments were manually reviewed and edited.
-#
-
-
 """Configuration class for embedding data preparation."""
 
 from pydantic import BaseModel, Field
 
@@ -1,20 +1,3 @@
-# Copyright 2024-2025
-# [ANONYMIZED_INSTITUTION],
-# [ANONYMIZED_FACULTY],
-# [ANONYMIZED_DEPARTMENT]
-#
-# Authors:
-# AUTHOR_1 (2025) ([email protected])
-# AUTHOR_2 ([email protected])
-#
-# Code generation tools and workflows:
-# First versions of this code were potentially generated
-# with the help of AI writing assistants including
-# GitHub Copilot, ChatGPT, Microsoft Copilot, Google Gemini.
-# Afterwards, the generated segments were manually reviewed and edited.
-#
-
-
 """Configurations for adding artificial noise into the local estimates computation."""
 
 from pydantic import BaseModel, Field
 
@@ -1,19 +1,3 @@
-# Copyright 2025
-# [ANONYMIZED_INSTITUTION],
-# [ANONYMIZED_FACULTY],
-# [ANONYMIZED_DEPARTMENT]
-#
-# Authors:
-# AUTHOR_1 (2025) ([email protected])
-#
-# Code generation tools and workflows:
-# First versions of this code were potentially generated
-# with the help of AI writing assistants including
-# GitHub Copilot, ChatGPT, Microsoft Copilot, Google Gemini.
-# Afterwards, the generated segments were manually reviewed and edited.
-#
-
-
 from pydantic import BaseModel, Field