Building the future of software personalization with decentralized memory networks.
We'd love to partner with early-stage companies to build together. Join us in the Tiles Discord server. Subscribe to our blog for updates on on-device AI and software personalization research.
Consider supporting our work through Github Sponsors.
Below is a living index of resources that inform and inspire our work.
- ✨ This is Cuba's Netflix, Hulu, and Spotify – all without the internet
- Cuba's Underground Gaming Network
- ✨ Weathering Software Winter
- ✨ The Memory Walled Garden
- ✨ Why Tech Needs Personalization
- ✨ Evolution of Apple M Series Chips | From M1 to M5
- The Bitter Lesson of LLM Extensions
- ✨ The Continual Learning Problem, Jessy Lin
- ✨ mem-agent: Equipping LLM Agents with Memory Using RL
- ✨ Xet is on the Hub
- ✨ MIR, Machine Intelligence Resource, A naming schema for AIGC/ML work, Darkshapes
- ✨ Jan-nano Technical Report
- ✨ Memory Layers at Scale, Meta FAIR
- ✨ Self-driving infrastructure
- ✨ LoRA Without Regret
- ✨A Preliminary Report On Edge-Verified Machine Learning (evML)
- ✨ Pretraining with hierarchical memories: separating long-tail and common knowledge, Apple
- LoRA Learns Less and Forgets Less
- ✨ The Bitter Lesson is coming for Tokenization
- On the Way to LLM Personalization: Learning to Remember User Conversations, Apple Machine Learning Research
- ✨ Text-to-LoRA: Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input ,Sakana AI
- Transformer²: Self-Adaptive LLMs
- How memory augmentation can improve large language models, IBM Research
- Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
- ✨ The Power of Efficiency: Edge Al’s Role in Sustainable Generative Al Adoption
- ✨ Small Language Models are the Future of Agentic AI, NVIDIA Research
- ✨ Defeating Prompt Injections by Design, Google Deepmind
- LLM in a flash: Efficient Large Language Model Inference with Limited Memory
- Introducing FlexOlmo: a new paradigm for language model training and data collaboration, Allen AI
- WhisperKit: On-device Real-time ASR with Billion-Scale Transformers, Argmax
- ✨ Towards Large-scale Training on Apple Silicon, Exo Labs
- Kinetics: Rethinking Test-Time Scaling Laws
- Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search
- LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
- AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air
- Comparative Analysis of Retrieval Systems in the Real World
- FedVLM: Scalable Personalized Vision-Language Models through Federated Learning
- On the Way to LLM Personalization: Learning to Remember User Conversations
- A Preliminary Report On Edge-Verified Machine Learning, Exo Labs
- ✨ Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities
- ✨ Intent-Based Architecture and Their Risks
- Your LLM Knows the Future: Uncovering Its Multi-Token Prediction Potential
- Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design
- Towards Feasible Private Distributed LLM Inference, Dria
- ✨ ChatGPT Memory and the Bitter Lesson
- ✨ OpenPoke: Recreating Poke's Architecture
- Anthropic's Opinionated Memory Bet
- Mind the Trust Gap: Fast, Private Local-to-Cloud LLM Chat
- MCP Colors: Systematically deal with prompt injection risk
- New physical attacks are quickly diluting secure enclave defenses from Nvidia, AMD, and Intel
- ✨ Foundation Models, Apple Developer Documentation
- ✨ User-Agent header, MDN
- ✨ Modeflile Reference, Ollama Documentation
- Dria Inference Arena – Compare Benchmarks Across LLMs, Inference Engines & Hardware
- The GPU-Poor LLM Gladiator Arena
- ✨ The Kaitchup Index: A Leaderboard for Quantized LLMs
- InferenceMAX™: Open Source Inference Benchmarking
- RFT, DPO, SFT: Fine-tuning with OpenAI — Ilan Bigio, OpenAI
- ✨ Hand-picked selection of articles on AI fundamentals/concepts that cover the entire process of building neural nets to training them to evaluating results.
- ✨ The State of Open Models
- ✨ The State of On-Device LLMs
- How to Scale Your Model
- ✨ r/LocalLLaMA
- ✨ An Analogy for Understanding Transformers
- ✨ Neural networks, 3Blue1Brown
- GGUF Quantization Docs (Unofficial)
- Reverse-engineering GGUF | Post-Training Quantization
- Choosing a GGUF Model: K-Quants, I-Quants, and Legacy Formats
- Reference implementation of the Transformer architecture optimized for Apple Neural Engine
- ✨ LLMs on a Budget
- ✨ Personalized Machine Learning
Copyright © 2025, Tiles Privacy. All rights reserved.