AI Beyond the Cloud

An on-device text summarization app built with React Native and Expo. This app demonstrates how to run Large Language Models (LLMs) entirely on-device, providing zero-cost AI functionality with complete privacy—your data never leaves your device.

Features

On-device text summarization using quantized LLM models
Privacy-first: All processing happens locally, no cloud API calls
Zero cost: No per-token charges or server infrastructure needed
Offline capable: Works without internet connection
Multi-model support: Choose from different quantized models optimized for mobile
Smart chunking: Automatically handles long documents by processing them in chunks
Cross-platform: Runs on both iOS and Android

Prerequisites

Node.js (v18 or later)
npm or yarn
For iOS development:
- macOS
- Xcode (latest version)
- CocoaPods (sudo gem install cocoapods)
For Android development:
- Android Studio
- Android SDK
- Java Development Kit (JDK)

Getting Started

1. Clone the Repository

git clone https://github.com/nearform/ai-beyond-the-cloud.git
cd ai-beyond-the-cloud

2. Install Dependencies

npm install

3. Build and Run Development Build

This app uses native modules (react-native-executorch) and requires a development build. Expo Go is not supported.

For iOS:

npm run ios

This will build the development build and launch it on the iOS simulator or connected device.

For Android:

npm run android

This will build the development build and launch it on the Android emulator or connected device.

Note:

The first build may take a while as it compiles native code
Subsequent runs will be faster with incremental builds

Usage

Enter text: Paste or type the text you want to summarize in the input field
Select model (optional): Tap the model selector to choose a different quantized model
Generate summary: Tap the "Summarize" button
View results: The summary will appear below the input field
Clear: Tap "Clear" to reset the input and summary

The app automatically handles long texts by:

Chunking text into manageable pieces
Processing chunks sequentially with delays to prevent thermal throttling
Combining chunk summaries into a final result

Project Structure

├── app/
│   ├── _layout.tsx          # App layout and navigation
│   └── index.tsx            # Main screen component
├── components/              # Reusable UI components
├── utils/
│   ├── generation-core.ts   # Core LLM generation logic
│   ├── generation-service.ts # Generation session management
│   ├── model-manager.ts     # Model lifecycle management
│   ├── model-registry.ts    # Available models configuration
│   ├── summarizer.ts        # Text chunking utilities
│   └── use-model.ts         # React hook for model access
├── __tests__/               # Test suites
└── assets/                  # Images and static assets

Available Scripts

npm start - Start the Expo development server (required for development builds)
npm run ios - Build and run on iOS simulator/device
npm run android - Build and run on Android emulator/device
npm run web - Run in web browser (limited functionality - on-device LLM not available)
npm test - Run test suite
npm run test:watch - Run tests in watch mode
npm run lint - Run ESLint

Testing

The project includes comprehensive test coverage:

npm test

Tests cover:

UI component interactions
Text chunking logic
Core generation functionality
Model state management
Error handling

Technical Details

Model Execution

The app uses react-native-executorch to run quantized PyTorch models on-device. Models are optimized for mobile with:

INT4/INT8 quantization for reduced size and power consumption
ExecuTorch runtime for efficient inference
Smart memory management to prevent OOM crashes

Architecture

Model Manager: Handles model loading, initialization, and lifecycle
Generation Service: Manages generation sessions and prevents race conditions
Generation Core: Core LLM inference logic with cooldown and locking mechanisms
React Hooks: useLLMModel provides reactive access to model state

Performance Considerations

Text is chunked to respect model context windows
Delays between chunk processing prevent thermal throttling
Input truncation prevents memory issues
Generation cooldowns prevent rapid-fire requests

Troubleshooting

Model not loading

Ensure you have sufficient device storage
Check that the model files are properly bundled
Try restarting the app

Slow performance

Use a smaller model for faster inference
Reduce input text length
Close other apps to free up memory

App crashes

The app limits input size and chunk count to prevent OOM
If crashes persist, try a device with more RAM or a smaller model
Check device logs for specific error messages

Learn More

For a detailed technical deep-dive into on-device AI, quantization, and the architecture decisions behind this app, see BLOG.md.

License

This project is private and proprietary.

Contributing

This is a demonstration project. For questions or issues, please open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.cursor		.cursor
.vscode		.vscode
__tests__		__tests__
app		app
assets/images		assets/images
components		components
constants		constants
hooks		hooks
public		public
scripts		scripts
utils		utils
.gitignore		.gitignore
BLOG.md		BLOG.md
README.md		README.md
app.json		app.json
eslint.config.js		eslint.config.js
jest.config.js		jest.config.js
jest.setup.js		jest.setup.js
package-lock.json		package-lock.json
package.json		package.json
slides.md		slides.md
slidev.config.ts		slidev.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Beyond the Cloud

Features

Prerequisites

Getting Started

1. Clone the Repository

2. Install Dependencies

3. Build and Run Development Build

Usage

Project Structure

Available Scripts

Testing

Technical Details

Model Execution

Architecture

Performance Considerations

Troubleshooting

Model not loading

Slow performance

App crashes

Learn More

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

nearform/ai-beyond-the-cloud

Folders and files

Latest commit

History

Repository files navigation

AI Beyond the Cloud

Features

Prerequisites

Getting Started

1. Clone the Repository

2. Install Dependencies

3. Build and Run Development Build

Usage

Project Structure

Available Scripts

Testing

Technical Details

Model Execution

Architecture

Performance Considerations

Troubleshooting

Model not loading

Slow performance

App crashes

Learn More

License

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages