Resume Reader App

Introduction

This project leverages the newly launched LlamaExtract Beta API to extract structured data from unstructured PDF CVs/resumes using Streamlit. The LlamaExtract API provides a powerful solution for extracting structured information from various documents with high accuracy and efficiency.

About LlamaExtract

Introducing LlamaExtract Beta: Structured Data Extraction in Just a Few Clicks

LlamaExtract is a managed service by LlamaCloud that allows you to perform structured extraction from unstructured documents. Structured extraction from unstructured data is not only a core use case for large language models (LLMs) but also a crucial component in data processing for retrieval and RAG (Retrieval-Augmented Generation) use cases. More info on this blog: LlamaIndex

Key Features:

Schema Inference: LlamaExtract can infer a schema from an existing candidate set of documents. Users have the option to edit this schema later.
Value Extraction: Extracts values from documents according to a specified schema, whether inferred, specified by a human, or both.

LlamaExtract is available to LlamaCloud users through both a UI and API. Schema inference is currently limited to 5 files with a maximum of 10 pages per file. Schema extraction operates on a per-document level given an existing schema.

Metadata Extraction in the LLM ETL Stack:

A new data ETL stack is needed for LLM applications. The data loading, transformation, and indexing layer is crucial for downstream RAG and agent use cases over unstructured data. We built LlamaParse and LlamaCloud to serve these ETL needs and power thousands of production pipelines over complex documents. Besides chunk-level embeddings, automated metadata extraction is vital for increasing transparency and control over unstructured data.

Getting Started

Prerequisites

Python 3.x
Streamlit
LlamaCloud API Key

Installation

Create a Virtual Environment:
```
python -m venv venv
```
Activate the Virtual Environment:

On Windows:
```
venv\Scripts\activate
```
On macOS/Linux:
```
source venv/bin/activate
```
Install Required Packages:
```
pip install -r requirements.txt
```
Set Up API Key:

Add your LLAMA_CLOUD_API_KEY in the resume_reader_app.py file.
Run the Streamlit App:
```
streamlit run resume_reader_app.py
```
Usage

Once the Streamlit server is running, you can upload PDF CVs/resumes through the app's interface. The app will use the LlamaExtract API to process the documents and display the extracted structured data.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
resume_reader_app.py		resume_reader_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Resume Reader App

Introduction

About LlamaExtract

Key Features:

Metadata Extraction in the LLM ETL Stack:

Getting Started

Prerequisites

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

saadshaikh3/llamaextract-streamlit-resumereader

Folders and files

Latest commit

History

Repository files navigation

Resume Reader App

Introduction

About LlamaExtract

Key Features:

Metadata Extraction in the LLM ETL Stack:

Getting Started

Prerequisites

Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages