Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
10 changes: 10 additions & 0 deletions bayan/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
logs
logs/*
/*__pychache__
.venv
*/__pychache__
helpers/__pychache__/*
helpers/__pychache__/
.vscode
*.log
*/__pychache__/*
114 changes: 114 additions & 0 deletions bayan/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# AI-Powered Software License Identification

This project leverages large language models (LLMs) to identify and extract relevant chunks of text pertaining to software licenses. It's a valuable tool for developers and legal teams needing to quickly pinpoint licensing information within extensive text.

## Table of Contents

- [AI-Powered Software License Identification](#ai-powered-software-license-identification)
- [Table of Contents](#table-of-contents)
- [Installation](#installation)
- [Step 1: Install dependencies](#step-1-install-dependencies)
- [Dependencies:](#dependencies)
- [Setting Up API Keys:](#setting-up-api-keys)
- [Step 1: Obtain an API Key](#step-1-obtain-an-api-key)
- [Step 2: Set Up Environment Variables](#step-2-set-up-environment-variables)
- [Alternative Method: Storing in .bashrc](#alternative-method-storing-in-bashrc)
- [Basic Usage](#basic-usage)

## Installation

To run this project, you need to install the required dependencies. The project uses Python 3.8+.

### Step 1: Install dependencies

You can install the required packages using the `requirements.txt` file:

```bash
pip install -r requirements.txt
```
### Dependencies:

The project relies on the following libraries:

* loguru==0.7.2
* openai==1.31.1
* pandas==2.0.3
* tenacity==8.3.0
* ratelimit==2.2.1
* Nirjas==1.0.1
* fuzzywuzzy[speedup]>=0.18.0
* langchain_groq
* tiktoken

## Setting Up API Keys:

To run the LLM models, you need to set up your API keys for the relevant services. The project primarily uses the OpenAI API for its LLM capabilities.

### Step 1: Obtain an API Key

* Visit GroQ, TogetherAI, etc. and sign up for an account if you don’t have one.

### Step 2: Set Up Environment Variables

You can add your API key to your environment variables to keep it secure:

For Linux/MacOS:

```bash
export OPENAI_API_KEY='your-api-key-here'
```

For Windows:

```bash
set OPENAI_API_KEY='your-api-key-here'
```

Alternatively, you can create a `.env` file in the project root directory and add your API key there:

```bash
OPENAI_API_KEY=your-api-key-here
```

### Alternative Method: Storing in .bashrc
In addition to using environment variables directly, you can also store your API keys in your `.bashrc` file. This approach offers a degree of persistence, as the keys will be loaded each time you open a new terminal session.

Steps:

1. Open your `.bashrc` file using a text editor:
```
sudo gedit ~/.bashrc
```

2. Add the following lines at the end of the file, replacing the placeholders with your actual API keys:

```
# Add your own API keys here
export GROQ_API_KEY=""
export NVIDIA_API_KEY=""
export TOGETHER_API_KEY=""
```

3. Save the file and close the text editor.

4. To apply the changes immediately, either close and reopen your terminal or run the following command:

```
source ~/.bashrc
```

## Basic Usage

```
from helpers.llm_client import LLMClient
from helpers.models import *

client = LLMClient()

client._infer(model = Models.GEMMA_2_9b, prompt = 'Hey, How are you?', temperature = 0.1)
```

More details can be found in the [project-showcase](./project-showcase.ipynb) notebook.



49 changes: 49 additions & 0 deletions bayan/extras/NomosTestfiles/AAL/AAL.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Attribution Assurance License
Copyright (c) 2002 by AUTHOR
PROFESSIONAL IDENTIFICATION * URL
"PROMOTIONAL SLOGAN FOR AUTHOR'S PROFESSIONAL PRACTICE"

All Rights Reserved
ATTRIBUTION ASSURANCE LICENSE (adapted from the original BSD license)
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the conditions below are met.
These conditions require a modest attribution to <AUTHOR> (the
"Author"), who hopes that its promotional value may help justify the
thousands of dollars in otherwise billable time invested in writing
this and other freely available, open-source software.

1. Redistributions of source code, in whole or part and with or without
modification (the "Code"), must prominently display this GPG-signed
text in verifiable form.
2. Redistributions of the Code in binary form must be accompanied by
this GPG-signed text in any documentation and, each time the resulting
executable program or a program dependent thereon is launched, a
prominent display (e.g., splash screen or banner text) of the Author's
attribution information, which includes:
(a) Name ("AUTHOR"),
(b) Professional identification ("PROFESSIONAL IDENTIFICATION"), and
(c) URL ("URL").
3. Neither the name nor any trademark of the Author may be used to
endorse or promote products derived from this software without specific
prior written permission.
4. Users are entirely responsible, to the exclusion of the Author and
any other persons, for compliance with (1) regulations set by owners or
administrators of employed equipment, (2) licensing terms of any other
software, and (3) local regulations regarding use, including those
regarding import, export, and use of encryption software.

THIS FREE SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
EVENT SHALL THE AUTHOR OR ANY CONTRIBUTOR BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
EFFECTS OF UNAUTHORIZED OR MALICIOUS NETWORK ACCESS;
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--End of License
47 changes: 47 additions & 0 deletions bayan/extras/NomosTestfiles/AAL/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
Copyright (c) 2007 - 2010 by Arslan Hassan et al
CLIPBUCKET * http://clip-bucket.com
"A way to broadcast yourself - free and opensource video sharing website script"

All Rights Reserved
ATTRIBUTION ASSURANCE LICENSE (adapted from the original BSD license)
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the conditions below are met.
These conditions require a modest attribution to <AUTHOR> (the
"Arslan Hassan"), who hopes that its promotional value may help justify the
thousands of dollars in otherwise billable time invested in writing
this and other freely available, open-source software.

1. Redistributions of source code, in whole or part and with or without
modification (the "Code"), must prominently display this GPG-signed
text in verifiable form.
2. Redistributions of the Code in binary form must be accompanied by
this GPG-signed text in any documentation and, each time the resulting
executable program or a program dependent thereon is launched, a
prominent display (e.g., splash screen or banner text) of the Author's
attribution information, which includes:
(a) Name ("Arslan Hassan"),
(b) Professional identification ("ClipBucket"), and
(c) URL ("http://clip-bucket.com").
3. Neither the name nor any trademark of the Author may be used to
endorse or promote products derived from this software without specific
prior written permission.
4. Users are entirely responsible, to the exclusion of the Author and
any other persons, for compliance with (1) regulations set by owners or
administrators of employed equipment, (2) licensing terms of any other
software, and (3) local regulations regarding use, including those
regarding import, export, and use of encryption software.

THIS FREE SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
EVENT SHALL THE AUTHOR OR ANY CONTRIBUTOR BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
EFFECTS OF UNAUTHORIZED OR MALICIOUS NETWORK ACCESS;
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
152 changes: 152 additions & 0 deletions bayan/extras/NomosTestfiles/ACAA/c32001a.ada
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
-- C32001A.ADA

-- Grant of Unlimited Rights
--
-- Under contracts F33600-87-D-0337, F33600-84-D-0280, MDA903-79-C-0687,
-- F08630-91-C-0015, and DCA100-97-D-0025, the U.S. Government obtained
-- unlimited rights in the software and documentation contained herein.
-- Unlimited rights are defined in DFAR 252.227-7013(a)(19). By making
-- this public release, the Government intends to confer upon all
-- recipients unlimited rights equal to those held by the Government.
-- These rights include rights to use, duplicate, release or disclose the
-- released technical data and computer software in whole or in part, in
-- any manner and for any purpose whatsoever, and to have or permit others
-- to do so.
--
-- DISCLAIMER
--
-- ALL MATERIALS OR INFORMATION HEREIN RELEASED, MADE AVAILABLE OR
-- DISCLOSED ARE AS IS. THE GOVERNMENT MAKES NO EXPRESS OR IMPLIED
-- WARRANTY AS TO ANY MATTER WHATSOEVER, INCLUDING THE CONDITIONS OF THE
-- SOFTWARE, DOCUMENTATION OR OTHER INFORMATION RELEASED, MADE AVAILABLE
-- OR DISCLOSED, OR THE OWNERSHIP, MERCHANTABILITY, OR FITNESS FOR A
-- PARTICULAR PURPOSE OF SAID MATERIAL.
--*
-- CHECK THAT IN MULTIPLE OBJECT DECLARATIONS FOR SCALAR TYPES, THE
-- SUBTYPE INDICATION AND THE INITIALIZATION EXPRESSIONS ARE EVALUATED
-- ONCE FOR EACH NAMED OBJECT THAT IS DECLARED AND THE SUBTYPE
-- INDICATION IS EVALUATED FIRST. ALSO, CHECK THAT THE EVALUATIONS
-- YIELD THE SAME RESULT AS A SEQUENCE OF SINGLE OBJECT DECLARATIONS.

-- RJW 7/16/86

WITH REPORT; USE REPORT;

PROCEDURE C32001A IS

BUMP : ARRAY (1 .. 8) OF INTEGER := (OTHERS => 0);

FUNCTION F (I : INTEGER) RETURN INTEGER IS
BEGIN
BUMP (I) := BUMP (I) + 1;
RETURN BUMP (I);
END F;

BEGIN
TEST ("C32001A", "CHECK THAT IN MULTIPLE OBJECT DECLARATION " &
"FOR SCALAR TYPES, THE SUBTYPE INDICATION " &
"AND THE INITIALIZATION EXPRESSIONS ARE " &
"EVALUATED ONCE FOR EACH NAMED OBJECT THAT " &
"IS DECLARED AND THE SUBTYPE INDICATION IS " &
"EVALUATED FIRST. ALSO, CHECK THAT THE " &
"EVALUATIONS YIELD THE SAME RESULT AS A " &
"SEQUENCE OF SINGLE OBJECT DECLARATIONS" );

DECLARE

TYPE DAY IS (MON, TUES, WED, THURS, FRI);
D1, D2 : DAY
RANGE MON .. DAY'VAL (F (1)) :=
DAY'VAL (F (1) - 1);
CD1, CD2 : CONSTANT DAY
RANGE MON .. DAY'VAL (F (2)) :=
DAY'VAL (F (2) - 1);

I1, I2 : INTEGER RANGE 0 .. F (3) :=
F (3) - 1;
CI1, CI2 : CONSTANT INTEGER RANGE 0 .. F (4)
:= F (4) - 1;

TYPE FLT IS DIGITS 3 RANGE -5.0 .. 5.0;
FL1, FL2 : FLT RANGE 0.0 .. FLT (F (5)) :=
FLT (F (5) - 1);
CFL1, CFL2 : CONSTANT FLT
RANGE 0.0 .. FLT (F (6)) :=
FLT (F (6) - 1);

TYPE FIX IS DELTA 1.0 RANGE -5.0 .. 5.0;
FI1, FI2 : FIX RANGE 0.0 .. FIX (F (7)) :=
FIX (F (7) - 1);
CFI1, CFI2 : CONSTANT FIX
RANGE 0.0 .. FIX (F (8)) :=
FIX (F (8) - 1);

BEGIN
IF D1 /= TUES THEN
FAILED ( "D1 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF D2 /= THURS THEN
FAILED ( "D2 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF CD1 /= TUES THEN
FAILED ( "CD1 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF CD2 /= THURS THEN
FAILED ( "CD2 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF I1 /= 1 THEN
FAILED ( "I1 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF I2 /= 3 THEN
FAILED ( "I2 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF CI1 /= 1 THEN
FAILED ( "CI1 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF CI2 /= 3 THEN
FAILED ( "CI2 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF FL1 /= 1.0 THEN
FAILED ( "FL1 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF FL2 /= 3.0 THEN
FAILED ( "FL2 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF CFL1 /= 1.0 THEN
FAILED ( "CFL1 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF CFL2 /= 3.0 THEN
FAILED ( "CFL2 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF FI1 /= 1.0 THEN
FAILED ( "FI1 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF FI2 /= 3.0 THEN
FAILED ( "FI2 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF CFI1 /= 1.0 THEN
FAILED ( "CFI1 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

IF CFI2 /= 3.0 THEN
FAILED ( "CFI2 NOT INITIALIZED TO CORRECT VALUE" );
END IF;

END;

RESULT;
END C32001A;
Loading