Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@
**/output/
**/*.xml
**/*.fasta
**/.DS_Store
93 changes: 85 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,93 @@

## Objective

This is a Python implementation of the BC-CfE HLA algorithm. This algorithm has
several interpretations, including [BBLab's hla-easy.rb](https://github.com/cfe-lab/bblab-server/blob/main/alldata/hla_class/hla-easy.rb) file. This project aims
to consolidate all of our HLA algorithm versions into a single version.
This is a Python implementation of the BC-CfE HLA algorithm. This algorithm
has several implementations, including [BBLab's hla-easy.rb][bblab_hla] file.
This project aims to consolidate all of our HLA algorithm versions into a
single version.

## Testing
[bblab_hla]: https://github.com/cfe-lab/bblab-server/blob/main/alldata/hla_class/hla-easy.rb

FIXME
The `hla_algorithm` Python package provided by this repository exposes some
executables as well.

The current validation method is to go to the [BBLab HLA Class tool page](https://hivresearchtools.bccfe.ca/django/tools/hla_class/), upload [test.fasta](https://github.com/cfe-lab/bblab-server/blob/main/tests/test.fasta) in `HLA Type C`, and assert that its output matches [the output kept in version control](https://github.com/cfe-lab/bblab-server/blob/main/tests/hla_class/HLA-C%20batch%20mode%20test%20data%20OUTPUT.csv).
* [`interpret_from_json`][interpret_from_json]: A bare-bones script that
performs interpretation on an HLA sequence specified in JSON.
* This script is also available as `python -m hla_algorithm` (in the
environment where the module is installed).
* [`update_alleles`][update_alleles]: Retrieves alleles from IMGT/HLA (or an
alternative source) and creates a YAML configuration file suitable for use with
this code. This is used periodically to update our lists of alleles with
up-to-date information.
* [`update_frequency_file`][update_frequency_file]: Updates an "old-style"
HLA frequency file into the format used by this algorithm.
* As of the time of writing, the frequencies file has unknown origin and has
never been updated. Under typical operation, you'll likely never use this.
* [`reformat_old_alleles`][reformat_old_alleles]: Updates "old-style" HLA
allele lists into a YAML configuration file suitable for use with this code.
These allele lists may be either "unreduced" (with duplicates remaining) or
"reduced" (with duplicates collapsed into a single representative). This
script can be used to update old allele lists that you wish to continue using
with this code, e.g. if you already have an old implementation of this
algorithm running somewhere and wish to preserve its functionality.

## TODO
[interpret_from_json]: src/hla_algorithm/interpret_from_json.py
[update_alleles]: src/hla_algorithm/update_alleles.py
[update_frequency_file]: src/hla_algorithm/update_frequency_file.py
[reformat_old_alleles]: src/hla_algorithm/reformat_old_alleles.py

<https://waylonwalker.com/hatch-version/>
### Ruby wrapper

In addition to the Python module, this repository also contains a Ruby wrapper
enabling this code to be used in Ruby (through system calls). The Ruby wrapper
is intended as a drop-in replacement for the Ruby-based HLA algorithm that has
been used at the CfE for years; once installed, this Ruby module provides an
`HLAAlgorithm` class with an `analyze` method, as the old code did.
Additionally, this `HLAAlgorithm` class can optionally be provided with new
configuration files, so that the alleles (and frequencies, if that ever happens)
can be updated without having to update the code.

This Ruby module works by simply packing up its inputs into a JSON file and
invoking `interpret_from_json`. Therefore, the Python module must be installed
on the system where you're using the Ruby module.

## Installation

This code may be installed from our git repo. For example, using pip:

```
pip install git+https://github.com/cfe-lab/hla_algorithm.git@[tag]
```

Using uv:

```
uv add git+https://github.com/cfe-lab/hla_algorithm.git@[tag]
```

### Ruby wrapper

Once the Python module is installed, the Ruby module can be installed from our
GitHub package repository. First, you need to authenticate with the GitHub
RubyGems registry. In brief, a personal access token (classic) is required to
install from this registry (even though our repository is public). Then you
must update your gem sources/bundler configuration with this token along with
the username of the token creator. Full details may be found in
[the GitHub instructions][github_rubygem_auth].

From there, you can install using either `gem` or via a Gemfile (i.e. with
`bundler`). Instructions may be found on
[the GitHub page for the package][github_package_page].

The Ruby module must be instructed on how to invoke `interpret_from_json`,
using the environment variable `HLA_INTERPRET_FROM_JSON`. For example, if you
install Python using `uv` in your project directory (as recommended), then you
would set this to `uv run interpret_from_json`. If your project is
Docker-based, then you can set this in your Dockerfile:

```
ENV HLA_INTERPRET_FROM_JSON="uv run interpret_from_json"
```

[github_rubygem_auth]: https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-rubygems-registry
[github_package_page]: https://github.com/cfe-lab/hla_algorithm/pkgs/rubygems/hla_algorithm
10 changes: 7 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,23 @@ build-backend = "hatchling.build"

[project]
name = "hla_algorithm"
description = ''
description = 'Python implementation of the BC-CfE HLA interpretation algorithm'
readme = "README.md"
requires-python = ">=3.10"
license = "MIT"
keywords = []
authors = [
{ name = "Rosemary McCloskey", email = "[email protected]" },
{ name = "David Rickett", email = "[email protected]" },
{ name = "Richard Liang", email = "[email protected]" },
]
classifiers = [
"Development Status :: 4 - Beta",
"Development Status :: 5 - Production/Stable",
"Programming Language :: Python",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: Implementation :: PyPy",
]
Expand Down Expand Up @@ -61,6 +63,7 @@ Source = "https://github.com/cfe-lab/hla_algorithm"
interpret_from_json = "hla_algorithm.interpret_from_json:main"
update_alleles = "hla_algorithm.update_alleles:main"
update_frequency_file = "hla_algorithm.update_frequency_file:main"
reformat_old_alleles = "hla_algorithm.reformat_old_alleles:main"

[tool.hatch.version]
source = "uv-dynamic-versioning"
Expand All @@ -75,6 +78,7 @@ include = [
"src/hla_algorithm/interpret_from_json.py",
"src/hla_algorithm/models.py",
"src/hla_algorithm/py.typed",
"src/hla_algorithm/reformat_old_alleles.py",
"src/hla_algorithm/update_alleles.py",
"src/hla_algorithm/update_frequency_file_lib.py",
"src/hla_algorithm/update_frequency_file.py",
Expand Down Expand Up @@ -125,6 +129,7 @@ omit = [
"src/hla_algorithm/__main__.py",
"tests/__init__.py",
"src/hla_algorithm/interpret_from_json.py",
"src/hla_algorithm/reformat_old_alleles.py",
"src/hla_algorithm/update_alleles.py",
"src/hla_algorithm/update_frequency_file.py",
"src/scripts/*.py",
Expand All @@ -149,7 +154,6 @@ select = ["E4", "E7", "E9", "F", "C"]
# https://www.flake8rules.com/rules/E712.html
ignore = ["E712"]


[tool.pydocstyle]
match = "src/**/*.py"

Expand Down
Binary file removed src/hla_algorithm/.DS_Store
Binary file not shown.
Loading
Loading