Skip to content

Conversation

@scravikiran
Copy link
Contributor

No description provided.

- GHSA-887c-mr87-cxwp: Upgrade torch from 2.7.1 to 2.8.0 in all affected environments
- GHSA-36rr-ww3j-vrjv: Upgrade keras from 3.11.0 to 3.11.3 in tensorflow environment
- GHSA-4xh5-x5gv-qwph: Upgrade pip to latest secure version across all environments

Environments fixed:
- automl environments (ai-ml-automl-*)
- fine-tuning environments (acft-*)
- general ML environments (sklearn, lightgbm, tensorflow)
- vision processing environments
- pytorch environments

All fixes maintain backward compatibility while resolving critical security issues.
- PyTorch 2.8.0 requires Python 3.10 or higher
- Updated all AutoML environments using Python 3.9 to Python 3.10
- This resolves the conda solver error: 'nothing provides __cuda needed by pytorch-2.8.0'

Environments updated:
- ai-ml-automl
- ai-ml-automl-dnn
- ai-ml-automl-dnn-forecasting-gpu
- ai-ml-automl-dnn-gpu
- ai-ml-automl-dnn-text-gpu
- ai-ml-automl-dnn-vision-gpu
- ai-ml-automl-gpu
- torchvision 0.22.1 depends on torch==2.7.1
- torchvision 0.23.0 is compatible with torch==2.8.0
- This resolves pip dependency conflicts during installation

Fixed environments:
- acpt-grpo
- ai-ml-automl-dnn-text-gpu
- ai-ml-automl-dnn-forecasting-gpu
- ai-ml-automl-dnn-vision-gpu
- ai-ml-automl-dnn-gpu
- automl-dnn-vision-gpu
Specialized fixes for environments with complex dependency conflicts:

1. ai-ml-automl-dnn-text-gpu:
   - Downgrade transformers from 4.53.0 to 4.48.0 (azureml-automl-dnn-nlp requirement)
   - Use torch==2.2.2 + torchvision==0.17.2 (azureml-automl-dnn-nlp requirement)
   - Downgrade urllib3 from 2.5.0 to 1.26.18 (azureml-automl-runtime requirement)

2. ai-ml-automl-gpu:
   - Downgrade urllib3 from 2.5.0 to 1.26.18 (azureml-automl-runtime requirement)
   - Keep pip security upgrade

These environments require older package versions due to azureml-automl-runtime
and azureml-automl-dnn-nlp compatibility constraints. The security vulnerabilities
in torch and transformers will need to be addressed through runtime updates
rather than package upgrades.
@github-actions
Copy link

github-actions bot commented Oct 9, 2025

Test Results for assets-test

12 tests   8 ✅  2h 15m 25s ⏱️
12 suites  0 💤
12 files    4 ❌

For more details on these failures, see this check.

Results for commit b258973.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@vizhur vizhur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these updates do not seem correct.

  • use no cache as images will result >8gb layers when installing torch+
  • do not reinstall torch if you based on aifx image, it defeats the purpose of using that base image
  • do not upgrade pip, esp in random envs that don't even exist in the image
  • if you have to update packages like urllib or request, you have a bigger problem with your dependencies
  • if you create a new conda environment and using aifx image, there is absolutely no reason to use aifx image, start from a light one from nvcr or one of training base images

@github-actions
Copy link

This pull request has been marked as stale because it has been inactive for 14 days.

@github-actions github-actions bot added the Stale label Oct 24, 2025
@github-actions
Copy link

This pull request has been automatically closed due to inactivity.

@github-actions github-actions bot closed this Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants