Video Outlier Optimization

Paper

Website

Abstract

Training sample quality impacts on deep learning model performances. While studies in the literature explored the association of outlier samples to model performance in modalities like text or images in the NLP and computer vision domains, it is relatively underexplored in the domain of video classifications. Researchers focused on anomaly detection or theoretical bounding of outliers towards video classifications. However, explicit, systematic empirical studies of the impacts of these outliers on video classification modeling are still yet to be explored. Therefore, to bridge this gap, in this work, we systematically analyze the impacts of outliers, specifically in-distribution outliers, on video classification performances and show that reducing the outliers from training can improve video classification performances.

Results

Top-1 Accuracy

Top-5 Accuracy

Setup and Training

$ git clone https://github.com/ckeith26/video-outlier-optimization.git

or if you have ssh

$ git clone [email protected]:ckeith26/video-outlier-optimization.git

Jupyter Notebooks Independent Model Training

You can train any of the models directly from the Jupyter notebooks. The trained models will be saved in the models directory.

Download Datasets

Download the UCF101 dataset and extract the files to ./data/UCF-101.
Download the UCF101 TrainTestlist and extract the files to .data/ucfTrainTestlist.

Conda Environment

If you don't have a conda env setup already:

$ conda create -n myenv python=3.8 $ conda install transformers tqdm torchvision torch numpy pandas seaborn av ipykernel

Model Training with Pipeline

Download Docker: $ pip install docker
Run $ sh ./scripts/run.sh to build the Docker image and run the container.

Or manually run the following commands: 2. Build the Docker image: $ docker build -t video_outlier_optimization -f ./container/Dockerfile . 3. Run the Docker image: $ docker run -p 8888:8888 video_outlier_optimization

Troubleshooting: Docker docs

Model Modification

Training Parameters

train_test_size: Number of training samples to use (e.g., 100k).
train_test_split: The proportion of the dataset to include in the train split (e.g., 0.8).
epochs: for training all models (e.g., 15).

Data Parameters

load_more_data: if True, the code will load a new dataset with the specified train_size.

Configuration File

Modify config.ini

This code reads configuration values from an INI file named config.ini located in the parent directory (./config.ini). The configuration file should have the following structure:

[models]
model1 = ./training_files/base_model.ipynb
model2 = ./training_files/outlier_model_5P.ipynb
model3 = ./training_files/outlier_model_5P.ipynb
model4 = ./training_files/outlier_model_5P.ipynb

[training]
train_size = 100000
train_test_split = 0.8
epochs = 15
batch_size = 128

[data]
load_more_data = False
train_dataset = ./data/train_data.pt
test_dataset = ./data/test_data.pt
train_subset_100k = ./data/train_subset_100k.pt
test_subset_20k = ./data/test_subset_20k.pt
train_subset_100k_rest = ./data/train_subset_100k_rest.pt
test_subset_20k_rest = ./data/test_subset_20k_rest.pt

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
container		container
data		data
models		models
results		results
scripts		scripts
training_files		training_files
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.ini		config.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Video Outlier Optimization

Paper

Website

Abstract

Results

Top-1 Accuracy

Top-5 Accuracy

Setup and Training

Jupyter Notebooks Independent Model Training

Download Datasets

Conda Environment

Model Training with Pipeline

Model Modification

Training Parameters

Data Parameters

Configuration File

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ckeith26/video-outlier-optimization

Folders and files

Latest commit

History

Repository files navigation

Video Outlier Optimization

Paper

Website

Abstract

Results

Top-1 Accuracy

Top-5 Accuracy

Setup and Training

Jupyter Notebooks Independent Model Training

Download Datasets

Conda Environment

Model Training with Pipeline

Model Modification

Training Parameters

Data Parameters

Configuration File

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages