Skip to content

File cache without kernel cache #1946

@natbprice

Description

@natbprice

It seems like a common use case for blobfuse2 would be to simply have files downloaded from blob storage when first accessed, cached for a fixed period for future access, and re-downloaded again if accessed after the cache time has expired. During this cache period the file may or may not have been updated externally, but this is acceptable for this hypothetical use case so long as file is not older than the specified cache time. This appears to be straightforward to configure using --file-cache-timeout option.

However, in my testing it seems like I also need to use --disable-kernel-cache or I will keep accessing the old file even after the specified file cache timeout. The default behavior includes kernel caching and this can lead to non-intuitive results if you are not aware of multiple caching layers and interactions. The option to disable kernel caching seems important, yet it doesn't appear on the landing page and is not included in the nice flowcharts on caching.

Somewhat related is the --direct-io option and I found some discussion of this related to resolving caching issues. However, this seems like an extreme option because it not only disables kernel caching, but also the blobfuse2 file caching. This seems like a less common use case, but it has more discussion than the option of simply disabling kernel cache.

Is my understanding correct that we must use --disable-kernel-cache to ensure blobfuse2 can manage the cache in accordance with the specified --file-cache-timeout?

Would it make sense to update the documentation to better explain the --disable-kernel-cache option?

By the way, I am using blobfuse with Azure Batch as configured using Azure Batch SDK (https://learn.microsoft.com/en-us/azure/batch/virtual-file-mount?tabs=windows). I am not sure if this might be causing any atypical behavior, but it does add another layer of abstraction making it harder to configure and debug issues.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions