Skip to content

overriding rpath filename when downloading #51

@grlloyd

Description

@grlloyd

Web resources are currently downloaded to rpath which is constructed by combining a unique id (if requested) and the file name extracted from the url. However, some url dont include a filename e.g.

src = 'https://pubchem.ncbi.nlm.nih.gov/sdq/sdqagent.cgi?infmt=json&outfmt=csv&query={%22download%22:%22*%22,%22collection%22:%22pathway%22,%22order%22:[%22relevancescore,desc%22],%22start%22:1,%22limit%22:10000000,%22downloadfilename%22:%22PubChem_pathway_text_Reactome%22,%22where%22:{%22ands%22:[{%22*%22:%22Reactome%22},{%22source%22:%22Reactome%22}]}}'

In this case the url contains json, so I think the download fails as the filename generated for rpath isnt valid. However, any url that doesn't have a filename at the end but returns a file could end up with an unwieldy filename in the cache folder.

I tried to overcome this using bfcupdate to change rpath before downloading, but it fails because bfcupdate changes the rtype to "local".

One option would be to include an input in bfcadd that allows the user to override the default filename for rpath e.g. rpath_filename = "new_filename.xyz" and construct rpath from that instead of trying to extract it from the url.

Or you could try to extract the intended filename from the httr:GET response, if there is one.

Is there a work around for this that doesnt need an update to BiocFileCache?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions