Skip to content

Conversation

@quaxsze
Copy link
Contributor

@quaxsze quaxsze commented Jan 27, 2020

No description provided.

@quaxsze
Copy link
Contributor Author

quaxsze commented Feb 4, 2020

@abulte I would need a review here. I left several comments in the code after implementing the double cache feature.
The question comes from the upload view.
The uploaded file will not have a URL hash. Therefore the way we look for it in tableview or exportview will not work.

A work around would be to change the endpoint:
"{scheme}://{request.host}/api/{urlhash}"
to
"{scheme}://{request.host}/api/{filehash}"

This would work but would it make the URL hash useless and not relevant?
Since we would not use it for anything else than to retrieve an DB entry for the first step of cache validation (and could be done by seeking with the filehash directly?)

@quaxsze quaxsze requested a review from abulte February 4, 2020 16:26
Copy link
Contributor

@abulte abulte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About upload view: I think we can rely on file hash only in this case, and store a url hash = None.

else:
raise RuntimeError('Func get_db_info need at least one not none argument')

res = c.fetchone()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code below can probably be made cleaner/shorter by getting column values in a dict instead of a list eg res[‘uuid’].

logger=app.logger,
sniff_limit=app.config.get('CSV_SNIFF_LIMIT'),
max_file_size=app.config.get('MAX_FILE_SIZE')
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation

Copy link
Contributor

@abulte abulte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests :-)

encoding = detect_encoding(filepath) if not encoding else encoding
table = from_csv(filepath, encoding=encoding, sniff_limit=sniff_limit)
return to_sql(table, urlhash, storage)
return to_sql(table, urlhash, filehash, storage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing line at EOF

filehash = X.hexdigest()
logger.debug('* Downloaded %s', filehash)
if not is_hash_relevant(urlhash, filehash):
print("HASH IS NOT RELEVANT")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

except Exception as e:
raise APIError('Error parsing CSV: %s' % e)
else:
print("HASH IS RELEVANT")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

raise APIError('Error parsing CSV: %s' % e)
else:
app.logger.info(f"{urlhash}.db already exists, skipping parse.")
print("AGE IS OK")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants