Skip to content

Conversation

@DougManuel
Copy link
Contributor

This PR contains updates to just variables.csv and just for the smoking variables. I will add the updates to this PR for variable_details.csv shortly (hopefully).

Additional changes to smoking variables that are a lower priority and will be added to a later PR.

  • I am also working on updates to the documentation for smoking variables, but this work will need coordination with @mdiaspar.
  • Further, I will finish suggestions to the .R code for derived variables.

Changes to variables.csv (this PR) include:

  • Essentially, all smoking variables and interview dates are updated because there are additions and changes to notes and description for all these variables.
  • DatabaseStart to _m suffix - removed duplicates and typos.
  • Fixed variableStart typos
  • Standard units capitalization
  • Added description and notes fields
  • Updated version to 3.0.0-alpha
  • Updated lastUpdated to 2025-10-26
  • Added detailed reviewNotes
  • Cleared ICES.confirmation fields. I don't think these are needed as we move toward a penultimate review.
  • No changes to Observations.MD as these are @mdiaspar notes and she may need them for her review.

DatabaseStart to _m suffix - removed duplicates and typos.
Fixed variableStart typos
Standard units capitalization
Added description  and notes fields
Updated version to 3.0.0-alpha
Updated lastUpdated to 2025-10-26
Added detailed reviewNotes (will be versionNotes in schema PR)
Cleared ICES.confirmation fields
@DougManuel DougManuel added this to the v3.0.0-alpha milestone Oct 26, 2025
@DougManuel
Copy link
Contributor Author

CSV Standardization and validation

@yulric is working on variables.csv and variable_details.csv. In the meantime, I used standardise_csv() in feature/csv-standardization-updates.

If you want to use them to verify the CSV formatting and structure:

# Source the csv-utils (from feature/csv-standardisation-updates branch)
source("R/csv-utils.R")

# Validate the updated variables.csv
result <- standardise_csv("inst/extdata/variables.csv", 
                         collaboration = FALSE,
                         validate_only = TRUE)

# Check results
print(result$valid)  # Should be TRUE
print(result$issues_remaining)  # Should be empty
Clean diff achievement
The preserve_column_order = TRUE parameter was used during preparation to ensure minimal diff noise (we have a different column order in our metadata yaml):
49 smoking variables updated (actual content changes)
No column reordering (would have shown 378 lines changed)
Only substantive changes are visible in the diff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants