-
Notifications
You must be signed in to change notification settings - Fork 0
FOLIOSYNC-7 Prepare files for Hyacinth Sync #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
ae8eca7
d7451c1
79e280d
8d7b6c5
50e14e5
bcb65da
47bbc45
67cc551
3ea5ff2
7e00c42
c73c6e3
f58033e
3dca47b
549a136
3524e79
1fce0d3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| development: | ||
| download_directory: <%= Rails.root.join('tmp/working_data/development/folio_to_hyacinth/downloaded_files') %> | ||
|
|
||
| test: | ||
| download_directory: <%= Rails.root.join('tmp/working_data/test/folio_to_hyacinth/downloaded_files') %> |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| development: | ||
| url: https://example-dev.library.edu/api | ||
| email: example-email | ||
| password: example-password | ||
|
|
||
| test: | ||
| url: https://example-test.library.edu/api | ||
| email: example-email | ||
| password: example-password |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,79 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| class FolioSync::FolioToHyacinth::MarcDownloader | ||
| attr_reader :downloading_errors | ||
|
|
||
| def initialize | ||
| @folio_client = FolioSync::Folio::Client.instance | ||
| @folio_reader = FolioSync::Folio::Reader.new | ||
| @downloading_errors = [] | ||
| end | ||
|
|
||
| # Downloads all SRS MARC bibliographic records that have a 965 field that has a subfield $a value of '965hyacinth' | ||
| # AND were modified within the last `last_x_hours` hours. | ||
| # A modified_since value of `nil` means that we want to download ALL '965hyacinth' records, regardless of modification time. | ||
| def download_965hyacinth_marc_records(last_x_hours = nil) | ||
| modified_since = Time.now.utc - (3600 * last_x_hours) if last_x_hours | ||
| modified_since_utc = modified_since&.utc&.iso8601 | ||
| Rails.logger.info( | ||
| "Downloading MARC with 965hyacinth#{modified_since_utc ? " modified since: #{modified_since_utc}" : ' (all records)'}" | ||
| ) | ||
|
|
||
| @folio_client.find_source_marc_records(modified_since: modified_since_utc, with_965_value: '965hyacinth') do |parsed_record| | ||
| # The returned MARC record has been filtered to include records with "965hyacinth" identifiers | ||
| # but we want to double-check that the identifier lives in the 965$a field. | ||
| if has_965hyacinth_field?(parsed_record) | ||
| begin | ||
| save_marc_record_to_file(parsed_record) | ||
| rescue StandardError => e | ||
| record_id = extract_id(parsed_record) || 'unknown' | ||
| error_message = "Failed to save MARC record #{record_id}: #{e.message}" | ||
| @downloading_errors << error_message | ||
| Rails.logger.error(error_message) | ||
| end | ||
| end | ||
| end | ||
| end | ||
|
|
||
| # @param [Hash] marc_record A MARC record represented as a Hash | ||
| def has_965hyacinth_field?(marc_record) | ||
| fields = marc_record['fields'] | ||
|
|
||
| fields.any? do |field| | ||
| next unless field['965'] | ||
|
|
||
| field['965']['subfields']&.any? { |subfield| subfield['a'] == '965hyacinth' } | ||
| end | ||
| end | ||
|
|
||
| def save_marc_record_to_file(marc_record) | ||
| config = Rails.configuration.folio_to_hyacinth | ||
| filename = extract_id(marc_record) | ||
|
|
||
| raise FolioSync::Exceptions::Missing001Field, 'MARC record is missing required 001 field' if filename.nil? | ||
|
|
||
| file_path = File.join(config[:download_directory], "#{filename}.mrc") | ||
| formatted_marc = MARC::Record.new_from_hash(marc_record) | ||
|
|
||
| Rails.logger.info("Saving MARC record with 001=#{filename} to #{file_path}") | ||
| File.binwrite(file_path, formatted_marc) | ||
| end | ||
|
|
||
| # Downloads a single SRS MARC record to the download directory. Raises an exception if the record with the given `folio_hrid` | ||
| # does NOT have at least one 965 field with a subfield $a value of '965hyacinth'. | ||
| def download_single_965hyacinth_marc_record(folio_hrid) | ||
| source_record = @folio_client.find_source_record(instance_record_hrid: folio_hrid) | ||
| marc_record = source_record['parsedRecord']['content'] if source_record | ||
|
|
||
| unless has_965hyacinth_field?(marc_record) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good move adding this check here! |
||
| raise "Source record with HRID #{folio_hrid} doesn't have a 965 field with subfield $a value of '965hyacinth'." | ||
| end | ||
|
|
||
| save_marc_record_to_file(marc_record) | ||
| end | ||
|
|
||
| def extract_id(marc_record) | ||
| field_001 = marc_record['fields']&.find { |f| f['001'] } | ||
| field_001 ? field_001['001'] : nil | ||
| end | ||
|
Comment on lines
75
to
78
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be a problem if the record is missing an 001 (and hopefully that will never happen!), so I think it would be good to have this method return The overall restructuring I'm proposing will prevent the (hopefully rare/impossible) case where we would write our an "unknown.mrc" file to disk. And I think the code change would also make it clearer that a record with a missing 001 value is an exceptional case that would cause issues generally. |
||
| end | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| class FolioSync::Hyacinth::Client < HyacinthApi::Client | ||
| # HyacinthApi will be extracted to a gem in the future | ||
| def self.instance | ||
| @instance ||= self.new( | ||
| HyacinthApi::Configuration.new( | ||
| url: Rails.configuration.hyacinth['url'], | ||
| email: Rails.configuration.hyacinth['email'], | ||
| password: Rails.configuration.hyacinth['password'] | ||
| ) | ||
| ) | ||
| @instance | ||
| end | ||
| end |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| class HyacinthApi::Client | ||
| include HyacinthApi::Finders | ||
| include HyacinthApi::DigitalObjects | ||
|
|
||
| attr_reader :config | ||
|
|
||
| def initialize(config) | ||
| @config = config | ||
| @auth_token = Base64.strict_encode64("#{@config.email}:#{@config.password}") | ||
| end | ||
|
|
||
| def self.instance | ||
| @instance ||= new | ||
| end | ||
|
|
||
| # Core HTTP methods | ||
| def get(path, params = {}) | ||
| response = connection.get(path, params) | ||
| handle_response(response) | ||
| end | ||
|
|
||
| def post(path, data = {}) | ||
| response = connection.post(path, data.to_json) | ||
| handle_response(response) | ||
| end | ||
|
|
||
| def put(path, data = {}) | ||
| response = connection.put(path, data.to_json) | ||
| handle_response(response) | ||
| end | ||
|
|
||
| def delete(path) | ||
| response = connection.delete(path) | ||
| handle_response(response) | ||
| end | ||
|
|
||
| def connection | ||
| @connection ||= Faraday.new( | ||
| url: @config.url, | ||
| headers: headers, | ||
| request: { timeout: @config.timeout } | ||
| ) do |faraday| | ||
| faraday.adapter Faraday.default_adapter | ||
| faraday.use Faraday::Response::RaiseError | ||
| end | ||
| end | ||
|
|
||
| def headers | ||
| { | ||
| 'Accept' => 'application/json, text/plain', | ||
| 'Content-Type' => 'application/json', | ||
| 'Authorization' => "Basic #{@auth_token}" | ||
| } | ||
| end | ||
|
|
||
| def handle_response(response) | ||
| return {} if response.body.blank? | ||
|
|
||
| JSON.parse(response.body) | ||
| rescue JSON::ParserError => e | ||
| Rails.logger.error("Invalid JSON response: #{response.body}") | ||
| raise "Invalid JSON response: #{e.message}" | ||
| end | ||
| end |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| module HyacinthApi | ||
| class Configuration | ||
| DEFAULT_TIMEOUT = 60 | ||
|
|
||
| attr_reader :url, :email, :password, :timeout | ||
|
|
||
| def initialize(url:, email:, password:, timeout: DEFAULT_TIMEOUT) | ||
| @url = url | ||
| @email = email | ||
| @password = password | ||
| @timeout = timeout | ||
| end | ||
| end | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Later in this file, I proposed that
extract_idreturnnilif an 001 isn't present. If this method returnsnilhere, I think it would be good to raise an exception (something like FolioSync::Exceptions::Missing001, which would extend FolioSyncException, and FolioSyncException extends StandardError so it would be caught by your line 27 rescue block).