-
Notifications
You must be signed in to change notification settings - Fork 10
Adiciona tarefa para migrar dados de Institution para RawOrganization por coleção #1284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adiciona tarefa para migrar dados de Institution para RawOrganization por coleção #1284
Conversation
…sks.py Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adiciona uma nova task Celery para executar, em lote e por lista de coleções (e opcionalmente por ISSN), a migração de dados de Institution para os campos raw_* (RawOrganization) nos históricos (PublisherHistory, OwnerHistory, SponsorHistory, CopyrightHolderHistory) associados aos journals.
Changes:
- Cria a task
task_migrate_institution_history_to_raw_institutioncom filtros por coleção e ISSN e retorno de estatísticas. - Implementa contadores por tipo de histórico e registra erros via
UnexpectedEvent. - Adiciona uma suíte de testes cobrindo migração, filtros e cenário sem instituição.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
journal/tasks.py |
Nova task Celery para migração em lote de institution → raw_* em históricos, com filtros e estatísticas. |
journal/tests.py |
Novos testes unitários para validar migração, filtros por coleção/ISSN e contadores. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Create mock self with request attribute for Celery task | ||
| mock_self = MagicMock() | ||
| mock_self.request = MagicMock() | ||
|
|
||
| # Run the task | ||
| result = task_migrate_institution_history_to_raw_institution( | ||
| mock_self, |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests call the Celery bound task with an explicit positional self (mock_self). For bind=True tasks, Celery supplies the task instance automatically when the task is called; passing a positional argument here will be treated as username and then conflicts with username="testuser" (TypeError: multiple values for argument 'username'). Call the task without the extra positional argument (or use .run(...) / .apply(kwargs=...)) and avoid mocking request in a way that makes _get_user attempt User.objects.get(pk=<MagicMock>).
| # Create mock self with request attribute for Celery task | |
| mock_self = MagicMock() | |
| mock_self.request = MagicMock() | |
| # Run the task | |
| result = task_migrate_institution_history_to_raw_institution( | |
| mock_self, | |
| # Run the task | |
| result = task_migrate_institution_history_to_raw_institution.run( |
| # Create mock self with request attribute for Celery task | ||
| mock_self = MagicMock() | ||
| mock_self.request = MagicMock() | ||
|
|
||
| # Run task only for "scl" collection | ||
| result = task_migrate_institution_history_to_raw_institution( | ||
| mock_self, | ||
| username="testuser", | ||
| collection_acron_list=["scl"], | ||
| ) |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as above: this bound Celery task should not receive an explicit positional self argument in the test. Passing mock_self will be interpreted as username and will conflict with the username= kwarg, causing the test to error before exercising the task logic.
| # Create mock self with request attribute for Celery task | ||
| mock_self = MagicMock() | ||
| mock_self.request = MagicMock() | ||
|
|
||
| # Run task only for specific ISSN | ||
| result = task_migrate_institution_history_to_raw_institution( | ||
| mock_self, | ||
| username="testuser", | ||
| collection_acron_list=["scl"], | ||
| journal_issns=["1234-5678"], | ||
| ) |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as above: remove the positional mock_self argument when invoking the bound Celery task (or call .run / .apply). As written, this invocation will raise due to multiple values for username and won't test the ISSN filtering behavior.
| # Create mock self with request attribute for Celery task | ||
| mock_self = MagicMock() | ||
| mock_self.request = MagicMock() | ||
|
|
||
| # Run the task | ||
| result = task_migrate_institution_history_to_raw_institution( | ||
| mock_self, | ||
| username="testuser", | ||
| collection_acron_list=["scl"], | ||
| ) |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as above: invoking a bind=True Celery task in tests should not pass an explicit positional self. This currently prevents the test from running and can also make _get_user try to query with a MagicMock pk depending on how request is mocked.
| if journal.publisher_history.filter(institution__isnull=False).exists(): | ||
| migrated_items = journal.migrate_publisher_history_to_raw() | ||
| migrated_publishers += len(migrated_items) | ||
| logger.info( | ||
| f"Migrated {len(migrated_items)} publisher history records " | ||
| f"for journal {journal.id}" | ||
| ) | ||
|
|
||
| # Check and migrate owner_history records with institution != None | ||
| if journal.owner_history.filter(institution__isnull=False).exists(): | ||
| migrated_items = journal.migrate_owner_history_to_raw() | ||
| migrated_owners += len(migrated_items) | ||
| logger.info( | ||
| f"Migrated {len(migrated_items)} owner history records " | ||
| f"for journal {journal.id}" | ||
| ) | ||
|
|
||
| # Check and migrate sponsor_history records with institution != None | ||
| if journal.sponsor_history.filter(institution__isnull=False).exists(): | ||
| migrated_items = journal.migrate_sponsor_history_to_raw() | ||
| migrated_sponsors += len(migrated_items) | ||
| logger.info( | ||
| f"Migrated {len(migrated_items)} sponsor history records " | ||
| f"for journal {journal.id}" | ||
| ) | ||
|
|
||
| # Check and migrate copyright_holder_history records with institution != None | ||
| if journal.copyright_holder_history.filter(institution__isnull=False).exists(): | ||
| migrated_items = journal.migrate_copyright_holder_history_to_raw() | ||
| migrated_copyright_holders += len(migrated_items) | ||
| logger.info( | ||
| f"Migrated {len(migrated_items)} copyright holder history records " |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The counters here use len(migrated_items), but Journal.migrate_*_history_to_raw() appends all history items (including those where institution is already null and therefore nothing is migrated). This will inflate the "migrated_*" stats whenever a journal has a mix of null/non-null institutions. Prefer counting the queryset that actually needs migration (e.g., publisher_history.filter(institution__isnull=False).count()) and increment by that value (and you can use that count to decide whether to call the migration method, avoiding the extra .exists() query too).
| if journal.publisher_history.filter(institution__isnull=False).exists(): | |
| migrated_items = journal.migrate_publisher_history_to_raw() | |
| migrated_publishers += len(migrated_items) | |
| logger.info( | |
| f"Migrated {len(migrated_items)} publisher history records " | |
| f"for journal {journal.id}" | |
| ) | |
| # Check and migrate owner_history records with institution != None | |
| if journal.owner_history.filter(institution__isnull=False).exists(): | |
| migrated_items = journal.migrate_owner_history_to_raw() | |
| migrated_owners += len(migrated_items) | |
| logger.info( | |
| f"Migrated {len(migrated_items)} owner history records " | |
| f"for journal {journal.id}" | |
| ) | |
| # Check and migrate sponsor_history records with institution != None | |
| if journal.sponsor_history.filter(institution__isnull=False).exists(): | |
| migrated_items = journal.migrate_sponsor_history_to_raw() | |
| migrated_sponsors += len(migrated_items) | |
| logger.info( | |
| f"Migrated {len(migrated_items)} sponsor history records " | |
| f"for journal {journal.id}" | |
| ) | |
| # Check and migrate copyright_holder_history records with institution != None | |
| if journal.copyright_holder_history.filter(institution__isnull=False).exists(): | |
| migrated_items = journal.migrate_copyright_holder_history_to_raw() | |
| migrated_copyright_holders += len(migrated_items) | |
| logger.info( | |
| f"Migrated {len(migrated_items)} copyright holder history records " | |
| publisher_to_migrate = journal.publisher_history.filter( | |
| institution__isnull=False | |
| ).count() | |
| if publisher_to_migrate: | |
| journal.migrate_publisher_history_to_raw() | |
| migrated_publishers += publisher_to_migrate | |
| logger.info( | |
| f"Migrated {publisher_to_migrate} publisher history records " | |
| f"for journal {journal.id}" | |
| ) | |
| # Check and migrate owner_history records with institution != None | |
| owner_to_migrate = journal.owner_history.filter( | |
| institution__isnull=False | |
| ).count() | |
| if owner_to_migrate: | |
| journal.migrate_owner_history_to_raw() | |
| migrated_owners += owner_to_migrate | |
| logger.info( | |
| f"Migrated {owner_to_migrate} owner history records " | |
| f"for journal {journal.id}" | |
| ) | |
| # Check and migrate sponsor_history records with institution != None | |
| sponsor_to_migrate = journal.sponsor_history.filter( | |
| institution__isnull=False | |
| ).count() | |
| if sponsor_to_migrate: | |
| journal.migrate_sponsor_history_to_raw() | |
| migrated_sponsors += sponsor_to_migrate | |
| logger.info( | |
| f"Migrated {sponsor_to_migrate} sponsor history records " | |
| f"for journal {journal.id}" | |
| ) | |
| # Check and migrate copyright_holder_history records with institution != None | |
| copyright_holder_to_migrate = journal.copyright_holder_history.filter( | |
| institution__isnull=False | |
| ).count() | |
| if copyright_holder_to_migrate: | |
| journal.migrate_copyright_holder_history_to_raw() | |
| migrated_copyright_holders += copyright_holder_to_migrate | |
| logger.info( | |
| f"Migrated {copyright_holder_to_migrate} copyright holder history records " |
| - migrated_copyright_holders: Total copyright holder history records migrated | ||
| - error_count: Number of errors encountered | ||
| """ | ||
| user = _get_user(self.request, username=username, user_id=user_id) |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable user is not used.
| user = _get_user(self.request, username=username, user_id=user_id) | |
| _get_user(self.request, username=username, user_id=user_id) |
O que esse PR faz?
Implementa
task_migrate_institution_history_to_raw_institutionque executa migração de dados de Institution para RawOrganization em PublisherHistory, OwnerHistory, SponsorHistory e CopyrightHolderHistory para uma lista de coleções.A tarefa:
collection_acron_liste opcionalmente porjournal_issnsinstitution__isnull=FalseOnde a revisão poderia começar?
journal/tasks.pylinha 638 - função principal da tarefajournal/tests.pylinha 687 - testes cobrindo todos os cenários (filtros, migração, contadores)Como este poderia ser testado manualmente?
institutionforam setados para None e camposraw_*populadosOu executar os testes:
Algum cenário de contexto que queira dar?
Os métodos de migração já existem no modelo Journal desde a implementação de RawOrganizationMixin. Esta tarefa permite executar a migração em lote para múltiplas coleções via Celery, seguindo o padrão de outras tarefas similares no módulo (ex:
task_export_journals_to_articlemeta,task_replace_institution_by_raw_institution).A implementação usa
.iterator()para otimizar uso de memória ao processar grandes volumes de journals.Screenshots
N/A - tarefa backend sem interface gráfica
Quais são tickets relevantes?
Relacionado à issue sobre migração de dados Institution → RawOrganization
Referências
journal/models.pylinhas 1313-1375journal/tasks.pyOriginal prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.