Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
f0c47e0
fix typo
zouharvi Aug 2, 2024
d64659b
clarify annotator tokens
zouharvi Aug 5, 2024
2666cb2
Update INSTALL.md
snukky Aug 19, 2024
82e9eab
Update INSTALL.md
snukky Aug 19, 2024
715ade9
Update INSTALL.md
snukky Aug 19, 2024
b0f6c16
Merge pull request #172 from AppraiseDev/fix-line-endings
zouharvi Jan 4, 2025
d07015d
remove constraints from loader
zouharvi Jan 4, 2025
d6ff83c
rename hit to set
zouharvi Jan 4, 2025
1954d7b
implement character-level alignment
zouharvi Jan 4, 2025
d8d8968
make src-tgt character highlights more prominent
zouharvi Jan 7, 2025
a9f4221
add prompt dialog for style of src-tgt highlight (testing only)
zouharvi Jan 8, 2025
e8bc144
Merge pull request #195 from AppraiseDev/footer-contributors
snukky Feb 11, 2025
8268137
add tests for more or less than 100 items; add info about the number …
zouharvi Feb 17, 2025
d97c1b2
set src-tgt char alignment as discussed
zouharvi Feb 17, 2025
bdd3840
remove unused/duplicate files
zouharvi Feb 17, 2025
42f5f7e
Merge pull request #196 from AppraiseDev/remove-100-constraint
snukky Jul 11, 2025
3ec2335
Merge branch 'main' into develop
Jul 11, 2025
ca266bd
Merge with develop
Jul 11, 2025
4164be2
Merge pull request #197 from AppraiseDev/rename-hit
snukky Jul 11, 2025
25dd575
Merge branch 'develop' into src-tgt-visual-alignment
Jul 11, 2025
84d5636
Merge pull request #198 from AppraiseDev/src-tgt-visual-alignment
snukky Jul 11, 2025
9d0e074
Merge branch 'develop' into remove-unused
Jul 11, 2025
82fa0c7
Merge pull request #202 from AppraiseDev/remove-unused
snukky Jul 11, 2025
5a95f92
upgrade requirements
Jul 11, 2025
dfbba06
upgrade Python to 3.12
Jul 11, 2025
53e848c
reformat with newer version of black
Jul 11, 2025
1c04c0c
bump version to #wmt25dev
Jul 11, 2025
a10669c
wmt24 -> wmt25
Jul 12, 2025
3298c23
dummy change
Jul 12, 2025
e9234f2
don't escape image context, ref #185'
zouharvi Jul 20, 2025
88a0c37
develop->vilemz/develop merge
zouharvi Jul 20, 2025
a832a62
add <img support, fix document count computation, render newlines as …
zouharvi Jul 20, 2025
e5642db
Merge pull request #205 from AppraiseDev/vilemz/develop
zouharvi Jul 20, 2025
c446187
add Maasai language
zouharvi Jul 21, 2025
50c6e5f
minor styling & instructions
zouharvi Jul 22, 2025
c4fc682
increase the max length for campaign names
Jul 23, 2025
92492a2
update next document button
zouharvi Jul 24, 2025
54ea604
update ESA slider anchor instructions
zouharvi Jul 24, 2025
9021b50
fix vertical videos
zouharvi Jul 24, 2025
31e1a9a
show ESA instructions by default
zouharvi Jul 24, 2025
8268e30
update ESA slider anchors
zouharvi Jul 24, 2025
56f5b65
merge
zouharvi Jul 24, 2025
3a90994
add note on LLM usage, resolve #201
zouharvi Jul 24, 2025
e79e4e9
ESA styling, add language tags on the side, resolve #50
zouharvi Jul 25, 2025
35bcf1e
make ESA interface neater by moving icons to the side
zouharvi Jul 25, 2025
b258936
Merge pull request #206 from AppraiseDev/vilemz/develop
snukky Jul 25, 2025
3c75e97
create campaign status page for ESA
zouharvi Jul 26, 2025
55715c8
update campaign-status styling
zouharvi Jul 26, 2025
ed89e7e
speed-up campaign-status and next document fetch
zouharvi Jul 26, 2025
7e8b9fa
edit style of times
zouharvi Jul 26, 2025
11cd259
use TaskAgenda to get the task for a user; display 0/xxx even if no a…
Jul 26, 2025
3456e64
Merge pull request #208 from AppraiseDev/romang/esa-status-task-agendas
snukky Jul 26, 2025
1118cf2
Merge pull request #207 from AppraiseDev/vilemz/develop
snukky Jul 26, 2025
b14ecc9
make campaign-status accessible without login
zouharvi Jul 26, 2025
f375cae
Merge pull request #209 from AppraiseDev/vilemz/develop
snukky Jul 26, 2025
8b8ad4b
update ESA instructions; resolve #211
zouharvi Jul 31, 2025
2a3c910
fix token generation when no QC exists, resolve #210
zouharvi Jul 31, 2025
401cab1
add sort_key functionality back, support multi-campaign status view
zouharvi Jul 31, 2025
2802711
Merge pull request #212 from AppraiseDev/vilemz/develop
snukky Jul 31, 2025
aed8807
fix campaign-status to work on single campaign
zouharvi Jul 31, 2025
f0a5c26
Merge pull request #213 from AppraiseDev/vilemz/develop
snukky Jul 31, 2025
1c092d6
fix >100% in campaign-status; more precise ESA time computation
zouharvi Aug 4, 2025
7932c7e
add tooltip to campaign-status
zouharvi Aug 4, 2025
aabf1c3
Add action to download/bulk-download annotated data from the admin view.
seblemaguer Aug 5, 2025
d497047
Merge pull request #214 from AppraiseDev/vilemz/develop
snukky Aug 5, 2025
66dcaa8
Merge pull request #215 from AppraiseDev/zouhar/admin-download
snukky Aug 5, 2025
e30cdc9
fix tests for ESA
zouharvi Aug 5, 2025
42ded89
restructure github tests to fail on fail
zouharvi Aug 5, 2025
c27b69e
fix version
zouharvi Aug 5, 2025
a7ab4bc
Merge pull request #216 from AppraiseDev/zouhar/admin-download
snukky Aug 5, 2025
fe7970a
Merge pull request #217 from AppraiseDev/zouhar/fix-gh-actions
snukky Aug 5, 2025
2131d6f
attempt fix duplicate campaign-status
zouharvi Aug 6, 2025
5e71313
fix tests due to changing csv export
zouharvi Aug 6, 2025
b511c2b
Merge pull request #218 from AppraiseDev/zouhar/develop
snukky Aug 6, 2025
52cbfd3
remove typo
zouharvi Aug 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.9]
python-version: [3.12]

steps:
- uses: actions/checkout@v4
Expand Down
12 changes: 11 additions & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,28 @@ jobs:
pip freeze | tee pip_freeze.log
- name: Unit tests
run: python3 manage.py test -v2

- name: Regression tests
id: regression_tests
run: bash RegressionTests/run.sh
# Continue even if tests fail, so that we can collect test outputs for debugging
continue-on-error: true

- name: Collect outputs
# This step will run even if the regression tests failed
run: |
find . -type f \( -name "*.log" -o -name "*.out" -o -name "*.diff" \) -print | cut -c3- > listing.txt
echo "Creating an artifact with the following files:"
cat listing.txt
7z a -tzip regression-tests-appraise.zip @listing.txt

- name: Publish outputs
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: regression-tests-appraise
path: regression-tests-appraise.zip

# Enforce the failure
- name: Check on failures
if: steps.regression_tests.outcome == 'failure'
run: exit 1
7 changes: 5 additions & 2 deletions Appraise/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
For the full list of settings and their values, see
https://docs.djangoproject.com/en/1.11/ref/settings/
"""

import logging
import os
import warnings
Expand Down Expand Up @@ -37,7 +38,9 @@

ALLOWED_HOSTS = os.environ.get('APPRAISE_ALLOWED_HOSTS', '127.0.0.1').split(',')

CSRF_TRUSTED_ORIGINS = os.environ.get('APPRAISE_CSRF_TRUSTED_ORIGINS', 'https://*.127.0.0.1').split(',')
CSRF_TRUSTED_ORIGINS = os.environ.get(
'APPRAISE_CSRF_TRUSTED_ORIGINS', 'https://*.127.0.0.1'
).split(',')

WSGI_APPLICATION = os.environ.get(
'APPRAISE_WSGI_APPLICATION', 'Appraise.wsgi.application'
Expand Down Expand Up @@ -208,7 +211,7 @@

# Base context for all views.
BASE_CONTEXT = {
'commit_tag': '#wmt24dev',
'commit_tag': '#wmt25dev',
'title': 'Appraise evaluation system',
'static_url': STATIC_URL,
}
Expand Down
5 changes: 3 additions & 2 deletions Appraise/urls.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

See LICENSE for usage details
"""

# pylint: disable=unused-import,import-error
from django.conf.urls import handler404
from django.conf.urls import handler500
Expand Down Expand Up @@ -187,8 +188,8 @@
name='pairwise-assessment-document',
),
re_path(
r'^campaign-status/(?P<campaign_name>[a-zA-Z0-9]+)/'
r'(?P<sort_key>[0123456])?/?$',
r'^campaign-status/(?P<campaign_name>[a-zA-Z0-9]+(,[a-zA-Z0-9]+)*)/'
r'(?P<sort_key>[a-zA-Z0-9_])?/?$',
campaign_views.campaign_status,
name='campaign_status',
),
Expand Down
7 changes: 4 additions & 3 deletions Appraise/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

See LICENSE for usage details
"""

import logging

from Appraise.settings import LOG_HANDLER
Expand Down Expand Up @@ -33,8 +34,8 @@ def _compute_user_total_annotation_time(timestamps):
def _clamp_time(seconds):
# if a segment takes longer than 10 minutes, set it to 5 minutes
# it's likely due to inactivity
if seconds >= 10*60:
return 5*60
if seconds >= 10 * 60:
return 5 * 60
else:
return seconds

Expand All @@ -54,4 +55,4 @@ def _clamp_time(seconds):
# Update the previous end timestamp
previous_end_timestamp = end_timestamp

return total_annotation_time
return total_annotation_time
1 change: 1 addition & 0 deletions Appraise/wsgi.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
For more information on this file, see
https://docs.djangoproject.com/en/1.11/howto/deployment/wsgi/
"""

import os

from django.core.wsgi import get_wsgi_application
Expand Down
136 changes: 96 additions & 40 deletions Campaign/admin.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Campaign admin.py
"""

# pylint: disable=C0330,import-error
from django.contrib import admin
from django.contrib.admin.filters import AllValuesFieldListFilter
Expand All @@ -10,14 +11,18 @@
from Campaign.models import CampaignTeam
from Campaign.models import TrustedUser
from EvalData.admin import BaseMetadataAdmin

from django.http import HttpResponse
import csv
import zipfile
from io import StringIO
import importlib

class DropdownFilter(AllValuesFieldListFilter):
"""
Experimental dropdown filter.
"""

template = 'Campaign/filter_select.html'
template = "Campaign/filter_select.html"


class CampaignTeamAdmin(BaseMetadataAdmin):
Expand All @@ -26,33 +31,33 @@ class CampaignTeamAdmin(BaseMetadataAdmin):
"""

list_display = [
'teamName',
'owner',
'teamMembers',
'requiredAnnotations',
'requiredHours',
'completionStatus',
"teamName",
"owner",
"teamMembers",
"requiredAnnotations",
"requiredHours",
"completionStatus",
] + BaseMetadataAdmin.list_display # type: ignore
list_filter = ['owner'] + BaseMetadataAdmin.list_filter # type: ignore
list_filter = ["owner"] + BaseMetadataAdmin.list_filter # type: ignore
search_fields = [
'teamName',
'owner__username',
'owner__first_name',
'owner__last_name',
"teamName",
"owner__username",
"owner__first_name",
"owner__last_name",
] + BaseMetadataAdmin.search_fields # type: ignore

filter_horizontal = ['members']
filter_horizontal = ["members"]

fieldsets = (
(
None,
{
'fields': (
'teamName',
'owner',
'members',
'requiredAnnotations',
'requiredHours',
"fields": (
"teamName",
"owner",
"members",
"requiredAnnotations",
"requiredHours",
)
},
),
Expand All @@ -65,22 +70,22 @@ class CampaignDataAdmin(BaseMetadataAdmin):
"""

list_display = [
'dataName',
'market',
'metadata',
'dataValid',
'dataReady',
"dataName",
"market",
"metadata",
"dataValid",
"dataReady",
] + BaseMetadataAdmin.list_display # type: ignore
list_filter = [
'dataValid',
'dataReady',
"dataValid",
"dataReady",
] + BaseMetadataAdmin.list_filter # type: ignore
search_fields = [
# nothing model specific
] + BaseMetadataAdmin.search_fields # type: ignore

fieldsets = (
(None, {'fields': ('dataFile', 'market', 'metadata')}),
(None, {"fields": ("dataFile", "market", "metadata")}),
) + BaseMetadataAdmin.fieldsets # type: ignore


Expand All @@ -89,47 +94,98 @@ class CampaignAdmin(BaseMetadataAdmin):
Model admin for Campaign instances.
"""

list_display = ['campaignName'] + BaseMetadataAdmin.list_display + ['id'] # type: ignore
list_display = ["campaignName"] + \
BaseMetadataAdmin.list_display + ["id"] # type: ignore
list_filter = [
# nothing model specific
] + BaseMetadataAdmin.list_filter # type: ignore
search_fields = [
# nothing model specific
] + BaseMetadataAdmin.search_fields # type: ignore

filter_horizontal = ['batches']
filter_horizontal = ["batches"]

fieldsets = (
(
None,
{
'fields': (
'campaignName',
'packageFile',
'teams',
'batches',
'campaignOptions',
"fields": (
"campaignName",
"packageFile",
"teams",
"batches",
"campaignOptions",
)
},
),
) + BaseMetadataAdmin.fieldsets # type: ignore


actions = ["export_results"]

def _retrieve_csv(self, current_campaign):
# Get the task type corresponding to the campaign
qs_name = current_campaign.get_campaign_type().lower()
qs_attr = "evaldata_{0}_campaign".format(qs_name)
qs_obj = getattr(current_campaign, qs_attr, None)
cls = type(qs_obj.all()[0])
cls_name = cls.__name__
cls_name = cls_name.replace("Task", "Result")
module = importlib.import_module(cls.__module__)
cls = getattr(module, cls_name)

# Now get the content
f = StringIO()
writer = csv.writer(f)
csv_content = cls.get_system_data(current_campaign.id, extended_csv=True)
for r in csv_content:
writer.writerow(r)

f.seek(0)
return f


def export_results(self, request, queryset):
if len(queryset) == 1:
current_campaign = queryset[0]
csv_content = self._retrieve_csv(current_campaign)
filename = f"results_{current_campaign.campaignName}.csv"
response = HttpResponse(csv_content, content_type="text/csv")
response["Content-Disposition"] = f"attachment; filename={filename}"
else:
response = HttpResponse(content_type='application/zip')
response['Content-Disposition'] = 'attachment; filename="campaign_results.zip"'

# Create a zip file with selected objects
with zipfile.ZipFile(response, 'w') as zipf:
for current_campaign in queryset:

csv_content = self._retrieve_csv(current_campaign)
# Add objects to the zip file, customize as per your model's data
# For example, you can add an object's name and description to a text file in the zip
filename = f"results_{current_campaign.campaignName}.csv"
zipf.writestr(filename, csv_content.getvalue())
return response

export_results.short_description = "Download results"



class TrustedUserAdmin(admin.ModelAdmin):
"""
Model admin for Campaign instances.
"""

list_display = ['user', 'campaign']
list_display = ["user", "campaign"]
list_filter = [
('campaign__campaignName', DropdownFilter),
# 'campaign'
("campaign__campaignName", DropdownFilter),
# "campaign"
]
search_fields = [ # type: ignore
# nothing model specific
]

fieldsets = ((None, {'fields': ('user', 'campaign')}),)
fieldsets = ((None, {"fields": ("user", "campaign")}),)


admin.site.register(CampaignTeam, CampaignTeamAdmin)
Expand Down
1 change: 1 addition & 0 deletions Campaign/management/commands/ComputeSystemScores.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from EvalData.models import DirectAssessmentResult
from EvalData.models import DirectAssessmentTask


# pylint: disable=C0111,C0330,E1101
class Command(BaseCommand):
help = 'Computes system scores over all results'
Expand Down
2 changes: 1 addition & 1 deletion Campaign/management/commands/ComputeWMT21Results.py
Original file line number Diff line number Diff line change
Expand Up @@ -463,7 +463,7 @@ def handle(self, *args, **options):
wins_for_system = defaultdict(list)
losses_for_system = defaultdict(list)
p_level = 0.05
for (sysA, sysB) in combinations_with_replacement(system_ids, 2):
for sysA, sysB in combinations_with_replacement(system_ids, 2):
sysA_ids = set([x[0] for x in system_z_scores[sysA]])
sysB_ids = set([x[0] for x in system_z_scores[sysB]])
good_ids = set.intersection(sysA_ids, sysB_ids)
Expand Down
4 changes: 2 additions & 2 deletions Campaign/management/commands/ComputeZScores.py
Original file line number Diff line number Diff line change
Expand Up @@ -427,7 +427,7 @@ def handle(self, *args, **options):

wins_for_system = defaultdict(list)
p_level = 0.05
for (sysA, sysB) in combinations_with_replacement(system_ids, 2):
for sysA, sysB in combinations_with_replacement(system_ids, 2):
sysA_ids = set([x[0] for x in system_z_scores[sysA]])
sysB_ids = set([x[0] for x in system_z_scores[sysB]])
good_ids = set.intersection(sysA_ids, sysB_ids)
Expand Down Expand Up @@ -577,7 +577,7 @@ def sort_by_wins_and_z_score(x, y):
key = system_id[:4].upper()
vsystems[key].extend(system_z_scores[system_id])

for (sysA, sysB) in combinations_with_replacement(
for sysA, sysB in combinations_with_replacement(
['GOOG', 'CAND', 'PROD'], 2
):
sysA_scores = [x[1] for x in vsystems[sysA]]
Expand Down
1 change: 1 addition & 0 deletions Campaign/management/commands/InitCampaignMMT18Task1.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
}
REDUNDANCY = 1


# pylint: disable=C0111,C0330,E1101
class Command(BaseCommand):
help = 'Initialises campaign MMT18 Task #1'
Expand Down
1 change: 1 addition & 0 deletions Campaign/management/commands/InitCampaignMMT18Task1b.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
}
REDUNDANCY = 1


# pylint: disable=C0111,C0330,E1101
class Command(BaseCommand):
help = 'Initialises campaign MMT18 Task #1.b'
Expand Down
Loading
Loading