-
Notifications
You must be signed in to change notification settings - Fork 158
Update CSV import to better handle complex models, add download template functionality #11918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev/8.1.x
Are you sure you want to change the base?
Conversation
|
|
||
| return terms | ||
|
|
||
| def transform_value_for_tile(self, value, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've actually been working on this in a branch off dev/7.6.x to enable more flexibility of acceptable value formats (e.g. a string pointing to a resourceinstance.legacyid, etc.)
def transform_value_for_tile(self, value, **kwargs):
# kwargs config looks like this:
# {
# "graphs": [
# {
# "name": "Person or Group",
# "graphid": "ccbd1537-ac5e-11e6-84a5-026d961c88e6",
# "relationshipConcept": "6f26aa04-52af-4b17-a656-674c547ade2a",
# "relationshipCollection": "00000000-0000-0000-0000-000000000005",
# "useOntologyRelationship": False,
# "inverseRelationshipConcept": "6f26aa04-52af-4b17-a656-674c547ade2a"
# }
# ],
# "searchDsl": "",
# "searchString": ""
# }
from arches.app.search.search_engine_factory import SearchEngineFactory
relatable_graphs = kwargs.get("graphs", [])
default_values_lookup = dict()
for graph in relatable_graphs:
if graph.get("useOntologyRelationship", False) or not graph.get(
"relationshipConcept", None
):
default_values_lookup[graph["graphid"]] = {
"ontologyProperty": "",
"inverseOntologyProperty": "",
}
else:
default_values_lookup[graph["graphid"]] = {
"ontologyProperty": graph["relationshipConcept"],
"inverseOntologyProperty": graph["inverseRelationshipConcept"],
}
def build_resource_instance_object(hit):
return {
"resourceId": hit["_id"],
"ontologyProperty": (
default_values_lookup[hit["_source"]["graph_id"]][
"ontologyProperty"
]
),
"inverseOntologyProperty": (
default_values_lookup[hit["_source"]["graph_id"]][
"inverseOntologyProperty"
]
),
"resourceXresourceId": str(uuid.uuid4()),
}
subtypes_dict = {
"uuid": uuid.UUID,
"dict": dict,
"str": str,
"int": int,
"float": float,
}
if isinstance(value, str):
for test_method in [uuid.UUID, json.loads, ast.literal_eval]:
try:
converted_value = test_method(value)
break
except:
converted_value = False
if converted_value is False and value != "":
converted_value = value.split(",") # is a string, likely legacyid
converted_value = [val.strip() for val in converted_value if val]
try:
converted_value = [uuid.UUID(val) for val in converted_value]
except:
pass
elif converted_value is False:
logger.warning("ResourceInstanceDataType: value is empty")
return []
else:
converted_value = value
value_type = None
if not isinstance(converted_value, list):
converted_value = [converted_value]
for value_subtype_label, value_subtype_class in list(subtypes_dict.items()):
if isinstance(converted_value[0], value_subtype_class):
value_type = value_subtype_label
break
se = SearchEngineFactory().create()
query = Query(se)
query.include("graph_id")
boolquery = Bool()
transformed_value = []
match value_type:
case "uuid":
results = query.search(
index=RESOURCES_INDEX, id=[str(val) for val in converted_value]
)
for hit in results["docs"]:
transformed_value.append(build_resource_instance_object(hit))
case "dict": # assume data correctly parsed via ast.literal
for val in converted_value:
try:
uuid.UUID(val["resourceId"])
except:
continue
transformed_value.append(val)
case _: # default case (handles str/legacyid and any other types)
if value_type != "str":
converted_value = [str(val) for val in converted_value]
boolquery.must(
Terms(field="legacyid.keyword", terms=converted_value)
) # exact match on keyword
query.add_query(boolquery)
results = query.search(index=RESOURCES_INDEX)
print(f"{len(results['hits']['hits'])} hits")
for hit in results["hits"]["hits"]:
transformed_value.append(build_resource_instance_object(hit))
if len(transformed_value) == 0:
logger.warning(
f"ResourceInstanceDataType: no resources found for {converted_value}"
)
return
return transformed_value
chiatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this adds new functionality it needs to target version 8. Can you change the target to 8.0.x?
* Allow date_datatype.transform_value_for_tile to process datetime objects, re archesproject#11878
* Add keyboard inputs for moving overlays
Types of changes
Description of Change
Upgrade import csv functionality so the script to has the possibility to load complex files. Added the functionality to correctly import parent/child relations, handle none type and empty nodes correctly, better searching of concepts and fix the case where names containing "," would brake the import, fix using existing identifier if it exist, fix tile ordering if multiple tiles exist with the same nodegroup.
Added a functionality to download csv templates for each model. The template contains a column for each node.
Issues Solved
Closes #
Checklist