python - Grouping django objects using QuerySet API by shared text tags using two models with ForeignKey relationship -
hi trying write django webapp sits on top of legacy database , cannot control model fields , achieve functionality using new models.
i have many documents user given comma-separated-tags . want group "related" documents based on shared tags.
# model express legacy table class document(models.model): id = models.bigintegerfield(primary_key= true) metadata_id = models.charfield(max_length=384) tags_as_csv = models.textfield() # created new model tag_text extracted tags_as_csv class tagdb(models.model): tagid = models.bigintegerfield(primary_key=true) referencing_document = models.foreignkey(document) tag_text = models.textfield(blank=true)
so document contain:
document : id = 1 , metadata_id = "a1ee3df3600c6f77a6e851781f7e70c6" , tags_as_csv = "raw-data , high temperature , important"
the tagdb have entries such as
id , referencing_document , tag_text 1 , 1 , "raw-data" 2 , 1 , "high temperature" 3 , 1 , "important" 4 , 2 , "important" 5 , 2 , "processed-data" 6 , 3 , "important" 7 , 4 , "processed-data"
now want extract document objects match tags corresponding parent document. doing using following get_queryset method.
def get_queryset(self, **kwargs): parent_document = document.objects.get(id=self.kwargs['slug']) tags_in_parent_document = [x.tag_text x in tagdb.objects.filter(referencing_document=parent_document.id)] # contain document ids match tags queryset_with_duplicates = [] tag in tags_in_parent_document: queryset_with_duplicates.extend([x.referencing_document.id x in tagdb.objects.filter(tagtext__icontains=tag)]) # make sure have unique ids queryset_unique = set(queryset_with_duplicates) # document objects queryset = document.objects.filter(id__in=queryset_unique) return queryset
my question : there better way . can somehow documents contain tags in parent document , filter out duplicates ( since multiple documents contain same tag).
you'd better create 2 additional models: 1 tag , 1 link between tag , document. if it's somewhy unacceptable, can use like:
document.objects.filter(tagdb__tag_text__in=doc.tags_as_csv.split(' , ')).distinct()
plus, add model method getting/setting tags, ease possible refactoring.
Comments
Post a Comment