python - Grouping django objects using QuerySet API by shared text tags using two models with ForeignKey relationship -

- May 15, 2012

hi trying write django webapp sits on top of legacy database , cannot control model fields , achieve functionality using new models.

i have many documents user given comma-separated-tags . want group "related" documents based on shared tags.

# model express legacy table class document(models.model):     id = models.bigintegerfield(primary_key= true)     metadata_id = models.charfield(max_length=384)     tags_as_csv = models.textfield()  # created new model tag_text extracted tags_as_csv class tagdb(models.model):     tagid = models.bigintegerfield(primary_key=true)     referencing_document = models.foreignkey(document)     tag_text = models.textfield(blank=true)

so document contain:

document :  id = 1 ,  metadata_id = "a1ee3df3600c6f77a6e851781f7e70c6" ,  tags_as_csv = "raw-data , high temperature , important"

the tagdb have entries such as

id , referencing_document , tag_text 1  , 1 , "raw-data"  2  , 1 , "high temperature" 3  , 1 , "important" 4  , 2 , "important" 5  , 2 , "processed-data" 6  , 3 , "important" 7  , 4 , "processed-data"

now want extract document objects match tags corresponding parent document. doing using following get_queryset method.

   def get_queryset(self, **kwargs):         parent_document = document.objects.get(id=self.kwargs['slug'])         tags_in_parent_document = [x.tag_text x in tagdb.objects.filter(referencing_document=parent_document.id)]         # contain document ids match tags         queryset_with_duplicates = []         tag in tags_in_parent_document:             queryset_with_duplicates.extend([x.referencing_document.id x in tagdb.objects.filter(tagtext__icontains=tag)])          # make sure have unique ids         queryset_unique = set(queryset_with_duplicates)          # document objects          queryset = document.objects.filter(id__in=queryset_unique)          return queryset

my question : there better way . can somehow documents contain tags in parent document , filter out duplicates ( since multiple documents contain same tag).

you'd better create 2 additional models: 1 tag , 1 link between tag , document. if it's somewhy unacceptable, can use like:

document.objects.filter(tagdb__tag_text__in=doc.tags_as_csv.split(' , ')).distinct()

plus, add model method getting/setting tags, ease possible refactoring.

Search This Blog

OSX

python - Grouping django objects using QuerySet API by shared text tags using two models with ForeignKey relationship -

Comments

Post a Comment

Popular posts from this blog

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -

c# - Getting per connection bandwidth statistics -

security - SQL injection and web log files -