python - The most effective way to assign unique integer id to a string? -


the program write processes large number of objects, each own unique id, string of complicated structure (dozen of unique fields of object joined separator) , big length.

since have process lot of these objects fast , need reffer them id while processing , have no power change format (i retrieve them externally, network), want map complicated string id own internal integer id , further use comparison, transfering them further other processes, etc.

what i'm going use simple dict keys string id of object , integer values internal integer id of it.

my question is: there better way in python this? may there way calculate hash manually, whatever? may dict not best solution?

as numbers: there 100k of such unique objects in system @ time, integer capacity more enough.

for comparison purposes, can intern strings , compare them is instead of ==, simple pointer comparison , should fast (or faster than) comparing 2 integers:

>>> 'foo' * 100 'foo' * 100 false >>> intern('foo' * 100) intern('foo' * 100) true 

intern guarantees id(intern(a)) == id(intern(b)) iff a == b. sure intern string input. note intern called sys.intern in python 3.x.

but when have pass these strings other processes, dict solution seems best. in such situations is

str_to_id = {} s in strings:     str_to_id.setdefault(s, len(str_to_id)) 

so integer capacity more enough

python integers bigints, should never problem.


Comments

Popular posts from this blog

jasper reports - Fixed header in Excel using JasperReports -

media player - Android: mediaplayer went away with unhandled events -

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -