Encoding that minimizes misreading / mistyping / misspeaking? -


let's have system in long key value can accurately communicated user on-screen, via email or via paper; user needs able communicate key accurately reading on phone, or reading , typing other interface.

what "good" way encode key make reading / hearing / typing easy & accurate?

this invoice number, document id, transaction id or other abstract value. let's sake of discussion underlying key value big number, 40 digits in base 10.

some thoughts:

shorter keys better

  • a 40-digit base 10 value may not fit in space given, , easy lost in middle of
  • the same value represented in base 16 in 33-34 digits
  • the same value represented in base 36 in 26 digits
  • the same value represented in base 64 in 22-23 digits

characters can't visually confused each other better

  • e.g. encoding includes both o (oh) , 0 (zero), or s (ess) , 5 (five), bad
  • this issue depends on font / face used display key, may able control in cases (like printing on paper) can't control in others (like web pages , email).
  • also depends on whether can control exclusive use of upper , / or lower case -- e.g. capital d (dee) may o (oh) lower case d (dee) not; while lower case l (ell) looks 1 (one) while capital l (ell) not. (with exceptions exotic fonts / faces).

characters can't verbally / aurally confused each other better

  • a (ay) 8 (eight)
  • b (bee) c (cee) d (dee) e (ee) g (gee) p (pee) t (tee) v (vee) z (zee) 3 (three)
  • this issue depends on audio quality of end-to-end channel -- bigger challenge if expected user base have speech impediment, or may have speak through gas mask, or communication channel include cb radios or choppy voip phone systems.

adding check digit or 2 detect errors not resolve errors.

an alpha - bravo - charlie - delta type dialog can hearing errors, not reading errors.

possible choices of encoding:

  • base 64 -- compact, many hard-to-verbalize characters (underscore, dash etc.)
  • base 34 -- 0-9 , a-z o (oh) , (aye) left out easiest confuse digits
  • base 32 -- same base 34 leave out 0 (zero) , 1 (one) well

is there recognized encoding reasonable solution scenario?

when heard first, liked article a proposal proquints: identifiers readable, spellable, , pronounceable. encodes data sequence of consonants , vowels. it's tied english language though. (because in german, f , v sound equal, should not used both.) general idea.


Comments

Popular posts from this blog

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -

objective c - Language Translation API for iPhone -

jasper reports - Fixed header in Excel using JasperReports -