An underline-based shorthand
Introduction
This page describes a simple shorthand I have developed. It is based on underlining and double-underlining some letters of the word. It was inspired by LZ77 and LZ88 (see Wikipedia).
Note that your browser may not show single and double underline correctly with some letters: e.g. Jj, Jj and Jj should all look distinct.
How it works
To repeat two or more letters immediately, underline them with a single underline. The repeated letters are highlighted in the decoded text like this:
Shorthand | Decoded |
---|---|
AB | ABAB |
ABCD | ABCBCD |
ABC | ABCABC |
ABCDE | ABCDBCDE |
To specify a nonzero distance between the underlined letters and where they should be repeated, double underline one or more of the underlined letters. The maximum distance is length×(length+1)/2 where length is the number of letters to repeat:
Shorthand | Decoded | Distance |
---|---|---|
ABCDEFGHI | ABCDABCEFGHI | 1 |
ABCDEFGHI | ABCDEABCFGHI | 2 |
ABCDEFGHI | ABCDEFABCGHI | 3 |
ABCDEFGHI | ABCDEFGABCHI | 4 |
ABCDEFGHI | ABCDEFGHABCI | 5 |
ABCDEFGHI | ABCDEFGHIABC | 6 |
There may be multiple replacements in a single word. The replacements may overlap, in which case they must be evaluated from left to right. Underlined and inserted substrings only count as one unit of distance:
Shorthand | Partially decoded | Fully decoded |
---|---|---|
ABCDE | ABABCDE | ABABCDEDE |
ABCDE | ABCDEAB | ABCDEDEAB |
ABCDE | ABCDEAB | ABCDEABDE |
Limitations:
- Two underlined parts may not be next to each other because they would be indistinguishable from a single long underlined part. E.g. ABABCDCD cannot be shortened into ABCD because that would mean ABCDABCD.
- The underlined letters cannot be repeated more than once.
Examples of English words
I found the words in Linux by using the grep
command on the file /usr/share/dict/words
.
One repeated substring
- 5 letters:
- nationalis = nationalisation
- rationalis = rationalisation
- whippersna = whippersnappers
- 4 letters:
- abracad = abracadabra
- atherosclis = atherosclerosis
- bandst = bandstands
- beri = beriberi
- flibbertigt = flibbertigibbet
- grandst = grandstands
- handst = handstands
- hodgep = hodgepodge
- hots = hotshots
- lightwe = lightweight
- priestl = priestliest
- underfd = underfunded
- 3 letters:
- backp = backpack
- barian = barbarian
- conted = contented
- mur = murmur
- nonse = nonsense
- posses / posses = possesses
- racetk = racetrack
- senti = sentient
- undergro = underground
Two repeated substrings
- Non-overlapping:
- foredaing = foreordaining
- inseitivy = insensitivity
- nondeminatiol = nondenominational
- phosrescen = phosphorescence
- postmiress = postmistresses
- proietress = proprietresses
- relentssne = relentlessness
- senlessn = senselessness
- sheprdess = shepherdesses
- sleepssne = sleeplessness
- Overlapping:
- crisso = crisscross
- ingrati (double underlined “g”) = ingratiating
- knicka = knickknack
- waster = wastewater