There’s an idea that just as English is composed of words, thought is composed of essential parts too. Perhaps there is a direct relationship between the most essential words and the most essential thoughts. How could we find out what are the atoms of thought, and what are just the molecules though?

Here’s one (quite unscientific) method. Take a dictionary, throw away all the words that don’t feature in any definitions since they can be described by other words, but aren’t needed to describe any word. Now, rewrite the definitions so that each word is replaced with its definition. Again, throw away any word that doesn’t feature in the new definitions. Continue boiling until you aren’t getting rid of words anymore.

I took Websters dictionary from the project Gutenberg site. I started with 95712 words. After the initial throwing away of words that weren’t in any definitions, I was down to 4489 words. After expanding them, and throwing away words that weren’t in the expanded definitions, I was down to 3601 words. Setting recursive definitions as atoms and continuing got me down to 2565 words, which should be enough to describe every thought you might have…

It turns out that a, b, c, d, e, f, g, h, i, j, k, l, n, o, q, s, t, u, v, w, y, z are the essential letters, leaving m, p, r and x as compound concepts.

I’m not sure how to reduce the number of words further, so here they are:

much, muck, vane, vamp, murk, mule, mull, vain, helve, mump, gland, tilth, musk, must, muss, mute, vase, soothe, souled, vast, glass, glare, scrimp, wrench, cove, cowl, pronged, hence, scurf, lance, scull, smouch, ketch, fy, glaze, ha, go, do, by, twinge, be, frock, itch, boast, cram, lynch, at, as, crab, an, tinge, no, of, realm, on, or, tined, board, pi, crew, me, my, ky, la, is, it, crib, in, slack, frond, front, up, to, frost, crow, crop, froth, so, reach, course, crup, frown, george, sleaved, gleet, club, clot, clog, lipped, inch, screw, scrim, scrip, fetch, screen, screed, billed, wythe, swathe, scrub, frill, clip, kibed, clew, clay, clan, clam, clap, claw, cock, code, large, coal, coat, pledge, quid, quit, quiz, frith, yacht, cope, cool, cook, coop, coot, cone, come, comb, waltz, cost, scribe, cork, corn, core, cord, scrap, colt, cold, coke, coin, tinned, coil, coif, taint, waffle, marked, glint, vive, reign, chit, chum, yeast, chop, corps, view, vice, yield, glide, terse, vine, gliff, char, chap, chat, scythe, fruit, nail, name, couch, spight, slish, waive, drain, waist, draft, slime, sling, slink, hoard, skilled, fierce, slide, plump, rinse, plumb, slice, pluck, drake, drape, right, scroll, slake, vest, slang, slant, vert, keeve, vent, vend, slate, slave, veil, vein, veer, taste, slight, cough, court, count, kedge, juice, warmth, wreath, hitch, glove, gloss, sleek, sleep, gloom, wright, sleid, cell, reeve, ceil, shelve, weight, globe, cent, cede, scaled, tared, slope, sloop, meet, dawk, daze, meek, crouch, brown, broth, rock, robe, scent, mend, scene, roan, roam, roar, melt, brose, mess, mesh, mere, broom, brook, brood, mete, clinch, strength, ridge, wraith, douche, crotch, fault, loose, draught, meat, mean, meal, seethe, vole, void, tierce, bilge, iced, vote, paint, ghost, deed, deep, cracked, deck, dear, deal, dean, dead, deaf, priest, haunch, desk, dent, nappe, made, farce, make, mail, main, scant, midst, beast, mace, scale, scald, scalp, beard, mawk, math, mate, scarp, scarf, scare, scaup, manx, male, malt, mall, mask, beach, mash, mass, mast, tenth, marl, mark, mart, mars, tense, rive, knife, ring, rind, ripe, risk, rise, brunt, rift, squab, oared, conch, brush, ride, brute, rich, squib, squid, nerved, phono, shive, dace, copse, bight, dauk, date, dash, dart, dark, snatch, shift, shine, damp, dame, shirk, shire, shirt, daft, basque, scrape, knell, scrawl, shawl, shave, tease, laugh, sprain, birth, teach, scowl, coarse, scorn, score, scots, scour, birch, scout, scope, scoop, read, coach, rear, reap, rhyme, reed, reef, reek, reel, scold, latch, rein, false, kneed, rent, rest, knead, coast, lurch, bitts, stretch, mood, wretch, mole, sheer, sheet, moll, mohr, move, mown, shears, moss, most, mote, moth, more, scream, sheal, moon, moot, sheaf, shear, prince, praise, shell, shelf, mode, mock, moan, moat, fresh, louse, sherd, rave, watch, rant, rank, hoarse, rapt, rape, faith, rare, rate, rash, spread, rasp, faint, mite, teens, steeve, mill, milk, mild, milt, teeth, mink, mine, mind, minx, mire, miss, mist, search, rhumb, bouffe, curl, curb, cure, calced, curt, cusp, cusk, rage, raft, rake, raid, knight, rail, rain, waste, mince, preach, cube, rack, race, nerve, waved, hunk, hunt, lunge, hulk, hull, slush, huff, huge, beached, hued, looped, husk, hurt, hurl, shade, frank, shaft, trench, frame, shake, shall, frail, shame, launch, launce, shank, shape, share, sharp, shark, brave, brawl, braze, brass, weave, frieze, nautch, throat, press, pshaw, bathe, batch, wait, wail, stanch, wage, boat, wale, walk, wall, barge, wake, preen, throng, throne, wade, bridge, grunt, braid, brain, brail, born, bore, boss, bote, bout, brace, bowl, boil, ward, ware, want, brant, brand, based, bolt, bold, boll, hooked, bomb, wave, hatch, friend, warn, warm, wart, warp, bond, bone, wash, brake, boot, book, boom, boon, bright, nosed, night, norse, haunt, weed, wedge, weep, weak, weal, wear, just, jump, smack, north, junk, sail, sage, safe, notch, sack, small, click, blur, blue, smart, witch, cloak, ninth, blot, quench, blow, hawse, flange, breed, sylph, close, bream, break, bread, prill, prime, clock, print, pride, price, prick, breve, tryst, clout, clove, clown, cloth, cloud, bite, bird, bing, bile, bill, bilk, bind, bitt, winged, drunk, hound, house, hours, suite, berth, flanged, nude, numb, breathe, bide, joke, join, barbed, splint, odds, guards, dredge, death, perch, shrike, shrill, shrimp, ruck, fringe, winch, wince, cheese, shriek, shuck, scheme, shunt, bribe, brick, rust, bride, rush, brief, brine, bring, brink, wine, wind, rude, wing, wink, shrine, shrink, sphere, ruff, wipe, wise, wisp, wire, rule, with, twill, twine, welsh, smile, twist, dunce, halve, best, bend, bell, belt, bent, twice, beef, pitch, bear, beat, thrist, bead, beak, beam, thrive, thrill, stepped, thrips, bench, smith, smite, whelked, dulse, sulks, roll, knock, roil, root, height, rope, roof, room, prawn, rote, ross, rose, broad, rout, prank, knosp, thrall, shield, knout, pounce, well, welt, wend, bruise, webbed, west, sphinx, clump, barn, bark, bare, bard, wide, surge, bath, bate, bass, base, ditch, bane, band, bank, bang, balm, wild, shoot, will, shook, jibe, shore, short, shote, smear, shove, wife, shout, back, tweak, bait, bail, bake, ball, balk, bale, bald, stored, priced, jilt, thread, what, whig, whim, whip, whir, toast, weird, when, shroud, whet, whey, harsh, weigh, shoal, smell, smelt, shock, twang, cage, zone, spar, writ, span, flute, flush, brusque, cheve, chest, fluke, wield, cave, drink, cheat, sour, soul, sort, sore, pinch, soph, soot, carl, carp, care, card, cart, cast, cash, case, cask, breach, drill, soon, cane, jest, camp, drive, sole, sold, cant, some, juggs, jerk, cake, cheep, cheer, cheek, soil, calm, call, calk, calf, check, soft, soak, soap, quitch, fleet, fleer, nine, taunt, snug, stuff, girth, snow, stunt, stump, hutch, flesh, stupe, voice, spoil, singe, spout, spoon, since, sinch, spoor, spool, sport, spore, strew, nice, nide, nick, nigh, strap, straw, stray, piled, strop, badge, strip, hurst, jeer, breath, breast, strum, strut, dried, glance, drift, spot, trine, spur, trill, school, spit, spin, trite, trice, trick, tride, tribe, breech, tried, schist, dress, clothe, chirk, spunk, chine, chink, troupe, child, chill, spurn, spurt, skid, quince, skip, skin, skim, chief, dream, trough, need, smoke, near, neat, neap, neck, style, smock, pique, spurred, wood, woof, flake, next, flail, newt, news, size, spleen, womb, nest, wolf, flash, flask, flare, judge, work, worm, flank, word, flame, float, smut, smug, flock, splent, snap, snag, hunch, flood, smee, smew, slow, slur, slug, dread, breeze, jack, jade, trend, floor, shrewd, sprig, slam, sled, width, flour, slim, flown, sprag, spray, slip, slit, cordate, tread, treat, slot, staves, truck, horse, writhe, branch, trump, trunk, sight, trust, truss, world, spice, noise, fleece, tooth, worth, worse, clang, clank, piece, chilled, clamp, pearl, peart, claim, probe, sphex, clare, clasp, class, sick, drove, side, pouched, sift, sign, deuce, prong, proof, butt, buzz, krems, clack, prose, proud, buoy, siss, prove, sire, bunk, bung, bump, site, bull, bulk, bulb, prowl, bush, burr, burn, buff, silk, silt, sink, toned, sine, sing, tongs, sham, zinc, shad, speck, speed, shed, childe, spell, spend, logged, shun, shut, legged, shin, ship, shop, show, shot, shoe, touch, tough, through, dense, note, nose, none, noon, scar, nonce, chase, charm, chart, chalk, chant, starch, scad, scab, whorled, furze, chair, chain, chafe, twitch, sand, salt, sake, chinch, flick, flirt, buck, trout, save, flint, fling, sash, split, droop, splay, grieve, keyed, clerk, smooth, noose, clear, cleat, clean, depth, seal, seam, cleft, sear, seat, gnash, seck, daunt, seed, seel, seek, troth, thrust, thrush, self, troll, send, brow, scow, flounce, spite, scum, starve, brun, ounce, nones, scye, spire, spike, spile, spill, spine, brag, prune, inched, vogue, bret, brew, brim, whiff, which, whirl, space, spade, whine, while, spalt, spasm, spare, spark, dance, zest, surd, sure, sunn, broach, suit, dough, salve, stern, such, doubt, suck, gneiss, crime, twelve, whisk, white, steak, steal, crisp, stub, bulge, stud, steam, sense, stop, stow, steep, steel, stir, there, stay, star, stab, stag, stew, stet, stem, step, poach, springe, thewed, douse, their, gift, ooze, gill, gild, knob, know, knot, knee, give, knit, bunch, spear, speak, gist, gird, girt, glib, glow, build, built, croak, crock, spright, glum, glue, sluice, glut, stark, start, stave, state, stake, stain, stall, stalk, stale, stamp, stand, flight, stack, stage, staff, serve, glad, blight, ache, eyed, goat, goal, goaf, goad, eyre, swim, swig, swop, croup, gnat, cross, crore, joule, crown, crowd, swab, sway, swan, joist, sized, speech, joint, thirst, stock, myrrh, glimpse, light, sauce, sault, else, crude, whole, crumb, grease, whoop, whore, crump, crust, crush, gaze, clause, gape, gang, gaul, gate, gash, gasp, crowned, garb, stork, store, gaff, storm, gage, stool, stoop, gain, gauze, stone, gall, gale, game, stove, stout, stoup, sarse, grace, grade, coursed, gauge, march, once, stink, sting, stint, still, stilt, stiff, stick, seize, marsh, polled, tail, tang, tamp, tame, tall, talk, take, tart, tare, garth, tape, task, south, sound, match, burst, geld, liege, burke, germ, sheathe, gear, tack, porte, wheeled, foxed, wicked, eight, dodge, fuse, moist, plague, stringed, funk, fund, twaite, full, thou, feud, they, fell, thin, this, fend, felt, quail, quake, fear, that, than, feed, feel, quart, choose, sconce, quash, tend, gripe, cowled, tent, tell, teil, grist, teem, pawl, pawn, grill, teat, tear, pave, quack, team, path, past, pass, parch, paste, vault, sponge, porch, lapsed, scratch, park, pare, faints, part, text, pale, palm, pall, pang, test, apse, page, pair, pain, paid, pack, rogue, term, scorch, fyke, thrack, young, crined, leaved, arched, arch, fish, graze, fist, firm, fire, grave, formed, fitz, five, roast, fizz, queue, graft, fife, fief, yoke, grail, grain, vexed, file, grasp, grass, tailed, querl, peaked, grate, quest, quell, grant, grand, finn, fine, find, fill, time, auld, queer, queen, tilt, till, tile, greed, crunch, tift, great, scotch, greek, green, veined, paned, kail, thus, tier, tide, tick, chord, chops, choke, peak, pear, forked, chock, flex, flea, solve, peck, flay, flaw, flax, flat, flap, flag, peel, peen, peer, perk, pelt, pest, toll, tomb, tone, groom, scouse, tool, tope, sparse, toed, groan, wharf, flow, sparge, whaup, flux, grout, tote, toss, flip, tour, tout, budge, town, gross, group, sexed, chyme, mouthed, clutch, ribbed, bit, foil, bin, big, bob, kern, bog, leech, forge, box, bow, boy, ban, bag, bad, baa, bay, bat, bar, foam, mesne, force, beg, bee, bed, bey, quick, bid, cat, can, cap, quilt, quill, wormed, bug, bud, but, buy, tongue, bye, quite, aim, air, blithe, aid, and, poise, all, quaint, abb, month, age, add, adz, progue, ace, act, choice, keen, keel, keep, fowl, four, foul, forth, are, arc, arm, ark, fork, form, art, ash, ask, ape, fore, apt, point, foot, fool, awe, font, food, awk, awl, ass, fond, fold, auk, whelm, patch, where, lithe, year, yean, mound, mount, horned, mouth, mouse, yelp, yell, quoth, quote, fret, free, fred, chuck, fray, frog, from, found, face, wheel, fade, fact, prompt, scrolled, tube, tuck, sober, crease, fail, fair, tuft, fake, saint, fall, churn, fame, tune, tump, groined, fawn, lived, fate, kiss, fast, fare, farm, kite, tree, trap, trot, troy, leave, least, forced, learn, leash, lease, trip, trim, true, leach, clumps, pounced, kink, king, kind, kill, kiln, jet, sleeve, guise, drop, jar, jam, jag, jaw, twin, twig, purge, job, joy, thought, jog, jig, pray, jib, drum, drug, wrest, prim, prig, ire, draw, haired, prod, drag, prop, dram, drab, purse, kid, kin, kit, black, pussy, key, sprout, league, guide, fight, turn, turf, jug, cranked, tush, tusk, tight, guild, boiled, length, goose, haw, rowed, hay, hew, her, hen, hem, hip, type, his, hit, sledge, doze, god, swipe, down, dorr, swish, gum, backed, switch, fifth, guy, gut, swing, swine, lorn, lord, lore, wreak, lope, doit, loop, loom, loon, look, wreck, lone, long, dock, ice, growth, roust, rouse, paunch, done, frogged, vague, puff, lout, love, inn, ink, loud, lost, loss, dome, ill, dole, lose, how, loan, pure, loam, loaf, hog, purl, hot, load, hop, hoe, field, hug, pull, pulp, round, hub, pump, yaws, yawn, loft, hum, should, rouge, rough, yarn, yard, push, lock, chance, few, fee, fib, guess, guest, change, fay, fat, luck, far, fan, fag, luce, eye, punch, eve, lodge, elf, elm, elk, ell, end, egg, eke, bleak, gin, gib, gig, swamp, get, swage, gem, dung, swath, gag, gas, gar, gap, dupe, squeeze, dusk, dust, sward, smeared, swart, swarm, bleed, lure, duff, fur, faced, duke, lute, dump, fra, dumb, lust, dull, fro, fry, fox, for, fog, foe, lump, duck, fly, fig, fin, fir, fit, fix, horn, hour, hood, hook, dab, hoop, dag, dam, day, hope, blaze, cup, cue, hold, home, hole, hone, cut, leafed, sweal, swear, blade, sweat, cod, cog, con, cot, sweep, sweet, cry, blank, bland, swell, blast, cit, blare, ebb, sterned, ear, eat, wrist, write, squint, eel, wring, squill, edh, dur, squirt, squire, dot, hock, spring, dog, hoar, dun, dup, due, dub, dry, dib, dew, den, dim, din, dip, die, dig, heel, guard, heck, pith, red, rep, dike, verge, heir, raw, ray, heft, dill, rap, rat, dine, ram, help, dint, rot, row, rob, hemp, rod, hell, helm, dirk, dire, dish, disk, disc, dirt, herd, rip, rig, rim, rib, rye, run, rum, rut, pick, rub, charge, sad, sag, pink, pine, ping, say, saw, pile, heat, heap, hear, heal, head, pipe, lynx, six, shy, sit, sip, sin, sew, set, sea, see, sou, sow, church, sol, son, sod, raised, fleeced, sky, limbed, sly, wrong, wroth, sty, sue, sum, sup, sun, blink, blind, tag, tax, taw, tau, tar, tap, tan, dice, snuff, pulse, par, pat, paw, pay, pad, pap, pan, chaste, loath, magged, mease, pew, plea, pen, per, pet, peg, ply, play, pie, pig, plan, pin, pip, pit, poy, pox, pop, pot, pry, curve, aisle, put, pun, pug, pyx, groove, harp, plot, hard, hall, hale, half, verse, hake, hang, hand, hawk, plum, plug, haze, hate, hash, hasp, have, haul, hive, eighth, net, sleave, new, nap, hind, hill, square, hire, grange, hint, gourd, nul, nun, high, nod, spruce, not, nor, now, gouge, chouse, nib, nip, chough, ohm, thyme, oil, poll, pole, strained, poke, odd, pound, grouse, pond, off, oak, own, mulch, owe, our, out, pouch, ground, ore, corked, orb, old, stitch, one, lag, lap, lar, law, lax, lay, lop, loo, log, lot, blotch, low, lug, pour, pout, port, prayer, pose, lex, post, let, leg, lee, nurse, pons, pool, lip, bowse, lin, poor, poop, lid, lie, mad, map, man, may, mat, mew, mob, mop, mow, mud, hewn, gorge, mix, mid, keeled, thump, thumb, stream, snood, snort, streak, skirt, bound, thatch, skill, skiff, jawed, edge, pierce, each, blown, blote, ease, east, bloom, earl, earn, fives, curled, blood, block, plate, trawl, yea, yen, yew, yet, trash, feast, tramp, plant, strain, plank, strait, plane, trail, train, trait, plash, strand, place, shirred, trade, plait, strath, plaid, plain, trace, track, tract, gyte, though, leet, leer, feign, hinge, lead, leaf, lean, leak, leap, blunt, strike, stride, strict, feint, blush, wry, thrid, wet, web, who, throw, wis, win, why, smudge, bluff, thrum, wan, wax, way, war, was, wag, wad, haft, hair, hail, hack, three, vow, vat, sniff, use, urn, snide, laze, stress, straight, brooch, lard, last, phrase, lash, tun, late, tub, tug, lave, two, try, lake, toe, land, lane, lank, toy, tow, ton, too, top, lamb, lame, lamp, tie, tin, tip, tic, tid, boots, ten, lade, tea, lace, lack, spired, the, bronze, gree, dyke, gray, boned, health, grit, grip, lick, lieu, life, lief, blanch, like, link, line, limp, lime, plight, lift, eared, wooled, cause, grub, swoop, gros, grow, sword, urge, sneer, crack, sneak, crane, crank, gold, thorn, crape, good, crass, gong, freight, crate, craft, gorm, gore, crake, gown, cramp, raise, left, lend, lere, craze, fanged, earth, less, stripe, chinned, string, lewd, strive, necked, range, plead, thick, snare, snarl, think, sleeved, heath, heart, third, heave, snail, snake, creak, cream, stroke, creep, carte, carve, strong, hearse, crest, creux, bogue, slough, plunge, first, gust, gush, browed, lint, gulf, live, list, lisp, catch, fence, hedge

4 thoughts on “Atoms of English”

  1. what school of thought are you referencing exactly?
    “blow” and “blown” are redundant for example, you should work on word trunks like 🙂


  2. I would have liked a dictionary that gave me word trunks, but ideally in English. You can have the java source if fancy trying it on German. However, I’m not totally sure that all derived versions of words are necessarily redundant. You can never predict 100% the way a derived word will be used just from the derivation, so there is some level of distinctiveness in each derived word, and that distinctiveness may include an atomic concept.

    However I’m not claiming this technique is even reasonable, just the best I could come up with in a few hours the other night…

    I’m planning to do a directed graph visualisation of all the words needed to define other words at some point, and I’ll post it here when I do. Have to finish detecting edges first though.


  3. Update: I noticed that some of my parsing was off – most of the words were fine, but some were a bit messed up. I’m not going to bother rerunning it just now, since it takes ages. Anyway, I’m more interested in visualising the graph…

    John: Do you want the list for programming with, or for looking at? And how is sorting it according to java hashcode not useful? 🙂 Actually, you’ve made me want to publish a dictionary where the words are on the page of their hashcode according to a simple human computable method. I ran a couple of tests with features humans could easily spot (number of descenders, number of vowels, etc) and then modding it by the intended number of pages (500 seems reasonable), but I’m still not down to a reasonable maximum number of words per page. If anyone can help me, we can share the enormous profits from the publishing (lulu).

    Fantastic, a dictionary you can only use if you already know how to spell the word and can do mental arithmetic. Could make it even more evil if you included aspects of the meaning into the hash code, then you could only use it if you know what the word means and how to spell it. You know it’d be a hit with elitist geeks.


