Skip to content

Dirty Poetry

  • by
  • 3 min read

Have you seen the post, “Somebody Has to Pay for Easter…“? The weird text in that piece, reproduced here:

g/i« : L : 2t . = G = Z H i A ;i P e o x‘ i o . = = – AET 2 b 35 – . o S Y Z Z \ S 3 “A e e G e T TaE i & b T ‘ 2 Gl 0 > / b o g ko {1/ 3 B % e 2 o T biistie e ST R S f.‘fi::% (! S A e A m 3 2 i e g v g R . At ,t‘ o et il R S B e T ; e s %‘ A 7 . Sty T j E NS . s = – o / By e i i = Ee ¢/ ! i . e S 4 £ 0 i 3 T s ey & v N i T E o : o b i : e e, 000 Lo o = L oo e s i (i B W e 7 7 1 kB v‘/ e R L v i G L LN e e . s . e % s S B G s o o oA ¢ i . A e e e i & GRS i 2% s SRR B N

was generated by AI. When I was posting the image to Instagram there was a button that offered to “auto generate an alt tag description.” Who could resist? I was amused by the results, which you see here. Well, Chad Nelson has done me better. Introducing Dirty Poetry.

Poems made from the dirtiest OCR data available from the Library of Congress’s Chronicling America dataset of historical Newspapers… Right! I’ve poured over ((ok, grepped) ~500GB of Chroincling America data to find lines that meet my low standard for nonsence, basically ones that match egrep "[^a-zA-Z0-9 ]{3,}". Having found the absolute bottom of the barrel, I then ran each line through some parsing and analysis to determine each lines…

OCR (Optical Character Recognition) may seem like a completely different thing than AI, but it’s not really. In both cases you have software attempting to regurgitate something is has scanned back at you. In the case of OCR, the program simply tries to interpret what it sees, while AI software tries to imitate what it has scanned.

The Library of Congress is Chronicling America by scanning old newspapers to create an ostensible searchable archive of American history. Chad Nelson has found a way to use some of their material to generate Dirty Poetry.

I know my description here kinda sucks so just check it out. But before you go, here’s a sample:

covi ?:\.-? loi: a BOX U BEBT LIT.
upon ?getting away at eon .? i??? tibie
upon ?getting away at eon .? i??? tibie
Tsmphlet i-ont Irio I y t ? ..;lil.i!ir v :.

Chad calls it Dirty but I think it’s Pure Poetry

Leave a Reply

Your email address will not be published. Required fields are marked *