eternally stressed semanticist (cqs) wrote,
eternally stressed semanticist

Spell checking and the internet

I've spent the last week noodling around with spelling correction in Python, with no particularly good results. (I might need to do more than noodle, if I want good results.) Part of the problem is deciding what to do with unfamiliar words—and if your client wants you to be searching on Pepsi-Cola (not an actual example), you kind of want references to "Pespi" to be corrected to "Pepsi"...but without any reference to "Cole Porter" to be corrected to "Cola Porter".

Today's lesson, though, while looking through capitalized phrases in a corpus, and finding "beret syndrome": no amount of spell-checking will help you when someone refers to Gion Beret Syndrome.
  • Post a new comment


    default userpic

    Your reply will be screened

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.