8.1 Introduction

  • Strings support many of the same sequence operations as lists and tuples
  • Strings, like tuples, are immutable
  • Here, we take a deeper look at strings
  • Introduce regular expressions and the re module for matching patterns in text
    • Particularly important in today’s data rich applications

8.1 Introduction (cont.)

  • Table below shows many string-processing and NLP-related applications
String and NLP applications
Anagrams
Automated grading of written homework
Automated teaching systems
Categorizing articles
Chatbots
Compilers and interpreters
Creative writing
Cryptography
Document classification
Document similarity
Document summarization
Electronic book readers
Fraud detection
Grammar checkers
Inter-language translation
Legal document preparation
Monitoring social media posts
Natural language understanding
Opinion analysis
Page-composition software
Palindromes
Parts-of-speech tagging
Project Gutenberg free books
Reading books, articles, documentation and absorbing knowledge
Search engines
Sentiment analysis
Spam classification
Speech-to-text engines
Spell checkers
Steganography
Text editors
Text-to-speech engines
Web scraping
Who authored Shakespeare’s works?
Word clouds
Word games
Writing medical diagnoses from x-rays, scans, blood tests
and many more…

©1992–2020 by Pearson Education, Inc. All Rights Reserved. This content is based on Chapter 5 of the book Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud.

DISCLAIMER: The authors and publisher of this book have used their best efforts in preparing the book. These efforts include the development, research, and testing of the theories and programs to determine their effectiveness. The authors and publisher make no warranty of any kind, expressed or implied, with regard to these programs or to the documentation contained in these books. The authors and publisher shall not be liable in any event for incidental or consequential damages in connection with, or arising out of, the furnishing, performance, or use of these programs.