PyPI StatsPyPI Stats
DiscoverCompareTrendingDashboard
PyPI Stats — Download analytics for Python packages
APIGitHubPrivacyTerms

Sign in to subscribe to our weekly trending newsletter.

Download data sourced from BigQuery (Google). Counts may include CI/CD and mirror traffic.
Inspired by and built upon the work of pypistats.org
Discover/Text Processing & NLP
📝

Best Python Text Processing & NLP Libraries

Libraries for parsing text, natural language processing, and linguistic analysis.

24 packages · ranked by health score & downloads

Trending this week

↑ regex↑ docutils↑ chardet↑ lark↑ junitparser
#1charset-normalizerv3.4.4
83

The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.

1.2B/moMIT★ 757
#2Jinja2v3.1.6
80

A very fast and expressive template engine.

477.3M/mo★ 11.5K
#3Jinja2v3.1.6
80

A very fast and expressive template engine.

477.3M/mo★ 11.5K
#4MarkupSafev3.0.3
75

Safely add untrusted strings to HTML/XML markup.

534.0M/mo★ 684
#5lxmlv6.0.2
90

Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.

248.0M/moBSD-3-Clause★ 3.0K
#6regexv2026.2.28↑
82

Alternative regular expression module, to replace re.

286.3M/mo★ 581
#7beautifulsoup4v4.14.3
85

Screen-scraping library

248.9M/moMIT License
#8docutilsv0.22.4↑
84

Docutils -- Python Documentation Utilities

252.5M/mo
#9pyparsingv3.3.2
80

pyparsing - Classes and methods to define and execute parsing grammars

302.1M/mo★ 2.5K
#10chardetv7.0.0↑
85

Universal character encoding detector

140.6M/mo
#11Markdownv3.10.2
83

Python implementation of John Gruber's Markdown.

94.3M/mo★ 4.2K
#12Markdownv3.10.2
83

Python implementation of John Gruber's Markdown.

94.3M/mo★ 4.2K
#13tree-sitterv0.25.2
84

Python bindings to the Tree-sitter parsing library

72.2M/mo★ 1.4K
#14nltkv3.9.3
86

Natural Language Toolkit

54.2M/moApache License, Version 2.0★ 14.6K
#15larkv1.3.1↑
83

a modern parsing library

60.7M/moMIT
#16humanfriendlyv10.0
88

Human friendly output for text interfaces using Python

46.5M/moMIT
#17Sphinxv9.1.0
79

Python documentation generator

67.9M/mo★ 7.8K
#18junitparserv4.0.2↑
91

Manipulates JUnit/xUnit Result XML files

38.1M/moApache-2.0
#19text-unidecodev1.3↑
78

The most basic Text::Unidecode port

63.2M/moArtistic License★ 68
#20tinycss2v1.5.1
74

A tiny CSS parser

68.1M/mo★ 184
#21langchain-text-splittersv1.1.1
82

LangChain text splitting utilities

39.1M/moMIT★ 131.3K
#22html5libv1.1
87

HTML parser based on the WHATWG HTML specification

29.9M/moMIT License
#23prettytablev3.17.0
80

A simple Python library for easily displaying tabular data in a visually appealing ASCII table format

40.6M/mo★ 1.6K
#24humanizev4.15.0
76

Python humanize utilities

46.9M/mo★ 719
← All categoriesSearch text processing & nlp packages →