Carrot2: Text Clustering Algorithms and Applications
Carrot2 is a programming library for clustering text. It can automatically discover groups of related documents and label them with short key terms or phrases.
Carrot2 can turn, for example, search result titles and snippets into groups like these:
Carrot2 is a software component and typically integrates with other software as a library dependency (see the API documentation available with each release).
Binary releases are published on GitHub and they ship with a HTTP/JSON REST API service called the DCS (document clustering server) for integration with other languages.
The documentation for the latest release is always at https://carrot2.github.io/release/latest.
Developer documentation and examples are part of binary releases. Once downloaded and unpacked, start the DCS:
shell script cd dcs ./dcs
Source code is at GitHub.
Carrot2 is licensed under the BSD license.