Need help with whatlanggo?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

abadojack
510 Stars 51 Forks MIT License 34 Commits 11 Opened issues

Description

Natural language detection library for Go

Services available

!
?

Need anything else?

Contributors list

# 176,780
Perl
Shell
duckduc...
text-pr...
19 commits
# 351,059
Go
Natural...
text-pr...
3 commits
# 331,862
PHP
CSS
Shell
text-pr...
3 commits
# 93,861
HTML
CSS
text-pr...
natural...
1 commit
# 383,676
Go
Natural...
text-pr...
1 commit
# 143,590
Go
Shell
golang
text-pr...
1 commit
# 75,214
Go
oauth2
C#
entity-...
1 commit

Whatlanggo

Build Status Go Report Card GoDoc Coverage Status

Natural language detection for Go.

Features

  • Supports 84 languages
  • 100% written in Go
  • No external dependencies
  • Fast
  • Recognizes not only a language, but also a script (Latin, Cyrillic, etc)

Getting started

Installation:

sh
    go get -u github.com/abadojack/whatlanggo

Simple usage example: ```go package main

import ( "fmt"

"github.com/abadojack/whatlanggo"

)

func main() { info := whatlanggo.Detect("Foje funkcias kaj foje ne funkcias") fmt.Println("Language:", info.Lang.String(), " Script:", whatlanggo.Scripts[info.Script], " Confidence: ", info.Confidence) } ```

Blacklisting and whitelisting

package main

import ( "fmt"

"github.com/abadojack/whatlanggo"

)

func main() { //Blacklist options := whatlanggo.Options{ Blacklist: map[whatlanggo.Lang]bool{ whatlanggo.Ydd: true, }, }

info := whatlanggo.DetectWithOptions("האקדמיה ללשון העברית", options)

fmt.Println("Language:", info.Lang.String(), "Script:", whatlanggo.Scripts[info.Script])

//Whitelist
options1 := whatlanggo.Options{
    Whitelist: map[whatlanggo.Lang]bool{
        whatlanggo.Epo: true,
        whatlanggo.Ukr: true,
    },
}

info = whatlanggo.DetectWithOptions("Mi ne scias", options1)
fmt.Println("Language:", info.Lang.String(), " Script:", whatlanggo.Scripts[info.Script])

}

For more details, please check the documentation.

Requirements

Go 1.8 or higher

How does it work?

How does the language recognition work?

The algorithm is based on the trigram language models, which is a particular case of n-grams. To understand the idea, please check the original whitepaper Cavnar and Trenkle '94: N-Gram-Based Text Categorization'.

How IsReliable calculated?

It is based on the following factors: * How many unique trigrams are in the given text * How big is the difference between the first and the second(not returned) detected languages? This metric is called

rate
in the code base.

Therefore, it can be presented as 2d space with threshold functions, that splits it into "Reliable" and "Not reliable" areas. This function is a hyperbola and it looks like the following one:

Language recognition whatlang rust

For more details, please check a blog article Introduction to Rust Whatlang Library and Natural Language Identification Algorithms.

License

MIT

Derivation

whatlanggo is a derivative of Franc (JavaScript, MIT) by Titus Wormer.

Acknowledgements

Thanks to greyblake (Potapov Sergey) for creating whatlang-rs from where I got the idea and algorithms.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.