Object/Trait

com.peoplepattern.text

LanguageIdentifier

Related Docs: trait LanguageIdentifier | package text

Permalink

object LanguageIdentifier extends LanguageIdentifier

Linear Supertypes
LanguageIdentifier, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. LanguageIdentifier
  2. LanguageIdentifier
  3. AnyRef
  4. Any
  1. Hide All
  2. Show all
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def classify(text: String, threshold: Double, minTextSize: Int): Option[(String, Double)]

    Permalink

    Classify an indivitual text for the language of the text

    Classify an indivitual text for the language of the text

    text

    the text to classify

    threshold

    the minimum score threshold to consider a valid prediction

    minTextSize

    the minimum length for the text to make a prediction

    returns

    a pair of code (ISO-639-1 langauge code) and prediction score if a prediction could be made, otherwise None

    Definition Classes
    LanguageIdentifierLanguageIdentifier
  6. def classify(text: String): Option[(String, Double)]

    Permalink

    Classify an indivitual text for the language of the text

    Classify an indivitual text for the language of the text

    This method should use sensible defaults for the threshold and minTextScore parameters

    returns

    a pair of code (ISO-639-1 langauge code) and prediction score if a prediction could be made, otherwise None

    Definition Classes
    LanguageIdentifier
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. lazy val defaultFrequency: Double

    Permalink
  9. lazy val defaultMinTextSize: Int

    Permalink
  10. lazy val defaultThreshold: Double

    Permalink
  11. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  12. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  13. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  16. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  17. val model: Model

    Permalink
  18. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  19. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  20. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  21. def summarize(texts: TraversableOnce[String], threshold: Double, frequency: Double, minTextSize: Int): Vector[(String, Double, Double)]

    Permalink

    Given a list of strings, returns an ordered list of unique identified languages using ISO-639-1 langauge code.

    Given a list of strings, returns an ordered list of unique identified languages using ISO-639-1 langauge code.

    This is built to classify many text entries, with the assumption that we only neeed to knowabout the most common langagues in the texts.

    Change threshold and frequency to deal with outlier data. Increasing threshold increases the confidence of identified languages, while increasing frequency reduces impact of minor second language usage.

    texts

    the texts to classify and summarize

    threshold

    the

    returns

    Vector of 3-tuples (lang-code, avg-lang-classification-score, frequency)

    Definition Classes
    LanguageIdentifierLanguageIdentifier
  22. def summarize(texts: TraversableOnce[String]): Vector[(String, Double, Double)]

    Permalink

    Given a list of strings, returns an ordered list of unique identified languages using ISO-639-1 langauge code.

    Given a list of strings, returns an ordered list of unique identified languages using ISO-639-1 langauge code.

    This is built to classify many text entries, with the assumption that we only neeed to knowabout the most common langagues in the texts.

    Change threshold and frequency to deal with outlier data. Increasing threshold increases the confidence of identified languages, while increasing frequency reduces impact of minor second language usage.

    returns

    Vector of 3-tuples (lang-code, avg-lang-classification-score, frequency)

    Definition Classes
    LanguageIdentifier
  23. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  24. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  25. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  27. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from LanguageIdentifier

Inherited from AnyRef

Inherited from Any

Ungrouped