GitHub - chakki-works/sumeval: Well tested & Multi-language evaluation framework for text summarization.

Well tested & Multi-language
evaluation framework for Text Summarization.

Well tested
- The ROUGE-X scores are tested compare with original Perl script (ROUGE-1.5.5.pl).
- The BLEU score is calculated by SacréBLEU, that produces the same values as official script (mteval-v13a.pl) used by WMT.
Multi-language
- Not only English, Japanese and Chinese are also supported. The other language is extensible easily.

Of course, implementation is Pure Python!

How to use

from sumeval.metrics.rouge import RougeCalculator


rouge = RougeCalculator(stopwords=True, lang="en")

rouge_1 = rouge.rouge_n(
            summary="I went to the Mars from my living town.",
            references="I went to Mars",
            n=1)

rouge_2 = rouge.rouge_n(
            summary="I went to the Mars from my living town.",
            references=["I went to Mars", "It's my living town"],
            n=2)

rouge_l = rouge.rouge_l(
            summary="I went to the Mars from my living town.",
            references=["I went to Mars", "It's my living town"])

# You need spaCy to calculate ROUGE-BE

rouge_be = rouge.rouge_be(
            summary="I went to the Mars from my living town.",
            references=["I went to Mars", "It's my living town"])

print("ROUGE-1: {}, ROUGE-2: {}, ROUGE-L: {}, ROUGE-BE: {}".format(
    rouge_1, rouge_2, rouge_l, rouge_be
).replace(", ", "\n"))

from sumeval.metrics.bleu import BLEUCalculator


bleu = BLEUCalculator()
score = bleu.bleu("I am waiting on the beach",
                  "He is walking on the beach")

bleu_ja = BLEUCalculator(lang="ja")
score_ja = bleu_ja.bleu("私はビーチで待ってる", "彼がベンチで待ってる")

From the command line

sumeval r-nlb "I'm living New York its my home town so awesome" "My home town is awesome"

output.

{
  "options": {
    "stopwords": true,
    "stemming": false,
    "word_limit": -1,
    "length_limit": -1,
    "alpha": 0.5,
    "input-summary": "I'm living New York its my home town so awesome",
    "input-references": [
      "My home town is awesome"
    ]
  },
  "averages": {
    "ROUGE-1": 0.7499999999999999,
    "ROUGE-2": 0.6666666666666666,
    "ROUGE-L": 0.7499999999999999,
    "ROUGE-BE": 0
  },
  "scores": [
    {
      "ROUGE-1": 0.7499999999999999,
      "ROUGE-2": 0.6666666666666666,
      "ROUGE-L": 0.7499999999999999,
      "ROUGE-BE": 0
    }
  ]
}

Undoubtedly you can use file input. Please see more detail by sumeval -h.

Install

pip install sumeval

Dependencies

BLEU is depends on SacréBLEU
To calculate ROUGE-BE, spaCy is required.
To use lang ja, janome or MeCab is required.
- Especially to get score of ROUGE-BE, GiNZA is needed additionally.
To use lang zh, jieba is required.
- Especially to get score of ROUGE-BE, pyhanlp is needed additionally.

Test

sumeval uses two packages to test the score.

pythonrouge
- It calls original perl script
- pip install git+https://github.com/tagucci/pythonrouge.git
rougescore
- It's simple python implementation for rouge score
- pip install git+git://github.com/bdusell/rougescore.git

Welcome Contribution 🎉

Add supported language

The tokenization and dependency parse process for each language is located on sumeval/metrics/lang.

You can make language class by inheriting BaseLang.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
doc		doc
sumeval		sumeval
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Well tested & Multi-language
evaluation framework for Text Summarization.

How to use

From the command line

Install

Dependencies

Test

Welcome Contribution 🎉

Add supported language

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

chakki-works/sumeval

Folders and files

Latest commit

History

Repository files navigation

Well tested & Multi-language evaluation framework for Text Summarization.

How to use

From the command line

Install

Dependencies

Test

Welcome Contribution 🎉

Add supported language

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Well tested & Multi-language
evaluation framework for Text Summarization.

Packages