GitHub - tokestermw/spacy_grammar: :black_nib: Language Tool style grammar handling with spaCy 2.0

spacy_grammar: rule-based grammar detection for spaCy

This packages uses the spaCy 2.0 alpha which provide support for adding custom attributes to Doc, Span, and Token objects. It also leverages the Matcher API in spaCy to quickly match on spaCy tokens not dissimilar to regex. It reads a grammar.yml file to load up custom patterns and returns the results inside Doc, Span, and Token.

It is extensible through adding rules to grammar.yml (though currently only the simple string matching is implemented).

doc = nlp('I can haz cheeseburger.')
doc._.has_grammar_error  # True

A lot of thanks to spacymoji and languagetool for inspiration.

Install

This package uses Python 3.6.

python3.6 -m venv .
source bin/activate
pip install -r requirements.txt

You will also need to install a spaCy model by:

python -m spacy download en_core_web_sm

Usage

From the root directory, you can check to see if the package is working.

python -m spacy_grammar.grammar

This code checks to see if someone wrote as follow instead of as follows.

import spacy
nlp = spacy.load('en_core_web_sm')
grammar = Grammar(nlp)
nlp.add_pipe(grammar)
doc = nlp('We can elaborate this distinction as follow.')
print([i._.g_as_follow_as_follows for i in doc])
# [False, False, False, False, False, True, True, False]

Adding rules

In grammar.yml, you can add rules following this template:

CATEGORY_NAME:
  RULE_NAME:
    description: 
    patterns: 
      - PATTERN_1
      - PATTERN_2
      ...
    corrections:
      - CORRECTION_1
      - CORRECTION_2
      ...
    examples: 
      - EXAMPLE_1
      - EXAMPLE_2
      ...

A pattern could match on a plain string or a list of token attributes.

# match on string
- as follow
# match on list of token attributes
-
  - LOWER: follow
  - LOWER: follow

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
spacy_grammar		spacy_grammar
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
circle.yml		circle.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

spacy_grammar: rule-based grammar detection for spaCy

Install

Usage

Adding rules

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

tokestermw/spacy_grammar

Folders and files

Latest commit

History

Repository files navigation

spacy_grammar: rule-based grammar detection for spaCy

Install

Usage

Adding rules

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages