Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
d1b2e76
Added ngram and skipgram schemes
pegasus-lynx Jul 27, 2021
9836984
Fixed circular import
pegasus-lynx Jul 28, 2021
5af79ea
Refactored code
pegasus-lynx Dec 16, 2021
a85dad9
Merge branch 'master' into mwe_schemes
pegasus-lynx Dec 17, 2021
ceca7b0
Fixes for pip install -e . to work
pegasus-lynx Dec 17, 2021
88b261b
Added get_scheme function
pegasus-lynx Dec 23, 2021
8c1c77c
Fixed naive pmi func
pegasus-lynx Jan 6, 2022
54c25d1
Added try catch to get the error while decoding
pegasus-lynx Jan 6, 2022
14fc9fd
Disabled bpe learn kwargs to pass by
pegasus-lynx Jan 6, 2022
bd9989e
Added Ext MWE Scheme. Fixed decode in skip scheme
pegasus-lynx Jan 8, 2022
8693928
Bug Fix in Ext MWE Scheme
pegasus-lynx Jan 8, 2022
5db8a7d
Fixed encoding and decoding for Ext MWE
pegasus-lynx Jan 9, 2022
f8cffbc
Fixed skip scheme decoding and changes to ext mwe
pegasus-lynx Jan 13, 2022
1fa75a1
Changes to ExtMWE Scheme for variable lists
pegasus-lynx Jan 13, 2022
48644aa
Fix in merge_types_list
pegasus-lynx Jan 13, 2022
e18a6ab
Added kids to the types
pegasus-lynx Jan 14, 2022
886d6cb
Fixed stochastic split function
pegasus-lynx Jan 14, 2022
55bdef1
Fixed decoding scheme for skipgrams
pegasus-lynx Feb 27, 2022
760955f
Added debugging in decode str
pegasus-lynx Feb 28, 2022
9681a84
Fixed kids for bpe tokens
pegasus-lynx Mar 2, 2022
850ff1a
Fixed kids in extmwe-scheme
pegasus-lynx Mar 2, 2022
a2f715f
Fixing ExtMweScheme loading
pegasus-lynx Mar 3, 2022
a37b499
Allow max_mwes parameter
pegasus-lynx Mar 27, 2022
1bd4280
Fixed working for max_mwes in ExtMWEScheme
pegasus-lynx Mar 27, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions nlcodec/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
DEF_MIN_CO_EV = 95 # recommended by Gowda and May (2020)
DEF_WORD_MIN_FREQ = 1 # minimum times a word should exist to be used for word vocab
DEF_CHAR_MIN_FREQ = 20 # minimum times a char should be seen to be included in init vocab
DEF_MWE_MIN_FREQ = 100 # minimum times an mwe should appear in the corpus
DEF_CHAR_COVERAGE = 0.9995 # Credits to google/sentencepiece for this idea;

import logging
Expand Down
Loading