Skip to content

Commit f5df3e9

Browse files
committed
feat: Migrate ML model to PyTorch
1 parent 744d5b2 commit f5df3e9

File tree

15 files changed

+8559
-3060
lines changed

15 files changed

+8559
-3060
lines changed

.ci/benchmark.txt

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -228,21 +228,21 @@ TOTAL: 10343 16352935 12112 46625 49
228228
credsweeper result_cnt : 11860, lost_cnt : 0, true_cnt : 11648, false_cnt : 212
229229
Rules Positives Negatives Templates Reported TP FP TN FN FPR FNR ACC PRC RCL F1
230230
------------------------------ ----------- ----------- ----------- ---------- ----- ---- ----- ---- -------- -------- -------- -------- -------- --------
231-
API 125 3172 187 122 122 0 3359 3 0.000000 0.024000 0.999139 1.000000 0.976000 0.987854
231+
API 125 3172 187 119 118 1 3358 7 0.000298 0.056000 0.997704 0.991597 0.944000 0.967213
232232
AWS Client ID 170 19 0 162 162 0 19 8 0.000000 0.047059 0.957672 1.000000 0.952941 0.975904
233233
AWS Multi 82 10 0 84 82 1 9 0 0.100000 0.000000 0.989130 0.987952 1.000000 0.993939
234234
AWS S3 Bucket 67 23 0 92 67 23 0 0 1.000000 0.000000 0.744444 0.744444 1.000000 0.853503
235235
Atlassian Old PAT token 3 7 0 10 3 7 0 0 1.000000 0.000000 0.300000 0.300000 1.000000 0.461538
236-
Auth 417 2744 81 396 395 1 2824 22 0.000354 0.052758 0.992906 0.997475 0.947242 0.971710
236+
Auth 417 2744 81 398 393 5 2820 24 0.001770 0.057554 0.991055 0.987437 0.942446 0.964417
237237
Azure Access Token 19 0 0 12 12 0 0 7 0.368421 0.631579 1.000000 0.631579 0.774194
238238
BASE64 Private Key 12 4 0 12 12 0 4 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
239239
BASE64 encoded PEM Private Key 7 0 0 5 5 0 0 2 0.285714 0.714286 1.000000 0.714286 0.833333
240240
Bitbucket Client ID 19 52 0 72 17 52 0 2 1.000000 0.105263 0.239437 0.246377 0.894737 0.386364
241241
Bitbucket Client Secret 29 75 1 104 27 75 1 2 0.986842 0.068966 0.266667 0.264706 0.931034 0.412214
242242
CMD ConvertTo-SecureString 13 4 0 13 13 0 4 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
243-
CMD Password 27 128 0 25 25 0 128 2 0.000000 0.074074 0.987097 1.000000 0.925926 0.961538
243+
CMD Password 21 128 6 21 21 0 134 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
244244
CMD Secret 1 1 0 1 1 0 1 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
245-
CMD Token 6 0 0 6 6 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
245+
CMD Token 6 0 0 5 5 0 0 1 0.166667 0.833333 1.000000 0.833333 0.909091
246246
Certificate 24 471 0 19 19 0 471 5 0.000000 0.208333 0.989899 1.000000 0.791667 0.883721
247247
Credential 91 422 76 90 90 0 498 1 0.000000 0.010989 0.998302 1.000000 0.989011 0.994475
248248
Docker Swarm Token 2 0 0 1 1 0 0 1 0.500000 0.500000 1.000000 0.500000 0.666667
@@ -259,21 +259,21 @@ Grafana Provisioned API Key 22 1 0
259259
JSON Web Token 170 61 0 131 131 0 61 39 0.000000 0.229412 0.831169 1.000000 0.770588 0.870432
260260
Jira / Confluence PAT token 0 4 0 0 0 4 0 0.000000 1.000000
261261
Jira 2FA 21 0 1 15 15 0 1 6 0.000000 0.285714 0.727273 1.000000 0.714286 0.833333
262-
Key 3916 15714 482 3921 3902 19 16177 14 0.001173 0.003575 0.998359 0.995154 0.996425 0.995789
263-
Nonce 93 49 0 93 92 1 48 1 0.020408 0.010753 0.985915 0.989247 0.989247 0.989247
262+
Key 3911 15717 483 3953 3894 59 16141 17 0.003642 0.004347 0.996221 0.985075 0.995653 0.990336
263+
Nonce 93 49 0 92 92 0 49 1 0.000000 0.010753 0.992958 1.000000 0.989247 0.994595
264264
Other 9 7450 5 0 0 7455 9 0.000000 1.000000 0.998794 0.000000
265265
PEM Private Key 1019 1483 0 1023 1019 4 1479 0 0.002697 0.000000 0.998401 0.996090 1.000000 0.998041
266-
Password 2032 7527 2539 1951 1946 5 10061 86 0.000497 0.042323 0.992478 0.997437 0.957677 0.977153
267-
SQL Password 44 13 0 41 41 0 13 3 0.000000 0.068182 0.947368 1.000000 0.931818 0.964706
266+
Password 1941 7534 2623 1883 1835 48 10109 106 0.004726 0.054611 0.987271 0.974509 0.945389 0.959728
267+
SQL Password 44 13 0 42 41 1 12 3 0.076923 0.068182 0.929825 0.976190 0.931818 0.953488
268268
Salesforce Credentials 2 0 0 2 2 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
269-
Salt 49 74 1 46 46 0 75 3 0.000000 0.061224 0.975806 1.000000 0.938776 0.968421
270-
Secret 1310 1567 799 1303 1303 0 2366 7 0.000000 0.005344 0.998096 1.000000 0.994656 0.997321
269+
Salt 48 75 1 45 43 2 74 5 0.026316 0.104167 0.943548 0.955556 0.895833 0.924731
270+
Secret 1310 1567 799 1299 1297 2 2364 13 0.000845 0.009924 0.995919 0.998460 0.990076 0.994251
271271
Seed 1 6 0 0 0 6 1 0.000000 1.000000 0.857143 0.000000
272272
Slack Token 4 1 0 4 4 0 1 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
273273
Stripe Credentials 2 0 0 2 2 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
274274
Tencent WeChat API App ID 6 0 0 6 6 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
275-
Token 647 4169 453 626 626 0 4622 21 0.000000 0.032457 0.996014 1.000000 0.967543 0.983504
275+
Token 645 4170 453 619 615 4 4619 30 0.000865 0.046512 0.993546 0.993538 0.953488 0.973101
276276
Twilio Credentials 30 39 0 30 30 0 39 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
277277
URL Credentials 224 168 197 223 223 0 365 1 0.000000 0.004464 0.998302 1.000000 0.995536 0.997763
278278
UUID 1075 265 0 1074 1073 1 264 2 0.003774 0.001860 0.997761 0.999069 0.998140 0.998604
279-
12112 46625 4907 11872 11648 212 46413 464 0.004547 0.038309 0.988491 0.982125 0.961691 0.971800
279+
12000 46639 5003 11809 11486 311 46328 514 0.006668 0.042833 0.985931 0.973637 0.957167 0.965332

.github/workflows/check.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,8 @@ jobs:
3838
- name: Check ml_config.json and ml_model.onnx integrity
3939
if: ${{ always() && steps.code_checkout.conclusion == 'success' }}
4040
run: |
41-
md5sum --binary credsweeper/ml_model/ml_config.json | grep 3a4bfcd6f3ea74461b158d4ec073cc06
42-
md5sum --binary credsweeper/ml_model/ml_model.onnx | grep 9725b166e07e60f94929fea986f84ae2
41+
md5sum --binary credsweeper/ml_model/ml_config.json | grep dae9d27fe1b6de565c6c4c994ee7ba36
42+
md5sum --binary credsweeper/ml_model/ml_model.onnx | grep c4c2a33a675e38dcc7024c176d9317ee
4343
4444
# # # line ending
4545

credsweeper/ml_model/ml_model.onnx

-6.54 KB
Binary file not shown.

0 commit comments

Comments
 (0)