-
Notifications
You must be signed in to change notification settings - Fork 224
added Double Counting algorithm for review #1098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for your thoughts here. Your example is somewhat difficult to follow because there are alot of boxes, can you use the deepforest bird model and the data here https://github.com/weecology/DoubleCounting/tree/main/tests/data/birds. I think that if we merge this we would need a seperate pip install since the dependencies are heavy compared to the rest of the repo. So I imagine something like pip install deepforest[double_counting] I think this would be an extra the .toml Then we would need to collect several other datasets to try to get a handle on how well it generalizes and what parameters are sensitive. These parameters would need to go in the hydra config. The general workflow would be something like
All of this is the module you attached, but would need a integration into deepforest.main() plus a documentation page with examples. Roadmap
|
|
@bw4sz Thank you again for the detailed feedback and the clear roadmap. The plan for optional dependencies and integrating the feature as predict_unique makes perfect sense. As you suggested, I've rerun the workflow on the bird dataset to provide a clearer example of the result. I'll start testing it with different data and let you know the sensitive parameters. |
|
great, let me know if you need help. |
|
@bw4sz i could only fine few overlapping dataset from kaggle and github. the rest all data i found was not in order or there where arial images but they where of different places without overlaps so where can i find more datasets to test this . like are there any keywords which would help me or any place i could search for these type of datasets. |
|
We have a number of datasets, let me look into this today. Can you look at that roadmap above and summarize which pieces are completed, which you plan to do, and which I can help you with. This is great stuff and I have some time this week to assist in review and get it over the finish line. Thanks for the contribution! |
|
Sorry for the Past two months i haven't been able to focus on this as i was caught up with lots of things. Now I have much time so I will start with the documentation and creating the extra package install today and once i have the data i will test and observe the sensitive parameters. I'll constantly give updates on where I am stuck and what's completed |
|
Great. I've collected data for a couple tests. Thanks for your contribution.
…On Thu, Oct 9, 2025 at 12:15 AM Bhavya Mehta ***@***.***> wrote:
*Bhavya1604* left a comment (weecology/DeepForest#1098)
<#1098 (comment)>
Sorry for the Past two months i haven't been able to focus on this as i
was caught up with lots of things.
Now I have much time so I will start with the documentation and creating
the extra package install today and once i have the data i will test and
observe the sensitive parameters.
I'll constantly give updates on where I am stuck and what's completed
—
Reply to this email directly, view it on GitHub
<#1098 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJHBLFVXSZJWLETOV2OWBL3WYDPJAVCNFSM6AAAAACCSY5GYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGOBUGQ2DONJUHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Ben Weinstein, Ph.D.
Research Scientist
University of Florida
https://bw4sz.github.io/
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1098 +/- ##
==========================================
+ Coverage 87.43% 87.61% +0.18%
==========================================
Files 20 20
Lines 2538 2544 +6
==========================================
+ Hits 2219 2229 +10
+ Misses 319 315 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…tions doc, removed DoubleCounting.py
|
@bw4sz I’ve moved the code to evaluate.py and main.py, and added documentation along with separate dependencies. |
|
@bw4sz How will you share the testing data with me? |
|
@bw4sz i hope you are doing well |

Hi @henrykironde , @bw4sz and @jveitchmichaelis
This Draft PR contains a working prototype script that implements the "predict and delete" strategy to handle the double-counting of objects in overlapping images. I've adapted the core logic from the DoubleCounting repo and integrated it
I tested the workflow on a dataset with 70-80% overlap using the "left-hand" strategy for clear visualization. The blue boxes are all initial predictions, while the pink boxes are the final, unique predictions for that image.
output :

we can observe that the top predictions are in pink which are unique (new for that image) which indicates that the code is able to identify the overlap and is working fine. there were a total of 401 predictions of which 194 where detected to be unique.
This PR contains the standalone DoubleCounting.py script for review. Before I start integrating this into the main DeepForest library, I would greatly appreciate your feedback