feat(stats): allow users to rank all opinions by representativeness for each group #105

takeruhukushima · 2025-09-06T05:01:39Z

Changed select_consensus_statements to return up to 100 results by default
Added comprehensive test coverage for multi-group consensus
Implemented proper sorting by z-score (p-test) for representativeness
Enhanced test output to show detailed metrics for all results

- Changed select_consensus_statements to return up to 100 results by default - Added comprehensive test coverage for multi-group consensus - Implemented proper sorting by z-score (p-test) for representativeness - Enhanced test output to show detailed metrics for all results Signed-off-by: takeru.fukushima <[email protected]>

nicobao · 2025-09-08T12:08:09Z

Thank you @takeruhukushima for your PR :)

It's on good track!

I think however, we shouldn't modify the existing API as it is designed to be 1-1 identical with pol.is API.

Instead, we should probably either create a new function for it, or create configuration variable to configure what's the expected list like (probably better).

Finally, our need is not only to provide the top 100, but to provide the list of ALL statements ranked by representativeness. I think it's probably not necessary to add another API. We can just add the relevant info in statements_df so that the library consumer can re-order the list of statements based on the representativeness for each group.

What we could do is:

rollback the hard change that gives 100 values by default
improve the existing API by implementing the following TODO comment https://github.com/polis-community/red-dwarf/pull/105/files#diff-643a4bd33f0031a0a8536ecb310b979a0117c11a4871bb6052b8b315f44ea234R42-R43
make sure the representativeness (rat, pat, rdt, pdt) per group is in statements_df so end-users can sort by it
we should also expose reddwarf/utils/consensus.py to the library API consumed by end-users. So end-users can recalculate representativeness and do whatever they please with (rat, pat, rdt, pdt).
It would be nice if fix: representative opinions selection and comment stats formatting #99 was merged first, since it introduces changes to these files

Takeru, @patcon is the main maintainer of the library so I'll address him to see what he thinks :)

Hey @patcon just a heads up, Takeru did the japanese translation in Agora, and he's a young and motivated student very eager to learn and contribute to civic tech tools :)

I told him we're interested at Agora in being able to retrieve more than just 5 representative opinions, but I didn't detail much.

Let me know what you think of the requirements, which I think we've already briefly discussed last time we spoke!

nicobao · 2025-09-08T12:10:17Z

Also @takeruhukushima, could you join the Polis User Discord group that @patcon manages?
https://discord.com/invite/wFWB8kzQpP
We can discuss things in the red-dwarf-polis-library channel as well

takeruhukushima · 2025-09-08T13:12:53Z

I'm sorry I messed with the existing functions.

I picked the top 100 entries because I thought there might be a theoretical risk of crashes or similar issues. That was my own arbitrary judgment, and I should have asked first.

nicobao · 2025-09-08T13:27:01Z

I'm sorry I messed with the existing functions.

I picked the top 100 entries because I thought there might be a theoretical risk of crashes or similar issues. That was my own arbitrary judgment, and I should have asked first.

It's chill, thanks for your efforts Takeru it's on good track!

I'll wait for @patcon feedback first, and I'll give you more detailed feedbacks on your code and on the requirements in a few days, if you don't mind!

…tativeness" This reverts commit 77799e4.

- Changed pick_max, confidence and prob_threshold parameters to be Optional - Updated functions in consensus.py, stats.py, and base.py Signed-off-by: takeru.fukushima <[email protected]>

nicobao · 2025-09-19T12:40:10Z

Hi @takeruhukushima
I am reviewing this today.

nicobao · 2025-09-23T19:41:42Z

Hi @takeruhukushima
Sorry for the long delay. I'm taking over the role of maintainer for the time being.
This PR is a bit too early, as we haven't defined the scope as to what should be done. We've discussed it on Monday during the 1st RedDwarf Open Call, but we're still not sure how to proceed.

In general @patcon what I'd like is to have all these values available: pa, pat, rat, ra, pa, rdt, rd, pdt and pd so I can rank all the statements.
We'd modify statements_df to have all these values for each cluster.
We could then add a score that can be use for the end-users (library consumer) to rank the statements accordingly. The score can be calculated via the following function:
https://github.com/nicobao/red-dwarf/blob/fix-repful-for/reddwarf/utils/stats.py#L507-L509
We'd also need to expose a way to calculate whether a statement if repful_for "agree" or "disagree" or nothing at all. And better so, we'd want to create granular functions that enable to have "agree", "strong_agree", "disagree", "strong_agree", "divisive", "strong_divisive". Logically, the score calculated above would rank first the "strong" repful_for first. And of course sometimes repful_for would be "not_representative" or just undefined.

@patcon could you provide feedback on this?

I'd like also to get more clarity as to what entails "confidence" in the source code? And what represents pat and rat exactly, as opposed to pa and ra? (same for disagree).
Finally how to identify "divisive statement for a specific cluster"?

@takeruhukushima I am sorry, we need a little more time to figure out the details, this is a complex feature. I'll come back to you for your "edit" PR in Agora which should be more easier to go through with. Thank you again!

nicobao changed the title ~~feat(consensus): Return up to 100 opinions ranked by representativeness~~ feat(stats): allow users to rank all opinions by representativeness Sep 8, 2025

nicobao changed the title ~~feat(stats): allow users to rank all opinions by representativeness~~ feat(stats): allow users to rank all opinions by representativeness for each group Sep 8, 2025

takeruhukushima added 2 commits September 8, 2025 23:52

Revert "feat(consensus): Return up to 100 opinions ranked by represen…

3a11d02

…tativeness" This reverts commit 77799e4.

feat: make consensus and repness parameters optional

580c3e2

- Changed pick_max, confidence and prob_threshold parameters to be Optional - Updated functions in consensus.py, stats.py, and base.py Signed-off-by: takeru.fukushima <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(stats): allow users to rank all opinions by representativeness for each group #105

feat(stats): allow users to rank all opinions by representativeness for each group #105

Uh oh!

takeruhukushima commented Sep 6, 2025

Uh oh!

nicobao commented Sep 8, 2025 •

edited

Loading

Uh oh!

nicobao commented Sep 8, 2025 •

edited

Loading

Uh oh!

takeruhukushima commented Sep 8, 2025

Uh oh!

nicobao commented Sep 8, 2025 •

edited

Loading

Uh oh!

nicobao commented Sep 19, 2025

Uh oh!

nicobao commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(stats): allow users to rank all opinions by representativeness for each group #105

Are you sure you want to change the base?

feat(stats): allow users to rank all opinions by representativeness for each group #105

Uh oh!

Conversation

takeruhukushima commented Sep 6, 2025

Uh oh!

nicobao commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nicobao commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

takeruhukushima commented Sep 8, 2025

Uh oh!

nicobao commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nicobao commented Sep 19, 2025

Uh oh!

nicobao commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nicobao commented Sep 8, 2025 •

edited

Loading

nicobao commented Sep 8, 2025 •

edited

Loading

nicobao commented Sep 8, 2025 •

edited

Loading