Skip to content

Conversation

@yuce
Copy link
Contributor

@yuce yuce commented Sep 19, 2025

This is the initial asyncio support.

  • Added the asyncio module which contains public asyncio API
  • Added the internal module and internal/asyncio_ modules, which contains the private asyncio API/implementation.
  • Added asyncio support for the Map proxy. Currently near cache, transactions, and lock releated-methods are not supported. The near cache support will come in another PR. Transactions will likely not be supported. Locks may need a different design.
  • I didn't include the API docs, in order to make the PR smaller. I'll add them in another PR.
  • The following tests are ported from the old API, and they work without changes (besides making them compatible with async/await syntax):
    • tests/integration/asyncio/authentication_tests
    • tests/integration/asyncio/backup_acks_tests
    • tests/integration/asyncio/client_test (one test is not ported, due to its Topic DDS dependency)
    • tests/integration/asyncio/proxy/map_test

Most of the code in this PR was duplicated to the internal module by prefixing them with asyncio_. For example ROOT/cluster.py was duplicated/modifed as internal/asyncio_cluster.py

Here is the diff between modules in this PR vs their counterparts in the old API:
https://gist.github.com/yuce/56e79a29a1d4d1d996788381d489c0a4

@codecov-commenter
Copy link

codecov-commenter commented Sep 19, 2025

Codecov Report

❌ Patch coverage is 76.65837% with 563 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.48%. Comparing base (5c7a4d5) to head (a87a5c6).

Files with missing lines Patch % Lines
hazelcast/internal/asyncio_connection.py 76.16% 149 Missing ⚠️
hazelcast/internal/asyncio_proxy/map.py 78.26% 125 Missing ⚠️
hazelcast/internal/asyncio_invocation.py 72.41% 80 Missing ⚠️
hazelcast/internal/asyncio_cluster.py 63.88% 65 Missing ⚠️
hazelcast/internal/asyncio_compact.py 32.89% 51 Missing ⚠️
hazelcast/internal/asyncio_listener.py 79.23% 38 Missing ⚠️
hazelcast/asyncio/client.py 84.36% 33 Missing ⚠️
hazelcast/internal/asyncio_proxy/base.py 87.50% 14 Missing ⚠️
hazelcast/internal/asyncio_proxy/manager.py 90.00% 4 Missing ⚠️
hazelcast/internal/asyncio_reactor.py 96.49% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #741      +/-   ##
==========================================
- Coverage   95.33%   93.48%   -1.86%     
==========================================
  Files         378      389      +11     
  Lines       21992    24404    +2412     
==========================================
+ Hits        20967    22815    +1848     
- Misses       1025     1589     +564     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yuce yuce added this to the 5.6.0 milestone Sep 25, 2025
@yuce
Copy link
Contributor Author

yuce commented Sep 26, 2025

I've added the diff between modules in this PR vs their counterparts in the old API to the PR description.

Copy link

@gbarnett-hz gbarnett-hz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a quick look.

For others: probably easier to use https://difftastic.wilfred.me.uk/introduction.html to make life easier.

Haven't studied the reactor in-detail and associated logic in detail. From the diff it looked like the changes were relatively minor, albeit many of them.

I can review those in-depth if needed (or you cannot find people).

_CLIENT_ID = AtomicInteger()

@classmethod
async def create_and_start(cls, config: Config = None, **kwargs) -> "HazelcastClient":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General note -- I'm sure there are others.

If we're trying to get this API nice w.r.t. typing as well then this will probably show some error by default as None is not applicable for Config. Might be more prominent here as I think this is entry point into client creation.

Same for __init__ -- config: Config | None.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be missing something in the diff: in client.py there's a load of documentation -- is it elsewhere, or why omit it for this one? (most likely doc applicable from __init__)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: Docs, I mentioned that in the PR description:

I didn't include the API docs, in order to make the PR smaller. I'll add them in another PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use better diff tool it doesn't make a difference. Give that one I mentioned a go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can review those in-depth if needed (or you cannot find people).

Yes, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least with PyCharm, the user gets the correct type annotation.
There is this project in case you'd like to try the asyncio module:
https://github.com/yuce/hazelcast-asyncio-sample

In any case, I pushed a PR that adds explicit |None:
baa3bc1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyCharm may be more forgiving -- I've never used it. pyright or pyrefly are good tools to use to determine compliance to the type invariants specified using type annotations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check the typings with mypy.

raise
_logger.info("Client started")

async def get_map(self, name: str) -> Map[KeyType, ValueType]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intention is only to support also VC in near (immediate) term?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map is the only proxy included, in order to make the PR small. Other proxies, except VC, may be excluded from the beta release.

}


class ProxyManager:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to others: is in proxy/__init__.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the diff that I provided in the description I've shown that, but maybe it wasn't easy to notice.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff was not good.

from hazelcast.internal.asyncio_connection import Connection
from hazelcast.core import Address

_BUFFER_SIZE = 128000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to define centrally if possible given it's used across reactors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they should be independent with each other.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other constant hasn't been changed for ~4 years, same value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's not a big probability that it will change.
But I would either have to import it from hazelcast.reactor, or refactor the code so both the asyncore and the asyncio reactor imported it from a common module.
That either introduces a dependency between those reactor modules, or require changes in the "old" Python code, which I tried to avoid.

def shutdown(self):
if not self._is_live:
return
# TODO: cancel tasks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compared to reactor.py is this correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start, shutdown are not necessary for the AsyncioReactor.
Removed them in the 3rd PR with this commit:
58783dc

@yuce yuce changed the title Initial Asyncio Module PR Initial Asyncio Module PR [1/3] Oct 1, 2025
_CLIENT_ID = AtomicInteger()

@classmethod
async def create_and_start(cls, config: Config | None = None, **kwargs) -> "HazelcastClient":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At Java this method is named as this:

HazelcastClient.newHazelcastClient()

At C++, I used this:

hazelcast::new_client()

Now for python, it will be yet another different name:

HazelcastClient.create_and_start()

Are these too confusing?

Copy link
Member

@emreyigit emreyigit Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also StartNewClientAsync in .net. Also, it's initialized over a factory class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would choose the name as new_hazelcast_client to be more consistent and easily understandable by any existing Hazelcast users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuce yuce changed the title Initial Asyncio Module PR [1/3] Initial Asyncio Module PR [1/4] Nov 17, 2025
@yuce yuce changed the title Initial Asyncio Module PR [1/4] Initial Asyncio Module PR [1/5] Nov 19, 2025
@yuce yuce changed the title Initial Asyncio Module PR [1/5] Initial Asyncio Module PR [1/6] Nov 20, 2025
self._cluster_connect_timeout_text,
self._max_backoff,
)
time.sleep(sleep_time)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asyncio.sleep ? called on line 550

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed the WaitStrategy in 91bf1d1


def callback(future):
try:
schema = future.result()
Copy link

@gbarnett-hz gbarnett-hz Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does .result() block if it's not ready? is it not something like await future, then future.result() or schema = await future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it doesn't block.
Trying to get the result when !future.done() raises an exception.
The fetch_schema_future.add_done_callback(callback) a few lines below makes sure that callback receives a done future.

Copy link
Contributor

@ihsandemir ihsandemir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will continue to review.

_CLIENT_ID = AtomicInteger()

@classmethod
async def create_and_start(cls, config: Config | None = None, **kwargs) -> "HazelcastClient":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would choose the name as new_hazelcast_client to be more consistent and easily understandable by any existing Hazelcast users.

@yuce
Copy link
Contributor Author

yuce commented Nov 21, 2025

@ihsandemir #741 (comment) Let's discuss about that on the TDD.


@property
def name(self) -> str:
return self._name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, the missing docs will be part of next PRs to reduce line of changes here.

)
return await self._invoke(request, handler)

async def flush(self) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why force_unlock and lock are removed?

Copy link
Contributor

@ihsandemir ihsandemir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at code coverage report #741 (comment), new code is around 76% covered. Will you complete it to 100%?

@yuce
Copy link
Contributor Author

yuce commented Nov 24, 2025

@ihsandemir

Looking at code coverage report #741 (comment), new code is around 76% covered. Will you complete it to 100%?

There are 5 more PRs that port more tests.

Copy link
Member

@emreyigit emreyigit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have anything left for the first PR, thanks.

@yuce yuce changed the title Initial Asyncio Module PR [1/6] [API-2326] Initial Asyncio Module PR [1/6] Nov 26, 2025
Copy link

@gbarnett-hz gbarnett-hz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a few scans -- PR is too large to be that confident. I had a few notes though that I forgot from last time.

except KeyError:
partition_map[partition_id] = [entry]

async with asyncio.TaskGroup() as tg: # type: ignore[attr-defined]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we handling < 3.11? https://github.com/hazelcast/hazelcast-python-client/blob/master/setup.py

TaskGroup added in 3.11 https://docs.python.org/3.11/library/asyncio-task.html#asyncio.TaskGroup

is it possible to use gather? add created tasks into list then gather the list and handle cancellation+exception handling ourselves? If we catch exception then manually cancel the tasks + then have return_exceptions=True?

Or, use TaskGroup for when >= 3.11. Otherwise, we need to assess impact of removing 3.7 support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.7 was left unintentionally.
3.11 is the minimum we'll support.

It's possible to approximate ThreadGroup with `gather, but I think it's best to use it directly to make our code a bit more robust.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's more difficult -- the analysis to determine EOL'ing 3.7 or the technical compromise?

def close(self):
self._transport.close()

def write(self, buf):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not just _transport.write(buf)? I think this already does what you do in _write_loop without polling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that causes buf to be written immediately, which is not ideal.
_write_loop tries to collect more data before flushing it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@yuce yuce Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

@gbarnett-hz gbarnett-hz Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like it is doing what we want for a non-blocking socket unless I misread something: the send will complete later if the call send on the socket would block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the send will complete later if the call send on the socket would block.

The problem is, it tries to write the data it gets immediately, no matter how small the data is.
Waiting for a bit for more data to arrive is more efficient.

In my first try, I used _transport.write directly, but then switched to the write loop afterwards, since it was more efficient.
To confirm that again, I did a simple benchmark again today, and in all tries, the throughput when writing directly is worse than the write loop.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the performance benefits? IMO it's better to integrate as-is rather than invent what looks to be similar to what they are doing unless the improvements are compelling.

Copy link
Contributor Author

@yuce yuce Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numbers I got were 15% to 70% worse when I used _transport.write directly.

What they do is not similar.
Note that the asyncore API variant of the client also tries to batch socket writes.

Copy link

@gbarnett-hz gbarnett-hz Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have the results + test you can share? Personally I would go with the _transport.write(...) as it seems to be the idiomatic way unless the difference between approaches when you consider avg + tail latencies is significant.

@yuce yuce merged commit f49d8ab into hazelcast:master Nov 28, 2025
11 checks passed
@yuce yuce deleted the asyncio-module branch November 28, 2025 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants