Skip to content

Conversation

@alexandrejbr
Copy link
Contributor

This PR aims at introducing the equivalent of openssh ClientAliveInterval/ClientAliveCountMax (https://man.openbsd.org/sshd_config#ClientAliveCountMax) and ServerAliveInterval/ServerAliveCountMax

There's 2 notable differences:

  1. In openssh "The default is 0, indicating that these messages will not be sent to the client.", but in this implementation infinity is used instead of 0 and ssh_options checks that a positive integer is presented;
  2. Keep-alive messages can't be sent during renegotiation, but since this feature acts as a keep-alive and a timeout, an equivalent timeout is established for the renegotiation procedure if alive is enable. This is implemented with a timeout called renegotiation_alive

@github-actions
Copy link
Contributor

github-actions bot commented Nov 27, 2024

CT Test Results

    2 files     29 suites   19m 45s ⏱️
  485 tests   478 ✅  6 💤 1 ❌
1 690 runs  1 663 ✅ 26 💤 1 ❌

For more details on these failures, see this check.

Results for commit 8814976.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

@alexandrejbr
Copy link
Contributor Author

@u3s this a feature I promised long time ago, if you are still interested in it please continue reading :)

I know you are short on time, but if you could give me some tips on how to create some tests for this would be appreciated.

I wanted to add tests at least these 3 scenarios:

  • Normal scenario client and server send alive messages (This one I have one idea how to test which would be using the ssh_dbg module, ssh_dbg:on([ssh_messages]) reveals the sending of the messages but it's kind of ugly to use a regex to get the keep alive message);
  • Timeout scenario. I would like that the server or client become not responsive and I would like to check that the connections is terminated (this one we have tested before suspending the openssh process but it's not so nice);
  • The renegotiation scenario in which the during the renegotiation the peer becomes unreachable and the timeout should have effect.

@alexandrejbr alexandrejbr force-pushed the alexandrejbr/ssh-keep-alive branch 2 times, most recently from 62ef27e to 5b99b40 Compare November 27, 2024 21:13
@u3s u3s self-assigned this Nov 28, 2024
@u3s u3s added the team:PS Assigned to OTP team PS label Nov 28, 2024
@u3s u3s removed the priority:low label Jun 9, 2025
@u3s
Copy link
Contributor

u3s commented Jun 12, 2025

  • we would like to proceed with that one
  • can you convert it into PR? so we could start testing it and discussing implementation details
  • for testing, my 1st guess would be ssh_trpt_lib module and do it kind of whitebox way

@alexandrejbr
Copy link
Contributor Author

Sounds good. I’ll rebase and start working on the tests.

@alexandrejbr alexandrejbr marked this pull request as ready for review June 12, 2025 14:23
@alexandrejbr alexandrejbr force-pushed the alexandrejbr/ssh-keep-alive branch from 5b99b40 to aac13e4 Compare June 18, 2025 13:05
@u3s u3s requested review from Copilot and u3s July 2, 2025 14:30
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds client/server keep-alive support by introducing interval/count parameters and integrating timers into the SSH finite-state machines.

  • Defines a new alive_params option with default count/interval and validation
  • Schedules and handles keep-alive and renegotiation timeout events in userauth and key exchange FSMs
  • Implements keep-alive macros, timer logic, and record fields in the connection handler

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
ssh_options.erl Added alive_params option with default and check logic
ssh_fsm_userauth_server.erl Switched idle timeout to start_alive for keep-alive
ssh_fsm_userauth_client.erl Scheduled initial keep-alive timer on auth success
ssh_fsm_kexinit.erl Added renegotiation alive timer after new keys
ssh_connection_handler.erl Introduced macros and functions for keep-alive handling
ssh.hrl Extended #ssh{} record with keep-alive fields
Comments suppressed due to low confidence (3)

lib/ssh/src/ssh_connection_handler.erl:2191

  • No unit tests were added for the new keep-alive timeout logic. Consider covering get_next_alive_timeout, triggered_alive, and reset_alive with targeted tests.
get_next_alive_timeout(#ssh{alive_interval = AliveInterval,

lib/ssh/src/ssh.hrl:1285

  • [nitpick] The alive_started field is added but never referenced in the code. Either remove it or implement logic to set and check this flag.
          alive_started = false               :: boolean(),

lib/ssh/src/ssh_connection_handler.erl:100

  • [nitpick] The global request name is given as a list; consider using a binary (e.g., <<"[email protected]">>) to match typical SSH message encoding.
    {ssh_msg_global_request,"[email protected]", true,<<>>}).

Copy link
Contributor

@u3s u3s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check inline comments.
Copilot repeated some of my observations.

we would need some docs and tests.

@IngelaAndin IngelaAndin added the waiting waiting for changes/input from author label Jul 15, 2025
@alexandrejbr alexandrejbr force-pushed the alexandrejbr/ssh-keep-alive branch from 7825483 to c456cc4 Compare August 13, 2025 14:17
@alexandrejbr
Copy link
Contributor Author

@u3s I have just pushed changes that address most of the comments.

There's 2 tests now in place. I started to add a third test that checks that the renegotiation_alive timeout works, but I could not start the renegotiation. I left a comment on the test and mentioned you there to see if you can try to help spot what am I doing wrong.

@alexandrejbr alexandrejbr requested a review from u3s August 18, 2025 07:12
@IngelaAndin IngelaAndin removed the waiting waiting for changes/input from author label Sep 2, 2025
@u3s u3s added testing currently being tested, tag is used by OTP internal CI and removed testing currently being tested, tag is used by OTP internal CI labels Sep 16, 2025
@alexandrejbr alexandrejbr force-pushed the alexandrejbr/ssh-keep-alive branch from 4339a10 to 4046e53 Compare September 22, 2025 13:59
@alexandrejbr alexandrejbr force-pushed the alexandrejbr/ssh-keep-alive branch from 3d92154 to fa6105b Compare October 10, 2025 12:13
@alexandrejbr alexandrejbr force-pushed the alexandrejbr/ssh-keep-alive branch from 2102453 to b0a6554 Compare October 22, 2025 11:44
@u3s u3s added testing currently being tested, tag is used by OTP internal CI and removed testing currently being tested, tag is used by OTP internal CI labels Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team:PS Assigned to OTP team PS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants