Skip to content

jmcd error when pulling java metrics when not running as Cassandra user #123

@joelsdc

Description

@joelsdc

ds-collector v2.0.2:

I've noticed the following error:

	executing `jcmd 8890 VM.system_properties > java_system_properties.txt`… com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
	at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)
	at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:63)
	at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:208)
	at sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:147)
	at sun.tools.jcmd.JCmd.main(JCmd.java:131)
failed
	executing `jcmd 8890 VM.command_line > java_command_line.txt`… com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
	at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)
	at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:63)
	at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:208)
	at sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:147)
	at sun.tools.jcmd.JCmd.main(JCmd.java:131)
failed

The issue here is running jcmd with a different user that the one owning the process. In this case, root is running the collector and cassandra is running the service, therefor it should be cassandra who runs jcmd instead of root.

As a workaround I have added sudo -u cassandra to the jcmd entries here:

https://github.com/datastax/diagnostic-collection/blob/master/ds-collector/rust-commands/collect-info.rs#L919-L940

This workaround is ugly at best. 😂

As the collector already handles finding the Cassandra PID to run jcmd, one better approach would be to run something like ps -o user= -p${cassandra_pid} once we have the ${cassandra_pid} to get the specific user running Cassandra, and then doing a proper sudo -u ${cassandra_pid_owner} jcmd ... the command doesn't fail.

I'm not sure what the best approach code-wise is for this one, I think the changes belong more in the rust side of the collector and I get lost there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions