Skip to content

YARN-11823: add new endpoints for getting jstacks of application and nodes #7726

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: trunk
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
/** * Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.hadoop.yarn.server.nodemanager;

import org.apache.hadoop.util.Shell;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.*;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.StandardCopyOption;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class DiagnosticJStackService {

private static final Logger LOG = LoggerFactory
.getLogger(DiagnosticJStackService.class);
private static final String PYTHON_COMMAND = "python3";
private static String scriptLocation = null;

static {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This static block will block the NM to start up, till it is not done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my testing, it is very fast when I access the JStack endpoint. Do you happen to have a better idea of getting the script file from /resources folder?

try {
// Extract script from JAR to a temp file
InputStream in = DiagnosticJStackService.class.getClassLoader()
.getResourceAsStream("diagnostics/jstack_collector.py");
File tempScript = File.createTempFile("jstack_collector", ".py");
Files.copy(in, tempScript.toPath(), StandardCopyOption.REPLACE_EXISTING);
tempScript.setExecutable(true); // Set execute permission
scriptLocation = tempScript.getAbsolutePath();
} catch (IOException e) {
LOG.error("Failed to extract Python script from JAR", e);
}
}

public static String collectNodeJStack()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First i read NodeJS, can we use other name here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I am thinking of changing to collectNodeThreadDump()

throws Exception {
if (Shell.WINDOWS) {
throw new UnsupportedOperationException("Not implemented for Windows");
}

ProcessBuilder pb = createProcessBuilder();

return executeCommand(pb);

}



public static String collectAppJStack(String appId)
throws Exception {
if (Shell.WINDOWS) {
throw new UnsupportedOperationException("Not implemented for Windows.");
}
ProcessBuilder pb = createProcessBuilder(appId);

LOG.info("Diagnostic process environment: {}", pb.environment());

return executeCommand(pb);
}

protected static ProcessBuilder createProcessBuilder() {
List<String> commandList =
new ArrayList<>(Arrays.asList(PYTHON_COMMAND, scriptLocation));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need ArrayList?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the ProcessBuilder method definition accept 'command' as a list :)

public ProcessBuilder(List<String> command) {..}


return new ProcessBuilder(commandList);
}


protected static ProcessBuilder createProcessBuilder(String appId) {
List<String> commandList =
new ArrayList<>(Arrays.asList(PYTHON_COMMAND, scriptLocation, appId));

return new ProcessBuilder(commandList);
}

private static String executeCommand(ProcessBuilder pb)
throws Exception {
Process process = pb.start();
int exitCode;
StringBuilder outputBuilder = new StringBuilder();
StringBuilder errorBuilder = new StringBuilder();

try (
BufferedReader stdoutReader = new BufferedReader(new InputStreamReader(process.getInputStream(),
StandardCharsets.UTF_8));
BufferedReader stderrReader = new BufferedReader(new InputStreamReader(process.getErrorStream(),
StandardCharsets.UTF_8));
) {

String line;
while ((line = stdoutReader.readLine()) != null) {
outputBuilder.append(line).append("\n");
}

while ((line = stderrReader.readLine()) != null) {
errorBuilder.append(line).append("\n");
}
if (!errorBuilder.toString().isEmpty()) {
LOG.error("Python script stderr: {}", errorBuilder);
}

process.waitFor();
} catch (Exception e) {
LOG.error("Error getting JStack: {}", pb.command());
throw e;
}
exitCode = process.exitValue();
if (exitCode != 0) {
throw new IOException("The JStack collector script exited with non-zero " +
"exit code: " + exitCode);
}

return outputBuilder.toString();
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,14 @@
import java.util.Set;

import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.yarn.server.nodemanager.DiagnosticJStackService;
import org.apache.hadoop.yarn.server.nodemanager.containermanager.records.AuxServiceRecord;
import org.apache.hadoop.yarn.server.nodemanager.containermanager.records.AuxServiceRecords;
import org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.ResourcePlugin;
import org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.ResourcePluginManager;
import org.apache.hadoop.yarn.server.nodemanager.webapp.dao.AuxiliaryServicesInfo;
import org.apache.hadoop.yarn.server.nodemanager.webapp.dao.NMResourceInfo;
import org.apache.hadoop.yarn.webapp.WebAppException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand Down Expand Up @@ -271,6 +273,35 @@ public ContainerInfo getNodeContainer(@javax.ws.rs.core.Context

}

@GET
@Path("/jstack")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be a bit misleading name cause we already have a /stacks API for jstack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also how those it different from /stacks ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stack and JStack are totally different from each other. JStack is used on current running Java process to see what each thread are actually doing while Stack is just a list of active methods that have been called.

Here is an example of JStack:

2025-05-06 14:43:45
Full thread dump OpenJDK 64-Bit Server VM (25.232-b09 mixed mode):

"Attach Listener" #36 daemon prio=9 os_prio=0 tid=0x00007f8cf6288800 nid=0x5601e waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
	- None

"shuffle-client-4-1" #35 daemon prio=5 os_prio=0 tid=0x00007f8cd86e6800 nid=0x55f04 runnable [0x00007f8ccfc0e000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
	at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
	at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
	- locked <0x00000000c1402a30> (a io.netty.channel.nio.SelectedSelectionKeySet)
	- locked <0x00000000c1402a48> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000000c14029e8> (a sun.nio.ch.EPollSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
	at 

Here is an example Stack:

Process Thread Dump: 
263 active threads
Thread 5938 (qtp2085713965-5938):
  State: RUNNABLE
  Blocked count: 2
  Waited count: 6
  Stack:
    sun.management.ThreadImpl.getThreadInfo1(Native Method)
    sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:178)
    sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:139)
    org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:169)
    org.apache.hadoop.http.HttpServer2$StackServlet.doGet(HttpServer2.java:1563)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
    org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1656)
    com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
    com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
    com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
    org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:178)
    com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
    com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
    com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
    com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
    com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
    com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
Thread 5937 (qtp2085713965-5937):

@Produces({MediaType.TEXT_PLAIN})
public Response getNodeJStack() {
try {
return Response.status(Status.OK)
.entity(DiagnosticJStackService.collectNodeJStack()) // Make sure the NodeManager have python3 install
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if py3 is not present?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is quite ambiguous when the python3 is not installed. The exception will only be shown when I execute the script manually. If I try to access the endpoint at RM without the python3 installed in NM, It will just say 'Internal Server error 500' and user have to check the corresponding NM to see the error. I will work on this to make the error less ambiguous.

.build();
} catch (Exception e) {
throw new WebAppException("Error collection NodeManager JStack: " + e.getMessage() + ". " +
"For more information please check the NodeManager logs.");
}
}


@GET
@Path("/apps/{appid}/jstack")
@Produces({MediaType.TEXT_PLAIN})
public Response getApplicationJStack(@PathParam("appid") String appId) {
try {
return Response.status(Status.OK)
.entity(DiagnosticJStackService.collectAppJStack(appId)) // Make sure the NodeManager have python3 install
.build();
} catch (Exception e) {
throw new WebAppException("Error collecting Application JStack: " + e.getMessage() + ". " +
"For more information please check the NodeManager logs.");
}
}

/**
* Returns log file's name as well as current file size for a container.
*
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Licensed to the Apache Software Foundation (ASF) under one

Check failure on line 1 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L1

pylint: [C0114(missing-module-docstring), ] Missing module docstring
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import subprocess
import sys

NUMBER_OF_JSTACK = 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be path throw REST

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, it will be nice to make that number configurable from the the RESTAPI.
I will work on that. Thanks!


def get_nodemanager_pid():

Check failure on line 22 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L22

pylint: [C0116(missing-function-docstring), get_nodemanager_pid] Missing function or method docstring
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I beleive from security perspective, these should not be available in REST API in case of not secure cluster, and we should do authorisation in secured clusters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm....why is that? The script will only get java processes of the active container and execute JStack command on it, not that user could modify the script or do some malicious activities?

results = run_command("ps aux | grep nodemanager | grep -v grep")
# ps aux | grep nodemanager | grep -v grep
# root 414 1.3 1.7 8124480 434520 ? Sl 11:36 0:52 /usr/lib/jvm/java-8-openjdk//bin/java -Dproc_nodemanager -Djava.net.preferIPv4Stack=true -Dyarn.log.dir=/opt/hadoop/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/opt/hadoop -Dyarn.root.logger=INFO,console -Dhadoop.log.dir=/opt/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop -Dhadoop.id.str=root -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.math=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.zip=ALL-UNNAMED --add-opens=java.base/sun.security.util=ALL-UNNAMED --add-opens=java.base/sun.security.x509=ALL-UNNAMED org.apache.hadoop.yarn.server.nodemanager.NodeManager

Check failure on line 25 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L25

pylint: [C0301(line-too-long), ] Line too long (1115/100)
pids = [] # Some host may contain more than one NodeManager
for result in results.strip().splitlines():
pid = result.split()[1]
pids.append(pid)

return pids


def get_app_pid(app_id):

Check failure on line 34 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L34

pylint: [C0116(missing-function-docstring), get_app_pid] Missing function or method docstring

Check failure on line 34 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L34

pylint: [W0613(unused-argument), get_app_pid] Unused argument 'app_id'

# results= '''
# root 413 1.7 2.0 8355580 512972 ? Sl 11:21 2:56 /usr/lib/jvm/java-8-openjdk//bin/java -Dproc_nodemanager -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/hadoop/logs -Dhadoop.log.file=NODEMANAGER.log -Dyarn.log.dir=/opt/hadoop/logs -Dyarn.log.file=NODEMANAGER.log -Dyarn.home.dir=/opt/hadoop -Dyarn.root.logger=INFO,DRFA -Dhadoop.home.dir=/opt/hadoop -Dhadoop.id.str=root -Dhadoop.root.logger=INFO,DRFA -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.math=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.zip=ALL-UNNAMED --add-opens=java.base/sun.security.util=ALL-UNNAMED --add-opens=java.base/sun.security.x509=ALL-UNNAMED --enable-native-access=ALL-UNNAMED org.apache.hadoop.yarn.server.nodemanager.NodeManager

Check failure on line 37 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L37

pylint: [C0301(line-too-long), ] Line too long (1158/100)
# root 41611 4.1 1.9 2414568 470660 ? Sl 14:08 0:16 /usr/lib/jvm/java-8-openjdk//bin/java -Xmx750m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_type GUARANTEED --container_memory 750 --container_vcores 1 --num_containers 500 --priority 0 --appname DistributedShell --homedir hdfs://namenode:9000/user/root

Check failure on line 38 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L38

pylint: [C0301(line-too-long), ] Line too long (366/100)
# '''
results = run_command("ps aux | grep jvm/java | grep -v -e /bin/bash -e grep") # TODO: later include "grep app_id" for long java application like mapreduce

Check failure on line 40 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L40

pylint: [C0301(line-too-long), ] Line too long (160/100)

Check failure on line 40 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L40

pylint: [W0511(fixme), ] TODO: later include "grep app_id" for long java application like mapreduce
pids = []
for result in results.strip().splitlines():
pid = result.split()[1]
pids.append(pid)

return pids


def execute_jstack(pids):

Check failure on line 49 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L49

pylint: [C0116(missing-function-docstring), execute_jstack] Missing function or method docstring
all_jstacks = []

for pid in pids:
for i in range(NUMBER_OF_JSTACK): # Get multiple jstack
jstack_output = run_command("jstack", pid)
all_jstacks.append("--- JStack iteration-{} for PID: {} ---\n{}".format(i, pid, jstack_output))

Check failure on line 55 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L55

pylint: [C0301(line-too-long), ] Line too long (107/100)

return "\n".join(all_jstacks)


def run_command(*argv):

Check failure on line 60 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L60

pylint: [C0116(missing-function-docstring), run_command] Missing function or method docstring
try:
cmd = " ".join(arg for arg in argv)
print("Running command with arguments:", cmd)
response = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, check=True)

Check failure on line 64 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L64

pylint: [C0301(line-too-long), ] Line too long (110/100)
response_str = response.stdout.decode('utf-8')
except subprocess.CalledProcessError as e:

Check failure on line 66 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L66

pylint: [C0103(invalid-name), run_command] Variable name "e" doesn't conform to snake_case naming style
response_str = "Unable to run command: {}".format(e)
print(response_str, file=sys.stderr)
except Exception as e:

Check failure on line 69 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L69

pylint: [W0703(broad-except), run_command] Catching too general exception Exception

Check failure on line 69 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L69

pylint: [C0103(invalid-name), run_command] Variable name "e" doesn't conform to snake_case naming style
response_str = "Exception occurred: {}".format(e)
print(response_str, file=sys.stderr)

return response_str


def main():

Check failure on line 76 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L76

pylint: [C0116(missing-function-docstring), main] Missing function or method docstring

# app_id = "application_1748517687882_0013"
if len(sys.argv) > 1:
app_id = sys.argv[1]
pids = get_app_pid(app_id)
else:
pids = get_nodemanager_pid()

if not pids:
print("No active process id in this NodeManager.")
sys.exit(0)

jstacks = execute_jstack(pids)
print(jstacks) # The Initiated java processBuilder will read this stdout


if __name__ == "__main__":
main()

Check failure on line 95 in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py

View check run for this annotation

ASF Cloudbees Jenkins ci-hadoop / Apache Yetus

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/diagnostics/jstack_collector.py#L95

pylint: [C0305(trailing-newlines), ] Trailing newlines