Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 31 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,21 @@ Clone this git repo and use the `az-aks-ssh.sh` direction (see below for usage).

Dependencies:

* kubectl
* Azure CLI
* [`az`](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-linux) (Azure CLI)
* [`kubectl`](https://learn.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest#az-aks-install-cli)

## Usage

```
```sh
./az-aks-ssh.sh --help
Usage:
SSH into an AKS agent node (pass in -c to run a single command
or omit for an interactive session):
./az-aks-ssh.sh \
-g|--resource-group <resource_group> \
-n|--cluster-name <cluster> \
-d|--node-name <node_name|any> \
[-p|--pod-name <pod_name>] \
[-c|--command <command>] \
[-o|--output-file <file>]

Expand All @@ -36,48 +38,56 @@ Usage:
./az-aks-ssh.sh --cleanup
```

> NOTE: `--resource-group` is read from `AZURE_DEFAULTS_GROUP` if not specified and that group is searched for a singleton AKS cluster to supply default values if possible.

## Examples

**SSH into any agent node in an interactive SSH session**
### SSH into any agent node in an interactive SSH session

```
$ ./az-aks-ssh.sh -g rg1 -n aks1 -d any
```sh
./az-aks-ssh.sh -g rg1 -n aks1 -d any
```

**SSH into a specific agent node (get node name from `kubectl get no`)**
### SSH into a specific agent node (get node name from `kubectl get nodes`)

```
$ ./az-aks-ssh.sh -g rg1 -n aks1 -d cluster_node
```sh
./az-aks-ssh.sh -g rg1 -n aks1 -d cluster_node
```

**Run a single command non-interactively**
### SSH into a specific agent node hosting a particular pod

```
$ ./az-aks-ssh.sh -g rg1 -n aks1 -d any -c "ps -aux"
```sh
./az-aks-ssh.sh -p mypod
```

**Run a command non-interactively and save the output to a file**
### Run a single command non-interactively

```
$ ./az-aks-ssh.sh -g rg1 -n aks1 -d any -c "ps -aux" -o ~/aks-ssh.out
```sh
./az-aks-ssh.sh -g rg1 -n aks1 -d any -c "ps -aux"
```

**Cleanup the environment (delete agent node SSH keys locally and remove the SSH proxy pod)**
### Run a command non-interactively and save the output to a file

```sh
./az-aks-ssh.sh -g rg1 -n aks1 -d any -c "ps -aux" -o ~/aks-ssh.out
```
$ ./az-aks-ssh.sh --cleanup

### Cleanup the environment (delete agent node SSH keys locally and remove the SSH proxy pod)

```sh
./az-aks-ssh.sh --cleanup
```

## More information

**Design**
### Design

![Design](./design.png)

**SSH keys**
### SSH keys

The SSH keys are generated for individual nodes. This ensures that keys are not being reused for multiple hosts. `--cleanup` removes all keys that match the prefix: `~/.ssh/az_aks*`.

**SSH proxy pod**
### SSH proxy pod

This design uses a proxy pod that sleeps forever so that it can be reused. `--cleanup` deletes this pod from the Kubernetes cluster. To see this pod you can run `kubectl get po aks-ssh-session`.
This design uses a proxy pod that sleeps forever so that it can be reused. `--cleanup` deletes this pod from the Kubernetes cluster. To see this pod you can run `kubectl get pod aks-ssh-session-{NodeName}`.
25 changes: 23 additions & 2 deletions az-aks-ssh.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ OUTPUT_FILE="/dev/stdout"
CLEAR_LOCAL_KEYS=""
DELETE_SSH_POD=""
SSH_POD_NAME="aks-ssh-session"
POD_NAME=""
CLEANUP=""
CLUSTER=""
RESOURCE_GROUP=""
Expand All @@ -25,6 +26,7 @@ function usage() {
echo " -g|--resource-group <resource_group> \\"
echo " -n|--cluster-name <cluster> \\"
echo " -d|--node-name <node_name|any> \\"
echo " [-p|--pod-name <pod_name>] \\"
echo " [-c|--command <command>] \\"
echo " [-o|--output-file <file>]"
echo ""
Expand Down Expand Up @@ -58,6 +60,11 @@ while [[ $# -gt 0 ]]; do
shift
shift
;;
-p|--pod-name)
POD_NAME="$2"
shift
shift
;;
-c|--command)
COMMAND="$2"
shift
Expand Down Expand Up @@ -137,13 +144,21 @@ if [[ -n "$CLEANUP" ]]; then
exit
fi

if [[ -n "$POD_NAME" ]] && [[ "$NODE_NAME" == "any" ]]; then
NODE_NAME=$(kubectl get pod "$POD_NAME" -o jsonpath="{.spec.nodeName}")
echo "Selecting node $NODE_NAME where pod $POD_NAME resides."
fi

if [[ "$NODE_NAME" == "any" ]]; then
echo "Selected 'any' node name, getting the first node"
NODE_NAME=$(kubectl get node -o jsonpath="{.items[0].metadata.labels['kubernetes\.io/hostname']}")
fi

echo "Using node: $NODE_NAME"

# Append the node name to the pod name.
SSH_POD_NAME="$SSH_POD_NAME-$NODE_NAME"

NODE_RESOURCE_GROUP=$(az aks show \
--resource-group "$RESOURCE_GROUP" \
--name "$CLUSTER" \
Expand Down Expand Up @@ -193,7 +208,7 @@ ACCESS_EXTENSION=$(az vmss show \
--resource-group "$NODE_RESOURCE_GROUP" \
--name "$CONTAINING_VMSS" \
--instance-id $INSTANCE_ID \
--query "instanceView.extensions[?name == 'VMAccessForLinux']" -o tsv)
--query "resources[?name == 'VMAccessForLinux']" -o tsv)

if [[ -z "$ACCESS_EXTENSION" || -n "$CREATED_KEY_FILE" ]]; then
echo "Access extension does not exist or new key generated, adding to VM"
Expand Down Expand Up @@ -221,7 +236,13 @@ echo "Instance IP is $INSTANCE_IP"

if ! kubectl get po "$SSH_POD_NAME"; then
echo "Proxy pod doesn't exist, setting it up"
kubectl run "$SSH_POD_NAME" --image ubuntu:bionic -- /bin/bash -c "sleep infinity"
# Need to place the pod on the same node to avoid some weird TTY issues.
kubectl run "$SSH_POD_NAME" \
--restart=Never \
--image debian:latest \
--overrides='{ "spec": { "hostNetwork": true, "nodeName": "'$NODE_NAME'" } }' \
-- \
/bin/bash -c "sleep infinity"
while true; do
echo "Waiting for proxy pod to be in a Running state"
POD_STATE=$(kubectl get po "$SSH_POD_NAME" -o jsonpath="{.status.phase}")
Expand Down