Troubleshooting

Find your failure mode and fix it — fast.

Last verified: 2026-04-06

Quick Triage

Pick your problem area:

Claude Code Issues MCP Issues Hook Issues k3s / Infra Issues Public Access Issues ArgoCD Issues GitHub Actions Issues

Start Here

Before diving into specific issues, try these five universal checks:

Update Claude — stale versions cause most weird bugs.
```
claude update
```
Restart your session — exit the terminal and run claude again. A fresh session clears transient state.
Check verbose mode — press Ctrl+O to see what Claude is doing under the hood.
Verify config locations — project settings live in .claude/settings.json; MCP config lives in .mcp.json (project) or ~/.claude.json (user).
Ask Claude with the pasted error — often the fastest path:
```
"I'm getting this error: <paste>"
```

Claude Code Issues

Symptom

Session hangs / no response

Check: Is the spinner moving? Press Ctrl+C to interrupt.

Likely cause: A long-running tool call timed out, or a network blip stalled the connection.

Fix:

Ctrl+C        # interrupt current turn
/clear        # reset session state

If still stuck, close the terminal entirely and run claude in a new window.

Success signal: Claude responds to your next prompt within a few seconds.

Symptom

Permission denied on every command

Check: Which permission mode are you running in?

Likely cause: Default mode requires approval for every tool call.

Fix: Press Shift+Tab to cycle permission modes, or launch with a more permissive mode:

claude --permission-mode acceptEdits

You can also press a during a prompt to allow a tool for the rest of the session.

Success signal: Commands run without repeated confirmation prompts.

Symptom

CLAUDE.md not loading

Check: Run /memory inside a session to see what files are loaded.

Likely cause: The file is in the wrong directory or has a broken @path import.

Fix: Ensure the file is at ./CLAUDE.md or ./.claude/CLAUDE.md relative to where you launch Claude. Fix any @path references so they resolve correctly.

Success signal: /memory shows your CLAUDE.md content.

Symptom

Skills not showing up

Check: Verify the skill file exists at the expected path.

Likely cause: The SKILL.md file is misplaced or the directory structure is wrong.

Fix: Skills must live at one of these paths:

Project: .claude/skills/<skill-name>/SKILL.md
User:    ~/.claude/skills/<skill-name>/SKILL.md

After fixing, start a new session — skills load at startup.

Success signal: The skill appears when Claude lists available capabilities.

Symptom

Auto memory not working

Check: Confirm your version and settings.

Likely cause: Outdated Claude Code version or the feature is disabled in settings.

Fix:

claude --version   # needs v2.1.59+
claude update

Then verify ~/.claude/settings.json contains:

{ "autoMemoryEnabled": true }

Success signal: Claude proactively saves facts between sessions.

Symptom

Context getting too long

Check: Is Claude forgetting earlier instructions or repeating itself?

Likely cause: The conversation has exceeded the effective context window.

Fix: Compact the conversation proactively — don't wait until Claude starts forgetting.

/compact

Success signal: Claude responds coherently with awareness of earlier context.

MCP Issues

Symptom

MCP server not connecting

Check: Is the config in the right file? Project-level: .mcp.json. User-level: ~/.claude.json.

Likely cause: Missing environment variables, wrong server command path, or the server binary isn't installed.

Fix: Verify the server command exists, set required env vars (e.g. GITHUB_TOKEN), then restart your Claude session — MCP config is only read at startup.

Success signal: Claude shows MCP tools in its capabilities on session start.

Symptom

MCP tools not appearing

Check: Press Ctrl+O to enable verbose mode and look for MCP handshake messages.

Likely cause: The server started but the tool registration failed — usually a schema mismatch or server crash during init.

Fix: Run the MCP server command manually in your terminal to see its error output. Fix any issues, then restart the Claude session.

Success signal: Tools from that MCP server appear and are callable.

Hook Issues

Symptom

Hooks not firing

Check: Enable verbose mode with Ctrl+O to see hook execution.

Likely cause: Event name casing is wrong, the matcher doesn't match, or the script isn't executable.

Fix:

# Validate your settings JSON
claude -p "validate the JSON in .claude/settings.json"

# Make sure the hook script is executable
chmod +x .claude/hooks/your-hook.sh

Event names are PascalCase: PreToolUse, not preToolUse. Exit codes matter: 0 = allow, 2 = block.

Success signal: Verbose mode shows the hook executing on the expected event.

Symptom

Hook blocking unexpectedly

Check: Which tool call is being blocked? Verbose mode (Ctrl+O) shows the hook name and exit code.

Likely cause: The matcher is too broad or the hook script returns exit code 2 on a path you didn't intend to block.

Fix: Narrow the matcher pattern in .claude/settings.json so it targets only the intended tool. Test by running the hook script manually with sample input.

Success signal: The tool call proceeds without being blocked.

Infrastructure Issues

Symptom

Can't SSH into droplet

Check: Run SSH with verbose output to see where it fails.

Likely cause: Wrong IP, SSH key not added during droplet creation, or firewall blocking port 22.

Fix:

ssh -v root@<droplet-ip>

Verify the IP in the DigitalOcean console. If the key is missing, add it via the DO dashboard and rebuild.

Success signal: You get a root shell on the droplet.

Symptom

k3s not starting

Check: Look at the service status and logs.

Likely cause: Insufficient RAM (need at least 2 GB), port 6443 already in use, or firewall rules blocking required ports.

Fix:

sudo systemctl status k3s
sudo journalctl -u k3s -f

Success signal: systemctl status k3s shows active (running).

Symptom

kubectl connection refused

Check: Are you using the correct kubeconfig?

Likely cause: k3s uses its own kubeconfig path, not the default ~/.kube/config.

Fix:

# On the droplet
sudo kubectl --kubeconfig /etc/rancher/k3s/k3s.yaml get nodes

For local access, copy the kubeconfig and replace 127.0.0.1 with your droplet’s public IP.

Success signal: kubectl get nodes returns your node in Ready state.

Symptom

Pods stuck in ImagePullBackOff

Check: Describe the pod to see the pull error.

Likely cause: Image doesn't exist, tag is wrong, or Docker Hub rate-limiting is hitting you.

Fix:

kubectl describe pod <pod-name> -n ai-coderrank

Verify the image name and tag. For rate-limit issues, switch to ghcr.io.

Success signal: Pod transitions to Running.

Symptom

Pods stuck in CrashLoopBackOff

Check: Read the logs from the previous crashed container.

Likely cause: App crashes on startup — missing env var, missing ConfigMap/Secret, or port mismatch.

Fix:

kubectl logs <pod-name> -n ai-coderrank --previous

Success signal: Pod stays in Running state after you fix the config.

Public Access Issues

Symptom

App not accessible on public IP (NodePort 30080)

Check: Confirm the service type and that the port is reachable.

Likely cause: Service isn't NodePort, the port isn't 30080, or the DigitalOcean firewall blocks it.

Fix:

# Verify the service is NodePort on 30080
kubectl get svc -n ai-coderrank

# Test locally on the droplet first
curl http://localhost:30080

If local curl works but external doesn’t, open port 30080 in the DigitalOcean firewall: DO Console > Networking > Firewalls.

This course uses NodePort 30080 for public access, not Ingress. Make sure your Service spec sets type: NodePort and nodePort: 30080.

Success signal: curl http://<droplet-ip>:30080 returns your app's response.

ArgoCD Issues

Symptom

ArgoCD UI not accessible

Check: Is the port-forward running?

Likely cause: No port-forward active, or the ArgoCD server pod isn't running.

Fix:

kubectl port-forward svc/argocd-server -n argocd 8080:443

Get the admin password:

kubectl -n argocd get secret argocd-initial-admin-secret \
  -o jsonpath="{.data.password}" | base64 -d

Success signal: Browser loads the ArgoCD dashboard at https://localhost:8080.

Symptom

ArgoCD not syncing

Check: Look at the application status.

Likely cause: Repo URL typo, wrong branch name, or the path doesn't match your directory structure.

Fix:

kubectl get app -n argocd

Force a sync if the app exists but is stuck:

kubectl patch app ai-coderrank -n argocd \
  --type merge -p '{"operation":{"sync":{}}}'

Success signal: App status shows Synced and Healthy.

Symptom

ArgoCD showing Degraded health

Check: Identify which resource is unhealthy.

Likely cause: A pod managed by ArgoCD is failing — this is usually a CrashLoopBackOff underneath.

Fix:

kubectl get app ai-coderrank -n argocd -o yaml | grep -A 20 health
kubectl logs -n ai-coderrank -l app=ai-coderrank --tail=50

Fix the underlying pod issue (see Infrastructure Issues) and ArgoCD will detect the recovery automatically.

Success signal: App health returns to Healthy.

GitHub Actions Issues

Symptom

Claude Action not triggering

Check: Is the workflow file present and the trigger correct?

Likely cause: Workflow file not in .github/workflows/, trigger event doesn't match, if condition filters it out, or ANTHROPIC_API_KEY secret is missing.

Fix: Verify all five of these:

Workflow file lives in .github/workflows/
Trigger event matches (issue_comment, pull_request_review_comment)
The if condition matches your comment pattern
ANTHROPIC_API_KEY is set in repo Settings > Secrets
GitHub App permissions are configured

Success signal: The Actions tab shows a new run after you post a matching comment.

Symptom

Action runs but no output

Check: Open the run log in the GitHub Actions tab.

Likely cause: API key is invalid/expired, you're rate-limited, max_turns is set too low, or CLAUDE.md has overly restrictive instructions.

Fix:

Check your Anthropic dashboard for API key status and rate-limit errors
Increase max_turns if the workflow exits too early
Review the workflow log to confirm Claude actually received the trigger payload
Temporarily simplify restrictive CLAUDE.md instructions if the agent is getting blocked by policy

Success signal: The action run log shows Claude's response and it posts output to the PR/issue.

What To Paste Into Claude

When you’re stuck, give Claude structured context. Copy this template:

I'm working on the Claude Code 101 course. I hit this issue:

**Block**: [which block]
**Step**: [which step]
**Error**: [paste error]
**What I tried**: [what you already tried]

Help me diagnose and fix this.

The more specific you are about the block, step, and exact error text, the faster Claude can help.

If you’re repeatedly blocked, mentoring may be faster than self-debugging.