🤖 The tool that will replace your DevOps/SRE/System Engineering team — DevOps GPT 🤖

Prashant Lakhera
4 min readFeb 3, 2025

--

“The tool that will replace your DevOps/SRE/System Engineering team” — Now that I have your attention, let’s talk about automation in the world of DevOps.
At a high level, the DevOps/SRE/System Engineering teams handle two critical tasks:
1️⃣ Writing and deploying the code
2️⃣ Monitoring the infrastructure post-deployment

While the first task has seen a lot of automation, the second, monitoring and troubleshooting, is still a manual, time-consuming process. Most solutions today just send logs to a centralized system and rely on alerts, leaving engineers to figure out the root cause themselves.

But what if LLMs could do the heavy lifting?
💡 Introducing DevOps GPT — An AI-driven agent that proactively monitors logs and suggests solutions in real-time.

🔍 How It Works:
✅ AI Agent runs directly on the server
✅ Detects errors from logs
✅ First, it checks its internal knowledge base (built from experience)
✅ Caches previous solutions to save costs and avoid redundant LLM calls
✅ If no match is found, it queries models like OpenAI, LLAMA3, or DeepSeek for recommendations
✅ Sends the recommended solution via Slack (more integrations coming soon!)

Installation

Step 1: Install the RPM Package

Run the following command in RedHat to install DevOps-GPT Agent:

rpm -ivh https://github.com/thedevops-gpt/devops-gpt/releases/download/0.0.1/devops-gpt-0.0.1-0.el9.x86_64.rpm

Note: If you encounter dependency issues, resolve them with:

yum -y install python3-pip

Step 2: Configure the Agent

You can configure OpenAI, DeepSeek, or Llama 3 based on your requirements. Run the configuration command and follow the prompts:

#OpenAI
sudo devops-gpt-configure
Enter check interval in seconds [10]:
Enter batch size [1]:
Enter maximum errors per batch [10]:
Enter error window in seconds [3600]: Available LLM providers:
1. OpenAI (requires API key)
2. Ollama (local)
Choose LLM provider (1/2) [1]:
Enter OpenAI API key: sk-XXXXX
Enable Slack notifications? (y/n) [y]:
Enter Slack webhook URL: https://hooks.slack.com/services/XXXXXXXXX
Configuration saved to /etc/devops-gpt/config.yaml
# Ollama(DeepSeek or Llama3
sudo devops-gpt-configure
Enter check interval in seconds [10]:
Enter batch size [1]:
Enter maximum errors per batch [10]:
Enter error window in seconds [3600]: Available LLM providers:
1. OpenAI (requires API key)
2. Ollama (local)
Choose LLM provider (1/2) [1]: 2Available Ollama models:
1. Llama 3.3
2. DeepSeek
Choose Ollama model (1/2) [1]: 2
Enable Slack notifications? (y/n) [y]:
Enter Slack webhook URL: https://hooks.slack.com/services/XXXXXX
Configuration saved to /etc/devops-gpt/config.yaml

Configuration Prompts:

  • Enter check interval in seconds [10]: Set the interval between checks (default: 10 seconds).
  • Enter batch size [1]: Define the number of errors to process in a batch (default: 1).
  • Enter maximum errors per batch [10]: Set the maximum number of errors in a batch (default: 10).
  • Enter error window in seconds [3600]: Specify the time window for error tracking (default: 1 hour).

Choose LLM provider:

  1. OpenAI (requires API key)
  2. Ollama (local LLM)
  3. DeepSeek (local LLM)

Step 3: Start and Enable the Service

Start the DevOps-GPT Agent service:

sudo systemctl start devops-gpt

sudo systemctl enable devops-gpt

sudo systemctl status devops-gpt
● devops-gpt.service - DevOps GPT Service
Loaded: loaded (/usr/lib/systemd/system/devops-gpt.service; disabled; preset: disabled)
Active: active (running) since Sun 2025-02-02 01:53:24 UTC; 4s ago
Main PID: 28448 (python3)
Tasks: 1 (limit: 48154)
Memory: 26.9M
CPU: 618ms
CGroup: /system.slice/devops-gpt.service
└─28448 /usr/bin/python3 -m devops_gpt.main

Feb 02 01:53:24plakhera.example.com systemd[1]: Started DevOps GPT Service.
Feb 02 01:53:24plakhera.example.com devops-gpt[28448]: 2025-02-02 01:53:24,646 - __main__ - INFO - Starting DevOps GPT service...
Feb 02 01:53:24plakhera.example.com devops-gpt[28448]: 2025-02-02 01:53:24,646 - devops_gpt.config_manager - INFO - Loading config from: /etc/devops-gpt/config.yaml
Feb 02 01:53:24plakhera.example.com devops-gpt[28448]: Debug: Loaded config contents: {'log_paths': ['system', 'syslog'], 'check_interval': 10, 'batch_size': 1, 'max_errors_per_batch': 10, 'error_window': 3600, 'llm_provider': 'openai', 'op>
Feb 02 01:53:24plakhera.example.com devops-gpt[28448]: 2025-02-02 01:53:24,654 - __main__ - INFO - Configuration loaded. Monitoring paths: ['system', 'syslog']
Feb 02 01:53:24plakhera.example.com devops-gpt[28448]: 2025-02-02 01:53:24,654 - devops_gpt.log_monitor - INFO - DevOps GPT Log Monitor initialized with paths: ['system', 'syslog']
Feb 02 01:53:24plakhera.example.com devops-gpt[28448]: 2025-02-02 01:53:24,672 - devops_gpt.cache.local_cache - INFO - DevOps GPT Cache initialized at /var/cache/devops-gpt/openai
Feb 02 01:53:24plakhera.example.com devops-gpt[28448]: 2025-02-02 01:53:24,672 - devops_gpt.llms.openai_provider - INFO - DevOps GPT OpenAI provider initialized
Feb 02 01:53:24plakhera.example.com devops-gpt[28448]: 2025-02-02 01:53:24,679 - devops_gpt.error_patterns - INFO - DevOps GPT Error Analyzer initialized with 20 patterns
Feb 02 01:53:24plakhera.example.com devops-gpt[28448]: 2025-02-02 01:53:24,679 - __main__ - INFO - DevOps GPT initialized with OPENAI as LLM provider
  • Once the service is up and running, you will receive an alert on Slack.

💬 Give it a try! Your feedback is invaluable as I refine this tool. I’ll be sharing more updates throughout the week — stay tuned!

🔗 For more details about the installation process, see the following link: https://github.com/thedevops-gpt/devops-gpt

--

--

Prashant Lakhera
Prashant Lakhera

Written by Prashant Lakhera

AWS Community Builder, Ex-Redhat, Author, Blogger, YouTuber, RHCA, RHCDS, RHCE, Docker Certified,4XAWS, CCNA, MCP, Certified Jenkins, Terraform Certified, 1XGCP

No responses yet