Loading...
Bot

Key logger and screenshot bot/malware

The Origin: Understanding Background Processes (2024) Every software engineer eventually wants to understand how programs interact directly with a computer's hardware. Back in 2024, my goal was to learn how operating systems handle keyboard inputs and screen rendering. To answer this, I built a basic system monitoring tool in Python.

The original iteration was functional but fragmented. I wrote separate scripts: one used pynput to listen to global keystrokes and save them to a text file, while another used pyautogui to snap pictures of the screen. Both scripts connected to Discord to upload the files. It worked perfectly as a proof-of-concept on Windows, but running multiple scripts that fought for the same API connection was inefficient.

The Evolution: Moving to Linux and Better Architecture (Present) Recently, I transitioned to using Fedora Linux with KDE Plasma as my main operating system. When I brought this older project over to my new OS, it immediately broke due to Linux's strict security rules. Fixing it forced me to completely re-architect the codebase.

I redesigned the project to meet professional software standards, which is exactly the kind of systems-level problem-solving I want to pursue in computer science:

  • Unified Asynchronous Bot: Instead of having two separate scripts fight to log into the same Discord bot, I combined the exfiltration logic into a single Object-Oriented class. Using discord.py's background tasks, the bot now smoothly handles uploading text files and images on a 10-second loop without freezing or crashing.
  • Concurrent Processing: I created a central main.py orchestrator that uses Python's subprocess module to launch the keylogger and the Discord bot side-by-side. They run simultaneously as independent processes, meaning the system can log fast typing without being interrupted by the screenshot timer.
  • Navigating OS Security: Porting this to Linux taught me a lot about system permissions. Fedora defaults to the Wayland display server, which actively blocks background apps from recording the screen or logging keys for security reasons. I had to research OS architectures and switch my desktop environment to X11 so my script had the necessary system-level permissions to capture inputs.
  • Credential Security: I stripped my hardcoded Discord API tokens out of the source code and implemented a .env system, ensuring the tool can be safely uploaded to public repositories without leaking sensitive passwords.

bot1.png 15.11 KB
Another thing to note, when I originally made this, I tested it out on a malware detection site to see if it would be detected as malware by windows. These sites actually test any files you upload to them so antivirus companies can be more prepared. So for a few months I would get pings from the discord server of the bot of screenshots of random virual machines.

Technical Growth and Takeaways This project represents my transition from writing simple, single-task scripts to designing a consolidated application that handles multiple background tasks at once. Working through the display server quirks on Linux gave me a much deeper understanding of how software interacts with the operating system it runs on. It transformed a basic coding experiment into a solid lesson in system telemetry, concurrent processing, and secure API design.