TruffleHog
TruffleHog is an open-source secret-scanning tool that searches filesystems, git repositories, and archives for common secret patterns (API keys, tokens, passwords, private keys, etc.). It runs multiple detectors and can emit structured JSON for automated processing.
Integration
TruffleHog is integrated as an analysis tool under jobs/trufflehog. The tool folder contains a small Python script (trufflehog.py) and a Dockerfile (Dockerfile_trufflehog) that:
- Installs TruffleHog (via the official install script) into the container.
- Registers a task handler using the project's
tool_registrydecorator so the worker listens on a Redis queue for tasks (the queue name is derived from the tool name, e.g.queue_trufflehog). - Runs as a
WorkerManagerprocess which pulls payloads from Redis and invokes the worker handler for each task.
The worker declares dependencies=["binwalk"], so it expects an earlier tool to have extracted the image filesystem into an extracted directory that the scanner can operate on.
Worker behavior
When a task arrives on the TruffleHog queue the worker code (jobs/trufflehog/trufflehog.py) performs the following actions:
- Resolve the provided image path and locate the extracted filesystem directory (the worker expects an
extracteddirectory next to the image path produced by earlier tools such asbinwalk). - Invoke the TruffleHog binary in
filesystemmode with--jsonto produce structured output. The worker runs TruffleHog as a subprocess and enforces a timeout to avoid indefinite scans. - Append human-readable stdout/stderr to the job's
output.txtso operators and the frontend can inspect raw tool output. - Read TruffleHog's line-delimited JSON output, parse each line, and filter for detection objects (the implementation treats objects containing
DetectorNameas detections). - For each detection, extract metadata (filesystem file path, line number, detector/decoder names, raw secret values when present) and call
DBConnector.insert_trufflehog_result(...)to persist the findings. - Write the structured detections to a file named
trufflehog-output.jsonnext to the output file. - Push a result/status message back to Redis on
queue_returndescribing the job, tool, image and final status (successorfailure) so the executor can update job state.
Outputs
The worker produces several artifacts that other components can consume:
output.txt(appended): a human-readable log containing TruffleHog's stdout and stderr.trufflehog-output.json: a JSON file containing the structured detection objects containing the secrets and the filepaths where they were found.- Database rows via
DBConnector.insert_trufflehog_result(...): a parsed, queryable representation of each detection (image id, file path, line number, detector/decoder names, and raw secret fields when available). - Redis return message on
queue_return: a small JSON payload describing the job/tool/image and whether the run finishedsuccess/failureso orchestration can continue.