Produce LLM-oriented repo digests in CI: summary, tree, and file contents
When it’s useful
AI-assisted review, onboarding, or documentation needs curated text from a repository, not an ad-hoc zip of everything. Running that extraction in GitHub Actions keeps prompts and artifacts repeatable and close to the code you are analyzing.
What you can do
- Analyze the checked-out workspace, a remote URL, or another path, with optional branch or tag.
- Receive three text files under a configurable output directory:
summary.txt(high-level stats),tree.txt(directory layout), andcontent.txt(concatenated sources), plus a job summary for quick reads in the Actions UI. - Tighten scope with include/exclude glob patterns, opt into gitignored files or submodules when appropriate, and pass a token for private clones.
Limits and fit
The action wraps gitingest, which enforces guardrails so runs stay bounded: by default 10 MB per file, 500 MB cumulative, up to 10,000 files, and 20 levels of directory depth. Additional paths are skipped and the digest may be partial. Large monorepos benefit from narrower patterns or a subdirectory source. Setup, error cases, and example workflows are in the action repository and Marketplace listing.
