How to check what’s hiding in your Jira attachments
Most teams only think about their Jira attachments when something forces them to — an audit, a data-subject request, a security review, or a near-miss. Then the question lands: what is actually inside the tens of thousands of files attached to our issues? It breaks into two separate checks. Is any file dangerous in itself — the job of a virus scan? And is any file, however harmless-looking, exposing data we shouldn’t be holding? Here is a practical way to work through both, and where the usual tools quietly stop short.
Start with what a virus scan covers — and where it stops
A malware or virus scan asks whether a file is malicious: a disguised executable, a macro-laden spreadsheet, a rigged PDF. Atlassian Cloud runs native malware detection on uploads, and many teams add their own antivirus for files that get downloaded. Keep that layer — it is your defence against a dangerous file. But a virus scan reads a file’s structure, not its meaning. A clean screenshot containing a live password is not a virus, so it sails straight through. That is the gap the rest of this checklist closes.
Map your highest-risk attachments with JQL
You rarely need to scan everything at once. Start where risk concentrates: service-desk projects where customers upload documents, projects handling HR or finance data, and issues with image or PDF attachments. A JQL scope — by project, label, date range, or issue type — lets you point a scan precisely instead of boiling the ocean, which also keeps any OCR cost predictable.
Decide what “sensitive” means, as patterns
An attachment checker for Jira is only as good as the patterns you give it. Translate your policy into concrete patterns: literal strings like “password” or “secret,” or regular expressions for API keys, email addresses, national ID formats, or card-number shapes. Simple wildcard text covers the obvious cases; regex handles the structured ones.
Read inside images and scanned PDFs — the step most tools skip
This is where ordinary scanning goes blind. Jira search indexes fields, not file contents; most DLP reads text in comments and fields, then stops. The riskiest content — a password in a screenshot, an ID on a scanned form — lives inside images and scans that text extraction can’t open. Attachment Scanner for Jira uses built-in OCR to read those files, so an image is treated like any other searchable document. Its full scan reads images and all PDF types; a document-only mode covers Office and text files alone, uses no OCR credits, and never contacts the OCR service.
Review every match in context, then remediate
A hit is only useful if you can judge it. The results dashboard shows each match with its issue key (click straight through to Jira), the file name, whether the text came from OCR or direct extraction, the matched text, and the surrounding context — so a false positive is obvious. When something genuinely shouldn’t be there, bulk-select the matches and delete those attachments. Deletion is always an explicit, admin-confirmed action and is written to an audit log; nothing is ever removed automatically.
Make it repeatable — and know the limits
Save your scope-plus-pattern as a template and rerun it after busy periods or before an audit. Two honest caveats: scanning is on demand today, not continuous real-time monitoring, and the app is Jira Cloud only, with no Data Center or Confluence support yet. On privacy, OCR runs on dedicated EU/EEA GPU hardware with no public AI service involved; attachments are processed in memory and discarded, and only matched snippets are stored in Atlassian’s Forge storage. You can try it free for 30 days from the Atlassian Marketplace, with monthly evaluation credits to test OCR on your own files.
