Standard collection of rules for capa: the tool for enumerating the capabilities of programs
This is the standard collection of rules for capa - the tool to automatically identify capabilities of programs.
Rule writing should be easy and fun! A large rule corpus benefits everyone in the community and we encourage all kinds of contributions.
Anytime you see something neat in malware, we want you to think of expressing it in a capa rule. Then, we'll make it as painless as possible to share your rule here and distribute it to the capa users.
capa uses a collection of rules to identify capabilities within a program. These rules are easy to write, even for those new to reverse engineering. By authoring rules, you can extend the capabilities that capa recognizes. In some regards, capa rules are a mixture of the OpenIOC, Yara, and YAML formats.
Here's an example of a capa rule:
rule: meta: name: hash data with CRC32 namespace: data-manipulation/checksum/crc32 author: [email protected] scope: function examples: - 2D3EDC218A90F03089CC01715A9F047F:0x403CBD - 7D28CB106CB54876B2A5C111724A07CD:0x402350 # RtlComputeCrc32 features: - or: - and: - mnemonic: shr - number: 0xEDB88320 - number: 8 - characteristic: nzxor - api: RtlComputeCrc32
capa interpets the content of these rules as it inspects executable files. If you follow the guidelines of this rule format, then you can teach capa to identify new capabilities.
The doc/format.md file describes exactly how to construct rules. Please refer to it as you create rules for capa.
The organization of this repository mirrors the namespaces of the rules it contains. capa uses namespaces to group like things together, especially when it renders its final report. Namespaces are hierarchical, so the children of a namespace encodes its specific techniques. In a few words each, the top level namespaces are:
We can easily add more top level namespaces as the need arises.
capa supports rules matching other rule matches. For example, the following rule set describes various methods of persistence. Note that the rule
persistencematches if either
servicematch against a sample.
--- rule: meta: name: persistence features: or: - match: run key - match: service --- rule: meta: name: run key features: string: /CurrentVersion\/Run/i --- rule: meta: name: service features: api: CreateService
Using this feature, we can capture common logic into "library rules". These rules don't get rendered as results but are used as building blocks to create other rules. For example, there are quite a few ways to write to files on Windows, so the following library rule makes it easy for other rules to thoroughly match file writing.
rule: meta: name: write file lib: True features: or: api: WriteFile api: fwrite ...
rule.meta.lib=Trueto declare a lib rule and place the rule file into the lib rule directory. Library rules should not have a namespace. Library rules will not be rendered as results. Capa will only attempt to match lib rules that are referenced by other rules, so there's no performance overhead for defining many reusable library rules.
The rule nursery is a staging ground for rules that are not quite polished. Nursery rule logic should still be solid, though metadata may be incomplete. For example, rules that miss a public example of the technique.
The rule engine matches regularly on nursery rules. However, our rule linter only enumerates missing rule data, but will not fail the CI build, because its understood that the rule is incomplete.
We encourage contributors to create rules in the nursery, and hope that the community will work to "graduate" the rule once things are acceptable.
Examples of things that would place a rule into the nursery: - no real-world examples - missing categorization - (maybe) questions about fidelity (e.g. RC4 PRNG algorithm)