summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2019-11-15WIP: Fix: Work around bug in oletools where vba macros are not detected properlyfix-oletools-bug-detect_vba_macrosMatthias Beyer
This patch works around a bug[0] in the oletools where the library does not detect vba macros properly in `VBA_Parser.detect_vba_macros()` returns true but `VBA_Parser.extract_all_macros()` returns an empty list because there is no macro. This test is not tested by me because I do not have the test setup. [0] https://github.com/decalage2/oletools/issues/501 Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2019-10-29Fix returned RuleResult istead of CuckooReportFelix Bauer
RuleResult was returned in case CuckooSubmitFailedException. Which lead to AttributeError: RuleResult instance has no attribute 'score' from expressions. Now None is returned and handled in expression rule to return Result.failed.
2019-10-29Replaced logging of entire exception with precise warningFelix Bauer
Exception loading cuckoo json report is no longer logged. Instead we now log a warning message.
2019-10-29Removed raise OSError when Cuckoo report not foundFelix Bauer
CuckooEmbed mode raised OSError if Cuckoo report file doesn't exit.
2019-10-29Removed overly specific exception handlingFelix Bauer
vbaparser.type was compared against a hard coded list. OleNotAnOfficeDocumentException has been removed. The additional test is unnecessary, olevba takes care of it.
2019-10-29Python3 Exception TypeError in scan_file.pyFelix Bauer
Must be string not bytes. Data from socket is now correctly decoded as UTF-8.
2019-10-15add some more default bad signatures to ruleset.conf.sample (#103)Marcel Caspar
* add more default bad signatures They have been tested a few days on our side with less traffic but I think those are worse enough and don't need to be largely tested
2019-10-02Remove file extension check from oletools module (#105)Felix Bauer
The file extension was checked against a list with office file extensions preventing analysis of non office files. Therefore the oletools module couldn't be used on other files with oterh extensions e.g. rtf. Now any file can be processed with oletools module. Tests and documentation have been adapted accordingly.
2019-10-01Add olereport to expression context (#100)Felix Bauer
olereport can now be used in expressions Before oletools was only a separate rule. Now has_office_macro and vba_code can be used in generic rule expressions.
2019-10-01Add None type to expressions (#101)Felix Bauer
Expressions can now contain None values. e.g. variable is None
2019-09-05Handles deprecation warnings in unit tests with python3 (#99)Felix Bauer
* Removes DeprecationWarning in unittests with python3 DeprecationWarning: Please use assertRaisesRegex instead is due to a change in python where funcitons have been renamed. But not in earlier versions. This commit introdues a change in handling python2 and python3 differently with those functions. * Fix AttributeError 'IdentifierMissingException' object has no attribute 'message' in ruleset.py
2019-08-27Ignore gpg signatures (#97)Felix Bauer
Add new rule to ignore gpg sinatures (Closes #40) Includes test case
2019-08-27Add implementation of generic rules (expressions) (#96)Felix Bauer
Generic rules allow to access Sample attributes and Cuckoo Report attributes from the rulest configuration and construct expressions that evaluate to a result. Implement a parser for generic rules based on pyparsing. Implement a rule that can use generic logical expressions to categorize samples. Smime signatures can now directly be ignored in the ruleset as a combination of delcared name and declared type (closes #83, closes #42.
2019-08-20Add an additional static test to the ruleset to check for office macros with ↵Felix Bauer
suspicious keywords (#87) The new rule is deactivated by default, it uses oletools to check SUSPICIOUSKEYWORDS in the macro code of office documents. Ole is now an analyser module much like Cuckoo inside the toolbox. All logic has been moved there. Sample is merely there for caching. Evaluation of Rules uses the toolbox and report back. Regex based matching of MS office files for configurable keywords. Also detection of macros has been improved. Tests for correct handling of none office file extension Tests for correct handling of empty file with correct extension Tests for correct detection of office file with suspicious macro Tests for correct pass of blank office document Tests for correct handling of empty word doc. Tests for correct non detection of Excel file with macro.
2019-08-06Merge pull request #91 from Jack28/cuckoo-api-authtokenFelix Bauer
Add support for Cuckoo API Authentication Bearer Token
2019-08-06Add support for Cuckoo API Authentication Bearer TokenFelix Bauer
Since version 2.0.7 the Cuckoo API supports authentication via a bearer auth token set in the cuckoo configuration and passed to the API with every request in the HTTP header. Add support for this mechanism to our cuckoo module and a new configuration option to specify the token.
2019-06-24Merge pull request #86 from michaelweiser/setup-docMichael Weiser
Deprecate installation using setup.py
2019-06-05Deprecate installation using setup.pyMichael Weiser
Document that installation using setup.py is not a good idea. Closes #80.
2019-06-05Merge pull request #85 from michaelweiser/augment-distMichael Weiser
Augment distribution
2019-06-05Add documentation on PyPI interaction testingMichael Weiser
Document how we test PyPI interaction using devpi.
2019-05-21Update documentation with sample config pathsMichael Weiser
Add mention to the documentation that peekaboo now installs sample peekaboo.conf and ruleset.conf files.
2019-05-21Update systemd unitMichael Weiser
Add startup order relative to cuckoo-api to systemd unit. It's not a hard requirement (no Require=) but only ordering so that if a cuckoo-api service exists, we're started after it in case we're running in api mode and will actually try to talk to it.
2019-05-21Distribute sample configuration filesMichael Weiser
Add ruleset.conf.sample to the distribution and configure setuptools to install it together with the README, peekaboo.conf.sample and our systemd unit into an fhs-like documentation directory. Add a number of comments to setup.py so we remember why this seems to be the least disruptive way out of the limited options presented by the python/setuptools/distutils/pip mechanics of providing these files at a location visible to users and still somewhat manageable by alternative packagers (e.g. Linux distributions). In short again: - package_data would hide the files away below the site-packages directory making them invisible to the user and hard(er) to find programmatically - absolute paths really don't work well with data_files - relative paths work somewhat better but still not fully consistently between setup.py and pip - overriding the install command solves the installation but not the packaging problem
2019-05-09Merge pull request #84 from michaelweiser/orig-filenameMichael Weiser
Orig filename
2019-05-09Update change logMichael Weiser
Update the change log with original filename and REST API robustness details.
2019-05-09More robustly poll Cuckoo REST API jobsMichael Weiser
Downloading the full list of jobs gets less and less efficient as the number of current and past jobs increases. There is no way to filter down to specific job IDs. The limit and offset parameters of the list action of the API cannot be used to achieve a similar effect because the job list is not sorted by job/task ID and the parameters seem only meant to iterate over the whole list in blocks, not to extract specific jobs from it. The previous logic of determining the highest job ID at startup and requesting the next million entries from that offset on was therefore likely not working as expected and making us "blind" to status changes of jobs which end up below our offset in the job list. This change adjusts the CuckooAPI to make use of the list of running jobs we've had for some time now to work around this. Instead of getting a list of all jobs starting from the highest job id we saw at startup we just get each job's status individually. While this makes for more requests, it should over a longer runtime make for less network traffic and reliably get us the data we need about our jobs. Also, turn the shutdown_requested flag into an event so we can use its wait() method to also implement the poll interval and get immediate reaction to a shutdown request. Finally, we switch to endless retrying of failed job status requests paired with the individual request retry logic introduced earlier. On submission we still fail the submission process after timeouts or retries on the assumption that without the job being submitted to Cuckoo, our feedback to the client that analysis fail will cause it to resubmit and still avoid duplicates. Closes #43.
2019-05-09Submit sample with its original filename to CuckooMichael Weiser
When using the REST API, submit the sample with its original filename if available via the new name_declared (meta info) property. Closes #81 and #82 when using api mode. No plans to add this to embed mode as well since it's deprecated anyway.
2019-05-09Use super() for other Cuckoo classes as wellMichael Weiser
Switch parent method calling to python3 no-argument version of super() as we've done with or WhitelistRetry before.
2019-05-09Let requests and urllib3 do our retries for usMichael Weiser
Use a session configured with various retry parameters and a special whitelisting retry class do our retries for us. This offloads all kinds of error handling we've not even thought of yet to urllib3 while still staying highly customizable. This also gives us a nice backoff algorithm when retrying which we only need to parametrize. While at it, unroll the __get() method into the one user of POST (submit()) to make it more readable.
2019-05-09Merge pull request #76 from michaelweiser/ruleset-reworkMichael Weiser
Ruleset rework
2019-04-25Update change logMichael Weiser
Update the change log with ruleset rework details.
2019-04-25Avoid TypeError on ruleset config validation with python3Michael Weiser
python3 changes the type of dict().keys() from list to dict_keys which can not be concatenated with a list easily. Options are in-place concatenation using += and explicit cast using list(dict().keys()). We choose the former.
2019-04-25Fix test regex for python3Michael Weiser
TypeError messages use quotes in python3 which need to be matched in our test.
2019-04-25Avoid config access boilerplate in rule classesMichael Weiser
Extend PeekabooConfigParser's get_by_default_type (and rename to get_by_type) to also support lists, typles, floats, loglevels and relists (lists of regular expressions). Use this to eliminate multiple wrapper methods from the Rule base class and instead access config values based on the type of their default value, as is already done with the main configuration. Log levels and relists still lack distinct types for automatic detection. Instead we allow to and do override their type explicitly on access. Eliminate exception catching from get_by_type() because it's inconsistent with the other getters. If not avoided by providing fallback values or only accessing known-to-exist settings they should be caught elsewhere. For this reason signature of Rule.get_config_value() enforces provision of a default.
2019-04-25Lower default for in-flight lock stalenessMichael Weiser
Lower the default for when to consider an in-flight lock of another instance as stale from one hour to 15 minutes to speed up recovery at the price of potential redundant analysis in case an instance is legitimately very busy.
2019-04-25Detect unknown config sections and optionsMichael Weiser
Check for and report unknown configuration sections and options to help the user detect misconfiguration and typos. Add respective logic to the main configuration as well as the ruleset config.
2019-04-25Validate ruleset configMichael Weiser
Validate the ruleset configuration at startup to inform the user about misconfiguration and exit immediately instead of giving warnings during seemingly normal operation. This also gives us a chance to pre-compile regexes for more efficient matching later on. We give rules a new method get_config() which retrieves their configuration. This is called for each rule by new method the validate_config() of the ruleset engine to catch errors. This way the layout and extent of configuration is still completely governed by the rule and we can interview it about its happiness with what's provided in the configuration file. As an incidental cleanup, merge class PeekabooRulesetConfig into PeekabooRulesetParser because there's nothing left where it could and would need to help the rules with an abstraction of the config file. Also switch class PeekabooConfig to be a subclass of PeekabooConfigParser so it can (potentially) benefit from the list parsing code there. By moving the special log level and by-default-type getters over there as well we end up with nicely generic config classes that can benefit directly from improvements in the configparser module. Update the test suite to test and use this new functionality. Incidentally, remove the convoluted inheritance-based config testing layout in favour of creating subclasses of the config classes.
2019-04-25Allow rules to run to be configuredMichael Weiser
Make the list of rules to run a section in the ruleset configuration. This allows rules to be reordered and (potentially) run more than once without code changes. This also obsoletes the enabled setting per rule because they're now implicitly enabled if they're listed. Commenting out disables a rule.
2019-04-25Merge pull request #79 from michaelweiser/python3Michael Weiser
Python3
2019-04-24Remove explicit references to python 2Michael Weiser
We no longer care which python version we run under. So remove explicit references to python2 and just let the tools and user choose what they like. The one exception is the interpreter override for embedded mode. Here we change the default to /usr/bin/python2 to force that version for execution of Cuckoo since it doesn't run with python3 yet.
2019-04-24Leave DB engine disposal to garbage collectorMichael Weiser
Explicitly disposing of the SQLAlchemy engine in __del__ of PeekabooDatabase causes an exception with python3: Exception ignored in: <bound method PeekabooDatabase.__del__ of <peekaboo.db.PeekabooDatabase object at 0x7f7c335ed160>> Traceback (most recent call last): File "peekaboo/db.py", line 465, in __del__ File "lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 2014, in dispose File "lib/python3.6/site-packages/sqlalchemy/pool/impl.py", line 251, in recreate File "<string>", line 2, in __init__ File "lib/python3.6/site-packages/sqlalchemy/util/deprecations.py", line 130, in warned File "lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 221, in __init__ File "lib/python3.6/site-packages/sqlalchemy/event/base.py", line 150, in _update File "lib/python3.6/site-packages/sqlalchemy/event/attr.py", line 367, in _update File "lib/python3.6/site-packages/sqlalchemy/event/registry.py", line 115, in _stored_in_collection_multi KeyError: (<weakref at 0x7f7c37bd2e58; to 'function' at 0x7f7c335b3378 (go)>,) Best guess: SQLAlchemy is working with weakrefs (https://docs.python.org/3/library/weakref.html) to get around problems which might prevent proper garbage collection (like cyclic dependencies between objects) and in our case this backfires particularly because we're doing cleanup when garbage collection is already underway. Doing complex cleanup work in __del__ is not generally recommended, most notably because it is highly uncertain when garbage collection will actually happen. (https://stackoverflow.com/questions/1481488/what-is-the-del-method-how-to-call-it) Since we've been doing the disposal only on garbage collection anyway and have wrapped the engine inside the PeekabooDatabase object so that new instances also get a new engine, we can just as well also rely on garbage collection to also dispose of the engine of a disposed PeekabooDatabase. If there are problems down the road, an explicit close method would be the way to go if reliable cleanup is required at a specific point in the code.
2019-04-24db: Remove unnecessary else clauseMichael Weiser
Remove an unnecessary else clause for the non-error case to more clearly show standard execution flow.
2019-04-24Raise oletools version for python3 compatibilityMichael Weiser
Require at least oletools 0.54 to make sure we're compatible with python3 at runtime.
2019-04-24Merge pull request #78 from michaelweiser/prepare-1.7Michael Weiser
Prepare release 1.7
2019-04-23Bump version to 1.7 and update ASCII logoMichael Weiser
2019-04-23Add a change logMichael Weiser
Add a log of user-visible changes to alert them to changes that might need looking into on update or might warrant an update in the first place. Update for 1.7 release.
2019-04-17Merge pull request #75 from michaelweiser/kill-the-authorsMichael Weiser
Trim in-source authorship info in favour of VCS metadata
2019-04-17Merge pull request #74 from michaelweiser/cuckoo-hangMichael Weiser
Cuckoo hang
2019-04-16Trim in-source authorship info in favour of VCS metadataMichael Weiser
Instead of maintaining ever-growing lists of authors in comments and constants in-source, name a main author where necessary and prudent and rely on VCS metadata instead. Closes #64 (again and terminally)
2019-04-16Give threads namesMichael Weiser
Name threads according to function to improve log readability.