Age | Commit message (Collapse) | Author |
|
* Adds ignore to smime rule for p7m and p7c
Now also signed and encrypted messages (smime.p7m) and certificates (smime.p7c)
are ignored.
Before only signatures (smime.p7s) were ignored.
* Fix TypeError when comparing regex with none value
TypeError: expected string or buffer in expressions.py line 148 when comparing
regex with unset name_declared.
Now the result of the comparison of regex with None is False.
|
|
Add subcommand scan_file with same behaviour as scan_file.py
previously. Moved everything into class PeekabooUtil.
Allow more commands to be added to the API.
Server can now easily be extended to understand and handle new API requests
|
|
RuleResult was returned in case CuckooSubmitFailedException.
Which lead to AttributeError: RuleResult instance has no attribute 'score' from
expressions.
Now None is returned and handled in expression rule to return Result.failed.
|
|
Exception loading cuckoo json report is no longer logged. Instead we now log a
warning message.
|
|
CuckooEmbed mode raised OSError if Cuckoo report file doesn't exit.
|
|
vbaparser.type was compared against a hard coded list.
OleNotAnOfficeDocumentException has been removed.
The additional test is unnecessary, olevba takes care of it.
|
|
Must be string not bytes.
Data from socket is now correctly decoded as UTF-8.
|
|
* add more default bad signatures
They have been tested a few days on our side with less traffic but I think those are worse enough and don't need to be largely tested
|
|
The file extension was checked against a list with office file extensions
preventing analysis of non office files.
Therefore the oletools module couldn't be used on other files with oterh
extensions e.g. rtf.
Now any file can be processed with oletools module.
Tests and documentation have been adapted accordingly.
|
|
olereport can now be used in expressions
Before oletools was only a separate rule.
Now has_office_macro and vba_code can be used in generic rule
expressions.
|
|
Expressions can now contain None values.
e.g. variable is None
|
|
* Removes DeprecationWarning in unittests with python3
DeprecationWarning: Please use assertRaisesRegex instead is due to
a change in python where funcitons have been renamed. But not in
earlier versions. This commit introdues a change in handling
python2 and python3 differently with those functions.
* Fix AttributeError
'IdentifierMissingException' object has no attribute 'message' in ruleset.py
|
|
Add new rule to ignore gpg sinatures (Closes #40)
Includes test case
|
|
Generic rules allow to access Sample attributes and Cuckoo Report attributes from the rulest configuration and construct expressions that evaluate to a result.
Implement a parser for generic rules based on pyparsing.
Implement a rule that can use generic logical expressions to categorize
samples.
Smime signatures can now directly be ignored in the ruleset
as a combination of delcared name and declared type (closes #83, closes #42.
|
|
suspicious keywords (#87)
The new rule is deactivated by default, it uses oletools to check SUSPICIOUSKEYWORDS
in the macro code of office documents.
Ole is now an analyser module much like Cuckoo inside the toolbox.
All logic has been moved there. Sample is merely there for caching.
Evaluation of Rules uses the toolbox and report back.
Regex based matching of MS office files for configurable keywords.
Also detection of macros has been improved.
Tests for correct handling of none office file extension
Tests for correct handling of empty file with correct extension
Tests for correct detection of office file with suspicious macro
Tests for correct pass of blank office document
Tests for correct handling of empty word doc.
Tests for correct non detection of Excel file with macro.
|
|
Add support for Cuckoo API Authentication Bearer Token
|
|
Since version 2.0.7 the Cuckoo API supports authentication via a bearer
auth token set in the cuckoo configuration and passed to the API with every
request in the HTTP header. Add support for this mechanism to our cuckoo
module and a new configuration option to specify the token.
|
|
Deprecate installation using setup.py
|
|
Document that installation using setup.py is not a good idea.
Closes #80.
|
|
Augment distribution
|
|
Document how we test PyPI interaction using devpi.
|
|
Add mention to the documentation that peekaboo now installs sample
peekaboo.conf and ruleset.conf files.
|
|
Add startup order relative to cuckoo-api to systemd unit. It's not a
hard requirement (no Require=) but only ordering so that if a cuckoo-api
service exists, we're started after it in case we're running in api mode
and will actually try to talk to it.
|
|
Add ruleset.conf.sample to the distribution and configure setuptools to
install it together with the README, peekaboo.conf.sample and our
systemd unit into an fhs-like documentation directory.
Add a number of comments to setup.py so we remember why this seems to be
the least disruptive way out of the limited options presented by the
python/setuptools/distutils/pip mechanics of providing these files at a
location visible to users and still somewhat manageable by alternative
packagers (e.g. Linux distributions).
In short again:
- package_data would hide the files away below the site-packages
directory making them invisible to the user and hard(er) to find
programmatically
- absolute paths really don't work well with data_files
- relative paths work somewhat better but still not fully consistently
between setup.py and pip
- overriding the install command solves the installation but not the
packaging problem
|
|
Orig filename
|
|
Update the change log with original filename and REST API robustness
details.
|
|
Downloading the full list of jobs gets less and less efficient as the
number of current and past jobs increases. There is no way to filter
down to specific job IDs. The limit and offset parameters of the list
action of the API cannot be used to achieve a similar effect because the
job list is not sorted by job/task ID and the parameters seem only meant
to iterate over the whole list in blocks, not to extract specific jobs
from it.
The previous logic of determining the highest job ID at startup and
requesting the next million entries from that offset on was therefore
likely not working as expected and making us "blind" to status changes
of jobs which end up below our offset in the job list.
This change adjusts the CuckooAPI to make use of the list of running
jobs we've had for some time now to work around this. Instead of getting
a list of all jobs starting from the highest job id we saw at startup we
just get each job's status individually. While this makes for more
requests, it should over a longer runtime make for less network traffic
and reliably get us the data we need about our jobs.
Also, turn the shutdown_requested flag into an event so we can use its
wait() method to also implement the poll interval and get immediate
reaction to a shutdown request.
Finally, we switch to endless retrying of failed job status requests
paired with the individual request retry logic introduced earlier. On
submission we still fail the submission process after timeouts or
retries on the assumption that without the job being submitted to
Cuckoo, our feedback to the client that analysis fail will cause it to
resubmit and still avoid duplicates.
Closes #43.
|
|
When using the REST API, submit the sample with its original filename if
available via the new name_declared (meta info) property.
Closes #81 and #82 when using api mode. No plans to add this to embed
mode as well since it's deprecated anyway.
|
|
Switch parent method calling to python3 no-argument version of super()
as we've done with or WhitelistRetry before.
|
|
Use a session configured with various retry parameters and a special
whitelisting retry class do our retries for us. This offloads all kinds
of error handling we've not even thought of yet to urllib3 while still
staying highly customizable. This also gives us a nice backoff algorithm
when retrying which we only need to parametrize.
While at it, unroll the __get() method into the one user of POST
(submit()) to make it more readable.
|
|
Ruleset rework
|
|
Update the change log with ruleset rework details.
|
|
python3 changes the type of dict().keys() from list to dict_keys which
can not be concatenated with a list easily. Options are in-place
concatenation using += and explicit cast using list(dict().keys()). We
choose the former.
|
|
TypeError messages use quotes in python3 which need to be matched in our
test.
|
|
Extend PeekabooConfigParser's get_by_default_type (and rename to
get_by_type) to also support lists, typles, floats, loglevels and
relists (lists of regular expressions). Use this to eliminate multiple
wrapper methods from the Rule base class and instead access config
values based on the type of their default value, as is already done with
the main configuration.
Log levels and relists still lack distinct types for automatic
detection. Instead we allow to and do override their type explicitly on
access.
Eliminate exception catching from get_by_type() because it's
inconsistent with the other getters. If not avoided by providing
fallback values or only accessing known-to-exist settings they should be
caught elsewhere. For this reason signature of Rule.get_config_value()
enforces provision of a default.
|
|
Lower the default for when to consider an in-flight lock of another
instance as stale from one hour to 15 minutes to speed up recovery at
the price of potential redundant analysis in case an instance is
legitimately very busy.
|
|
Check for and report unknown configuration sections and options to help
the user detect misconfiguration and typos. Add respective logic to the
main configuration as well as the ruleset config.
|
|
Validate the ruleset configuration at startup to inform the user about
misconfiguration and exit immediately instead of giving warnings during
seemingly normal operation. This also gives us a chance to pre-compile
regexes for more efficient matching later on.
We give rules a new method get_config() which retrieves their
configuration. This is called for each rule by new method the
validate_config() of the ruleset engine to catch errors. This way the
layout and extent of configuration is still completely governed by the
rule and we can interview it about its happiness with what's provided in
the configuration file.
As an incidental cleanup, merge class PeekabooRulesetConfig into
PeekabooRulesetParser because there's nothing left where it could and
would need to help the rules with an abstraction of the config file.
Also switch class PeekabooConfig to be a subclass of
PeekabooConfigParser so it can (potentially) benefit from the list
parsing code there. By moving the special log level and by-default-type
getters over there as well we end up with nicely generic config classes
that can benefit directly from improvements in the configparser module.
Update the test suite to test and use this new functionality.
Incidentally, remove the convoluted inheritance-based config testing
layout in favour of creating subclasses of the config classes.
|
|
Make the list of rules to run a section in the ruleset configuration.
This allows rules to be reordered and (potentially) run more than once
without code changes.
This also obsoletes the enabled setting per rule because they're now
implicitly enabled if they're listed. Commenting out disables a rule.
|
|
Python3
|
|
We no longer care which python version we run under. So remove explicit
references to python2 and just let the tools and user choose what they
like.
The one exception is the interpreter override for embedded mode. Here we
change the default to /usr/bin/python2 to force that version for
execution of Cuckoo since it doesn't run with python3 yet.
|
|
Explicitly disposing of the SQLAlchemy engine in __del__ of
PeekabooDatabase causes an exception with python3:
Exception ignored in: <bound method PeekabooDatabase.__del__ of
<peekaboo.db.PeekabooDatabase object at 0x7f7c335ed160>>
Traceback (most recent call last):
File "peekaboo/db.py", line 465, in __del__
File "lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 2014, in dispose
File "lib/python3.6/site-packages/sqlalchemy/pool/impl.py", line 251, in recreate
File "<string>", line 2, in __init__
File "lib/python3.6/site-packages/sqlalchemy/util/deprecations.py", line 130, in warned
File "lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 221, in __init__
File "lib/python3.6/site-packages/sqlalchemy/event/base.py", line 150, in _update
File "lib/python3.6/site-packages/sqlalchemy/event/attr.py", line 367, in _update
File "lib/python3.6/site-packages/sqlalchemy/event/registry.py", line 115, in _stored_in_collection_multi
KeyError: (<weakref at 0x7f7c37bd2e58; to 'function' at 0x7f7c335b3378 (go)>,)
Best guess: SQLAlchemy is working with weakrefs
(https://docs.python.org/3/library/weakref.html) to get around problems
which might prevent proper garbage collection (like cyclic dependencies
between objects) and in our case this backfires particularly because
we're doing cleanup when garbage collection is already underway.
Doing complex cleanup work in __del__ is not generally recommended, most
notably because it is highly uncertain when garbage collection will
actually happen.
(https://stackoverflow.com/questions/1481488/what-is-the-del-method-how-to-call-it)
Since we've been doing the disposal only on garbage collection anyway
and have wrapped the engine inside the PeekabooDatabase object so that
new instances also get a new engine, we can just as well also rely on
garbage collection to also dispose of the engine of a disposed
PeekabooDatabase.
If there are problems down the road, an explicit close method would be
the way to go if reliable cleanup is required at a specific point in the
code.
|
|
Remove an unnecessary else clause for the non-error case to more clearly
show standard execution flow.
|
|
Require at least oletools 0.54 to make sure we're compatible with
python3 at runtime.
|
|
Prepare release 1.7
|
|
|
|
Add a log of user-visible changes to alert them to changes that might
need looking into on update or might warrant an update in the first
place.
Update for 1.7 release.
|
|
Trim in-source authorship info in favour of VCS metadata
|
|
Cuckoo hang
|
|
Instead of maintaining ever-growing lists of authors in comments and
constants in-source, name a main author where necessary and prudent and
rely on VCS metadata instead.
Closes #64 (again and terminally)
|