butido - Sources of butido

Age	Commit message (Collapse)	Author
2021-02-02	Fix: Receive artifacts _after_ checking whether jobs resulted in error	Matthias Beyer
	This caused the program never to return if the running jobs resulted in an error and no artifact being sent to the parent - which caused the tokio::join!() to never return, thus the futures not to be pulled and thus the whole program sleep in an strange state which looked like some filesystem operations did not return. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-02	Fix: fn does not have to be async	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-30	Update tokio: 0.2 -> 1.0, shiplift	Matthias Beyer
	Because tokio 1.0 does not ship with the Stream trait, this patch also introduces tokio_stream as new dependency. For more information, look here: https://docs.rs/tokio/1.0.3/tokio/stream/index.html Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-29	Add documentation how Orchestrator works	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-25	Outsource receiving, ensure we received it all	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-25	Make progress bar message format uniform	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-25	Let the JobHandle::run() return a Vec<Artifact>	Matthias Beyer
	Before that change, it returned the dbmodels::Artifact objects, for which we needed to fetch the filestore::Artifact again. This change removes that restriction (improving runtime, of course). Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-25	Reimplement Orchestrator::run()	Matthias Beyer
	This reimplements the Orchestrator::run() function _again_. Commit 889649ac16367fe671ce61363bb6ce82531e5a6b was the basis for this work, improving the baseline so we can take a step further in this commit. The approach before the change from 889649ac16367fe671ce61363bb6ce82531e5a6b had one flaw. In the following scenario: A / \ B E / \ \ C D F The nodes C, D and F are selected and then run. After they all succeeded, the next iteration is checked, and yields that B and E can be built. But if F takes extremely long, B and E both have to wait until it is ready (because that's how the implementation works), although B can be built as soon as C and D are ready. This patch changes the implementation to the following: 1. For each job, there is a task. 2. The task has a channel where it receives results from its dependencides. In above example, B would receive the results of the job runs for C and D, and E would receive the result from the job run of F. 3. The task also has a sender where it can send its resulting artifacts to a parent task. The task _also_ sends the results of its childs. This way we propagate the built artifacts up to the root node. All these tasks are started concurrently. The "root" task sends the result to the orchestrator. The task itself is responsible for sending the job to the scheduler and processing the result. If the job errored, the task sends that to its parent. If a child errored, the task aborts its own error and propagates that error. What does not yet work in this commit: * Artifacts that were built before the error occoured are not reported yet. * The staging/release stores may contain artifacts that can be re-used. They are completely ignored by now. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-21	Fix clippy: Do not clone() copy type	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-21	Remove trailing whitespace	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-21	Reimplement: Orchestrator::run()	Matthias Beyer
	This patch reimplements the running of the computed jobs. The old implementation was structured as follows: 1. Compute a Tree of dependencies for the requested package 2. Make sets of this tree (see below) 3. For each set 3.1. Run set in parallel by submitting each job in the set to the scheduler 3.2. collect outputs and errors 3.3. Record outputs and return errors (if any) The complexity here was the computing of the JobSets but also the running of each job in a set in parallel. The code was non-trivial to understand. But that's not even the biggest concern with this approch. Consider the following tree of jobs: A / \ B E / \ \ C D F / \ G H \ I Each node here represents a package, the edges represent dependencies on the lower-hanging package. This tree would result in 5 sets of jobs: [ [ I ] [ G, H ] [ C, D, F ] [ B, E ] [ A ] ] because each "layer" in the tree would be run in parallel. It can be easily seen, that in the tree from above, the jobs for [ I, G, D, C ] can be run in parallel easily, because they do not have dependencies. The reimplementation also has another (crucial) benefit: The implementation does not depend on a structure of artifact path names anymore. Before, the artifacts needed to have a name as follows: <name of the package>-<version of the package>.<something> which was extremely restrictive. With the changes from this patch, the implementation does not depend on such a format anymore. Instead: Dependencies are associated with a job, by the output of jobs run for dependent packages. That means that, considering the above tree of packages: deps_of(B) = outputs_of(job_for(C)) + outputs_of(job_for(D)) in text: The dependencies of package B are the outputs of the job run for package C plus the outputs of the job run for package D. With that change in place, the outputs of a job run for a package can yield arbitrary file names and as long as the build script for the package can process them, everything is fine. The new algorithm, that solves that issue, is rather simple: 1. Hold a list of errors 2. Hold a list of artifacts that were built 3. Hold a list of jobs that were run 4. Iterate over all jobs, filtered by - If the job appears in the "already run jobs" list, ignore it - If a job has dependencies (on outputs of other jobs) that do not appear in the "already run jobs", ignore it (for now) 5. Run these jobs, and for each job: 5.1. Take the job UUID and put it in the "already run jobs" list. 5.2. Take the result of the job, 5.2.1. if it is an error, put it in the "list of errors" 5.2.2. if it is ok, put the artifact in the "list of artifacts" 6. if the list of errors is not empty, goto 9 7. if all jobs are in the "already run jobs" list, goto 9 8. goto 4 9. return all artifacts and all errors Because this approach is fundamentally different than the previous approach, a lot of things had to be rewritten: - The `JobSet` type was complete removed - There is a new type `crate::job:Tree` that gets built from the `crate::package::Tree` It is a mapping of a UUID (the job UUID) to a `JobDefinition`. The `JobDefinition` type is - A Job - A list of UUIDs of other jobs, where this job depends on the outputs It is therefore a mapping of `Job -> outputs(jobs_of(dependencies)` The `crate::job::Tree` type is now responsible for building a `Job` object for each `crate::package::Package` object from the `crate::package::Tree` object. Because the `crate::package::Tree` object contains all required packages for the complete built, the implementation of `crate::job::Tree::build_tree()` does not check sanity. It is assumed that the input tree to the function contains all mappings. Despite the name `crate::job::Tree` ("Tree"), the actual structure stored in the type is not a real tree. - The `MergedStores::get_artifact_by_path()` function was adapted because in the previous implementation, it used `StagingStore::load_from_path()`, which tried to load the file from the filesystem and put it into the internal map, which failed if it was already there. The adaption checks if the artifact already exists in the internal map and returns that object instead. (For the release store accordingly) - The interface of the `RunnableJob::build_from_job()` function was adapted, as this function does not need to access the `MergedStores` object anymore to load dependency-Artifacts from the filesystem. Instead, these Artifacts are passed to the function now. - The Orchestrator code - Got a type alias `JobResult` which represents the result of a job run wich is either - A number of artifacts (for optimization reasons with their associated database artifact entry) - or an error with the job uuid that failed (again, for optimization reasons) - Got an implementation of the algorithm described above - Got a new implementation of run_job(), which - Fetches the pathes of dependency-artifacts from the database by using the job uuids from the JobDefinition object - Creates the RunnableJob object for that - Schedules the RunnableJob object in the scheduler - For each output artifact (database object representing it) - get the filesystem Artifact object for it Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-18	Run `cargo fmt`	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-15	Fix clippy: redundant field names in struct initialization	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-15	Fix clippy: using `clone` on a `Copy` type	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-13	Add LICENSE file and license headers	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-10	Add more verbose error reporting	Matthias Beyer
	This patch adds more verbose error reporting in case of build error. It alters the Orchestrator::run() interface to return the Job UUID alongside the error object. The UUID object can then be (and is) used in the "build" subcommand implementation to fetch information about the failed job from the database and print it to the user. The number of log lines is configurable. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-10	Print errors in build subcommand implementation	Matthias Beyer
	This changes the Orchestrator interface to return a Vec of errors which happened during the container run, rather than just reporting "something errored". This way, we can use the build subcommand implementation to do the reporting instead of the Orchestrator implementation, which is way cleaner (especially for future features). Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-10	Rewrite container error reporting	Matthias Beyer
	This patch rewrites the container error reporting to be more simple. The ContainerError type was rewritten to not wrap other errors anymore, but be the root error itself. It only has one variant now, because there's only one kind of error: "It didn't work". The reporting in the calling functions can now use anyhow::Result<_> instead of std::result::Result because of that. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-09	Fix weird rendering bug of progress bars	Matthias Beyer
	Apparently, this fixes the rendering bug we had with indicatif. The issue was, that we called `indicatif::ProgressBar::set_message()` before the bar was added to the `indicatif::MultiProgress` object. This caused the bar to be rendered, and as soon we added it to the MultiProgress object and re-called set_message(), it was rendered again. This is of course a bug / inconveniance in indicatif. Either way, the issue was solved by not calling `set_message()` in our `ProgressBars` helper object, but only return a preconfigured bar object. Because not calling the set_message() function yields the whole bunch of helper functions as unnecessary, these were removed and the interface was boiled down to `pub fn ProgressBars::bar(&self) -> indicatif::ProgressBar` which in turn results in a few modifications all over the place. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-08	Move progress bar instantiation out of JobHandle implementation	Matthias Beyer
	... and into orchestrator implementation. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-08	Refactor to construct MergedStores object as soon as possible	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-08	Simplify: Pass ref to MergedStores object	Matthias Beyer
	This reduces the runtime because the MergedStores object does not have to be created all the time, also, less parameters are passed to the run_jobset() function. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-08	Make StagingStore get()able from MergedStores, to simplify Orchestrator impl	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-08	Outsource running of JobSet to helper function	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-08	Outsource running of RunnableJob on scheduler	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-08	Refactor to iterator chaining	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Deny macro_use from external crate	Matthias Beyer
	Diesel is an exception here, because the generated src/schema.rs file does not automatically contain the necessary imports. All imports were added where necessary. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Remove passing of additional env variables	Matthias Beyer
	This patch removes the passing around of additional environment variables that were specified on the commandline and adds them directly to the Job object instance upon creation. This does not result in a netto-loss of code, but in a netto-loss of complexity. For this to be possible, we had to derive Clone for `JobResource`, which we have to clone when creating the `Job` objects during the creation of the jobsets from the `Tree` object. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-04	Add strict script interpolation	Matthias Beyer
	This patch adds strict script interpolation, which means that the script interpolation will result in an error if a variable is referenced that does not exist. Before this patch, referencing an absent variable did result in an empty string, possibly resulting in an error at runtime. This feature is on by default. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-03	Add artifact in database and let job return list of artifacts instead of pathes	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-03	Cleanup imports	Matthias Beyer
	This patch cleans the imports, removes the unused ones and moves imports, wherever possible, to the outer scope. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-03	Remove unused member: Orchestrator::database	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-03	Remove unused variables	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-03	Remove unused field: Orchestrator::progress_generator	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-03	Remove unused module: log::filesink	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-16	Add passing of additional env	Matthias Beyer
	This patch adds the code to pass the additional environment to the container job. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-15	Implement error reporting of failed jobs	Matthias Beyer
	This patch implements error reporting if a container job did not end successfully. It does so by adding an error type `ContainerError`, which is either an error that describes that a container did not exit with success, or an anyhow::Error (that describes an error from the container management code). The algorithm of log-aggregation is now intercepted to catch any exit-state log items. If there is no exit-state from the container (No line with "#BUTIDO:STATE:..."), no error is assumed. Here could be a warning later on. The so aggregated state is then passed up to the orchestrator, which then collects the errors and prints them. If the implementation is correct (which is not tested yet, because this is rather difficult to test), all other containers should continue operation until they are ready, before the errors are handled. The code responsible for this (in the Orchestrator implementation) was adapted to not collect until the first error, but collect everything and then check for errors. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-14	Implement on-disk logs	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-14	Make multi bar available to scheduler when creating job	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-14	Rewrite to use tokio::sync::RwLock	Matthias Beyer
	This patch rewrites the codebase to not use std::sync::RwLock, but tokio::sync::RwLock. tokios RwLock is an async RwLock, which is what we want in an async-await context. The more I use tokio, the more I understand what you should do and what you shouldn't do. Some parts of this patch are a rewrite, for example, JobSet::into_runables() was completely rewritten. That was necessary because the function used inside is `Runnable::build_from_job()`, which uses an RwLock internally, thus, gets `async` in this patch. Because of this, `JobSet::into_runables()` needed a complete rewrite as well. Because it is way more difficult than transforming the function to return an iterator of futures, this patch simply rewrites it to return a `Result<Vec<RunnableJob>>` instead. Internally, tokio jobs are submitted via the `futures::stream::FuturesUnordered<_>` now. This is not the most performant implementation for the problem at hand, but it is a reasonable simple one. Optimization could happen here, of course. Also, the implementation of resource preparation inside `RunnableJob::build_from_job()` got a rewrite using the same technique. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-14	Add debug output, error context information	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-14	Move multibar creation back to orchestrator	Matthias Beyer
	This needs to be done to prepare for one thing: We need to be able to call `multibar.join()` (which blocks the current thread) _while_ running the jobs on the scheduler. That's ugly, but that's the way indicatif works. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-14	Add scheduler shutdown function, to drop multibar	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-13	Pass Submit object around when scheduling jobs	Matthias Beyer
	This is needed because we need to refer to the submit when creating a Job entry in the database. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-13	Rename: JobHandle::get_result() -> JobHandle::run()	Matthias Beyer
	This renames the method, because this is way more descriptive of what the method actually does. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-13	Move log collecting to endpoint code	Matthias Beyer
	This patch moves the log-setup and -collecting to the endpoint implementation, that takes care of running the job. This is required because here we aggregate the log and here we know all information that need to be written to the database, so it seems natural to move the log-collecting code (and everything else related to the database entry for the job) to this position in the code. This implies that the "job" is written to the database _after_ it was run. This is due to the fact that we cannot write the job entry before we have the logs. We _have to_ collect the logs in memory (we can also write to a log-dump-file, of course), but we cannot create the job object in the database and then append the logs to it. This might change in the future, but for now that's the way it is done. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-12	Move log receiving to dedicated type	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-11	Copy package source to container before running build	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-09	Report created packages at the end of orchestrator run	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-09	Add more trace output	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>