butido - Sources of butido

Age	Commit message (Collapse)	Author
2021-11-19	Update Copyright string to 2020-2022	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-08-18	Refactor: Simplify implementation	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-08-12	Fix clippy: Remove needless borrows	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-04-12	Remove duplicated code	Matthias Beyer
	This patch rewrites the unpacking of the tar from the stream in a way so that the unpacking function returns the written pathes and therefore we don't have to pass over the tar twice. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-04-12	Filter out the /outputs/ directory	Matthias Beyer
	The issue here is that we copy all build results (packages) in the container to /outputs and then butido uses that directory to fetch the outputs of the build. But, because how the docker API works, we get a TAR stream from docker that _contains_ the /outputs directory. But of course, we don't want that. Until now, that was no issue. But it has become one now that we start adopting butido for our real-world scenarios. This patch adds filtering out that /outputs portion of the pathes from the tar archive when writing all the things to disc. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-04-06	Remove pub from FileStoreImpl::root_path member	Matthias Beyer
	And replace it with a getset::Getters implementation. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-04-06	Remove outdated comment	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-03-04	Do not pass progress bars by value, but by reference	Matthias Beyer
	Because passing by value is simply not necessary here. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-02-22	Merge branch 'misc'	Matthias Beyer

2021-02-22	Fix: Debug output should match type name	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-22	Remove MergedStores	Matthias Beyer
	The concept of the MergedStores type was okay in the beginning, but it got more and more complex to use properly and most of the time, we used the release/staging stores directly anyways. So this removes the MergedStores type, which is a preparation for the change to have multiple release stores. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-17	Pass MergedStores to endpoint/jobs rather than only staging store	Matthias Beyer
	This patch changes the code so that the MergedStores object is known in the endpoints and job execution code. This is necessary, because the jobs must be able to load artifacts from the release store as well, for example if a library was released in another submit and we can reuse it because the script for that library didn't change. For that, the interface of StoreRoot::join() was changed - -> Result<FullArtifactPath<'a>> + -> Result<Option<FullArtifactPath<'a>>> whereas it returns an Ok(None) if the artifact requested does not exist. This was necessary to try-joining a path on the store root of the staging store, and if there is no such file continue with try-joining the path on the release store. The calling code got a bit more complex with that change, though. Next, the MergedStores got a `derive(Clone)` because clone()ing it is cheap (two `Arc`s) and necessary to pass it easily to the jobs. Each instance of the code where the staging store was consulted was changed to consult the the release store as well. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-16	Rewrite artifact searching in replacement algorithm	Matthias Beyer
	This patch rewrites the replacement searching algorithm, to try the staging store first and then the release store. It does so by sorting the artifacts by whether they are in the staging store or not (hence the FullArtifactPath::is_in_staging_store() function). It filters out not-found artifacts and returns only ones that were found in either store. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-15	No need to map () to ()	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-11	Implement find-artifact subcommand	Matthias Beyer
	This patch implements the find-artifact subcommand, which is the basis for artifact finding. Right now we have the problem that finding artifacts after the fact is non-trivial. What we can do, though, is search for them in the database using the same inputs as the ones that created the artifact. That means that if we want to find an artifact or multiple artifacts for a package P in version V, we have to search the database for jobs that built that package in that version, with the same script and the same variables (environment, used container image, etc). So if a package was built with the same script, the same environment variables and on an image that is, for example, not in the denylist of the package, chances are good that the artifacts produced by the job for the package are the ones we search for. In the `crate::command::find_artifact()` function, results are sorted before they are printed, so that we preferrably print results with a release date. Env filtering is also implemented, so a user has to provide the appropriate additional environment variables, as they were submitted for the `buid` command when the artifact was built. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-11	Add FullArtifactPath::artifact_path() getter	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-11	Add MergedStores::get() for getting artifact pathes from any store	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-11	Add getter for MergedStores::release	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-11	Make function pub, so we can use it elsewhere	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-08	Remove `Artifact` type	Matthias Beyer
	This patch follows-up on the shrinking of the `Artifact` type and removes it entirely. The type is not needed. Only the `ArtifactPath` type is needed, which is a thin wrapper around `PathBuf`, ensuring that the path is relative to the store root. The `Artifact` type used `pom` to parse the name and version of the package from the `ArtifactPath` object it contained, which resulted in the restriction that the path must always be <name>-<version>... Which should not be a requirement and actually caused issues with a package named "foo-bar" (as an example). Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-02-08	Remove `Artifact` type almost entirely	Matthias Beyer
	This patch removes the `Artifact` type almost entirely. The type parsed its path to extract name and version. This information was _never_ needed, only resulting in the restriction that the path must be parsable (which technically is not required at all). This patch boils down the `Artifact` type to the absolute minimum, as a baseline for its complete removal. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-25	Let the JobHandle::run() return a Vec<Artifact>	Matthias Beyer
	Before that change, it returned the dbmodels::Artifact objects, for which we needed to fetch the filestore::Artifact again. This change removes that restriction (improving runtime, of course). Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-25	Fix: Filter each entry, strip prefix	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-21	Fix clippy: Remove noop drop() call	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-21	Reimplement: Orchestrator::run()	Matthias Beyer
	This patch reimplements the running of the computed jobs. The old implementation was structured as follows: 1. Compute a Tree of dependencies for the requested package 2. Make sets of this tree (see below) 3. For each set 3.1. Run set in parallel by submitting each job in the set to the scheduler 3.2. collect outputs and errors 3.3. Record outputs and return errors (if any) The complexity here was the computing of the JobSets but also the running of each job in a set in parallel. The code was non-trivial to understand. But that's not even the biggest concern with this approch. Consider the following tree of jobs: A / \ B E / \ \ C D F / \ G H \ I Each node here represents a package, the edges represent dependencies on the lower-hanging package. This tree would result in 5 sets of jobs: [ [ I ] [ G, H ] [ C, D, F ] [ B, E ] [ A ] ] because each "layer" in the tree would be run in parallel. It can be easily seen, that in the tree from above, the jobs for [ I, G, D, C ] can be run in parallel easily, because they do not have dependencies. The reimplementation also has another (crucial) benefit: The implementation does not depend on a structure of artifact path names anymore. Before, the artifacts needed to have a name as follows: <name of the package>-<version of the package>.<something> which was extremely restrictive. With the changes from this patch, the implementation does not depend on such a format anymore. Instead: Dependencies are associated with a job, by the output of jobs run for dependent packages. That means that, considering the above tree of packages: deps_of(B) = outputs_of(job_for(C)) + outputs_of(job_for(D)) in text: The dependencies of package B are the outputs of the job run for package C plus the outputs of the job run for package D. With that change in place, the outputs of a job run for a package can yield arbitrary file names and as long as the build script for the package can process them, everything is fine. The new algorithm, that solves that issue, is rather simple: 1. Hold a list of errors 2. Hold a list of artifacts that were built 3. Hold a list of jobs that were run 4. Iterate over all jobs, filtered by - If the job appears in the "already run jobs" list, ignore it - If a job has dependencies (on outputs of other jobs) that do not appear in the "already run jobs", ignore it (for now) 5. Run these jobs, and for each job: 5.1. Take the job UUID and put it in the "already run jobs" list. 5.2. Take the result of the job, 5.2.1. if it is an error, put it in the "list of errors" 5.2.2. if it is ok, put the artifact in the "list of artifacts" 6. if the list of errors is not empty, goto 9 7. if all jobs are in the "already run jobs" list, goto 9 8. goto 4 9. return all artifacts and all errors Because this approach is fundamentally different than the previous approach, a lot of things had to be rewritten: - The `JobSet` type was complete removed - There is a new type `crate::job:Tree` that gets built from the `crate::package::Tree` It is a mapping of a UUID (the job UUID) to a `JobDefinition`. The `JobDefinition` type is - A Job - A list of UUIDs of other jobs, where this job depends on the outputs It is therefore a mapping of `Job -> outputs(jobs_of(dependencies)` The `crate::job::Tree` type is now responsible for building a `Job` object for each `crate::package::Package` object from the `crate::package::Tree` object. Because the `crate::package::Tree` object contains all required packages for the complete built, the implementation of `crate::job::Tree::build_tree()` does not check sanity. It is assumed that the input tree to the function contains all mappings. Despite the name `crate::job::Tree` ("Tree"), the actual structure stored in the type is not a real tree. - The `MergedStores::get_artifact_by_path()` function was adapted because in the previous implementation, it used `StagingStore::load_from_path()`, which tried to load the file from the filesystem and put it into the internal map, which failed if it was already there. The adaption checks if the artifact already exists in the internal map and returns that object instead. (For the release store accordingly) - The interface of the `RunnableJob::build_from_job()` function was adapted, as this function does not need to access the `MergedStores` object anymore to load dependency-Artifacts from the filesystem. Instead, these Artifacts are passed to the function now. - The Orchestrator code - Got a type alias `JobResult` which represents the result of a job run wich is either - A number of artifacts (for optimization reasons with their associated database artifact entry) - or an error with the job uuid that failed (again, for optimization reasons) - Got an implementation of the algorithm described above - Got a new implementation of run_job(), which - Fetches the pathes of dependency-artifacts from the database by using the job uuids from the JobDefinition object - Creates the RunnableJob object for that - Schedules the RunnableJob object in the scheduler - For each output artifact (database object representing it) - get the filesystem Artifact object for it Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-21	Add derive(Debug) for FillArtifactPathDisplay	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-21	Add MergedStores::get_artifact_by_path()	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-21	Add FullArtifactPath::exists()	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-18	Run `cargo fmt`	Matthias Beyer
	Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-15	Fix clippy: this import is redundant	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-15	Fix clippy: this `else { if .. }` block can be collapsed	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-13	Add LICENSE file and license headers	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-08	Make StagingStore get()able from MergedStores, to simplify Orchestrator impl	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Reduce interface of pathes by restricting visibility of functions	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Fix: Add unchecked variants of constructors for tests	Matthias Beyer
	Because the constructors now have side-effects (accessing the filesystem), this patch adds unchecked variants of the constructors, so that tests can be implemented with pathes that do not point to valid filesystem items. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Add checks so that FullArtifactPath can only be constructed for existing files	Matthias Beyer
	In the process, FullArtifactPath::is_file() and FullArtifactPath::is_dir() as well as ArtifactPath::file_stem() were removed. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Fix: ArtifactPath::is_dir() should not exist	Matthias Beyer
	This removes the ArtifactPath::is_dir() function, which should not exist because we try to guarantee that ArtifactPath is relative, so this function would always return false results. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Remove StoreRoot::stripped_from() and StoreRoot::walk()	Matthias Beyer
	This patch reduces the interface of StoreRoot even further. The ::stripped_from() and ::walk() interfaces were removed and replaced with a StoreRoot::find_artifacts_recursive() function, which returns an iterator over the found ArtifactPath objects. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Remove AsRef<Path> for StoreRoot impl	Matthias Beyer
	This patch reduces the mis-useable parts of the StoreRoot interface further by removing the AsRef<Path> implementation for StoreRoot. The implementation was required to unpack the tar archive stream in the staging store implementation. A function was added to do it instead. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Remove StoreRoot::as_path()	Matthias Beyer
	This patch reduces the mis-useable parts of the StoreRoot interface further by removing the StoreRoot::as_path() function. A function to get a walkdir::WalkDir was added instead. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Make stores only loadable with StoreRoot object	Matthias Beyer
	This patch changes the interfaces to no load the StoreRoot object internally, but getting it from the caller. The StoreRoot::new() function got more restrictive, to only be able to load the StoreRoot object when pointing to an existing directory. This makes is_dir() checks during the runtime unecessary, which reduces runtime overall, because this test is only done once during construction, not N times during usage of the store objects. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Add runtime check in StoreRoot::new	Matthias Beyer
	This adds a runtime check whether the artifact path is indeed absolute. A relative artifact path is a bug. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Add runtime check in ArtifactPath::new	Matthias Beyer
	This adds a runtime check whether the artifact path is indeed relative. An absolute artifact path is a bug. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Remove StoreRoot::join_path()	Matthias Beyer
	Reduce mis-use possibilities by removing the join_path() method for StoreRoot. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Make FullArtifactPath a tuple of StoreRoot and ArtifactPath	Matthias Beyer
	This changes the implementation of FullArtifactPath from holding a complete PathBuf object, to holding references to StoreRoot and ArtifcatPath objects. This not only reduces memory but more importantly is another step into the right direction concerning strong path types. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Remove AsRef<Path> for FullArtifactPath impl	Matthias Beyer
	This patch removes the AsRef<Path> implementation for FullArtifactPath, to make everything more explicit. It also alters the FileStoreImpl implementation to contain a map of ArtifactPath -> Artifact, rather than PathBuf -> Artifact, which is also more type-safe than before this way. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Add ArtifactPath, StoreRoot	Matthias Beyer
	This is the first step towards strong path typing for distinction between artifact pathes. It adds a type for Store root pathes and a type for artifact pathes. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Deny macro_use from external crate	Matthias Beyer
	Diesel is an exception here, because the generated src/schema.rs file does not automatically contain the necessary imports. All imports were added where necessary. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Remove unused imports	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07	Remove Artifact::create{,_path}() (unused)	Matthias Beyer
	Signed-off-by: Matthias Beyer <mail@beyermatthias.de>