summaryrefslogtreecommitdiffstats
path: root/src/filestore
AgeCommit message (Collapse)Author
2021-01-25Let the JobHandle::run() return a Vec<Artifact>Matthias Beyer
Before that change, it returned the dbmodels::Artifact objects, for which we needed to fetch the filestore::Artifact again. This change removes that restriction (improving runtime, of course). Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-25Fix: Filter each entry, strip prefixMatthias Beyer
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-21Fix clippy: Remove noop drop() callMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-21Reimplement: Orchestrator::run()Matthias Beyer
This patch reimplements the running of the computed jobs. The old implementation was structured as follows: 1. Compute a Tree of dependencies for the requested package 2. Make sets of this tree (see below) 3. For each set 3.1. Run set in parallel by submitting each job in the set to the scheduler 3.2. collect outputs and errors 3.3. Record outputs and return errors (if any) The complexity here was the computing of the JobSets but also the running of each job in a set in parallel. The code was non-trivial to understand. But that's not even the biggest concern with this approch. Consider the following tree of jobs: A / \ B E / \ \ C D F / \ G H \ I Each node here represents a package, the edges represent dependencies on the lower-hanging package. This tree would result in 5 sets of jobs: [ [ I ] [ G, H ] [ C, D, F ] [ B, E ] [ A ] ] because each "layer" in the tree would be run in parallel. It can be easily seen, that in the tree from above, the jobs for [ I, G, D, C ] can be run in parallel easily, because they do not have dependencies. The reimplementation also has another (crucial) benefit: The implementation does not depend on a structure of artifact path names anymore. Before, the artifacts needed to have a name as follows: <name of the package>-<version of the package>.<something> which was extremely restrictive. With the changes from this patch, the implementation does not depend on such a format anymore. Instead: Dependencies are associated with a job, by the output of jobs run for dependent packages. That means that, considering the above tree of packages: deps_of(B) = outputs_of(job_for(C)) + outputs_of(job_for(D)) in text: The dependencies of package B are the outputs of the job run for package C plus the outputs of the job run for package D. With that change in place, the outputs of a job run for a package can yield arbitrary file names and as long as the build script for the package can process them, everything is fine. The new algorithm, that solves that issue, is rather simple: 1. Hold a list of errors 2. Hold a list of artifacts that were built 3. Hold a list of jobs that were run 4. Iterate over all jobs, filtered by - If the job appears in the "already run jobs" list, ignore it - If a job has dependencies (on outputs of other jobs) that do not appear in the "already run jobs", ignore it (for now) 5. Run these jobs, and for each job: 5.1. Take the job UUID and put it in the "already run jobs" list. 5.2. Take the result of the job, 5.2.1. if it is an error, put it in the "list of errors" 5.2.2. if it is ok, put the artifact in the "list of artifacts" 6. if the list of errors is not empty, goto 9 7. if all jobs are in the "already run jobs" list, goto 9 8. goto 4 9. return all artifacts and all errors Because this approach is fundamentally different than the previous approach, a lot of things had to be rewritten: - The `JobSet` type was complete removed - There is a new type `crate::job:Tree` that gets built from the `crate::package::Tree` It is a mapping of a UUID (the job UUID) to a `JobDefinition`. The `JobDefinition` type is - A Job - A list of UUIDs of other jobs, where this job depends on the outputs It is therefore a mapping of `Job -> outputs(jobs_of(dependencies)` The `crate::job::Tree` type is now responsible for building a `Job` object for each `crate::package::Package` object from the `crate::package::Tree` object. Because the `crate::package::Tree` object contains all required packages for the complete built, the implementation of `crate::job::Tree::build_tree()` does not check sanity. It is assumed that the input tree to the function contains all mappings. Despite the name `crate::job::Tree` ("Tree"), the actual structure stored in the type is not a real tree. - The `MergedStores::get_artifact_by_path()` function was adapted because in the previous implementation, it used `StagingStore::load_from_path()`, which tried to load the file from the filesystem and put it into the internal map, which failed if it was already there. The adaption checks if the artifact already exists in the internal map and returns that object instead. (For the release store accordingly) - The interface of the `RunnableJob::build_from_job()` function was adapted, as this function does not need to access the `MergedStores` object anymore to load dependency-Artifacts from the filesystem. Instead, these Artifacts are passed to the function now. - The Orchestrator code - Got a type alias `JobResult` which represents the result of a job run wich is either - A number of artifacts (for optimization reasons with their associated database artifact entry) - or an error with the job uuid that failed (again, for optimization reasons) - Got an implementation of the algorithm described above - Got a new implementation of run_job(), which - Fetches the pathes of dependency-artifacts from the database by using the job uuids from the JobDefinition object - Creates the RunnableJob object for that - Schedules the RunnableJob object in the scheduler - For each output artifact (database object representing it) - get the filesystem Artifact object for it Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-21Add derive(Debug) for FillArtifactPathDisplayMatthias Beyer
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-21Add MergedStores::get_artifact_by_path()Matthias Beyer
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-21Add FullArtifactPath::exists()Matthias Beyer
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-18Run `cargo fmt`Matthias Beyer
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
2021-01-15Fix clippy: this import is redundantMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-15Fix clippy: this `else { if .. }` block can be collapsedMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2021-01-13Add LICENSE file and license headersMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-08Make StagingStore get()able from MergedStores, to simplify Orchestrator implMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Reduce interface of pathes by restricting visibility of functionsMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Fix: Add unchecked variants of constructors for testsMatthias Beyer
Because the constructors now have side-effects (accessing the filesystem), this patch adds unchecked variants of the constructors, so that tests can be implemented with pathes that do not point to valid filesystem items. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Add checks so that FullArtifactPath can only be constructed for existing filesMatthias Beyer
In the process, FullArtifactPath::is_file() and FullArtifactPath::is_dir() as well as ArtifactPath::file_stem() were removed. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Fix: ArtifactPath::is_dir() should not existMatthias Beyer
This removes the ArtifactPath::is_dir() function, which should not exist because we try to guarantee that ArtifactPath is relative, so this function would always return false results. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove StoreRoot::stripped_from() and StoreRoot::walk()Matthias Beyer
This patch reduces the interface of StoreRoot even further. The ::stripped_from() and ::walk() interfaces were removed and replaced with a StoreRoot::find_artifacts_recursive() function, which returns an iterator over the found ArtifactPath objects. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove AsRef<Path> for StoreRoot implMatthias Beyer
This patch reduces the mis-useable parts of the StoreRoot interface further by removing the AsRef<Path> implementation for StoreRoot. The implementation was required to unpack the tar archive stream in the staging store implementation. A function was added to do it instead. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove StoreRoot::as_path()Matthias Beyer
This patch reduces the mis-useable parts of the StoreRoot interface further by removing the StoreRoot::as_path() function. A function to get a walkdir::WalkDir was added instead. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Make stores only loadable with StoreRoot objectMatthias Beyer
This patch changes the interfaces to no load the StoreRoot object internally, but getting it from the caller. The StoreRoot::new() function got more restrictive, to only be able to load the StoreRoot object when pointing to an existing directory. This makes is_dir() checks during the runtime unecessary, which reduces runtime overall, because this test is only done once during construction, not N times during usage of the store objects. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Add runtime check in StoreRoot::newMatthias Beyer
This adds a runtime check whether the artifact path is indeed absolute. A relative artifact path is a bug. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Add runtime check in ArtifactPath::newMatthias Beyer
This adds a runtime check whether the artifact path is indeed relative. An absolute artifact path is a bug. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove StoreRoot::join_path()Matthias Beyer
Reduce mis-use possibilities by removing the join_path() method for StoreRoot. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Make FullArtifactPath a tuple of StoreRoot and ArtifactPathMatthias Beyer
This changes the implementation of FullArtifactPath from holding a complete PathBuf object, to holding references to StoreRoot and ArtifcatPath objects. This not only reduces memory but more importantly is another step into the right direction concerning strong path types. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove AsRef<Path> for FullArtifactPath implMatthias Beyer
This patch removes the AsRef<Path> implementation for FullArtifactPath, to make everything more explicit. It also alters the FileStoreImpl implementation to contain a map of ArtifactPath -> Artifact, rather than PathBuf -> Artifact, which is also more type-safe than before this way. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Add ArtifactPath, StoreRootMatthias Beyer
This is the first step towards strong path typing for distinction between artifact pathes. It adds a type for Store root pathes and a type for artifact pathes. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Deny macro_use from external crateMatthias Beyer
Diesel is an exception here, because the generated src/schema.rs file does not automatically contain the necessary imports. All imports were added where necessary. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove unused importsMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove Artifact::create{,_path}() (unused)Matthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove StagingStore::load_from_path() (unused)Matthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove MergedStores::get_artifact_by_name() (unused)Matthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-07Remove FileStoreImpl::get{,_artifact_by_name{,_and_version}}() (unused)Matthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-03Make sure only the relative pathes are written to databaseMatthias Beyer
This patch is a bit of a mess. It makes sure that an crate::filestore::Artifact only knows its _relative_ path to the store it is put into. First of all, this is what it should be like. Secondly, we only want to track relative pathes in the database (to reduce database size, but more importantly because it is just duplicated data). The full path of an Artifact can always be reconstructed from the submit id, the configured pathes of the staging/release store and the artifact information (relative path) tracked in the database. This patch would not be necessary if we would introduce strong typing for the different kinds of pathes. This is most certainly on our todo list here. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-12-03Cleanup importsMatthias Beyer
This patch cleans the imports, removes the unused ones and moves imports, wherever possible, to the outer scope. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-14Rewrite to use tokio::sync::RwLockMatthias Beyer
This patch rewrites the codebase to not use std::sync::RwLock, but tokio::sync::RwLock. tokios RwLock is an async RwLock, which is what we want in an async-await context. The more I use tokio, the more I understand what you should do and what you shouldn't do. Some parts of this patch are a rewrite, for example, JobSet::into_runables() was completely rewritten. That was necessary because the function used inside is `Runnable::build_from_job()`, which uses an RwLock internally, thus, gets `async` in this patch. Because of this, `JobSet::into_runables()` needed a complete rewrite as well. Because it is way more difficult than transforming the function to return an iterator of futures, this patch simply rewrites it to return a `Result<Vec<RunnableJob>>` instead. Internally, tokio jobs are submitted via the `futures::stream::FuturesUnordered<_>` now. This is not the most performant implementation for the problem at hand, but it is a reasonable simple one. Optimization could happen here, of course. Also, the implementation of resource preparation inside `RunnableJob::build_from_job()` got a rewrite using the same technique. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-09Fix: Checking resulting paths into the staging storeMatthias Beyer
Unfortunately, these are the kind of things that Rust does not find, unfortunately. I could have built the whole thing so that the rust compiler can find such things, by even introducing more typing. But that would slow down the development of this project by a huge bit, so we have to deal with these kind of bugs for now. The issue solved with this patch is, that with an empty staging store, the example_1 failed. This was because the job for pkgA did not find the already built pkgB. When rerunning butido right after the error, though, everything worked. This is due to the fact that we filteres the result-TAR archive for files, to ignore directories when checking results into the staging store. This was done by filtering the archive with .filter(|path| path.is_file()) but Path::is_file() checks _on the filesystem_, not whether the path itself looks like a file-path. Thus, we alter the iteration here to do the check whether a resulting path is a file-path right when we want to load the artifact. This fixes the bug. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-09Add more trace outputMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-08Remove unused imports, sort importsMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-08Fix artifact testsMatthias Beyer
Well, that's developing without CI, isn't it? Somewhere in the last merges we messed up and did merge without running the tests first. This patch fixes the issue that artifact path extensions are now considered in the parsing process. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-07Fix: Parse artifact name and version from file stemMatthias Beyer
This fixes a bug where we considered the extension of the filename when parsing the artifact name and version, which is clearly something we shouldn't do, as the file extension is not part of the version at all! ;-) Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-07Add logging when finding artifacts in the storeMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-07Add some more error context messagesMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-07Fix: We only care about file entries from the result TARMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-07Impl FileStoreImpl::is_sub_path()Matthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-06Implement Debug for store frontend typesMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-06Add helper functions in filestore implementationsMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-05Change MergedStores interface to return clonesMatthias Beyer
This changes the interface of MergedStores for less complicated implementation and use. Artifact implements Clone and isn't too big, so this is not that expensive. Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-05Change MergedStore to take Release/StagingStore via Arc<>Matthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-04Impl equality traits on ArtifactMatthias Beyer
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
2020-11-04Remove all non-equality "Version constraint" implementationMatthias Beyer
This patch removes the idea of "version constraints" except for the equality constraint. This is due to the fact, that everything else might result in impurities. This might be reverted in the future and actual operators ("<" or ">" or ranges...) might be implemented. Thus, we keep the "=" equality sign as prefix for a version string, to be extensible here. This commit also fixes (automatically, because the implementation changed from the ground up) the issue that there was no difference between a version string and a version constraint string. Signed-off-by: Matthias Beyer <mail@beyermatthias.de> (cherry picked from commit 7bbcca0356795ac60bf7761819b56430e0905a3c)