Age | Commit message (Collapse) | Author |
|
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This patch rewrites the unpacking of the tar from the stream in a way so that
the unpacking function returns the written pathes and therefore we don't have to
pass over the tar twice.
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
|
|
The issue here is that we copy all build results (packages) in the container to
/outputs and then butido uses that directory to fetch the outputs of the build.
But, because how the docker API works, we get a TAR stream from docker that
_contains_ the /outputs directory. But of course, we don't want that.
Until now, that was no issue. But it has become one now that we start adopting
butido for our real-world scenarios.
This patch adds filtering out that /outputs portion of the pathes from the tar
archive when writing all the things to disc.
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
Tested-by: Matthias Beyer <matthias.beyer@atos.net>
|
|
And replace it with a getset::Getters implementation.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Because passing by value is simply not necessary here.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This patch follows-up on the shrinking of the `Artifact` type and removes it
entirely.
The type is not needed. Only the `ArtifactPath` type is needed, which is a thin
wrapper around `PathBuf`, ensuring that the path is relative to the store root.
The `Artifact` type used `pom` to parse the name and version of the package from
the `ArtifactPath` object it contained, which resulted in the restriction that
the path must always be
<name>-<version>...
Which should not be a requirement and actually caused issues with a package
named "foo-bar" (as an example).
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
Tested-by: Matthias Beyer <matthias.beyer@atos.net>
|
|
Before that change, it returned the dbmodels::Artifact objects, for which we
needed to fetch the filestore::Artifact again.
This change removes that restriction (improving runtime, of course).
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
Tested-by: Matthias Beyer <matthias.beyer@atos.net>
|
|
This patch reimplements the running of the computed jobs.
The old implementation was structured as follows:
1. Compute a Tree of dependencies for the requested package
2. Make sets of this tree (see below)
3. For each set
3.1. Run set in parallel by submitting each job in the set to the scheduler
3.2. collect outputs and errors
3.3. Record outputs and return errors (if any)
The complexity here was the computing of the JobSets but also the running of
each job in a set in parallel.
The code was non-trivial to understand.
But that's not even the biggest concern with this approch.
Consider the following tree of jobs:
A
/ \
B E
/ \ \
C D F
/ \
G H
\
I
Each node here represents a package, the edges represent dependencies on the
lower-hanging package.
This tree would result in 5 sets of jobs:
[
[ I ]
[ G, H ]
[ C, D, F ]
[ B, E ]
[ A ]
]
because each "layer" in the tree would be run in parallel.
It can be easily seen, that in the tree from above, the jobs for [ I, G, D, C ]
can be run in parallel easily, because they do not have dependencies.
The reimplementation also has another (crucial) benefit: The implementation does
not depend on a structure of artifact path names anymore.
Before, the artifacts needed to have a name as follows:
<name of the package>-<version of the package>.<something>
which was extremely restrictive.
With the changes from this patch, the implementation does not depend on such
a format anymore.
Instead: Dependencies are associated with a job, by the output of jobs run
for dependent packages.
That means that, considering the above tree of packages:
deps_of(B) = outputs_of(job_for(C)) + outputs_of(job_for(D))
in text:
The dependencies of package B are the outputs of the job run for package C
plus the outputs of the job run for package D.
With that change in place, the outputs of a job run for a package can yield
arbitrary file names and as long as the build script for the package can process
them, everything is fine.
The new algorithm, that solves that issue, is rather simple:
1. Hold a list of errors
2. Hold a list of artifacts that were built
3. Hold a list of jobs that were run
4. Iterate over all jobs, filtered by
- If the job appears in the "already run jobs" list, ignore it
- If a job has dependencies (on outputs of other jobs) that do not appear in
the "already run jobs", ignore it (for now)
5. Run these jobs, and for each job:
5.1. Take the job UUID and put it in the "already run jobs" list.
5.2. Take the result of the job,
5.2.1. if it is an error, put it in the "list of errors"
5.2.2. if it is ok, put the artifact in the "list of artifacts"
6. if the list of errors is not empty, goto 9
7. if all jobs are in the "already run jobs" list, goto 9
8. goto 4
9. return all artifacts and all errors
Because this approach is fundamentally different than the previous approach, a
lot of things had to be rewritten:
- The `JobSet` type was complete removed
- There is a new type `crate::job:Tree` that gets built from the
`crate::package::Tree`
It is a mapping of a UUID (the job UUID) to a `JobDefinition`.
The `JobDefinition` type is
- A Job
- A list of UUIDs of other jobs, where this job depends on the outputs
It is therefore a mapping of `Job -> outputs(jobs_of(dependencies)`
The `crate::job::Tree` type is now responsible for building a `Job` object for
each `crate::package::Package` object from the `crate::package::Tree` object.
Because the `crate::package::Tree` object contains all required packages for
the complete built, the implementation of `crate::job::Tree::build_tree()`
does not check sanity.
It is assumed that the input tree to the function contains all mappings.
Despite the name `crate::job::Tree` ("Tree"), the actual structure stored in
the type is not a real tree.
- The `MergedStores::get_artifact_by_path()` function was adapted because in the
previous implementation, it used `StagingStore::load_from_path()`, which tried
to load the file from the filesystem and put it into the internal map, which
failed if it was already there.
The adaption checks if the artifact already exists in the internal map and
returns that object instead.
(For the release store accordingly)
- The interface of the `RunnableJob::build_from_job()` function was adapted, as
this function does not need to access the `MergedStores` object anymore to
load dependency-Artifacts from the filesystem.
Instead, these Artifacts are passed to the function now.
- The Orchestrator code
- Got a type alias `JobResult` which represents the result of a job run wich
is either
- A number of artifacts (for optimization reasons with their associated
database artifact entry)
- or an error with the job uuid that failed (again, for optimization
reasons)
- Got an implementation of the algorithm described above
- Got a new implementation of run_job(), which
- Fetches the pathes of dependency-artifacts from the database by using
the job uuids from the JobDefinition object
- Creates the RunnableJob object for that
- Schedules the RunnableJob object in the scheduler
- For each output artifact (database object representing it)
- get the filesystem Artifact object for it
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
Tested-by: Matthias Beyer <matthias.beyer@atos.net>
|
|
Signed-off-by: Matthias Beyer <matthias.beyer@atos.net>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This patch reduces the mis-useable parts of the StoreRoot interface
further by removing the AsRef<Path> implementation for StoreRoot.
The implementation was required to unpack the tar archive stream in the
staging store implementation.
A function was added to do it instead.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This patch changes the interfaces to no load the StoreRoot object
internally, but getting it from the caller.
The StoreRoot::new() function got more restrictive, to only be able to
load the StoreRoot object when pointing to an existing directory.
This makes is_dir() checks during the runtime unecessary, which reduces
runtime overall, because this test is only done once during
construction, not N times during usage of the store objects.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This adds a runtime check whether the artifact path is indeed relative.
An absolute artifact path is a bug.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Reduce mis-use possibilities by removing the join_path() method for
StoreRoot.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This changes the implementation of FullArtifactPath from holding a
complete PathBuf object, to holding references to StoreRoot and
ArtifcatPath objects.
This not only reduces memory but more importantly is another step into
the right direction concerning strong path types.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This patch removes the AsRef<Path> implementation for FullArtifactPath,
to make everything more explicit.
It also alters the FileStoreImpl implementation to contain a map of
ArtifactPath -> Artifact, rather than PathBuf -> Artifact, which is also
more type-safe than before this way.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This is the first step towards strong path typing for distinction
between artifact pathes.
It adds a type for Store root pathes and a type for artifact pathes.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Diesel is an exception here, because the generated src/schema.rs file
does not automatically contain the necessary imports.
All imports were added where necessary.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This patch is a bit of a mess.
It makes sure that an crate::filestore::Artifact only knows its
_relative_ path to the store it is put into. First of all, this is what
it should be like. Secondly, we only want to track relative pathes in
the database (to reduce database size, but more
importantly because it is just duplicated data). The full path of an
Artifact can always be reconstructed from the submit id, the configured
pathes of the staging/release store and the artifact
information (relative path) tracked in the database.
This patch would not be necessary if we would introduce strong typing
for the different kinds of pathes.
This is most certainly on our todo list here.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
This patch cleans the imports, removes the unused ones and moves
imports, wherever possible, to the outer scope.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Unfortunately, these are the kind of things that Rust does not find,
unfortunately.
I could have built the whole thing so that the rust compiler can find
such things, by even introducing more typing. But that would slow down
the development of this project by a huge bit, so we have to deal with
these kind of bugs for now.
The issue solved with this patch is, that with an empty staging store,
the example_1 failed.
This was because the job for pkgA did not find the already built pkgB.
When rerunning butido right after the error, though, everything worked.
This is due to the fact that we filteres the result-TAR archive for
files, to ignore directories when checking results into the staging
store.
This was done by filtering the archive with
.filter(|path| path.is_file())
but Path::is_file() checks _on the filesystem_, not whether the path
itself looks like a file-path.
Thus, we alter the iteration here to do the check whether a resulting
path is a file-path right when we want to load the artifact.
This fixes the bug.
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|
|
Signed-off-by: Matthias Beyer <mail@beyermatthias.de>
|