Arvados 2.5.0 Release Notes
December 22, 2022
The Arvados team is pleased to announce Arvados 2.5.0. This is a major upgrade, with many new features, changes and bugfixes. We recommend that new and existing installations of 2.4.4 or earlier upgrade to 2.5.0. See Upgrading Arvados for upgrade instructions.
New Features
Workbench 2
Workbench 2 features a new color scheme, designed for improved contrast and consistency. #19462
The process pages features new Input and Output panels displaying the input and output parameters of the process. #16073 #19700 #19848
The process page also features a new Resources panel showing what resources were requested by the container, and (on cloud installs) what instance type it was actually run on. #19438
The picking dialog now includes the ability to search for projects and to filter the list of collections within a project. #19783
The process panel includes new information about how much a workflow cost to run. #19319
When you view a workflow step or output collection in Workbench 2, there’s now a row of breadcrumbs at the top of the page that shows you the parent workflow and project. This helps users understand how the object was created, and navigate to those parent objects if desired. #19504
The project view now offers the ability to configure a variety of additional columns for display: UUID, Created at/Trash at/Delete at, Properties, Portable Data Hash, Version, File Count, Requesting Container UUID, Container UUID, Output UUID, Log UUID, and Modified By User UUID. #19690
Workbench 2’s left navigation pane is now collapsible. #19434
Workbench 2 fully supports frozen projects: projects can be frozen and unfrozen, and frozen projects can’t be edited in any way. #18692
When you maximize a Workbench 2 panel, there is now an un-maximize button to easily restore the previous panels. #19300
Installer
The Arvados installer includes a new script to manage the process of setting up a fully-featured multi-node cluster. The multi host install guide guide has been rewritten, updated, and expanded. In addition, we now provide a Terraform script to assist in deploying on AWS. #19175 #19215
Client tools
arv-mount
now features a memory-mapped disk-backed cache instead of
a pure RAM cache. This enables the cache to be much bigger by default
and leverage the kernel’s filesystem cache without competing with user
programs for RAM. The result is same or better performance reading
files without needing to tune cache parameters. This is available
both running arv-mount
locally and running containers on Arvados.
The size of the cache is controlled with --file-cache
like before,
and the location is controlled with --disk-cache-dir
. #18842
#19872
arvados-client shell
can now connect to containers running under LSF
and SLURM, in addition to the cloud dispatcher. #19166
Running Workflows
Container records now have cost
and subrequest_cost
fields with
information about the container’s cost to run. See the containers
reference
for details. Container request records now have a cumulative_cost
field with information about the request’s total cost to run. See the
container requests reference
for details. This is calculated using the Price
field of the
instance type definition in the configuration file. #18205
Workflow steps submitted by arvados-cwl-runner
now have properties
cwl_input
and cwl_output
on the container request. See the API
properties documentation for
details. #19466
SDK
The Python SDK documentation now includes a full overview of how the API client works, with lots of realistic examples for demonstration. #19791
The Python SDK features a disk-backed cache for Keep. This is used by
arv-mount
as discussed above, but also available to other users of
the Python SDK. #18842 #19872
Administrator tools
Arvados includes a new tool sync-users-tool
to synchronize Arvados
user records described by a CSV file. You can use this to keep Arvados
user records up-to-date with other identity sources like Active
Directory or SSO providers. The admin guide on synchronizing
has more details. #18858
arvados-client diagnostics
is
now documented in the admin guide. When possible, the diagnostics
also performs a cluster “health check” (arvados-server check
described below) and includes any problems in the diagnostics
report. #19364 #19377
arvados-server check
is now
documented in the admin guide. The arvados-server check
tool now
detects and reports several likely error conditions on a cluster: when
different services in the cluster are not using the same
configuration; different services in the cluster are running from
different versions of Arvados; when clock times reported by different
services in the cluster are a minute apart or more. #18794
Arvados API
The API server respects a new configuration option
Users.CanCreateRoleGroups
. If you set this false, non-admin users
will not be allowed to create role groups. This is helpful to set when
your cluster uses arvados-group-sync
as the single source of
roles. #19513
The API server now allows sharing objects to be “writable” by the “All
Users” group. This entailed some changes to the permissions model for
roles: in general, updating a role now requires can_manage
permission. Refer to the API permissions model
documentation
for details. #19269
Group objects returned by the API server now include boolean
can_manage
and can_write
fields indicating the current user’s
level of access. The writeable_by
field is now deprecated in favor
of these new fields. #19146
Keep
If an unauthenticated user visits keep-web
they will now be
redirected to Workbench 2 to log in first. Once the user is logged in,
they will be redirected back to keep-web
. #17807
Keepstore now uses a new driver for S3 driver by default. This driver features improved S3 read performance. The old v1 driver will be removed in Arvados 2.6.0. #19582
The keep-web
S3 API now publishes project and collection properties
through the X-Amz-Meta-
headers. Control characters in values are
MIME encoded. Refer to the new section in our S3 API
documentation for
details. #19088 #19249
Other changes and bug fixes
Salt Installer
The Salt installer’s nginx configuration now includes the connection upgrade configuration required to support container shells. #19603
Fixed a bug in the Salt installer where the webshell’s nginx
configuration always used localhost
as upstream instead of
shell.domain
. #19283
The Salt installer now correctly obtains TLS certificates from Let’s Encrypt for single-node installs. #19169
The Salt installer’s configuration comments and error messages clarify that cluster IDs need to be lowercase. #19169
Workbench 2
Workbench 2 now supports search terms with whitespace when they are surrounded by double quotes. #19051
Workbench 2 no longer displays an extraneous “This field is required” error when editing collection properties. #19732
Process and subprocess listings in Workbench 2 now provide an “Open in new tab” action. #19569
Clicking on the buttons at the top of a collection or process page now jumps directly to the pane instead of auto-scrolling. #19465
If an error occurs when you try to create or edit a project, Workbench 2 will display that error, and let you edit the project again to resolve it. #19691
Workbench 2 now resolves CWL $include
and $import
directives in
subprocess parameters, and displays the resulting object when
found. #19684
Workbench 2 now displays a friendly error message when a subprocess parameter does not match the expected type and so cannot be rendered. #19684
Workbench 2 shows a warning when a container previously failed and is being retried. #19093
When Workbench 2 can’t show all the log messages for a running container, the output includes a message explaining this. #19851
Workbench 2 now displays the user who submitted a container request, and the user it’s running as if they’re different. #19315
Workbench 2 now validates new VM logins before they can be submitted. #18979
The “Add User” button has been removed from Workbench 2 because users must be managed by the external identity provider. #19627
Fixed a bug in Workbench 2 where copying the URL of items in the left hand project tree view did not always include the full URL. #19567
Fixed a bug where Workbench 2 would needlessly re-render the collection file browser. #18787
Fixed a bug in Workbench 2 that would cause the top search bar to become unresponsive. #19275
Restored the action menu (as an alternate way to access the context menu for a file or directory) on the collection file browser. #19007
The “Subprocesses” panel is now labeled. #19631
Fixed several bugs in Workbench 2 that would cause it to display VM logins incorrectly. #18979
Fixed bug that would cause some projects and collections to be incorrectly marked with icons for “favorite” and/or “public favorite”. #19260
Scrolling container logs now works correctly on Safari. #19687
Arvados API
When you configure services in /etc/arvados/config.yml
, the
InternalURL
is now always the address published to clients on the
same network. If that address is not where the service should bind to
listen for connections, you can specify the bind address with a
ListenURL
field nested under each InternalURL
. Refer to the
URLs section of the install guide for
details. #16561
The user/unsetup
API method now removes all of a user’s permission
links as part of its work. This ensures the user won’t have lingering
permissions if they are reactivated later. #19501
The API server will only return new authentication tokens to trusted
endpoints. The cluster’s Workbench 1 and 2 endpoints are both trusted
by default. If you have other endpoints, make sure to add them to
Login.TrustedClients
in your cluster configuration. See the upgrade
notes for
details. #19240
The API server no longer allows admin users to edit frozen projects. #19145
The API server used to require that a container’s output was recorded
before the exit_code
. This restriction has been relaxed, to allow
reporting the exit code while output is still being saved. #18948
The API server now enforces that all users are owned by the system user. #19139
The API server now enforces that the root user cannot be disabled or have their admin status revoked. #19206
The API server now reports a helpful error message when a user tries to start a container shell using a too-old version of arvados-client. #19166
The API server now logs when a user works with a project, collection, or container request per a configured time period. This provides improved reporting about general cluster activity levels. #19388
The API server’s old logs are now deleted with a background task that
replaces the old rake delete_old_container_logs
cron job. If you
have this cron job running on your cluster, you can remove it after
you upgrade to Arvados 2.5.0 or later. See the upgrade
notes for
details. #18863
The API controller now uses locks to ensure only one instance each of a dispatcher and keep-balance run at a time. #18071
Fixed a bug in the API server where if you tried to save an object
with save_with_unique_name=true
, but there was a problem saving the
object, the unique name logic would cause a second error that masked
the first. #19698
Container request runtime constraints now include keep_cache_disk
,
with an amount of cache storage to allocate. #18842
Client tools
arvados-client diagnostics
now uploads a small “hello world” Docker
image as part of its tests, and uses this image by default to test
Crunch if you don’t specify one yourself. This means the diagnostics
can run without having Docker images pre-uploaded to the
cluster. #19281
arvados-client diagnostics
now cancels the container request it
submitted after its check times out. #19364
arv-keepdocker
properly lists Docker images with a port number in
their repository name. #19840
arv-keepdocker
properly stores images that specify the default port
443 in their repository name. #19840
Fixed a bug in arvados-client diagnostics
where the tool tested
keep-web with a bad filename. #19379
Fixed a bug in Workbench 1 that would send users to a controller error page when they needed to log in on a cluster using LDAP or PAM authentication. #19880
Fixed broken links to the Arvados wiki throughout tools’ documentation. #19710
SDKs
The Python SDK’s arvados.api()
constructor now returns a
ThreadSafeApiCache
. This is an API-compatible wrapper around the
original API client object that stores a separate object per thread
because it is not natively thread-safe. For more information, see the
arvados.api
reference. #19686
The Java SDK has a new KeepWebApiClient.upload()
method to upload
files to Keep via keep-web. #19220
The Java SDK’s KeepWebApiClient
now gets its API token from
configuration dynamically, so it’s easier to use as a Spring
Bean. #19282
Go SDK functions that load data from Keep will now include a block ID in their error message if a requested block is not found. #19600
The R SDK has been updated. It features a new method Collection$readArvFile which handles reading files from Arvados in a variety of formats. #19704
The Perl SDK, which has not been maintained in years, has been removed from this release. #19712
Keep
The keepstore S3 v2 driver automatically fills in an AWS-style region parameter when deployed on Google Cloud Platform. #19234
Fixed a bug in keep-web where it did not correctly invalidate its internal caches when files were added to a collection via the S3 API. As part of this work, cache invalidation got more efficient. #19362
Crunch
Crunch now records a container’s exit code before it starts uploading and recording the output, so users can see that information sooner. #18948
Crunch now logs the RAM use of various components used to run a container: crunch-run, arv-mount, and keepstore. #19563
The default value for Containers.ReserveExtraRAM
has been increased
from 256MiB to 550MiB, to account for typical keepstore
overhead. #19702
The Crunch LSF dispatcher now uses the cluster’s configured
InstanceTypes
to determine whether or not it is possible to run a
given container, and cancel it if not. #19418
When running with Singularity runtime engine, Crunch now sets
SINGULARITY_NO_EVAL=1
in the environment to improve container
portability across runtimes. #19081
Administrator tools
arv-user-activity
reports now exclude automatic token updates from
arvados-login-sync
. #19179
Fixed a bug in arv-user-activity
where it would crash when working
on objects without a UUID (e.g., a collection loaded by portable data
hash). #19594
Fixed a bug in arvados-login-sync
so it checks token status on the
correct cluster, and does not constantly issue new tokens. #19400
Most Arvados cluster web services (everything except the old API server and both versions of Workbench) provide a status endpoint that reports the longest-running active requests, including whether or not the client has abandoned them. #19205
Development changes
arvados-server
can now start all cluster services, including
Workbench 2, keepproxy, keep-web, and arv-git-httpd. #18700 #18947
arvados-server init
will now try to automatically obtain SSL
certificates from Let’s Encrypt when the cluster uses public DNS names
and ports 80 and 443 are usable on the server. #16552
arvados-server boot
can now be configured to start multiple
clusters, for testing federation functionality. #18699
arvados-dispatch-cloud
can now be configured with a loopback driver
that mimics a cloud environment. It allows Crunch to create an
“instance” that sets up work to run on the local node. This is mainly
useful for testing. #15370
When test suites reset the database or its fixtures, test runners suppress SQL logs. This makes it easier to find relevant output from failing tests. #19217
The arvados-server-easy
package received several updates to make it
more useful, but we still don’t recommend it for any production
deployments just yet. #17344
arvbox
now builds Ruby gems in a specific order to support
arvados-login-sync
. #19683
The Arvados Coding Standards now include detailed guidelines for Python docstrings, including markup recommendations that present well in both web and plaintext documentation. #18797
Updated Dependencies
Across Arvados services, we have updated our dependencies to get the latest bug fixes and improvements. #19620 #19629 #19745 #19862 #19877 #19878