Arvados 2.5.0 Release Notes

December 22, 2022

The Arvados team is pleased to announce Arvados 2.5.0. This is a major upgrade, with many new features, changes and bugfixes. We recommend that new and existing installations of 2.4.4 or earlier upgrade to 2.5.0. See Upgrading Arvados for upgrade instructions.

New Features

Workbench 2

Workbench 2 features a new color scheme, designed for improved contrast and consistency. #19462

The process pages features new Input and Output panels displaying the input and output parameters of the process. #16073 #19700 #19848

The process page also features a new Resources panel showing what resources were requested by the container, and (on cloud installs) what instance type it was actually run on. #19438

The picking dialog now includes the ability to search for projects and to filter the list of collections within a project. #19783

The process panel includes new information about how much a workflow cost to run. #19319

When you view a workflow step or output collection in Workbench 2, there’s now a row of breadcrumbs at the top of the page that shows you the parent workflow and project. This helps users understand how the object was created, and navigate to those parent objects if desired. #19504

The project view now offers the ability to configure a variety of additional columns for display: UUID, Created at/Trash at/Delete at, Properties, Portable Data Hash, Version, File Count, Requesting Container UUID, Container UUID, Output UUID, Log UUID, and Modified By User UUID. #19690

Workbench 2’s left navigation pane is now collapsible. #19434

Workbench 2 fully supports frozen projects: projects can be frozen and unfrozen, and frozen projects can’t be edited in any way. #18692

When you maximize a Workbench 2 panel, there is now an un-maximize button to easily restore the previous panels. #19300

Installer

The Arvados installer includes a new script to manage the process of setting up a fully-featured multi-node cluster. The multi host install guide guide has been rewritten, updated, and expanded. In addition, we now provide a Terraform script to assist in deploying on AWS. #19175 #19215

Client tools

arv-mount now features a memory-mapped disk-backed cache instead of a pure RAM cache. This enables the cache to be much bigger by default and leverage the kernel’s filesystem cache without competing with user programs for RAM. The result is same or better performance reading files without needing to tune cache parameters. This is available both running arv-mount locally and running containers on Arvados. The size of the cache is controlled with --file-cache like before, and the location is controlled with --disk-cache-dir. #18842 #19872

arvados-client shell can now connect to containers running under LSF and SLURM, in addition to the cloud dispatcher. #19166

Running Workflows

Container records now have cost and subrequest_cost fields with information about the container’s cost to run. See the containers reference for details. Container request records now have a cumulative_cost field with information about the request’s total cost to run. See the container requests reference for details. This is calculated using the Price field of the instance type definition in the configuration file. #18205

Workflow steps submitted by arvados-cwl-runner now have properties cwl_input and cwl_output on the container request. See the API properties documentation for details. #19466

SDK

The Python SDK documentation now includes a full overview of how the API client works, with lots of realistic examples for demonstration. #19791

The Python SDK features a disk-backed cache for Keep. This is used by arv-mount as discussed above, but also available to other users of the Python SDK. #18842 #19872

Administrator tools

Arvados includes a new tool sync-users-tool to synchronize Arvados user records described by a CSV file. You can use this to keep Arvados user records up-to-date with other identity sources like Active Directory or SSO providers. The admin guide on synchronizing has more details. #18858

arvados-client diagnostics is now documented in the admin guide. When possible, the diagnostics also performs a cluster “health check” (arvados-server check described below) and includes any problems in the diagnostics report. #19364 #19377

arvados-server check is now documented in the admin guide. The arvados-server check tool now detects and reports several likely error conditions on a cluster: when different services in the cluster are not using the same configuration; different services in the cluster are running from different versions of Arvados; when clock times reported by different services in the cluster are a minute apart or more. #18794

Arvados API

The API server respects a new configuration option Users.CanCreateRoleGroups. If you set this false, non-admin users will not be allowed to create role groups. This is helpful to set when your cluster uses arvados-group-sync as the single source of roles. #19513

The API server now allows sharing objects to be “writable” by the “All Users” group. This entailed some changes to the permissions model for roles: in general, updating a role now requires can_manage permission. Refer to the API permissions model documentation for details. #19269

Group objects returned by the API server now include boolean can_manage and can_write fields indicating the current user’s level of access. The writeable_by field is now deprecated in favor of these new fields. #19146

Keep

If an unauthenticated user visits keep-web they will now be redirected to Workbench 2 to log in first. Once the user is logged in, they will be redirected back to keep-web. #17807

Keepstore now uses a new driver for S3 driver by default. This driver features improved S3 read performance. The old v1 driver will be removed in Arvados 2.6.0. #19582

The keep-web S3 API now publishes project and collection properties through the X-Amz-Meta- headers. Control characters in values are MIME encoded. Refer to the new section in our S3 API documentation for details. #19088 #19249

Other changes and bug fixes

Salt Installer

The Salt installer’s nginx configuration now includes the connection upgrade configuration required to support container shells. #19603

Fixed a bug in the Salt installer where the webshell’s nginx configuration always used localhost as upstream instead of shell.domain. #19283

The Salt installer now correctly obtains TLS certificates from Let’s Encrypt for single-node installs. #19169

The Salt installer’s configuration comments and error messages clarify that cluster IDs need to be lowercase. #19169

Workbench 2

Workbench 2 now supports search terms with whitespace when they are surrounded by double quotes. #19051

Workbench 2 no longer displays an extraneous “This field is required” error when editing collection properties. #19732

Process and subprocess listings in Workbench 2 now provide an “Open in new tab” action. #19569

Clicking on the buttons at the top of a collection or process page now jumps directly to the pane instead of auto-scrolling. #19465

If an error occurs when you try to create or edit a project, Workbench 2 will display that error, and let you edit the project again to resolve it. #19691

Workbench 2 now resolves CWL $include and $import directives in subprocess parameters, and displays the resulting object when found. #19684

Workbench 2 now displays a friendly error message when a subprocess parameter does not match the expected type and so cannot be rendered. #19684

Workbench 2 shows a warning when a container previously failed and is being retried. #19093

When Workbench 2 can’t show all the log messages for a running container, the output includes a message explaining this. #19851

Workbench 2 now displays the user who submitted a container request, and the user it’s running as if they’re different. #19315

Workbench 2 now validates new VM logins before they can be submitted. #18979

The “Add User” button has been removed from Workbench 2 because users must be managed by the external identity provider. #19627

Fixed a bug in Workbench 2 where copying the URL of items in the left hand project tree view did not always include the full URL. #19567

Fixed a bug where Workbench 2 would needlessly re-render the collection file browser. #18787

Fixed a bug in Workbench 2 that would cause the top search bar to become unresponsive. #19275

Restored the action menu (as an alternate way to access the context menu for a file or directory) on the collection file browser. #19007

The “Subprocesses” panel is now labeled. #19631

Fixed several bugs in Workbench 2 that would cause it to display VM logins incorrectly. #18979

Fixed bug that would cause some projects and collections to be incorrectly marked with icons for “favorite” and/or “public favorite”. #19260

Scrolling container logs now works correctly on Safari. #19687

Arvados API

When you configure services in /etc/arvados/config.yml, the InternalURL is now always the address published to clients on the same network. If that address is not where the service should bind to listen for connections, you can specify the bind address with a ListenURL field nested under each InternalURL. Refer to the URLs section of the install guide for details. #16561

The user/unsetup API method now removes all of a user’s permission links as part of its work. This ensures the user won’t have lingering permissions if they are reactivated later. #19501

The API server will only return new authentication tokens to trusted endpoints. The cluster’s Workbench 1 and 2 endpoints are both trusted by default. If you have other endpoints, make sure to add them to Login.TrustedClients in your cluster configuration. See the upgrade notes for details. #19240

The API server no longer allows admin users to edit frozen projects. #19145

The API server used to require that a container’s output was recorded before the exit_code. This restriction has been relaxed, to allow reporting the exit code while output is still being saved. #18948

The API server now enforces that all users are owned by the system user. #19139

The API server now enforces that the root user cannot be disabled or have their admin status revoked. #19206

The API server now reports a helpful error message when a user tries to start a container shell using a too-old version of arvados-client. #19166

The API server now logs when a user works with a project, collection, or container request per a configured time period. This provides improved reporting about general cluster activity levels. #19388

The API server’s old logs are now deleted with a background task that replaces the old rake delete_old_container_logs cron job. If you have this cron job running on your cluster, you can remove it after you upgrade to Arvados 2.5.0 or later. See the upgrade notes for details. #18863

The API controller now uses locks to ensure only one instance each of a dispatcher and keep-balance run at a time. #18071

Fixed a bug in the API server where if you tried to save an object with save_with_unique_name=true, but there was a problem saving the object, the unique name logic would cause a second error that masked the first. #19698

Container request runtime constraints now include keep_cache_disk, with an amount of cache storage to allocate. #18842

Client tools

arvados-client diagnostics now uploads a small “hello world” Docker image as part of its tests, and uses this image by default to test Crunch if you don’t specify one yourself. This means the diagnostics can run without having Docker images pre-uploaded to the cluster. #19281

arvados-client diagnostics now cancels the container request it submitted after its check times out. #19364

arv-keepdocker properly lists Docker images with a port number in their repository name. #19840

arv-keepdocker properly stores images that specify the default port 443 in their repository name. #19840

Fixed a bug in arvados-client diagnostics where the tool tested keep-web with a bad filename. #19379

Fixed a bug in Workbench 1 that would send users to a controller error page when they needed to log in on a cluster using LDAP or PAM authentication. #19880

Fixed broken links to the Arvados wiki throughout tools’ documentation. #19710

SDKs

The Python SDK’s arvados.api() constructor now returns a ThreadSafeApiCache. This is an API-compatible wrapper around the original API client object that stores a separate object per thread because it is not natively thread-safe. For more information, see the arvados.api reference. #19686

The Java SDK has a new KeepWebApiClient.upload() method to upload files to Keep via keep-web. #19220

The Java SDK’s KeepWebApiClient now gets its API token from configuration dynamically, so it’s easier to use as a Spring Bean. #19282

Go SDK functions that load data from Keep will now include a block ID in their error message if a requested block is not found. #19600

The R SDK has been updated. It features a new method Collection$readArvFile which handles reading files from Arvados in a variety of formats. #19704

The Perl SDK, which has not been maintained in years, has been removed from this release. #19712

Keep

The keepstore S3 v2 driver automatically fills in an AWS-style region parameter when deployed on Google Cloud Platform. #19234

Fixed a bug in keep-web where it did not correctly invalidate its internal caches when files were added to a collection via the S3 API. As part of this work, cache invalidation got more efficient. #19362

Crunch

Crunch now records a container’s exit code before it starts uploading and recording the output, so users can see that information sooner. #18948

Crunch now logs the RAM use of various components used to run a container: crunch-run, arv-mount, and keepstore. #19563

The default value for Containers.ReserveExtraRAM has been increased from 256MiB to 550MiB, to account for typical keepstore overhead. #19702

The Crunch LSF dispatcher now uses the cluster’s configured InstanceTypes to determine whether or not it is possible to run a given container, and cancel it if not. #19418

When running with Singularity runtime engine, Crunch now sets SINGULARITY_NO_EVAL=1 in the environment to improve container portability across runtimes. #19081

Administrator tools

arv-user-activity reports now exclude automatic token updates from arvados-login-sync. #19179

Fixed a bug in arv-user-activity where it would crash when working on objects without a UUID (e.g., a collection loaded by portable data hash). #19594

Fixed a bug in arvados-login-sync so it checks token status on the correct cluster, and does not constantly issue new tokens. #19400

Most Arvados cluster web services (everything except the old API server and both versions of Workbench) provide a status endpoint that reports the longest-running active requests, including whether or not the client has abandoned them. #19205

Development changes

arvados-server can now start all cluster services, including Workbench 2, keepproxy, keep-web, and arv-git-httpd. #18700 #18947

arvados-server init will now try to automatically obtain SSL certificates from Let’s Encrypt when the cluster uses public DNS names and ports 80 and 443 are usable on the server. #16552

arvados-server boot can now be configured to start multiple clusters, for testing federation functionality. #18699

arvados-dispatch-cloud can now be configured with a loopback driver that mimics a cloud environment. It allows Crunch to create an “instance” that sets up work to run on the local node. This is mainly useful for testing. #15370

When test suites reset the database or its fixtures, test runners suppress SQL logs. This makes it easier to find relevant output from failing tests. #19217

The arvados-server-easy package received several updates to make it more useful, but we still don’t recommend it for any production deployments just yet. #17344

arvbox now builds Ruby gems in a specific order to support arvados-login-sync. #19683

The Arvados Coding Standards now include detailed guidelines for Python docstrings, including markup recommendations that present well in both web and plaintext documentation. #18797

Updated Dependencies

Across Arvados services, we have updated our dependencies to get the latest bug fixes and improvements. #19620 #19629 #19745 #19862 #19877 #19878