Load entire scarlet model dataset locally when not specifying a blend_id parameter

Description

In https://rubinobs.atlassian.net/browse/DM-49537 we implemented storing scarlet models as zip files so that individual blends could be loaded quickly without loading the entire dataset. This has been causing I/O issues when the entire dataset is loaded (ie. for populating an entire source catalog) as it makes 1000s of separate I/O calls. This ticket is to fix that by loading the entire dataset into memory when no blend ID is (or blend IDs are) specified.

Issue Matrix

hide

Activity

Show:
Tim Jenness
April 28, 2025 at 9:22 PM

The reason the NotImplemented line is not being hit is because the butler will download to a local file and read that if it knows that the file is going to be cached – that is the default butler configuration. There is a test in obs_base that explicitly prevents caching for a test but it’s probably not worth your while here. I think IN2P3 hit the problem because in batch we are more targeted with our specification as to which files should be cached.

Tim Jenness
April 28, 2025 at 9:09 PM

(sorry, pressed the wrong button first time)

Thanks for unifying the read code for stream and path.

Fred Moolekamp
April 24, 2025 at 7:41 PM
(edited)
Done
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Reviewers

Tim Jenness

Story Points

RubinTeam

Checklist

Created April 24, 2025 at 6:36 PM
Updated April 30, 2025 at 2:14 AM
Resolved April 30, 2025 at 2:13 AM