Use PyYAML safe_dump() to write datasets

Description

Using dump() in the YamlFormatter implementation can potentially lead to writing datasets that full_load() (which is currently used) cannot read.  Using safe_dump() would catch this problem earlier.

On the other hand, a tagged value is used to ensure round-tripping of Astropy times.  Make sure that this usage is not obstructed by changing changing to safe_dump().

The documentation of _readFile() in the formatter also incorrectly claims that UnsafeLoader is used; please fix this.

Checklist

Lucidchart Diagrams

Issue Matrix

hide

Activity

Show:
Kian-Tat Lim
November 4, 2020 at 11:57 PM

I guess the extra parameter is OK, though if people start copying it we'll be back to where we were before.

One code conciseness suggestion.

Tim Jenness
November 4, 2020 at 10:23 PM

I switched to safe_dump and safe_load by default but to allow PropertyList to work I needed to add a write parameter to the formatter to let it use safe_dump. I think this should be fine since you have to go out of the way to enable it and it won't help you in the long term because safe_load would fail anyhow if you haven't added the proper yaml hooks. If you object can I put the "write a new PropertyList formatter" in a new ticket?

feel free to ignore the yaml formatter stuff but can you please take a look at the changes to test_butler.py to see if I got the tags vs collection right? Everything seemed to be okay except the datasetExists calls after the pruneDatasets were confusing me a bit.

Done
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Labels

Reviewers

Kian-Tat Lim

Story Points

RubinTeam

Components

Checklist

Created November 4, 2020 at 1:34 AM
Updated November 5, 2020 at 6:20 PM
Resolved November 5, 2020 at 1:55 AM