Choose the collection you’d like to edit and go to its homepage
Example, “Chemical Engineering Faculty Publication Series
From the administrator menu bar on the lefthand side of the screen, choose the fourth menu item
Click Export and choose “Export Metadata”. ScholarWorks will list the collection you’re in as the first choice. If that’s correct, choose it. If not, search for the collection you’d like to edit
ScholarWorks will ask if you’re sure you want to export the metadata. Click “Export” to confirm.
You’ll be taken to a Process overview page. You’ll know the process is completed when there’s a .csv file in the Output files
Click the csv file to download it.
Open OpenRefine
OpenRefine will ask you where your data is coming from. Choose This Computer and click the “Choose Files” button
Select the file that you just downloaded from ScholarWorks and click “Next”.
OpenRefine will show you a preview. Confirm that the columns are separated by commas (CSV) and then click “Create Project”
You will delete every column except for id, collection, dc.date.issued, dc.identifier.doi, dc.relation.url, dc.subject, and dc.type
To remove columns, click on the dropdown arrow at the top of a column and choose Edit Column > Remove this Column
To edit dc.date.issued
Click dropdown arrow at the top of the column and choose Facet > Text facet
Edit dates in the facet window that appears on the left side of the screen
We only need the year. If there’s a year and month that’s fine as long as it’s not January. If you see a date that’s YYYY-MM-01, delete the “01”
To edit dc.identifier.doi
Click at the top of the column and choose edit cells > transform
In the formula box, type value.replace(“https://doi.org/”,””). This will remove all of the https://doi.org at the beginning of each cell that has a DOI
If you need to convert commas in dc.subject to ||’s, click at the top of the column, choose edit cells>transform
In the formula box, type value.replace(“,”,”||”).
Dc.relation.url might be tricky to edit because it will occasionally contain items that we really want to link out to, so we’ll come back to this later.
To edit dc.type,
Click the dropdown arrow at the top of the column and choose Facet > Text facet
Edit types in the facet window that appears on the lefthand of the screen.
These are the default types currently loaded in SW:
| Animation |
|---|
| Article |
| Book |
| Book chapter |
| Dataset |
| Learning Object |
| Image |
| Image, 3-D |
| Map |
| Musical Score |
| Newsletter |
| Plan or blueprint |
| Preprint |
| Presentation |
| Recording, acoustical |
| Recording, musical |
| Recording, oral |
| Software |
| Technical Report |
| Thesis |
| Video |
| Working Paper |
| Other |
Once you’re done with these edits, you’ll export the .csv file.
Click Export in the top right corner and choose comma-separated value. Save the file wherever you’d like.
To import the file back into ScholarWorks, you’ll go back to the admin menu in ScholarWorks and choose “import > metadata” (3rd item from the top).
ScholarWorks will default to CSV validation, so be sure to click on “Import metadata”.
Choose your exported file from OpenRefine and click the maroon validate button if you’d prefer to check your validation first.