-
Notifications
You must be signed in to change notification settings - Fork 93
Description
Thanks for your fantastically useful json2csv script - I've been using it to parse data from OpenLibrary dumps. It's working very well, even though the OL data is very inconsistently structured. One question, though, if I may...
In a case where there are commas within an item, eg
{"subjects": ["Books and reading -- Fiction.", "Storytelling -- Fiction.", "Death -- Fiction.", "Jews -- Germany -- History -- 1933-1945 -- Fiction."]}
json2csv appears to strip out the commas within the value, so the four different subjects all get merged into one. It comes out like this for -k subjects:
[Books and reading -- Fiction. Storytelling -- Fiction. Death -- Fiction. Jews -- Germany -- History -- 1933-1945 -- Fiction.]
Is there a straightforward way to get it to preserve those multiple items within a value? (I don't need them as separate fields in the CSV, but would like to preserve the distinction within the 'subjects' field, if you see what I mean - so they could be delimited by something other than a comma.)
(I tried using the -d flag to set a different field delimiter, e.g. semicolon, but it still stripped out the commas as above.)
Edit: another example...
"subject_places": ["United States", "China"]
comes out as
[United States China]
so it's not really practical to find some automated way of parsing that alas.