add subset function which takes the voiced or non-voiced audio segments and outputs a new audio file
Preferably with av (linked to ropensci/av#52)
- status: after some hacking of the internal C code in av, I couldn't yet manage to do this, but it should be doable
- workaround is to get the data in R, append all the voice in R and next write wav file again (that works already)
subset should keep the information of the segments such that it can be aligned after doing the transcription