Why Tar Flag –strip-components is Ignored Sometimes?

tar

I have an automated docker images build where I download elasticsearch archive and extract it via:

tar zxf archive.tar.gz --strip-components=1 -C /path/to/dir

And it always worked until the latest releases (6.8.5 and 7.4.2). It no longer works for 6.8.5, meaning the flag --strip-components no longer has any effect. However, it works fine for 7.4.2. After comparing these two archives the only difference I've found is that 6.8.5 has a different ownership of files in the archive – 631:503 vs root:root in 7.4.2. However, if that was the issue flags --no-same-owner or --user should've resolved the issue by they didn't. I even created a user/group with those IDs and extracted the archive under this user but it also had no effect.

This is how you can reproduce (replace 6.8.5 to 7.4.2 to try both):

$ docker run --rm -ti alpine:3.10.3 sh

### from the container

$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.5.tar.gz
$ apk add --update tar
$ mkdir elastic
$ tar zxvf elasticsearch-6.8.5.tar.gz --strip-components=1 -C elastic
$ ls -la elastic

With 6.8.5 you'll the intermediary directory that wasn't stripped, with 7.4.2 you won't see it despite it exists in both archives.

As you may notice I don't use tar from musl, I used the GNU version from alpine packages (version 1.32) that have been there for a few months already. I use this package with the same flags in many others builds and it works just fine for me.

Best Answer

As was explained to me by Elastic staffer on github this happens due to a leading ./ on the paths within the archive:

/ # tar tvf elasticsearch-6.8.5.tar.gz --numeric-owner | head -n 2
drwxr-xr-x 631/503           0 2019-11-14 14:20 ./
drwxr-xr-x 631/503           0 2019-11-13 20:07 ./elasticsearch-6.8.5/

So, in this case --strip-components should be 2, not 1. To handle this kind of situations universally you can list the archive before extracting and if it has ./ you can dynamically change the --strip-components count:

$ if tar tf elasticsearch-6.8.5.tar.gz | head -n 1 | grep -q '^./$'; then STRIP_COMPONENTS_COUNT=2; else STRIP_COMPONENTS_COUNT=1; fi
$ tar zxvf elasticsearch-6.8.5.tar.gz --strip-components=$STRIP_COMPONENTS_COUNT -C elastic

But, honestly, good archives should be created without any ./ which is super confusing because unless you list files in the archive you won't notice any difference.

Related Topic