Note: this is a self-answered question, to help anyone in a similar situation.
While upgrading Terraform to 0.15, we got the following error messsage (along with similar messages for the aws
and random
providers):
> terraform -chdir=aws init --upgrade
Upgrading modules...
Downloading git@github.com:penngineering/ops-terraform-lambda.git?ref=tf-15 for stream_handler...
- stream_handler in .terraform/modules/stream_handler
Initializing the backend...
Successfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.
╷
│ Error: Invalid legacy provider address
│
│ This configuration or its associated state refers to the unqualified provider "pagerduty".
│
│ You must complete the Terraform 0.13 upgrade process before upgrading to later versions.
╵
...
However, we don't use the pagerduty
provider, and neither does any module that our root module references. Running terraform providers
gives this output:
Providers required by configuration:
.
├── provider[registry.terraform.io/hashicorp/archive] 2.2.0
├── provider[registry.terraform.io/hashicorp/aws] 3.39.0
├── provider[registry.terraform.io/hashicorp/local] 2.1.0
├── provider[registry.terraform.io/hashicorp/null] 3.1.0
└── provider[registry.terraform.io/hashicorp/random] 3.1.0
Providers required by state:
provider[registry.terraform.io/hashicorp/aws]
provider[registry.terraform.io/hashicorp/local]
provider[registry.terraform.io/hashicorp/null]
provider[registry.terraform.io/hashicorp/random]
provider[terraform.io/builtin/terraform]
provider[registry.terraform.io/hashicorp/archive]
Downloading the external state and grepping for providers shows the same list of providers.
Turning on logging didn't help: we could see where Terraform was making calls to retrieve remote state, then it printed the error messages and exited. No indication of what it was doing to try to figure out the provider config.
Googling didn't turn up much; in most cases people hadn't done the 0.13 upgrade correctly. We did update the state
that provider names exactly matched our required_providers
config, as indicated by this StackOverflow answer, but it didn't help.
Best Answer
Googling did turn up this thread, which led us to the answer.
Terraform has the notion of a "default" workspace, and if you run
terraform init
without specifying a workspace, it will initialize based on this workspace. However, we don't use the default workspace; each of our root modules is in an explicit workspace.Our build process used
terraform workspace select ...
beforeplan
andapply
. But you can't select a workspace this way until youinit
. Andinit
was using the default workspace (which I guess has an actual default config inside the Terraform binary, used for initializing completely new projects).The solution is to use the
TF_WORKSPACE
environment variable when runninginit
(we now use it everywhere):