Analytics token as a query parameter
We really like the analytics functionality offered by this registry but we're curious if the analytics token can be provided via a query parameter instead of the URL path.
As an example, the following URL:
myregistry.io/my-token__namespace/module/aws
would instead be:
myregistry.io/namespace/module/aws?analytics-token=my-token
This is admittedly mostly cosmetic but the consumers of the modules find the explicit query parameter to be more intuitive and obvious that the example value needs to be replaced. It also makes linking back to the registry from code a bit easier.
If you think this is possible to support then I'd be willing to try and contribute towards it. But I'm not sure if there are certain restrictions or limitations of Terraform registries that would prevent this from working.
Github reference: https://github.com/MatthewJohn/terrareg/issues/66
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Link issues together to show that they're related or that one is blocking others. Learn more.
When these merge requests are accepted, this issue will be closed automatically.
Activity
- Author Guest
From @MatthewJohn
Hey @frittsy ,
Whilst this would lovely - the main reason for the double underscore was just to get around the limitations of Terraform/Opentofu.
Placing a URL with get parameters into a Terraform call produces:
Initializing the backend... Initializing modules... ╷ │ Error: Invalid registry module source address │ │ on main.tf line 38, in module "example-submodule": │ 38: source = "local-dev.dock.studio:5000/test/example-module/null?something=yes" │ │ Failed to parse module registry address: module registry addresses may not include a query string portion. │ │ Terraform assumed that you intended a module registry source address because you also set the argument "version", which applies only to registry modules. ╵
I'm up for any suggestions, but, as I say, Terraform is the main limiting factor - if you find a more elegant URL structure, I'm all ears ;) (Just try terraform init a module with the desired URL).
The ticket that I investigated how to structure the URLs can be found here: #8 (closed), including the link to the various regex that Terraform uses against the URL
Many thanks Matt
Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2442132160
github-comment-id:2442132160
- Author Guest
From @frittsy
@MatthewJohn sorry for taking so long to truly review this. I figured there would be limitations but it's great to know where they happen, so thank you.
In your original investigation you mentioned this one option "Inject 'data' terraform object", could you elaborate on that a bit? I wouldn't mind "invasive" in my particular use case, but I can't picture what you meant by this. Would it involve invoking some sort of
local-exec
command upon planning?Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2472171779
github-comment-id:2472171779
- Author Guest
From @MatthewJohn
Would it involve invoking some sort of local-exec command upon planning?
Yes exactly, this is precisely the idea, though the specific meaning behind the "data" terraform object, was the https://registry.terraform.io/providers/hashicorp/http/latest/docs/data-sources/http data source :) But local-exec could certainly achieve the same (though, since it would be notifying the terrareg API, so local-exec would need to determine a platform-specific command etc, so the http data source would make this easier.
In terms of revisiting this and reviewing possible alternatives, from your description you mention that it's not obvious to your users what needs to be replaced - is there a way this can be tackled? Are they using the Terrareg UI to discover them? Is there someway we can improve how it's displayed to users how to use the modules and add the analytics token?
I guess, from the point-of-view for the data-source alternative, it would probably result in having to inject an additional variable into the module as well, which would probably be easy to understand
Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2472609324
github-comment-id:2472609324
- Author Guest
From @frittsy
I may have painted an inaccurate picture of my motives here, and our goals have changed a little bit in the past couple weeks as we prep this for Production use, so bear with me as I'll probably go outside the scope of the original issue.
The registry does a great job at making it clear on the module's landing page how these tokens work, returning 401s when not replaced, we're writing internal documentation as well, etc. all this is good.
Ultimately the goal here would be to gain the benefits of analytics without relying on consumers to do anything. Frankly we just don't trust our consumers to always do it correctly, especially given the high percentage of Terraform code that gets copied and pasted in our org. I think (some) consumers would initially get the module code from Terrareg, but then copy their repository code + analytics token repeatedly.
We also want to have a standardized format in our analytics tokens that identifies the team/division within the org and the repository in which the module is consumed (i.e.
MatthewJohn:inf-repo
). Again, we don't 100% trust consumers to always do this correctly, and I would prefer to automate it for them, either via this http data source idea or injecting the analytics token into the URL as part of our CI/CD pipeline templates.I was actually exploring the latter option, so consumers would reference the modules without an analytics token, and the pipeline running the TF plan will calculate the correct analytics token we want + inject it. We enabled
ALLOW_UNIDENTIFIED_DOWNLOADS
so people can run local TF plans on their personal machines, and then set a singleANALYTICS_AUTH_KEYS
so that only downloads performed from our build agents are logged for analytics. The only issue with this approach is that the module page in Terrareg always includes the instructions for analytics tokens unlessDISABLE_ANALYTICS
is true, which is not what we want either.So I think the simplest solution might be an additional environment variable to opt-out of the instructions + examples for analytics tokens, but still collect analytics? The http data source is probably a more all-encompassing solution, but would also require hiding the instructions too. Really open to anything you have in mind. Thanks in advance for reading this :)
Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2477202191
github-comment-id:2477202191
- Author Guest
From @MatthewJohn
Hey @frittsy,
Thank you for the explanation - it makes complete sense Before suggesting any technical implemenations on the Terrareg side - could I suggest or at least proprose something, although it's somewhat dirty.
If you have got standardised CI/CD pipelines, could they scan the Terraform files for any module sources and fail if they provide an analytics token (or remove it). And then, compile an analytics token of your choice using the CI/CD env variables and inject it back into the module source?
A rough idea:
# Remove any existing analytics tokens find . -type f -name '*.tf' -exec sed -E 's#(yourregistry\.example\.com/[^/]+)__[^/]+#\1#g' "{}" \; # Generate new analytics token analytics_token="${CI_ORG}_${CI_REPO}" # Remove any invalid characters in analytics token analytics_token=$(echo $analytics_token | sed 's/[^a-z^A-Z^0-9^-^_]/_/g') # Inject analytics token find . -type f -name '*.tf' -exec sed -E "s#(yourregistry\.example\.com/[^/]+)#\1__$analytics_token#g" "{}" \;
But happy to discuss further options that the registry can do to make this easier
Many thanks Matt
Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2478049871
github-comment-id:2478049871
- Author Guest
From @frittsy
Yes I can absolutely do this, and probably should as a fail safe. One of the many benefits of having standardized and highly-adopted pipeline templates for Terraform.
It just becomes a tad confusing and conflicting when reading Terrareg, that's all. But our internal documentation can clear this up, and people will get used to it quickly, so it's not a deal-breaker by any means.
Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2479695247
github-comment-id:2479695247
- Author Guest
From @MatthewJohn
The main issue with ensuring users pass the correct value and avoid copying and pasting... the "interface" between the Terraform code and Terrareg is one of a few things:
- The URL they provide
- The Terraform auth token (which arguably could be used for analytics)
- The parameters they provide (if we start manipulating the module)
From these options, the first and last are in the users' control - if they don't provide an analytics token correctly in the URL, then injecting custom variables into the module would still require them to populate them. Whilst, if we're inject code into modules, we could use
local-exec
and use environment variables, not only is this somewhat platform dependent, it would also likely lead down a dark rabbit of hole - reading environment variables and making API calls outside of the users' control, which is possibly a bit of a moral barrier I'm reluctant to break - aside from the harsh compatibility issues it would incur. The "onboarding" of Terrareg "deployers" into how this works would also be quite troublesome and likely require many people wanting to inject different environment variables and different Terraform code. And, aside from any of this, would only work if the Terrareg "archive" of modules is used, rendering the git URL passthrough (which I think is the most commonly used) to either be unsuable or provide different functionality to the archive workflow. From here, we could either allow the deployer (yourself) to ensure the analytics token in the URL is valid - you can perform any manipulation that you see reasonable within your own pipelines.The last is the auth token - I could see this provide more benefit - having it identify both environment and have it include the analytics token for the deployment, though would require some "re-jigging".
Apart from these options, I'm see few other options available :D
Hope this all makes sense
Many thanks Matt
Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2479737349
github-comment-id:2479737349
- Author Guest
From @frittsy
Agree wholeheartedly about the registry's role in injecting code into the modules, I wouldn't want Terrareg to attempt to do that either. I was thinking about an additional API endpoint for Terrareg that would allow manual logging of analytics. Upon publishing a module to Terrareg, I can inject the custom
http
data source code I want with all my preferences, platform assumptions, etc. that will make the API call to Terrareg (and yes, I do use the s3 backend for module hosting instead of passthrough). Maybe the data source injection is overkill and I could just make the API call from the pipeline for each module reference I find. Either way, I would probably want to disable the automated analytics and just rely on these API calls.I'm definitely into the auth token solution as well, without knowing the full implications of that either.
Since all these options end up abstracting the analytics token away from the end user, I think my concern still stands about the token instructions being displayed unless
DISABLE_ANALYTICS
is set to true. Which then also removes the analytics tab from the UI.Thanks for your continued support and brainstorming on this. Have a great weekend
️Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2480317452
github-comment-id:2480317452
- Author Guest
From @MatthewJohn
I was thinking about an additional API endpoint for Terrareg that would allow manual logging of analytics
Yes - apologies - this is a good idea - I was actually thinking the same yesterday morning and completely forgot to mention in my reply. The main things I was considering that I'd hadn't worked out was, how the following information might be provided:
- Terraform version (this is provided by Terraform in headers when calling a module). The only way I could imagine doing this is if your pipeline determined the terraform version (if you using tfswitch or a static version of Terraform) and providing as an env variable to your local-exec
- Module version - I guess this could either be determined after a terraform init and then provided as an environment variable, though a little convoluted. Alternatively, depending on whether you are providing the module via Git or uploading an archive, I guess the version could be injected into your module prior to uploading to the registry. If you use git, I assume you would just need forward thinking to update some version constant in the module prior to tagging it.
- Deployment environment - This could be determined by Terrareg based on some method of authentication that the local-exec uses in the API key header to call Terrareg (some thoughts on this below), otherwise, you could again provide this as an environment variable to your local-exec as pass the environment name directly to Terrareg
For authentication... whilst I could foresee the auth tokens that Terraform uses in the pipeline to authenticate the analytics request, I wonder if the token is overly permissive. That is, your pipeline can inject the token into the .terraformrc file, but to be usable by the local-exec, I assume, would probably need to be set as an environment variable (assuming the local-exec doesn't scrape the terraformrc file ;) ). This would mean exposing the token further and this token would allow both the pushing of analytics, but also to download/query the modules. Would we want a specific API key in this case for the analytics registration endpoint? If so, I wonder if this should also be an API key per environment, or a static API key and take the enivronment as part of the data to the request.
Maybe some food for thought here. I'm fully with this idea and happy to implement. I guess, as a first iteration, it could re-use the existing API keys and can then add specific API key(s) and/or "environment" arguments to the endpoint at a later date. The terraform version could be optional for now. For the module version, I'm not sure if this be optional - will have a think about the consequences of this - but if you could let me know what you think about the feasibility of identifying and passing through the module version.
It just becomes a tad confusing and conflicting when reading Terrareg, that's all
Any suggestions/pointers here are much appreciated - I'm aware the documentation isn't the best - but haven't spent the time to tackle it properly. But if there's anything specific, I'll happy take a look sooner.
I think my concern still stands about the token instructions being displayed unless DISABLE_ANALYTICS is set to true. Which then also removes the analytics tab from the UI.
Bad documentation probably hits again here, the configuration ALLOW_UNIDENTIFIED_DOWNLOADS (https://matthewjohn.github.io/terrareg/CONFIG/#allow_unauthenticated_access) will actually do this job - analytics are visible in the UI, but not enforced. If the plan is to not show analytics tokens in the documentation (because you modules are calling this new endpoint), some work will need to be done to tidy this up in the UI. But this configuration, at least for now, should be what you're looking for.
Many thanks Matt
Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2480469856
github-comment-id:2480469856
- Author Guest
From @MatthewJohn
Some implementation notes: I wanted to have support for both a
local-exec
andhttp
data source, so testing with the data source to determine the headers:data "http" "this" { url = "https://localhost:5000/v1/terrareg/analytics/adgADG2/example-submodule/null/1.1.0" method = "POST" request_headers = { Content-Type = "application/json" } }
I received the headers:
Host: localhost:5000 User-Agent: Go-http-client/1.1 Content-Length: 2 Content-Type: application/json Accept-Encoding: gzip
So Terraform doesn't pass the standard Terraform user-agent and
X-Terraform-Version
headers, so I guess the local-exec and http data resource will basically act the sameLink: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2480493934
github-comment-id:2480493934
- Matt created branch
548-analytics-token-as-a-query-parameter
to address this issuecreated branch
548-analytics-token-as-a-query-parameter
to address this issue - Matt mentioned in merge request !435 (merged)
mentioned in merge request !435 (merged)
- Author Guest
From @frittsy
Setting the data source aside for a moment, this is a potential script that could be added to our pipelines after
terraform init
is run to log analytics for all downloaded modules:terraform_version=$(terraform version --json | jq -r '.terraform_version') terraform_workspace=$(terraform workspace show) jq -c '.Modules[] | select(.Source | contains("terrareg.mydomain.io"))' ./.terraform/modules/modules.json | while read -r module; do source=$(echo "$module" | jq -r '.Source') version=$(echo "$module" | jq -r '.Version') module=$(echo "$source" | awk -F/ '{print $2"/"$3"/"$4}') body=$(jq -n --arg TerraformVersion $terraform_version --arg ModuleVersion $version --arg Environment $terraform_workspace '$ARGS.named') curl -X POST "https://terrareg.mydomain.io/v1/terrareg/modules/$module/analytics" \ -H "Content-Type: application/json" \ -d $body done
Snippets of this could be used in a
local-exec
to output a file, then you could use alocal_file
data source to read the contents of that file, maybe withjsondecode()
to more easily extract the properties we want from that file.Where I'm stuck mentally is where/when this injection would happen. Would it be upon module upload? If so, wouldn't we know the module name and version from the API endpoint parameters? Then they could be hardcoded into what gets injected into that version of the module.
The data source injection idea is cool but might be ever-so-slightly convoluted enough to make me question whether it's a good idea or not
I was dancing around another topic to avoid blowing up the scope even further but since you said
pass the environment name directly to Terrareg ... take the environment as part of the data to the request
I really would love to be able to do this, and log the current Terraform workspace as the "environment". We run all our Terraform from centralized build agents, so we don't have separate deployment environments. Terraform workspaces are what determine what environment we're dealing with. The issue here is while there are guidelines for naming these workspaces, there is quite a bit of variance, so I'm not sure how I could define the hierarchy of environments for Terrareg (i.e.
dev
->prod
vstest-us-east-1
->production-us-east-1
). So maybe it's not viable.If the plan is to not show analytics tokens in the documentation (because you modules are calling this new endpoint), some work will need to be done to tidy this up in the UI
This is all I was trying to say before. I am using
ALLOW_UNIDENTIFIED_DOWNLOADS
. It's just confusing for consumers to see instructions saying to provide an analytics token when it will be done for them. Nothing major and nothing I can't communicate to consumers on my own in the meantime, but would be nice to have the option to hide these things in the UI.Sorry my thoughts are disjointed here and I have more to respond to, but wanted to get some thoughts down while I had a window of time.
Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2484189698
github-comment-id:2484189698
- Author Guest
From @MatthewJohn
The data source injection idea is cool but might be ever-so-slightly convoluted enough to make me question whether it's a good idea or not
I do agree - if you have full control over the pipeline and are already injecting a script (assuming I'm understanding correctly), having this be self-reliant, as you have here, is probably much more reliable than haivng a script in the CI pipeline and expecting the module to product a "magic file" that is read by the script. But, if you mean completely moving all the logic to a local-exec, then yes, I suspect the pipeline that uploads the module could certainly inject it - as you say, it would have all the necessary details about the namespace, name, version. The only attributes that would need to be gathered at runtime are the analytics token and terraform version, which would be set as env variables. From the currently implementation for this feature, there's nothing Terraform specific, so will allow it to be executed in terraform (via http/local-exec) or externally in the script, so should be able to go whichever direction fits you best
I really would love to be able to do this, and log the current Terraform workspace as the "environment". We run all our Terraform from centralized build agents, so we don't have separate deployment environments.
I've seen instances of Terrareg be used like this before - the main thing was that, it's not the build agent that contains the API key in the .terraformrc file, but the pipeline has a environment-specific variable for this - so the file is generated during the pipeline and the API key matching the environment is used.
I'm not sure how I could define the hierarchy of environments for Terrareg (i.e. dev -> prod vs test-us-east-1 -> production-us-east-1). So maybe it's not viable.
It mostly comes down to what you care about seeing. In my experience, if we need to move away from a particular module version, though it's sometimes useful to see what's deployed to dev environments-only, we assume the main migration work for big upgrades will occur for instances where the module has actually been deployed to production. When it comes to multi-regions - to a degree, production is production, meaning the infrastructure is "live". But this is completely down to you. That said, the analytics are all recorded. That is, if you have
test-us-east-1
,prod-us-east-1
, ``test-us-east-2`, although the UI only shows the "highest" environment, the information about the others is still present - so if there's changes made to display this data differently in the UI, the data will already be present :)would be nice to have the option to hide these things in the UI.
Agreed, can do this as part of this work :)
Matt
Link: https://github.com/MatthewJohn/terrareg/issues/66#issuecomment-2484797541
github-comment-id:2484797541