Looking for help fixing a broken web UI. API works perfectly.
Copying over from GitHub issue #8874:
What did you do?
Prometheus is installed on one of our servers. I was tasked with finding the source of errors in the web interface that make it (the UI) unusable despite the API endpoints working perfectly (Alertmanager/Grafana use it with no problem whatsoever).
What did you expect to see?
Being able to execute queries, see graphs, alerts and runtime information on the web UI (e.g. localhost:9090).
What did you see instead? Under which circumstances?
Under all circumstances, I see error messages each and every time I try to execute a query/see information about the server, etc. except for a few cases when I use the classic UI.
Pages not working in the new UI:
- All of them => failing with errors citing “JSON.parse: unexpected character at line 1 column 1 of the JSON data”
Pages not working in the classic UI:
- Graph => Error loading available metrics!
Screenshot:
Environment
- 
System information: Linux 4.19.0-10-amd64 x86_64 
- 
Prometheus version: prometheus, version 2.25.2 (branch: HEAD, revision: bda05a23ada314a0b9806a362da39b7a1a4e04c3) 
 build user: root@de38ec01ef10
 build date: 20210316-18:07:52
 go version: go1.15.10
 platform: linux/amd64
- 
Alertmanager version: Although Alertmanager is now up and running, it wasn’t set up when the bug started 
- 
Prometheus configuration file: 
global:
  scrape_interval: 1m
  evaluation_interval: 1m
  # scrape_timeout is set to the global default (10s).
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"
    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 30s
    scrape_timeout: 30s
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: operations-dashboard
    scheme: https
    scrape_interval: 10s
    scrape_timeout: 5s
    honor_labels: true
    static_configs:
      - targets:
          - ops-dashboard.example.com
        labels: { "host": "operations-dashboard" }
  - job_name: node
    # If prometheus-node-exporter is installed, grab stats about the local
    # machine by default.
    #   [scheme http]
    static_configs:
      # BEGIN ANSIBLE MANAGED BLOCK example_host
      - targets: ["example_host_name:9100"]
        labels:
          host: "example_host_name"
          team: "IT"
      # END ANSIBLE MANAGED BLOCK example_host
      ... a few more blocks almost identical to this one irrelevant to the bug
- Logs: No logs reporting errors/warning. Everything “working as expected”.
Here’s what I’ve tried so far:
- understanding the entire configuration file
- checking that the configuration file is valid
- checking that the configuration file passes promtool check config
- running the configuration file on a local docker instance of Prometheus with the same version (2.25.2)
- the UI works on the local instance
 
- restarting the service (sudo systemctl restart prometheus)
- reloading the config
- checking the logs for more information (sudo journalctl -f -u prometheus)
- changing the log level of prometheus (add argument --log.level=debugin/etc/systemd/system/prometheus.service)- This does not yield any more useful information
 
- inspecting the available information in the classic UI
- label statistics are available and all nodes are up and scraped properly
 
