I would really like to be able to monitor network traffic by client/remote IP for a couple Windows servers and ship that data to Prometheus, but I’m unable to find a good solution on how to accomplish this and wanted to ask the community for suggestions.
It seems like the windows-exporter tcp and net flags expose network traffic statistics from WMI for specific NIC interfaces but not as granular as traffic statistics for remote IPs.
I’m aware of how much data this could generate and don’t really need to know layer 4/7 data per client, but it’d be helpful to see that server A in x days has sent/received x amount of bytes broken down by remote IP addresses. I’m also aware that Wireshark and TCPView could accomplish this but I’d have to I’d be concerned about the load it would have on the server to leave those running continuously (not to mention not shipping to a central monitoring platform with time-series functionality like Prometheus).
Thank you in advance for any assistance you can provide!
Prometheus probably isn’t suitable for this. Even if you don’t include detail about ports or protocols used the number of remote IP addresses is significant (IPv4 space has 4 billion IP addresses and IPv6 has many order of magnitude more). The generally accepted rough limit for a single Prometheus server is in the low millions of time series, which if you assume a handful of metrics (send & receive counts & traffic, etc.) and multiple servers you might only have capacity for say 10k IP addresses to be used (and that doesn’t take into account Prometheus doing anything “normal” like also monitoring applications, node exporter, databases, etc.). Unless the servers are only talking to a limited selection of internal only servers (meaning the IP addresses possible are highly constrained to say a few 100) you are likely to rapidly & significantly exceed this limit - the Internet is full of systems doing random IP scans, DDOS attacks, search engine bot checks as well as real users/customers.
In general it would be much better to think of this sort of data in the event sphere. So store the flow log information in a system/database such as Elasticsearch or InfluxDB, which can still be visualised using Grafana dashboards and summarised into aggregated metrics.
You also mentioned concern about the overhead of running something like Wireshark on a server. This can indeed be fairly significant depending upon exactly what is being done, but that overhead doesn’t go away using other solutions - there is a fair amount of work to ask the kernel to intercept all network traffic and send it to a capture application in userspace (which could just be a logging tool). In the networking space a standard way to overcome this issue is to run the capture tool on another server and then use mirrored network ports on the switch you are plugged into, which duplicates everything crossing the network meaning you don’t have to run anything extra at all on the server being monitored.