>We have successfully replaced thousands of complicated deep net time series based anomaly detectors at a FANG with statistical (nonparametric, semiparametric) process control ones.
They use 3 to 4 orders lower number of trained parameters and have just enough complexity that a team of 3 or four can handle several thousands of such streams.
Could you explain how ? Cause I am working on this essentially right now and it seems management is wanting to go the way of Deep NNs for our customers.
Without knowing details it's very hard to give specific recommendations. However if you follow that thread you will see folks have commented on what has worked for them.
In general I would recommend get Hyndman's (free) book on forecasting. That will definitely get you upto speed.
If it's the case that you will ship the code over client's fence and be done with it, that is, no commitments regarding maintenance, then I will say do what the management wants. If you will continue to remain responsible for the ongoing performance of the tool then you will be better if choosing a model you understand.
MBAs do love their neural nets. As a data scientist you have to figure out what game you’re playing: is it the accuracy game or the marketing game? Back when I was a data scientist, I got far better results from “traditional” models than NN, and I was able to run off dozens of models some weeks to get a lot of exposure across the org. Combined with defensible accuracy, this was a winning combination for me. Sometimes you just have to give people what they want, and sometimes that’s cool modeling and a big compute spend rather than good results.
Without getting into specifics (just joined a new firm), we’re working with a bunch of billing data.
Management is leaning toward a deep learning forecasting approach — train a neural net to predict expected cost and then use multiple deviation scorers (including Wasserstein distance) to flag anomalies.
A simpler v1 is already live, and this newer approach isn’t my call. I’m still fairly new to anomaly detection, so for now I’m mostly trying to learn and ship within the existing direction rather than fight it.
I do not understand how time series can be forecast without training on data from a relevant domain. Like, would these be able to predict EEG/fMRI timeseries?
The promise is similar to LLMs, if you pretrain on sufficiently large timeseries datasets with sufficiently large variance/characteristics, that you will be able to transfer the model to a completely different use case that exhibits somewhat similar characteristics (in latent space). But it’s always good to check what kind of data the model was trained on, eg Chronos 2.0 training data is described in Appendix A Table 6 here: https://arxiv.org/pdf/2510.15821
Isn’t this the ultimate black box? If a forecasting system is a black box, then you have no chance of understanding why its performance might deteriorate. Once that happens it essentially becomes a digital paper-weight.
That's not a good argument because it's like saying that an LLM is a black box, yet we use them all day every day. The two share the same engineering and operating principles.
Moreover, some of the models used as listed at https://faim.it.com/models are open models developed by third-parties, and how you host and call them is up to you.
I will always advise "start simple"
https://news.ycombinator.com/item?id=46055919
>We have successfully replaced thousands of complicated deep net time series based anomaly detectors at a FANG with statistical (nonparametric, semiparametric) process control ones.
They use 3 to 4 orders lower number of trained parameters and have just enough complexity that a team of 3 or four can handle several thousands of such streams.
Could you explain how ? Cause I am working on this essentially right now and it seems management is wanting to go the way of Deep NNs for our customers.
Without knowing details it's very hard to give specific recommendations. However if you follow that thread you will see folks have commented on what has worked for them.
In general I would recommend get Hyndman's (free) book on forecasting. That will definitely get you upto speed.
https://news.ycombinator.com/item?id=46058611
Wishing you the best.
If it's the case that you will ship the code over client's fence and be done with it, that is, no commitments regarding maintenance, then I will say do what the management wants. If you will continue to remain responsible for the ongoing performance of the tool then you will be better if choosing a model you understand.
MBAs do love their neural nets. As a data scientist you have to figure out what game you’re playing: is it the accuracy game or the marketing game? Back when I was a data scientist, I got far better results from “traditional” models than NN, and I was able to run off dozens of models some weeks to get a lot of exposure across the org. Combined with defensible accuracy, this was a winning combination for me. Sometimes you just have to give people what they want, and sometimes that’s cool modeling and a big compute spend rather than good results.
Without getting into specifics (just joined a new firm), we’re working with a bunch of billing data.
Management is leaning toward a deep learning forecasting approach — train a neural net to predict expected cost and then use multiple deviation scorers (including Wasserstein distance) to flag anomalies.
A simpler v1 is already live, and this newer approach isn’t my call. I’m still fairly new to anomaly detection, so for now I’m mostly trying to learn and ship within the existing direction rather than fight it.
I do not understand how time series can be forecast without training on data from a relevant domain. Like, would these be able to predict EEG/fMRI timeseries?
The promise is similar to LLMs, if you pretrain on sufficiently large timeseries datasets with sufficiently large variance/characteristics, that you will be able to transfer the model to a completely different use case that exhibits somewhat similar characteristics (in latent space). But it’s always good to check what kind of data the model was trained on, eg Chronos 2.0 training data is described in Appendix A Table 6 here: https://arxiv.org/pdf/2510.15821
Isn’t this the ultimate black box? If a forecasting system is a black box, then you have no chance of understanding why its performance might deteriorate. Once that happens it essentially becomes a digital paper-weight.
That's not a good argument because it's like saying that an LLM is a black box, yet we use them all day every day. The two share the same engineering and operating principles.
Before picking this I would benchmark on my existing data using e.g. https://unit8co.github.io/darts/index.html#regression-models
If these worked we would have heard a lot more about them.
It looks like this is an SaaS with an open source client only right?
Moreover, some of the models used as listed at https://faim.it.com/models are open models developed by third-parties, and how you host and call them is up to you.
How does it compare to tabpfn?
How does next-token prediction work for time series data?
Would be good if the site had a couple of case studies