That's the best way I can describe the performance of the locally run instance of KoboldAI when powered by a RTX 4080.
The previous card I'd had in the server was a GTX 1050 Ti. When I prompted the AI with a scenario, the resulting response took 193 seconds.
After installing and configuring the RTX 4080? The same reply took just three seconds.
The models I've been using—either Erebus or Pygmalion, depending on how froggy I feel—have been quite impressive.
Ideally, I think I want to write my own neural network from scratch to cut down on potential bloat, and train the model to my own liking.