One of the focus areas at Together Research is new architectures for long context, improved training, and inference performance over the Transformer architecture. Spinning out of a research program from our team and academic collaborators, with roots in signal processing-inspired sequence models, we are excited to introduce the StripedHyena models.... See more
Deploy virtually any SentenceTransformer - deploy the model you know from SentenceTransformers
Fast inference backends : The inference server is built on top of torch, fastembed(onnx-cpu) and CTranslate2, getting most out of your CUDA or CPU hardware.