Serverless

Serverless endpoints for LLM inference

Instantly access popular Generative AI models without the need to manage infrastructure. Only pay for what you use and scale indefinitely with Nscale Serverless.

Models & Pricing

Prices are per 1 million tokens including input and output tokens for Chat, Multimodal, Language and Code models. Image models is based on image size and steps.
Request Access
swipe for more info
Severless Endpoint
Flux.1 [schnell]
Stable Diffusion XL 1.0
Llama 3.2 11B Vision Instruct Turbo
Llama 3 70B Instruct Turbo
Mixtral 8x22B Instruct
AMD Llama 135m
Author
Type
Price
Black Forest Labs
Text-to-Image
0.0015
per mega-pixel
Stability AI
Text-to-Image
0.003
per mega-pixel
Meta
Image-Text-to-Text
0.15
per 1m tokens
Meta
Text Generation
0.2
per 1m tokens
Mistral AI
Text Generation
0.15
per 1m tokens
AMD
Text Generation
0.01
per 1m tokens
Need dedicated infrastructure?
Contact Sales
Benefit One
Lorem ipsum dolor sit amet consectetur. Id neque elementum lacus et vestibulum molestie donec a pharetra. Varius bibendum in mus eu porttitor ullamcorper.
Benefit Two
Lorem ipsum dolor sit amet consectetur. Id neque elementum lacus et vestibulum molestie donec a pharetra. Varius bibendum in mus eu porttitor ullamcorper.
Benefit Three
Lorem ipsum dolor sit amet consectetur. Id neque elementum lacus et vestibulum molestie donec a pharetra. Varius bibendum in mus eu porttitor ullamcorper.

Instant access to leading models

No more idle costs, infrastructure headaches, or cold starts, get instant access to all models in the Nscale ecosystem and scale as much as you need.
Get Started

Built on high-
performance GPU compute

Our inference service is built on the latest AMD Instinct-series GPU accelerators. Combined with high-speed networking and fast storage, we deliver unmatched computational power for AI workloads.
Learn More
OUR HARDWARE

Access cutting-edge hardware

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Get Started
AMD MI300X

Harness the power of AMD's MI300X GPUs for unparalleled compute performance and efficiency.

Contact Sales
AMD MI250X

Instant access to AMD MI250X GPUs to drive results for all your computational needs.

Contact Sales
NVIDIA H100

Experience the pinnacle of AI performance with Nvidia H100 GPUs available instantly.

Contact Sales

Performance

80% LOWER COST
More performance for less
Nscale delivers on average 80% cost-saving in comparison to hyperscalers.
30% FASTER
On time to insights
Nscale Cloud accelerates time to insights by up to 30% thanks to its AI-optimised stack.
+40% EFFICIENCY
Improved resource utilisation
Up to 40% improvement on efficiency.
100%
Renewable Energy
Our data centres operate on 100% renewable power generated by hydropower dams.

Get access to a fully integrated 
suite of AI services 
and compute

Reduce costs, grow revenue, and run your AI workloads more efficiently on a fully integrated platform. Whether you're using Nscale's built-in AI/ML tools or your own, our platform is designed to simplify the journey from development to production.

Marketplace
Training
Inference
GPU-nodes
Nscale's Datacentres
Powered by 100% renewable energy
LLM Library
Pre-configured Software
Pre-configured Infrastructure
Job Management
Job-scheduling
Container Orchestration
Optimised Libraries
Optimised Compilers and Tools
Optimised Runtime
EYEBROW TEXT

Feature List

Feature Title
Example Feature Here
Example Feature Here
Example Feature Here
Example Feature Here
Feature Title
Example Feature Here
Example Feature Here
Example Feature Here
Example Feature Here
Feature Title
Example Feature Here
Example Feature Here
Example Feature Here
Example Feature Here
Feature Title
Example Feature Here
Example Feature Here
Example Feature Here
Example Feature Here

FAQs

What is GPU Nodes from Nscale and how does it work?

Nscale's GPU Nodes offering allows users to access powerful graphics processing units (GPUs) remotely over the internet. At Nscale, we provide on-demand access to high-performance GPUs for tasks such as AI training, rendering, and scientific computing. Users can easily provision and scale GPU resources based on their specific needs.

What types of GPUs does Nscale offer?

Nscale offers a range of GPUs to suit different requirements, including NVIDIA and AMD GPUs. Our lineup includes models such as the NVIDIA A100, H100, and V100, as well as AMD MI300X and MI250X GPUs. These GPUs are optimised for various workloads, from deep learning and machine learning to graphics rendering and scientific simulations.

What are the benefits of using GPU Nodes from Nscale?

By leveraging GPU Nodes from Nscale, users can enjoy several benefits, including:
1. Access to high-performance GPUs without the need for upfront hardware investment.
2. Scalability to easily adjust GPU resources based on workload demands.
3. Cost-effectiveness by paying only for the GPU resources used.
4. Flexibility to choose from a variety of GPU models to suit specific application requirements.
5. Simplified management and provisioning of GPU resources through our user-friendly platform.
6. Reliable performance and uptime, backed by Nscale's robust infrastructure and support services.

What industries can benefit from GPU Nodes?

GPU Nodes from Nscale is beneficial for a wide range of industries, including:
1. Artificial Intelligence (AI) and machine learning research and development.
2. Gaming and entertainment for graphics rendering and simulation.
3. Healthcare for medical imaging and analysis.
4. Finance for quantitative analysis and risk modeling.
5. Automotive for autonomous driving and vehicle simulation.
6. Aerospace and engineering for simulation and modelling.

How secure is GPU Nodes with Nscale?

Security is a top priority at Nscale, and we employ industry-leading security measures to protect our GPU infrastructure and user data. Our platform features robust encryption, access controls, and network security protocols to ensure the confidentiality, integrity, and availability of GPU resources.

Can I try GPU Nodes from Nscale before committing?

Yes, Nscale offers a trial period for users to experience our GPU Nodes platform before making a commitment. During the trial period, users can explore our platform, provision GPU resources, and test their workloads to ensure compatibility and performance. Contact our sales team to inquire about our trial options and get started today.

Access thousands of GPUs tailored to your requirements.